Resources Page for MIS 611D and MIS 464

Class Resources for MIS 464, Data Analytics, and MIS 611D, Topics in Data and Web Mining (Spring 2019)

Instructor:  Hsinchun Chen, Ph.D., Professor, Management Information Systems Dept, Eller College of Management, University of Arizona

TOPIC 1: Introduction

  1. The 50 Best Jobs in America for 2019 (Glassdoor Ranking), January 23, 2019.
  2. University of Arizona MIS Program: Overview, by Dr. Hsinchun Chen
  3. ACM Student Membership Application and Order Form [PDF], [Online Link]
  4. IEEE Student Membership Application
  5. Ideas for the Future of the IS Field, by G. B. Davis, P. Gray, S. Madnick, J. F. Nunamaker, R. Sprague, and A. Whinston.  Transactions on Management Information Systems, Volume 1, Issue 1, pp. 2:1 - 2:15. [PDF copy here]
  6. Design Science, Grand Challenges, and Societal Impacts, by Hsinchun Chen. Transactions on Management Information Systems, Volume 2, Issue 1, pp. 1:1 - 1:10. [PDF copy here]
  7. Journals, Conferences, and Funding Sources for MIS Researchers and Educators: A Resource Guide, by Dr. Hsinchun Chen (Updated 2019)
  8. ISI Ranking of Top Computer Science and Information Systems Journals, from the ISI Web of Knowledge 2017 editionupdated in December 2018
  9. The H-Index for MIS, January 2019
  10. Template for Producing IT Research and Publication, by Dr. Hsinchun Chen
    1. IEEE Template - Word Document
    2. IEEE Paper Examples
      1. Exploring Threats and Vulnerabilities in Hacker Web: Forums, IRC and Carding Shops - PDF (Benjamin et al., 2015)
      2. Developing Understanding of Hacker Language through the use of Lexical Semantics - PDF (Benjamin and Chen, 2015)
      3. Detecting Cyber Threats in Non-English Dark Net Markets: A Cross-Lingual Transfer Learning Approach - PDF (Ebrahimi et al., 2018)
  11. Design Science in Information Systems Research, by Alan R. Hevner, Salvatore T. March, Jinsso Park, and Sudha Ram.  MIS Quarterly, Volume 28, Number 1, pp. 75-105, March 2004.
  12. Positioning and Presenting Design Science Research for Maximum Impact, by Shirley Gregor and Alan R. Hevner. MIS Quarterly, Volume 37, Number 2, pp. 337-355, June 2013.
  13. Editor's Comments: Diversity of Design Science Research, by Arun Rai, Andrew Burton-Jones, Hsinchun Chen, Alok Gupta, Alan R. Hevner, and Wolfgang Ketter. MIS Quarterly, Volume 41, Number 1, pp. iii-xviii, March 2017.
  14. MISQ BI Special Issue: Business Intelligence and Analytics: From Big Data to Big Impact, by Hsinchun Chen et al. (2012).
  15. UC Berkeley’s Fastest-Growing Class Is Data Science 101, by Douglas Belkin, WSJ, November 2, 2018
  16. Special Issue: BD2K Centers Open Doors to Discovery, Biomedical Computation Review, Summer 2017. [Online]
  17. CyberGate: A Design Framework and System for Text Analysis of Computer-Mediated Communication, by Abbasi and Chen, December 2008 (MISQ) - PDF
  18. CyberGate: A Design Framework and System for Text Analysis of CMC, by Abbasi and Chen, 2008 - PPT
  19. Detecting Fake Websites: The Contribution of Statistical Learning Theory, by Abbasi et al., September 2010 (MISQ) - PDF
  20. Healthcare Predictive Analytics for Risk Profiling in Chronic Care: A Bayesian Multitask Learning Approach, by Yu-Kai Lin et al., 2017 (MISQ) - PDF
  21. DICE-E: A Framework for Conducting Darknet. Identification, Collection, Evaluation, with Ethics, by Victor Benjamin, Joseph S. Valacich, and Hsinchun Chen, Forthcoming (MISQ) - PDF
  22. His Promise to Heal Bad Hearts Relied on Mountain of False Data, by Gina Kolata, WSJ, October 30, 2018. [Online]
  23. Big Data Technology - Hadoop, MapReduce, and Spark (Jonathan Jiang, with updates from Sagar Samtani and Shuo Yu, 2019)
  24. Introduction to Blockchain and a Demo on Financial Trades (Eric Tham, 2018)
  25. WEKA Overview (Sagar Samtani, Weifeng Li, and Hsinchun Chen, with updates from Shuo Yu, 2019)
    1. iris-train, iris-testhouses-trainhouses-test
  26. Tableau Overview and Publicly Available Data Sources (Sagar Samtani and Hsinchun Chen, with updates from Hongyi Zhu, 2019)
    1. Sample NFL Dataset for Visualization

TOPIC 2: Web Mining

  1. Inside Internet Search Engines, by Jan Pedersen and William Chang (SIGIR 1999)
    1. Fundamentals, by Jan Pedersen and William Chang (SIGIR 1999) (398K)
    2. Spidering and Indexing, by Jan Pedersen and William Chang (SIGIR 1999) (41K)
    3. Search, by Jan Pedersen and William Chang (SIGIR 1999) (553K)
    4. Products, by William Chang and Jan Pedersen (SIGIR 1999) (75K)
    5. Business, by William Chang and Jan Pedersen (SIGIR 1999) (37K)
  2. Search Engines and Their Algorithms, by C. Lee Giles (2018) (33M)
  3. The Anatomy of a Large-Scale Hypertextual Web Search Engine, S. Brin and L. Page (1998)
  4. Page Rank and Google Story, by Vise and Malseed, 2005
  5. AI, Chapter 4. Search Algorithm, Winston (1984)
  6. GA Handout (27M)
  7. Network Science (Sagar Samtani, Weifeng Li, Hsinchun Chen, 2016)
  8. The Great Giveaway (25M), by Erick Schonfeld, Business 2.0 (April 2005)
  9. The Long Tail, by Chris Anderson, WIRED Magazine (December 2004)
  10. Web 2.0 ... The Machine is Us/ing Us (YouTube)
  11. What Is Web 2.0? Design Patterns and Business Models for the Next Generation of Software, by Tim O'Reilly (2005)
  12. Facebook Story (2012)
  13. Communications of the ACM (2011):
    1. Reflecting on the DARPA Red Balloon Challenge, by John C. Tang et al. (April 2011)
    2. Crowdsourcing Systems on the World-Wide Web, by Anhai Doan et al. (April 2011)
    3. An Overview of Business Intelligence Technology, by Surajit Chaudhuri et al. (August 2011)
  14. World (Patent) War, from the Bloomberg Businessweek Technology section, March 12, 2012.
  15. The Netflix Recommender System: Algorithms, Business Value, and Innovation (Uribe and Hunt, 2015)
  16. Matrix Factorization Techniques for Recommender Systems (Koren, Bell, and Volinsky, 2009)
  17. Zillow awards $1 million to data scientists for improving its Zestimate algorithm, by Natalie Gagliordi (January 2019)
  18. Data Science and Prediction, by Vasant Dhar (2013)
  19. Harvard Business Review (October 2012)
    1. Big Data: The Management Revolution (from HBR 12/12)
    2. Data Scientist: The Sexiest Job Of the 21st Century (from HBR 12/12)
    3. Making Advanced Analytics Work For You (from HBR 12/12)
  20. Hype Cycle for Business Intelligence, 2011, by Andreas Bitterer, Gartner Report (Aug. 12 2011)
  21. Magic Quadrant for Business Intelligence Platforms, by Rita L. Sallam et al., Gartner Report (Jan. 27 2011)
  22. The 2011 IBM Tech Trends Report, by IBM (Nov. 15th, 2011)
  23. The Economist A Special Report on Social Networking---A World of Connections (January 30th 2010):
    1. A world of connections (from The Economist 1/30/10)
    2. Global swap shops (from The Economist 1/30/10)
    3. Twitter's transmitters (from The Economist 1/30/10)
    4. Profiting from friendship (from The Economist 1/30/10)
  24. The Economist, Data, Data, Everywhere: A Special Report on Managing Information (February 25th 2010); includes the following pieces:
    1. The data deluge
    2. Data, data everywhere
    3. All too much
    4. A different game
    5. Show me
    6. Needle in a haystack
    7. New rules for big data
    8. Clicking for gold
    9. Handling the cornucopia
    10. The open society
    11. Sources and acknowledgments
  25. The Economist, A Special Report on Personal Technology (October 8th 2011).  Includes the following sections:
    1. Beyond the PC
    2. The Power of Many
    3. The Beauty of Bite-sized Software
    4. IT's Arab Spring
    5. Up Close
  26. The Economist, Special Report, Cyber-Security, July 12, 2014: Defending the Digital Frontier.  Includes the following sections:
    1. Cybercrime: Hackers, Inc.
    2. Vulnerabilities: Zero-day game
    3. Business: Digital disease control
    4. Critical infrastructure: Crashing the system
    5. Market failures: Not my problem
    6. The Internet of Things: Home, hacked home
    7. Remedies: Prevention is better than cure
  27. The Economist, Technology Quarterly, Civilian Drones, June 10, 2017: Taking Flight. Includes the following sections:
    1. Give and take
    2. Seeing is believing
    3. Can drones deliver the goods?
    4. Rules and tools
  28. The Economist, Special Report, The Economics of Longevity, July 8, 2017: The New Old. Includes the following sections:
    1. Footloose and fancy-free
    2. Rock around the clock
    3. Don't call us silver
    4. Your money and your life
    5. Tablets for every problem
    6. A blessing, not a burden
  29. The Economist, September 9, 2017: Facial Industry. Includes the following sections:
    1. The facial-industry complex
    2. Keeping a straight face
    3. Making faces from DNA
  30. The Economist, Special Report, Autonomous Vehicles, March 3, 2018: Reinventing Wheels. Includes the following sections:
    1. From here to autonomy
    2. Selling rides, not cars
    3. The new autopia
    4. A different world
    5. Rules of the road
  31. The Economist, Special Report, AI in Business, March 31, 2018: GrAIt Expectations. Includes the following sections:
    1. In algorithms we trust
    2. Here to help
    3. Hire education
    4. Simile, you're on camera
    5. Leave it to the experts
    6. Two faced
  32. A Special Report on Artificial Intelligence. The New York Times, October 19, 2018. Includes the following articles:
    1. Workers Beware, by David Kaufman, October 18, 2018
    2. What Comes After the Roomba? by John Markoff, October 21, 2018
    3. The Computerized Chauffeur, by Norman Mayersohn, October 19, 2018
    4. A.I. Is Beginning to Assist Novelists, by David Streitfeld, October 18, 2018
    5. The A.I. Wave Is Here, by Steve Lohr, October 21, 2018
    6. Acknowledging the Pitfalls, Too, by Cade Metz, October 22, 2018
    7. Will There Be a Ban on Killer Robots? by Adam Satariano, October 19, 2018
    8. Breaking Big Tech's Hold on A.I., by Nathaniel Popper, October 20, 2018
  33. Chinese Unicorns Rush Out IPOs, by Julie Steinberg and Liza Lin, WSJ, May 2, 2018.
  34. Introduction to Web Application and APIs (Revised by Jonathan Jiang and Julian Guo):
    1. Flickr Photo Search API Sample Code
    2. Amazon Product Advertising API Sample Code
    3. YouTube Data API Sample Code
    4. Yelp API Sample Code

TOPIC 3: Data Mining

  1. Top 10 Algorithms in Data Mining (PDF)
  2. ID3 Handout
  3. Backpropagation Neural Network Handout
  4. Self-organizing maps: an introduction
  5. K-means algorithm
  6. Expert Prediction, Symbolic Learning, and Neural Networks-An Experiment on Greyhound Racing, by Hsinchun Chen et al., IEEE Expert (December 1994)
  7. Introduction to Support Vector Machine (SVM) and Conditional Random Field (CRF) (Long Version, Short Version)
  8. Pattern Recognition using Support Vector Machine and Principal Component Analysis (Ahmed Abbasi)
  9. Predictive Analytics - Regression and Classification (Weifeng Li, Sagar Samtani, Hsinchun Chen, 2016)
  10. Logistic Regression and Elastic Net (Weifeng Li and Hsinchun Chen, 2016)
  11. Publicly Available Data Sources (Sagar Samtani and Hsinchun Chen, with updates from Hongyi Zhu, 2019)
  12. Google masters Go (Nature, Elizabeth Gibney, January 28, 2016)
  13. Artificial Intelligence Go Showdown (The Economist, March 12, 2016)
  14. Artificial Intelligence - Million Dollar Babies - The Economist, April 2, 2016
  15. Mastering the game of Go with deep neural networks and tree search (Nature, Silver et al., 2016)
  16. The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation, by Miles Brundage et al., February 2018.
  17. Deep Learning (Nature, LeCun et al., 2015)
  18. Machine Learning: Trends, Perspectives, and Prospects (Science, Jordan and Mitchell, 2015)
  19. Editorial: Chess, a Drosophila of Reasoning (Science, Kasparov, 2018)
  20. One Giant Step for a Chess-Playing Machine (New York Times, Strogatz, 2018)
  21. A General Reinforcement Learning Algorithm That Masters Chess, Shogi, and Go Through Self-play (Science, Silver et al., 2018)
  22. Deep Learning: An Overview (Weifeng Li, Victor Benjamin, Xiao Liu, and Hsinchun Chen, 2016)
  23. Topic Modeling and Latent Dirichlet Allocation: An Overview (Weifeng Li, Sagar Samtani, and Hsinchun Chen, 2016)
  24. An Introduction to Convolutional Neural Networks (Shuo Yu and Hsinchun Chen, 2018)
  25. An Introduction to Recurrent Neural Networks (Hongyi Zhu and Hsinchun Chen, 2018)
  26. Autoencoders: Overviews and Selected Application (Sagar Samtani and Hsinchun Chen, 2018)
  27. An Introduction to Deep Transfer Learning (Mohammadreza Ebrahimi and Hsinchun Chen, 2018)
  28. An Overview of Topic Modeling (Weifeng Li and Hsinchun Chen, 2018)
  29. Deep Generative Models: An Overview (Yidong Chai, Weifeng Li, and Hsinchun Chen, 2018)
  30. Artificial Intelligence and Deep Learning (Lee Giles, 2018)
  31. Representation Learning (Alexander G. Ororbia II and Lee Giles, 2018)

TOPIC 4: Text Mining

  1. Information Visualization
  2. Information Visualization for Digital Library (2.21M)
  3. Visualizing Data (Hongyi Zhu, Sagar Samtani, Hsinchun Chen, 2016)
  4. Text Mining: Techniques, Tools, Ontologies and Shared Tasks (Xiao Liu, with updates from Shuo Yu, 2019)

TOPIC 5: Emerging Research in Data and Web Mining (for MIS 611D)

  1. COPLINK, Dark Web, and Hacker Web: A Research Path in Security Informatics, by Dr. Hsinchun Chen
  2. Criminal Network Analysis and Visualization, by Jennifer Xu and Hsinchun Chen
  3. The Topology of Dark Networks, by Jennifer Xu and Hsinchun Chen
  4. Exploring Dark Networks: From the Surface Web to the Dark Web, by Hsinchun Chen, October 2017
  5. CyberGate: A Design Framework and System for Text Analysis of CMC, by Ahmed Abbasi and Hsinchun Chen
  6. MedTime: A Temporal Information Extraction System for Clinical Narratives, by Yu-Kai Lin, Hsinchun Chen and Randall A. Brown (2013)
  7. Smart and Connected Health: Guest Editors' Introduction, by Gondy Leroy, Hsinchun Chen, and Thomas C. Rindflesch (2014)
  8. Time-To-Event Predictive Modeling for Chronic Conditions Using Electronic Health Records, by Yu-Kai Lin, Hsinchun Chen, Randall A. Brown, Shu-Hsing Li, and Hung-Jen Yang (2014)
  9. Identifying Adverse Drug Events from Patient Social Media: A Case Study for Diabetes, by Xiao Lu and Hsinchun Chen (2015)
  10. HackerWeb and Shodan Access (Jonathan Jiang)
    1. Hacker Web Sample Code
    2. Shodan Sample Code
  11. Homeland Security Data Mining using Social (Dark) Network Analysis, ISI 2008, Keynote Address, by Dr. Chen (18.4M)
  12. Health Big Data Analytics: Clinical Decision Support and Patient Empowerment, by Dr. Hsinchun Chen
  13.  IEEE Intelligent Systems, Trends & Controversies; with introductions by Dr. Hsinchun Chen (2009, 2010, 2011):
    1. AI and Global Science and Technology Assessment, by Hsinchun Chen (July/August 2009)
    2. AI, E-Government, and Politics 2.0, by Hsinchun Chen (September/October 2009)
    3. AI for Global Disease Surveillance, by Hsinchun Chen and Daniel Zeng (November/December 2009)
    4. Business and Market Intelligence 2.0, by Hsinchun Chen (January/February 2010)
    5. AI and Opinion Mining, by Hsinchun Chen and David Zimbra (May/June 2010)
    6. AI and Security Informatics, by Hsinchun Chen, (September/October 2010)
    7. AI, Virtual Worlds, and Massively Multiplayer Online Games, by Hsinchun Chen and Yulei Zhang (January/February 2011)
    8. Smart Health and Wellbeing, by Hsinchun Chen (September/October 2011)
    9. Smart Market and Money, by Hsinchun Chen (November/December 2011)
  14. Recent Research at the Artificial Intelligence Lab of the University of Arizona: AZSecure for Advanced Cyber Threat Intelligence and SilverLink for Proactive Mobile Health, by Hsinchun Chen, March 2018.
  15. Cybersecurity and AI: A Data Science Perspective, by Hsinchun Chen, November 2018.
  16. Cyber Threat Intelligence (February 2019)
    1. Overview Fundamental
    2. Hacker Community Data
    3. Hacker Assets Portal
    4. Looking to the Future


Photo provided through courtesy of DARPA and available through Wikimedia Commons.