MIS 510 "Data and Web Mining"
Return to Dr. Chen's homepage
This course introduces data structures and algorithms that are suited for developing Internet-based information systems in business intelligence, search engines, digital libraries, knowledge management systems, web/data/text mining, national security, and biomedical informatics. The course contains lectures, readings, programming assignments, lab sessions, and a large-scale hands-on system development project. The course will begin with select fundamental yet useful data structures (e.g., stacks, queues, lists, trees, and graphs) and sorting and searching algorithms. Newer and more robust web/data/text mining algorithms (e.g., neural networks, decision trees, genetic algorithms, spreading activation, information retrieval, natural language processing) are then introduced in the context of modern and emerging information systems in business, engineering, and bioinformatics.
Syllabus and Important Materials (Spring 2009)
- MIS 510 Syllabus
- Hsinchun Chen, (2001), Knowledge Management Systems: A Text Mining Perspective
- Hsinchun Chen, (2002), Trailblazing a Path Towards Knowledge and Transformation
- Web Mining Project Resources
- Introduction to Weka and NetDraw
Other Course-Related Materials
- Web 2.0: Introduction (Dr. Hsinchun Chen)
- Tim O'Reilly, (2005), What Is Web 2.0? Design Patterns and Business Models for the Next Generation of Software
- Web 2.0 (Wikipedia)
- Web 2.0 ... The Machine is Us/ing Us (YouTube)
- The Long Tail, by Chris Anderson, WIRED Magazine, December 2004
- The Great Giveaway (25M)
- Using Open Web APIs in Teaching Web Mining
- Web Mining: Machine Learning for Web Applications, H. Chen and M. Chau, (2004),
- The Anatomy of a Large-Scale Hypertextual Web Search Engine, S. Brin and L. Page, (1998),
- A Smart Itsy Bitsy Spider for the Web
- AI, Chapter 4, Winston, (1984),
- Assignment 1: GA (Spring, 2009)
- Assignment 2: Neural Network (Spring, 2009)
- Assignment 2: Iris dataset (Spring, 2009)
- GA Handout (27M)
- ID3 Handout
- Backpropagation Neural Network Handout
- Self-organizing Maps Handout
- Prim's and Kruskal's Minium Spanning Tree Algorithms
- Credit Rating Analysis with Support Vector Machines and Neural Network: A Market Comparative Study
- An Automatic Classification Approach to Business Stakeholder Analysis on the Web
- Major Web Intelligence Tools
- Web Marketing Research (Dr. Hsinchun Chen)
Guest Lectures
- Programming with Amazon, Google, and eBay (Chun-Ju Tseng)
- Web Programming and Web Services (Chun-Ju Tseng)
- Software Agents, Multi-Agent Systems, and Data Mining (Dr. Daniel Zeng)
- Pattern Recognition using Support Vector Machine and Principal Component Analysis (Ahmed Abbasi)
- TimelyBid (Sean Humphreys)
- iDog (Chris Chang)
- Smart Gift Card (Gavin Zhang)
Slides
- UA MIS Program Overview (846K)
- Inside Internet Search Engines: Fundamentals (398K)
- Inside Internet Search Engines: Spidering and Indexing (41K)
- Inside Internet Search Engines: Search (553K)
- Inside Internet Search Engines: Products (75K)
- Inside Internet Search Engines: Business (37K)
- Page Rank and Google
- From Search Engines to Web Mining
- Web Mining: Machine Learning for Web Applications
- A Graph-based Recommender System
- Analytical and Visual Data Mining (5.29M)
- Data Mining: Part I (1.96M)
- Data Mining: Part II (3.83M)
- Data Mining: Part III (3.35M)
- Top 10 Algorithms in Data Mining (PDF)
- Information Visualization for Digital Library (2.26M)
- Knowledge Management Systems: Development and Applications Part I: Overview and Related Fields (1.91M)
- Knowledge Management Systems: Development and Applications Part II: Techniques and Examples (2.46M)
- Knowledge Management Systems: Development and Applications Part III: Case Studies and Future (13.91M)
- Internet Searching and Browsing in a Multilingual World (2.23M)
- An Automatic Text Mining Framework for Knowledge Discovery on the Web (3.43M)
- Achieving Information Resources Empowerment: A Digital Library and Knowledge Management Perspective (10.9M)
- Digital Library Development in the Asia Pacific (16.9M)
- Information Visualization
- From Search Engines to Web Mining




