CS725 : Data Mining

I like this Course

Course Info

Course Category

Computer Science/Information Technology

Course Level

Graduate

Credit Hours

3

Pre-requisites

N/A

Instructor

Dr. Usman Ghani
Phd

Course Contents

Introduction: Why Data Mining?, Introduction: What Is Data Mining?, Introduction: A Multi-Dimensional View of Data Mining, Introduction: What Kind of Data Can Be Mined?, Introduction: Are all Patterns are interesting?, Introduction: What Technology Are Used?, Introduction: What Kind of Applications Are Targeted?, Introduction: Major Issues in Data Mining, Data Objects and Attribute Types: Types of Data Sets, Data Objects and Attribute Types: Important Characteristics of Structured Data, Data Objects and Attribute Types: Data Objects, Data Objects and Attribute Types: Attributes, Data Objects and Attribute Types: Attribute Types, Data Objects and Attribute Types: Discrete vs. Continuous Attributes, Data Visualization: Introduction, Data Visualization: Pixel-Oriented Visualization Techniques, Basic Statistical Descriptions of Data: Introduction, Basic Statistical Descriptions of Data: Measuring the Central Tendency, Basic Statistical Descriptions of Data: Symmetric vs. Skewed Data, Basic Statistical Descriptions of Data: Measuring the Dispersion of Data, Basic Statistical Descriptions of Data: Box plot Analysis, Basic Statistical Descriptions of Data: Graphic Displays of Basic Statistical Descriptions using Histogram, Basic Statistical Descriptions of Data: Graphic Displays of Basic Statistical Descriptions using Quantile Plot, Basic Statistical Descriptions of Data: Graphic Displays of Basic Statistical Descriptions using Scatter plot, Data Visualization: Geometric Projection Visualization Techniques, Data Visualization: Icon-Based Visualization Techniques, Data Visualization: Hierarchical Visualization Techniques, Data Visualization: Hierarchical Visualization examples, Measuring Data Similarity and Dissimilarity: Introduction (Videos are not directly watching but just downloading), Measuring Data Similarity and Dissimilarity: Data Matrix and Dissimilarity Matrix, Measuring Data Similarity and Dissimilarity: Proximity Measure, Measuring Data Similarity and Dissimilarity: Standardizing Numeric Data, Measuring Data Similarity and Dissimilarity: Distance on Numeric Data, Measuring Data Similarity and Dissimilarity: Attributes of Mixed Type, Measuring Data Similarity and Dissimilarity: Cosine Similarity, Why Preprocess the Data: Introduction, Why Preprocess the Data: Why Is Data Dirty?, Why Preprocess the Data: Multi-Dimensional Measure of Data Quality, Why Preprocess the Data: Major Tasks in Data Preprocessing, Data Cleaning: Introduction, Data Cleaning: Missing Data, Data Cleaning: Noisy Data, Data Cleaning: How to Handle Noisy data using Binning, Data Cleaning: How to Handle Noisy data using Regression and Cluster Analysis, Data integration and transformation: Introduction, Data integration and transformation: Handling Redundancy in Data Integration, Data integration and transformation: Detect Redundancy in Data Integration using Corelation analysis, Data integration and transformation: Data Transformation methods, Data integration and transformation: Normalization Example, Data reduction: Introduction, Data reduction: Data cube aggregation, Data reduction: Data Compression, Data reduction: Dimensionality Reduction using Wavelet Transformation, Data reduction: Dimensionality Reduction using PCA, Data reduction: Numerosity Reduction, Data reduction: Numerosity Reduction using Regression and Log-Linear Models, Data reduction: Numerosity Reduction using Histogram, Data reduction: Numerosity Reduction using Clustering, Data reduction: Numerosity Reduction using Sampling, What is a data warehouse?: Introduction, What is a data warehouse?: Subject-Oriented, Data warehouse architecture, What is a data warehouse?: Data Warehouse vs. Operational DBMS, Data warehouse architecture: Data Warehouse Models, Data Warehouse-Metadata, A multi-dimensional data model, A multi-dimensional data model: Example of Star Schema, A multi-dimensional data model: Example of Snowflake Schema, A multi-dimensional data model: Example of Fact Constellation, Concept Heirarchy, Multi Dimensional Data Models, A multi-dimensional data model: Typical OLAP Operations, NLP, Stages of NLP, Syntax processing, Other stages of NLP, Regular expressions, Errors, Tokenization& issues in tokenization, Word normalization, Language model, Chain rule, Markov assumption, N gram probabilities, LM example, Spell correction, Noisy Channel, Candidate Generation, Noisy channel probability, Bigram Based Correction, Text Classification, Text Classification Examples, BOW Model, Formalizing Text Classification, Bayes Classification Methods: Why?, Naive Bayes Independence, Naive Bayes Parameters Learning, Naive Bayes Smoothing, Naive Bayes and LM, NB Example, NB Advantage, F Measure, Basic Concepts of Mining Frequent patterns: Introduction, Basic Concepts of Mining Frequent Patterns, Association and Correlation: Why Is Freq. Pattern Mining Important?, Market Basket Analysis, Frequent Item set Mining Methods: Apriori: Example, Frequent Item set Mining Methods: Apriori: Pseudo Code, Frequent Item set Mining Methods: Mining Close Frequent Patterns and Max patterns, Frequent Item set Mining Methods: How to Count Supports of Candidates?, Frequent Item set Mining Methods: Improving the Efficiency of Apriori, Basic Concepts of Mining Frequent Patterns, Association and Correlation: Computational Complexity of Frequent Item set Mining, Frequent Item set Mining Methods: ECLAT: Frequent Pattern Mining with Vertical Data Format, Which Patterns Are Interesting? - Pattern Evaluation Methods: interest Measure, Basic Concepts of Classification: Introduction, Basic Concepts of Classification: Supervised vs. Unsupervised Learning, Basic Concepts of Classification: A Two-Step Process, Classification Issues, Classification Methods, Decision Tree Induction, Decision Tress-Introduction, Decision Tree Induction Algorithm, Pruning, Entropy, Attribute Selection Introduction, Information Gain, Gain Ration, Gini Index, Attribute Selection Comparison, Attributes Measure, Enhancement to DTI, Introduction to Rain Forest, Example of Rain Forest, BOAT, Rule-Based Classification: Using IF-THEN Rules for Classification, Rule-Based Classification: Rule Extraction from a Decision Tree, Rule-Based Classification: Rule Induction: Sequential Covering Method, Model Evaluation and Selection: Introduction, Model Evaluation and Selection: Confusion Matrix, Model Evaluation and Selection: Accuracy, Error Rate, Sensitivity and Specificity using Evaluation matrix Matrix, Model Evaluation and Selection: Holdout & Cross-Validation Methods, Model Evaluation and Selection: Bootstrap for evaluation of classifier, Model Evaluation and Selection: ROC Curves, Model Evaluation and Selection: Issues Affecting Model Selection, Techniques to Improve Classification Accuracy: Ensemble Methods: Introduction, Techniques to Improve Classification Accuracy: Ensemble Methods: Bagging: Bootstrap Aggregation, Techniques to Improve Classification Accuracy: Ensemble Methods: Boosting, Techniques to Improve Classification Accuracy: Ensemble Methods: Random Forest, Classification of Imbalance data, Classification by Back propagation: Introduction, Classification by Back propagation: Neural Network as a Classifier, Classification by Back propagation: A Multi-Layer Feed-Forward Neural Network, Classification by Back propagation: Defining a Network Topology, Classification by Back propagation: Back propagation, Neural Networks Evaluation, Support Vector Machines: Introduction, Support Vector Machines: History and Applications, Support Vector Machines: General Philosophy, Basic Concepts of Cluster Analysis: What is Cluster Analysis?, Basic Concepts of Cluster Analysis: Clustering for Data Understanding and Applications, Basic Concepts of Cluster Analysis: Clustering as a Preprocessing Too, Basic Concepts of Cluster Analysis: Quality: What Is Good Clustering?, Basic Concepts of Cluster Analysis: Measure the Quality of Clustering, Clustering Criteria, Basic Concepts of Cluster Analysis: Requirements and Challenges, Clustering Matrixes, Types of data of Cluster Analysis: Interval-valued variables, Types of data of Cluster Analysis: Binary Variables, Types of data of Cluster Analysis: Ratio-Scaled Variables, Partitioning Methods: Basic Concept, Partitioning Methods: The K-Means Clustering Method, Partitioning Methods: Comments on the K-Means Method, Partitioning Methods: Variations of the K-Means Method, Partitioning Methods: The K-Medoids Clustering Method, Hierarchical Methods: Introduction, Hierarchical Methods: AGNES (Agglomerative Nesting), Hierarchical Methods: DIANA (Divisive Analysis), Density-Based Methods: Introduction, DBM Parameters, Density-Based Methods: DBSCAN, Grid-Based Methods: Introduction, Grid-Based Methods: STING: A Statistical Information Grid Approach, Grid-Based Methods: Clustering by Wavelet Analysis, Model-Based Methods: Introduction, Model-Based Methods: EM - Expectation Maximization, Model-Based Methods: Conceptual Clustering, Model-Based Methods: COBWEB Clustering Method, Model-Based Methods: Neural Network Approach, Model-Based Methods: Self-Organizing Feature Map (SOM), Outlier Analysis: What Is Outlier Discovery?, Outlier Analysis: Statistical Approaches for outlier discovery, Outlier Analysis: Distance-Based Approach for outlier discovery, Outlier Analysis: Deviation-Based Approach for outlier discovery, Clustering High-Dimensional Data: Introduction, Clustering High-Dimensional Data: The Curse of Dimensionality, Clustering High-Dimensional Data: Why Subspace Clustering?, Clustering High-Dimensional Data: CLIQUE, Introduction to WEKA, Features of WEKA, Attributes of WEKA, Preprocessing in WEKA, WEKA Classifier-Example, Demo of WEKA Classifier, WEKA Classification-Results, Result Visualization, WEKA Clustering, Association Findings in WEKA, Web Mining, Web Mining Introduction, Introduction to Text Mining, IR, Information Extraction, Web Structure Mining, Web Search, Cyber Community, Types of Cyber Community, Web Mining _ Usage, Web Mining Details, Strategies for Web search, Web Architecture