Course Info

Course Category

Computer Science/Information Technology

Course Level

Graduate

Credit Hours

3

Prerequisites

N/A

Instructor

Dr. Usman Ghani Phd


Course Contents

Introduction: Why Data Mining?,
Introduction: What Is Data Mining?,
Introduction: A MultiDimensional View of Data Mining,
Introduction: What Kind of Data Can Be Mined?,
Introduction: Are all Patterns are interesting?,
Introduction: What Technology Are Used?,
Introduction: What Kind of Applications Are Targeted?,
Introduction: Major Issues in Data Mining,
Data Objects and Attribute Types: Types of Data Sets,
Data Objects and Attribute Types: Important Characteristics of Structured Data,
Data Objects and Attribute Types: Data Objects,
Data Objects and Attribute Types: Attributes,
Data Objects and Attribute Types: Attribute Types,
Data Objects and Attribute Types: Discrete vs. Continuous Attributes,
Data Visualization: Introduction,
Data Visualization: PixelOriented Visualization Techniques,
Basic Statistical Descriptions of Data: Introduction,
Basic Statistical Descriptions of Data: Measuring the Central Tendency,
Basic Statistical Descriptions of Data: Symmetric vs. Skewed Data,
Basic Statistical Descriptions of Data: Measuring the Dispersion of Data,
Basic Statistical Descriptions of Data: Box plot Analysis,
Basic Statistical Descriptions of Data: Graphic Displays of Basic Statistical Descriptions using Histogram,
Basic Statistical Descriptions of Data: Graphic Displays of Basic Statistical Descriptions using Quantile Plot,
Basic Statistical Descriptions of Data: Graphic Displays of Basic Statistical Descriptions using Scatter plot,
Data Visualization: Geometric Projection Visualization Techniques,
Data Visualization: IconBased Visualization Techniques,
Data Visualization: Hierarchical Visualization Techniques,
Data Visualization: Hierarchical Visualization examples,
Measuring Data Similarity and Dissimilarity: Introduction (Videos are not directly watching but just downloading),
Measuring Data Similarity and Dissimilarity: Data Matrix and Dissimilarity Matrix,
Measuring Data Similarity and Dissimilarity: Proximity Measure,
Measuring Data Similarity and Dissimilarity: Standardizing Numeric Data,
Measuring Data Similarity and Dissimilarity: Distance on Numeric Data,
Measuring Data Similarity and Dissimilarity: Attributes of Mixed Type,
Measuring Data Similarity and Dissimilarity: Cosine Similarity,
Why Preprocess the Data: Introduction,
Why Preprocess the Data: Why Is Data Dirty?,
Why Preprocess the Data: MultiDimensional Measure of Data Quality,
Why Preprocess the Data: Major Tasks in Data Preprocessing,
Data Cleaning: Introduction,
Data Cleaning: Missing Data,
Data Cleaning: Noisy Data,
Data Cleaning: How to Handle Noisy data using Binning,
Data Cleaning: How to Handle Noisy data using Regression and Cluster Analysis,
Data integration and transformation: Introduction,
Data integration and transformation: Handling Redundancy in Data Integration,
Data integration and transformation: Detect Redundancy in Data Integration using Corelation analysis,
Data integration and transformation: Data Transformation methods,
Data integration and transformation: Normalization Example,
Data reduction: Introduction,
Data reduction: Data cube aggregation,
Data reduction: Data Compression,
Data reduction: Dimensionality Reduction using Wavelet Transformation,
Data reduction: Dimensionality Reduction using PCA,
Data reduction: Numerosity Reduction,
Data reduction: Numerosity Reduction using Regression and LogLinear Models,
Data reduction: Numerosity Reduction using Histogram,
Data reduction: Numerosity Reduction using Clustering,
Data reduction: Numerosity Reduction using Sampling,
What is a data warehouse?: Introduction,
What is a data warehouse?: SubjectOriented,
Data warehouse architecture,
What is a data warehouse?: Data Warehouse vs. Operational DBMS,
Data warehouse architecture: Data Warehouse Models,
Data WarehouseMetadata,
A multidimensional data model,
A multidimensional data model: Example of Star Schema,
A multidimensional data model: Example of Snowflake Schema,
A multidimensional data model: Example of Fact Constellation,
Concept Heirarchy,
Multi Dimensional Data Models,
A multidimensional data model: Typical OLAP Operations,
NLP,
Stages of NLP,
Syntax processing,
Other stages of NLP,
Regular expressions,
Errors,
Tokenization& issues in tokenization,
Word normalization,
Language model,
Chain rule,
Markov assumption,
N gram probabilities,
LM example,
Spell correction,
Noisy Channel,
Candidate Generation,
Noisy channel probability,
Bigram Based Correction,
Text Classification,
Text Classification Examples,
BOW Model,
Formalizing Text Classification,
Bayes Classification Methods: Why?,
Naive Bayes Independence,
Naive Bayes Parameters Learning,
Naive Bayes Smoothing,
Naive Bayes and LM,
NB Example,
NB Advantage,
F Measure,
Basic Concepts of Mining Frequent patterns: Introduction,
Basic Concepts of Mining Frequent Patterns, Association and Correlation: Why Is Freq. Pattern Mining Important?,
Market Basket Analysis,
Frequent Item set Mining Methods: Apriori: Example,
Frequent Item set Mining Methods: Apriori: Pseudo Code,
Frequent Item set Mining Methods: Mining Close Frequent Patterns and Max patterns,
Frequent Item set Mining Methods: How to Count Supports of Candidates?,
Frequent Item set Mining Methods: Improving the Efficiency of Apriori,
Basic Concepts of Mining Frequent Patterns, Association and Correlation: Computational Complexity of Frequent Item set Mining,
Frequent Item set Mining Methods: ECLAT: Frequent Pattern Mining with Vertical Data Format,
Which Patterns Are Interesting?  Pattern Evaluation Methods: interest Measure,
Basic Concepts of Classification: Introduction,
Basic Concepts of Classification: Supervised vs. Unsupervised Learning,
Basic Concepts of Classification: A TwoStep Process,
Classification Issues,
Classification Methods,
Decision Tree Induction,
Decision TressIntroduction,
Decision Tree Induction Algorithm,
Pruning,
Entropy,
Attribute Selection Introduction,
Information Gain,
Gain Ration,
Gini Index,
Attribute Selection Comparison,
Attributes Measure,
Enhancement to DTI,
Introduction to Rain Forest,
Example of Rain Forest,
BOAT,
RuleBased Classification: Using IFTHEN Rules for Classification,
RuleBased Classification: Rule Extraction from a Decision Tree,
RuleBased Classification: Rule Induction: Sequential Covering Method,
Model Evaluation and Selection: Introduction,
Model Evaluation and Selection: Confusion Matrix,
Model Evaluation and Selection: Accuracy, Error Rate, Sensitivity and Specificity using Evaluation matrix Matrix,
Model Evaluation and Selection: Holdout & CrossValidation Methods,
Model Evaluation and Selection: Bootstrap for evaluation of classifier,
Model Evaluation and Selection: ROC Curves,
Model Evaluation and Selection: Issues Affecting Model Selection,
Techniques to Improve Classification Accuracy: Ensemble Methods: Introduction,
Techniques to Improve Classification Accuracy: Ensemble Methods: Bagging: Bootstrap Aggregation,
Techniques to Improve Classification Accuracy: Ensemble Methods: Boosting,
Techniques to Improve Classification Accuracy: Ensemble Methods: Random Forest,
Classification of Imbalance data,
Classification by Back propagation: Introduction,
Classification by Back propagation: Neural Network as a Classifier,
Classification by Back propagation: A MultiLayer FeedForward Neural Network,
Classification by Back propagation: Defining a Network Topology,
Classification by Back propagation: Back propagation,
Neural Networks Evaluation,
Support Vector Machines: Introduction,
Support Vector Machines: History and Applications,
Support Vector Machines: General Philosophy,
Basic Concepts of Cluster Analysis: What is Cluster Analysis?,
Basic Concepts of Cluster Analysis: Clustering for Data Understanding and Applications,
Basic Concepts of Cluster Analysis: Clustering as a Preprocessing Too,
Basic Concepts of Cluster Analysis: Quality: What Is Good Clustering?,
Basic Concepts of Cluster Analysis: Measure the Quality of Clustering,
Clustering Criteria,
Basic Concepts of Cluster Analysis: Requirements and Challenges,
Clustering Matrixes,
Types of data of Cluster Analysis: Intervalvalued variables,
Types of data of Cluster Analysis: Binary Variables,
Types of data of Cluster Analysis: RatioScaled Variables,
Partitioning Methods: Basic Concept,
Partitioning Methods: The KMeans Clustering Method,
Partitioning Methods: Comments on the KMeans Method,
Partitioning Methods: Variations of the KMeans Method,
Partitioning Methods: The KMedoids Clustering Method,
Hierarchical Methods: Introduction,
Hierarchical Methods: AGNES (Agglomerative Nesting),
Hierarchical Methods: DIANA (Divisive Analysis),
DensityBased Methods: Introduction,
DBM Parameters,
DensityBased Methods: DBSCAN,
GridBased Methods: Introduction,
GridBased Methods: STING: A Statistical Information Grid Approach,
GridBased Methods: Clustering by Wavelet Analysis,
ModelBased Methods: Introduction,
ModelBased Methods: EM  Expectation Maximization,
ModelBased Methods: Conceptual Clustering,
ModelBased Methods: COBWEB Clustering Method,
ModelBased Methods: Neural Network Approach,
ModelBased Methods: SelfOrganizing Feature Map (SOM),
Outlier Analysis: What Is Outlier Discovery?,
Outlier Analysis: Statistical Approaches for outlier discovery,
Outlier Analysis: DistanceBased Approach for outlier discovery,
Outlier Analysis: DeviationBased Approach for outlier discovery,
Clustering HighDimensional Data: Introduction,
Clustering HighDimensional Data: The Curse of Dimensionality,
Clustering HighDimensional Data: Why Subspace Clustering?,
Clustering HighDimensional Data: CLIQUE,
Introduction to WEKA,
Features of WEKA,
Attributes of WEKA,
Preprocessing in WEKA,
WEKA ClassifierExample,
Demo of WEKA Classifier,
WEKA ClassificationResults,
Result Visualization,
WEKA Clustering,
Association Findings in WEKA,
Web Mining,
Web Mining Introduction,
Introduction to Text Mining,
IR,
Information Extraction,
Web Structure Mining,
Web Search,
Cyber Community,
Types of Cyber Community,
Web Mining _ Usage,
Web Mining Details,
Strategies for Web search,
Web Architecture


