Course Overview
|
Course Synopsis
|
This course discusses the theory, design, and implementation of text-based information retrieval systems. The core components of an Information Retrieval include statistical characteristics of text, representation of information needs and documents, several important retrieval models (Boolean, vector space, probabilistic, inference net, language modeling, link analysis), clustering algorithms, automatic text categorization, recommender systems, search computing ,search engine optimization, multimedia IR, semantic web, and experimental evaluation. The software architecture components include design and implementation of high-capacity text retrieval and text filtering systems. Furthermore, queries related to the “deep web” are also discussed under the topic of Search Computing. Lastly, Page Rank Computation, Latent Semantic Indexing, other advance topics, and latest research trends shall also be discussed in this course.
|
Course Learning Outcomes
|
Developing understanding of theory and practice of text retrieval techniques
- You will be able to understand theory of IR systems, the working mechanism of such systems and practical applications of the IR systems to real life problems.
|
Course Calendar
|
2
|
Information Retrieval Models Boolean Retrieval Model
|
3
|
Boolean Retrieval Model Rank Retrieval Model
|
4
|
Vector Space Retrieval Model
|
5
|
TF-IDF Weighting, Document Representation in Vector Space, Query Representation in Vector Space, Similarity Measures
|
6
|
Similarity Measures; Cosine Similarity Measure
|
8
|
Token Numbers Stop Words
|
10
|
Lemmatization Stemming
|
17
|
Processing a phrase query, Proximity queries
|
18
|
Wild Card Queries, B Tree
|
19
|
Permuterm index k-gram
|
21
|
Spelling Correction (contd.)
|
22
|
Spelling Correction(contd.)
|
23
|
Performance Evaluation of Information Retrieval Systems
|
24
|
Benchmarks for the Evaluation of IR Systems
|
25
|
Benchmarks for the Evaluation of IR Systems (contd.)
|
27
|
Mean Average Precision, Non Binary Relevance, DCG, NDCG
|
30
|
Sampling and pre-grouping
|
31
|
Dimensionality reduction
|
40
|
Top-k Query Processing
|
45
|
Final Notes on Information Retrieval
|
|
|
|