Text Based Information Retrieval H02C8A Marie-Francine Moens Karl Gyllstrom Katholieke Universiteit Leuven Study points: 4 Language: English Periodicity: Taught in the second semester e-mail: sien.moens@cs.kuleuven.be karl.gyllstrom@cs.kuleuven.be 2011-2012 Text Based Information Retrieval 2011-2012 • Aims of the course: – Acquire the fundamental techniques for text based information retrieval and text mining – Learn to design, partially implement, and evaluate a text based information retrieval system – Acquire insights into current research questions – Illustrate with commercial applications (1 lesson: speaker of an international company) Text Based Information Retrieval 2011-2012 E.g., retrieval models (algebraic, probabilistic, link-based), advanced representations (e.g., LSA, LDA), index structures, ...... Text Based Information Retrieval 2011-2012 E.g., text categorization, information extraction, text clustering, summarization, cross-language and cross-media retrieval, ... Text Based Information Retrieval 2011-2012 Prerequisites • Basic knowledge of: – Probability theory and statistics – Information theory – Linear algebra – (Machine learning) Text Based Information Retrieval 2011-2012 Course material • Course slides and exercise questions/solutions can be downloaded from the Toledo platform – http://toledo.kuleuven.be – Background literature Text Based Information Retrieval 2011-2012 Evaluation • An assignment (grading: 33.3%): • Paper or programming assignment • Available week 7 • Solution is due week 16 • A score of 50% or more is transferred to the September exam session • Theory exam (grading: 33.3 %): Oral with written preparation, closed book. • Exercise exam (grading: 33.3%): Written, open book. Text Based Information Retrieval 2011-2012 Text Based Information Retrieval H02C8B Marie-Francine Moens Karl Gyllstrom Katholieke Universiteit Leuven Study points: 6 Language: English Periodicity: Taught in the second semester e-mail: sien.moens@cs.kuleuven.be karl.gyllstrom@cs.kuleuven.be 2011-2012 • See H02C8A • Additional lectures and exercise session on: – Data structures and search techniques – Compression of textual data – Fusion and learning to rank • Assignment = programming assignment: – Available week 7 – Solution of part 1 is due week 16 – Solution of part 2 is due week 21 Text Based Information Retrieval 2011-2012