Language-Modeling Kernel Based Approach for Information Retrieval
Department
Computer Science
Document Type
Article
Publication Date
9-2007
Abstract
In this presentation, we propose a novel integrated information retrieval approach that provides a unified solution for two challenging problems in the field of information retrieval. The first problem is how to build an optimal vector space corresponding to users' different information needs when applying the vector space model. The second one is how to smoothly incorporate the advantages of machine learning techniques into the language modeling approach. To solve these problems, we designed the language-modeling kernel function, which has all the modeling powers provided by language modeling techniques. In addition, for each information need, this kernel function automatically determines an optimal vector space, for which a discriminative learning machine, such as the support vector machine, can be applied to find an optimal decision boundary between relevant and nonrelevant documents. Large-scale experiments on standard test-beds show that our approach makes significant improvements over other state-of-the-art information retrieval methods.