Language-Modeling Kernel Based Approach for Information Retrieval

Ying Xie, Kennesaw State University
Vijay V. Raghavan, University of Louisiana at Lafayette


In this presentation, we propose a novel integrated information retrieval approach that provides a unified solution for two challenging problems in the field of information retrieval. The first problem is how to build an optimal vector space corresponding to users' different information needs when applying the vector space model. The second one is how to smoothly incorporate the advantages of machine learning techniques into the language modeling approach. To solve these problems, we designed the language-modeling kernel function, which has all the modeling powers provided by language modeling techniques. In addition, for each information need, this kernel function automatically determines an optimal vector space, for which a discriminative learning machine, such as the support vector machine, can be applied to find an optimal decision boundary between relevant and nonrelevant documents. Large-scale experiments on standard test-beds show that our approach makes significant improvements over other state-of-the-art information retrieval methods.