Query Classification using Wikipedia's Category Graph
Milad AlemZadeh1,
Richard Khoury2, and
Fakhri Karray1
1. Centre for Pattern Analysis and Machine Intelligence, University of Waterloo, Waterloo, Ontario, Canada
2. Department of Software Engineering, Lakehead University, Thunder Bay, Ontario, Canada
2. Department of Software Engineering, Lakehead University, Thunder Bay, Ontario, Canada
Abstract— Wikipedia’s category graph is a network of 300,000 interconnected category labels, and can be a powerful resource for many classification tasks. However, its size and the lack of order can make it difficult to navigate. In this paper, we present a new algorithm to efficiently exploit this graph and accurately rank classification labels given user-specified keywords. We highlight multiple possible variations of this algorithm, and study the impact of these variations on the classification results in order to determine the optimal way to exploit the category graph. We implement our algorithm as the core of a query classification system and demonstrate its reliability using the KDD CUP 2005 and TREC 2007 competitions as benchmarks.
Index Terms—keyword search, natural language processing, knowledge based systems, web sites, semantic web
Cite: Milad AlemZadeh, Richard Khoury, and Fakhri Karray, "Query Classification using Wikipedia's Category Graph," Journal of Emerging Technologies in Web Intelligence, Vol. 4, No. 3, pp. 207-220, August 2012. doi:10.4304/jetwi.4.3.207-220
Index Terms—keyword search, natural language processing, knowledge based systems, web sites, semantic web
Cite: Milad AlemZadeh, Richard Khoury, and Fakhri Karray, "Query Classification using Wikipedia's Category Graph," Journal of Emerging Technologies in Web Intelligence, Vol. 4, No. 3, pp. 207-220, August 2012. doi:10.4304/jetwi.4.3.207-220
Array
Previous paper:First page
Next paper:Towards Identifying Personalized Twitter Trending Topics using the Twitter Client RSS Feeds
Next paper:Towards Identifying Personalized Twitter Trending Topics using the Twitter Client RSS Feeds