Investigating User Behavior in Document Similarity Judgment for Interactive Clustering-based Search Engines

Minghuang Chen1, Seiji Yamada2, and Yasufumi Takama1
1. Tokyo Metropolitan University, Tokyo, Japan
2. National Institute of Informatics, Tokyo, Japan
Abstract—This paper investigates the behavior of users judging the similarity of documents in order to examine the user’s feedback cost for interactive document clustering. Modern web search engines employ linear-style SERPs (search engine result pages). In order to make use of information on continuously growing web, various search engines for the next generation have been studied, among which clustering-based search engines are expected to be promising. It is also important to introduce interactive user feedback mechanism into search engines. The aim of this paper is to study the effective interface design that is suitable for interactive clustering-based search engines. An experiment is conducted with 21 test participants, who were asked to judge the similarity of document pairs based on three conditions: viewing snippet, topic terms, or original text. Those conditions are compared in terms of judgment time and accuracy with ANOVA and chi-square analysis. The typical judging behaviors of the participants are also investigated by eye-tracking system. The results will contribute to the design of interface for interactive clustering-based search engines for the next generation.

Index Terms—web search interfaces, interactive clustering, document similarity judgment, eye-tracking  

Cite: Minghuang Chen, Seiji Yamada, and Yasufumi Takama, "Investigating User Behavior in Document Similarity Judgment for Interactive Clustering-based Search Engines," Journal of Emerging Technologies in Web Intelligence, Vol. 3, No. 1, pp. 3-10, February 2011. doi:10.4304/jetwi.3.1.3-10
