3. Interactive Query Graph Refinement
The user can interactively clarify his/her vague information needs, by looking at and editing the query graph. Figure-2 depicts the user interface of our search system. The steps in refining a query graph are as follows.
- The user inputs sentences (keywords or documents) representing his/her information needs. The system displays the initial query graph made from the inputs.
- The user edits the query graph to match his/her information needs by removing and/or adding nodes and/or links.
- The system measures the relevance score of each document against the modified query graph. The score is defined using the similarity between the query graph and the subject graph of each document.
- This is called subject graph matching . Appendix A outlines the algorithm. Details of the matching algorithm and experimental results are shown in . Subject graph matching can measure the scores more precisely than the conventionally utilized vector space model , because the matching algorithm incorporates term associations in calculating the scores.
- The system ranks search results in descending score order, and displays their titles to the user interface with relevant checkboxes beside the titles that allow the user to select documents as desired.
- The user selects a checkbox if the document appears to be relevant to his/her information needs.
- The system makes a subject graph of the entire content of each selected document.
- It then extracts a neighbor subgraph, from each subject graph, including nodes matched to the query graph and their neighboring nodes. Appendix B outlines the algorithm for extracting a neighbor subgraph.
- It merges these subgraphs and the query graph. Figure-3 depicts these steps.
Figure-3: The steps of making a new query graph. (click for
- The system displays the merged graph, which can then be used to create a new query.
- Steps from 2. to 7. are repeated interactively until the user is satisfied with the search results.
In this way, the user clarifies his/her vague information needs by selecting relevant documents, and editing query graphs interactively. The user obtains the desired search results in response to a query graph that represents the clarified information needs of the user in an easy-to-understand manner.
As a preliminary evaluation, we showed a query graph and a list of terms, both representing the same information needs, to 5 users. All users indicated that the query graph was more intuitive and made the task (discerning the search goal) easier to understand than the list of terms. Comprehensive evaluations are planned of the interactive search process as well as query graph performance.
The proposed query graph has weights on its nodes and links, but these values are currently not visible, nor are they editable. Size or color clues could be utilized for visualizing them. The current query graphs use one word to represent one concept, but it is reasonable to expect that some concepts will actually need several words. We want to enhance the query graphs to handle such concepts, and different types of relations such as 'agent', 'object' and 'place'.
We have created a new search system that can clarify vague information needs. Because its process is visible and controllable, users can efficiently find the desired documents. Subject graph matching can measure the score between a query graph and each document precisely.
Appendix A: Subject graph matching
Each document is translated into a query graph (Figure-4). Each node represents a term in the text, while a link represents an association between the linked terms. Significance of the terms and the term-term associations are represented as weights assigned to them.
Figure-5 depicts the algorithm for query graph matching.
Figure-5: Algorithm for subject graph matching. (click for enlargement)
Appendix B: Extracting neighbor subgraph
Figure-6 depicts the algorithm for extracting the neighbor subgraph.
Figure-6: Algorithm for extracting neighbor subgraph. (click for enlargement)
Figure-7 depicts an example of transitive association in the algorithm. Italic numbers represent the transitive associations of each term from 'search'.
Figure-7: An example of transitive association. (click for enlargement)
- William B.Frakes and Ricardo Baeza-Yates. Information Retrieval, Data Structures & Algorithms, Prentice Hall, 1992.
- J. Koenemann and N. J. Belkin. A Case For Interaction: A Study of Interactive Information Retrieval Behavior and Effectiveness, CHI'96, 205-212, 1996.
- Junji Tomita and Yoshihiko Hayashi. Improving Effectiveness and Efficiency of Web Search by Graph-based Text Representation, Poster Proceedings of WWW9: 2000.
- Junji Tomita and Hiroshi Takeno. Proposal for a new IR system using subject graph and word's weighting by the relation (in Japanese), IPSJ-SIGFI, 52(3): 17-24, 1998.
- G. Salton and A. Wong and C. S. Yang. A Vector Space Model for Automatic Indexing, Communications of the ACM, 18:613-620,1975.