Book

New Directions in Cognitive Information Retrieval

Authors:
  • Colemining Inc. (consultancy)

Abstract

New Directions in Cognitive Information Retrieval presents an exciting new direction for research into cognitive oriented information retrieval (IR) research, a direction based on an analysis of the user’s problem situation and cognitive behavior when using the IR system. This contrasts with the current dominant IR research paradigm which concentrates on improving IR system matching performance. The chapters describe the leading edge concepts and models of cognitive IR that explore the nexus between human cognition, information and the social conditions that drive humans to seek information using IR systems. Chapter topics include: Polyrepresentation, cognitive overlap and the boomerang effect, Multitasking while conducting the search, Knowledge Diagram Visualizations of the topic space to facilitate user assimilation of information, Task, relevance, selection state, knowledge need and knowledge behavior, search training built into the search, children’s collaboration for school projects, and other cognitive perspectives on IR concepts and issues.

Chapters (12)

The principle of polyrepresentation is a coherent and comprehensive cognitive framework that can be applied simultaneously to the cognitive space of the user and the information space of IR systems. The principle has the potential to guide the design of interactive IR systems that take full advantage of the available document representations and user’s context to improve retrieval performance. However, before this can be achieved a number of issues need further investigation. Among these are simulation studies that test which methods would be appropriate for matching different representations of the user’s cognitive space with document representations. Such simulations could apply simulated work task situations; or they could explore exhaustively the possibilities of a number of controlled variables and thus simulate all achievable combinations. Investigations involving test persons and experimental laboratory tests (simulations) must take into account the dependency of domains, media and representation styles. Studies of the principles for how a Request Model Builder should function would be fruitful, and how to match representations to generate strong cognitive overlaps, especially in best match settings. The latter issue is illustrated by the polyrepresentation continuum. It points to the investigation of flexible and powerful hybrid matching of representations as a challenge and opportunity for future research along these lines.
The chapter presented a framework for understanding and studying how behavior can be used as implicit relevance feedback. This included a classification and discussion of behaviors that have been used as implicit relevance feedback, a general discussion and characterization of implicit feedback research, and a presentation of example studies to illustrate how such studies have been conducted and how feedback has typically been measured and used. Finally, this chapter presented a discussion of key issues and problems associated with implicit feedback research and identified challenges for future research. The use of behavior as implicit relevance feedback is an exciting and promising approach to personalizing IR interactions. Although more effort needs to be made to fully understand how behaviors can be used as implicit relevance feedback, current research efforts offer a promising start.
The colossal landscape of scholarly knowledge, growing at exponential rates, now requires representational maps utilizing advanced techniques to provide insight into the structure and dynamics of scholarly domains. Today we need intellectual cartographers to assist students and scholars in navigating, understanding and internalizing the structure and dynamics of scholarly bodies of knowledge. There is compelling evidence of the utility of KDVs stemming from the fields of educational psychology, cognitive science, cartography, and information science. Well designed KDVs have the ability to facilitate understanding, recall, and to convey to the user the schematic, geo-spatial, f temporal, semantic, or social organization of the underlying domain. Though educational knowledge domain visualizations are still in their infancy, we believe that they have a promising future in assisting with access to and the navigation, understanding, management, and communication of large-scale information spaces. Furthermore, when used as an interface for information retrieval, knowledge domain visualizations have the potential to convey the structural organization of the domain to the user. In turn, this structural knowledge of the domain provides the cognitive scaffolding with which the user may associate additional details about the domain.
Developing and supporting human search capabilities is at least equally important as developing the capabilities of search engines. Existing academic research suggests that particularly inexperienced users searching the Web utilize only a modest subset of the capabilities that the tools offer and are weak at understanding their real information needs, articulating them in a way that allows for effective searches, and interpreting search results in the context of those needs. This clearly suggests that training users to become better searchers is a worthwhile effort, and that understanding what makes certain search interventions successful and others not is vitally important for enabling users to make effective use of their time.
At present no overall integrative framework exists for CIR, but an integrated approach—i.e., that distinguishes separate concepts and processes in information seeking and information search then attempts to create an integrated conceptualization of the user who is both searching and seeking information within the larger context of HIB—has the potential to yield a more holistic theoretical and cognitive understanding that will assist IR and Web system designers. This book provides an overview of new directions in CIR research. The field of CIR is broad, international, interdisciplinary and dynamic with tremendous potential to impact the everyday lives of people in both developed and developing countries as they increasingly need to interact with IR systems. This book is not an exhaustive or historical discussion of all possible areas of important and new directions in CIR research. Information retrieval, in all its technical, cognitive and other respects continues to be an intractable research problem and research area. Our goal in producing this book was to stimulate the thinking of authors and readers alike.
... Relevance behavior studies are closely related to information seeking studies and to the broad area of human information behavior studies. Not surprisingly then, texts that deal with human information behavior, including cognitive IR, extensively deal with relevance as well (e.g., Ingwersen, & Järvelin, 2005, Spink & Cole, 2005). Many studies on various aspects of human information behavior are related to relevance behavior, but are not included here for space reasons. ...
Article
Full-text available
All is flux. —Plato on Knowledge in the Theaetetus (about 369 BC) Relevance is a, if not even the, key notion in information science in general and information retrieval in particular. This two-part critical review traces and synthesizes the scholarship on relevance over the past 30 years or so and provides an updated framework within which the still widely dissonant ideas and works about relevance might be interpreted and related. It is a continuation and update of a similar review that appeared in 1975 under the same title, considered here as being Part I. The present review is organized in two parts: Part II addresses the questions related to nature and manifestations of relevance, and Part III addresses questions related to relevance behavior and effects. In Part II, the nature of relevance is discussed in terms of meaning ascribed to relevance, theories used or proposed, and models that have been developed. The manifestations of relevance are classified as to several kinds of relevance that form an interdependent system of relevancies. In Part III, relevance behavior and effects are synthesized using experimental and observational works that incorporated data. In both parts, each section concludes with a summary that in effect provides an interpretation and synthesis of contemporary thinking on the topic treated or suggests hypotheses for future research. Analyses of some of the major trends that shape relevance work are offered in conclusions.
... The main difference between the existing approaches which incorporate a user's searching behaviors discussed above with our approach is that they use a user's search behaviors to modify the weight of an individual term while ours uses the captured user intent to modify the relationships among terms in a query. Most recently, a book edited by Amanda Spink and Charles Cole (Spink & Cole, 2005) has provided an excellent overview and new research directions to find a central ground for user-centered and systemcentered approach in information retrieval. Our work here certainly contributes to this stream of research. ...
Chapter
Full-text available
A user is an important factor that contributes to the success or failure of any information retrieval system. Unfortunately, users often do not have the same technical and/or domain knowledge as the designers of such a system, while the designers are often limited in their understanding of a target user’s needs. In this chapter, we study the problem of employing a cognitive user model for information retrieval in which knowledge about a user is captured and used for improving his/her performance in an information seeking task. Our solution is to improve the effectiveness of a user in a search by developing a hybrid user model to capture user intent dynamically and combines the captured intent with an awareness of the components of an information retrieval system. The term “hybrid” refers to the methodology of combining the understanding of a user with the insights into a system all unified within a decision theoretic framework. In this model, multi-attribute utility theory is used to evaluate values of the attributes describing a user’s intent in combination with the attributes describing an information retrieval system. We use the existing research on predicting query performance and on determining dissemination thresholds to create functions to evaluate these selected attributes. This approach also offers fine-grained representation of the model and the ability to learn a user’s knowledge dynamically. We compare this approach with the best traditional approach for relevance feedback in the information retrieval community—Ide dec-hi, using term frequency inverted document frequency (TFIDF) weighting on selected collections from the information retrieval community such as CRANFIELD, MEDLINE, and CACM. The evaluations with our hybrid model with these testbeds show that this approach retrieves more relevant documents in the first 15 returned documents than the TFIDF approach for all three collections, as well as more relevant documents on MEDLINE and CRANFIELD in both initial and feedback runs, while being competitive with the Ide dec-hi approach in the feedback runs for the CACM collection. We also demonstrate the use of our user model to dynamically create a common knowledge base from the users’ queries and relevant snippets using the APEX 07 data set.
... Yet, to develop information retrieval systems that actively support health-related decision making, it is necessary to understand the complex process of how people search for and review information when making decisions. 10 Our own prior analysis of clinician information search used a Bayesian belief revision framework to retrospectively model how documents might influence decisions during and after a search session. 11 The Bayesian model that best predicted the final clinical decision included numerical factors to account for several well known cognitive biases. ...
Article
To test whether individuals experience cognitive biases whilst searching using information retrieval systems. Biases investigated are anchoring, order, exposure and reinforcement. A retrospective analysis and a prospective experiment were conducted to investigate whether cognitive biases affect the way that documentary evidence is interpreted while searching online. The retrospective analysis was conducted on the search and decision behaviors of 75 clinicians (44 doctors, 31 nurses), answering questions for 8 clinical scenarios within 80 minutes in a controlled setting. The prospective study was conducted on 227 undergraduate students, who used the same search engine to answer two of six randomly assigned consumer health questions. Frequencies of correct answers pre- and post- search, and confidence in answers were collected. The impact of reading a document on the final decision was measured by the population likelihood ratio (LR) of the frequency of reading the document and the frequency of obtaining a correct answer. Documents with a LR > 1 were most likely to be associated with a correct answer, and those with a LR < 1 were most likely to be associated with an incorrect answer to a question. Agreement between a subject and the evidence they read was estimated by a concurrence rate, which measured the frequency that subjects' answers agreed with the likelihood ratios of a group of documents, normalized for document order, time exposure or reinforcement through repeated access. Serial position curves were plotted for the relationship between subjects' pre-search confidence, document order, the number of times and length of time a document was accessed, and concurrence with post-search answers. Chi-square analyses tested for the presence of biases, and the Kolmogorov-Smirnov test checked for equality of distribution of evidence in the comparison populations. A person's prior belief (anchoring) has a significant impact on their post-search answer (retrospective: P < 0.001; prospective: P < 0.001). Documents accessed at different positions in a search session (order effect [retrospective: P = 0.76; prospective: P = 0.026]), and documents processed for different lengths of time (exposure effect [retrospective: P = 0.27; prospective: P = 0.0081]) also influenced decision post-search more than expected in the prospective experiment but not in the retrospective analysis. Reinforcement through repeated exposure to a document did not yield statistical differences in decision outcome post-search (retrospective: P = 0.31; prospective: P = 0.81). People may experience anchoring, exposure and order biases while searching for information, and these biases may influence the quality of decision making during and after the use of information retrieval systems.
Article
Full-text available
Intruduction: Today, the evaluation of information retrieval systems, especially search engines, has become one of the most important studies in the field of information science. Nevertheless, a research that it is review research on the evaluation of search engines was not observed. So,the aim is to analyze the related literature of information retrieval evaluation field using quantitative, qualitative and composite approaches. Methodology: This research is a review article to review the related articles, the terms »arzyabieh motor kavosh«, »arzyabieh Nezam bazyabieh ettlaat«, »arzyabieh rabt«, »motor kavosh«, »motor jostejo« in the Magiran and Sid database was searched. Also the terms « evaluation search engine»، « evaluation information retrival»، « method and information retrieval»، « research and search engine» و «relevance» in the google scholar was searched and then the retrieved articles were studied. Finally, the research on the evaluation of general search engines published in Persian and English from 1998 (1998) have been studied using a library method and an analytical approach.. Results: The results showed that, in the reviewed studies, the search engines evaluation has conducted through one of the quantitative, qualitative and composite approaches. In the quantitative approach, there are several research categories such as coverage and overlap, abstracting and indexing quality, retrieval algorithms, recommender systems, interface, and document ranking quality. In the qualitative approach, there has been observed two kinds of studies which include ethnography and grounded theory. In the ethnography research, the users’ information retrieval behavior is described. In the other category, namely grounded theory, two research areas may be identified; one is fully committed to the qualitative principles and another semi-committed to them. Hence, hjorland (2010) has criticized researchers who regard their research as a "functional" and qualitative approach, but did not follow all of their principles. Conclusion: As in the research methodology, first, a quantitative method and then a qualitative method are proposed, and finally a composite method is proposed to use the strengths of both methods. It seems that this process of change in research methods from quantitative to qualitative, and then from qualitative to composite, has also affected the evaluation of information retrieval, especially search engines. Because in data retrieval research, a systematic approach has been initially introduced in evaluating information retrieval systems. And then some researchers have criticized this approach, which is the result of a user-oriented approach to users. And in recent years, researchers such as Saracevic (2007) and Thoronley (2012) have considered both approaches to be both necessary and point to a dual-axis approach to data retrieval assessment research
Article
Information encountering (IE) often occurs during active information seeking and involves passively finding unsought, unexpected information that is subjectively considered interesting, useful, or potentially useful. While the idealized IE process involves engaging with information after noticing it (for example, by examining it, conducting follow‐up seeking to determine usefulness, then using or sharing it), the process can be disrupted—resulting in missed opportunities for knowledge and insight creation. This study provides a detailed understanding of when and why the process can be disrupted. Think‐aloud observations and Critical Incident Interviews were conducted with 15 web users, focusing on examining when they encountered information but did not engage with it. Factors that discouraged engagement and simultaneously encouraged participants to return to active, goal‐directed information seeking by disrupting the IE process were identified. These factors individually and collectively demonstrate that IE can instigate a highly uncertain cost–benefit trade‐off, sometimes resulting in encounterers being cautious by returning to “less risky” active seeking. Design suggestions are made for reducing the uncertainty of deciding whether to engage with encountered information and making it easier to return to the active seeking task if disruption occurs.
Article
The field of information retrieval (IR) is typically defined, in a variety of different wordings, as concerned with retrieval of documents that satisfy an information need. In this essay, I argue that these definitions are inaccurate, fail to capture major threads of activity in IR research, and in particular are flawed because they omit the element of human participation in the retrieval process. After outlining some perspectives to consider in formulating better definitions, I offer an option, as an illustration of how the field might be presented; this option is centred on the purpose of IR, namely, support of cognition. There is an obvious need for a clear statement of the purpose of the discipline: information access is recognized as a human right and IR is the basis of a critical technology for providing that access -- one that is deeply intertwined with daily life and is changing human psychology. Well-grounded descriptions can encourage IR researchers to embrace a view of the field that enables richer connection with other disciplines, and should embody a vision of what IR research can accomplish.
Conference Paper
To improve the effectiveness of users' information seeking experience in interactive web search we hypothesize how people might be influenced when making relevance judgment decisions by introducing the C onsensus T heory & Relevance Judgment M odel (CT&M). This is combined with a practical path to assess the extent of difference between suggestions of current search engines versus user expectations. A user-centered, evidence-based, phenomenology approach is used to improve on Google PageRank (GPR) in two ways. The first by biasing GPR's equal navigation probability assumption using (f)actual usage stats as implicit user consensus which leads to the StatsRank (SR) algorithm. Secondly, we aggregate users' explicit ranking to derive Consensus Rank (CR) which is shown to predict individual user ranking significantly better than GPR and meta-search of modern search engines Google and Yahoo/Bing real-time. CT&M contextualizes CR, SR, and a live open online web experiment, called The Ranking Game , which is based on the August-2016 English Wikipedia corpus (12.7 million pages) and Page View Statistics for May to July 2016. Limiting this work to Wikipedia makes GPR topic-based since any Wikipedia page is focused on one topic. TREC's pooling is used to merge top 20 results from major search engines and present an alphabetized list for users' explicit ranking via drag and drop. The same platform captures implicit data for future research and can be used for controlled experiments. Our contributions are: CT&M, SR, CR, and the open online user feedback web experiment research platform.
Chapter
Web oriented knowledge acquisition has become the important way for people to acquire knowledge. How to help user accurately to obtain personalized knowledge in network, is an important problem for Web service. In this paper, we present a web knowledge acquisition model based on human cognitive process that can improve the precision of knowledge acquisition and meet the personalized requirements of users. The user cognitive model (UCM) designed in this paper can improve the interaction with web, and then promote the web services development of personalized, accurate and intelligent.
Chapter
This book provides a new framework for understanding the human information condition within the human condition. HIB is a social science that is continuing to adapt its framework with the evolving human information condition. There is a need to further develop a more overarching understanding of HIB within an interdisciplinary environment. This book contributes to the process of widening the HIB perspective so that it includes various diverse approaches within the broader framework of social science theories and models. In line with evolutionary psychology, we place information at the center of human adaptation. This is a worthy start.
Conference Paper
When reading online, users sometimes need auxiliary information to complement or fill in their own background knowledge in order to better understand a document that they are reading. We believe that delivering this information in the least intrusive fashion possible will improve their understanding. We have prototyped a system that selects a single Wikipedia article for users when they highlight text in an abstract. This prototype employs a contextual retrieval algorithm developed for high precision retrieval of Wikipedia articles that uses the terms in the abstract, currently being read, as a context for the search. The results from our evaluation reveal that the top-performing algorithm is able to respond with a single relevant article 77% of the time. The user study that we conducted indicates that participants have a strong preference for this approach to searching while reading.
Article
The stage-driven information seeking process to reduce uncertainty and increase value is systematically validated with real users (n=60) with real work tasks from social sciences and applied sciences domains in a UK and Danish university. A broad set of information sources are applied and core relevance criteria are measured. The research seeks to test the hypothesis that the information seeking process is seen as a dynamic and iterative development to reduce uncertainty through four stages until the problem is solved: (1) problem recognition: kind of problem, (2) problem definition: nature of the problem, (3) problem resolution: finding an answer to the problem, (4) solution statement: answer to the problem or how to deal with it. The hypothesis can be rejected in this case since there is no significant decrease in uncertainty level from stage 1 to 4.
Article
Full-text available
Searching the Web for documents using information retrieval systems plays an important part in clinicians' practice of evidence-based medicine. While much research focuses on the design of methods to retrieve documents, there has been little examination of the way different search engine capabilities influence clinician search behaviors. Previous studies have shown that use of task-based search engines allows for faster searches with no loss of decision accuracy compared with resource-based engines. We hypothesized that changes in search behaviors may explain these differences. In all, 75 clinicians (44 doctors and 31 clinical nurse consultants) were randomized to use either a resource-based or a task-based version of a clinical information retrieval system to answer questions about 8 clinical scenarios in a controlled setting in a university computer laboratory. Clinicians using the resource-based system could select 1 of 6 resources, such as PubMed; clinicians using the task-based system could select 1 of 6 clinical tasks, such as diagnosis. Clinicians in both systems could reformulate search queries. System logs unobtrusively capturing clinicians' interactions with the systems were coded and analyzed for clinicians' search actions and query reformulation strategies. The most frequent search action of clinicians using the resource-based system was to explore a new resource with the same query, that is, these clinicians exhibited a "breadth-first" search behaviour. Of 1398 search actions, clinicians using the resource-based system conducted 401 (28.7%, 95% confidence interval [CI] 26.37-31.11) in this way. In contrast, the majority of clinicians using the task-based system exhibited a "depth-first" search behavior in which they reformulated query keywords while keeping to the same task profiles. Of 585 search actions conducted by clinicians using the task-based system, 379 (64.8%, 95% CI 60.83-68.55) were conducted in this way. This study provides evidence that different search engine designs are associated with different user search behaviors.
ResearchGate has not been able to resolve any references for this publication.