Paul Kantor

Paul Kantor
Rutgers, The State University of New Jersey | Rutgers · Department of Library and Information Science

About

207
Publications
35,597
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,448
Citations

Publications

Publications (207)
Article
The First Search Futures Workshop, in conjunction with the Fourty-sixth European Conference on Information Retrieval (ECIR) 2024, looked into the future of search to ask questions such as: • How can we harness the power of generative AI to enhance, improve and re-imagine Information Retrieval (IR)? • What are the principles and fundamental rights t...
Chapter
The field and community of Information Retrieval (IR) are changing and evolving in response to the latest developments and advances in Artificial Intelligence (AI) and research culture. As the field and community re-oriented and re-consider its positioning within computing and information sciences more generally – it is timely to gather and discuss...
Article
Full-text available
Human-machine systems, especially those involving reinforcement learning (RL), are becoming increasingly common across application domains. Designing these systems to be effective and reliable requires a task-oriented understanding of both human learning (HL) and RL. In particular, how does the structure of a learning task affect the learning perfo...
Preprint
Full-text available
Reliable real-world deployment of reinforcement learning (RL) methods requires a nuanced understanding of their strengths and weaknesses and how they compare to those of humans. Human-machine systems are becoming more prevalent and the design of these systems relies on a task-oriented understanding of both human learning (HL) and RL. Thus, an impor...
Chapter
We present a model for layered security with applications to the protection of sites such as stadiums or large gathering places. We formulate the problem as one of maximizing the capture of illegal contraband. The objective function is indefinite and only limited information can be gained when the problem is solved by standard convex optimization m...
Preprint
Full-text available
As machine learning (ML) is more tightly woven into society, it is imperative that we better characterize ML's strengths and limitations if we are to employ it responsibly. Existing benchmark environments for ML, such as board and video games, offer well-defined benchmarks for progress, but constituent tasks are often complex, and it is frequently...
Preprint
Full-text available
We present a model for layered security with applications to the protection of sites such as stadiums or large gathering places. We formulate the problem as one of maximizing the capture of illegal contraband. The objective function is indefinite and only limited information can be gained when the problem is solved by standard convex optimization m...
Preprint
Full-text available
Information sharing is vital in resisting cyberattacks, and the volume and severity of these attacks is increasing very rapidly. Therefore responders must triage incoming warnings in deciding how to act. This study asked a very specific question: "how can the addition of confidence information to alerts and warnings improve overall resistance to cy...
Preprint
What makes a task relatively more or less difficult for a machine compared to a human? Much AI/ML research has focused on expanding the range of tasks that machines can do, with a focus on whether machines can beat humans. Allowing for differences in scale, we can seek interesting (anomalous) pairs of tasks T, T'. We define interesting in this way:...
Preprint
Full-text available
We consider the problem of eliciting expert assessments of an uncertain parameter. The context is risk control, where there are, in fact, three uncertain parameters to be estimates. Two of these are probabilities, requiring the that the experts be guided in the concept of "uncertainty about uncertainty." We propose a novel formulation for expert es...
Preprint
Full-text available
There are many legacy databases, and related stores of information that are maintained by distinct organizations, and there are other organizations that would like to be able to access and use those disparate sources. Among the examples of current interest are such things as emergency room records, of interest in tracking and interdicting illicit d...
Article
Recommender systems may accelerate knowledge discovery in many fields. However, their users may be competitors guarding their ideas before publication or for other reasons. We describe a simulation experiment to assess user privacy against targeted attacks, modeling recommendations based on co‐access data. The analysis uses an unusually long (14 ye...
Article
When a small number of observations is used to estimate the performance of a walk-through metal detector (WTMD), it is difficult to compare the observations made with different numbers of tests on different machines. Standard tests for metal detectors, whether hand-held (wand) or walk-through, are expressed in terms of specific test objects and a r...
Conference Paper
Effective management of border security requires effective measurement of the impact of alternative policies on unwanted cross-border flows. There are no agreed-upon ways to measure the amount of any particular inflow; some indicators are believed to rise and fall with each in-flow, but they do not measure the total flow. To combine multiple measur...
Presentation
Full-text available
Presentation at the INFORMS 2016 International Conference
Conference Paper
Walk-through metal detectors (WTMDs) are increasingly utilized as a security measure at large events held at stadiums. As this is a different environment than standard use cases (e.g. airports, prisons), work has been done to understand how to best integrate this technology into an outdoor high throughput screening environment. We have performed se...
Conference Paper
As we face an explosion of potential new applications for the fundamental concepts and technologies of information retrieval, ranging from ad ranking to social media, from collaborative recommending to question answering systems, many researchers are spending unnecessary time reinventing ideas and relationships that are buried in the prehistory of...
Conference Paper
When utilizing metal detectors at a large venue such as a sports stadium, there are the competing objectives of accuracy of the patron screening and the speed of throughput. This research, carried out in collaboration with the security staff at MetLife Stadium in New Jersey as well as other stadiums, analyzed two patron screening methods: handheld...
Conference Paper
Most ports in the United States now inspect a large number of goods for radioactive cargo to address the potential smuggling of illicit nuclear material for terrorism. The U.S. Department of Homeland Security (DHS) has sponsored research to develop systems for detection of illicit nuclear materials at the ports. We present a systematic approach for...
Conference Paper
We present a model and discrete event simulation of USCG Air Stations, accounting for the mission demands and maintenance procedures pertaining to USCG aircraft. The simulation provides aircraft availability distributions and mission performance metrics based on varying input scenarios, including changes in the number of stationed aircraft and main...
Conference Paper
Full-text available
We have developed a framework for flood mitigation risk analysis that applies generally to floods, and have applied it in this study of flood mitigation risk analysis for the Raritan Basin in New Jersey, USA. The framework we have developed involves a conceptual model of the relation among meteorological activity, hydrological models, infrastructur...
Conference Paper
Full-text available
A model was created for the United States Coast Guard (USCG) to maximize aircraft fleet operational performance subject to budgetary constraints, or, conversely, to minimize aircraft fleet operational costs subject to performance targets. This is a two-stage model: The first stage, prior work, is a simulation model of each USCG Air Station generati...
Article
Full-text available
The Department of Homeland Security identifies stadium safety as a crucial component of risk mitigation in the US. Patron screening poses difficult trade-offs for security officials: rigorous screening prevents weapons from entering the structure, but it also creates lines that become security hazards and may be infeasible if patrons are to get int...
Conference Paper
Full-text available
The United States sets fishing regulations to sustain healthy fish populations. The overall goal of the research reported on here is to increase the efficiency of the United States Coast Guard (USCG) when boarding commercial fishing vessels to ensure compliance with those regulations. We discuss scoring rules that indicate whether a given vessel mi...
Article
A recent study of patient decision making regarding acceptance of an implantable cardiac defibrillator (ICD) provides a substantial but nonrandom sample (N = 191) of telephone interviews with persons who have made an affirmative decision regarding an ICD. Using a coding scheme developed through qualitative analysis of transcribed interviews, these...
Technical Report
Full-text available
The Raritan River has a history of overflowing its banks and causing substantial damage to nearby townships and boroughs. The team has developed a framework for flood mitigation risk analysis that applies generally to floods, and has applied it in this preliminary study of flood mitigation risk analysis for the Raritan Basin. The framework we have...
Article
In this article, we report on an experiment to assess the possibility of rigorous evaluation of interactive question-answering (QA) systems using the cross-evaluation method. This method takes into account the effects of tasks and context, and of the users of the systems. Statistical techniques are used to remove these effects, isolating the effect...
Article
Full-text available
We consider the problem of combining a given set of diagnostic tests into an inspection system to classify items of interest (cases) with maximum accuracy such that the cost of performing the tests does not exceed a given budget constraint. One motivating application is sequencing diagnostic tests for container inspection, where the diagnostic test...
Conference Paper
Full-text available
Common approaches to assessing document quality look at shallow aspects, such as grammar and vocabulary. For many real-world applications, deeper notions of quality are needed. This work represents a first step in a project aimed at developing computational methods for deep assessment of quality in the domain of intelligence reports. We present an...
Conference Paper
As we face an explosion of potential new applications for the fundamental concepts and technologies of information retrieval, ranging from ad ranking to social media, from collaborative recommending to question answering systems, many researchers are spending unnecessary time reinventing ideas and relationships that are buried in the prehistory of...
Book
The explosive growth of e-commerce and online environments has made the issue of information search and selection increasingly serious; users are overloaded by options to consider and they may not have the time or knowledge to personally evaluate these options. Recommender systems have proven to be a valuable way for online users to cope with the i...
Article
Full-text available
We report a study of patient decision-making regarding acceptance of an implantable cardiac defibrillator (ICD), using 191 telephone interviews with patients who have an ICD. Key findings include: 1) patients' perceptions regarding decision-making reveal four broad themes -physical, psychological, social reasons driving agency in the decision, or n...
Article
Detection of contraband depends on countermeasures, some of which involve examining cargo containers and/or their associated documents. Document screening is the least expensive, physical methods, such as gamma ray detection are more expensive, and definitive manual unpacking is most expensive. We cannot apply the full array of methods to all incom...
Article
Cargo ships arriving at US ports are inspected for unauthorized materials. Because opening and manually inspecting every container is costly and time-consuming, tests are applied to decide whether a container should be opened. By utilizing a polyhedral description of decision trees, we develop a large-scale linear programming model for sequential c...
Article
Evaluating interactive question answering (QA) systems with real users can be challenging because traditional evaluation measures based on the relevance of items returned are difficult to employ since relevance judgments can be unstable in multi-user evaluations. The work reported in this paper evaluates, in distinguishing among a set of interactiv...
Conference Paper
Full-text available
One knowledge discovery problem in the rapid response setting is the cost of learning which patterns are indicative of a threat. This typically involves a detailed follow-through, such as review of documents and information by a skilled analyst, or detailed examination of a vehicle at a border crossing point, in deciding which suspicious vehicles r...
Article
Full-text available
This is the report of the working group on the relation between, or hybrid combination of design experiment optimization and R&S. The rapporteur, Paul Kantor, learned a great deal at the conference which he summarized by sharing the cartoon shown here. ("A student asking the teacher'... may i be excused, my is full" (from a 1986 cartoon by Gary Lar...
Article
We present a family of distance measures for comparing activation patterns captured in fMRI images. We model an fMRI image as a spatial object with varying density, and measure the distance between two fMRI images using a novel fixed-radius, distribution-based Earth Mover’s Distance that is computable in polynomial time. We also present two simplif...
Conference Paper
Full-text available
Several recent studies have found a weak relationship between system performance and search "success". We hypothesize that searchers are successful because they alter their search behavior. To clarify the relation between system performance and search behavior, we designed an experiment in which system performance is controlled in order to elicit a...
Chapter
Full-text available
The problem of container inspection at ports-of-entry is formulated in several different ways as an optimization problem. Data generated from different analytical methods, x-ray detectors, gamma-ray detectors and other sensors used for the detection of chemical, biological, radiological, nuclear, explosive, and other illicit agents are often relied...
Conference Paper
Security is a concern for persons, organizations, and nations. For the individual members of organizations and nations, personal privacy is also a concern. The technologies for monitoring electronic communication are at the same time tools to protect security and threats to personal privacy. Participants in this workshop address the interrelation o...
Conference Paper
Full-text available
This manuscript proposes a retrieval system for fMRI brain images. Our goal is to find a similarity-metric to enable us to support queries for "similar tasks" for retrieval on a large collection of brain experiments. The system uses a novel similarity measure between the result of probabilistic independent component analysis (PICA) of brain images....
Article
Full-text available
academic progenitor of an amazing number of important contributors to the field of Information Retrieval, begins this work with a delightful prologue. It is a ‘‘discussion’’ between himself (‘‘K’’) and two other individuals identified only as ‘‘B’ ’ and ‘‘N’’, who are both skeptical about his ideas. Since he wields the pen, he is able to move them...
Article
We describe a procedure for quantitative evaluation of interactive question-answering systems and illustrate it with application to the High-Quality Interactive Question-Answering (HITIQA) system. Our objectives were (a) to design a method to realistically and reliably assess interactive question-answering systems by comparing the quality of report...
Article
Full-text available
Information retrieval research, at least as conceived by the SIGIR community, is fundamentally experimental in nature. As such, the presentation of results from controlled, reproducible experiments lies at the core of our work. Many reports follow the same general format: authors propose a new retrieval method, whose performance on some well-define...
Article
Full-text available
In this paper, we explore the concept of a "library of brain images", which implies not only a repository of brain images, but also efficient search and retrieval mechanisms that are based on models derived from IR practice. As a preliminary study, we have worked with a collection of functional MRI brain images assembled in the study of sev-eral di...
Article
The purpose of this work is to identify potential evaluation criteria for interactive, analytical question-answering (QA) systems by analyzing evaluative comments made by users of such a system. Qualitative data collected from intelligence analysts during interviews and focus groups were analyzed to identify common themes related to performance, us...
Conference Paper
Full-text available
The thresholded t-map produced by the General Linear Model (GLM) gives an effective summary of activation patterns in functional brain images and is widely used for feature selection in fMRI related classification tasks. As part of a project to build content-based retrieval systems for fMRI images, we have investigated ways to make GLM more adaptiv...
Conference Paper
Full-text available
We present a new Finite Impulse Response (FIR) model for hemodynamics in functional brain images. Like other FIR models, our method permits a flexible formulation of the hemo - dynamic response. The distinctive feature of this model is that the shape information of the canonical HRF is imposed on the FIR, with little loss of flexibility. Model fitt...
Article
The authors report on a series of experiments to automate the assessment of document qualities such as depth and objectivity. The primary purpose is to develop a quality-sensitive functionality, orthogonal to relevance, to select documents for an interactive question-answering system. The study consisted of two stages. In the classifier constructio...
Article
Full-text available
We describe a large-scale evaluation of four interactive question answering sys-tem with real users. The purpose of the evaluation was to develop evaluation methods and metrics for interactive QA systems. We present our evaluation method as a case study, and discuss the design and administration of the evalua-tion components and the effectiveness o...
Article
In this article, we introduce a new information system evaluation method and report on its application to a collaborative information seeking system, AntWorld. The key innovation of the new method is to use precisely the same group of users who work with the system as judges, a system we call Cross-Evaluation. In the new method, we also propose to...
Article
Full-text available
The outputs of several information filtering (IF) systems can be combined to improve filtering performance. In this article the authors propose and explore a framework based on the so-called information structure (IS) model, which is frequently used in Information Economics, for combining the output of multiple IF systems according to each user's p...
Conference Paper
Full-text available
In this paper, we explore the concept of a \library of brain images", which implies not only a repository of brain images, but also ecien t search and retrieval mechanisms that are based on models derived from IR practice. As a preliminary study, we have worked with a collection of functional MRI brain images assembled in the study of several disti...
Chapter
We describe an interactive approach to question answering where the user and the system first negotiate the scope and shape of information being sought and then cooperate in locating and assembling the answer. The system, which we call HITIQA11, has access to a large repository of unprocessed and unformatted data, and is additionally equipped with...
Article
Full-text available
Cargo ships arriving at USA ports are inspected for unauthorized materials. Ideally, we want to open and check every container they carry, but it is costly and time-consuming. Instead, tests are developed to decide whether a container should be opened. By utilizing a polyhedral description of decision trees, we develop a large scale linear programm...
Poster
Assessing the Impact of IAIMS on the UMDNJ ”Information Workspace.”
Article
Some aspects of Data Fusion (DF) for Information Retrieval (IR) are explored using a set of data from the Fifth International Conference on Text Retrieval, TREC5. It has been observed from time to time that DF applied to a pair of systems or schemes for IR may yield results that are better than those of either participating scheme. It has been conj...
Article
We analyzed textual properties of documents to identify predictive variables for various document qualities by means of statistical and linguistic methods. We have created a collection of 1000 documents, each document has been judged in terms of nine document qualities (accuracy, reliability, objectivity, depth, author/producer credibility, readabi...
Article
In this paper we report preliminary results of a study to develop, and subsequently to automate, new metrics for assessment of information quality in text documents, particularly in news. Through focus group studies, quality judgment experiments, and textual feature extraction and analysis, we were able to generate nine quality aspects and apply th...
Article
The work reports some initial success in extending the Rutgers Paradigm of IR evaluation to the realm of concrete measurement, not in information retrieval per se, but in the arguably more complex domain of Question Answering. Crucial to the paradigm are two components: cross evaluation, and an analytical model that controls for the potential probl...
Article
The goal of this research is to automatically predict human judgments of document qualities such as subjectivity, verbosity and depth. In this paper, we explore the behavior of adjectives as indicators of subjectivity in documents. Specifically, we test whether a subset of automatically derived subjective adjectives (Wiebe, 2000b), selected a prior...
Article
In addition to relevance, there are other factors that contribute to the utility of a document. For examples, content properties like depth of analysis and multiplicity of viewpoints, and presentational properties like readability and verbosity, all will affect the usefulness of a document. These kinds of relevance-independent properties are diffic...
Conference Paper
Full-text available
This work compares two approaches to finding effective topic-independent classifier combinations. We suggest a new federated approach and compare it against the global approach. Our results indicate that the relative effectiveness of these approaches depends on the measure used to evaluate them. We suggest explanations for these results.
Conference Paper
Full-text available
This report describes DIMACS work on the text categoriza- tion task of the TREC 2005 Genomics track. Our approach to this task was similar to the triage subtask studied in the TREC 2004 Genomics track. We applied Bayesian logistic regression and achieved good eectiveness on all categories.
Conference Paper
Full-text available
We present a 3D matching framework based on a many-to-many matching algorithm that works with skeletal representations of 3D volumetric objects. We demonstrate the performance of this approach on a large database of 3D objects containing more than 1000 exemplars. The method is especially suited to matching objects with distinct part structure and i...
Article
Full-text available
on two of the groups of entity resolution problems, ER1 and ER2 for the KDD Challenge in 2005. We presume that the situation is intended to mimic, using abstracts and author information from the life sciences, some real world problem, in which it is important to recognize the identity of an individual, even though he may share that name with other...
Conference Paper
Full-text available
In this paper we suggest a new approach to analysis and design of IR systems. We argue for design space exploration in constructing IR systems and in analyzing the effects of individual modules and parameters. We present results of experiments with parametric interpolation, or "homotopy", between two systems, and show, incidentally, that the best r...
Article
Full-text available
In this paper we describe the analytic question answering system HITIQA (High-Quality Interactive Question Answering) which has been developed over the last 2 years as an advanced research tool for information analysts. HITIQA is an interactive open-domain question answering technology designed to allow analysts to pose complex exploratory question...
Article
Full-text available
We report here empirical results of a series of studies aimed at automatically predicting information quality in news documents. Multiple research methods and data analysis techniques enabled a good level of machine prediction of information quality. Procedures regarding user experiments and statistical analysis are described.
Conference Paper
Full-text available
DIMACS participated in the text categorization and ad hoc retrieval tasks of the TREC 2004 Genomics track. For the categorization task, we tackled the triage and annotation hierarchy subtasks. and biology of the laboratory mouse. In particular, the Mouse Genome Database (MGD) contains information on the characteristics and functions of genes in the...