Kshitij Fadnis’s research while affiliated with IBM and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (17)


Figure 3: A custom OpenQA search application built with PRIMEQA. Additional screenshots are in Appendix A.
PRIMEQA: The Prime Repository for State-of-the-Art MultilingualQuestion Answering Research and Development
  • Preprint
  • File available

January 2023

·

33 Reads

·

Jaydeep Sen

·

Bhavani Iyer

·

[...]

·

The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers. In this paper, we introduce PRIMEQA: a one-stop and open-source QA repository with an aim to democratize QA re-search and facilitate easy replication of state-of-the-art (SOTA) QA methods. PRIMEQA supports core QA functionalities like retrieval and reading comprehension as well as auxiliary capabilities such as question generation.It has been designed as an end-to-end toolkit for various use cases: building front-end applications, replicating SOTA methods on pub-lic benchmarks, and expanding pre-existing methods. PRIMEQA is available at : https://github.com/primeqa.

Download


Doc2Bot: Document grounded Bot Framework

May 2021

·

3 Reads

·

2 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Conversational agents, or chatbots, are widely used to provide customer care and other informational support. Currently, the development of chatbots using standard frameworks requires a lot of manual crafting by subject matter experts (SMEs). On the other hand, while learning-based approaches to dialog have made significant advancements, they require training with a large volume of dialog data, which chatbot developers typically do not have access to. To tackle these challenges, we introduce DOC2BOT, a system that supports the automated construction of chatbots by digesting various forms of documents such as business manuals, HowTos, and customer support pages that organizations own. In addition to that, DOC2BOT provides a user-friendly experience to SMEs, and to minimize their effort by supporting intuitive interactions and streamlining their workflow.


CLAI: A Platform for AI Skills on the Command Line

May 2020

·

30 Reads

·

1 Citation

This paper reports on the open-source project-Project CLAI (Command Line AI)-which aims to bring the power of AI to the command-line interface (CLI). The platform sets up the CLI as a new environment for AI researchers to conquer by surfacing the command line as a generic environment that researchers can interface to using a simple sense-act API much like the traditional AI agent architecture. In this paper, we discuss the design and implementation of the platform in detail, through illustrative use cases of new end-user interaction patterns enabled by this design, and through quantitative evaluation of the system footprint of a CLAI-enabled terminal. We also report on some early user feedback on its features from an internal survey.


Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

April 2020

·

49 Reads

·

26 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Textual entailment is a fundamental task in natural language processing. Most approaches for solving this problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However, the proposed models do not fully exploit the information in the usually large and noisy KGs, and it is not clear how it can be effectively encoded to be useful for entailment. We present an approach that complements text-based entailment models with information from KGs by (1) using Personalized PageRank to generate contextual subgraphs with reduced noise and (2) encoding these subgraphs using graph convolutional networks to capture the structural and semantic information in KGs. We evaluate our approach on multiple textual entailment datasets and show that the use of external knowledge helps the model to be robust and improves prediction accuracy. This is particularly evident in the challenging BreakingNLI dataset, where we see an absolute improvement of 5-20% over multiple text-based entailment models.


Doc2Dial: A Framework for Dialogue Composition Grounded in Documents

April 2020

·

27 Reads

·

7 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

We introduce Doc2Dial, an end-to-end framework for generating conversational data grounded in given documents. It takes the documents as input and generates the pipelined tasks for obtaining the annotations specifically for producing the simulated dialog flows. Then, the dialog flows are used to guide the collection of the utterances via the integrated crowdsourcing tool. The outcomes include the human-human dialogue data grounded in the given documents, as well as various types of automatically or human labeled annotations that help ensure the quality of the dialog data with the flexibility to (re)composite dialogues. We expect such data can facilitate building automated dialogue agents for goal-oriented tasks. We demonstrate Doc2Dial system with the various domain documents for customer care.


CLAI: A Platform for AI Skills on the Command Line

January 2020

·

31 Reads

This paper reports on the open source project CLAI (Command Line AI), aimed at bringing the power of AI to the command line interface. The platform sets up the CLI as a new environment for AI researchers to conquer by surfacing the command line as a generic environment that researchers can interface to using a simple sense-act API much like the traditional AI agent architecture. In this paper, we discuss the design and implementation of the platform in detail, through illustrative use cases of new end user interaction patterns enabled by this design, and through quantitative evaluation of the system footprint of a CLAI-enabled terminal. We also report on some early user feedback on its features from an internal survey.



Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

November 2019

·

63 Reads

Textual entailment is a fundamental task in natural language processing. Most approaches for solving the problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However, the proposed models do not fully exploit the information in the usually large and noisy KGs, and it is not clear how it can be effectively encoded to be useful for entailment. We present an approach that complements text-based entailment models with information from KGs by (1) using Personalized PageR- ank to generate contextual subgraphs with reduced noise and (2) encoding these subgraphs using graph convolutional networks to capture KG structure. Our technique extends the capability of text models exploiting structural and semantic information found in KGs. We evaluate our approach on multiple textual entailment datasets and show that the use of external knowledge helps improve prediction accuracy. This is particularly evident in the challenging BreakingNLI dataset, where we see an absolute improvement of 5-20% over multiple text-based entailment models.


Figure 1: An NLI instance situated in a knowledge graph. Premise nodes are blue, hypothesis nodes red.
Figure 2: Architecture of our GRN model.
Heuristics for Interpretable Knowledge Graph Contextualization

November 2019

·

286 Reads

In this paper, we introduce the problem of knowledge graph contextualization that is, given a specific context, the problem of extracting the most relevant sub-graph of a given knowledge graph. The context in the case of this paper is defined to be the textual entailment problem, and more specifically an instance of that problem where the entailment relationship between two sentences P and H has to be predicted automatically. This prediction takes the form of a classification task, and we seek to provide that task with the most relevant external knowledge while eliminating as much noise as possible. We base our methodology on finding the shortest paths in the cost-customized external knowledge graph that connect P and H, and build a series of methods starting with manually curated search heuristics and culminating in automatically extracted heuristics to find such paths and build the most relevant sub-graph. We evaluate our approaches by measuring the accuracy of the classification on the textual entailment problem, and show that modulating the external knowledge that is used has an impact on performance.


Citations (11)


... Sil et al. introduced PRIMEQA, an open-source repository to democratize cutting-edge QA methodologies. This end-to-end QA toolkit allows for custom app creation with trainable retrievers and readers for deployment [53]. Sun et al. [54] propose recitation-augmented language models, enabling LLMs to retrieve pertinent information from their own memory through sampling to answer questions. ...

Reference:

Large language models as tax attorneys: a case study in legal capabilities emergence
PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development
  • Citing Conference Paper
  • January 2023

... It contains 4,793 goal-oriented dialogues and a total of 488 associated grounding documents from 4 domains for social welfare. The proof-of-concept doc2dial framework [10] and subsequent attempts [4,9,18] are following the knowledge extraction -response generation paradigm. However, their data pre-processing includes a well-designed text reading step, which draws the government document into the form of plain text. ...

Doc2Bot: Document grounded Bot Framework
  • Citing Article
  • May 2021

Proceedings of the AAAI Conference on Artificial Intelligence

... With the development of online business, customer service has been applied in various domains, including technical support, after-sales service, and banking applications. Customer service is one of the pillars for a company's success, as it is highly related to customers' satisfaction and affects how the company is viewed by the public [6]. To some extent, providing satisfying customer service can generate more marketing and sales opportunities. ...

Agent Assist through Conversation Analysis

... Crowdsourcing is the primary method for creating conversational datasets, where human workers generate data based on provided instructions [7,12,23,24,56,64,79,82,86,112,121]. This approach, however, is costly, timeconsuming, and challenging to scale or adapt to new domains [33,84,92]. ...

Doc2Dial: A Framework for Dialogue Composition Grounded in Documents
  • Citing Article
  • April 2020

Proceedings of the AAAI Conference on Artificial Intelligence

... For graph-based external knowledge in the field of Natural Language Inference (NLI), there are a lot of attempts such as [47,52,53]. In particular, Wang et al. [47] presented a combination of techniques on text, graph, and text-and-graph-based models that can leverage external knowledge to improve performance on the NLI problem. ...

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks
  • Citing Article
  • April 2020

Proceedings of the AAAI Conference on Artificial Intelligence

... However, all their models were tailored to specific games; in contrast, by reading the docs, DocCoder generalizes to unseen functions. Agarwal et al. (2020) retrieved from the tldr pages; however, their model only retrieved NL→Bash examples, without the further challenge of generating code. ...

CLAI: A Platform for AI Skills on the Command Line
  • Citing Conference Paper
  • May 2020

... IBM has developed a business group decision-making system. 149 The moral machine 150 elicits preferences about ethical issues and can be used to help people make ethical decisions, 151 for example, by voting. 131 Societal tradeoffs between activities can be decided by aggregating agents' individual tradeoffs. ...

Planning and visualization for a smart meeting room assistant1: A case study in the Cognitive Environments Laboratory at IBM T.J. Waston Research Center, Yorktown
  • Citing Article
  • February 2019

AI Communications

... For instance, Krarup et al. (2019) [145] use waypoints for explanation, where this use of an execution-trace is a similar approach to that of rule traces and tracing nodes through a BN or DN. Similar approaches of generating explanations from actions using a model can be seen in other recent research [146][147][148][149][150][151][152][153][154][155][156][157][158]. Fox et al. (2017) [144] identify several questions that XAIP can answer. ...

Visualizations for an Explainable Planning Agent
  • Citing Conference Paper
  • July 2018

... To address the weaknesses above, our work revisits and further builds upon the method of Gunasekara et al. [18,19] that comprises two efficient steps in which utterances are first encoded into vectors and subsequently clustered. However, Gunasekara et al. [18,19] represent utterances as bag-of-words or skip-thought [20] vectors, which have been shown to perform poorly in semantic similarity tasks [21,22], and without considering dialogue context. ...

Quantized Dialog - A General Approach for Conversational Systems
  • Citing Article
  • June 2018

Computer Speech & Language

... In contrast to the traditional keyword-based image search, users easily obtain the expected images quickly and efficiently, while the system narrows down their search space dramatically by collecting a set of constraints. Considering its big commercial potential, a long track of research efforts have been dedicated to the conversational image search [6], [7]. Our investigation shows that existing studies usually concentrate on the "conversation" part [8]- [13], namely developing preference elicitation paradigm relying on natural language processing techniques to determine which question to ask at each time, so that the system can quickly understand the user need with fewer conversational rounds. ...

A Unified Implicit Dialog Framework for Conversational Search

Proceedings of the AAAI Conference on Artificial Intelligence