January 2022
·
23 Reads
·
24 Citations
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
January 2022
·
23 Reads
·
24 Citations
October 2021
·
15 Reads
September 2021
·
46 Reads
·
1 Citation
We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to track performance change as languages are added to a base language in a graded and greedy (in the sense of best boost of performance) manner; which reveals that careful selection of subset of related languages can significantly improve performance than utilizing all related languages. The Indo-Aryan (IA) language family is chosen for the study, the exact languages being Bengali, Gujarati, Hindi, Marathi, Oriya, Punjabi and Urdu. The script barrier is crossed by simple rule-based transliteration of the text of all languages to Devanagari. Experiments are performed on mBERT, IndicBERT, MuRIL and two RoBERTa-based LMs, the last two being pre-trained by us. Low resource languages, such as Oriya and Punjabi, are found to be the largest beneficiaries of multilingual fine-tuning. Textual Entailment, Entity Classification, Section Title Prediction, tasks of IndicGLUE and POS tagging form our test bed. Compared to monolingual fine tuning we get relative performance improvement of up to 150% in the downstream tasks. The surprise take-away is that for any language there is a particular combination of other languages which yields the best performance, and any additional language is in fact detrimental.
September 2021
·
70 Reads
Weakly-supervised table question-answering(TableQA) models have achieved state-of-art performance by using pre-trained BERT transformer to jointly encoding a question and a table to produce structured query for the question. However, in practical settings TableQA systems are deployed over table corpora having topic and word distributions quite distinct from BERT's pretraining corpus. In this work we simulate the practical topic shift scenario by designing novel challenge benchmarks WikiSQL-TS and WikiTQ-TS, consisting of train-dev-test splits in five distinct topic groups, based on the popular WikiSQL and WikiTableQuestions datasets. We empirically show that, despite pre-training on large open-domain text, performance of models degrades significantly when they are evaluated on unseen topics. In response, we propose T3QA (Topic Transferable Table Question Answering) a pragmatic adaptation framework for TableQA comprising of: (1) topic-specific vocabulary injection into BERT, (2) a novel text-to-text transformer generator (such as T5, GPT2) based natural language question generation pipeline focused on generating topic specific training data, and (3) a logical form reranker. We show that T3QA provides a reasonably good baseline for our topic shift benchmarks. We believe our topic split benchmarks will lead to robust TableQA solutions that are better suited for practical deployment.
August 2021
·
329 Reads
·
83 Citations
June 2021
·
62 Reads
Spoken intent detection has become a popular approach to interface with various smart devices with ease. However, such systems are limited to the preset list of intents-terms or commands, which restricts the quick customization of personal devices to new intents. This paper presents a few-shot spoken intent classification approach with task-agnostic representations via meta-learning paradigm. Specifically, we leverage the popular representation-based meta-learning learning to build a task-agnostic representation of utterances, that then use a linear classifier for prediction. We evaluate three such approaches on our novel experimental protocol developed on two popular spoken intent classification datasets: Google Commands and the Fluent Speech Commands dataset. For a 5-shot (1-shot) classification of novel classes, the proposed framework provides an average classification accuracy of 88.6% (76.3%) on the Google Commands dataset, and 78.5% (64.2%) on the Fluent Speech Commands dataset. The performance is comparable to traditionally supervised classification models with abundant training samples.
June 2021
·
89 Reads
Recent advances in transformers have enabled Table Question Answering (Table QA) systems to achieve high accuracy and SOTA results on open domain datasets like WikiTableQuestions and WikiSQL. Such transformers are frequently pre-trained on open-domain content such as Wikipedia, where they effectively encode questions and corresponding tables from Wikipedia as seen in Table QA dataset. However, web tables in Wikipedia are notably flat in their layout, with the first row as the sole column header. The layout lends to a relational view of tables where each row is a tuple. Whereas, tables in domain-specific business or scientific documents often have a much more complex layout, including hierarchical row and column headers, in addition to having specialized vocabulary terms from that domain. To address this problem, we introduce the domain-specific Table QA dataset AIT-QA (Airline Industry Table QA). The dataset consists of 515 questions authored by human annotators on 116 tables extracted from public U.S. SEC filings (publicly available at: https://www.sec.gov/edgar.shtml) of major airline companies for the fiscal years 2017-2019. We also provide annotations pertaining to the nature of questions, marking those that require hierarchical headers, domain-specific terminology, and paraphrased forms. Our zero-shot baseline evaluation of three transformer-based SOTA Table QA methods - TaPAS (end-to-end), TaBERT (semantic parsing-based), and RCI (row-column encoding-based) - clearly exposes the limitation of these methods in this practical setting, with the best accuracy at just 51.8\% (RCI). We also present pragmatic table preprocessing steps used to pivot and project these complex tables into a layout suitable for the SOTA Table QA models.
March 2021
·
805 Reads
·
1 Citation
Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts of labeled corpora in multiple languages. With multilingualism becoming common in today's world, there has been increasing interest in code-switching ASR as well. In code-switching, multiple languages are freely interchanged within a single sentence or between sentences. The success of low-resource multilingual and code-switching ASR often depends on the variety of languages in terms of their acoustics, linguistic characteristics as well as the amount of data available and how these are carefully considered in building the ASR system. In this challenge, we would like to focus on building multilingual and code-switching ASR systems through two different subtasks related to a total of seven Indian languages, namely Hindi, Marathi, Odia, Tamil, Telugu, Gujarati and Bengali. For this purpose, we provide a total of ~600 hours of transcribed speech data, comprising train and test sets, in these languages including two code-switched language pairs, Hindi-English and Bengali-English. We also provide a baseline recipe for both the tasks with a WER of 30.73% and 32.45% on the test sets of multilingual and code-switching subtasks, respectively.
January 2021
·
13 Reads
January 2021
·
29 Reads
·
2 Citations
... We collect raw tabular data from existing datasets, including typical datasets such as WTQ (Pasupat and Liang 2015), SQA (Iyyer, Yih, and Chang 2017), TabFact (Nan et al. 2022), FeTaQA (Nan et al. 2022), FinQA (Chen et al. 2021c), AIT-QA (Katsis et al. 2022), etc. To align closely with the "reasoning complexity of questions" dimension in real-world tabular problems, we do not specifically design for the complexity of the tables themselves, such as structural complexity or large-sized tables. ...
January 2022
... Composed Image Retrieval. Composed Image Retrieval (CIR) has garnered significant attention due to its flexibil-ity in search systems [37,43,48]. Zero-shot CIR methods have been extensively explored, with textual inversion [4,8,15,49,55] emerging as a prominent technique. ...
April 2018
Proceedings of the AAAI Conference on Artificial Intelligence
... It benefits many downstream table-oriented tasks. For instance, in table question answering (Chemmengath et al., 2021), it helps locate the answer for entity-centric questions. In knowledge base population (Zhang et al., 2020a), it aids in mining relationships between entities from the tables. ...
January 2021
... At its core, the model leverages the pretrained transformer model ia-multilingual-transliterated-roberta 1 (Dhamecha et al., 2021;Liu et al., 2019) from IBM as its primary language representation layer. This layer, denoted by self.l1, ...
January 2021
... However, including unrelated languages in multilingual fine-tuning might not be beneficial but could result in negative interference due to conflicting gradients. 28,29 Conclusion and future work This work investigates the use of multilingual FL in mental health by focusing on the challenge of detecting depression from social Values are reported in percentages as the mean of 3 training runs using TwHIN-BERT. ± denotes the standard deviation.Q b L b , non-IID with balanced quantity and distribution-based label; Q b L i , non-IID with distribution-based label imbalance; Q i L b , non-IID with quantity imbalance; Q i L i , non-IID with quantity and distribution-based label imbalance. ...
September 2021
... Automatic speech recognition (ASR) systems have been widely researched and developed for various languages and domains. However, the development of ASR systems for resource-scarce languages (Diwan, 2021), such as Urdu, has been limited due to the availability of limited annotated speech data and computational resources (Farooq M. A., 2019;Zia H. R., 2018;Reitmaier, 2022, April;Naeem, 2020). Statistical techniques like HMM-GMM are widely adopted for small-scale datasets across languages with varied accuracies (Amoolya, 2022;Ashraf, Speaker independent Urdu speech recognition using HMM, 2010;Zhang, 2017). ...
August 2021
... The model accomplishes this by using multiple subtasks to learn the parameter initialization so that fine-tuning can be applied to the initialization with only a few labels and still perform well with the targeted tasks [42]. In each approach, there is a task-independent encoding function and a task-specific classifier in task-independent learning [43]. ProtoNet is a model for learning embeddings for classification, while Ridge is a model for preventing overfitting, uses linear regression, and MetaOptNet is a model for optimizing feature representations. ...
October 2020
... In this Chatbot we used the client server architecture with the help of Android www.ijmse.org 3 GUI and natural language processing ("NLP"), RNN as a technology. [14] Proposed methodology, for designing a Chatbot platform to databases we planned a framework that presents an automatic technique. To the greatest of our knowledge, our approach is the first to utilize relational databases to bootstrap chat bots for response for natural language to structure query translation. ...
January 2021
... However, ATHENA is highly sensitive to changes and interpretations of user queries [99]. Both the NLIDB system described in the paper [116] and ATHENA++ [115] are extensions of ATHENA. They combine linguistic analysis with deep domain reasoning to translate complex join and nested SQL. ...
August 2020
Proceedings of the VLDB Endowment
... For example, Peled and Reichart (2017) and Dubey et al. (2019) plored the conversion of sarcastic texts into their non-sarcastic interpretations using machine translation techniques. On the contrary, Mishra et al. (2019) have worked on using sentences that express a negative sentiment to generate corresponding sarcastic text. Desai et al. (2022) proposed the task of multimodal sarcasm explanation (MuSE) and also proposed the MORE dataset for the same. ...
January 2019