Vani Kanjirangat

Vani Kanjirangat
Dalle Molle Institute for Artificial Intelligence Research | IDSIA

PhD

About

38
Publications
6,868
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
471
Citations
Introduction
Currently working on Deep Learning in Natural Language Processing.
Education
January 2013 - May 2018
Amrita Vishwa Vidyapeetham
Field of study
  • Natural Language Processing
August 2010 - May 2012
Cochin University of Science and Technology
Field of study
  • Computer Science

Publications

Publications (38)
Conference Paper
We aim to explore and analyze the capabilities and limitations of the large language models in understanding and distinguishing causal sentences under a zero-shot setting. We experiment on a multi-class dataset of direct causal, conditional causal, and correlational sentences. In the experiments, the GPT and Falcon models are validated against a ne...
Conference Paper
Full-text available
We report the approaches submitted to the NADI 2024 Subtask 1: Multi-label country-level Dialect Identification (MLDID). The core part was to adapt the information from multi-class data for a multi-label dialect classification task. We experimented with supervised and unsupervised strategies to tackle the task in this challenging setting. Under the...
Article
Full-text available
A suprasellar lesion is an unusual mass in the suprasellar region in the brain. Some common suprasellar lesions include Pituitary Adenoma, Craniopharyngioma and Meningioma. Patients may present with significant visual and other symptoms like headache, and hormonal imbalances. The proposed study utilizes 553 discharge summaries of suprasellar patien...
Article
Full-text available
We propose a novel approach that uses semi-supervised learning to extract triplets from domain-specific texts and create a Knowledge Graph (KG), with a focus on the agricultural domain. Building domain specific knowledge graphs can be challenging due to several factors such as domain specific vocabulary, data integration challenges, dynamic data, a...
Conference Paper
Full-text available
In this paper, we describe our systems submitted to the NADI Subtask 1: country-wise dialect classifications. We designed two types of solutions. The first type is convolutional neural network CNN) classifiers trained on subword segments of optimized lengths. The second type is fine-tuned classifiers with BERT-based language specific pre-trained mo...
Conference Paper
Full-text available
This paper deals with the problem of incremental dialect identification. Our goal is to reliably determine the dialect before the full utterance is given as input. The major part of the previous research on dialect identification has been model-centric, focusing on performance. We address a new question: How much input is needed to identify a diale...
Conference Paper
In this paper, we discuss our contribution to the NII Testbeds and Community for Information Access Research (NTCIR)-16 Real-MedNLP shared task. Our team (ZuKyo) participated in the English subtask: Few-resource Named Entity Recognition. The main challenge in this low-resource task was a low number of training documents annotated with a high number...
Conference Paper
In this NTCIR-16 Real-MedNLP shared task paper, we present the methods of the ZuKyo-JA subteam for solving the Japanese part of Subtask1 and Subtask3 (Subtask1-CR-JA, Subtask1-RR-JA, Subtask3-RR-JA). Our solution is based on a sliding-window approach using a Japanese BERT pre-trained masked-language model., which was used as a common architecture f...
Article
Full-text available
Automatic document classification for highly interrelated classes is a demanding task that becomes more challenging when there is little labeled data for training. Such is the case of the coronavirus disease 2019 (COVID-19) Clinical repository—a repository of classified and translated academic articles related to COVID-19 and relevant to the clinic...
Article
Entity relation extraction plays an important role in the biomedical, healthcare, and clinical research areas. Recently, pre-trained models based on transformer architectures and their variants have shown remarkable performances in various natural language processing tasks. Most of these variants were based on slight modifications in the architectu...
Preprint
Full-text available
When coping with literary texts such as novels or short stories, the extraction of structured information in the form of a knowledge graph might be hindered by the huge number of possible relations between the entities corresponding to the characters in the novel and the consequent hurdles in gathering supervised information about them. Such issue...
Preprint
Lexical semantic change detection (also known as semantic shift tracing) is a task of identifying words that have changed their meaning over time. Unsupervised semantic shift tracing, focal point of SemEval2020, is particularly challenging. Given the unsupervised setup, in this work, we propose to identify clusters among different occurrences of ea...
Chapter
Biomedical question answering is a great challenge in NLP due to complex scientific vocabulary and lack of massive annotated corpora, but, at the same time, is full of potential in optimizing in critical ways the biomedical practices. This paper describes the work carried out as a part of the BioASQ challenge (Task-7B Phase-B), and targets an integ...
Preprint
Full-text available
We present two deep learning approaches to narrative text understanding for character relationship modelling. The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time. An empirical analysis of the corresponding character trajectories shows that such approaches are effec...
Article
The objective of the work is to explore the potency of integrating structural and citation information with effective syntax‐semantic text‐based analysis for scientific plagiarism detection. One of the major limitations in today's plagiarism checkers is their sole dependence on text‐based detection, where they ignore the citation and structural inf...
Article
The proposed work aims to explore and compare the potency of syntactic-semantic based linguistic structures in plagiarism detection using natural language processing techniques. The current work explores linguistic features, viz., part of speech tags, chunks and semantic roles in detecting plagiarized fragments and utilizes a combined syntactic-sem...
Article
The proposed work models document level text plagiarism detection as a binary classification problem, where the task is to distinguish a given suspicious-source document pair as plagiarized or non-plagiarized. The objective is to explore the potency of syntax based linguistic features extracted using shallow natural language processing techniques f...
Article
Full-text available
The rapid evolution of information content and its ease of access have made the field of research and academia so vulnerable to plagiarism. Plagiarism is an act of intellectual theft and information breach which must be restricted to ensure educational integrity. Usually in plagiarism checking, exhaustive document comparisons with large repositorie...
Article
Full-text available
With the evolution of technologies like internet search engines and improved text editors, plagiarism has become a critical issue. Many works are already available in verbatim plagiarism detection which is a type of simple copy and paste plagiarism but when it comes to intelligent plagiarism the scenario becomes more complex. Intelligent plagiarism...
Article
Full-text available
The swift evolution of technology has facilitated the access of information through different means which has opened the doors to plagiarism. In today’s world of technological outburst, plagiarism is aggravating and has become a serious concern in academia, research and many other fields. To curb this intellectual theft and to ensure academic integ...
Article
With the advent of World Wide Web, plagiarism has become a prime issue in field of academia. A plagiarized document may contain content from a number of sources available on the web and it is beyond any individual to detect such plagiarism manually. This paper focuses on the exploration of soft clustering, via, Fuzzy C Means algorithm in the candid...
Article
Text document categorization is one of the rapidly emerging research fields, where documents are identified, differentiated and classified manually or algorithmically. The paper focuses on application of automatic text document categorization in plagiarism detection domain. In today's world plagiarism has become a prime concern, especially in resea...
Conference Paper
Plagiarism is one of the most serious crimes in academia and research fields. In this modern era, where access to information has become much easier, the act of plagiarism is rapidly increasing. This paper aligns on external plagiarism detection method, where the source collection of documents is available against which the suspicious documents are...

Network

Cited By