• Home
  • Shantipriya Parida
Shantipriya Parida

Shantipriya Parida
Silo AI

Senior AI Scientist Silo AI

About

56
Publications
8,910
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
204
Citations

Publications

Publications (56)
Preprint
Full-text available
This paper provides the system description of "Silo NLP's" submission to the Workshop on Asian Translation (WAT2022). We have participated in the Indic Multimodal tasks (English->Hindi, English->Malayalam, and English->Bengali Multimodal Translation). For text-only translation, we trained Transformers from scratch and fine-tuned mBART-50 models. Fo...
Conference Paper
Full-text available
Multi-modal Machine Translation (MMT) enables the use of visual information to enhance the quality of translations. The visual information can serve as a valuable piece of context information to decrease the ambiguity of input sentences. Despite the increasing popularity of such a technique, good and sizeable datasets are scarce, limiting the full...
Preprint
Full-text available
This paper presents the first publicly available treebank of Odia, a morphologically rich low resource Indian language. The treebank contains approx. 1082 tokens (100 sentences) in Odia selected from "Samantar", the largest available parallel corpora collection for Indic languages. All the selected sentences are manually annotated following the ``U...
Preprint
Full-text available
Multi-modal Machine Translation (MMT) enables the use of visual information to enhance the quality of translations. The visual information can serve as a valuable piece of context information to decrease the ambiguity of input sentences. Despite the increasing popularity of such a technique, good and sizeable datasets are scarce, limiting the full...
Article
A major effort is currently underway to develop a large-scale treebank for Indian low resource Languages (ILRLs). Apart from that, a rich and large-scale treebank can be an essential resource for linguistic investigations. This paper presents the first publicly available treebank of Santali low resource Indian language. The treebank contains 307 to...
Article
Full-text available
The healthcare system in the Indian subcontinent is plagued with numerous issues related to the access, transfer, and storage of patient's medical records. The lack of infrastructure to properly communicate and track records between all key participants has allowed the distribution of counterfeit drugs, dependency on unsafe methods of communication...
Chapter
Full-text available
Reddit is a platform with a heavy focus on its community forums and hence is comparatively unique from other social media platforms. It is divided into sub-Reddits, resulting in distinct topic-specific communities. The convenience of expressing thoughts, a flexibility of describing emotions, inter-operability of using jargon, the security of user i...
Conference Paper
Full-text available
Multimodal Machine Translation (MMT) systems utilize additional information from other modalities beyond text to improve the quality of machine translation (MT). The additional modality is typically in the form of images. Despite proven advantages, it is indeed difficult to develop an MMT system for various languages primarily due to the lack of a...
Article
Full-text available
Heart disease is considered to be the most life threatening ailment in the entire world and has been a major concern of developing countries. Heart disease also affects the fetus, which can be detected by cardiotocography tests conducted on the mother during her pregnancy. This paper analyses the presence of heart disease in the foetus by optimizin...
Article
Full-text available
The mental illness or abnormal brain is recorded with EEG, and it records corollary discharge, which helps to identify the schizophrenia spontaneous situation of a patient. The recordings are in a time interval that shows the brain's different nodes normal and abnormal activities. The spiking neural network procedure can be applied here to detect t...
Article
Full-text available
According to the psychological literature, implicit motives allow for the characterization of behavior, subsequent success, and long-term development. Contrary to personality traits, implicit motives are often deemed to be rather stable personality characteristics. Normally, implicit motives are obtained by Operant Motives, unconscious intrinsic de...
Poster
Full-text available
To evaluate the performance of traditional statistical MT techniques, as well as some recent NMT techniques under different configuration settings. e.g., one-to-one, one-to-many. Statistical Machine Translation Neural Machine Translation IBM Translation Model2 OpenNMT-Py OpenNMT-tf Version1 Version2 Version3 Version4 Version5 Spanish (es)-Hñähñu (o...
Conference Paper
Full-text available
This paper describes the team ("Tamalli")'s submission to AmericasNLP2021 shared task on Open Machine Translation for low resource South American languages. Our goal was to evaluate different Machine Translation (MT) techniques, statistical and neural-based, under several configuration settings. We obtained the second-best results for the language...
Chapter
The world we live in today, where technology has become a very integral part of our lives, has new, untapped resources that can bring about massive changes in the health sector. The Internet and social media have become the flag bearers of the tech‐savvy world. Some of the services provided by the various social media platforms like chats, comments...
Chapter
Nowadays, more and more people are gaining interest in news and social media networks, and are also sharing their opinions freely in different languages. Such kind of activities leads to interesting topics of research that scientists are working on. Considering news, it must be classified and easily accessible by the users for the information of th...
Conference Paper
Full-text available
This paper describes our participation in the shared evaluation campaign of MexA3T 2020. Our main goal was to evaluate a Supervised Autoencoder (SAE) learning algorithm in text classification tasks. For our experiments, we used three different sets of features as inputs, namely classic word n-grams, char n-grams, and Spanish BERT encodings. Our res...
Conference Paper
Full-text available
This paper describes the team ("ODIANLP")'s submission to WAT 2020. We have participated in the English→Hindi Multimodal task and Indic task. We have used the state-of-the-art Transformer model for the translation task and InceptionResNetV2 for the Hindi Image Captioning task. Our submission tops in English→Hindi Multimodal task in its track and Od...
Conference Paper
Full-text available
Language detection is a key part of the NLP pipeline for text processing. The task of automatically detecting languages belonging to disjoint groups is relatively easy. It is considerably challenging to detect languages that have similar origins or dialects. This paper describes Idiap's submission to the 2020 Germeval evaluation campaign 1 on Swiss...
Conference Paper
Full-text available
The preparation of parallel corpora is a challenging task, particularly for languages that suffer from under-representation in the digital world. In a multilingual country like India, the need for such parallel corpora is stringent for several low-resource languages. In this work, we provide an extended English-Odia parallel corpus, OdiEnCorp 2.0,...
Article
Full-text available
Electroencephalography is the recording of brain electrical activities that can be used to diagnose brain seizure disorders. By identifying brain activity patterns and their correspondence between symptoms and diseases, it is possible to give an accurate diagnosis and appropriate drug therapy to patients. This work aims to categorize electroencepha...
Chapter
Automatic text summarization is considered as a challenging task in natural language processing field. In the case of multilingual scenario particularly for the low-resource, morphologically complex languages the availability of summarization data set is rare and difficult to construct. In this work, we propose a novel technique to extract Odia tex...
Article
Full-text available
Visual Genome is a dataset connecting structured image information with English language.We present “Hindi Visual Genome”, a multi-modal dataset consisting of text and images suitable for English-Hindi multi-modal machine translation task and multi-modal research. We have selected short English segments (captions) from Visual Genome along with the...
Conference Paper
Full-text available
Text summarization is considered as a challenging task in the NLP community. The availability of datasets for the task of multilingual text summarization is rare, and such datasets are difficult to construct. In this work, we build an abstract text summarizer for the German language text using the state-of-the-art "Transformer" model. We propose an...
Preprint
Full-text available
Visual Genome is a dataset connecting structured image information with English language. We present ``Hindi Visual Genome'', a multimodal dataset consisting of text and images suitable for English-Hindi multimodal machine translation task and multimodal research. We have selected short English segments (captions) from Visual Genome along with asso...
Poster
Full-text available
English-to-Hindi Multimodal Dataset with an challenging test set which requires image for disambiguite.
Conference Paper
A multi-lingual country like India needs language corpora for low resource languages not only to provide its citizens with technologies of natural language processing (NLP) readily available in other countries, but also to support its people in their education and cultural needs. In this work, we focus on one of the low resource languages, Odia, an...
Conference Paper
Full-text available
This paper describes the CUNI submission to WAT 2018 for the English-Hindi translation task using a transfer learning techniques which has proven effective under low resource conditions. We have used the Transformer model and utilized an English-Czech parallel corpus as additional data source. Our simple transfer learning approach first trains a “p...
Conference Paper
Full-text available
This paper presents a case study in translating short image captions of the Visual Genome dataset from English into Hindi using out-of-domain data sets of varying size. We experiment with three NMT models: the shallow and deep sequence-to-sequence and the Transformer model as implemented in Marian toolkit. Phrase-based Moses serves as the baseline....
Article
The huge number of voxels in fMRI over time poses a major challenge to for effective analysis. Fast, accurate, and reliable classifiers are required for estimating the decoding accuracy of brain activities. Although machine-learning classifiers seem promising, individual classifiers have their own limitations. To address this limitation, the presen...
Chapter
Classification of brain states obtained through functional magnetic resonance imaging (fMRI) poses a serious challenges for neuroimaging community to uncover discriminating patterns of brain state activity that define independent thought processes. This challenge came into existence because of the large number of voxels in a typical fMRI scan, the...
Chapter
Classification of brain states obtained through functional magnetic resonance imaging (fMRI) poses a serious challenges for neuroimaging community to uncover discriminating patterns of brain state activity that define independent thought processes. This challenge came into existence because of the large number of voxels in a typical fMRI scan, the...
Article
Functional magnetic resonance imaging (fMRI) makes it possible to detect brain activities in order to elucidate cognitive-states. The complex nature of fMRI data requires under-standing of the analyses applied to produce possible avenues for developing models of cognitive state classification and improving brain activity prediction. While many mode...
Article
The application of machine learning approaches to decode cognitive states through functional Magnetic Resonance Imaging (fMRI) is one of the emerging fields of research over the past decade. Multivoxel Pattern Analysis (MVPA) treats the activation of multiple voxels from the fMRI data as a pattern to decode the brain states using machine learning b...
Article
Classification of brain states obtained through functional magnetic resonance imaging (fMRI) poses a serious challenges for neuroimaging community to uncover discriminating patterns of brain state activity that define independent thought processes. This challenge came into existence because of the large number of voxels in a typical fMRI scan, the...
Conference Paper
In this paper an application of genetic algorithms (GAs) and Gaussian Naïve Bayesian (GNB) approach is studied to explore the brain activities by decoding specific cognitive states from functional magnetic resonance imaging (fMRI) data. However, in case of fMRI data analysis the large number of attributes may leads to a serious problem of classifyi...
Conference Paper
The functional magnetic resonance imaging (fMRI) is considered as a powerful technique for performing brain activation studies by measuring neural activities. However, the tons of voxels over time are posed a major challenge to neuroscientists and researchers for analyzing it effectively. The decoding of brain activities required fast, accurate, an...
Chapter
The classification of functional magnetic imaging resonance (fMRI) data involves many challenges due to the problem of high dimensionality, noise, and limited training samples. In particular, mental states classification, decoding brain activation, and finding the variable of interest by using fMRI data was one of the focused research topics among...
Article
Full-text available
Mobile broadband traffic continues to grow exponentially as subscribers increasingly turn to their mobile devices as their primary access to Internet services and applications. New Smartphones, tablets, and machine to-machine devices are providing a compelling mobile experience by allowing people to engage their social networks, conduct business, a...
Conference Paper
Full-text available
One of the key challenges in cognitive neuroscience is determining the mapping between neural activities and mental representations. The functional magnetic resonance imaging (fMRI) provides measure of brain activity in response to cognitive tasks and proved as one of the most effective tool in brain imaging and studying the brain activities. The c...

Network

Cited By