
Nirmalya Thakur- Doctor of Philosophy
- Assistant Professor at South Dakota School of Mines and Technology
Nirmalya Thakur
- Doctor of Philosophy
- Assistant Professor at South Dakota School of Mines and Technology
Research Interests: Big Data, Data Analysis, Human-Computer Interaction, Machine Learning, and Natural Language
About
60
Publications
11,474
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
932
Citations
Introduction
My research interests include Human-Computer Interaction, Artificial Intelligence, Machine Learning, Internet of Things, Data Science, and Natural Language Processing.
To learn more about my work please visit my website at https://www.nirmalyathakur.com/
Current institution
Publications
Publications (60)
This paper makes four scientific contributions to the field of fall detection in the elderly to contribute to their assisted living in the future of internet of things (IoT)-based pervasive living environments, such as smart homes. First, it presents and discusses a comprehensive comparative study, where 19 different machine learning methods were u...
This framework for human behavior monitoring aims to take a holistic approach to study, track, monitor, and analyze human behavior during activities of daily living (ADLs). The framework consists of two novel functionalities. First, it can perform the semantic analysis of user interactions on the diverse contextual parameters during ADLs to identif...
Falls, which are increasing at an unprecedented rate in the global elderly population, are associated with a multitude of needs such as healthcare, medical, caregiver, and economic, and they are posing various forms of burden on different countries across the world, specifically in the low- and middle-income countries. For these respective countrie...
This presentation offers an in-depth exploration of applied computer vision, with a focus on utilizing OpenCV for image processing and real-time vision tasks. Developed from a workshop conducted by Dr. Nirmalya Thakur at the South Dakota School of Mines and Technology on October 29, 2024, the slides have been updated to serve a broader audience whi...
The outbreak of COVID-19 served as a catalyst for content creation and dissemination on social media platforms, as such platforms serve as virtual communities where people can connect with one another seamlessly. While there have been several works related to the mining and analysis of COVID-19related posts on social media platforms such as Twitter...
The work of this paper presents multiple novel findings from a comprehensive analysis of a dataset that includes the stress, anxiety, and depression levels experienced by 95 young adults computed using the Depression Anxiety Stress Scale (DASS). First, forage groups, 18-20, 21-25, and 26-30, average stress and anxiety levels were higher in females...
The work presented in this paper makes multiple scientific contributions related to the investigation of the global fear associated with COVID-19 by performing a comprehensive analysis of a dataset comprising survey responses of participants from 40 countries. First, the results of subjectivity analysis performed using TextBlob, showed that in the...
The work presented in this paper makes multiple scientific contributions with a specific focus on the analysis of misinformation about COVID-19 on YouTube. First, the results of topic modeling performed on the video descriptions of YouTube videos containing misinformation about COVID-19 revealed four distinct themes or focus areas—Promotion and Out...
This work focuses on the analysis of user diversity-based patterns of the public discourse on Twitter about mental health in the context of online learning during COVID-19. Two aspects of user diversity – gender and location are the focus of this work. A dataset comprising 52,984 Tweets about online learning during COVID-19, posted on Twitter betwe...
The interdisciplinary work at the intersections of Big Data, Data Mining, and Data Analysis presented in this paper, focuses on the mining and analysis of web behavior on Google related to different online learning platforms from different countries, since the beginning of COVID-19. This paper makes multiple scientific contributions to these fields...
During virus outbreaks in the recent past, web behavior mining, modeling, and analysis have served as means to examine, explore, interpret, assess, and forecast the worldwide perception, readiness, reactions, and response linked to these virus outbreaks. The recent outbreak of the Marburg Virus disease (MVD), the high fatality rate of MVD, and the...
This paper presents several novel findings from a comprehensive analysis of about 50,000 Tweets about online learning during COVID-19, posted on Twitter between 9 November 2021 and 13 July 2022. First, the results of sentiment analysis from VADER, Afinn, and TextBlob show that a higher percentage of these Tweets were positive. The results of gender...
The World Health Organization (WHO) added Disease X to their shortlist of blueprint priority diseases to represent a hypothetical, unknown pathogen that could cause a future epidemic. During different virus outbreaks of the past, such as COVID-19, Influenza, Lyme Disease, and Zika virus, researchers from various disciplines utilized Google Trends t...
Exoskeletons have emerged as a vital technology in the last decade and a half, with diverse use cases in different domains. Even though several works related to the analysis of Tweets about emerging technologies exist, none of those works have focused on the analysis of Tweets about exoskeletons. The work of this paper aims to address this research...
This paper presents multiple novel findings from a comprehensive analysis of a dataset comprising 1,244,051 Tweets about Long COVID, posted on Twitter between 25 May 2020 and 31 January 2023. First, the analysis shows that the average number of Tweets per month wherein individuals self-reported Long COVID on Twitter was considerably high in 2022 as...
In the last decade and a half, the world has experienced outbreaks of a range of viruses such as COVID-19, H1N1, flu, Ebola, Zika virus, Middle East Respiratory Syndrome (MERS), measles, and West Nile virus, just to name a few. During these virus outbreaks, the usage and effectiveness of social media platforms increased significantly, as such platf...
Social media platforms are a type of web-based applications that are built on the conceptual and technical underpinnings of Web 2 [...]
Mining and analysis of the big data of Twitter conversations have been of significant interest to the scientific community in the fields of healthcare, epidemiology, big data, data science, computer science, and their related areas, as can be seen from several works in the last few years that focused on sentiment analysis and other forms of text an...
Falls, considered a serious health-related concern for the elderly people, are associated with multiple diverse and dynamic needs for the elderly people themselves, their caregivers, their family members, and healthcare professionals. The modern-day Internet of Everything lifestyle is characterized by people using the internet for a multitude of re...
The mining of Tweets to develop datasets on recent issues, global challenges, pandemics, virus outbreaks, emerging technologies, and trending matters has been of significant interest to the scientific community in the recent past, as such datasets serve as a rich data resource for the investigation of different research questions. Furthermore, the...
The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times its current value within the next two years. Therefore, it is crucial to study...
The COVID-19 Omicron variant, reported to be the most immune-evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversation...
Falls, highly common in the constantly increasing global aging population, can have a variety of negative effects on their health, well-being, and quality of life, including restricting their capabilities to conduct activities of daily living (ADLs), which are crucial for one’s sustenance. Timely assistance during falls is highly necessary, which i...
This paper presents the findings of an exploratory study on the continuously generating Big Data on Twitter related to the sharing of information, news, views, opinions, ideas, knowledge, feedback, and experiences about the COVID-19 pandemic, with a specific focus on the Omicron variant, which is the globally dominant variant of SARS-CoV-2 at this...
COVID-19, a pandemic that the world has not seen in decades, has resulted in presenting a multitude of unprecedented challenges for student learning across the globe. The global surge in COVID-19 cases resulted in several schools, colleges, and universities closing in 2020 in almost all parts of the world and switching to online or remote learning,...
The rise of the Internet of Everything lifestyle in the last decade has had a significant impact on the increased emergence and adoption of online learning in almost all countries across the world. E-learning 3.0 is expected to be-come the norm of learning globally in almost all sectors in the next few years. The pervasiveness of the Semantic Web p...
Over the last decade, exoskeletons have had an extensive impact on different disciplines and application domains such as assisted living, military, healthcare, firefighting, and industries, on account of their diverse and dynamic functionalities to augment human abilities, stamina, potential, and performance in a multitude of ways. In view of this...
The United States of America has been the worst affected country in terms of the number of cases and deaths on account of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) or COVID-19, a highly transmissible and pathogenic coronavirus that started spreading globally in late 2019. On account of the surge of infections, accompanied by...
The scientific contribution of this paper is a Big Data-centric study, conducted using Google Trends, that involved analysis of the global, country-level, and state-level search trends related to indoor localization by mining relevant Google Search data from 2015–2020. There are three novel findings of this study. First, the current global search i...
The proposed approach can filter, study, analyze, and interpret written communications from social media platforms for early detection of Cognitive Impairment (CI) to connect individuals with CI with assistive services in their location. It has three novel functionalities. First, it presents a Big Data-centric Data Mining methodology that uses a ho...
The scientific contribution of this paper is a multilayered Human-Human Interaction driven framework that aims to connect the needs of different sectors of the society to provide a long-term, viable, robust, and implementable solution for addressing multiple societal, economic, and humanitarian issues related to the increasing population of the wor...
This paper presents a multifunctional interdisciplinary framework that makes four scientific contributions towards the development of personalized ambient assisted living (AAL), with a specific focus to address the different and dynamic needs of the diverse aging population in the future of smart living environments. First, it presents a probabilis...
The scientific contribution of this work to address multiple global
societal and economic challenges associated with the increasing
aging population is primarily two-fold. First, it presents and discusses a new computing methodology - Pervasive Activity Logging
that involves the creation of an adaptive and semantic collection or
‘log’ of human acti...
This work makes multiple scientific contributions to the field of Indoor Localization for Ambient Assisted Living in Smart Homes. First, it presents a Big-Data driven methodology that studies the multimodal components of user interactions and analyzes the data from Bluetooth Low Energy (BLE) beacons and BLE scanners to detect a user’s indoor locati...
The future of Internet of Things (IoT)-based living spaces would involve interaction, coordination, collaboration and communication between humans, machines, robots and other technology-laden systems in the context of users performing their daily routine tasks. Through this Contextually Intelligent Activity Recognition framework, this work proposes...
This paper outlines the steps for a specification to create a language for defining human behavior in the context of complex activities, with a specific focus on Activities of Daily Living (ADLs). The proposed specification to create a language for defining human behavior is used to develop a framework for representation of (1) macro-level tasks or...
This research work discusses a mathematical foundation based on probability theory and related disciplines for development of a knowledge base that would list the exhaustive ways or approaches, arising from universal diversity, by which any activity can be performed by a given user. The global challenge in this field is to address the needs associa...
The proposed Ubiquitous Activity Aware Framework aims to leverage the immense potential that lies in the application of Activity Centered Computing in an Internet of Things (IoT) environment for the future of technology-based living spaces, for instance, in Smart Homes and Smart Cities. The scientific contribution in this work aims to address both...
This paper introduces a multilayered framework in the field of Human-Human interaction and its related disciplines, that envisions to take a holistic approach to provide a long-term, economic and sustainable solution to mitigate feelings of loneliness and social isolation in elderly people, as well as to address housing needs and caregiver needs fo...
This paper makes two scientific contributions to the field of exoskeleton-based action and movement recognition. First, it presents a novel machine learning and pattern recognition-based framework that can detect a wide range of actions and movements - walking, walking upstairs, walking downstairs, sitting, standing, lying, stand to sit, sit to sta...
One of the distinct features of this century has been the population of older adults which has been on a constant rise. Elderly people have several needs and requirements due to physical disabilities, cognitive issues, weakened memory and disorganized behavior, that they face with increasing age. The extent of these limitations also differs accordi...
Providing environmental and navigational safety to occupants in workplace, public venues, home and other environments is of a critical concern and has multiple applications for various end users. A framework, for monitoring movement, pose and behavior of elderly people for safe navigation in indoor environments is presented in this work. This frame...
The proposed Context Driven Indoor Localization Framework aims to implement a standard for indoor localization to address the multiple needs in different indoor environments with a specific focus to contribute towards Ambient Assisted Living (AAL) in the Future of Smart Homes for healthy aging of the rapidly increasing elderly population. This fram...
One of the distinct features of this century has been the population of older adults which has been on a constant rise. At present, the worldwide population of elderly people is about 962 million and their population is anticipated to rise even further by the year 2100 and reach to 3.1 billion. Elderly people have several needs and requirements due...
The proposed framework at the intersection of Internet of Things (IoT), Big Data, Human Computer Interaction, Artificial Intelligence and their interrelated disciplines aims to predict cramps during Activities of Daily Living (ADL) to contribute towards assisted living and healthy aging of the constantly increasing population of older adults, in th...
The proposed framework at the intersection of Internet of Things (IoT), Big Data, Human Computer Interaction, Assistive Technologies and their interrelated disciplines aims to take a holistic approach towards detecting walking related falls in the constantly increasing elderly population during Activities of Daily Living (ADL). Walking is one of th...
The essence of intelligent assistive technologies in smart homes can be outlined as their ability to enhance the user experience in many ways. A way to accomplish this goal is to make such systems aware and knowledgeable about user interactions in the context of the given environment. In the context of smart homes, the ability of such intelligent s...
The population of elderly people has been increasing at a rapid rate over the last few decades and their population is expected to further increase in the upcoming future. Their increasing population is associated with their increasing needs due to problems like physical disabilities, cognitive issues, weakened memory and disorganized behavior, tha...
The increasing population of elderly people is associated with the need to meet their increasing requirements and to provide solutions that can improve their quality of life in a smart home. In addition to fear and anxiety towards interfacing with systems; cognitive disabilities, weakened memory, disorganized behavior and even physical limitations...
Affect aware systems hold immense potential towards the creation of smart and assistive living spaces to establish paradigms of human-computer interactions through enhancing user experiences. The relevance of affect-aware systems not only lies in analyzing the affective components of user interactions, but it also involves analysis of the component...
Internet of Things (IoT) will provide a data rich world to afford smart systems which, in a context of a smart home, will have to adapt to people according to their needs for work and living. This paper describes the work to establish a framework for building a database to capture all possible user interactions associated with a given activity and...
The application of affect aware systems in a smart home environment has the potential to address challenges in creation of a smart, adaptive, context-aware and assistive living space for elderly people. Affect aware systems that can predict the user experience even before an activity is performed, would provide a scope for effective communications...
Interactive virtual assistants serve as a cost-effective solution to assist elderly people in several ways. They act as means to provide social support, manage loneliness, medium of communication, reminder systems and even instill positive moods in their users. The relevance of advancement in virtual assistant technology is not only confined to adv...
Human users of future smart ubiquitous systems will be immersed in a technology-laden environment; thus, it is important to understand how users interact effectively with these systems. This paper proposes an activity analysis model to study different affective states associated with different user interactions while performing the sentiment analys...
has reportedly been an increase in population of the
retired professionals and the further aged people.
Before retirement these people had been pioneers in
their field but post retirement their knowledge,
experience and wisdom go under-utilized. While these
retired people have less chances to contribute to the
technological development, it becomes...
The usefulness of affect aware systems can be summarized as optimizing the system’s services to improve the user experience. A means to achieve this objective is to make the system aware of the real-world situations and enable intelligent emotion analysis of its users. The ability of smart systems to automatically recognize human affective behavior...
Questions
Questions (4)
This multilingual dataset presents 60,127 Instagram posts about Mpox (monkeypox), published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages. For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset.
After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed.
This process included classifying each post into
- one of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutral
- hate or not hate
- anxiety/stress detected or no anxiety/stress detected
These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.
The dataset is available at https://zenodo.org/records/13738598, and the paper associated with this dataset is available on Arxiv at https://arxiv.org/abs/2409.05292
The following is a description of the attributes present in this dataset
- Post ID: Unique ID of each Instagram post
- Post Description: Complete description of each post in the language in which it was originally published
- Date: Date of publication in MM/DD/YYYY format
- Language: Language of the post as detected using the Google Translate API
- Translated Post Description: Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.
- Sentiment: Results of sentiment analysis (using translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutral
- Hate: Results of hate speech detection (using translated Post Description) where each post was classified as hate or not hate
- Anxiety or Stress: Results of anxiety or stress detection (using translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.
This multilingual dataset presents 60,127 Instagram posts about Mpox (monkeypox), published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages. For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset.
After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed.
This process included classifying each post into
- one of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutral
- hate or not hate
- anxiety/stress detected or no anxiety/stress detected
These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.
The dataset is available at https://zenodo.org/records/13738598, and the paper associated with this dataset is available on Arxiv at https://arxiv.org/abs/2409.05292
The following is a description of the attributes present in this dataset
- Post ID: Unique ID of each Instagram post
- Post Description: Complete description of each post in the language in which it was originally published
- Date: Date of publication in MM/DD/YYYY format
- Language: Language of the post as detected using the Google Translate API
- Translated Post Description: Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.
- Sentiment: Results of sentiment analysis (using translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutral
- Hate: Results of hate speech detection (using translated Post Description) where each post was classified as hate or not hate
- Anxiety or Stress: Results of anxiety or stress detection (using translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.
This dataset, available at https://zenodo.org/records/11711230, contains the data of 4011 videos about the ongoing outbreak of measles published on 264 websites on the internet between January 1, 2024, and May 31, 2024. These websites primarily include YouTube and TikTok, which account for 48.6% and 15.2% of the videos, respectively. The remainder of the websites include Instagram and Facebook as well as the websites of various global and local news organizations. For each of these videos, the URL of the video, title of the post, description of the post, and the date of publication of the video are presented as separate attributes in the dataset.
After developing this dataset, sentiment analysis (using VADER), subjectivity analysis (using TextBlob), and fine-grain sentiment analysis (using DistilRoBERTa-base) of the video titles and video descriptions were performed. This included classifying each video title and video description into (i) one of the sentiment classes i.e. positive, negative, or neutral, (ii) one of the subjectivity classes i.e. highly opinionated, neutral opinionated, or least opinionated, and (iii) one of the fine-grain sentiment classes i.e. fear, surprise, joy, sadness, anger, disgust, or neutral. These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for performing sentiment analysis or subjectivity analysis in this field as well as for other applications. The paper associated with this dataset (please see the following citation) also presents a list of open research questions that may be investigated using this dataset.
Please cite the following paper when using this dataset:
N. Thakur, V. Su, M. Shao, K. Patel, H. Jeong, V. Knieling, and A. Bian “A labelled dataset for sentiment analysis of videos on YouTube, TikTok, and other sources about the 2024 outbreak of measles,” Proceedings of the 26th International Conference on Human-Computer Interaction (HCII 2024), Washington, USA, 29 June - 4 July 2024. (Accepted as a Late Breaking Paper, Preprint Available at: https://doi.org/10.48550/arXiv.2406.07693)
This dataset, available at https://zenodo.org/records/11711230, contains the data of 4011 videos about the ongoing outbreak of measles published on 264 websites on the internet between January 1, 2024, and May 31, 2024. These websites primarily include YouTube and TikTok, which account for 48.6% and 15.2% of the videos, respectively. The remainder of the websites include Instagram and Facebook as well as the websites of various global and local news organizations. For each of these videos, the URL of the video, title of the post, description of the post, and the date of publication of the video are presented as separate attributes in the dataset.
After developing this dataset, sentiment analysis (using VADER), subjectivity analysis (using TextBlob), and fine-grain sentiment analysis (using DistilRoBERTa-base) of the video titles and video descriptions were performed. This included classifying each video title and video description into (i) one of the sentiment classes i.e. positive, negative, or neutral, (ii) one of the subjectivity classes i.e. highly opinionated, neutral opinionated, or least opinionated, and (iii) one of the fine-grain sentiment classes i.e. fear, surprise, joy, sadness, anger, disgust, or neutral. These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for performing sentiment analysis or subjectivity analysis in this field as well as for other applications. The paper associated with this dataset (please see the following citation) also presents a list of open research questions that may be investigated using this dataset.
Please cite the following paper when using this dataset:
N. Thakur, V. Su, M. Shao, K. Patel, H. Jeong, V. Knieling, and A. Bian “A labelled dataset for sentiment analysis of videos on YouTube, TikTok, and other sources about the 2024 outbreak of measles,” Proceedings of the 26th International Conference on Human-Computer Interaction (HCII 2024), Washington, USA, 29 June - 4 July 2024. (Accepted as a Late Breaking Paper, Preprint Available at: https://doi.org/10.48550/arXiv.2406.07693)