
Anupam Joshi- University of Maryland, Baltimore County
Anupam Joshi
- University of Maryland, Baltimore County
About
539
Publications
125,768
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
22,823
Citations
Publications
Publications (539)
The integration of privacy measures, including differential privacy techniques, ensures a provable privacy guarantee for the synthetic data. However, challenges arise for Generative Deep Learning models when tasked with generating realistic data, especially in critical domains such as Cybersecurity and Healthcare. Generative Models optimized for co...
In the realm of IoT/CPS systems connected over mobile networks, traditional intrusion detection methods analyze network traffic across multiple devices using anomaly detection techniques to flag potential security threats. However, these methods face significant privacy challenges, particularly with deep packet inspection and network communication...
Over the years, data anonymization and federated learning have been proposed to address the challenges of data privacy in distributed systems. Unfortunately, data anonymization techniques are not fool proof and can be broken with new information that an adversary may obtain or through weaknesses in the anonymization process. Federated learning tech...
Neurosymbolic artificial intelligence (AI) is an emerging and quickly advancing field that combines the subsymbolic strengths of (deep) neural networks and the explicit, symbolic knowledge contained in knowledge graphs (KGs) to enhance explainability and safety in AI systems. This approach addresses a key criticism of current generation systems, na...
Synthesizing information from collections of tables embedded within scientific and technical documents is increasingly critical to emerging knowledge-driven applications. Given their structural heterogeneity, highly domain-specific content, and diffuse context, inferring a precise semantic understanding of such tables is traditionally better accomp...
Neuro-Symbolic Artificial Intelligence (AI) is an emerging and quickly advancing field that combines the subsymbolic strengths of (deep) neural networks and explicit, symbolic knowledge contained in knowledge graphs to enhance explainability and safety in AI systems. This approach addresses a key criticism of current generation systems, namely thei...
Entity linking is an important step towards constructing knowledge graphs that facilitate advanced question answering over scientific documents, including the retrieval of relevant information included in tables within these documents. This paper introduces a general-purpose system for linking entities to items in the Wikidata knowledge base. It de...
AI models for cybersecurity have to detect and defend against constantly evolving cyber threats. Much effort is spent building defenses for zero days and unseen variants of known cyber-attacks. Current AI models for cybersecurity struggle with these yet unseen threats due to the constantly evolving nature of threat vectors, vulnerabilities, and exp...
Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks and is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG). There is a strong need to develop community-accessible datasets to train existing AI-based cybersecurity pipelines to efficiently a...
The Internet of Battlefield Things (IoBT) will advance the operational effectiveness of infantry units. However, this requires autonomous assets such as sensors, drones, combat equipment, and uncrewed vehicles to collaborate, securely share information, and be resilient to adversary attacks in contested multi-domain operations. CAPD addresses this...
Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks and is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG). There is a strong need to develop community-accessible datasets to train existing AI-based cybersecurity pipelines to efficiently a...
Storytelling, and the delivery of societal narratives, enable human beings to communicate, connect, and understand one another and the world around them. Narratives can be defined as spoken, visual, or written accounts of interconnected events and actors, generally evolving through some notion of time. Today, information is typically conveyed over...
Today there is a significant amount of fake cybersecurity related intelligence on the internet. To filter out such information, we build a system to capture the provenance information and represent it along with the captured Cyber Threat Intelligence (CTI). In the cybersecurity domain, such CTI is stored in Cybersecurity Knowledge Graphs (CKG). We...
Data confidentiality is an issue of increasing importance. Several authorities and regulatory bodies are creating new laws that control how web services data is handled and shared. With the rapid increase of such regulations, web service providers face challenges in complying with these evolving regulations across jurisdictions. Providers must upda...
The detection and removal of misinformation from social media during high impact events, e.g., COVID-19 pandemic, is a sensitive application since the agency in charge of this process must ensure that no unwarranted actions are taken. This suggests that any automated system used for this process must display both high prediction accuracy as well as...
In many social media applications, a small fraction of the members are highly linked while most are sparsely connected to the network. Such a skewed distribution is sometimes referred to as the "long tail". Popular applications like meme trackers and content aggregators mine for information from only the popular blogs located at the head of this cu...
Identifying topics and concepts associated with a set of documents is a task common to many applications. It can help in the annotation and categorization of documents and be used to model a person's current interests for improving search results, business intelligence or selecting appropriate advertisements. One approach is to associate a document...
Analysing complex natural phenomena often requires synthesized data that matches observed characteristics. Graph models are widely used in analyzing the Web in general, but are less suitable for modeling the Blogosphere. While blog networks resemble many properties of Web graphs, the dynamic nature of the Blogosphere, its unique structure and the e...
The entire scientific and academic community has been mobilized to gain a better understanding of the COVID-19 disease and its impact on humanity. Most research related to COVID-19 needs to analyze large amounts of data in very little time. This urgency has made Big Data Analysis, and related questions around the privacy and security of the data, a...
Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these sys...
Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these sys...
Smart farming also known as precision agriculture is gaining more traction for its promising potential to fulfill increasing global food demand and supply. In a smart farm, technologies and connected devices are used in a variety of ways, from finding the real-time status of crops and soil moisture content to deploying drones to assist with tasks s...
Detecting anomalies and attacks in smart cyber-physical systems are of paramount importance owing to their growing prominence in controlling critical systems. However, this is a challenging task due to the heterogeneity and variety of components of a CPS, and the complex relationships between sensed values and potential attacks or anomalies. Such c...
Cyber-Physical Systems (CPS) and Internet of Thing (IoT) generate large amounts of data spurring the rise of Artificial Intelligence (AI) based smart applications. Driven by rapid advancements in technologies that support smart devices, agriculture and farming sector is shifting towards IoT connected ecosystem to balance the increase in demand for...
Cyber-Physical Systems (CPS) and Internet of Thing (IoT) generate large amounts of data spurring the rise of Artificial Intelligence (AI) based smart applications. Driven by rapid advancements in technologies that support smart devices, agriculture and farming sector is shifting towards IoT connected ecosystem to balance the increase in demand for...
Social media has become an important communication channel during high impact events, such as the COVID-19 pandemic. As misinformation in social media can rapidly spread, creating social unrest, curtailing the spread of misinformation during such events is a significant data challenge. While recent solutions that are based on machine learning have...
This paper, based on data from the first nationwide survey of cybersecurity among local or grassroots governments in the United States, examines how these governments manage this important function. As we have shown elsewhere, cybersecurity among local governments is increasingly important because these governments are under constant or nearly cons...
With the recent developments in artificial intelligence and machine learning, anomalies in network traffic can be detected using machine learning approaches. Before the rise of machine learning, network anomalies which could imply an attack, were detected using well-crafted rules. An attacker who has knowledge in the field of cyber-defence could ma...
After Action Reports (AARs) provide incisive analysis of cyber-incidents. Extracting cyber-knowledge from these sources would provide security analysts with credible information, which they can use to detect or find patterns indicative of a cyber-attack. In this paper, we describe a system to extract information from AARs, aggregate the extracted i...
Named Entity Recognition (NER) has been studied for many languages like English, German, Spanish, and others but virtually no studies have focused on the Nepali language. One key reason is the lack of an appropriate, annotated dataset. In this paper, we describe a Nepali NER dataset that we created. We discuss and compare the performance of various...
Reproducibility of computations and data prove-nance are very important goals to achieve in order to improve the quality of one's research. Unfortunately, despite some efforts made in the past, it is still very hard to reproduce computational experiments with high degree of certainty. The Big Data phenomenon in recent years makes this goal even har...
In the modern age, the Internet of Things forms the basis of all the infrastructures for improving efficiency, reliability, and comfort. As applied to the power system, increasing the penetration of information and communication technologies at the device and process level is enabling devices to communicate with each other, thereby, facilitating wi...
Named Entity Recognition have been studied for different languages like English, German, Spanish and many others but no study have focused on Nepali language. In this paper we propose a neural based Nepali NER using latest state-of-the-art architecture based on grapheme-level which doesn't require any hand-crafted features and no data pre-processin...
We present an observational study on the relationship between demographic factors and phishing susceptibility at the University of Maryland, Baltimore County (UMBC). In spring 2018, we delivered phishing attacks to 450 randomly selected students on three different days (1,350 students total) to examine user click rates and demographics among UMBC’s...
Background
Based on the idea of cooperative communication, recently a lot of attention has been drawn to cooperative spectrum access for the secure information transmission in a cognitive radio network (CRN). Security is one of the most important aspects of these networks, as due to their open and dynamic nature, they are extremely vulnerable to ma...
The intercept probability performance of decode-and-forward (DF) underlay cognitive radio threshold-based network is investigated in this paper. Here, a secondary source is transferring data to secondary destination via secondary relay cooperation. There are interference limitations on secondary user of cognitive system from the primary licensed us...
Security Analysts that work in a `Security Operations Center' (SoC) play a major role in ensuring the security of the organization. The amount of background knowledge they have about the evolving and new attacks makes a significant difference in their ability to detect attacks. Open source threat intelligence sources, like text descriptions about c...
Keeping up with threat intelligence is a must for a security analyst today. There is a volume of information present in `the wild' that affects an organization. We need to develop an artificial intelligence system that scours the intelligence sources, to keep the analyst updated about various threats that pose a risk to her organization. A security...
This article examines data from the first‐ever nationwide survey of cybersecurity among American local governments. The data show that these governments are under constant or near‐constant cyberattack, yet, on average, they practice cybersecurity poorly. While nearly half reported experiencing cyberattacks at least daily, one‐third said that they d...
To detect energy theft attacks in Advanced Metering Infrastructure (AMI), we propose a detection method based on
principal component analysis (PCA) approximation. PCA approximation is introduced by dimensionality reduction of high dimensional
AMI data and we extract the underlying consumption trends of a consumer that repeat on a daily or weekly ba...
Contemporary smartphones are capable of generating and transmitting large amounts of data about their users. Recent advances in collaborative context modeling combined with a lack of adequate permission model for handling dynamic context sharing on mobile platforms have led to the emergence of a new class of mobile applications that can access and...
We present an observational study on the relationship between demographic factors and phishing susceptibility at the University of Maryland, Baltimore County (UMBC). In spring 2018, we delivered phishing attacks to 450 randomly-selected students on three different days (1,350 students total) to examine user click rates and demographics among UMBC's...
In this paper, we examine cybersecurity challenges faced by America’s local, governments, including: the extent of cyberattacks; problems faced in preventing attacks from being successful; barriers to providing high levels of cybersecurity management; and actions that local governments believe should be taken to improve cybersecurity practice. Our...
In this study, the authors propose a scheme based on Stackelberg game for price and power control in threshold-based relay network, where the source transmits message to destination with the cooperation of a relay, in the presence of an eavesdropper. The relay gets revenue for transmitting the source information and the source profits from the coop...
Open-Source Projects and Libraries are being used in software development while also bearing multiple security vulnerabilities. This use of third party ecosystem creates a new kind of attack surface for a product in development. An intelligent attacker can attack a product by exploiting one of the vulnerabilities present in linked projects and libr...
In the spectrum sharing mode, the transmitting power of the secondary user (SU) is optimally controlled, such that no additional interference occurs at the primary user (PU). In this paper, the secrecy outage performance is analyzed for such cognitive underlay decode-and-forward (DF) threshold-based relay network. The relayed and the direct signals...
The early detection of cybersecurity events such as attacks is challenging given the constantly evolving threat landscape. Even with advanced monitoring, sophisticated attackers can spend as many as 146 days in a system before being detected. This paper describes a novel, cognitive framework that assists a security analyst by exploiting the power o...
As AI systems become more ubiquitous, securing them becomes an emerging challenge. Over the years, with the surge in online social media use and the data available for analysis, AI systems have been built to extract, represent and use this information. The credibility of this information extracted from open sources, however, can often be questionab...
The multilingual nature of the Internet increases complications in the cybersecurity community's ongoing efforts to strategically mine threat intelligence from OSINT data on the web. OSINT sources such as social media, blogs, and dark web vulnerability markets exist in diverse languages and hinder security analysts, who are unable to draw conclusio...
In this paper, we investigate the secrecy outage performance of a dual-hop decode-and-forward (DF) threshold-based cooperative relay network, both with and without the direct links between source-eavesdropper and source-destination. Without assuming that all the relays can always perfectly decode, here we consider that only those relays who satisfy...
To minimize energy theft attacks in Advanced Metering Infrastructure (AMI), we propose statistical distance based theft detection method. In the proposed method, different statistical distance indices (Jensen-Shannon distance, Hellinger distance, and Cumulative Distribution Function based distance) are computed using historical measurement variatio...
Contemporary smartphones are capable of generating and transmitting large amounts of data about their users. Recent advances in collaborative context modeling combined with a lack of adequate permission model for handling dynamic context sharing on mobile platforms have led to the emergence of a new class of mobile applications that can access and...