About
189
Publications
23,713
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,862
Citations
Citations since 2017
Introduction
Data science, data analytics, machine learning, software engineering, computer science education
Additional affiliations
September 2002 - present
Publications
Publications (189)
Driving risk prediction has been a topic of much research over the past few decades to minimize driving risk and increase safety. The use of demographic information in risk prediction is a traditional solution with applications in insurance planning, however, it is difficult to capture true driving behavior via such coarse-grained factors. Therefor...
Road construction projects maintain transportation infrastructures. These projects range from the short-term (e.g., resurfacing or fixing potholes) to the long-term (e.g., adding a shoulder or building a bridge). Deciding what the next construction project is and when it is to be scheduled is traditionally done through inspection by humans using sp...
Adaptive optics imaging has enabled the enhanced in vivo retinal visualization of individual cone and rod photoreceptors. Effective analysis of such high-resolution, feature rich images requires automated, robust algorithms. This paper describes RC-UPerNet, a novel deep learning algorithm, for identifying both types of photoreceptors, and was evalu...
Edge computing is a growing paradigm where compute resources are provisioned between data sources and the cloud to decrease compute latency from data transfer, lower costs, comply with security policies, and more. Edge systems are as varied as their applications, serving internet services, IoT, and emerging technologies. Due to the tight constraint...
Prediction of indoor airborne pollutant concentrations can enable a smart indoor air quality control strategy that potentially reduces building energy use and improves occupant outcomes. In service of this overarching goal, this work pursues four objectives: 1) Determine which low-cost airborne pollutant sensors are useful for prediction of indoor...
Web-based interactions can be frequently represented by an attributed graph, and node clustering in such graphs has received much attention lately. Multiple efforts have successfully applied Graph Convolutional Networks (GCN), though with some limits on accuracy as GCNs have been shown to suffer from over-smoothing issues. Though other methods (par...
In recent years, there has been much interest in Graph Convolutional Networks (GCNs). There are several challenges associated with training GCNs. Particularly among them, because of massive scale of graphs, there is not only a large computation time, but also the need for partitioning and loading data multiple times. This paper presents a different...
In particular, research papers on the view, practice, and teaching-learning of free educational resources in mathematics education research papers are scarce. In this research work, it will present a brief overview of Open Educational Resources (OER) in respective of its advantage, disadvantage, use, implemented area, effect in the education system...
The current study conducted to determine the degree of efficacy of research scholars, and a significant difference was found concerning type in universities and experience. The sample included 525 research scholars researching state universities of Tamil Nadu. The tool used for the present study was the research scholar's effectiveness Scale (RES)....
Identifying driving styles is the task of analyzing the behavior of drivers in order to capture variations that will serve to discriminate different drivers from each other. This task has become a prerequisite for a variety of applications, including usage-based insurance, driver coaching, driver action prediction, and even in designing autonomous...
This paper presents techniques to detect the “offline” activity (such as dining, shopping, or entertainment) a person is engaged in when she is tweeting , in order to create a dynamic profile of the user, for uses such as better targeting of advertisements. To this end, we present a hybrid gated recurrent neural network (GRNN)-based model for rich...
Novel contexts may often arise in complex querying scenarios such as in evidence-based medicine (EBM) involving biomedical literature, that may not explicitly refer to entities or canonical concept forms occurring in any fact- or rule-based knowledge source such as an ontology like the UMLS. Moreover, hidden associations between candidate concepts...
Reducing traffic accidents is an important public safety challenge, therefore, accident analysis and prediction has been a topic of much research over the past few decades. Using small-scale datasets with limited coverage, being dependent on extensive set of data, and being not applicable for real-time purposes are the important shortcomings of the...
Businesses communicate using Twitter for a variety of reasons -- to raise awareness of their brands, to market new products, to respond to community comments, and to connect with their customers and potential customers in a targeted manner. For businesses to do this effectively, they need to understand which content and structural elements about a...
Online customer reviews on large-scale e-commerce websites, represent a rich and varied source of opinion data, often providing subjective qualitative assessments of product usage that can help potential customers to discover features that meet their personal needs and preferences. Thus they have the potential to automatically answer specific queri...
Online customer reviews on large-scale e-commerce websites, represent a rich and varied source of opinion data, often providing subjective qualitative assessments of product usage that can help potential customers to discover features that meet their personal needs and preferences. Thus they have the potential to automatically answer specific queri...
Reducing traffic accidents is an important public safety challenge, therefore, accident analysis and prediction has been a topic of much research over the past few decades. Using small-scale datasets with limited coverage, being dependent on extensive set of data, and being not applicable for real-time purposes are the important shortcomings of the...
Pattern discovery in geo-spatiotemporal data (such as traffic and weather data) is about finding patterns of collocation, co-occurrence, cascading, or cause and effect between geospatial entities. Using simplistic definitions of spatiotemporal neighborhood (a common characteristic of the existing general-purpose frameworks) is not semantically repr...
This paper presents techniques to detect the "offline" activity a person is engaged in when she is tweeting (such as dining, shopping or entertainment), in order to create a dynamic profile of the user, for uses such as better targeting of advertisements. To this end, we propose a hybrid LSTM model for rich contextual learning, along with studies o...
Reducing traffic accidents is an important public safety challenge. However, the majority of studies on traffic accident analysis and prediction have used small-scale datasets with limited coverage, which limits their impact and applicability; and existing large-scale datasets are either private, old, or do not include important contextual informat...
Dashboard camera installations are becoming increasingly common due to various Advanced Driver Assistance Systems (ADAS) based services provided by them. Though deployed primarily for crash recordings, calibrating these cameras can allow them to measure real-world distances, which can enable a broad spectrum of ADAS applications such as lane-detect...
Pattern discovery in geo-spatiotemporal data (such as traffic and weather data) is about finding patterns of collocation, co-occurrence, cascading, or cause and effect between geospatial entities. Using simplistic definitions of spatiotemporal neighborhood (a common characteristic of the existing general-purpose frameworks) is not semantically repr...
Methods that are both computationally feasible and practically effective are needed to make sense of big corpuses of content, or “big content.” For example, supervised categorization techniques for open-access academic publishing are ill-suited for automated categorization because they rely on an existing categorization scheme, but no supervised sc...
In fact-based information retrieval, stateof-the-art performance is traditionally achieved by knowledge graphs driven by knowledge bases, as they can represent facts about and capture relationships between entities very well. However, in domains such as medical information retrieval, where addressing specific information needs of complex queries ma...
In fact-based information retrieval, stateof-the-art
performance is traditionally
achieved by knowledge graphs driven by
knowledge bases, as they can represent
facts about and capture relationships between
entities very well. However, in
domains such as medical information retrieval,
where addressing specific information
needs of complex queries ma...
Telematics data is becoming increasingly available due to the ubiquity of devices that collect data during drives, for different purposes, such as usage based insurance (UBI), fleet management, navigation of connected vehicles, etc. Consequently, a variety of data-analytic applications have become feasible that extract valuable insights from the da...
In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has be...
In 2008, the National Science Foundation (NSF) released the report “Fostering Learning in the Networked World: The Cyberlearning Opportunity and Challenge”. NSF argued in this report that the heavy investment and focus on Cyberinfrastructures must be complemented by a parallel investment in Cyberlearning, “…learning that is mediated by networked co...
The ubiquity and variety of available sensors has enabled the collection of voluminous datasets of car trajectories that enable analysts to make sense of driving patterns and behaviors. One approach to obtain driving behaviors is to break a trajectory into its underlying patterns and then analyze these patterns (aka segmentation). To validate and i...
Because of the increasing availability of spatiotemporal data, a variety of data-analytic applications have become possible. Characterizing driving context, where context may be thought of as a combination of location and time, is a new challenging application. An example of such a characterization is finding the correlation between driving behavio...
Nowadays, the ubiquity of various sensors enables the collection of voluminous datasets of car trajectories. Such datasets enable analysts to make sense of driving patterns and behaviors: in order to understand the behavior of drivers, one approach is to break a trajectory into its underlying patterns and then analyze that trajectory in terms of de...
This study investigated the effects of a multicomponent, supplemental intervention on the reading fluency of second-grade African-American urban students who showed reading and special education risk. The packaged intervention combined repeated readings and culturally relevant stories, delivered through a novel computer software program to enhance...
Cloud-based services today depend on many layers of virtual technology and application services. Incidents and problems that arise in such complex operational environments are logged as a ticket, worked on by experts and finally resolved. To assist these experts, any machine recommendation method must meet the following critical business requiremen...
Telematics data is becoming increasingly available due to the ubiquity of devices that collect data during drives, for different purposes, such as usage based insurance (UBI), fleet management, navigation of connected vehicles, etc. Consequently, a variety of data-analytic applications have become feasible that extract valuable insights from the da...
In this paper we demonstrate a novel system Soft-Swipe, which can enable highly accurate pairing of vehicles to respective lanes in a wide-range of vehicle-based multi-lane service stations using economical general-purpose commodity communication and sensing technology. To study the system, we consider an example application of pairing vehicles to...
Smart-devices can render high quality location services when endowed with the ability to analyze information conveyed through video feed. In this paper, we aim to provide tracking services by using a mobile smart camera such as in google glasses and smartphones considering the following three objectives: (1) No additional deployment, (2) No user-si...
The number of freshmen interested in entrepreneurship has grown dramatically in the last few years. In response, many universities have created entrepreneurship programs, including ones focused on engineering entrepreneurship. In this paper, we report on NEWPATH, an innovative NSF-supported program at Ohio State, designed to nurture students to bec...
A new recommendation framework that addresses the correct and quick resolution of incidents that occur within the complex systems of an enterprise is introduced here. It uses statistical learning to mediate problem solving by large-scale Resolution Service Networks (with nodes as technical expert groups) that collectively resolve the incidents logg...
This paper is a contribution to the Computational Science & Engineering
Software Sustainability and Productivity Challenges (CSESSP Challenges)
Workshop (https://www.nitrd.gov/csessp/), sponsored by the Networking and
Information Technology Research and Development (NITRD) Software Design and
Productivity (SDP) Coordinating Group, held October 15th...
With the onset of social media and news aggregators on the Web, the newspaper industry is faced with a declining subscriber base. In order to retain customers both on-line and in print, it is therefore critical to predict and mitigate customer churn. Newspapers typically have heterogeneous sources of valuable data: circulation data, customer subscr...
Collaborative learning is a key component of software engineering (SE) courses in most undergraduate computing curricula. Thus these courses include fairly intensive team projects, the intent being to ensure that not only do students develop an understanding of key software engineering concepts and practices, but also develop the skills needed to w...
The ratings and rationales primary-age urban learners gave culturally relevant reading passages was the focus of this descriptive study. First- and second-grade students each read 30 researcher-developed passages reflecting the students’ immediate and historical backgrounds. The students rated the passages and gave a reason for their ratings. A des...
The use of culturally relevant material for urban students with special education/reading risk is frequently promoted in the literature; however, the empirical evidence appears limited. This study included eight African American urban second-grade students who scored within the at-risk range on the DIBELS Oral Reading Fluency measure. The students...
Conflict and cooperation would seem to be ideas that are diametrically opposed to each other. But, in fact, classic work by Piaget on how children and adults learn shows that when learners engage with peers in critical discussion of ideas concerning which they have different understandings, that contributes very effectively to learners developing d...
Mobile information technologies can unshackle students from desks and classrooms and allow them to learn on the go. They can explore and consume information, record their learning, and collaborate with mentors and with each other at any time and in any place. Further, because the mobile device knows user location and identity, learning can be locat...
One of the significant challenges for Cloud Service Providers (CSPs) hosting “virtual desktop cloud” (VDC) infrastructures is to deliver a satisfactory quality of experience (QoE) to the user. In order to maximize the user QoE without expensive resource overprovisioning, there is a need to design and verify resource allocation schemes for a compreh...
We present our work with GeoGames that are played on top of online geographic maps, using the real world as the game world. The developed technology represents an innovative potential for geographic inquiry-based, learning through play, which through the internet can reach a massive audience. The described "Green Revolution" game is meant to teach...
Complex information technology (IT) installations within an enterprise raise contextual challenges when delivering IT service management (ITSM) that maintain and enhance software for the business. Enterprise Architecture (EA) tools are now available to represent context as business processes, services and enabling IT components. However, few method...
For purposes such as end-to-end monitoring, capacity planning, and performance bottleneck troubleshooting across multi-domain networks, there is an increasing trend to deploy interoperable measurement frameworks such as perfSONAR. These deployments expose vast data archives of current and historic measurements, which can be queried using web servic...
Information searches based on expert-seeking technology can prove time-consuming or unsuccessful if search terms do not turn up extrinsic identifiers in profiles and saved documents. In many such cases, knowledge brokers function as "humans in the loop," providing intrinsic enterprise knowledge to mediate between information seekers and expert sour...
Subsampling workloads compute statistics from a set of observed samples using a random subset of sample data (i.e., a subsample). Data-parallel platforms group these samples into tasks, each task subsamples its data in parallel. In this paper, we study subsampling workloads that benefit from tiny tasks-i.e., tasks comprising few samples. Tiny tasks...
Large manufacturers increasingly leverage modelling and simulation to improve quality and reduce cost. Small manufacturers have not adopted these techniques due to sizable upfront costs for expertise, software and hardware. The software as a service (SaaS) model provides access to applications hosted in a cloud environment, allowing users to try se...