About
18
Publications
4,866
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
391
Citations
Publications
Publications (18)
We present a set of fault injection experiments performed on the ACES (LANL/SNL) Cray XE supercomputer Cielo. We use this experimental campaign to improve the understanding of failure causes and propagation that we observed in the field failure data analysis of NCSA's Blue Waters. We use the data collected from the logs and from network performance...
This work presents a software and hardware framework for a telerobotic surgery safety and motor skill training simulator. The aims are at providing trainees a comprehensive simulator for acquiring essential skills to perform telerobotic surgery. Existing commercial robotic surgery simulators lack features for safety training and optimal motion plan...
We present a set of fault injection experiments performed on the ACES (LANL/SNL) Cray XE supercomputer Cielo. We use this experimental campaign to improve the under- standing of failure causes and propagation that we observed in the field failure data analysis of NCSA’s Blue Waters. We use the data collected from the logs and from network performan...
The proliferation of high-throughput sequencing machines allows for the rapid generation of billions of short nucleotide fragments in a short period. This massive amount of sequence data can quickly overwhelm today's storage and compute infrastructure. This poster explores the use of hardware acceleration to significantly improve the runtime of sho...
Today's cyber-physical systems (CPSs) can have very different characteristics in terms of control algorithms, configurations, underlying infrastructure, communication protocols, and real-time requirements. Despite these variations, they all face the threat of malicious attacks that exploit the vulnerabilities in the cyber domain as footholds to int...
This paper presents a simulation framework for recreating the realistic safety hazard scenarios commonly observed in robotic surgical systems, which can be used to prepare surgical trainees for handling safety-critical events during procedures. The proposed simulation platform is composed of a surgical simulator based on an open-source surgical rob...
Robotic telesurgical systems are one of the most complex medical
cyber-physical systems on the market, and have been used in over 1.75 million
procedures during the last decade. Despite significant improvements in design
of robotic surgical systems through the years, there have been ongoing
occurrences of safety incidents during procedures that neg...
With the advent of modern technologies, microprocessor-based devices are used to monitor and control critical infrastructures, e.g., electric power grids, oil and gas distribution. However, the security and reliability of these microprocessor-based systems is a significant issue, since they are more susceptible to transient errors and malicious att...
This paper proposes a novel technique for preventing a wide range of data errors from corrupting the execution of applications. The proposed technique enables automated derivation of fine-grained, application-specific error detectors based on dynamic traces of application execution. The technique derives a set of error detectors using rule-based te...
We present CloudVal, a framework to validate the reliability of virtualization environment in Cloud Computing infrastructure. A case study, based on injecting faults in the KVM hypervisor and Xen hypervisor, was conducted to show the viability of the framework. The study shows that due to the architectural differences between KVM and Xen, a direct...
We present CloudVal, a framework to validate the reliability of virtualization environment in Cloud Computing infrastructure. A case study, based on injecting faults in the KVM hypervisor and Xen hypervisor, was conducted to show the viability of the framework. The study shows that due to the architectural differences between KVM and Xen, a direct...
This paper presents an approach to conducting experimental studies for the characterization and comparison of the error behavior in different computing systems. The proposed approach is applied to characterize and compare the error behavior of three commercial systems (Linux 2.6 on Pentium 4, Solaris 10 on UltraSPARC IIIi, and AIX 5.3 on POWER 5) u...
When an operating system crashes and hangs, it leaves the machine in an unusable state. All currently running program state and data is lost. The usual solution is to reboot the machine and restart user programs. However, it is possible that after a crash, user program state and most operating system state is still in memory and hopefully, not corr...
This paper proposes a novel technique for preventing a wide range of data errors from corrupting the execution of applications. The proposed technique enables automated derivation of fine-grained, application-specific error detectors. An algorithm based on dynamic traces of application execution is developed for extracting the set of error detector...
This paper proposes a novel technique for preventing a wide range of data errors from corrupting the execution of applications. The proposed technique enables automated derivation of fine-grained, application-specific error detectors. An algorithm based on dynamic traces of application execution is developed for extracting the set of error detector...
This paper proposes a novel technique for automated derivation of fine-grained, application- specific error detectors. An algorithm based on dynamic traces of application execution is developed for extracting the optimal set of error detectors for a target application. An automatic framework is proposed for synthesizing the derived detectors in har...