About
234
Publications
11,574
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,975
Citations
Citations since 2017
Publications
Publications (234)
The field of computer architecture uses quantitative methods to drive the computer system design process. By quantitatively profiling the run time characteristics of computer programs, the principal processing needs of commonly used programs became well understood and computer architects can focus their design solutions toward those needs. The DESM...
Forward Time Population Genetic Simulations offer a flexible framework for modeling the various evolutionary processes occurring in nature. Often this model expressibility is countered by an increased memory usage or computational overhead. With the complexity of simulation scenarios continuing to increase, addressing the scalability of the underly...
Transactional memory is a concurrency control mechanism that dynamically determines when threads may safely execute critical sections of code. It provides the performance of fine-grained locking mechanisms with the simplicity of coarse-grained locking mechanisms. With hardware based transactions, the protection of shared data accesses and updates c...
A considerable amount of research on parallel discrete event simulation has been conducted over the past few decades. However, most of this research has targeted the parallel simulation infrastructure; focusing on data structures, algorithms, and synchronization methods for parallel simulation kernels. Unfortunately, distributed environments often...
For safety critical systems, hardware is often preferred over software because it is easier to achieve safety goals in hardware alone and because hardware is considered more reliable than software. But as systems become more complex, software solutions will also be important. Here we demonstrate, using a simple example, that formal methods are a us...
The rapid growth in the parallelism of multi-core processors has opened up new opportunities and challenges for parallel simulation discrete event simulation (PDES). PDES simulators attempt to find parallelism within the pending event set to achieve speedup. Typically the pending event set is sorted to preserve the causal orders of the contained ev...
In recent years genetic data analysis has seen a rapid increase in the scale of data to be analyzed. Schadt et al [1] offered that with data sets approaching the petabyte scale, data related challenges such as formatting, management, and transfer are increasingly important topics which need to be addressed. The use of succinct data structures is on...
Clinical assessment systems deployed in rehabilitation settings are likely to consist of several software and hardware components that subjects directly interact with. While it is extremely important that all facets of user interface are tested for critical ergonomic issues, it may be useful to identify those issues during the design and developmen...
This paper describes the design of a custom robotic system that will be used for clinical evaluation of upper-limb motor performance. The system consists of a mechatronic component that is central to capturing of motor inputs, and a software component that constitutes various user interface applications. The focus of this paper is to elaborate the...
This paper presents prototypes of a hardware interface that is directed towards possible integration with a Point-of-Care Testing Environment for Neurological Assessment (POCTENA). While the complete system is intended to assist with diagnosis of mild Traumatic Brain Injury (TBI), the focus of this paper is to present designs of necessary hardware...
Multi-core and many-core processing chips are becoming widespread and are now being widely integrated into Beowulf clusters. This poses a challenging problem for distributed simulation as it now becomes necessary to extend the algorithms to operate on a platform that includes both shared memory and distributed memory hardware. Furthermore, as the n...
Some emerging high performance many-core chips have support to enable software control of an individual core's operating frequency (and voltage). These controls can potentially be used to optimize execution for either performance (accelerating the critical path) or power savings (green computing). In Time Warp parallel simulators using the Virtual...
Time Warp synchronized parallel discrete event simulators are organized to operate asynchronously and aggressively without explicit synchronization between the concurrently executing simulation objects. In place of an explicit synchronization mechanism, the concurrent simulators implement an independent but common virtual clock model and a rollback...
In this paper, we present a prototype design of POCTENA (Point-Of-Care Testing Environment for Neurological Assessment), a medical computing system that will be used to assist with diagnosis of mild traumatic brain injury. The design includes an initial set of neurological tests that are built into the system. Component-based usability testing was...
Time Warp synchronized parallel discrete event simulators are organized to operate asynchronously and aggressively without explicit synchronization between the concurrently executing simulators. In place of an explicit synchronization mechanism, the concurrent simulators maintain a common virtual clock model and implement a rollback/recovery mechan...
When a scheme such as FDTD is used to perform full-wave electromagnetic simulation of electrically large objects, it can result in excessive computation times. Parallel processing is one means to reduce the computation time. A shortcoming of the parallel approach is the communication overhead that is responsible for reduced speedup efficiency but i...
In recent years, the number of hardware supported threads in desktop processors has increased dramatically. All but the very lowest cost net books and embedded processors now have at least dual cores and soon systems supporting upwards of 8 to 16 hardware threads are likely to be commonplace. Unfortunately, it will be difficult to take full advanta...
In North America, an estimated 30,000 patients annually experience an aneurysmal subarachnoid hemorrhage (SAH). In approximately five percent of these patients, the hemorrhage is not visible on computerized tomography scans due to the inability to image blood at time intervals greater than 12 h post symptom onset. For these patients (many of which...
In North America, an estimated 30,000 patients annually experience an aneurysmal subarachnoid hemorrhage (SAH). In approximately five percent of these patients, the hemorrhage is not visible on computerized tomography scans due to the inability to image blood at time intervals greater than 12 hours post symptom onset. For these patients (many of wh...
Packing (or executable compression) is considered as one of the most effective anti-reverse engineering methods in the Microsoft Windows environment. Even though many reversing attacks are widely conducted in the Linux-based embedded system there is no widely used secure binary code packing tools for Linux. This paper presents two secure packing me...
Understanding regional as well as global spinal alignment is increasingly recognized as important for the spine surgeon. A novel software program for virtual preoperative measurement and surgical manipulation of sagittal spinal alignment was developed to provide a research and educational tool for spine surgeons. This first-generation software prog...
Current trends in desktop processor design have been toward many-core solutions with increased parallelism. As the number of supported threads grows in these processors, it may prove difficult to exploit them on the commodity desktop. This paper presents a study that explores the spawning of the dynamic memory management activities into a separatel...
3D virtual worlds provide rich environments for collaborative work and social networking. The design of these spaces is largely confined by the classical time and space properties that we find in our own physical world. In some cases, some non-physical capabilities like teleportation are provided, but in general, most of the limits that we experien...
Described are a self-protecting storage device and method that can be used to monitor attempts to access protected information. Access is allowed for authorized host systems and devices while unauthorized access is prevented. Authorization use includes inserting a watermark into access commands, such as I/O requests, sent to the storage device. The...
Time Warp distributed simulations can enter a catastrophic state of cascading rollbacks where out of order (premature) event executions propagate faster than the corrective measures (anti-messages) designed to terminate them. In this paper, a distributed, proactive cancellation mechanism designed to avoid this situation is presented. We also presen...
This paper presents an enhanced process model in an optimistic distributed simulation to develop a method for tracking causality and safe-time and its application in the design of a local fossil identification technique. The fossil identification process is reduced to a local operation that operates without requiring a GVT estimation method. The in...
Multi-resolution models are often used to accelerate simulation-based analysis without significantly impacting the fidelity of the simulations. We have developed a web-enabled, component-based, multi-resolution modeling and Time Warp synchronized parallel simulation environment called WESE (Web-Enabled Simulation Environment). WESE uses a methodolo...
In this article, we report the characterization results of two data integrated video sensors designed by Clifton Labs, Inc.
A data integrated video sensor consists of an array of photodetectors that each provide both an analog (video) and digital
(data) output based on the amount of incident light on the detector. Video capture occurs using a simpl...
The modeling and analysis of a USB storage device with a novel protection mechanism is described. The USB storage device contains an active monitoring subsystem that decodes and analyzes an encoded request stream has been developed by Clifton Labs. However, before moving to fabrication, there are several design parameters that need to be explored a...
Computing time in finite-difference time-domain can be saved by expressing a portion of the grid (macromodel) as an linear time invariant (LTI) system. The output then becomes a convolution of the input with the LTI impulse response. To achieve a constant time for the convolution, the macromodel's response is expressed as a superposition of eigenmo...
Computing time in finite-difference time-domain can be saved by expressing a portion of the grid (macromodel) as an linear time invariant (LTI) system. The output then becomes a convolution of the input with the LTI impulse response. To achieve a constant time for the convolution, the macromodel's response is expressed as a superposition of eigenmo...
This paper presents a new scheduling mechanism to choose and process the next input event from the event-set of simulation objects in a logical process (LP). Events are prioritized based on a simulation object's working set. Working set of a simulation object is analogous to working set defined in virtual memory, and consists of input and output ev...
This paper presents a time warp fossil collection mechanism that functions without need for a GVT estimation algorithm. Effectively each logical process (LP) collects causality information during normal event execution and then each LP utilizes this information to identify fossils. In this mechanism, LPs use constant size vectors (that are independ...
High resolution models of logic circuits need to be used in simulations to accurately track logic transitions or glitches, which contribute to the most dominant portion of VLSI power dissipated. Unfortunately, simulating large, high resolution models is a time consuming task. Although more abstract models that simulate faster can be used, they are...
High frequency electromagnetic simulation requires computationally expensive methods such as FDTD. The simulation time can be reduced by creating macromodels that describe the entire sub-region but provide input/output information only at a limited number of points. An alternative to the conventional FDTD method is to compute the macromodel respons...
The steady growth in the multifaceted use of broadband asynchronous transfer mode (ATM) networks for time-critical applications has significantly increased the demands on the quality of service (QoS) provided by the networks. Satisfying these demands requires the networks to be carefully engineered based on inferences drawn from detailed analysis o...
Multi-resolution models can be statically (i.e., before simulation) or dynamically (i.e., during simulation) abstracted to accelerate the simulations without compromising the analysis goals. However, abstractions must be carefully chosen because not all abstractions improve performance. Unfortunately, identifying performance enhancing transformatio...
In this paper, the Simbus backplane is used in conjunction with SAVANT/TyVIS/WARPED, a parallel VHDL simulator, and Xyce, a parallel SPICE simulation engine, to model and simulate a mixed-signal ASIC-driven charging circuit simulation. In particular, the individual components of an airbag deployment system are described and modeled in the digital a...
Full wave electromagnetic simulation requires numerically expensive methods such as FDTD. The computation time depends superlinearly on the number of unknowns in the simulation region. In some situations, especially when the results are not needed at every point of the grid, simulation time can be reduced. This reduction can be accomplished by part...
High resolution models of logic circuits need to be used in simulations to accurately track logic transitions or glitches, which contribute to the most dominant portion of VLSI power dissipated. Unfortunately, simulating large, high resolution models is a time consuming task. Although more abstract models that simulate faster can be used, they are...
Optimistic time warp simulators should stop the rapid propagation of incorrect events to avoid reaching a catastrophic state (a state where out of order event execution is always a step ahead of its corrective measures). A distributed cancellation mechanism using total clocks was proposed earlier to avoid such catastrophic states. In this paper, we...
In order to simulate electromagnetic phenomena at high frequencies, full wave solvers such as the FDTD method must be used. An alternative to the conventional FDTD method is to compute the zero state response with convolution. Convolution results in an increased computation time with every time step. By performing eigenmodal decomposition of the in...
We investigate the Linux implementation of the finite-difference time- domain (FDTD) algorithm on a Beowulf cluster of parallel workstations. We discuss synchronization and domain decomposition alternatives. To obtain performance characteristics the algorithm is applied to a test scattering problem. The computation has been done using varying numbe...
The FDTD method is used to find numerical solutions to Maxwell's equations when analytic ones are prohibitive. Brute force approach to the FDTD method requires costly calculation at every point in the solution grid. If the solution is required at only a subset of the domain, then some computing time can be saved by expressing a portion of the grid...
In this article, we report the characterization results of two data integrated video sensors designed by Clifton Labs, Inc. A data integrated video sensor consists of an array of photodetectors that provide both an analog and digital output based on the amount of incident light on the detector. Two approaches can be used to capture the resulting re...
The factors influencing spread of Lyme disease are often studied using computer-based simulations and spatially explicit models. However, simulating large and complex models is a time consuming task, even when parallel simulation techniques are employed. In an endeavor to accelerate such simulations, an alternative approach involving dynamic (i.e.,...
This work discusses a simulation backplane called "Simbus", which allows analog and digital simulators to be connected and utilized together to perform mixed-signal simulation. The approach allows elements of the each domain to be expressed in their native format, simulate on their native simulators, and be coupled into an aggregate mixed-signal si...
No Abstract Available.
Continuous system models are becoming increasingly more important in the modeling and analysis of complex systems. Unfortunately, the runtime simulation costs required to support continuous modeling can be prohibitive to their use. One technique to decrease simulation runtime costs is mixed-domain simulation where the system is modeled by a mixture...
Report developed under SBIR contract for topic AO1-58. PHOCI is an optical imaging system that is suitable for both image capture and reception of optical communications data. The envisioned system, called PHOCI includes a novel image/data capture chip that imbeds a high-speed optical data communications receiver technology into the image capture a...
The Web-based Environment for Systems Engineering (wese) is a Web-based modeling and simulation environment in which the level of abstraction of a model can be configured statically (prior to simulation) or dynamically (during simulation) by substituting a module (set of components) with an equivalent component or vice versa through a process calle...
The Web-based Environment for Systems Engineering (wese) is a web-based modeling and simulation environment in which the level of abstraction of a model can be configured statically (prior to simulation) or dynamically (during simulation) by substituting a module (set of components) with an equivalent component or vice versa through a process calle...
The Modular-Virtual Interface Architecture (M-VIA) is a communication software suite developed by the National Energy Research Scientific Center. M-VIA is based on the Intel hardware-based VI standard. This software currently outperforms TCP in both bandwidth and latency. A performance analysis of the M-VIA software is conducted to determine, if th...
WARPED is a publicly available time warp simulation kernel. The kernel defines a standard interface to the application developer and is designed to provide a highly configurable environment for the integration of time warp optimizations. It is written in C++, uses the MPI message passing standard, and executes on a variety of parallel and distribut...
Many modern systems involve complex interactions between a large number of diverse entities that constitute these systems. Unfortunately, these large, complex systems frequently defy analysis by conventional analytical methods and their study is generally performed using simulation models. Further aggravating the situation, detailed simulations of...
The Time Warp synchronization protocol allows causality errors and then recovers from them with the assistance of a cancellation mechanism. Cancellation can cause the rollback of several other simulation objects that may trigger a cascading rollback situation where the rollback cycles back to the original simulation object. These cycles of rollback...
Web-based simulations are performed by utilizing the resources of the Word Wide Web (WWW) such as proprietary components/models developed by third party modelers /manufacturers and web-based computational infrastructures (or compute servers). Access to such web-based resources, third party resources in particular, is usually circumscribed by a vari...
The Web-based Environment for Systems Engineering (WESE) is a web-based modeling and simulation environment in which the level of abstraction of a model can be configured statically (prior to simulation) or dynamically (during simulation) by substituting a module (set of components) with a equivalent component or vice versa through a process called...
Active networking techniques embed computational capabilities into conventional networks thereby massively increasing the complexity and customization of the computations that are performed with a network. In depth studies of these large and complex networks that are still in their nascent stages cannot be effectively performed using analytical met...
The steady increase in size and complexity of communication networks, coupled with growing needs and demands, has motivated the development of active networks. Active networking techniques embed computational capabilities into conventional networks, thereby massively increasing the complexity and customization of the computations that are performed...
Efficient management of events lists is important in optimizing discrete event simulation performance. This is especially true in distributed simulation systems. The performance of simulators is directly dependent on the event list management operations such as insertion, deletion, and search. Several factors such as scheduling, checkpointing, and...
The emergence of mixed-signal (analog and digital) integrated circuits motivates the need for CAD tools supporting mixed-signal design and analysis. Furthermore, the presence of a large body of existing models in existing modeling language and the need for modeling mixed-signal (analog and digital) circuits motivates the need for a single unified s...
Circuit simulation has proven to be one of the most important computer aided design (CAD) methods for verification and analysis of, integrated circuit designs. A popular approach to modeling circuits for simulation purposes is to use a hardware description language such as VHDL. VHDL has had a tremendous impact in fostering and accelerating CAD sys...
The paper describes a formal framework developed using the Prototype Verification System (PVS) to model and verify distributed simulation kernels based on the Time Warp paradigm. The intent is to provide a common formal base from which domain specific simulators can be modeled, verified, and developed. PVS constructs are developed to represent basi...
We consider two models for the structure of the algorithm used for concurrent interpretation of MIMD code sequences on SIMD machines. Thesingle-fetch model shares portions of the instruction execution among all the instructions, minimizing the interpreter length. Because the fetch of the next instruction is shared, only one instruction is executed...
Blue Gene/L (BG/L) is a 65, 536-node massively parallel computer being developed at the IBM Thomas J. Watson Research Center that promises to revolutionize large-scale scientific computing. However, its size alone will make programming BG/L a major challenge, ...
Parallel simulations using optimistic synchronization strategies such as Time Warp, operate with no regard to global synchronization since this results in greater parallelism and lower synchronization cost. However, like virtual memory, the parallel simulators may end up thrashing instead of performing useful work. The complication in using a Time...
The Web-based Environment for Systems Engineering (wese) is a web-based modeling and simulation environment in which the level of abstraction of a model can be configured statically (prior to simulation) or dynamically (during simulation) by substituting a module (set of components) with an equivalent component or vice versa through a process calle...
The integration of processing and DRAM offers a potential solution to the memory bottleneck problem. The bandwidth available within the chip is several orders of magnitude higher than that at the memory bus with a lower access time. As workloads shift towards data-intensive/multimedia applications, the wide bandwidth can be effectively utilized by...
The development of efficient parallel discrete event simulators is hampered by the large number of interrelated factors affecting performance. This problem is made more difficult by the lack of scalable representative models that can be used to analyze optimizations and isolate bottlenecks. This paper proposes a performance and scalability analysis...
SIMD machines are considered special purpose architectures chiefly because of their inability to support control parallelism. This restriction exists because there is a single control unit that is shared at the thread level; concurrent control threads must time-share the control unit (they are sequentially executed). We present an alternative model...
Modeling and simulation of large, high resolution network models is a time consuming task even when parallel simulation techniques are employed. Processing voluminous, detailed simulation data further increases the complexity of analysis. Consequently, the models (or parts of the models) are abstracted to improve performance of the simulations by t...
The gap between the speed of logic and DRAM access is widening. Traditional processors hide some of the mismatch in latency using techniques such as multi-level caches, instruction prefetching and memory interleaving/pipelining. Even with larger caches, cache miss rates are higher than the rate at which memory can provide data. Moreover, the memory...
The Time Warp synchronization protocol allows causality errors and then recovers from them with the assistance of a cancellation mechanism. Cancellation can cause the rollback of several other simulation objects that may trigger a cascading rollback situation where the rollback cycles back to the original simulation object. These cycles of rollback...
The size and complexity of hardware systems motivates the use of simulation for their study and analysis. Parallelization techniques are often employed to meet the memory and computational requirements for simulating large hardware designs. Furthermore, partitioning the design for parallel simulation is vital for achieving acceptable simulation thr...
Efficient management of events lists is important in optimizing discrete event simulation performance. This is especially true in distributed simulation systems. The performance of simulators is directly dependent on the event list management operations such as insertion, deletion, and search. Several factors such as scheduling, checkpointing, and...
Web-based simulations are performed by utilizing the resources of the Word Wide Web (WWW) such as propri- etary components/models developed by third party mod- elers/manufacturers and web-based computational infras- tructures (or compute servers). Access to such web-based resources, third party resources in particular, is usuallycir- cumscribed by...
The gap between the speed of logic and the DRAM memory access is widening. Traditional processors hide some of the mismatch in memory latency using techniques such as multi-level caches, instruction prefetching and memory interleaving. The bandwidth available at the system bus also forms a bottleneck; even an elaborate memory hierarchy with a perfe...
Circuit simulation has proven to be one of the most important computer aided design (CAD) methods for the analysis and validation of integrated circuit designs. A popular approach to describing circuits for simulation purposes is to use a hardware description language such as VHDL. Similar efforts have also been carried out in the analog domain tha...
Mixed-Mode simulation has been generating considerable interest in the simulation community and has continued to grow as an active research area. Traditional mixed-mode simulation involves the merging of digital and analog simulators in various ways. However, efficient methods for the synchronization between the two time domains remains elusive. Th...
The chief characteristic that differs markedly between parallel
simulation techniques is how they manage process interaction. The linear
control models for online configuration of the simulation presented in
this article are different from traditional control theory. The reason
is because data sampling and parameter adjustment are intrusive; these...
The design and development of modern systems is complicated by their size and complexity. Furthermore, many complex systems are built using subsystems and components available from third party manufacturers. The diversity of the available components has made exploration of design alternatives a large task. Simulation plays an important role in the...
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from: Publications Dept., ACM Inc., 1515 Broadway, New York, NY 10036, USA. Fax +1 (212) 869-0481, or permissi...
Parallelization is a popular technique for improving the performance of discrete event simulation. Due to the complex, distributed nature of parallel simulation algorithms, debugging implemented systems is a daunting, if not impossible task. Developers are plagued with transient errors that prove difficult to replicate and eliminate. Recently, rese...
The web presents an opportunity for realizing a distributed design framework supporting multi-disciplinary, multi-organizational collaborative design and analysis activities. The potential for deploying online, reusable parts libraries for virtual prototyping and design analysis exists. However, several issues must be solved before vendors will be...
Parallel simulation techniques are often employed to meet the
computational requirements of large hardware simulations in order to
reduce simulation time. In addition, partitioning for parallel
simulations has been shown to be vital for achieving higher simulation
throughput. This paper presents the results of our partitioning studies
conducted on...
Discrete event simulations are widely used to study and analyze
active and conventional networking architectures and protocols. Active
networks must coexist and communicate with conventional networks to
effectively utilize and extend the infrastructure of the Internet.
Hence, large scale network simulations containing both conventional and
active c...
Recent breakthroughs in communication and software engineering has
resulted in significant growth of Web-based computing. Web-based
techniques have been employed for modeling, simulation and analysis of
systems. The models for simulation are usually developed using component
based techniques. In a component based model, a system is represented as
a...
Recent breakthroughs in communication and software engineering has resulted in significant growth of web-based computing. Web-based techniques have been employed for modeling, simulation, and analysis of systems. The models for simulation are usually developed using component based techniques. In a component based model, a system is represented as...
The Single Instruction Multiple Data (SIMD) paradigm has several desirable characteristics from the perspective of massively-parallel algorithms. However, its restricted control organization makes it only useful for a small set of applications that fit this restricted model. The alternative for other applications has been to use Multiple Instructio...