Samuel Pratt Midkiff

Samuel Pratt Midkiff
Purdue University | Purdue · School of Electrical and Computer Engineering

About

72
Publications
10,705
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,422
Citations

Publications

Publications (72)
Conference Paper
Full-text available
Computed Tomographic (CT) image reconstruction is an important technique used in a wide range of applications. Among reconstruction methods, Model-Based Iterative Reconstruction (MBIR) is known to produce much higher quality CT images; however, the high computational requirements of MBIR greatly restrict their application. Currently, MBIR speed is...
Technical Report
Full-text available
Model-Based Iterative Reconstruction (MBIR) is a widely-explored fully 3D Computed Tomography (CT) image reconstruction technique that has a large impact on the image reconstruction community. The slow computation speed for MBIR, however, is a bottleneck for scientific advancements in fields that use imaging, such as materials. A recently proposed...
Conference Paper
Full-text available
Computed Tomography (CT) Image Reconstruction is an important technique used in a variety of domains, including medical imaging, electron microscopy, non-destructive testing and transportation security. Model-based Iterative Reconstruction (MBIR) using Iterative Coordinate Descent (ICD) is a CT algorithm that produces state-of-the-art results in te...
Article
Computed Tomography (CT) Image Reconstruction is an important technique used in a variety of domains, including medical imaging, electron microscopy, non-destructive testing and transportation security. Model-based Iterative Reconstruction (MBIR) using Iterative Coordinate Descent (ICD) is a CT algorithm that produces state-of-the-art results in te...
Conference Paper
Full-text available
Computed Tomography (CT) Image Reconstruction is an important technique used in a wide range of applications, ranging from explosive detection, medical imaging to scientific imaging. Among available reconstruction methods, Model Based Iterative Reconstruction (MBIR) produces higher quality images and allows for the use of more general CT scanner ge...
Article
Computed Tomography (CT) Image Reconstruction is an important technique used in a wide range of applications, ranging from explosive detection, medical imaging to scientific imaging. Among available reconstruction methods, Model Based Iterative Reconstruction (MBIR) produces higher quality images and allows for the use of more general CT scanner ge...
Poster
Full-text available
In Computed Tomography (CT) methods, Model Based Iterative Reconstruction (MBIR) produces higher quality images than commonly used Filtered Backprojection (FBP) but at a very high computational cost. We describe a new MBIR implementation, PSV- ICD, which significantly reduces the computational cost of MBIR while retaining its benefits. It describes...
Conference Paper
Full-text available
Virtual machines hosted in virtualized data centers are important providers of computational resources in the era of cloud computing. Efficient scheduling of data centers' virtual machines can reduce the number of physical servers needed to host the virtual machines and, in turn, reduce the energy and other capital costs for maintaining the virtual...
Conference Paper
This paper describes AntSM, a system that uses the inherent parallelism of multi-threaded programs to reduce the overhead of statistical and invariant violations detection-based debugging tools. The runtime monitoring of these tools leads to high overheads. The key insight of the AntSM system is that this overhead can be reduced in parallel program...
Conference Paper
Full-text available
Array data flow analysis (ADFA) is a classical method for collecting array section information in sequential programs. When applying ADFA to parallel OpenMP programs, array access information needs to be analyzed in loops whose iteration spaces are partitioned across threads. The analysis involves symbolic expressions that are functions of the orig...
Article
This paper provides an overview and an evaluation of the Cetus source-to-source compiler infrastructure. The original goal of the Cetus project was to create an easy-to-use compiler for research in automatic parallelization of C programs. In meantime, ...
Conference Paper
Large-scale data mining and deep data analysis are in high demand in modern enterprises. This work describes the RABID (R Analytics for BIg Data) framework to provide a highly parallel R. We achieve the goal of providing data analysts with an easy-to-use R interface to effectively perform deep data analysis on clusters by integrating R and a MapRed...
Conference Paper
This paper summarizes our experiences and findings in teaching the concepts of parallel computing in two undergraduate programming courses and an undergraduate hardware design course. The first is a junior-senior level elective course Object-Oriented Programming using C++ and Java. The second is a sophomore-level required course on Advanced C Progr...
Conference Paper
This paper describes Ant, a debugging framework targeting MPI parallel programs. The Ant framework statically analyzes programs, marking code regions as being executed by all processes or executed by only some of the processes. The analyzed program is then instrumented with calls to an invariant violation monitoring and detection library. The analy...
Conference Paper
Full-text available
In the early 1980s, shared memory mini-super-computers had buses and memory whose speeds were relatively fast compared to processor speeds. This led to the widespread use of various producer/consumer (post/wait) synchronization schemes for enforcing data dependences within parallel doacross loops. The rise of the “killer micro”, instruction sets op...
Article
Full-text available
The Cetus tool provides an infrastructure for research on multicore compiler optimizations that emphasizes automatic parallelization. The compiler infrastructure, which targets C programs, supports source-to-source transformations, is user-oriented and easy to handle, and provides the most important parallelization passes as well as the underlying...
Conference Paper
In the presence of dynamic classloading, performing interprocedural analysis (IPA) too early can lead to repeatedly performing the IPA as new classes are loaded, while performing it too late will cause a performance degradation by running unoptimized code for too long. This paper investigates how programs load classes and how this affects the perfo...
Conference Paper
Well designed domain specific languages enable the easy expression of problems, the application of domain specific optimizations, and dramatic improvements in productivity for their users. In this paper we describe a compiler for polymer chemistry, and in particular rubber chemistry, that achieves all of these goals. The compiler allows the develop...
Article
Manual debugging is tedious, as well as costly. The high cost has motivated the development of fault localization techniques, which help developers search for fault locations. In this paper, we propose a new statistical method, called SOBER, which automatically localizes software faults without any prior knowledge of the program semantics. Unlike e...
Chapter
Full-text available
Static Single Assignment (SSA) form has shown its usefulness as a program representation for code optimization techniques in sequential programs. We introduce the Concurrent Static Single Assignment (CSSA) form to represent explicitly parallel programs with interleaving semantics and post-wait synchronization. The parallel construct considered in t...
Conference Paper
A number of hardware and software techniques have been proposed to detect dynamic program behaviors that may indicate a bug in a program. Because these techniques suffer from high overheads they are useful in finding bugs in programs before they are released, but are significantly less useful in finding bugs in long-running programs on production s...
Conference Paper
Peer-to-peer (P2P) cycle sharing over the Internet has become increasingly popular as a way to share idle cycles. A fundamental problem faced by P2P cycle sharing systems is how to incrementally monitor and verify, with low overhead, the execution of jobs submitted to a remote untrusted hosting machine, or cluster of machines. In this paper, we pre...
Chapter
The IBM Scalable Shared Memory Project Machine (SSMP) is an SCOMA research prototype machine. Due to large latencies of non-local data access, the primary concerns of the compiler are locality, data layout, and scheduling of work to where data is currently located — goals that are similar to a compiler for a distributed memory machine. The presence...
Conference Paper
Full-text available
The widespread popularity of languages allowing explicitly parallel, multi-threaded programming, e.g. Java and C#, have focused attention on the issue of memory model design. The Pensieve Project is building a compiler that will enable both language designers to prototype different memory models, and optimizing compilers to adapt to different memor...
Article
Automated localization of software bugs is one of the essential issues in debugging aids. Previous studies indicated that the evaluation history of program predicates may disclose important clues about underlying bugs. In this paper, we propose a new statistical model-based approach, called SOBER, which localizes software bugs without any prior kno...
Conference Paper
Full-text available
This paper makes two contributions to architectural support for software debugging. First, it proposes a novel statistics-based, on-the-fly bug detection method called PC-based invariant detection. The idea is based on the observation that, in most programs, a given memory location is typically accessed by only a few instructions. Therefore, by cap...
Conference Paper
Full-text available
The rise of Java, C#, and other explicitly parallel languages has increased the importance of compiling for different software memory models. This paper describes co-operating escape, thread structure, and delay set analyses that enable high performance for sequentially consistent programs.We compare the performance of a set of Java programs compil...
Article
The design of consistency models for both hardware and software is a difficult task. For programming languages it is particularly difficult because the target audience for a programming language is much wider than the target audience for a hardware programming language, making usability a more important criteria. Exacerbating this problem is the re...
Article
Full-text available
This article presents an escape analysis framework for Java to determine (1) if an object is not reachable after its method of creation returns, allowing the object to be allocated on the stack, and (2) if an object is reachable only from a single thread during its lifetime, allowing unnecessary synchronization operations on that object to be remov...
Article
The lack of direct support for multidimensional arrays in Java TM has been recognized as a major deficiency in the language's applicability to numerical computing. It has been shown that, when augmented with multidimensional arrays, Java can achieve very high-performance for numerical computing through the use of compiler techniques and efficient i...
Conference Paper
In general, the hardware memory consistency model in a multiprocessor system is not identical to the memory model at the programming language level. Consequently, the programming language memory model must be mapped onto the hardware memory model. Memory fence instructions can be inserted by the compiler where needed to accomplish this mapping. We...
Article
Full-text available
This paper presents a compilation framework that allows executable code to be shared across different Java Virtual Machine (JVM) instances. Current compliant JVMs for servers are burdened with large memory footprints (because of the size of the increasingly complicated compilers) and high startup costs, while compliant JVMs for embedded devices typ...
Conference Paper
Full-text available
Concurrent threads executing on a shared memory system can access the same memory locations. A consistency model defines constraints on the order of these shared memory accesses. For good run-time performance, these constraints must be as few as possible. Programmers who write explicitly parallel programs must take into account the consistency mode...
Article
The Java language specification requires that all array references be checked for validity. If a reference is invalid, an exception must be thrown. Furthermore, the environment at the time of the exception must be preserved and made available to whatever code handles the exception. Performing the checks at run-time incurs a large penalty in executi...
Conference Paper
Full-text available
The design of memory consistency models for both hardware and software is a difficult task. It is particularly difficult for a programming language because the target audience is much wider than the target audience for a machine language, making usability a more important criterion. Adding to this problem is the fact that the programming language c...
Article
Full-text available
When Java was first introduced, there was a perception that its many benefits came at a signifi- cant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progre...
Chapter
This paper presents a compilation framework that enables efficient use of volatile (e.g. RAM) storage while executing compiled Java code on an embedded device. Traditionally, Java Virtual Machines (JVM’s) have relied on dynamic compilation for performance. These JVMs suffer from large memory footprints and high startup costs, both of which are seri...
Article
Full-text available
this article from being used in a dynamic compiler. Moreover, by using the quasi-static dynamic compilation model [10], the more expensive optimization and 5 analysis techniques employed by TPO can be done off-line, sharply reducing the impact of compilation overhead
Article
Full-text available
This article discusses the Numerically Intensive Java programming which is a prototype of Java environment and its applications involve high-performance numerical computing. When the Java programming language was introduced by Sun Microsystems Inc. in 1995, there was a perception that its many benefits came at a significant performance cost. Despit...
Article
Full-text available
This paper presents a compilation framework that enables efficient sharing of executable code across distinct Java Virtual Machine (JVM) instances. High-performance JVMs rely on run-time compilation, since static compilation cannot handle many dynamic features of Java. These JVMs suffer from large memory footprints and high startup costs, which are...
Conference Paper
The Java HotSpotTM Server Compiler achieves improved asymptotic performance through a combination of object-oriented and classical-compiler optimizations. Aggressive inlining using class-hierarchy analysis reduces function call overhead and ...
Article
Full-text available
This paper presents the design and implementation of the Quicksilver 1 quasi-static compiler for Java. Quasi-static compilation is a new approach that combines the benefits of static and dynamic compilation, while maintaining compliance with the Java standard, including support of its dynamic features. A quasi-static compiler relies on the generati...
Conference Paper
Full-text available
The lack of direct support for multidimensional arrays in Java#8482; has been recognized as a major deficiency in the language's applicability to numerical computing. The typical approach to adding multidimensional arrays to Java has been through class libraries that implement these structures. It has been shown that the class library approach can...
Article
This paper presents the design and implementation of the Quicksilver ¹ quasi-static compiler for Java. Quasi-static compilation is a new approach that combines the benefits of static and dynamic compilation, while maintaining compliance with the Java standard, including support of its dynamic features. A quasi-static compiler relies on the generati...
Conference Paper
From a software engineering perspective, the Java programming language provides an attractive platform for writing numerically intensive applications. A major drawback hampering its widespread adoption in this domain has been its poor performance on numerical codes. This paper describes a prototype Java compiler which demonstrates that it is possib...
Article
Full-text available
We describe pHPF, an research prototype HPF compiler for the IBM SP series parallel machines. The compiler accepts as input Fortran 90 and Fortran 77 programs, augmented with HPF directives; sequential loops are automatically parallelized. The compiler supports symbolic analysis of expressions. This allows parameters such as the number of processor...
Article
Full-text available
this article we show how optimizing array bounds checks and null pointer checks creates loop nests on which aggressive optimizations can be used. Applying these optimizations by hand to a simple matrix-multiply test case leads to Java-compliant programs whose performance is in excess of 500 Mflops on a four-processor 332MHz RS/6000 model F50 comput...
Article
Full-text available
This paper presents a simple and efficient data flow algorithm for escape analysis of objects in Java programs to determine (i) if an object can be allocated on the stack; (ii) if an object is accessed only by a single thread during its lifetime, so that synchronization operations on that object can be removed. We introduce a new program abstractio...
Article
First proposed as a mechanism for enhancing Web content, the Java™ language has taken off as a serious general-purpose programming language. Industry and academia alike have expressed great interest in using the Java language as a programming language for scientific and engineering computations. Applications in these domains are characterized by in...
Conference Paper
This paper presents the design and implementation of the Quicksilver1 quasi-static compiler for Java. Quasi-static compilation is a new approach that combines the benefits of static and dynamic compilation, while maintaining compliance with the Java standard, including support of its dynamic features. A quasi-static compiler relies on the generatio...
Conference Paper
Full-text available
Traditional compiler techniques developed for sequential programs do not guarantee the correctness (sequential consistency) of compiler transformations when applied to parallel programs. This is because traditional compilers for sequential programs do not account for the updates to a shared variable by different threads. We present a concurrent sta...
Article
This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer com...
Conference Paper
Poor performance on numerical codes has slowed the adoption of Java within the technical computing community. In this paper we describe a prototype array library and a research prototype compiler that support standard Java and deliver near-Fortran performance on numerically intensive codes. We discuss in detai our implementation of: (i) an efficien...
Conference Paper
Full-text available
One glaring weakness of Java for numerical programming is its lack of support for complex numbers. Simply creating a Complex number class leads to poor performance relative to Fortran. We show in this paper, however, that the combination of such a Complex class and a compiler that understands its semantics does indeed lead to Fortran-like performan...
Article
Full-text available
In this paper, we present a constant propagation algorithm for explicitly parallel programs, which we call the Concurrent Sparse Conditional Constant propagation algorithm. This algorithm is an extension of the Sparse Conditional Constant propagation algorithm. Without considering the interaction between threads, classical optimizations lead to an...
Article
Irregular applications comprise a significant and increasing portion of jobs running in parallel environments. Recent research has shown that, in parallel environments, both the system utilization and application turn around time improve when resources allocated to applications can be dynamically adjusted at run-time, depending on the workload. To...
Article
Full-text available
: In this paper, we describe a new scheme for checkpointing parallel applications on message-passing scalable distributed memory systems. The novelty of our scheme is that a checkpointed application can be restored, from its checkpointed state, in a reconfigured form. Thus, a parallel application may be checkpointed while executing with t 1 tasks o...
Article
Translating program loops into a parallel form is one of the most important transformations performed by concurrentizing compilers. This transformation often requires the insertion of synchronization instructions within the body of the concurrent loop. Several loop synchronization techniques are presented first. Compiler algorithms to generate sync...
Article
The development of high speed parallel multi-processors, capable of parallel execution of doacross and forall loops, has stimulated the development of compilers to transform serial FORTRAN programs to parallel forms. One of the duties of such a compiler must be to place synchronization instructions in the parallel version of the program to insure t...
Conference Paper
This paper presents methods for the compile time generation of synchronization instructions for parallel loops running on a multiprocessor. A synchronization instruction set and architecture are defined, and using these it is shown how to synchronize FORTRAN source loops so that dependences on the parallel loop are enforced. Construction of an appl...

Network

Cited By

Projects