
Barbara Kreaseck- La Sierra University
Barbara Kreaseck
- La Sierra University
About
15
Publications
2,605
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
328
Citations
Current institution
Publications
Publications (15)
Applications that manipulate sparse data structures contain memory reference patterns that are un-known at compile time due to indirect accesses such as A[B[i]]. To exploit parallelism and improve locality in such applications, prior work has developed a number of run-time reordering transformations (RTRTs). This paper presents the Sparse Polyhedra...
In forward mode Automatic Differentiation, the derivative program computes a function f and its derivatives, f0. Activity analysis is important for AD. Our results show that when all variables are active, the runtime checks required for dynamic activity analysis incur a significant overhead. However, when as few as half of the input variables are i...
Overlapping communication with computation is a well-known technique to increase application perfor- mance. While it is commonly assumed that communi- cation and computation can be overlapped at no cost, in reality they interfere with each other. In this pa- per we empirically evaluate the interference rate of communication on computation via measu...
Message passing via MPI is widely used in single-program, multiple-data (SPMD) parallel programs. Data-flow analy- sis frameworks that respect the semantics of message-passing SPMD programs are needed to obtain more accurate and in some cases correct analysis results for such programs. We qualitatively evaluate various approaches for performing dat...
Summary form only given. Overlapping communication with computation is a well-known technique to increase application performance. While it is commonly assumed that communication and computation can be overlapped at no cost, in reality, they do contend for resources and thus interfere with each other. Here we present an empirical quantification of...
In modern computers, a program's data locality can affect performance significantly. This paper details full sparse til- ing, a run-time reordering transformation that improves the data locality for stationary iterative methods such as Gauss-Seidel operating on sparse matrices. In scientific applications such as finite element analysis, these itera...
In this paper we investigate protocols for scheduling applications that consist of large numbers of identical, independent tasks on large-scale computing platforms. By imposing a tree structure on an overlay network of computing nodes, our previous work showed that it is possible to compute the schedule which leads to the optimal steady-state task...
Finite Element problems are often solved using multigrid techniques. The most time consuming part of multigrid is the iterative smoother, such as Gauss-Seidel. To improve performance, iterative smoothers can exploit parallelism, intra-iteration data reuse, and inter-iteration data reuse. Current methods for parallelizing Gauss-Seidel on irregular g...
Finite Element problems are often solved using multigrid techniques. The most time consuming part of multigrid is the iterative smoother, such as Gauss-Seidel. To improve performance, iterative smoothers can exploit parallelism, intra-iteration data reuse, and inter-iteration data reuse. Current methods for parallelizing Gauss-Seidel on irregular g...
Traditional parallel compilers do not effectively parallelize irregular applications because they contain little loop-level parallelism. We explore Speculative Task Parallelism (STP), where tasks are full procedures and entire natural loops. Through profiling and compiler analysis, we find tasks that are speculatively memory- and control-independen...
In modern computer architecture the use of memory hierarchies allows a program's data locality to directly aect performance. Data locality occurs when data is still in a cache upon reuse. This paper presents a technique for tiling iterative sparse matrix computations in order to improve the data locality in scientic applications, such as Finite Ele...
Tomorrow's microprocessors will be able to handle multiple flows of control. Applications that exhibit task level parallelism (TLP) and can be decomposed into parallel tasks will perform well on these platforms. TLP arises when a task is independent of its neighboring code. Traditional parallel compilers exploit one variety of TLP, loop level paral...
Data-flow analyses that include some model of the data-flow be-tween MPI sends and receives result in improved precision in the analysis results. One issue that arises with performing data-flow analyses on MPI programs is that the interprocedural control-flow graph ICFG is often irreducible due to call and return edges, and the MPI-ICFG adds furthe...
Thesis (Ph. D.)--University of California, San Diego, 2003. Vita. Includes bibliographical references (leaves 147-153).