Dan Stanzione

Dan Stanzione
University of Texas at Austin | UT · Texas Advanced Computing Center (TACC)

About

61
Publications
8,089
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,522
Citations

Publications

Publications (61)
Article
Full-text available
The DesignSafe cyberinfrastructure ( www.designsafe-ci.org ) is part of the NSF-funded Natural Hazard Engineering Research Infrastructure (NHERI) and provides cloud-based tools to manage, analyze, understand, and publish critical data for research to understand the impacts of natural hazards. The DesignSafe Data Depot provides private and public di...
Book
This book constitutes the refereed proceedings of the Second International Symposium on Benchmarking, Measuring, and Optimization, Bench 2019, held in Denver, CO, USA, in November 2019. The 20 full papers and 11 short papers presented were carefully reviewed and selected from 79 submissions. The papers are organized in topical sections named: Best...
Technical Report
Full-text available
We present early results of the deep learning work on the Stampede2 supercomputer. Our goal is to enable scalable and efficient deep learning model training and serving to expedite scientific discovery. We build three popular deep learning frameworks, namely, Intel-Caffe, MXNet, and TensorFlow. With the built-in applications of these frameworks (Ca...
Conference Paper
The Stampede 1 supercomputer was a tremendous success as an XSEDE resource, providing more than eight million successful computational simulations and data analysis jobs to more than ten thousand users. In addition, Stampede 1 introduced new technology that began to move users towards many core processors. As Stampede 1 reaches the end of its produ...
Article
Natural hazards engineering plays an important role in minimizing the effects of natural hazards on society through the design of resilient and sustainable infrastructure. The DesignSafe cyberinfrastructure has been developed to enable and facilitate transformative research in natural hazards engineering, which necessarily spans across multiple dis...
Conference Paper
Full-text available
Jetstream is a first-of-a-kind system for the NSF - a distributed production cloud resource. The NSF awarded funds to create Jetstream in November 2014. Here we review the purpose for creating Jetstream, present the acceptance test results that define Jetstream's key characteristics, describe our experiences in standing up an OpenStack-based cloud...
Article
Jetstream will be the first production cloud resource supporting general science and engineering research within the XD ecosystem. In this report we describe the motivation for proposing Jetstream, the configuration of the Jetstream system as funded by the NSF, the team that is implementing Jetstream, and the communities we expect to use this new s...
Conference Paper
With the growth of data in science and engineering fields and the I/O intense technologies used to carry out research with these massive datasets, it has become clear new solutions to support data research is required. In support of this, the Texas Advanced Computing Center presents Wrangler, the first open science research platform built from the...
Article
The partial correlation coefficient with information theory (PCIT) method is an important technique for detecting interactions between networks. The PCIT algorithm has been used in the biological context to infer complex regulatory mechanisms and interactions in genetic networks, in genome wide association studies, and in other similar problems. In...
Conference Paper
The PCIT method is an important technique for detecting interactions between networks. The PCIT algorithm has been used in the biological context to infer complex regulatory mechanisms and interactions in genetic networks, in genome wide association studies, and in other similar problems. In this work, the PCIT algorithm is re-implemented with exem...
Article
Full-text available
Understanding how complex organisms function as integrated units that constantly interact with their environment is a long-standing challenge in biology. To address this challenge, organismal biology reveals general organizing principles of physiological systems and behavior—in particular, in complex multicellular animals. Organismal biology also f...
Conference Paper
The iPlant Foundation API (fAPI) is a hosted, Software-as-a-Service (SaaS) resource for the computational biology field. Unlike many other grid-based approaches to distributed infrastructure, the fAPI provides a holistic view of core computing concepts such as security, data, applications, jobs, and systems. It also provides the support services, s...
Conference Paper
This paper introduces the iPlant Foundation API Data Services, a cloud-based, hosted solution for distributed data and metadata management in the iPlant Data Store. The iPlant Data Store is a virtual storage solution for over 7000 users providing seamless access to over 6PB of distributed storage within the iPlant cyberinfrastructure using com...
Article
Debugging is difficult; debugging parallel programs at large scale is particularly so. Interactive debugging tools continue to improve in ways that mitigate the difficulties, and the best such systems will continue to be mission critical. Such tools have their limitations, however. They are often unable to operate across many thousands of cores. Ev...
Article
Full-text available
The Arabidopsis Information Portal (AIP), a resource expected to provide access to all community data and combine outputs into a single user-friendly interface, has emerged from community discussions over the last 23 months. These discussions began during two closely linked workshops in early 2010 that established the International Arabidopsis Info...
Article
The cloud platform complements traditional compute and storage infrastructures by introducing capabilities for efficiently provisioning resources in a self-service, on-demand manner. The new provisioning model promises to accelerate scientific discovery by improving access to customizable and task-specific computing resources. This paradigm is well...
Article
Full-text available
The iPlant Collaborative is an NSF-funded cyberinfrastructure (CI) effort directed towards the plant sciences community. This paper enumerates the key concepts, middleware, tools, and extensions that create the unique capabilities of the iPlant Discovery Environment (DE) that provide access to our CI. The DE is a rich web-based application that bri...
Article
As plant biology becomes a data-driven science, new computing technologies are needed to address many formidable challenges. The iPlant Collaborative provides cyberinfrastructure for researchers and developers to collaborate in creating better tools, workflows, algorithms, and ontologies.
Conference Paper
We report on early programming experiences with the Intel® Many Integrated Core (Intel® MIC) Co-processor. This new and x86 based technology is Intel's answer to GPU-based accelerators by NVIDIA, AMD and others. Accelerators have generally sparked interest in the HPC community because they have the potential to significantly increase the compute po...
Article
Full-text available
The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that...
Conference Paper
Full-text available
This paper describes a scalable approach to one of the most computationally intensive problems in molecular plant breeding, that of associating quantitative traits with genetic markers. The fundamental problem is to build statistical correlations between particular loci in the genome of an individual plant and the expressed characteristics of that...
Article
Full-text available
Phylogenetic analysis is a critical area of modern life biology, and is among the most computationally intensive areas of the life sciences. Phylogenetics, the study of evolutionary relationships, is used to study a wider range of topics, including the evolution of critical traits, speciation, adaptation, and many other areas. In this paper, we pre...
Conference Paper
Full-text available
The iPlant Collaborative is a 5-year, National Science Foundation-funded effort to develop cyberinfrastructure to address a series of grand challenges in plant science. The second of these grand challenges is the Genotype-to-Phenotype project, which seeks to provide tools, in the form of a web-based Discovery Environment, for understanding the deve...
Article
The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee and the North American Arabidopsis Steering Committee. There are extensive tools and resources for inf...
Article
The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee and the North American Arabidopsis Steering Committee. There are extensive tools and resources for inf...
Article
The computational problems that scientists face are rapidly escalating in size and scope. Moreover, the computer systems used to solve these problems are becoming significantly more complex than the familiar, well-understood sequential model on their desktops. While it is possible to re-train scientists to use emerging high-performance computing (H...
Conference Paper
Full-text available
Photonic crystals show great promise as a means to realize the goal of true integrated optics. However, for truly practical application, optical losses must be minimized. For this, it is desirable that these devices be made within a fully three-dimensional (3D) photonic crystal structure. However, the three dimensional aspect of the problem makes d...
Conference Paper
This paper provides and overview of the {it GDBase} framework for offline parallel debuggers. The framework was designed to become the basis of debugging tools which scale successfully on systems with tens to hundreds of thousands of cores. With several systems coming online at more than 50,000 cores in the past year, debuggers which can run at the...
Article
Full-text available
Detection and analysis of faults in parallel applications is a difficult and tedious process. Existing tools attempt to solve this problem by extending traditional debuggers to inspect parallel applications. This technique is limited since it must connect to each computing processes and will not scale to next gen- eration systems running on hundred...
Article
Full-text available
Photonic crystals have shown a great deal of promise for the realization of true integrated optics. Waveguides with small bends may be formed allowing compact integrated photonic circuits to be formed. Full three dimensional (3D) photonic simulations are required in order to realize very low loss, integrated photonic crystal circuits. Needless to s...
Article
Message passing, as implemented in message passing interface (MPI), has become the industry standard for programming on distributed memory parallel architectures, while the threading on shared memory machines is typically implemented in OpenMP. Outstanding performance has been achieved with these methods, but only on a relatively small number of co...
Conference Paper
Multiple clusters co-existing in a single research campus has become commonplace at many university and government labs, but effectively leveraging those resources is difficult. Intelligently forwarding and spanning jobs across clusters can increase throughput, decrease turnaround time, and improve overall utilization. Dynamic Virtual Clustering (D...
Conference Paper
Full-text available
Photonic crystals have shown a great deal of promise for the realization of true integrated optics. Waveguides with small bends may be formed allowing compact integrated photonic circuits to be formed. Full three-dimensional (3D) photonic simulations are required in order to realize very low loss, integrated photonic crystal circuits. Needless to s...
Article
At many university, government, and corporate facilities, it is increasingly common for multiple compute clusters to exist in a relatively small geographic area. These clusters represent a significant investment, but effectively leveraging this investment across clusters is a challenge. Dynamic Virtual Clustering has been shown to be an effective w...
Conference Paper
As larger and larger commodity clusters for high perfor- mance computing proliferate at research institutions around the world, challenges in maintaining eective use of these systems also continue to increase. Among the many challenges are maintaining the appropriate software stack for a broad array of applications, and sharing workload across clus...
Conference Paper
Full-text available
Blade severs are being increasingly deployed in modern datacenters due to their high performance/cost ratio and compact size. In this study, we document our work on blade server based datacenter thermal management. Our goal is to minimize the total energy costs (usage) of datacenter operation while providing a reasonable thermal environment for the...
Conference Paper
Full-text available
This paper examines the suitability of different virtualization techniques in a high performance cluster environment. A survey of visualization techniques is presented. Two representative technologies (Xen and User Mode Linux) are selected for an in depth analysis of cluster readiness in terms of their performance, reliability, and their overall im...
Conference Paper
InfiniBand has emerged as a new high bandwidth, low latency standard for high performance computing, but as a technology, is still focused on Layer 2 switching. Standards have not yet been defined for InfiniBand Layer 3 Routing, required for additional scalability, distance reach, security, and fault tolerance and isolation.The meeting will consist...
Article
Full-text available
In this paper, we present a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We explore the interaction of simultaneously co-allocate...
Conference Paper
Summary form only given. The ability to easily span parallel and distributed jobs over multiple physical clusters is a potentially attractive proposal. Such an ability would allow researchers to pool all available cluster resources at a given site. Multiple clusters at a single research site has become the norm whether in an academic, government, o...
Article
Full-text available
In this paper, we present a bandwidth-centric job com-munication model that captures the interaction and impact of simultaneously co-allocated jobs in a grid. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We ex-plore the interaction of simultaneously co-allocated jobs and the...
Conference Paper
Full-text available
The interaction of simultaneously co-allocated jobs can often create contention in the network infrastructure of a dedicated computational grid. This contention can lead to degraded job run-time performance. We present several bandwidth-aware co-allocating meta-schedulers. These schedulers take into account inter-cluster network utilization as a me...
Conference Paper
In this paper, we present a bandwidth-centric parallel job communication model that takes into account inter- cluster network utilization as a means by which to capture the interaction and impact of simultaneously co-allocated jobs in a mini-grid. Our model captures the time-varying utilization of shared inter-cluster network resources in the grid....
Article
Full-text available
As FPGA density increases, so does the potential for configurable computing machines. Unfortunately, the larger designs which take advantage of the higher densities require much more e#ort and longer design cycles, making it even less likely to appeal to users outside the field of configurable computing. To combat this problem, we present the Recon...
Article
As Field Programmable Gate Array (FPGA) density increases, so does the potential for reconfigurable computing machines. Unfortunately, applications which take advantage of the higher densities require significant effort and involve prohibitively long design cycles when conventional methods are used. To combat this problem, we propose a design envir...
Conference Paper
As the quality and accuracy of remote sensing instruments available improves., the ability to quickly process remotely sensed data is in increasing demand. A side effect of this increased quality and accuracy is a tremendous increase in the amount of computational power required by modern remote sensing applications. In this work, a problem solving...
Conference Paper
This paper presents the status of an ongoing project in constructing a framework to create problem solving environments (PSEs). The framework is independent of any particular architecture, programming model, or problem domain. The framework makes use of compiler technology, but identifies and addresses several key differences between compilers and...
Conference Paper
In the past, reconfigurable computing has not been an option for accelerating scientific algorithms (which require complex floating-point operations) and other similar applications due to limited FPGA density. However, the rapid increase of FPGA densities over the past several years has altered this situation. The central goal of the Reconfigurable...
Conference Paper
Integral equations solved by the method of moments (MOM) are an important and computationally intense class of problems in antenna design and other areas of electromagnetics. Particularly when structures become electrically large, MOM solutions become intractable as they lead to large, densely-filled, complex matrices, the solution of which is nume...
Article
Full-text available
Effective resource management remains a challenge for large scale cluster computing systems, as well as for clusters of clusters. Re-source management involves the ability to monitor resource usage and en-force polices to manage available resources and provide the desired level of service. One of the difficulties in resource management is that user...
Article
Full-text available
In a scientific community that increasingly relies upon High Performance Computing (HPC) for large scale sim-ulations and analysis, the reliability of hardware and ap-plications devoted to HPC is extremely important. While hardware reliability is not likely to dramatically increase in the coming years, software must be able to provide the reli-abil...
Article
Full-text available
The Hybrid method of parallelization (using MPI for inter-node communication and OpenMP for intra-node communication) seems a natural fit for the way most clusters are built today. It is generally ex-pected to help programs run faster due to factors like availability of greater bandwidth for intra-node communication. However, optimizing hybrid appl...
Conference Paper
The iPlant Collaborative’s mission is to create a comprehensive set of cyberinfrastructure to support plant research. Concluding year four of this project, iPlant has successfully developed and deployed a variety of integrated technologies and computational resources that provide access to large data storage, high-performance computing, grid comput...

Network

Cited By

Projects

Project (1)