December 2021
·
116 Reads
·
1 Citation
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
December 2021
·
116 Reads
·
1 Citation
December 2021
·
182 Reads
·
10 Citations
Journal of Geovisualization and Spatial Analysis
Today’s supercomputing capabilities allow ocean scientists to generate simulation data at increasingly higher spatial and temporal resolutions. However, I/O bandwidth and data storage costs limit the amount of data saved to disk. In situ methods are one solution to generate reduced data extracts, with the potential to reduce disk storage requirement even for high spatial and temporal resolutions, a major advantage to storing raw output. Image proxies have become an efficient and accepted in situ reduced data extract. These extracts require innovative automated techniques to identify and analyze features. We present a framework of computer vision and image processing techniques to detect and analyze important features from in situ image proxies of large ocean simulations. We constrain the analysis framework in support of techniques that emulate ocean-specific tasks as accurately as possible. The framework maximizes feature analysis capabilities while minimizing computational requirements. We demonstrate its use for image proxies extracted from the ocean component of Model for Prediction Across Scales (MPAS) simulations to analyze ocean-specific features such as eddies and western boundary currents. The results obtained for specific data sets are compared to those of traditional methods, documenting the efficacy and advantages of our framework.
November 2021
·
16 Reads
·
3 Citations
August 2021
·
121 Reads
·
4 Citations
Choosing salient time steps from spatio-temporal data is useful for summarizing the sequence and developing visualizations for animations prior to committing time and resources to their production on an entire time series. Animations can be developed more quickly with visualization choices that work best for a small set of the important salient timesteps. Here we introduce a new unsupervised learning method for finding such salient timesteps. The volumetric data is represented by a 4-dimensional non-negative tensor, X (t, x, y, z).The presence of latent (not directly observable) structure in this tensor allows a unique representation and compression of the data. To extract the latent time-features we utilize non-negative Tucker tensor decomposition. We then map these time-features to their maximal values to identify the salient time steps. We demonstrate that this choice of time steps allows a good representation of the time series as a whole.
June 2021
·
150 Reads
·
24 Citations
Extreme-scale cosmological simulations have been widely used by today’s researchers and scientists on leadership supercomputers. A new generation of error-bounded lossy compressors has been used in workflows to reduce storage requirements and minimize the impact of throughput limitations while saving large snapshots of high-fidelity data for post-hoc analysis. In this paper, we propose to adaptively provide compression configurations to compute partitions of cosmological simulations with newly designed postanalysis aware rate-quality modeling. The contribution is fourfold: (1) We propose a novel adaptive approach to select feasible error bounds for different partitions, showing the possibility and efficiency of adaptively configuring lossy compression for each partition individually. (2) We build models to estimate the overall loss of post-analysis result due to lossy compression and to estimate compression ratio, based on the property of each partition. (3) We develop an efficient optimization guideline to determine the best-fit configuration of error bounds combination in order to maximize the compression ratio under acceptable post-analysis quality loss. (4) Our approach introduces negligible overheads for feature extraction and error-bound optimization for each partition, enabling post-analysis-aware in situ lossy compression for cosmological simulations. Experiments show that our proposed models are highly accurate and reliable. Our fine-grained adaptive configuration approach improves the compression ratio of up to 73% on the tested datasets with the same post-analysis distortion with only 1% performance overhead.
March 2021
·
50 Reads
Extreme-scale cosmological simulations have been widely used by today's researchers and scientists on leadership supercomputers. A new generation of error-bounded lossy compressors has been used in workflows to reduce storage requirements and minimize the impact of throughput limitations while saving large snapshots of high-fidelity data for post-hoc analysis. In this paper, we propose to adaptively provide compression configurations to compute partitions of cosmological simulations with newly designed post-analysis aware rate-quality modeling. The contribution is fourfold: (1) We propose a novel adaptive approach to select feasible error bounds for different partitions, showing the possibility and efficiency of adaptively configuring lossy compression for each partition individually. (2) We build models to estimate the overall loss of post-analysis result due to lossy compression and to estimate compression ratio, based on the property of each partition. (3) We develop an efficient optimization guideline to determine the best-fit configuration of error bounds combination in order to maximize the compression ratio under acceptable post-analysis quality loss. (4) Our approach introduces negligible overheads for feature extraction and error-bound optimization for each partition, enabling post-analysis-aware in situ lossy compression for cosmological simulations. Experiments show that our proposed models are highly accurate and reliable. Our fine-grained adaptive configuration approach improves the compression ratio of up to 73% on the tested datasets with the same post-analysis distortion with only 1% performance overhead.
February 2021
·
31 Reads
·
9 Citations
IEEE Transactions on Visualization and Computer Graphics
Contour trees are used for topological data analysis in scientific visualization. While originally computed with serial algorithms, recent work has introduced a vector-parallel algorithm. However, this algorithm is relatively slow for fully augmented contour trees which are needed for many practical data analysis tasks. We therefore introduce a representation called the hyperstructure that enables efficient searches through the contour tree and use it to construct a fully augmented contour tree in data parallel, with performance on average 6 times faster than the state-of-the-art parallel algorithm in the TTK topological toolkit.
December 2020
·
78 Reads
·
1 Citation
Computing in Science & Engineering
Application scientists often employ feature tracking algorithms to capture the temporal evolution of various features in their simulation data. However, as the complexity of the scientific features is increasing with the advanced simulation modeling techniques, quantification of reliability of the feature tracking algorithms is becoming important. One of the desired requirements for any robust feature tracking algorithm is to estimate its confidence during each tracking step so that the results obtained can be interpreted without any ambiguity. To address this, we develop a confidence-guided feature tracking algorithm that allows reliable tracking of user-selected features and presents the tracking dynamics using a graph-based visualization along with the spatial visualization of the tracked feature. The efficacy of the proposed method is demonstrated by applying it to two scientific data sets containing different types of time-varying features.
November 2020
·
9 Reads
·
7 Citations
November 2020
·
24 Reads
·
7 Citations
AIP Conference Proceedings
A variety of flow stress models exist with new models constantly being developed. These models aim to approximate the strength of materials in a variety of regimes from quasistatic loading through shock scenarios. All models contain an array of parameters which need to be tuned to the material under study. Some models perform well under limited conditions, requiring adjustment of the parameters when venturing outside of those predefined ranges. Other models perform well over a wide range of conditions with a set of parameters, but may be outperformed by other models optimized on a tighter range of conditions. Recent research by Los Alamos demonstrated the ability to optimize the Johnson Cook (JC) model using a set of 3 plate-impact experiments on Aluminum. They utilized Bayesian statistics and emulation to determine optimal parameters for the model with a quantification of parameter uncertainty. We present an extension of this capability to incorporate velocimetry from plate-impact tests, stress-strain data from split Hopkinson pressure bar and quasistatic compression tests, plus profiles from Taylor cylinders in a unified fashion. Statistically robust comparisons of the performance and uncertainty of different realizations of the JC flow stress model were carried out based on calibration to several possible combinations of these three different experiment types.
... For example, El-Rushaidat et al. [17] aimed to convert unstructured data into rectilinear grids, while Berger and Rigoutsos [28] further explored cluster and adaptive mesh refinement (AMR) algorithms to reduce grids into non-uniform structure consist of fewer rectangular patches. Various studies have also addressed compressing data in AMR forms [29]- [31]. Our work is motivated by some mesh-to-grid data approximation techniques discussed in the aforementioned work but fundamentally differs in two key aspects: first, our primary objective is to reduce the storage cost rather than accelerate analytic tasks such as visualization; second, we aim to maintain the general structure of reduced data and mathematically preserve errors incurred during data compression. ...
November 2023
... A pre-processing workflow in order to enhance the performance of the lossy compressor SZ for AMR data has been portrayed by Wang et al. [52]. Another preprocessing stratgey for AMR data has been laid out by Li et. ...
November 2023
... Despite the existence of various AMR data compression solutions, none of the studies have comprehensively examined the impact of lossy compression on the visualization of AMR data. While there are studies that analyze the effects of data compression on non-AMR data visualization [31], the visualization of AMR data is more complex due to its hierarchical structure. ...
November 2022
... These steps can be seen in Figure 7. The interested reader is directed to Dutta et al. Dutta et al. (2022a) and Dutta et al., Dutta et al. (2022b) for details. ...
January 2023
Lecture Notes in Computer Science
... Because all children in this cohort entered school at the same point in time, we cannot distinguish these two exposures and determine the separate effects of school vs. non-school exposure. In the United States, COVID-19 incidence has been shown to vary with the start and return to school following breaks, and thus these secular trends may also have been driven by school attendance [42][43][44]. In particular, the initial wave was generally coincident with the start of the school year in February, with a second smaller peak occurring around the typical winter break in July. ...
September 2022
Epidemics
... The interested reader is directed to Dutta et al. Dutta et al. (2022a) and Dutta et al., Dutta et al. (2022b) for details. ...
July 2022
Journal of Computational Science
... Wang et al. [50] used a similar approach proposing an error-bounded lossy compression approach for 3D AMR data which was later extended in [51]. The authors compressed each refinement level of the AMR data separately. ...
May 2022
... Therefore, the histogram+gradient-based version is generally better at retaining important features of the data than just the histogram-based version of sampling algorithm. The interested reader is directed to Biswas et al., Biswas et al. (2022Biswas et al. ( , 2021 for further information. ...
January 2022
... Frey et al. [13] introduced a method for time-step selection to reduce memory cost and speed up the rendering process. Gross et al. [15] proposed a sub-sampling algorithm of time-dependent data and further Figure 2: Overview of the proposed prediction model. Left: Model training. ...
November 2021
... To illustrate how pyDNMF-GPU can be used as a building block for more comprehensive workflows, we integrate pyDNMF-GPU with our existing model selection algorithm pyDNMFk 1 that enables automatic determination of the (usually unknown) number of latent features on a large scale datasets [4][5][6][7][8]. We utilized the integrated model selection algorithm previously to decompose the worlds' largest collection of human cancer genomes [9], defining cancer mutational signatures [10], as well as successfully applied to solve real-world problems in various fields [8,[11][12][13][14][15][16][17][18][19]. ...
August 2021