Scientific Computing in the Cloud

Univ. of Washington, Seattle, WA, USA
Computing in Science and Engineering (Impact Factor: 0.99). 07/2010; 12(3):34 - 43. DOI: 10.1109/MCSE.2010.70
Source: IEEE Xplore


Large, virtualized pools of computational resources raise the possibility of a new, advantageous computing paradigm for scientific research. To help achieve this, new tools make the cloud platform behave virtually like a local homogeneous computer cluster, giving users access to high-performance clusters without requiring them to purchase or maintain sophisticated hardware.

Download full-text


Available from: Jeffrey P. Gardner,
  • Source
    • "The work in [9] is divided in two main areas: i) Benchmarking , where the performance of the FEFF codes on EC2 hardware have been tested, and ii) Development, where an environment is created which permits the FEFF user community to run different versions of the software in their own EC2- resident compute clusters. "
    [Show description] [Hide description]
    DESCRIPTION: Cloud Computing is emerging today as a commercial infrastructure that eliminates the need for maintaining expensive computing hardware. Through the use of virtualization, clouds promise to address with the same shared set of physical resources a large user base with different needs. Thus, clouds promise to be for scientists an alternative to clusters, grids, and supercomputers. However, virtualization may induce significant performance penalties for the demanding scientific computing workloads. In this work, we present an evaluation of the usefulness of the current cloud computing services for scientific computing. We analyze the performance of the Ubuntu Enterprise DIY Eucalyptus and Amazon EC2 platform using SPEC MPI 2007 Benchmarks Suite.
  • Source
    • "Accordingly, the adoption of a typical cloud dedicated paradigm, like the MapReduce framework, is not straightforward and not necessarily effective [33], [34], [35]. Finally, due to both the size of involved data, which can range from hundreds of GBytes up to tens of TBytes for the processing of a single interferometric dataset, destined to enlarge to PB-scale with Sentinel-1, and the nature of the operations which are performed on them, the majority of SAR algorithms are characterized by highly demanding requirements in terms of computing resources and hardware. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a case study on the migration to a cloud computing environment of the advanced differential synthetic aperture radar interferometry (DInSAR) technique, referred to as Small BAseline Subset (SBAS), which is widely used for the investigation of Earth surface deformation phenomena. In particular, we focus on the SBAS parallel algorithmic solution, namely P-SBAS, that allows the production of mean deformation velocity maps and the corresponding displacement time-series from a temporal sequence of radar images by exploiting distributed computing architectures. The Cloud migration is carried out by encapsulating the overall P-SBAS application in virtual machines running on the cloud; moreover, the cloud resources provisioning and configuration phases are implemented in an automatic way. Such an approach allows us to preserve the P-SBAS parallelization strategy and to straightforwardly evaluate its performance within a cloud environment by comparing it with those achieved on a HPC in-house cluster. The results we present were achieved by using the Amazon Elastic Compute Cloud (EC2) of the Amazon Web Services (AWS) to process SAR datasets collected by the ENVISAT satellite and show that, thanks to the cloud resources availability and flexibility, large DInSAR data volumes can be processed through the P-SBAS algorithm in short time frames and at reduced costs. As a case study, the mean deformation velocity map of the southern California area has been generated by processing 172 ENVISAT images. By exploiting 32 EC2 instances this processing took less than 17 hours to complete, with a cost of USD 850. Considering the available PB-scale archives of SAR data and the upcoming huge SAR data flow relevant to the recently launched (April 2014) Sentinel-1A and the forthcoming Sentinel-1B satellites, the exploitation of cloud computing solutions is particularly relevant because of the possibility to provide cloud-based multi-user services allowing worldwide scientists to quickly process SAR data and to manage and access the achieved DInSAR results.
    IEEE Transactions on Cloud Computing 01/2015; DOI:10.1109/TCC.2015.2440267
  • Source
    • "EC2 is a cloud service whereby one can rent virtual machines from Amazon data center and deploy scalable applications on them. Several works are conducted to evaluate EC2 performance [64]. Wall et al. concluded that the effort to transform existing comparative genomics algorithms from local infrastructures to cloud is not trivial, but the cloud environment is an economical alternative in the speed and flexibility considerations [65]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable.
    10/2013; 2013(9):185679. DOI:10.1155/2013/185679
Show more