Ivan Rodero

Ivan Rodero
Rutgers, The State University of New Jersey | Rutgers · Rutgers Discovery Informatics Institute

Ph.D.

About

124
Publications
21,445
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,523
Citations
Additional affiliations
September 2009 - present
Rutgers, The State University of New Jersey
June 2007 - August 2007
IBM
Position
  • Research Intern
July 2005 - August 2009
Barcelona Supercomputing Center
Education
September 2004 - February 2009
Universitat Politècnica de Catalunya
Field of study
  • Computer Science and Engineering

Publications

Publications (124)
Preprint
Full-text available
Data collected by large-scale instruments, observatories, and sensor networks are key enablers of scientific discoveries in many disciplines. However, ensuring that these data can be accessed, integrated, and analyzed in a democratized and timely manner remains a challenge. In this article, we explore how state-of-the-art techniques for data discov...
Article
Full-text available
Computational science depends on complex, data intensive applications operating on datasets from a variety of scientific instruments. A major challenge is the integration of data into the scientist’s workflow. Recent advances in dynamic, networked cloud resources provide the building blocks to construct reconfiguration, end-to-end infrastructure th...
Article
With the growing number and increasing availability of shared-use instruments and observatories, observational data is becoming an essential part of application workflows and contributor to scientific discoveries in a range of disciplines. However, the corresponding growth in the number of users accessing these facilities coupled with the expansion...
Preprint
Full-text available
With the growing number and increasing availability of shared-use instruments and observatories, observational data is becoming an essential part of application workflows and contributor to scientific discoveries in a range of disciplines. However, the corresponding growth in the number of users accessing these facilities coupled with the expansion...
Article
Urgent science describes time-critical, data-driven scientific work-flows that can leverage distributed data sources in a timely way to facilitate important decision making. While our capacity for generating data is expanding dramatically, our ability to manage, analyze, and transform this data into knowledge in a timely manner has not kept pace. T...
Preprint
In order to achieve near-time insights, scientific workflows tend to be organized in a flexible and dynamic way. Data-driven triggering of tasks has been explored as a way to support workflows that evolve based on the data. However, the overhead introduced by such dynamic triggering of tasks is an under-studied topic. This paper discusses different...
Article
One of the major endeavors of modern cyberinfrastructure (CI) is to carry content produced on remote data sources, such as sensors and scientific instruments, and to deliver it to end users and workflow applications. Maintaining data quality, data resolution, and on-time data delivery and considering the increasing number of computing, storage, and...
Article
Full-text available
Research Infrastructures (RIs) are large-scale facilities encompassing instruments, resources, data and services used by the scientific community to conduct high-level research in their respective fields. The development and integration of marine environmental RIs as European Research Vessel Operators [ERVO] (2020) is the response of the European C...
Conference Paper
Full-text available
Our research aims to improve the accuracy of Earthquake Early Warning (EEW) systems by means of machine learning. EEW systems are designed to detect and characterize medium and large earthquakes before their damaging effects reach a certain location. Traditional EEW methods based on seismometers fail to accurately identify large earthquakes due to...
Poster
EMSO is a European Research Infrastructure Consortium (ERIC) with 8 member countries. It is coordinated by a central management o_ce and promotes monitoring services offered by 11 Cxed- point deep-sea and water column observatories around Europe, from the Atlantic, through the Mediterranean, and to the anoxic Black Sea. EMSO aims are to advance mar...
Poster
Full-text available
EMSO ERIC, a pan-European Research Infrastructure, is supported by an integrated system of Regional Facilities involving 11 multi-sensor fixed-point Regional facilities-platforms deployed in 'key environmental sites' across the European seas. These platforms are engaged to the long-term multidisciplinary observation of the deep-sea ocean and water...
Conference Paper
Data and services provided by shared facilities, such as large-scale observing facilities, have become important enablers of scientific insights and discoveries across many science and engineering disciplines. Ensuring satisfactory quality of service can be challenging for facilities, due to their remote locations and to the distributed nature of t...
Article
The Virtual Data Collaboratory is a federated data cyberinfrastructure designed to drive data-intensive, interdisciplinary and collaborative research that will impact researchers, educators and entrepreneurs across a broad range of disciplines and domains as well as institutional and geographic boundaries.
Article
Large-scale scientific facilities provide a broad community of researchers and educators with open access to instrumentation and data products generated from geographically distributed instruments and sensors. This paper discusses key architectural design, deployment, and operational aspects of a production cyberinfrastructure for the acquisition,...
Technical Report
Full-text available
The Rutgers Discovery Informatics Institute (RDI2), New Jersey's Center for Advanced Computation, broadens access to state-of-the-art computing technology that enables large-scale "Big Data" analytics, computational modeling, and visualization, all of which are playing increasingly important roles in education, research and innovation. RDI2 uses ad...
Conference Paper
Full-text available
Modern Cyberinfrastructures (CIs) operate to bring content produced from remote data sources such as sensors and scientific instruments and deliver it to end users and workflow applications. Maintaining data quality/resolution and on-time data delivery while considering an increasing number of computing, storage and network resources requires a rea...
Preprint
Emerging non-volatile memory technologies (NVRAM) offer alternatives to hard drives that are persistent, while providing similar latencies to DRAM. Intel recently released the Optane drive, which features 3D XPoint memory technology. This device can be deployed as an SSD or as persistent memory. In this paper, we provide a performance comparison be...
Article
Full-text available
The Ocean Observatories Initiative (OOI) is an integrated suite of instrumented platforms and discrete instruments that measure physical, chemical, geological, and biological properties from the seafloor to the sea surface. The OOI provides data to address large-scale scientific challenges such as coastal ocean dynamics, climate and ecosystem healt...
Conference Paper
A Distributed Denial of Service (DDoS) attack is an attempt to make an online service, a network, or even an entire organization, unavailable by saturating it with traffic from multiple sources. DDoS attacks are among the most common and most devastating threats that network defenders have to watch out for. DDoS attacks are becoming bigger, more fr...
Conference Paper
Enterprise and Cloud environments are rapidly evolving with the use of lightweight virtualization mechanisms such as containers. Containerization allow users to deploy applications in any environment faster and more efficiently than using virtual machines. However, most of the work in this area focused on Linux-based containerization such as Docker...
Conference Paper
As data analytics applications become increasingly important in a wide range of domains, the ability to develop large-scale and sustainable platforms and software infrastructure to support these applications has significant potential to drive research and innovation in both science and business domains. This paper characterizes performance and powe...
Article
Full-text available
Internet of Things (IoT) is bringing an increasing number of connected devices that have a direct impact on the growth of data and energy-hungry services. These services are relying on Cloud infrastructures for storage and computing capabilities, transforming their architecture into more a distributed one based on edge facilities provided by Intern...
Conference Paper
Full-text available
Large scientific facilities provide researchers with instrumentation, data, and data products that can accelerate scientific discovery. However, increasing data volumes coupled with limited local computational power prevents researchers from taking full advantage of what these facilities can offer. Many researchers looked into using commercial and...
Conference Paper
Full-text available
Large scale observatories are shared-use resources that provide open access to data from geographically distributed sensors and instruments. This data has the potential to accelerate scientific discovery. However, seamlessly integrating the data into scientific workflows remains a challenge. In this paper, we summarize our ongoing work in supportin...
Conference Paper
Full-text available
Scientific simulation workflows executing on very large scale computing systems are essential modalities for scientific investigation. The increasing scales and resolution of these simulations provide new opportunities for accurately modeling complex natural and engineered phenomena. However, the increasing complexity necessitates managing, transpo...
Conference Paper
Global energy problems necessitate an urgent transformation of the existing electrical generation grid into a smart grid, rather than a gradual evolution. A smart grid is a real-time bi-directional communication network between end users and their utility companies which monitors power demand and manages the provisioning and transport of electricit...
Conference Paper
Cloud computing technology is being adopted by a growing number of organizations as a way to manage, store and process data. Protecting this infrastructure and key resources such as data and recovering against cyber threats has become a critical concern. At the same time, cloud computing technology such as virtualization and on-demand provisioning...
Conference Paper
Sensitivity analysis (SA) is a fundamental tool of uncertainty quantification(UQ). Adjoint-based SA is the optimal approach in many large-scale applications, such as the direct numerical simulation (DNS) of combustion. However, one of the challenges of the adjoint workflow for time-dependent applications is the storage and I/O requirements for the...
Article
Full-text available
As the number of people who interact on social networks increases, and coupled with the greater capability made available within our computational devices, there is the potential to establish "Social Clouds" - a resource sharing infrastructure that enable people who have trust relationships to come together to share computational/data services with...
Chapter
Cloud computing has emerged as a dominant paradigm that has been widely adopted by enterprises. Clouds provide on-demand access to computing utilities, an abstraction of unlimited computing resources, and support for on-demand scale up, scale down and scale out. Clouds are also rapidly joining high performance computing system, clusters and grids a...
Article
Full-text available
Background The development of digital imaging technology is creating extraordinary levels of accuracy that provide support for improved reliability in different aspects of the image analysis, such as content-based image retrieval, image segmentation, and classification. This has dramatically increased the volume and rate at which data are generated...
Article
Emerging scientific simulations on leadership class systems are generating huge amounts of data and processing this data in an efficient and timely manner is critical for generating insights from the simulations. However, the increasing gap between computation and disk I/O speeds makes traditional data analytics pipelines based on post-processing c...
Technical Report
The grid resource management research has achieved several advances in last years. The interoperability among different grid systems, Servic e Level Agreements and job self scheduling are some of the current research topics. In this paper we present the new design and implementation of the eNANOS Grid Resource Broker that we have developed to suppo...
Conference Paper
The development of digital imaging technology is creating extraordinary levels of accuracy that provide support for improved reliability in different aspects of the image analysis such as content-based image retrieval, image segmentation and classification. This has dramatically increased the volume and generation rates of data, which make querying...
Article
Power and energy are critical concerns for high performance computing systems from multiple perspectives, including cost, reliability/resilience and sustainability. At the same time, data locality and the cost of data movement have become dominating concerns in scientific workflows. One potential solution for reducing data movement costs is to use...
Article
Full-text available
Mobile platforms are becoming the predominant medium of access to Internet services due to the tremendous increase in their computation and communication capabilities. However, enabling applications that require real-time in-the-field data collection and processing using mobile platforms is still challenging due to i) the insufficient computing cap...
Article
Full-text available
The complexity of many problems in science and engineering requires computational capacity exceeding what the average user can expect from a single computational center. While many of these problems can be viewed as a set of independent tasks, their collective complexity easily requires millions of core-hours on any high-power computing (HPC) resou...
Conference Paper
The increasing gap between the rate at which large scale scientific simulations generate data and the corresponding storage speeds and capacities is leading to more complex system architectures with deep memory hierarchies. Advances in non-volatile memory (NVRAM) technology have made it an attractive candidate as intermediate storage in this memory...
Conference Paper
Full-text available
As scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads...
Conference Paper
In recent years, MapReduce programming model and specifically its open source implementation Hadoop has been widely used by organizations to perform large-scale data processing tasks such as web-indexing, data mining as well as scientific simulations. The key benefits of this programming model include its simple programming interface and ability to...
Conference Paper
We present a federation model to support the dynamic federation of resources and autonomic management mechanisms that coordinate multiple workflows to use resources based on objectives. We illustrate the effectiveness of the proposed framework and autonomic mechanisms through the discussion of representative use case application scenarios, and from...
Article
Full-text available
Clouds are rapidly joining high-performance computing (HPC) systems, clusters, and grids as viable platforms for scientific exploration and discovery. As a result, understanding application formulations and usage modes that are meaningful in such a hybrid infrastructure, and how application workflows can effectively utilize it, is critical. Here, t...
Article
Full-text available
The goal of Grid computing is to integrate the usage of computer resources from cooperating partners in the form of Virtual Organizations (VO). One of its key functions is to match jobs to execution resources efficiently. For interoperability between VOs, this matching operation occurs in resource brokering middleware, commonly referred to as the m...
Technical Report
Full-text available
High Performance Computing (HPC) has evolved over the past decades into increasingly complex and powerful systems. Current HPC systems consume several MWs of power, enough to power small towns, and are in fact soon approaching the limits of the power available to them. Estimates are with the given current technology, achieving exascale will require...
Article
Social Clouds provide the capability to share resources among participants within a social network—leveraging on the trust relationships already existing between such participants. In such a system, users are able to trade resources between each other rather than make use of capability offered at a (centralized) data center. Although such an enviro...
Chapter
The purpose of this chapter is to identify and analyze the challenges of creating new software in the public cloud due to legal regulations. Specifically, this chapter explores how the Sarbanes-Oxley Act (SOX) will indirectly affect the development and implementation process of cloud computing applications in terms of software engineering and actua...
Chapter
The purpose of this chapter is to identify and analyze the challenges of creating new software in the public cloud due to legal regulations. Specifically, this chapter explores how the Sarbanes-Oxley Act (SOX) will indirectly affect the development and implementation process of cloud computing applications in terms of software engineering and actua...
Conference Paper
Full-text available
Emerging scientific simulations on leadership class systems are generating huge amounts of data. However, the increasing gap between computation and disk I/O speeds makes traditional data analytics pipelines based on post-processing cost prohibitive and often infeasible. In this paper, we investigate an alternate approach that aims to bring the ana...
Article
We show how a layered Cloud service model of software (SaaS), platform (PaaS), and infrastructure (IaaS) leverages multiple independent Clouds by creating a federation among the providers. The layered architecture leads naturally to a design in which inter-Cloud federation takes place at each service layer, mediated by a broker specific to the conc...
Article
Full-text available
Enabling data-and compute-intensive applications that require real-time in-the-field data collection and processing using mobile plat-forms is still a significant challenge due to i) the insufficient com-puting capabilities and unavailability of complete data on individ-ual mobile devices and ii) the prohibitive communication cost and response time...
Article
Full-text available
Virtualized datacenters and clouds are being increasingly considered for traditional High-Performance Computing (HPC) workloads that have typically targeted Grids and conventional HPC platforms. However, maximizing energy efficiency and utilization of datacenter resources, and minimizing undesired thermal behavior while ensuring application perform...
Chapter
Introduction Background and Related Work Proactive, Component-Based Power Management Quantifying Energy Saving Possibilities Evaluation of the Proposed Strategies Results Concluding Remarks Summary References
Technical Report
Full-text available
High-performance parallel computing architectures are increasingly based on multi-core processors. While current commercially available processors are at 8 and 16 cores, technological and power constraints are limiting the performance growth of the cores and are resulting in architectures with much higher core counts, such as the experimental many-...
Conference Paper
Full-text available
Social Clouds provide the capability to share resources among partic