Conference Paper

Towards a Generic Cloud-Based Virtual Research Environment

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Virtual collaboration is an important aspect for the success of scientific projects, especially if participating researchers are distributed over the whole globe. In the recent past some systems - so called virtual research environments - were presented to support collaborative work restricted to certain research domains. Within this article a concept of a generic framework for building personal, cloud-based virtual research environments easily is proposed. Such an environment could be defined by composing arbitrary services, appropriate to the requirements of a particular scientist. Due to low funds in some scientific areas, we also provide a flexible billing strategy using the cloud specific pay-per-use model. Thus, each service has just to be paid as long as it is utilized. Virtual Research Environment; Cloud Computing; Layered Architecture; Virtualization; Virtual Collaboration; Pay-per-use I. INTRODUCTION

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Scientific workflow management systems are other examples of support systems; they aim to automate the execution of data pipelines or computing tasks across remotely distributed infrastructures. 15 VREs or Science Gateway, to a certain extent, can be seen as a new generation of those early systems with better support for data-intensive applications on remote infrastructures (e.g., cloud), 16 research activities across entire research lifecycles ‡ ‡ and collaboration and sharing within a community. 17 In many cases, the differences among those systems are not clearly cut. ...
Article
Full-text available
Virtual research environments (VREs) provide user‐centric support in the lifecycle of research activities, for example, discovering and accessing research assets or composing and executing application workflows. A typical VRE is often implemented as an integrated environment, including a catalog of research assets, a workflow management system, a data management framework, and tools for enabling user collaboration. In contrast, notebook environments like Jupyter allow researchers to rapidly prototype scientific code and share their experiments as online accessible notebooks. Jupyter can support several popular languages used by data scientists, such as Python, R, and Julia. However, such notebook environments do not have seamless support for running heavy computations on remote infrastructure or finding and accessing collaborative software code inside notebooks. This article investigates the gap between a notebook environment and a VRE and proposes an embedded VRE solution for the Jupyter environment called Notebook‐as‐a‐VRE (NaaVRE). The NaaVRE solution provides functional components via a component marketplace and allows users to create a customized VRE on top of the Jupyter environment. From the VRE, a user can search research assets (data, software, and algorithms), compose workflows, manage the lifecycle of an experiment, and share the results among users in the community. We demonstrate how such a solution can enhance a legacy workflow that uses Light Detection and Ranging (LiDAR) data from country‐wide airborne laser scanning surveys for deriving geospatial data products of ecosystem structure at high resolution over broad spatial extents. This enables users to scale out the processing of multi‐terabyte LiDAR point clouds for ecological applications to more data sources in a distributed cloud environment. Similar applications could be developed for workflows producing other essential biodiversity variables.
... These working environments, which comprehensively serve the needs of a community of practice, are commonly referred to as Virtual Research Environments (VREs). Roth et al. (2011) andAssante et al. (2019) identify cloud resources as the underlying ''global virtual infrastructure'' for these systems and provide two similar implementations that offer on-demand allocation of VREs. Both implementations enable to dynamically compose VREs from a collection of scientific applications, which are nevertheless installed directly on Virtual Machines (VMs). ...
Article
Full-text available
The computational demands for scientific applications are continuously increasing. The emergence of cloud computing has enabled on-demand resource allocation. However, relying solely on infrastructure as a service does not achieve the degree of flexibility required by the scientific community. Here we present a microservice-oriented methodology, where scientific applications run in a distributed orchestration platform as software containers, referred to as on-demand, virtual research environments. The methodology is vendor agnostic and we provide an open source implementation that supports the major cloud providers, offering scalable management of scientific pipelines. We demonstrate applicability and scalability of our methodology in life science applications, but the methodology is general and can be applied to other scientific domains.
... Bastian Roth et al [10] have sort after the challenges in scientific collaboration and proposed an approach, which leverages on groupware tools and hypervisor-based virtualization techniques like KVM, VMware vSphere or Xen to run a generic collaboration platform. ...
Article
Full-text available
p class="Default">Cloud-based research collaboration platforms render scalable, secure and inventive environments that enabled academic and scientific researchers to share research data, applications and provide access to high- performance computing resources. Dynamic allocation of resources according to the unpredictable needs of applications used by researchers is a key challenge in collaborative research environments. We propose the design of Cloud Container based Collaborative Research (CCCORE) framework to address dynamic resource provisioning according to the variable workload of compute and data-intensive applications or analysis tools used by researchers. Our proposed approach relies on–demand, customized containerization and comprehensive assessment of resource requirements to achieve optimal resource allocation in a dynamic collaborative research environment. We propose algorithms for dynamic resource allocation problem in a collaborative research environment, which aim to minimize finish time, improve throughput and achieve optimal resource utilization by employing the underutilized residual resources. </p
... stical analysis platforms (e.g. Biocep-R [8] and CloudNumbers [2]), and social networks (e.g. Mendeley and Academia.edu). Efforts for designing domain-driven solutions include the following two architectural proposals. [20] presents a use case of similar architectural elements and hybrid infrastructure deployment, but does not use RESTful services. [19] defines a generic high-level framework for assembling virtual research environments. Domain-driven solutions are used for data discovery, data normalisation , and workflow execution. In the domain of environmental and geosciences, the NSF-funded Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI) developed seve ...
... stical analysis platforms (e.g. Biocep-R [8] and CloudNumbers [2]), and social networks (e.g. Mendeley and Academia.edu). Efforts for designing domain-driven solutions include the following two architectural proposals. [20] presents a use case of similar architectural elements and hybrid infrastructure deployment, but does not use RESTful services. [19] defines a generic high-level framework for assembling virtual research environments. Domain-driven solutions are used for data discovery, data normalisation , and workflow execution. In the domain of environmental and geosciences, the NSF-funded Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI) developed seve ...
Article
Environmental science is often fragmented: data is collected using mismatched formats and conventions, and models are misaligned and run in isolation. Cloud computing offers a lot of potential in the way of resolving such issues by supporting data from different sources and at various scales, by facilitating the integration of models to create more sophisticated software services, and by providing a sustainable source of suitable computational and storage resources. In this paper, we highlight some of our experiences in building the Environmental Virtual Observatory pilot (EVOp), a tailored cloud-based infrastructure and associated web-based tools designed to enable users from different backgrounds to access data concerning different environmental issues. We review our architecture design, the current deployment and prototypes. We also reflect on lessons learned. We believe that such experiences are of benefit to other scientific communities looking to assemble virtual observatories or similar virtual research environments.
Chapter
With advancement in information technologies and a better mobile environment, virtual and personal research environments can be made available for researchers. On the other hand, as more investment is being made in R&D, the efforts to enhance R&D productivity are becoming important. This study proposes a conceptual design of network-based personalized research environment that is called NePRE for developing the tool to assist researchers in their R&D efforts. It can be more easily utilized by researchers in their R&D information activities. To do this, we analyze existing research support environments in terms of R&D information activities. And we also analyze changes of information environment in terms of personalization. Subsequently we define three design principles of NePRE for personalization. Finally we define its key functions to assist researchers with respect to six R&D information activities
Article
With advancement in information technologies and a better mobile environment, the paradigm of service is shifting again from web portals to networked-applications based on individual application programs. Furthermore, as more investment is being made in R&D, the efforts to enhance R&D productivity are becoming important. In this paper, we designed a personalized service model for developing a tool to assist researchers in their R&D activities. To do this, we first compared services and tools in terms of information activities of researchers in R&D. In addition, we also analyzed changes of information environment such as open expansion of information and data, enhancement of personal information protection, popularization of social networking service, very big contents, advances in web platform technology in terms of personalization, and defined some directions of developing a personalized service. Subsequently we designed a personalized service model of research support tool in the views of functions, contents, operation, and defined personalized design goals and principles for implementing it as standard, participation, and open.
Conference Paper
Full-text available
Virtual Research Environments (VRE) sollen den Forschungsprozess durch Ansammlungen webbasierter Services unterstützen sowie ein plattform- und ortsunabhängiges wissenschaftliches Arbeiten - auch disziplinübergreifend - ermöglichen. Auf dem Markt finden sich bereits zahlreiche öffentlich zugängliche, teils kommerzielle, Angebote; noch mehr universitäre „Eigengewächse“ sind in der Planungs- oder Umsetzungsphase. Insbesondere für die Entwicklung solcher IT-Artefakte stellt sich die Frage, welche Anforderungen die Plattformen erfüllen sollen. Mittels eines Literaturreviews wurden Beiträge aus 41 Journalen untersucht. Im Zeitraum 2008 bis 2012 konnten dadurch 44 Beiträge ermittelt werden, die funktionale und nichtfunktionale Anforderungen an virtuelle Forschungsumgebungen adressieren.
Conference Paper
Full-text available
In Zeiten angespannter Haushaltslagen und gleichzeitig zunehmender Finanzautonomie halten bereits seit geraumer Zeit Ideen und Steuerungsinstrumente des New Public Management im Hochschulbereich Einzug. Diese umfassen beispielsweise die Ausbringung der Haushalte als budgetierte Globalhaushalte sowie - insbesondere - die Verwendung einer Kosten- und Leistungsrechnung zur Befriedigung interner und externer Bedarfsträger. In diesem Zusammenhang wächst die Bedeutung eines IT-Controllings, auch durch die Berücksichtigung einer IT-Leistungsverrechnung auf die Serviceempfänger. Der vorliegende Beitrag entwickelt nach einem Abriss über die vorhandene Literatur ein Gesamtmodell zur Integration eines „Cost Accounting“ an Hochschulen und veranschaulicht dieses abschließend anhand einer Fallstudie.
Conference Paper
Full-text available
With emergence and adoption of cloud computing, cloud has become an effective collaboration platform for integrating various software tools to deliver as services. In this paper, we present a cloud-based image processing toolbox by integrating Galaxy, Hadoop and our proprietary image processing tools. This toolbox allows users to easily design and execute complex image processing tasks by sharing various advanced image processing tools and scala-ble cloud computation capacity. The paper provides the integration architecture and technical details about the whole system. In particular, we present our investigations to use Hadoop to handle massive image processing jobs in the system. A number of real image processing examples are used to demonstrate the usefulness and scalability of this class of data-intensive applications.
Article
Purpose Today's requirements concerning successful learning support comprise a variety of application scenarios. Therefore, the development of supporting software preferably aims at modular design. This article discusses requirements regarding flexibility of e‐learning systems and presents important principles, which should be met by successful systems. The purpose of this paper is to achieve a highly flexible system as follows: first of all, the system itself should be capable of easily being integrated into other systems. Second, the approach should allow easy integration of new components, respectively, existing resources without the need to adapt the whole system. Design/methodology/approach Guided by the results of previous projects and by various experiences in online education the importance of modular structures of an effective architecture as well as for the system usage were discovered. Accordingly, existing e‐learning systems were examined and some deficiency regarding support of synchronous learning activities were found. Findings The architecture of the Meeting Room Platform (MRP) is introduced as an example implementation of synchronous communication and collaboration systems. In addition to fulfilling explained flexibility requirements, it is configurable in a way so that the user can choose a set of services he wants to provide in online meetings. Originality/value With aforementioned aspects of flexibility in mind, the concept of the MRP system differs from existing systems and constitutes a new approach in designing synchronous e‐learning environments. Finally, various use cases as described in this article show the benefit of this approach more detailed.
Conference Paper
Full-text available
Cloud computing delivers platform, software application and hardware infrastructure as a service over Internet. It allows the users to utilize the service on-demand and pay-per-use model as given by Amazon EC2. There are two types of cost involved, fixed and variable. Cost will rise as the demand increases. In this paper the web service request coming from users are categorized into groups and virtual machines (VM's) are created group wise from the physical servers. Mapping of these two group id is done by application provisioner. Fixed cost is charged on requesting reserved resource as per Service Level Agreement (SLA) and variable cost for requesting instantaneous resources. Ultimate cost charged to the user is minimized by 16%. Efficient workload monitoring, grouping of VM's and user request helps to finish user request with in time and better utilization of available resources (physical serves). Idle time of resource is reduced from 50% to 23%.
Conference Paper
In this paper we present principles and the architecture of the Meeting Room Platform (MRP) as an example implementation of a synchronous communication and collaboration system. Our main goal is to achieve a highly flexible system as follows: First of all, the system itself should be easily capable of being integrated into other systems like Learning Management Systems (LMS). Secondly the approach allows integrating new components, respectively existing resources without the need to adapt the whole system. Finally, the system is configurable, so the user can choose a set of services that he wants to provide in his online meetings. With these three aspects of flexibility the concept of the MRP system differs from existing systems and constitutes therefore a new approach in designing synchronous e-learning environments. Furthermore, various use cases (of the system) as described in this paper show the benefit of this approach more detailed.
Article
Full-text available
This study investigated international developments in Virtual Research Communities (VRCs) and to evaluate them in relation to the activities in the JISC’s VRE programme. The study examined programmes in a number of key countries along with significant projects and communities as well as some countries where developments on this front are just beginning. There has been a great deal of activity over the past few years in terms of prototype and demonstration systems moving into the mainstream of research practice. Notable trends are emerging as researchers increasingly apply collaborative systems to everyday research tasks.
Article
Full-text available
The Science Clouds provide EC2-style cycles to scientific projects. This document contains a description of technologies enabling this project and an early summary of its experiences.
Article
Full-text available
We propose an integrated Cloud computing stack archi-tecture to serve as a reference point for future mash-ups and comparative studies. We also show how the existing Cloud landscape maps into this architecture and identify an infras-tructure gap that we plan to address in future work.
Article
Full-text available
Trust has been a focus of research on virtual collaboration in distributed teams, e-commerce, e-learning, and telemedicine. Central to several models of trust and virtual collaboration is user's disposition to trust. This construct, however, has generally been conceptualized in as a stand-alone trait without a substantive theoretical background in personality theory. This paper advances the interpersonal circumplex model (ICM) as a theoretical framework for understanding the role of personal traits in collaboration in virtual contexts. The ICM posits that tendencies in interpersonal interaction stem from personal dispositions that can be understood in terms of dimensions of power and affiliation, fundamental constituents of user's personality. We develop a model that proposes that interpersonal traits, specifically, personality type as defined by the circumplex, affect the individual's disposition to trust, perceived trustworthiness, communication, and thereby affects willingness to collaborate and the sustainability and productivity of the collaboration. The model enables us to unpack the black box concepts of disposition to trust, faith in others, and trusting stance that are currently incorporated in theories of trust in information systems. The theory also enables explanation of trust dynamics at the dyadic and group levels. We develop propositions positing that individual's traits and dyadic complementarity are mediating factors in interpersonal trust and willingness to use new technologies and significantly affect the initiation, duration, and productivity of computer-mediated collaboration.
Conference Paper
Many scientific workflow systems have been developed and are serving to benefit science. In this paper we look outside the workflow to consider the use of workflows within scientific practice, and we argue that the tremendous scientific potential of workflows will be achieved through mechanisms for sharing and collaboration – empowering the scientist to spread their experimental protocols and to benefit from the protocols of others. We discuss issues in workflow sharing, propose a set of design principles for collaborative e-Science software, and illustrate these principles in action through the design of the myExperiment Virtual Research Environment for collaboration and sharing of experiments.
Article
Cloud computing emerges as a new computing paradigm which aims to provide reliable, customized and QoS guar- anteed dynamic computing environments for end-users. This paper reviews our early experience of Cloud comput- ing based on the Cumulus project for data centers. In this paper, we introduce the Cumulus project with its various aspects, such as testbed, infrastructure, middleware and ap- plication models.
Conference Paper
This paper describes Biocep-R, an Open Source platform for the virtualization of Scientific Computing Environments (SCEs) such as R and Scilab. To our knowledge it is the first time that a software platform enables geographically distributed collaborators to view and analyze terabytes of data interactively and collaboratively, using standard computational tools. Those tools can be running on high performance machines or on a Cloud. This is also the first time that a full end-to-end solution is proposed for reproducible computational research in a Cloud and for virtual appliances-based education.
Article
This paper describes the £120M UK ‘e-Science’ (http://www.research-councils.ac.uk/ and http://www.escience-grid.org.uk) initiative and begins by defining what is meant by the term e-Science. The majority of the £120M, some £75M, is funding large-scale e-Science pilot projects in many areas of science and engineering. The infrastructure needed to support such projects must permit routine sharing of distributed and heterogeneous computational and data resources as well as supporting effective collaboration between groups of scientists. Such an infrastructure is commonly referred to as the Grid. Apart from £10M towards a Teraflop computer, the remaining funds, some £35M, constitute the e-Science ‘Core Programme’. The goal of this Core Programme is to advance the development of robust and generic Grid middleware in collaboration with industry. The key elements of the Core Programme will be outlined including details of a UK e-Science Grid testbed. The pilot e-Science projects that have so far been announced are then briefly described. These projects span a range of disciplines from particle physics and astronomy to engineering and healthcare, and illustrate the breadth of the UK e-Science Programme. In addition to these major e-Science projects, the Core Programme is funding a series of short-term e-Science demonstrators across a number of disciplines as well as projects in network traffic engineering and some international collaborative activities. We conclude with some remarks about the need to develop a data architecture for the Grid that will allow federated access to relational databases as well as flat files.
Article
This paper discusses the concept of Cloud Computing to achieve a complete definition of what a Cloud is, using the main characteristics typically associated with this paradigm in the literature. More than 20 definitions have been studied allowing for the extraction of a consensus definition as well as a minimum definition containing the essential characteristics. This paper pays much attention to the Grid paradigm, as it is often confused with Cloud technologies. We also describe the relationships and distinctions between the Grid and Cloud approaches.
Available: http://www.debian.org/ [9] Active Endpoints
  • Gnu Debian
  • Linux
Debian GNU/Linux. Available: http://www.debian.org/ [9] Active Endpoints. (2011-03-15).
Available: http://www.ifaust.de/ [11] Microsoft Corporation. (2011-03-15) Microsoft Windows Server
  • Land Software-Entwicklung Faust
  • Standard
Land Software-Entwicklung. (2011-03-15). FAUST Standard FAUST Professional. Available: http://www.ifaust.de/ [11] Microsoft Corporation. (2011-03-15). Microsoft Windows Server 2008 R2. Available: http://www.microsoft.com/germany/windowsserver2008/
2011-03-15) Webbrowser Firefox
  • Mozilla Foundation
Mozilla Foundation. (2011-03-15). Webbrowser Firefox. Available: http://www.mozilla-europe.org/de/firefox/
Kernel Based Virtual Machine
  • Redhat
RedHat. (2011-03-15). Kernel Based Virtual Machine. Available: http://www.linux-kvm.org/
Welcome to xen.org, home of the Xen? hypervisor
  • Citrix Systems
  • Inc
VMware vSphere 4: Private Cloud Computing, Server and Data Center Virtualization
  • Inc Vmware
Microsoft Windows Server
  • Microsoft Corporation
Amazon Simple Storage Service (Amazon S3)
  • Amazon
Virtuelle Arbeitsplattform f?r Technik und Organisation im verteilten Forschungsbetrieb
  • Deutsche Forschungsgemeinschaft
Citrix Systems, Inc. (2011-03-15). Welcome to xen.org, home of the Xen® hypervisor
  • Citrix Systems Inc
Active Endpoints. (2011-03-15). ActiveVOS
  • Active Endpoints