Conference Paper

Data as a Service (DaaS) for Sharing and Processing of Large Data Collections in the Cloud

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Data as a Service (DaaS) is among the latest kind of services being investigated in the Cloud computing community. The main aim of DaaS is to overcome limitations of state-of-the-art approaches in data technologies, according to which data is stored and accessed from repositories whose location is known and is relevant for sharing and processing. Besides limitations for the data sharing, current approaches also do not achieve to fully separate/decouple software services from data and thus impose limitations in inter-operability. In this paper we propose a DaaS approach for intelligent sharing and processing of large data collections with the aim of abstracting the data location (by making it relevant to the needs of sharing and accessing) and to fully decouple the data and its processing. The aim of our approach is to build a Cloud computing platform, offering DaaS to support large communities of users that need to share, access, and process the data for collectively building knowledge from data. We exemplify the approach from large data collections from health and biology domains.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Take a specialist's substantial infrastructure instead of creating your own individual set up, has been the key for cloud computing success. In the last decade there has been a significant research to deal with the allocation of resources in cloud computing [1,27]. Several companies (such as Amazon EC2, Microsoft Azure etc.) have come up with technologies to support the key viewpoint of cloud computing. ...
... Data-as-a-Service(DaaS) is coming up as an alternative school of thoughts in cloud computing where data are obtainable as a service through network [1,2,3]. In [1] a DaaS architecture is presented for data discovery, storing and moving data, and processing of the data with the consideration of big data as a service. ...
... Data-as-a-Service(DaaS) is coming up as an alternative school of thoughts in cloud computing where data are obtainable as a service through network [1,2,3]. In [1] a DaaS architecture is presented for data discovery, storing and moving data, and processing of the data with the consideration of big data as a service. A pricing scheme for a query processing in DaaS platform is addressed in [3]. ...
Chapter
Full-text available
Data as a Service (DaaS) is the next emerging technology in cloud computing research. Small clouds operating as a group may exploit the DaaS efficiently to perform substantial amount of work. In this paper an auction framework is studied when the small clouds are strategic in nature. We present the system model and formal definition of the problem. Several auction DaaS-based mechanisms are proposed and their correctness and computational complexity analysed. To the best of our knowledge, this is the first and realistic attempt to study the DaaS in strategic setting.
... However such services are known and described as Data-as-a-service (Delen and Demirkan, 2013). In (Terzo et al., 2013), Data-as-a-Service is interpreted as a storing and processing service and the authors proposed a layer architecture of an IaaS. In (Seibold and Kemper, 2012), different types of Database-as-a-Service (shared machine, shared processes, shared tables) are described and can appear as SaaS, PaaS and IaaS, which depends on the complexity of the delivered systems. ...
... Normally the initiation of a data science process is forced by a user or business requirements that expect some kind of knowledge or insight from data. Thus, the terms business analytics, Insight-as-a-Service or Knowledge-as-a-Service (Terzo et al., 2013) are used. In case of big data problems, there exist different approaches about so called Big-Data-Analytics-as-a-Service or Big-Dataas-a-Service. ...
... When processing large data and attempting to provide flexible service, there is still issues due to different data source and changeable user needs [1,2]: a). Not all types of data processing are suitable for Map Reduce model. ...
... Even most of us are unknown about the existence of less-known data-brokers who gain profit by gathering, aggregating, analyzing, hoarding, commodifying, trading or using personal data without our knowledge or consent [11]. There are many other situations where data monetization is an integral and indispensable part of a system in the form of data-as-a-service model [14]. Some interesting fields spawned and coexisting with the use of big data are machine learning, deep learning, artificial intelligence, data-science, etc., which may demand training dataset in a payper-use fashion. ...
Chapter
Big data, as a driving force to the business growth, creates a new paradigm that encourages large number of start-ups and less-known data brokers to adopt data monetization as their key role in the data marketplace. As a pitfall, such data-driven scenarios make big data prone to various threats, such as ownership claiming, illegal reselling, tampering, etc. Unfortunately, existing watermarking solutions are ill-suited to big data due to a number of challenging factors, such as V’s of big data, involvement of multiple owners, incremental watermarking, large cover-size and limited watermark-capacity, non-interference, etc. This paper presents a novel approach BDmark that provides a transparent immutable audit trail for data movement in big data monetizing scenarios, by exploiting the power of both watermarking and blockchain technologies. We describe in detail how our approach overcomes the above-mentioned challenging factors. As a proof of concept, we present a prototype implementation of the proposed system using Java and Solidity on Ethereum platform and the experimental results on smart contracts show a linear growth of gas consumption w.r.t. input data size. To the best of our knowledge, this is the first proposal which deals with watermarking issues in the context of big data.
... With GraphQL, the concept of data-as-a-service (DaaS) [16] is more authentic; data is provided on demand and clients can specify the structure, filters or even operations for the data retrieved. ...
Conference Paper
A crucial part of data-driven ecosystems1 is the management and processing of complex data structures, as well as the proper handling of the data flows within the ecosystem. To manage these data flows, data-driven ecosystems need high levels of interoperability, as it allows the collaboration and independence of both internal and external components. REST APIs are a common solution to achieve interoperability, but sometimes they lack flexibility and performance. The arising of GraphQL APIs as a flexible, fast and stable protocol for data fetching makes it an interesting approach for data-intensive and complex data-driven (eco)systems. This paper outlines the GraphQL protocol and the benefits derived from its use, as well as it presents a case of study of the improvement experienced by the Observatory of Employment and Employability (also known as OEEU) ecosystem after including GraphQL as main API in several components. The results of the paper show promising improvements regarding the flexibility, maintainability and performance, among other benefits.
... Although clouds provide an ideal environment for big data storage, the access delay to them is generally high due to network limitations (Terzo et al. 2013). The delay can be particularly problematic for frequently accessed data (e.g., index of big data search engine). ...
... This can lead to complex and time-consuming process to gather and combine all necessary data. Data-asa-service is an approach in software infrastructure where consuming users can access data in standard format without the need to perform ETL manually [9,16]. Similar to its sister concept, i.e., Function-as-a-service, Data-as-a-service provides both scalability and standardize mechanism to access data. ...
Chapter
Full-text available
Generally speaking, healthcare service providers, such as hospitals, maintains a large collection of data. In the last decade, healthcare industry becomes aware that data analytics is a crucial tool to help providing a better services. However, there are several obstacles to prevent a successful deployment of such systems, among them are data quality and system performance. To address the issues, this paper proposes a distributed data-as-a-service framework that help to assure level of data quality and also improve the performance of data analytics. Preliminary evaluation suggests that the proposed system is scale well to large amount of user requests.
... This removes the need to download data from a portal or data store, to know what data exists and where it resides, to be able to understand and decode the storage format and to manually convert it to a form that adds value to the end user (such as changing units, datum, etc.). The DaaS concept enables machine systems to discover, access and deliver data, providing an underlying set of services on which information systems can be built (Terzo et al., 2013). ...
Article
Full-text available
In the next decade the pressures on ocean systems and the communities that rely on them will increase along with impacts from the multiple stressors of climate change and human activities. Our ability to manage and sustain our oceans will depend on the data we collect and the information and knowledge derived from it. Much of the uptake of this knowledge will be outside the ocean domain, for example by policy makers, local Governments, custodians, and other organizations, so it is imperative that we democratize or open the access and use of ocean data. This paper looks at how technologies, scoped by standards, best practice and communities of practice, can be deployed to change the way that ocean data is accessed, utilized, augmented and transformed into information and knowledge. The current portal-download model which requires the user to know what data exists, where it is stored, in what format and with what processing, limits the uptake and use of ocean data. Using examples from a range of disciplines, a web services model of data and information flows is presented. A framework is described, including the systems, processes and human components, which delivers a radical rethink about the delivery of knowledge from ocean data. A series of statements describe parts of the future vision along with recommendations about how this may be achieved. The paper recommends the development of virtual test-beds for end-to-end development of new data workflows and knowledge pathways. This supports the continued development, rationalization and uptake of standards, creates a platform around which a community of practice can be developed, promotes cross discipline engagement from ocean science through to ocean policy, allows for the commercial sector, including the informatics sector, to partner in delivering outcomes and provides a focus to leverage long term sustained funding. The next 10 years will be “make or break” for many ocean systems. The decadal challenge is to develop the governance and co-operative mechanisms to harness emerging information technology to deliver on the goal of generating the information and knowledge required to sustain oceans into the future.
... In [13] the authors proposed a framework that allows data from different sources to be integrated with the semantic data. Moreover, this work tries to solve interoperability issues and proposes an access control system in order to define an explicit privacy constraint. ...
Chapter
Full-text available
Nowadays there is a myriad of different Cloud services deployed in datacenters around the world. These services are presented by means of models offered as services on demand (SaaS, PaaS and IaaS). At a different level, the companies aim at finding some services that manage their data in order to enhance their process and increase their benefits. The massive quantity of data and the heterogeneous types of data deployed in the Cloud made the discovery task much important. In addition, one big problem in discovery task is the absence of a description model to represent data services (Data as a service; DaaS). Moreover, the location of services made the selection and composition process much very hard, especially for functional and non-functional parameters. In this paper, we plan to solve these problems by introducing an extended description model for data services, in order to reduce the number of services during the selection and composition process. The implementation shows the effectiveness of the proposed model.
... The huge amount of data made available over the last decades has given rise to a complex technology ecosystem, ranging from data collection and storage infrastructures to hardware and software tools for efficient computation of analytics. Recently, this ecosystem has undergone some radical change due to the advent of Big Data and Cloud Computing techniques [5]. The main requirements that these systems need to achieve are flexibility and scalability in processing large and heterogeneous quantities of data within tolerable elapsed times. ...
Conference Paper
The growing availability of data over the last decades has given rise to a number of successful technologies, ranging from data collection and storage infrastructures to hardware and software tools for efficient computation of analytics. This context, in principle, places a great demand on data quality. As a matter of fact, experience has shown that the open Web and other platforms hosting user-generated content or real-time data can provide little quality control at content production time. To address these challenges, our aim is to provide a general and configurable model for assessing data quality supporting task composition. In particular, we introduce a model characterized along the notion of matching, illustrating the issues that can be addressed by this approach with a concrete case study. We also identify and discuss challenges to be addressed in future research to strengthen this idea.
... Cloud systems can appraise the processes and expenditure of resources as well as observation, have power over and coverage in a completely transparent manner [21][22] ...
... This shall be understood in a more general way as an issue of traceability (Terzo et al., 2013), which is itself depending on the architectures both of the data platform but also of the developed services. If the data are provided in CSV format, any re-user may download these data once and embedding them in a service without any other interaction with the platform. ...
Conference Paper
Full-text available
The main purpose of this article is to address the issue of monitoring in an integrated way the uptake and the impacts of Open Data in a context where the links between government and re-users are weakening as a result of the use of digital artefacts. It is based on the case of an actual Open Data platform. The contribution is first providing a literature review of the assessment frameworks developed to describe the various impacts of Open Data as well as the different methods leveraged to catch these impacts. This is providing a generic framework suitable to describe the impacts triggered by Open Data. Qualitative-organization of focus groups with the stakeholders implied in the Open Data value chain-and quantitative-surveys and platform logs analyses-methodologies are leveraged to showcase how they may be combined. The main findings show the complementarities between these methodologies and both their suitability to lead to the construction of some automatic monitoring indicators, as well as the barriers to overcome. One of the most important being how to align some indicators designed for other purposes with needs of impact assessment, and with which reliability. These results aim at being useful both for researchers and practitioners, as the gaps for further and future research are highlighted, and as technical issues and recommendations are identified for other empirical experiments and policy deployment.
... The Applications hosted in the Cloud access the data sensed in the Things layer through virtualized components hosted in the Services layer. Sensing and Actuation as a Service (SAaaS) [5], Data as a Service (DaaS) [6], and Sensor Event as a Service (SEaaS) [7] are examples of Cloud-centric systems in IoT. The robustness and flexibility of the Cloud make the data processing efficient and reliable [8]; however, the time data streams take to reach the Cloud may affect the accurate decision-making over that data [9]. ...
Conference Paper
Full-text available
Current networking integrates common "Things" to the Web, creating the Internet of Things (IoT). The considerable number of heterogeneous Things that can be part of an IoT network demands an efficient management of resources. With the advent of Fog computing, some IoT management tasks can be distributed toward the edge of the constrained networks, closer to physical devices. Blockchain protocols hosted on Fog networks can handle IoT management tasks such as communication, storage, and authentication. This research goes beyond the current definition of Things and presents the Internet of "Smart Things." Smart Things are provisioned with Artificial Intelligence (AI) features based on CLIPS programming language to become self-inferenceable and self-monitorable. This work uses the permission-based blockchain protocol Multichain to communicate many Smart Things by reading and writing blocks of information. This paper evaluates Smart Things deployed on Edison Arduino boards. Also, this work evaluates Multichain hosted on a Fog network.
... Ongoing work is to incorporate a data layer in our secure Web service CLMS Grid architecture (see Fig. 3) based on the Data as a Service (DaaS) of Cloud technology as the appropriate service for effective large data storage and data provision on demand to the user regardless of geographic or organizational separation of both provider and consumer (Terzo et al., 2013). In this context, SOA has also rendered the actual platform on which the location where the data resides is irrelevant. ...
Article
Full-text available
The paper presents innovative trustworthy services to support secure e-assessment in web-based collaborative learning grids. Although e-Learning has been widely adopted, there exist still drawbacks which limit their potential. Among these limitations, we investigate information security requirements in on-line assessment learning activities, (e-assessment). In previous research, we proposed a trustworthiness model to support secure e-assessment requirements for e-Learning. In this paper, we present effective applications of our approach by integrating flexible and interoperable Web based secure e-learning services based on our trustworthiness model into e-assessment activities in on-line collaborative learning courses. Moreover, we leverage Grid technology to meet further demanding requirements of collaborative learning applications in terms of computation performance and management of large data sets, in order for the trustworthy collaborative learning services to be continuously adapted, adjusted, and personalised to each specific target learning group. Evaluation in a real context is provided while implications of this study are eventually remarked and discussed.
... For the DaaS research [1,3,4,7,11,13,17,19,22,[25][26][27], papers were discarded when they were focused on security or structural aspects of the service. As our focus where aimed at data aspects of the service, we had to maintain twelve papers that were mapped into the bubble chart in Figure 2. The chart is organized following the similar structure used in SaaS chart. ...
Conference Paper
Full-text available
Software as a Service (SaaS) and Data as a Service (DaaS) proves to be two promising areas of research in the cloud computing field, however interoperability among different cloud providers is yet poorly explored. Today, clients looking for content or services from different providers need extra time and resources to learn and implement the required adaptations from the other parties. In this paper we propose MIDAS, a novel middleware to interoperate SaaS and DaaS services seamlessly and independently from provider. That is, SaaS applications will be able to get data from DaaS datasets by sending a query to our middleware and letting it mediate the communication and return the expected results. We evaluate our proposal by developing a prototype from two case studies and by analyzing the time effort to query through our middleware. Our results presented that no important overhead were required from providers nor to the final user.
...  Infrastructure as a service (IAAS): Cloud service providers provide infrastructure such as storage, computing capacity, etc. is a form of cloud computing that provides virtualized computing resources over the Internet , In an IaaS model, a third-party provider hosts hardware, software, servers, storage and other infrastructure components on behalf of its users [25][26].  DaaS : It is the alternative cloud computing model, as it differs from traditional models like (SAAS, IAAS, PAAS) in providing data to users through the network, as data is considered the value of this model [27] in conjunction with cloud computing based on solving some of the challenges in managing a huge amount of data. For these reasons, DaaS is closely related to big data whose technologies must be utilized [28]. ...
Article
Full-text available
Communicating by using information technology in various ways produces big amounts of data. Such data requires processing and storage. The cloud is an online storage model where data is stored on multiple virtual servers. Big data processing represents a new challenge in computing, especially in cloud computing. Data processing involves data acquisition, storage and analysis. In this respect, there are many questions including, what is the relationship between big data and cloud computing? And how is big data processed in cloud computing? The answer to these questions will be discussed in this paper, where the big data and cloud computing will be studied, in addition to getting acquainted with the relationship between them in terms of safety and challenges. We have suggested a term for big data, and a model that illustrates the relationship between big data and cloud computing.
... Generally, the traditional cloud provide three kinds of service models, i.e., infrastructure-as-a-service (IaaS), platform-as-aservice (PaaS), and software-as-a-service (SaaS) [15]. As the development of data-intensive sciences and big data technologies, the data-as-a-service (DaaS) model has been proposed to facilitate the computation for intelligent sharing and processing of large data collections [16] in recent years. ...
... In this type of cloud service, cloud service providers handle everything including the middleware and operating system. Terzo et al. [33] opined apart from these three cloud computing service models, an alternative cloud computing model called DaaS (Desktop as a service) has been developed for effective big data processing, distribution and management. This model is closely connected to SaaS which can be easily paired with either one or both of the mentioned models [34]. ...
Chapter
Nowadays, all organization without exception depend on data, its management and storage to ensure success is achieved on a project. Data management involves the process of collecting, storing, sharing, controlling, and retrieval of information contained in the project coming from different sources. These data need to be stored properly to serve as a reference whenever needed (availability) for smooth project execution. The aim of this study is to discuss the capabilities of cloud storage for data management in the built environment during various projects execution phases. The study adopted a literature review methodology to draw knowledge on the way cloud storage operates as well as the services it provides. It is evident from the reviewed literature that data storage is highly important for data record and protection during the project lifecycle. This is because it gives maximum security to data owners compared to traditional data storage facilities that are available. The study concluded that a good data storage platform will help project parties to avoid data and information loss, scattered data, data incompleteness, project interruption due to data unavailability as well as theft of data. The study recommended that construction parties need to have a knowledge of the different technologies required for cloud data storage to improve data management on built environment projects.
... Big data in 9 . As discussed earlier healthcare is one of the organizations which are generating huge data sets. ...
Article
Full-text available
The critical challenge that the healthcare organizations are facing is to analyze the large-scale data. With the rapid growth of various healthcare applications, various devices used in healthcare generate varieties of data. The data need to be processed and effectively analyzed for better decision making. Cloud computing is a promising technology which can provide on-demand services for storage, processing and analyzing the data. The traditional data processing systems no longer has an ability to process such huge data. In order to achieve a better performance and to solve the scalability issues we need a better distributed system on cloud environment. Hadoop is a framework which can process large scale data sets on distributed environment. Hadoop can be deployed on cloud environment to process the large scale healthcare data. Healthcare applications are being supplied through internet and cloud services rather than using as traditional software. Healthcare providers need to have real time information to provide quality healthcare. This paper discuss on the impacts of data processing and analyzing large scale healthcare data on cloud computing environment.
... Com os recentes desenvolvimentos em computação em nuvem, torna-se mais fácil construir DaaSes que fornecem conjuntos de dados maiores a custos mais baixos nas nuvens. O Amazon Simple Storage Service (S3) fornece uma simples interface de Web services que pode ser usada para armazenar e recuperar, declarada pela Amazon, qualquer quantidade de dados, a qualquer momento, de qualquer lugar na Web [164] [165] [166] [159]. ...
Technical Report
Full-text available
My PhD qualification - Federal University of Ceará, Brazil (July 15, 2019).
... The data discovery is responsible to select and prepare in advance the required data sources that includes pertinent data, which will be required for the later search execution. The output of the data discovery phase is a particular list of data sources, which also includes supplementary metainformation (Terzo et al., 2013). ...
... This evergrowing demands of big data in rapidly changing competitive environment offers a new paradigm which encourages enterprises to adopt data monetization and initiates the establishment of large number of start-ups companies who sell and purchase our personal data on a daily basis. Few, among many others, include Datacamp,Datawallet,Dawex,etc. 1 There are many other situations where data monetization is an integral and indispensable part of a system in the form of data-as-a-service model (Terzo et al., 2013). Some interesting fields spawned and co-existing with the use of big data are machine learning, deep learning, artificial intelligence, data-science, etc., which may demand training dataset in a pay-per-use fashion. ...
Article
Full-text available
In the era of big data, modern data marketplaces have received much attention as they allow not only large enterprises but also individuals to trade their data. This new paradigm makes the data prone to various threats, including piracy, illegal reselling, tampering, illegal redistribution, ownership claiming, forgery, theft, misappropriation, etc. Although digital watermarking is a promising technique to address the above-mentioned challenges, the existing solutions in the literature are deemed to be incompetent in big data scenarios due to the following factors: V's of big data, involvement of multiple owners, incremental watermarking, large cover-size and limited watermark-capacity, non-interference, etc. In this paper, we propose a novel big data watermarking technique that leverages the power of blockchain technology and provides a transparent immutable audit trail for data movement in big data monetizing scenarios. In this context, we address all the crucial challenges mentioned above. We present a prototype implementation of the system as a proof of concept using Solidity on Ethereum platform, and we perform experimental evaluation to demonstrate its feasibility and effectiveness in terms of execution gas costs. To the best of our knowledge, this is the first proposal which deals with watermarking issues in the context of big data.
...  DaaS : It is the alternative cloud computing model, as it differs from traditional models like (SAAS, IAAS, PAAS) in providing data to users through the network, as data is considered the value of this model [14] in conjunction with cloud computing based on solving some of the challenges in managing a huge amount of data. For these reasons, DaaS is closely related to big data whose technologies must be utilized [15]. ...
Conference Paper
Full-text available
Cloud computing gives a relevant and adaptable support for Big Data by the ease of use, access to resources, low cost use of resources, and the use of strong equipment to process big data. Cloud and big data center on developing the value of a business while reducing capital costs. Big data and cloud computing, both favor companies and by cause of their benefit, the use of big data growths extremely in the cloud. With this serious increase, there are several emerging risk security concerns. Big data has more vulnerabilities with the comparison to classical database, as this database are stored in servers owned by the cloud provider. The various usage of data make safety-related big data in the cloud intolerable with the traditional security measures. The security of big data in the cloud needs to be looked at and discussed. In this current paper, my colleagues and me will present and discuss the risk assessment of big-data applications in cloud computing environments and present some ideas for assessing these risks.
Conference Paper
Scientific community is now spending more and more efforts in defining and developing effective methodologies and technologies in order to easy design and development of Cloud solutions. In order to exploit the features of existing Cloud services and Resources Orchestration becomes a hot research topic. In this scenario, Cloud Designers promote reuse but a clear and simple design and verification methodology still misses in literature. In this scenario, a simple (UML-based) modelling profile and a Model-Driven Engineering methodology for Cloud-based Value Added Services are very appealing. In this work we define a modelling profile able to describe Orchestrated Cloud Services and Resources by means of Cloud Design Patterns and we show how Cloud Designer can use it both to ease composition and verification purposes.
Chapter
With the increasing use of technologically advanced equipment in medical, biomedical and healthcare fields the collection of patients' data from various hospitals is also getting necessary. The availability of data at the central location is suitable so that it can be used in need of any pharmaceutical feedback, equipment's reporting, analysis and results of any disease and many more. Collected data can also be used for manipulating or predicting and upcoming health crisis due to any disaster, virus or climatically changes. Collection of Data from various health related entities or from any patient raises some serious questions upon leakage, integrity, security and privacy of data. In this chapter the term Big Data and its usage in healthcare applications is discussed. The questions and issues are highlighted and discussed in the last section of the chapter to emphasize on the broad and pre-deployment issues. Available platforms and solutions are also mentioned and detailed to overcome the arising situation and question on usage and deployment of Big Data in healthcare related fields and applications. The available data privacy, data security, users' accessing mechanisms, authentication procedures and privileges are also described.
Conference Paper
There is a growing interest in health information technology using evidence-based approaches in clinical decision-support systems, the goal for such systems is ‘precision medicine’ using ‘interventional informatics’. However, the impact has been less than positive and it has been argued that interventional informatics using data-driven interventions is required to achieve evidence-based clinical decision-support. In this paper we discuss context-Aware, evidence-based, data driven development of diagnostic scales created using multi-disciplinary collaborative development. The goal is the development of novel dynamic scales for decision-support in healthcare provision and, while clinicians may derive benefit from such systems, there are potentially greater benefits for all stakeholders in medical triage systems using both face-to-face and remote consultations. While our focus lies in depression, the proposed approach will generalise to a diverse range of domains, systems, and technologies.
Chapter
Demographic changes are resulting in a rapidly growing elderly population with healthcare implications which importantly include dementia, which is a condition that requires long-term support and care to manage the negative behavioural symptoms. In order to optimise the management of healthcare professionals and provide an enhanced quality of life for patients and carers alike, Remote Electronic Health Monitoring forms a crucial role. This requires myriad functions and components to achieve patient monitoring while accommodating the technological, medical, legal, regulatory, ethical, and privacy considerations. The chapter considers the relevant components and functions of the current state-of-the-art to the provision of effective Remote Electronic Health Monitoring. The authors present the background and related research, and then they focus on the technological aspects of Remote Electronic Health Monitoring to which Cloud-Based Systems and the closely related Cloud Service Modules are central. A number of scenarios to illustrate the concepts are discussed in the chapter.
Article
Full-text available
The Internet of Things (IoT) has made it possible for devices around the world to acquire information and store it, in order to be able to use it at a later stage. However, this potential opportunity is often not exploited because of the excessively big interval between the data collection and the capability to process and analyse it. In this paper, we review the current IoT technologies, approaches and models in order to discover what challenges need to be met to make more sense of data. The main goal of this paper is to review the surveys related to IoT in order to provide well integrated and context aware intelligent services for IoT. Moreover, we present a state-of-the-art of IoT from the context aware perspective that allows the integration of IoT and social networks in the emerging Social Internet of Things (SIoT) term.
Chapter
Today we witness a growing change in how public health administration thinks about medical data. We have slowly moved from paper-based patient records to digitally storing medical data, in support for advanced evidence-based mining and decision support processes. With this change comes great responsibility, among which efficient storing and accessing the health status of the patient is particularly important. In this chapter, the authors analyze current storage technologies for storing medical data. We are witnessing a shift from traditional relational database support to NoSQL technologies capable of offering great availability and scalability options, and back to the mixture between the SQL and NoSQL worlds, and scalable SQL databases. All these alternatives come with their own pros and cons, which the authors carefully analyze. They believe that their survey will help medical practitioners and developers of health applications make a more informed decision when designing medical data storage support.
Conference Paper
Big Data has become the cornerstone of modern knowledge based system. However, taking advantage of the knowledge found in big data sets requires advanced solutions to store, access and analyze data in a feasible way, either online, offline or both. Such solutions comprise on the one hand a better understanding of computational needs for big data and on the other, the design of new computational infrastructures for such purpose. This paper evaluates the performance in terms of CPU, Load and Memory utilization and scalability of some clustering and collaborative filtering algorithms of Apache Spark MLlib, which provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. The aim is to reveal the performance of such algorithms and draw conclusions for their application to real life problems. To that end, the performance evaluations are done by using a large scale Google cluster usage trace dataset.
Chapter
Given the large amount of sensed data by IoT devices and various wireless sensor networks, traditional data services lack the necessary resources to store and process that data, as well as to disseminate high-quality data to a variety of potential consumers. In this paper, we propose a framework for Data as a Service (DaaS) provisioning, which relies on deploying DaaS services on the cloud and using a DaaS agency to mediate between data consumers and DaaS services using a publish/subscribe model. Furthermore, we describe a decision algorithm for the selection of appropriate DaaS services that can fulfill consumers’ requests for high-quality data. One of the benefits of the proposed approach is the elasticity of the cloud resources used by DaaS services. Moreover, the selection algorithm allows ranking DaaS services by matching their quality-of-data (QoD) offers against the QoD needs of the data consumer.
Chapter
Today we witness a growing change in how public health administration thinks about medical data. We have slowly moved from paper-based patient records to digitally storing medical data, in support for advanced evidence-based mining and decision support processes. With this change comes great responsibility, among which efficient storing and accessing the health status of the patient is particularly important. In this chapter, the authors analyze current storage technologies for storing medical data. We are witnessing a shift from traditional relational database support to NoSQL technologies capable of offering great availability and scalability options, and back to the mixture between the SQL and NoSQL worlds, and scalable SQL databases. All these alternatives come with their own pros and cons, which the authors carefully analyze. They believe that their survey will help medical practitioners and developers of health applications make a more informed decision when designing medical data storage support.
Article
The explosive growth of information has rapidly ushered people into the era of big data. Due to the large volume, high variety, and rapid velocity characteristics of big data, most traditional data mining methods developed for a centralized data analysis process cannot be applied directly. AI leverages machine intelligence to provide insights, automation, and new methods to interact with data, thereby promoting data literacy throughout the organization. Based on the theory of Big Data Cycle, this paper discusses the relationship between big data and AI and how they interact and influence each other. It adopts the integrative review research method to screen latest literature and summarizes the role of AI in different phases of big data cycle. We also provide an insight into the applications of big data and AI in three different areas, that is, social network, health care, and finance.
Article
Big data-driven innovations are key in improving healthcare system sustainability. Given the complexity, these are frequently conducted by public-private-partnerships (PPPs) between profit and non-profit parties. However, information on how to manage big data-driven healthcare innovations by PPPs is limited. This study elucidates challenges and best practices in managing big data-driven healthcare innovations by PPPs in the Netherlands. Fifteen technical, organizational, competence and ethical/legal challenges and best practices to overcome these challenges were identified in expert interviews with key opinion leaders (KOLs) and through literature review. They were prioritized by a second KOL-panel in an online questionnaire and results were interpreted by a focus-group. ‘Data variety’ was the main challenge, followed by ‘lack of data sharing’ and ‘insufficient data quality’. PPP-respondents ranked appropriate big data skills significantly lower (P = 0.049) and conservatism towards health care decisions significantly (P = 0.026) than non-PPP respondents. The profit sub-group ranked data access higher compared to the non-profit sub-group (P = 0.022). Continuous dialogue between stakeholders, cost-benefit analyses and pilot experiments might overcome conservatism. In conclusion, PPPs should blend skills and resources to maximize benefits of big data-driven healthcare innovations. Mitigating actions could overcome technical issues, whilst a better common support base might prevent conservatism and lack of data sharing.
Article
The impressive progress in sensing technology over the last few years has contributed to the proliferation and popularity of the Internet of Things (IoT) paradigm and to the adoption of Sensor Clouds for provisioning smart ubiquitous services. Also, the massive amount of data generated by sensors and smart objects led to a new kind of services known as Data-as-a-Service (DaaS). The quality of these services is highly dependent on the quality of sensed data (QoD), which is characterized by a number of quality attributes. DaaS provisioning is typically governed by a Service Level Agreement (SLA) between data consumers and DaaS providers. In this work, we propose a game-based approach for DaaS Provisioning, which relies on signaling based model for the negotiation of several QoD attributes between DaaS providers and data consumers. We consider that these entities are adaptive, rational, and able to negotiate the QoD offering even in the case of incomplete information about the other party. We use in the negotiation between the two parties a Q-learning algorithm for the signaling model and a Multi Attributes Decision Making (MADM) model to select the best signal. Moreover, we empirically validate the MADM model using Shannon's entropy. The results obtained in the case of a multi-stages negotiation scenario show the convergence towards the pooling equilibrium.
Chapter
Big Data has become an enabling technology for many of the today’s innovations. Given the exponential rate at which the data is produced there is a clear necessity for scalable solutions to control the overwhelming flow of new streams of information and extract information out of DaaS Clouds. In this paper we review and analyze some VM deployment methods and their suitability for Data as a Service (DaaS) model in Clouds. Then we approach some novel aspects of VM deployment, including VM migration.
Chapter
Decision Support Systems (DSS) play a significant role in several fields that assist professionals in their decision-making process, either in short-term or mid/long term. The advent of Big Data, Big Data Streams, and Cloud computing ecosystems enables new, faster, and more effective forms of decision-making, therefore laying the basis for a new generation of DSS. Knowledge base and Rule base systems play an important role in the development of DSS. This chapter illustrates different types of Decision Support Systems and their application in healthcare, the barriers they encounter, implementation strategies, challenges, and outlook for the future vision of DSS.
Chapter
As a unified data access model, data service has become a promising technique to integrate and share heterogeneous datasets. In order to publish overwhelming data on the web, it is a key to automatically extract and encapsulate data services from various datasets in cloud environment. In this paper, a novel data service generation approach for cross-origin datasets is proposed. An attribute dependency graph (ADG) is constructed by using inherent data dependency. Based on the ADG, an automatic data service extraction algorithm is implemented. The extracted atomic data services are further organized into another representation named data service dependency graph (DSDG). Then, a data service encapsulation framework, which includes an entity layer, a data access object layer and a service layer, is designed. Via a flexible RESTful service template, this framework can automatically encapsulate the extracted data services into the RESTful services which can be accessed by the exposed interfaces. In addition, a data service generation system has been developed. Experimental results show that the system has high efficiency and good quality for data service generation.
Chapter
The rapid expansion of the Internet of Things (IoT) will generate a diverse range of data types that needs to be handled, processed and stored. This paper aims to create a multi-agent system that suits the needs introduced by the IoT expansion, thus being able to oversee the Big Data collection and processing and also to maintain the semantic links between the data sources and data consumers. In order to build a complex agent oriented architecture, we have assessed the existing agent oriented methodologies searching for the best solution that is not bound to a specific programming language of framework, and it is flexible enough to be applied in such a divers domain like IoT. As complex scenario, the proposed approach has been applied to medical diagnosis and motoring of mental disorders.
Chapter
In recent years, cloud computing and the Internet of Things (IoT) are used to enhance and accelerate today’s technological evaluation efficiently and securely. Some issues (i.e., privacy, data security, data protection, malicious attack) are not solvable individually in cloud computing and IoT. Today, the most significant evaluation is that cloud-based Internet of Things mechanism is a very far-reaching application in the various enterprise domain. Successful implementation of CBIT is a challenging task for researchers. However, a lot of research work has been done before by the many researchers integrating IoT with cloud computing. In this book chapter, a simple description of CBIT architecture and applications has been presented. This chapter also represents various CBIT application domains and describes all the useful application domains easily. Specific improvements in a specific application domain have been shown in this chapter.
Article
Cloud computing is an accepted widely, emerging paradigm for its ‘pay as you go’ approach, massive economies of scale, and global in minutes concept. Over the years, different cloud providers have emerged with various services to meet the requirements of the end-user. Because of an increase in the diversity of services, the complexity increases. Customers cannot decide the optimal service to fulfill their requirements. This paper provides a comparative analysis of services of top public cloud providers namely, AWS, GCP, Oracle, and Microsoft Azure. Public cloud-provider strives to be efficient in every technological aspect, though some are better for certain tasks than others. This paper, as a solution, introduces the concept of Multi-Cloud computing, to leverage the benefits of the different cloud providers and to maximize their utility in single network architecture.
Chapter
This chapter outlines the technological evolution experimented by the Observatory for University Employability and Employment's information system to become a data-driven technological ecosystem. This observatory collects data from more than 50 Spanish universities and their graduate students (bachelor's degree, master's degree) with the goal of measuring the factors that lead to students' employability and employment. The goals pursued by the observatory need a strong technological support to gather, process, and disseminate the related data. The system that supports these tasks has evolved from a standard (traditional) information system to a data-driven ecosystem, which provides remarkable benefits covering the observatory's requirements. The benefits, the foundations, and the way the data-driven ecosystem is built will be described throughout the chapter, as well as how the information obtained is exploited in order to provide insights about the employment and employability variables.
Chapter
In the last years, companies have seen that the quality of the services they provided is becoming more and more and more important. They try to reach as many clients as possible and try to improve their services. New technologies (internet, computers, smartphones) are something that companies are taking advantage of, and there has been a huge change in the way services are provided. We have less physical contact between companies and clients and even less contact by phone (already being exceeded). Companies like Amazon, Netflix, or Uber are good examples of how the way companies provide services is changing, taking advantage of new technologies, making everyone’s life easier.
Chapter
By exponential increase of data produced and processed in all industries, data-oriented services are having increasing importance and popularity. Companies are seeking for services related to administration of data, data analysis, preparation of insights, especially when is coming to Big Data. Additionally, to enrich their own data sets are available external data providers. In this chapter, we focus on detailed understanding of both Data-as-a-Service and Information as a service on cloud. As the importance of cloud computing and Big Data continues to rise, service providers, business leaders, consumers, and researchers alike need a clear understanding of these two concepts to be able to distinguish between them. This paper seeks to find a clear definition and specification for DaaS, IaaS, as well as contrast the two concepts, highlighting which service is better in which situation for what kind of company.
Chapter
Nowadays, artificial intelligence is a tool widely used in several areas such as medicine, personal assistance, regular/special education, leisure, among many others. Additive manufacturing development allows designing and making innovative low-cost robotic assistants for educational inclusion processes in developing countries. There are not enough assistive technologies for special education centers in Ecuador due to the lack of resources. For these reasons, in this paper, we describe the robot AsiRo-μ (multi-purpose robotic assistant, for its Spanish name). AsiRo-μ is based on the open-source 3D printed robot Inmmov. It implements the following functionalities: hand gestures recognition, automatic speech recognition and text to speech function through the IBM Watson Cloud services, and gesture imitation. The robot aims to conduct a dialogue with the children, motivate them to carry out the exercises/rehabilitation activities, and motivate and engage them in the therapy sessions (for children with disabilities). To determine the children’s acceptance level, we carried out a pilot experiment with two groups of children. The first group consisted of 7 children with multiple disabilities (intellectual disability, Joubert syndrome, Down Syndrome, Autism, etc.), whereas in the second participated 30 children aged between 4 and 6 years (without disabilities). Each child interacted with the robot, and a group of experts evaluated the children’s perceptions.
Chapter
Full-text available
The impact of ICT on professional practice has been mainly in making jobs easier for the professions, facilitating decision-making and savings in operating costs, among others. The inefficient national electric power supply system and the high cost of computer hardware and software in relation to the dwindling fortunes of the professions in Nigeria’s depressed economy are the key obstacles to increased investments in ICT. The aim of this study is to understand the extent of ICT applications by professionals in built environment related vocations, with a view to improving the level of ICT application and adoption in Nigeria. A sample size of 82 respondents were used in this study, with questionnaires distributed to construction professionals. Three methods of data analysis were employed for this research. The study assessed the level of ICT application by professionals in built environment related vocations, with a view to improving the level of ICT application in Nigeria via a questionnaire survey with its respondents comprising of Architects, Builders, Engineers, Surveyors, and Quantity Surveyors. It examined the current status of ICT use in the built environment. The study discovered that the most commonly used softwares are; Microsoft Excel (100.0%), Microsoft Word (98.8%) and Microsoft PowerPoint (93.8%). Whereas, AutoCAD is the most popular at 87.7% for Architectural/ Engineering design and drawing, QSCAD (21.0%) for quantity surveying, BIM 360 at 32.1% for project management and Co-Construct at 19.8% for Building Management. The top three benefits of ICT as perceived by the respondents include time saving, makes job easier, and enhances productivity. Three major challenges faced were erratic power supply, high cost of purchasing ICT related softwares and/ or hardwares, job size and fees. The study recommend the following based on research results; the government should enable provision of steady power supply, as well as each organization to also provide back up options for power in case of power failure.
Article
Full-text available
The 20th annual Database Issue of Nucleic Acids Research includes 176 articles, half of which describe new online molecular biology databases and the other half provide updates on the databases previously featured in NAR and other journals. This year’s highlights include two databases of DNA repeat elements; several databases of transcriptional factors and transcriptional factor-binding sites; databases on various aspects of protein structure and protein–protein interactions; databases for metagenomic and rRNA sequence analysis; and four databases specifically dedicated to Escherichia coli. The increased emphasis on using the genome data to improve human health is reflected in the development of the databases of genomic structural variation (NCBI’s dbVar and EBI’s DGVa), the NIH Genetic Testing Registry and several other databases centered on the genetic basis of human disease, potential drugs, their targets and the mechanisms of protein–ligand binding. Two new databases present genomic and RNAseq data for monkeys, providing wealth of data on our closest relatives for comparative genomics purposes. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and currently lists 1512 online databases. The full content of the Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
Article
Full-text available
By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.
Conference Paper
Full-text available
This paper discusses the challenges that are imposed by Big Data Science on the modern and future Scientific Data Infrastructure (SDI). The paper refers to different scientific communities to define requirements on data management, access control and security. The paper introduces the Scientific Data Lifecycle Management (SDLM) model that includes all the major stages and reflects specifics in data management in modern e-Science. The paper proposes the SDI generic architecture model that provides a basis for building interoperable data or project centric SDI using modern technologies and best practices. The paper explains how the proposed models SDLM and SDI can be naturally implemented using modern cloud based infrastructure services provisioning model.
Article
Full-text available
The 19th annual Database Issue of Nucleic Acids Research features descriptions of 92 new online databases covering various areas of molecular biology and 100 papers describing recent updates to the databases previously described in NAR and other journals. The highlights of this issue include, among others, a description of neXtProt, a knowledgebase on human proteins; a detailed explanation of the principles behind the NCBI Taxonomy Database; NCBI and EBI papers on the recently launched BioSample databases that store sample information for a variety of database resources; descriptions of the recent developments in the Gene Ontology and UniProt Gene Ontology Annotation projects; updates on Pfam, SMART and InterPro domain databases; update papers on KEGG and TAIR, two universally acclaimed databases that face an uncertain future; and a separate section with 10 wiki-based databases, introduced in an accompanying editorial. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and now lists 1380 databases. Brief machine-readable descriptions of the databases featured in this issue, according to the BioDBcore standards, will be provided at the http://biosharing.org/biodbcore web site. The full content of the Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
Article
Full-text available
Data storage costs have become an appreciable proportion of total cost in the creation and analysis of DNA sequence data. Of particular concern is that the rate of increase in DNA sequencing is significantly outstripping the rate of increase in disk storage capacity. In this paper we present a new reference-based compression method that efficiently compresses DNA sequences for storage. Our approach works for resequencing experiments that target well-studied genomes. We align new sequences to a reference genome and then encode the differences between the new sequence and the reference genome for storage. Our compression method is most efficient when we allow controlled loss of data in the saving of quality information and unaligned sequences. With this new compression method we observe exponential efficiency gains as read lengths increase, and the magnitude of this efficiency gain can be controlled by changing the amount of quality information stored. Our compression method is tunable: The storage of quality scores and unaligned sequences may be adjusted for different experiments to conserve information or to minimize storage costs, and provides one opportunity to address the threat that increasing DNA sequence volumes will overcome our ability to store the sequences.
Article
The computer industry is being challenged to developmethods and techniques for affordable data processing on large datasets at optimum response times. The technical challenges in dealing with the increasing demand to handle vast quantities of data is daunting and on the rise. One of the recent processing models with a more efficient and intuitive solution to rapidly process large amount of data in parallel is called MapReduce. It is a framework defining a template approach of programming to perform large-scale data computation on clusters of machines ina cloud computing environment. MapReduce provides automatic parallelization and distribution of computation based on several processors. It hides the complexity of writing parallel and distributed programming code. This paper provides a comprehensive systematic review and analysis of large-scale dataset processing and dataset handling challenges and requirements in a cloud computing environment by using the M apReduce framework and its open -source implementation Hadoop. We defined requirements for MapReduce syst ems to perform large-scale data processing. We also proposed the MapReduce framework and one implementation of this framework on Amazon Web Services. At the end of the paper, we presented an experimentation of running MapReduce system in a cloud environment. This paper outlines one of the best techniques to process large datasets is MapReduce;it also can help developers to do parallel and distributed computation in a cloud environment.
Article
Today’s cloud ecosystem features a number of differing and increasingly diverging interfaces for management. In fact, a number of bridging efforts, such as the well-known libcloud or DeltaCloud projects, have been started to ameliorate the resulting vendor lock-in for the customers. With the current pace of growth in the number of providers, this approach shows its main drawback: the maintenance of adapter implementations. The Open Cloud Computing Interface (OCCI) brings this effort to the next level by taking a different approach: building on the fundamentals of modern web-based services, it defines a standardised interface towards cloud environments while enabling service providers to differentiate at the same time. In this article, we review the need for standards in the cloud arena, discuss the technicalities of OCCI, show its impact, and review its merits with respect to the currently popular proxy approach taken by competing efforts.
Conference Paper
The amount of computing resources currently available on Clouds is large and easily available with pay per use cost model. E-Science applications that need on-demand execution benefit from Clouds, because no permanent computing resources to support peak demand has to be acquired. In this paper, we present AMOS, a system that automates creation and management of temporary Grids on a Cloud to execute (parts of) application workflows. We performed experiments with AMOS and a representative e-Science application on a research Grid and on the Amazon EC2 Cloud. The results show that AMOS is a viable approach to manage and execute e-Science applications in a flexible Grid environment and to explore novel mechanisms that allow optimal utilization of Cloud resources. Furthermore, we consider AMOS as a step towards an operating system for (virtual) infrastructures that enables Grid applications to control their computational resources at run-time.
How Europe can gain from rising tide of scientific data
  • Riding The Wave
Adianto Wibisono, Cees de Laat"-Addressing Big Data Challenges for Scientific Data Infrastructure System and Network Engineering", Group University of AmsterdamAmsterdam, The Netherlands
  • Yuri Demchenko
  • Zhiming Zhao
  • Paola Grosso
European research Activities in Cloud Compting
  • Dana Pectcu
  • José Luis Vàsquez-Poletti
An integrated map of genetic variation from 1,092 human genomes
  • Mcvean
McVean, " An integrated map of genetic variation from 1,092 human genomes., " Nature, vol. 491, no. 7422, pp. 56–65, Nov. 2012.
PARADE -Partnership for Accessing Data in Europe -Strategy for a European Data Infrastructure (White Paper) published on Fri 2nd
  • Cisco Global
  • Cloud Index
Cisco Global Cloud Index: Forecast and Methodology, 2011–2016 http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ ns705/ns1175/Cloud_Index_White_Paper.pdf [10] PARADE -Partnership for Accessing Data in Europe -Strategy for a European Data Infrastructure (White Paper) published on Fri 2nd October 2009 http://www.csc.fi/english/pages/parade/ [11] GLOBAL RESEARCH INFRASTRUCTURE SUB GROUP ON DATA, e-Infrastructure activity, Draft Report, 28 October 2011 http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/g8.pdf [12] IBM Research Report-Understanding System and Architecture for Big Data - Anne E. Gattiker et al., 2013 http://domino.watson.ibm.com/library/CyberDig.nsf/1e4115aea78b6e 7c85256b360066f0d4/f085753cf57c8c35852579e90050598f!OpenDo cument%26Highlight=0,rc25281
Cloud Computing, Data- Intensive Computing and scheduling " , Multidimensional data analysis in a cloud data center
  • Frédéric Jie Magoulès
  • Fei Pan
  • Teng
Frédéric Magoulès. Jie Pan, and Fei teng " Cloud Computing, Data- Intensive Computing and scheduling ", Multidimensional data analysis in a cloud data center, pp 63-84, 2012,
Cees de Laat " -Addressing Big Data Challenges for Scientific Data Infrastructure System and Network Engineering " , Group University of AmsterdamAmsterdam, The Netherlands
  • Yuri Demchenko
  • Zhiming Zhao
  • Paola Grosso
  • Adianto Wibisono
Yuri Demchenko, Zhiming Zhao, Paola Grosso, Adianto Wibisono, Cees de Laat " -Addressing Big Data Challenges for Scientific Data Infrastructure System and Network Engineering ", Group University of AmsterdamAmsterdam, The Netherlands, pp 614 – 617, 2012 IEEE 4th International Conference on Cloud Computing Technology and Science [15] Gartner's 2012 Hype Cycle for Emerging Technologies STAMFORD, Conn., August 16, 2012 http://www.gartner.com/newsroom/id/2124315
Audrey Watters The Age of Exabytes, Tools And Approaches For Managing Big Data
Audrey Watters The Age of Exabytes, Tools And Approaches For Managing Big Data, ReadWriteWeb and HP, 2010, http://readwrite.com/2012/03/05/big-data
Grid Computing: Making the Global Infrastructure a Reality
  • J Bunn
  • H Newman
BUNN, J. AND NEWMAN, H. 2003. Grid Computing: Making the Global Infrastructure a Reality. Wiley Press, London, UK, Chapter 39 Data Intensive Grids for High Energy Physics.