
Zhiming ZhaoUniversity of Amsterdam | UVA · Institute of Informatics
Zhiming Zhao
Ph.D.
About
158
Publications
31,648
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,544
Citations
Publications
Publications (158)
The IaaS model provides elastic infrastructure that enables the migration of legacy applications to cloud environments. Many cloud computing vendors such as Amazon Web Services, Microsoft Azure, and Google Cloud Platform offer a pay-per-use policy that allows for a sustainable reduction in costs compared to on-premise hosting, as well as enable use...
Virtual research environments (VREs) provide user‐centric support in the lifecycle of research activities, for example, discovering and accessing research assets or composing and executing application workflows. A typical VRE is often implemented as an integrated environment, including a catalog of research assets, a workflow management system, a d...
Industrial applications often require federated cloud services from multiple providers to improve reliability and flexibility. Traditional selection methods through auctions usually involve a centralized auctioneer to coordinate the auction procedure. Blockchain and smart contracts provide a decentralized mechanism to automate the cloud auction pro...
Effectively managing decentralized applications in cloud environments using a decentralized control paradigm is essential, as current cloud providers usually only offer a control interface for monitoring cloud infrastructures. This study proposes a decentralized service control framework for implementing the control across various organizations and...
The FAIR principles have been accepted globally as guidelines for improving data-driven science and data management practices, yet the incentives for researchers to change their practices are presently weak. In addition, data-driven science has been slow to embrace workflow technology despite clear evidence of recurring practices. To overcome these...
Blockchain technologies, e.g., Hyperledger Fabric and Sawtooth, have been evolving rapidly during past years and enable potential decentralised innovations in a substantial amount of business applications, e.g. crowd journalism, car-sharing and energy trading. The development of decentralised business applications has to face challenges in selectin...
Data is one of the most valuable assets of an organization and has a tremendous impact on its long-term success and decision-making processes. Typically, organizational data error and outlier detection processes perform manually and reactively, making them time-consuming and prone to human errors. Additionally, rich data types, unlabeled data, and...
Virtual Research Environments (VREs) provide user-centric support in the lifecycle of research activities, e.g., discovering and accessing research assets, or composing and executing application workflows. A typical VRE is often implemented as an integrated environment, which includes a catalog of research assets, a workflow management system, a da...
Research infrastructures play an increasingly essential role in scientific research. They provide rich data sources for scientists, such as services and software packages, via catalog and virtual research environments. However, such research infrastructures are typically domain-specific and often not connected. Accordingly, researchers and practiti...
Cloud computing has been one of the disruptive technologies to change the traditional application operation for the last decades. The success of Cloud boosts ever more newly-built data centers. Although these data centers are distributed all around the world, the computing resources are managed in a relatively centralized manner within one big data...
In recent years, blockchain has gained widespread attention as an emerging technology for decentralization, transparency, and immutability in advancing online activities over public networks. As an essential market process, auctions have been well studied and applied in many business fields due to their efficiency and contributions to fair trade. C...
The real‐world complex networks, such as biological, transportation, biomedical, web, and social networks, are usually dynamic and change over time. The communities which reflect the substructures hidden in the networks usually overlap each other, and detecting overlapping communities in the dynamic complex networks is a challenging task. Prior res...
Research infrastructures play an increasingly essential role in scientific research. They provide rich data sources for scientists, such as services and software packages, via catalog and virtual research environments. However, such research infrastructures are typically domain-specific and often not connected. Accordingly, researchers and practiti...
Peak mitigation is of interest to power companies as peak periods may require the operator to over provision supply in order to meet the peak demand. Flattening the usage curve can result in cost savings, both for the power companies and the end users. Integration of renewable energy into the energy infrastructure presents an opportunity to use exc...
Social media applications are essential for next generation connectivity. Today, social media are centralized platforms with a single proprietary organization controlling the network and posing critical trust and governance issues over the created and propagated content. The ARTICONF project [1] funded by the European Union’s Horizon 2020 program r...
Scholars worldwide leverage science gateways/VREs for a wide variety of research and education endeavors spanning diverse scientific fields. Evaluating the value of a given science gateway/VRE to its constituent community is critical in obtaining the financial and human resources necessary to sustain operations and increase adoption in the user com...
In this paper, a Non-negative Matrix Factorization Feature Expansion (NMFFE) approach was proposed to overcome the feature-sparsity issue when expanding features of short-text. Firstly, we took the internal relationships of short texts and words into account when segmenting words from texts and constructing their relationship matrix. Secondly, we u...
In this paper, a non‐negative matrix factorization feature expansion (NMFFE) approach was proposed to overcome the feature‐sparsity issue when expanding features of short‐text. First, we took the internal relationships of short texts and words into account when segmenting words from texts and constructing their relationship matrix. Second, we utili...
The provenance of research data is of critical importance to the reproducibility of and trust in scientific results. As research infrastructures provide more amalgamated datasets for researchers and more integrated facilities for processing and publishing data, the capture of provenance in a standard, machine-actionable form becomes especially impo...
Environmental research infrastructures aim to provide scientists with facilities, resources and services to enable scientists to effectively perform advanced research. When addressing societal challenges such as climate change and pollution, scientists usually need data, models and methods from different domains to tackle the complexity of the comp...
E-Infrastructures play an increasingly important part in the provision of digital services to environmental researchers and other users. The availability of reliable networks, storage facilities, high performance and high throughput computers and associated middleware and services to ease their utilisation all contribute to enabling research and it...
Environmental research infrastructures (RIs) support their respective research communities by integrating large-scale sensor/observation networks with data curation and management services, analytical tools and common operational policies. These RIs are developed as service pillars for intra- and interdisciplinary research; however, comprehension o...
The use of metadata to characterise scientific datasets, making data easier to discover and use directly by researchers and via various online data services, is one of the primary concerns of research infrastructures (RIs); also, of concern is the use of metadata to describe equipment, facilities, services and other research assets. Metadata models...
The ENVRI Reference Model provides architects and engineers with the means to describe the architecture and operational behaviour of environmental and Earth science research infrastructures (RIs) in a standardised way using the standard terminology. This terminology and the relationships between specific classes of concept can be used as the basis...
The increasing volumes of data being produced, curated and made available by research infrastructures in the environmental science domain require services able to optimise the delivery staging and process of data on behalf of researchers. Specialised data services for managing the data lifecycle, for creating and delivering data products, and for c...
Advances in automation, communication, sensing and computation enable experimental scientific processes to generate data at increasingly great speeds and volumes. Research infrastructures are devised to take advantage of these data, providing advanced capabilities for acquisition, sharing, processing, and analysis; enabling advanced research and pl...
To perform data-centric research in environmental and earth sciences, researchers need effectively query, select and access data products from different research infrastructures. When providing observation data continuously, infrastructure is expected to create and deliver customised data products, e.g. for specific geo-regions, time durations or o...
Research infrastructures available for researchers in environmental and Earth science are diverse and highly distributed; dedicated research infrastructures exist for atmospheric science, marine science, solid Earth science, biodiversity research, and more. These infrastructures aggregate and curate key research datasets and provide consolidated da...
After a brief reminder on general concepts used in data cataloguing activities, this chapter provides information concerning the architecture and design recommendations for the implementation of catalogue systems for the ENVRIplus community. The main objective of this catalogue is to offer a unified discovery service allowing cross-disciplinary sea...
The ARTICONF project funded by the European Horizon 2020 program addresses issues of trust, time-criticality and democratisation for a new generation of federated infrastructure, to fulfil the privacy, robustness, and autonomy related promises critical in proprietary social media platforms. It aims to: (1) simplify the creation of open and agile so...
Cloud environments can provide virtualized, elastic, controllable and high-quality on-demand infrastructure services for supporting complex distributed applications. However, existing IaaS (Infrastructure-as-a-Service) solutions mainly focus on the automated integration or deployment of generic applications; they lack flexible infrastructure planni...
The current cloud market is dominated by a few providers, which offer cloud services in a take‐it‐or‐leave‐it manner. However, the dynamism and uncertainty of cloud environments may require the change over time of both application requirements and service capabilities. The current service‐level agreement (SLA) management solutions cannot easily gua...
This deliverable introduces the FAIR principles, describes the approach chosen for the FAIRness assessment, gives insights into the assessment results at the project/subdomain level (and for each RI in the protected project-internal Redmine environment) and discusses the requirements for achieving FAIRer data and services. It provides a summary of...
This open access book summarises the latest developments on data management in the EU H2020 ENVRIplus project, which brought together more than 20 environmental and Earth science research infrastructures into a single community. It provides readers with a systematic overview of the common challenges faced by research infrastructures and how a ‘refe...
As microservice architecture is becoming more popular than ever, developers intend to transform traditional monolithic applications into service-based applications (composed by a number of services). To deploy a service-based application in clouds, besides the resource demands of each service, the traffic demands between collaborative services are...
Purpose
The purpose of this paper is to boost multidisciplinary research by the building of an integrated catalogue or research assets metadata. Such an integrated catalogue should enable researchers to solve problems or analyse phenomena that require a view across several scientific domains.
Design/methodology/approach
There are two main approach...
There lacks trust between the cloud customer and provider to enforce traditional cloud SLA (Service Level Agreement) where the blockchain technique seems a promising solution. However, current explorations still face challenges to prove that the off‐chain SLO (Service Level Objective) violations really happen before recorded into the on‐chain trans...
By effectively virtualizing operating systems and encapsulating necessary runtime contexts of software components and services, container technologies can significantly improve portability and efficiency for distributed application deployment. It flexibly extends virtual machine based cloud (Infrastructure-as-a-Service) as a much lighter virtual en...
Semantic annotation is a crucial part of achieving the vision of the Semantic Web and has long been a research topic among various communities. The most challenging problem in reaching the Semantic Web’s real potential is the gap between a large amount of unlabeled existing/new data and the limited annotation capability available. To resolve this p...
The ARTICONF project funded by the European Horizon
2020 program addresses issues of trust, time-criticality and democratisa-
tion for a new generation of federated infrastructure, to ful�l the privacy,
robustness, and autonomy related promises critical in proprietary social
media platforms. It aims to: (1) simplify the creation of open and ag-
ile...
The infrastructure‐as‐a‐service (IaaS) model of cloud computing provides virtual infrastructure functions (VIFs), which allow application developers to flexibly provision suitable virtual machines' (VM) types and locations, and even configure the network connection for each VM. Because of the pay‐as‐you‐go business model, IaaS provides an elastic w...
Well-founded data management systems are of vital importance for ocean observing systems as they ensure that essential data are not only collected but also retained and made accessible for analysis and application by current and future users. Effective data management requires collaboration across activities including observations, metadata and dat...
Schema matching exists as a long-standing challenge in many database related applications, such as data integration, where two databases with different schema have to be integrated. With the evolvement from database to big data, the schema matching has been enriched with various purposes and application contexts, ranging from data integration, to s...
In this paper, we study the scheduling decisions for handling deadline-constrained workflows in the context of planning customized virtual infrastructures in the cloud. We specifically focus on the effects of using different types of greediness in selecting cost-effective virtual machines for the tasks in an application's workflow graph. The profil...
Virtual Research Environments (VREs), also known as science gateways or virtual laboratories, assist researchers in data science by integrating tools for data discovery, data retrieval, workflow management and researcher collaboration, often coupled with a specific computing infrastructure. Recently, the push for better open data science has led to...
Time-critical applications, such as early warning systems or live event broadcasting, present particular challenges. They have hard limits on Quality of Service constraints that must be maintained, despite network fluctuations and varying peaks of load. Consequently, such applications must adapt elastically on-demand, and so must be capable of reco...
We propose a novel mechanism to generate a family of deterministic small-world and scalefree networks by inserting new nodes into old nodes. These models can characterize the distinguishing properties of many real-world networks, because this novel class of networks incorporates some key properties characterizing a majority of real-world networked...