
Hans-arno JacobsenUniversity of Toronto | U of T
Hans-arno Jacobsen
About
571
Publications
95,242
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,238
Citations
Publications
Publications (571)
EV charging infrastructures traditionally rely on untrusted centralized infrastructures that pose several privacy and security threats to EVs’ personal information. Targeted advertisements, privacy leaks, selling data to third parties, are among the threats to privacy and security. By utilizing blockchain-based solutions, recent work address the se...
Graph neural networks (GNNs) are a type of neural network capable of learning on graph-structured data. However, training GNNs on large-scale graphs is challenging due to iterative aggregations of high-dimensional features from neighboring vertices within sparse graph structures combined with neural network operations. The sparsity of graphs freque...
Stream processing acceleration is driven by the continuously increasing volume and velocity of data generated on the Web and the limitations of storage, computation, and power consumption. Hardware solutions provide better performance and power consumption, but they are hindered by the high research and development costs and the long time to market...
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients. However, new approaches to FL often discuss their contributions involving small deep-learning models only and focus on training full models on clients. In the wake of Foundation Models (FM), the reality...
The European Union Artificial Intelligence Act mandates clear stakeholder responsibilities in developing and deploying machine learning applications to avoid substantial fines, prioritizing private and secure data processing with data remaining at its origin. Federated Learning (FL) enables the training of generative AI Models across data siloes, s...
Similar to other transaction processing frameworks, blockchain systems need to be dynamically reconfigured to adapt to varying workloads and changes in network conditions. However, achieving optimal reconfiguration is particularly challenging due to the complexity of the blockchain stack, which has diverse configurable parameters. This paper explor...
This paper aims to answer the question: Can deep learning models be cost-efficiently trained on a global market of spot VMs spanning different data centers and cloud providers? To provide guidance, we extensively evaluate the cost and throughput implications of training in different zones, continents, and clouds for representative CV, NLP and ASR m...
The field of quantum computing has the potential to transform quantum chemistry. The variational quantum eigensolver (VQE) algorithm has allowed quantum computing to be applied to chemical problems in the...
Byzantine fault-tolerant (BFT) consensus algorithms are at the core of providing safety and liveness guarantees for distributed systems that must operate in the presence of arbitrary failures. Recently, numerous new BFT algorithms have been proposed, not least due to the traction blockchain technologies have garnered in the search for consensus sol...
Recently, graph neural networks (GNNs) have gained much attention as a growing area of deep learning capable of learning on graph-structured data. However, the computational and memory requirements for training GNNs on large-scale graphs can exceed the capabilities of single machines or GPUs, making distributed GNN training a promising direction fo...
This paper proposes PrestigeBFT, a novel leader-based BFT consensus algorithm that addresses the weaknesses of passive view-change protocols. Passive protocols blindly rotate leadership among servers on a predefined schedule, potentially selecting unavailable or slow servers as leaders. PrestigeBFT proposes an active view-change protocol using repu...
Federated Machine Learning (FL) has received considerable attention in recent years. FL benchmarks are predominantly explored in either simulated systems or data center environments, neglecting the setups of real-world systems, which are often closely linked to edge computing. We close this research gap by introducing FLEdge, a benchmark targeting...
Training deep learning models in the cloud or on dedicated hardware is expensive. A more cost-efficient option are hyperscale clouds offering spot instances, a cheap but ephemeral alternative to on-demand resources. As spot instance availability can change depending on the time of day, continent, and cloud provider, it could be more cost-efficient...
Aside from the conception of new blockchain architectures, existing blockchain optimizations in the literature primarily focus on system or data-oriented optimizations within prevailing blockchains. However, since blockchains handle multiple aspects ranging from organizational governance to smart contract design, a holistic approach that encompasse...
Blockchains are decentralized; are they genuinely? We analyze blockchain decentralization's often-overlooked but quantifiable dimension: geospatial distribution of transaction processing. Blockchains bring with them the potential for geospatially distributed transaction processing. They enable validators from geospatially distant locations to parta...
Graph Neural Networks (GNNs) are an emerging research field. This specialized Deep Neural Network (DNN) architecture is capable of processing graph structured data and bridges the gap between graph processing and Deep Learning (DL). As graphs are everywhere, GNNs can be applied to various domains including recommendation systems, computer vision, n...
Graph Neural Networks (GNNs) are an emerging research field. This specialized Deep Neural Network (DNN) architecture is capable of processing graph structured data and bridges the gap between graph processing and Deep Learning (DL). As graphs are everywhere, GNNs can be applied to various domains including recommendation systems, computer vision, n...
For distributed graph processing on massive graphs, a graph is partitioned into multiple equally-sized parts which are distributed among machines in a compute cluster. In the last decade, many partitioning algorithms have been developed which differ from each other with respect to the partitioning quality, the run-time of the partitioning and the t...
Stream processing acceleration is driven by the continuously increasing volume and velocity of data generated on the Web and the limitations of storage, computation, and power consumption. Hardware solutions provide better performance and power consumption, but they are hindered by the high research and development costs and the long time to market...
Although rooftop PV panels and battery energy storage systems have been well established for detached residential buildings, there is still a lack of access to the advantages of onsite renewable energy generation and consumption for residents of multi-unit buildings. To understand the effects of developing distributed renewable energy sources for m...
This paper presents V-Guard, a new permissioned blockchain that achieves consensus for vehicular data under changing memberships, targeting the problem in V2X networks where vehicles are often intermittently connected on the roads. To achieve this goal, V-Guard integrates membership management into the consensus process for agreeing on data entries...
Operating a scalable and reliable server application, such as publish/subscribe (pub/sub) systems, requires tremendous development efforts and resources. The emerging serverless paradigm simplifies the development and deployment of highly available applications by delegating most operational concerns to the cloud providers. The serverless paradigm...
With the ongoing integration of Renewable Energy Sources (RES), the complexity of power grids is increasing. Due to the fluctuating nature of RES, ensuring the reliability of power grids can be challenging. One possible approach for addressing these challenges is Demand Response (DR) which is described as matching the demand for electrical energy a...
Existing permissioned blockchains often rely on coordination-based consensus protocols to ensure the safe execution of applications in a Byzantine environment. Furthermore, these protocols serialize the transactions by ordering them into a total global order. The serializability preserves the correctness of the application's state stored on the blo...
Non-intrusive load monitoring (NILM) aims at energy consumption and appliance state information retrieval from aggregated consumption measurements, with the help of signal processing and machine learning algorithms. Representation learning with deep neural networks is successfully applied to several related disciplines. The main advantage of repres...
The DEBS Grand Challenge (GC) is an annual programming competition open to practitioners from both academia and industry. The GC 2022 edition focuses on real-time complex event processing of high-volume tick data provided by Infront Financial Technology GmbH. The goal of the challenge is to efficiently compute specific trend indicators and detect p...
The growing number of data centers consumes a vast amount of energy for processing. There is a desire to reduce the environmental footprint of the IT industry, and one way to achieve this is to use renewable energy sources. A challenge with using renewable resources is that the energy output is irregular as a consequence of the intermittent nature...
Byzantine fault-tolerant (BFT) consensus algorithms are at the core of providing safety and liveness guarantees for distributed systems that must operate in the presence of arbitrary failures. Recently, numerous new BFT algorithms have been proposed, not least due to the traction blockchain technologies have garnered in search for consensus solutio...
Graph edge partitioning is an important preprocessing step to optimize distributed computing jobs on graph-structured data. The edge set of a given graph is split into $k$ equally-sized partitions, such that the replication of vertices across partitions is minimized. Out-of-core edge partitioning algorithms are able to tackle the problem with low m...
Growing excitement around permissionless blockchains is uncovering its latent scalability concerns. Permissioned blockchains offer high transactional throughput and low latencies while compromising decentralization. In the quest for a decentralized, scalable blockchain fabric, i.e., to offer the scalability of permissioned blockchain in a permissio...
Interwell connectivity plays a key role in waterflooding for guiding water injection. The existing works focus on the response relationship between one injection well and one production well. No research has explored the structural information of waterflooding on a well pattern. To address this challenge, this paper proposes cooperation-mission neu...
Leader-based consensus protocols must undergo a view-change phase to elect a new leader when the current leader fails. The new leader is often decided upon a candidate server that collects votes from a quorum of servers. However, voting-based election mechanisms intrinsically cause competition in leadership candidacy when each candidate collects on...
Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy. Maximizing resource utilization is becoming more challenging as the throughput of training processes increases with hardware innovations (e.g., faster GPUs, TPUs, and inter-connects) and advanced parallelization techniques that yi...
Transfer Learning is a well-studied concept in machine learning, that relaxes the assumption that training and testing data need to be drawn from the same distribution. Recent success in applying transfer learning in the area of computer vision has motivated research on transfer learning also in context of time series data. This benefits learning i...
In smart grids, the large-scale integration of distributed renewable energy resources has enabled the provisioning of alternative sources of supply. Peer-to-peer (P2P) energy trading among local households is becoming an emerging technique that benefits both energy prosumers and operators. Since conventional energy supply is still needed to help fi...
We present a new approach for designing reliable and scalable overlay networks to support topic-based pub/sub communication. We propose the
${{\mathsf {MinAvg}}-{k}{\mathsf {TCO}}}$
problem parameterized by
${k}$
: use the minimum number of edges to create a
${k}$ k -topic-connected overlay
(
${{k}TCO}$
) for pub/sub systems, i.e., for each...
Blockchains have witnessed widespread adoption in the past decade in various fields. The growing demand makes their scalability and sustainability challenges more evident than ever. As a result, more and more blockchains have begun to adopt proof-of-stake (PoS) consensus protocols to address those challenges. One of the fundamental characteristics...
Current Internet of Things (IoT) infrastructures rely on cloud storage however, relying on a single cloud provider puts limitations on the IoT applications and Service Level Agreement (SLA) requirements. Recently, multiple decentralized storage solutions (e.g., based on blockchains) have entered the market with distinct architecture, Quality of Ser...
Due to the recent explosion of data volume and velocity, a new array of lightweight key-value stores have emerged to serve as alternatives to traditional databases. The majority of these storage engines, however, sacrifice their read performance in order to cope with write throughput by avoiding random disk access when writing a record in favor of...
Distributed systems that manage and process graph-structured data internally solve a graph partitioning problem to minimize their communication overhead and query run-time. Besides computational complexity -- optimal graph partitioning is NP-hard -- another important consideration is the memory overhead. Real-world graphs often have an immense size...
Permissioned blockchain systems promise to provide both decentralized trust and privacy. Hyperledger Fabric is currently one of the most wide-spread permissioned blockchain systems and is heavily promoted both in industry and academia. Due to its optimistic concurrency model, the transaction failure rates in Fabric can become a bottleneck. While th...
Adaptive workflow management systems allow workflows to be changed in both the modeling and runtime stages, resulting in many workflow variants. Identifying a minimum sequence of high-level changes between two workflows represents a fundamental yet critical issue. The state-of-the-art approach utilizes digital logic to seek the optimal solution; ho...
Process mining aims at discovering behavioral knowledge of business processes from their event logs, which has received an increasing attention in the era of cloud computing and big data. Surprisingly, to date, discovering structural errors (e.g., deadlocks and lack of synchronization) from event logs has not been considered in state-of-the-art pro...
In peer-to-peer (P2P) energy trading, the incorporation of distributed energy resources with unprotected data, originating from sources such as home energy management systems that are connected through the Internet, provokes vulnerabilities that can manifest security breaches. In this article, two threat scenarios based on a novel false data inject...
The Intelligent Transportation System (ITS) has become essential for the economical and technological development of a country. The maturity of communication technologies (Vehicle to Infrastructure (V2I) and Vehicle to Vehicle (V2V)) and the amalgamation of smart grids, electric vehicles (EVs) and energy trading resulted in a storm of research oppo...
Monitoring the internal conditions of a machine is essential to increase its production efficiency and to reduce energy waste. Non-intrusive condition monitoring techniques, such as analysing electrical signals, provide insights by disaggregating a composite signal of a machine as a whole into the individual components to determine their states. De...
Due to recent explosion of data volume and velocity, a new array of lightweight key-value stores have emerged to serve as alternatives to traditional databases. The majority of these storage engines, however, sacrifice their read performance in order to cope with write throughput by avoiding random disk access when writing a record in favor of fast...
The accurate detection of appliance state transitions in electrical signals is fundamental for numerous energy-conserving applications. We present an extensive overview and categorization of the current state in event detection on high-sampling-rate signals. Existing approaches are designed for specific environments and need to be tediously adapted...
With energy consumption in high-performance computing clouds growing rapidly, energy saving has become an important topic. Virtualization provides opportunities to save energy by enabling one physical machine (PM) to host multiple virtual machines (VMs). Dynamic voltage and frequency scaling (DVFS) is another technology to reduce energy consumption...
In traditional IP-based publish/subscribe middlewares, a detour to overlay network is demanded to match events with defined filters, which introduces more latency overhead for delivering events from publishers to subscribers. The emerging Software Defined Networking (SDN) creates boundless possibilities to improve the efficiency of event delivery b...