Bernhard Haslhofer

Bernhard Haslhofer
AIT Austrian Institute of Technology | ait · Center of Digital Safety & Security, Digital Insight Lab

PhD

About

106
Publications
34,084
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,726
Citations
Introduction
Bernhard Haslhofer currently works as a Senior Scientist at AIT's Digital Insight Lab. His research interest lies in finding and applying quantitative methods for gaining new insights from large-scale, connected datasets. He often acts as bridge researcher between fields and contributes practical methods and tools drawn from machine learning, network analytics, and text mining. His current research topics are: Cryptocurrency Analytics, Predictive Maintenance, and Knowledge Graph Engineering.
Additional affiliations
March 2013 - February 2014
University of Vienna
Position
  • PostDoc Position
March 2011 - February 2013
Cornell University
Position
  • PostDoc Position

Publications

Publications (106)
Preprint
Decentralized finance (DeFi) has been the target of numerous profit-driven crimes, but the prevalence and cumulative impact of these crimes have not yet been assessed. This study provides a comprehensive assessment of profit-driven crimes targeting the DeFi sector. We collected data on 1153 crime events from 2017 to 2022. Of these, 1,048 were relat...
Article
Full-text available
Decentralized Finance (DeFi) is a new financial paradigm that leverages distributed ledger technologies to offer services such as lending, investing, or exchanging cryptoassets without relying on traditional centralized intermediaries. A range of DeFi protocols implements these services as a suite of smart contracts, i.e., software programs that en...
Chapter
The rapid growth of the Ethereum ecosystem since 2020 has been driven by the proliferation of several DeFi protocols [10], which are application-layer programs that provide Decentralized Finance (DeFi) services [14, 16] such as the exchange of cryptoassets on decentralized exchanges (DEXs) [2, 7, 15], their lending and borrowing [1, 4, 8], or the c...
Article
Full-text available
Investors commonly exhibit the disposition effect—the irrational tendency to sell their winning investments and hold onto their losing ones. While this phenomenon has been observed in many traditional markets, it remains unclear whether it also applies to atypical markets like cryptoassets. This paper investigates the prevalence of the disposition...
Article
Full-text available
We present a measurement study on compositions of Decentralized Finance (DeFi) protocols, which aim to disrupt traditional finance and offer services on top of distributed ledgers, such as Ethereum. Understanding DeFi compositions is of great importance, as they may impact the development of ecosystem interoperability, are increasingly integrated w...
Preprint
Full-text available
One of the defining features of Bitcoin and the thousands of cryptocurrencies that have been derived from it is a globally visible transaction ledger. While Bitcoin uses pseudonyms as a way to hide the identity of its participants, a long line of research has demonstrated that Bitcoin is not anonymous. This has been perhaps best exemplified by the...
Preprint
Full-text available
We present the first study on compositions of Decentralized Finance (DeFi) protocols, which aim to disrupt traditional finance and offer financial services on top of the distributed ledgers, such as the Ethereum. Starting from a ground-truth of 23 DeFi protocols and 10,663,881 associated accounts, we study the interactions of DeFi protocols and ass...
Chapter
Bitcoin (BTC) pseudonyms (layer 1) can effectively be deanonymized using heuristic clustering techniques. However, while performing transactions off-chain (layer 2) in the Lightning Network (LN) seems to enhance privacy, a systematic analysis of the anonymity and privacy leakages due to the interaction between the two layers is missing. We present...
Preprint
Full-text available
We present a first measurement study on two popular wallets with built-in distributed CoinJoin functionality, Wasabi and Samourai, in the context of the broader Bitcoin ecosystem. By applying two novel heuristics, we can effectively pinpoint 25,070 Wasabi and 134,569 Samourai transactions within the first 689,255 (2021-07-01) blocks. Our study reve...
Preprint
Full-text available
There is currently an increasing demand for cryptoasset analysis tools among cryptoasset service providers, the financial industry in general, as well as across academic fields. At the moment, one can choose between commercial services or low-level open-source tools providing programmatic access. In this paper, we present the design and implementat...
Preprint
Full-text available
Investors tend to sell their winning investments and hold onto their losers. This phenomenon, known as the \emph{disposition effect} in the field of behavioural finance, is well-known and its prevalence has been shown in a number of existing markets. But what about new atypical markets like cryptocurrencies? Do investors act as irrationally as in t...
Chapter
In the proof-of-stake (PoS) paradigm for maintaining decentralized, permissionless cryptocurrencies, Sybil attacks are prevented by basing the distribution of roles in the protocol execution on the stake distribution recorded in the ledger itself. However, for various reasons this distribution cannot be completely up-to-date, introducing a gap betw...
Preprint
Payment channel networks (PCNs) have emerged as a promising alternative to mitigate the scalability issues inherent to cryptocurrencies like Bitcoin and are often assumed to improve privacy, as payments are not stored on chain. However, a systematic analysis of possible deanonymization attacks is still missing. In this paper, we focus on the Bitcoi...
Article
Full-text available
Analyzing cryptocurrency payment flows has become a key forensic method in law enforcement and is nowadays used to investigate a wide spectrum of criminal activities. However, despite its widespread adoption, the evidential value of obtained findings in court is still largely unclear. In this paper, we focus on the key ingredients of modern cryptoc...
Chapter
Travelogues represent an important and intensively studied source for scholars in the humanities, as they provide insights into people, cultures, and places of the past. However, existing studies rarely utilize more than a dozen primary sources, since the human capacities of working with a large number of historical sources are naturally limited. I...
Preprint
Blockchains are typically managed by peer-to-peer (P2P) networks providing the support and substrate to the socalled distributed ledger (DLT), a replicated, shared, and synchronized data structure, geographically spread across multiple nodes. The Bitcoin (BTC) blockchain is by far the most well known DLT, used to record transactions among peers, ba...
Preprint
In the proof-of-stake (PoS) paradigm for maintaining decentralized, permissionless cryptocurrencies, Sybil attacks are prevented by basing the distribution of roles in the protocol execution on the stake distribution recorded in the ledger itself. However, for various reasons this distribution cannot be completely up-to-date, introducing a gap betw...
Preprint
Full-text available
Travelogues represent an important and intensively studied source for scholars in the humanities, as they provide insights into people, cultures, and places of the past. However, existing studies rarely utilize more than a dozen primary sources, since the human capacities of working with a large number of historical sources are naturally limited. I...
Conference Paper
Full-text available
In the past year, a new spamming scheme has emerged: sexual extortion messages requiring payments in the cryptocurrency Bitcoin, also known as sextortion. This scheme represents a first integration of the use of cryptocurrencies by members of the spamming industry. Using a dataset of 4,340,736 sextortion spams, this research aims at understanding s...
Chapter
Monero is a privacy-centric cryptocurrency that makes payments untraceable by adding decoys to every real input spent in a transaction. Two studies from 2017 found methods to distinguish decoys from real inputs, which enabled traceability for a majority of transactions. Since then, a number protocol changes have been introduced, but their effective...
Preprint
Full-text available
In the past year, a new spamming scheme has emerged: sexual extortion messages requiring payments in the cryptocurrency Bitcoin, also known as sextortion. This scheme represents a first integration of the use of cryptocurrencies by members of the spamming industry. Using a dataset of 4,340,736 sextortion spams, this research aims at understanding s...
Preprint
Analyzing cryptocurrency payment flows has become a key forensic method in law enforcement and is nowadays used to investigate a wide spectrum of criminal activities. However, despite its widespread adoption, the evidential value of obtained findings in court is still largely unclear. In this paper, we focus on the key ingredients of modern cryptoc...
Preprint
Full-text available
Miners play a key role in cryptocurrencies such as Bitcoin: they invest substantial computational resources in processing transactions and minting new currency units. It is well known that an attacker controlling more than half of the network's mining power could manipulate the state of the system at will. While the influence of large mining pools...
Preprint
Full-text available
Monero is a privacy-centric cryptocurrency that makes payments untraceable by adding decoys to every real input spent in a transaction. Two studies from 2017 found methods to distinguish decoys from real inputs, which enabled traceability for a majority of transactions. Since then, a number protocol changes have been introduced, but their effective...
Conference Paper
Full-text available
Investigating perceptions of Otherness is the overarching goal of the Travelogues project. It studies a corpus comprising of thousands of recently digitized travelogues dating back to the 16th century held by the Austrian National Library. Driven by an interdisciplinary team of historians and data scientists, it aims at making knowledge that is now...
Article
Full-text available
Ransomware can prevent a user from accessing a device and its files until a ransom is paid to the attacker, most frequently in Bitcoin. With over 500 known ransomware families, it has become one of the dominant cybercrime threats for law enforcement, security professionals and the public. However, a more comprehensive, evidence-based picture on the...
Article
Full-text available
Knowledge graphs represent concepts (e.g., people, places, events) and their semantic relationships. As a data structure, they underpin a digital information system, support users in resource discovery and retrieval, and are useful for navigation and visualization purposes. Within the libaries and humanities domain, knowledge graphs are typically r...
Chapter
Bitcoin ist eine dezentrale virtuelle Währung, die dafür genutzt werden kann, weltweit pseudoanonymisierte Zahlungen innerhalt kurzer Zeit und mit vergleichsweise geringen Transaktionskosten auszuführen. In dieser Abhandlung zeigen wir die ersten Ergebnisse eine Langzeitstudie zur Bitcoinadressenkurve, die alle Adressen und Transaktionen seit dem S...
Poster
Full-text available
Knowledge graphs have become increasingly important in information retrieval tasks. However, if semantically interlinked concepts do not reflect the semantics of a document corpus, users might be confronted with non-relevant query results. In this work, we propose a network-metrics based method that allows assessment of knowledge graph quality with...
Research
Full-text available
This data literacy module results from the Cluster K “Data Librarian” working group that has been active between September 2015 and June 2016. The goal was to suggest two modules which can supplement current librarian/information professional courses at Austrian Universities. It should qualify information professionals and students to deal with dig...
Poster
Bitcoin is a rising digital currency and exemplifies the growing need for systematically gathering and analyzing public transaction data sets such as the blockchain. However, the blockchain in its raw form is just a large ledger listing transfers of currency units between alphanumeric character strings, without revealing contextually relevant real-...
Article
Cultural institutions are increasingly contributing content to social media platforms to raise awareness and promote use of their collections. Furthermore, they are often the recipients of user comments containing information that may be incorporated in their catalog records. However, not all user-generated comments can be used for the purpose of e...
Conference Paper
Full-text available
Web vocabularies provide organization and orientation in information environments and can facilitate resource discovery and retrieval. Several tools have been developed that support quality assessment for the increasing amount of vocabularies expressed in SKOS and published as Linked Data. However, these tools do not yet take into account the users...
Article
Europeana is the European Union's flagship digital cultural heritage initiative. the europeana portal, launched in November 2008, showcases the possibility of cross-cultural domain interoperability on a pan-european level. To date, metadata and thumbnails for over 23 million objects have been aggregated from over 1500 providers from the library, ar...
Conference Paper
Full-text available
Cultural institutions are increasingly opening up their repositories and contribute digital objects to social media platforms such as Flickr. In return they often receive user comments containing information that could be incorporated in their catalog records. Since judging the usefulness of a large number of user comments is a labor-intensive task...
Article
Full-text available
Maintenance of multiple, distributed up-to-date copies of collections of changing Web resources is important in many application contexts and is often achieved using ad hoc or proprietary synchronization solutions. ResourceSync is a resource synchronization framework that integrates with the Web architecture and leverages XML sitemaps. We define a...
Conference Paper
Full-text available
Knowledge organization systems such as thesauri or taxonomies are increasingly being expressed using the Simple Knowledge Organization System (SKOS) and published as structured data on the Web. Search engines can exploit these vocabularies and improve search by expanding terms at query or document indexing time. We propose a SKOS-based term expansi...
Conference Paper
Many applications need up-to-date copies of collections of changing Web resources. Such synchronization is currently achieved using ad-hoc or proprietary solutions. We propose ResourceSync, a general Web resource synchronization protocol that leverages XML Sitemaps. It provides a set of capabilities that can be combined in a modular manner to meet...
Conference Paper
We address the Named Entity Disambiguation (NED) problem for short, user-generated texts on the social Web. In such settings, the lack of linguistic features and sparse lexical context result in a high degree of ambiguity and sharp performance drops of nearly 50% in the accuracy of conventional NED systems. We handle these challenges by developing...
Conference Paper
We address the Named Entity Disambiguation (NED) problem for short, user-generated texts on the social Web. In such settings, the lack of linguistic features and sparse lexical context result in a high degree of ambiguity and sharp performance drops of nearly 50% in the accuracy of conventional NED systems. We handle these challenges by developing...
Article
Full-text available
Many applications need up-to-date copies of collections of changing Web resources. Such synchronization is currently achieved using ad-hoc or proprietary solutions. We propose ResourceSync, a general Web resource synchronization protocol that leverages XML Sitemaps. It provides a set of capabilities that can be combined in a modular manner to meet...
Article
Full-text available
We address the Named Entity Disambiguation (NED) problem for short, user-generated texts on the social Web. In such settings, the lack of linguistic features and sparse lexical context result in a high degree of ambiguity and sharp performance drops of nearly 50% in the accuracy of conventional NED systems. We handle these challenges by developing...
Conference Paper
Full-text available
Tags assigned by users to shared content can be ambiguous. As a possible solution, we propose semantic tagging as a collaborative process in which a user selects and associates Web resources drawn from a knowledge context. We applied this general technique in the specific context of online historical maps and allowed users to annotate and tag them....
Article
This is the second paper in D-Lib Magazine about the ResourceSync effort conducted by the National Information Standards Organization (NISO) and the Open Archives Initiative (OAI). The first part provided a perspective on the resource synchronization problem and introduced a template that organized possible components of a resource synchronization...
Article
Europeana is a single access point to millions of books, paintings, films, museum objects and archival records that have been digitized throughout Europe. The data.europeana.eu Linked Open Data pilot dataset contains open metadata on approximately 2.4 million texts, images, videos and sounds gathered by Europeana. All metadata are released under Cr...
Article
With the increasing storage capacity of personal computing devices, the problems of information overload and information fragmentation are apparent on users? desktops. For the Web, semantic technologies solve this problem by adding a machine-interpretable information layer on top of existing resources. It has been shown that the application of thes...
Article
Web applications frequently leverage resources made available by remote web servers. As resources are created, updated, deleted, or moved, these applications face challenges to remain in lockstep with changes on the server. Several approaches exist to help meet this challenge for use cases where "good enough" synchronization is acceptable. But when...
Article
Full-text available
Linked Data is a way of exposing and sharing data as resources on the Web and interlinking them with semantically related resources. In the last three years significant amounts of data have been generated, increasingly forming a globally connected, distributed data space. For multimedia content, metadata are a key factor for efficient management, o...
Conference Paper
Full-text available
The Simple Knowledge Organization System (SKOS) is a standard model for controlled vocabularies on the Web. However, SKOS vocabularies often differ in terms of quality, which reduces their applicability across system boundaries. Here we investigate how we can support taxonomists in improving SKOS vocabularies by pointing out quality issues that go...
Article
Full-text available
Many Web portals allow users to associate additional information with existing multimedia resources such as images, audio, and video. However, these portals are usually closed systems and user-generated annotations are almost always kept locked up and remain inaccessible to the Web of Data. We believe that an important step to take is the integrati...
Conference Paper
The World Wide Web has changed the way of publishing and distributing scholarly results. However, scholarly publications are still organized linearly and point to supplemental or related information only by textual references or at most by hyperlinks embedded into PDF documents. They are stored in closed repositories and we can hardly access, navig...
Chapter
With the increasing storage capacity of personal computing devices, the problems of information overload and information fragmentation are apparent on users’ desktops. For the Web, semantic technologies solve this problem by adding a machine-interpretable information layer on top of existing resources. It has been shown that the application of thes...
Conference Paper
Full-text available
The MEKETREpository (MR) allows scholars to collect and publish artwork descriptions from Egypt's Middle Kingdom (MK) period on the Web. Collaboratively developed vocabularies can be used for the semantic classification and annotation of uploaded media. This allows all users with system access to contribute their knowledge about the published artwo...
Article
The dynamics of linked datasets may lead to broken links if data providers do not react to changes appropriately. Such broken links denote interrupted navigational paths between resources and may lead to unavailability of data. As a possible solution, we developed DSNotify, an event-detection framework that informs actors about various types of cha...
Article
Full-text available
Annotations allow users to associate additional information with existing resources. Using proprietary and closed systems on the Web, users are already able to annotate multimedia resources such as images, audio and video. So far, however, this information is almost always kept locked up and inaccessible to the Web of Data. We believe that an impor...
Conference Paper
Full-text available
Historic maps are valuable scholarly resources that record information often retained by no other written source. With the YUMA Map Annotation Tool we want to facilitate collaborative annotation for scholars studying historic maps, and allow for semantic augmentation of annotations with structured, contextually relevant information retrieved from L...
Conference Paper
Annotations are a fundamental scholarly practice common across disciplines. They enable scholars to organize, share and exchange knowledge, and collaborate in the interpretation of source material. In this paper, we introduce the YUMA Media Annotation Framework, an ongoing open source effort to provide integrated collaborative annotation functional...
Article
Full-text available
Cultural heritage institutions and private collections such as the Library of Congress or the David Rumsey Map Collection are increasingly providing free online access to high-resolution scans of old maps. With the YUMA Map Annotation Tool, we want to facilitate collaborative scholarly annotation for such online resources. A central feature of our...
Article
Full-text available
With our YUMA annotation tool we allow scholars to annotated digitized historic maps. Besides common annotation functionality it supports novel annotation features, such as semantic linking and georeferencing. In this document, we briefly outline these features, the collections and users we are targeting, as well as the project background and organ...
Conference Paper
Full-text available
Interoperability is a qualitative property of computing infrastructures that denotes the ability of sending and receiving systems to exchange and properly interpret information objects across system boundaries. Since this property is not given by default, the interoperability problem and the representation of semantics have been an active research...
Article
Full-text available
Cultural heritage institutions and private collections such as the Library of Congress or the David Rumsey Map Collection are increasingly providing free online access to high-resolution scans of old maps. With the YUMA Map Annotation Tool, we want to facilitate collaborative scholarly annotation for such online resources. A central feature of our...
Article
Full-text available
data.europeana.eu is an ongoing effort of making Europeana metadata available as Linked Open Data on the Web. It allows others to access metadata collected from Europeana data providers via standard Web technologies. The data are represented in the Europeana Data Model (EDM) and the described resources are addressable and dereferencable by their UR...