Ohad Rodeh

Ohad Rodeh
Ultima Genomics · Software

PhD

About

36
Publications
28,395
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,408
Citations
Citations since 2017
1 Research Item
470 Citations
2017201820192020202120222023020406080
2017201820192020202120222023020406080
2017201820192020202120222023020406080
2017201820192020202120222023020406080

Publications

Publications (36)
Preprint
Full-text available
As ever-larger cohorts of human genomes are collected in pursuit of genotype/phenotype associations, sequencing informatics must scale up to yield complete and accurate genotypes from vast raw datasets. Joint variant calling, a data processing step entailing simultaneous analysis of all participants sequenced, exhibits this scaling challenge acutel...
Article
Full-text available
Massive block IO systems are the workhorses powering many of today’s largest applications. Databases, health care systems, and virtual machine images are examples for block storage applications. The massive scale of these workloads, and the complexity of the underlying storage systems, makes it difficult to pinpoint problems when they occur. This w...
Article
Full-text available
BTRFS is a Linux filesystem that has been adopted as the default filesystem in some popular versions of Linux. It is based on copy-on-write, allowing for efficient snapshots and clones. It uses B-trees as its main on-disk data structure. The design goal is to work well for many use cases and workloads. To this end, much effort has been directed to...
Conference Paper
Full-text available
Deduplicating in-line data on primary storage is hampered by the disk bottleneck problem, an issue which results from the need to keep an index mapping portions of data to hash values in memory in order to detect duplicate data without paying the performance penalty of disk paging. The index size is proportional to the volume of unique data, so pla...
Article
Full-text available
Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g. , pa...
Article
Full-text available
Object-based storage is the natural evolution of the block storage interface, aimed at efficiently and effectively meeting the performance, reliability, security, and service requirements demanded by current and future applications. The object-based storage interface provides an organizational container, called an object, into which higher-level so...
Article
Full-text available
B-trees are used by many file systems to represent files and directories. They provide guaranteed logarithmic time key-search, insert, and remove. File systems like WAFL and ZFS use shadowing, or copy-on-write, to implement snapshots, crash recovery, write-batching, and RAID. Serious difficulties arise when trying to use b-trees and shadowing in a...
Conference Paper
Full-text available
Developers often describe testing as being tedious and boring. This work challenges this notion; we describe tools and methodologies crafted to test object-based storage devices (OSDs) for correctness and compliance with the T10 OSD standard. A special consideration is given to test the security model of an OSD implementation. Additionally, some wo...
Conference Paper
Full-text available
The concept of object storage was introduced in the early 1990's by the research community. Since then it has greatly matured and is now in its early stages of adoption by the industry. Yet, object storage is still not widely accepted. Viewing object store technology as the future building block particularly for large storage systems, our team in I...
Article
Full-text available
The concept of object storage was introduced in the early 1990's by the research community. Since then it has greatly matured and is now in its early stages of adoption by the industry. Yet, object storage is still not widely ac- cepted. Viewing object store technology as the future build- ing block, particularly for large storage systems, our team...
Article
Full-text available
Group communication is used in many file systems and storage controller systems. Ideally, there could be a single group communication system that serves the needs of all these various projects---software that was built once, tested once, and deployed everywhere. In reality, each project builds its own custom group services component. There are thre...
Article
Full-text available
Storage Area Networks (SAN) are based on direct interaction between clients and storage servers exposing the storage server to network attacks. Giving the client direct access to the storage servers requires verification that the client requests conform with the system protection policy. Today, the only available solutions enforce access control at...
Article
Full-text available
We present a protocol for diffusion of updates among replicas in a distributed system where up to # replicas may suffer Byzantine failures. Our algorithm ensures that no correct replica accepts spurious updates introduced by faulty replicas, by requiring that a replica accepts an update only after receiving it from at least # ##distinct replicas (o...
Conference Paper
Full-text available
zFS is a research project aimed at building a decentralized file system that distributes all aspects of file and storage management over a set of cooperating machines interconnected by a high-speed network. zFS is designed to be a file system that scales from a few networked computers to several thousand machines and to be built from commodity off-...
Conference Paper
Storage Area Networks (SAN) are based on direct interaction between clients and storage servers. This unmediated access exposes the storage server to network attacks, necessitating a verification, by the server, that the client requests conform with the system protection policy. Solutions today can only enforce access control at the granularity of...
Conference Paper
Full-text available
Today's SAN architectures promise unmediated host ac- cess to storage (i.e., without going through a server). To achieve this promise, however, we must address several is- sues and opportunities raised by SANs, including security, scalability and management. Object storage, such as in- troduced by the NASD work (14), is a means of address- ing thes...
Article
In this paper we describe an ecient algorithm for the management of group-keys for Group Communication Systems. Our algorithm is based on the notion of key-graphs, previously used for managing keys in large IP-multicast groups.
Conference Paper
We present a protocol for diffusion of updates among replicas in a distributed system where up to b replicas may suffer Byzantine failures. Our algorithm ensures that no correct replica accepts spurious updates introduced by faulty replicas, by requiring that a replica accepts an update only after receiving it from at least b+1 distinct replicas (o...
Article
Full-text available
In this paper we describe an ecient algorithm for the management of group-keys for Group Communication Systems. Our algorithm is based on the notion of key-graphs, previously used for managing keys in large IP-multicast groups. The standard protocol requires a centralized key-server that has knowledge of the full key-graph. Our protocol does not de...
Article
Full-text available
Ensemble is a Group Communication System built at Cornell and the Hebrew universities. It allows processes to create process groups within which scalable reliable fo-ordered multicast and point-to-point communication are supported. The system also supports other communication properties, such as causal and total multicast ordering, ow control, etc....
Article
In this paper we study the key management problem, in the context of Group Communication Systems (GCS). GCSs are mid-sized systems, scaling up to 100 members. We present a side-by-side comparison of three ways of managing keys, studing bandwidth and latency.
Article
Full-text available
In this paper we describe an efficient algorithm for the management of group-keys. Our algorithm is based on a protocol for secure IP-multicast and is used to manage groupkeys in group-communication systems. Unlike prior work, based on centralized key-servers, our solution is completely distributed and fault-tolerant and its performance is comparab...
Conference Paper
Full-text available
The Horus and Ensemble efforts culminated a multi-year Cornell research program in process group communication used for fault-tolerance, security and adaptation. Our intent was to understand the degree to which a single system could offer flexibility and yet maintain high performance, to explore the integration of fault tolerance with security and...
Article
Full-text available
A secure reliable multicast protocol enables a process to send a message to a group of recipients such that all correct destinations receive the same message, despite the malicious efforts of fewer than a third of the total number of processes, including the sender. This has been sh own to be a useful tool in building secure distributed services, a...
Article
Full-text available
Transis [ADKM92,AAD93,ADM + 93] is a tool for group communication that provides reliable ordered multicast along with membership services and strong group semantics. Transis can currently be used by processes residing on nodes within a BCD (Broadcast Domain). Building distributed applications on top of these services enables the programmer to assum...
Article
Full-text available
We extend traditional Virtual Private Networks (VPNs) with fault-tolerance and dynamic membership properties, defining a Dynamic Virtual Private Network (DVPN). We require no new hardware and make no special assumptions about line security. An implementation exhibits low overheard, provides guarantees of authenticity and confidentiality to any IP a...
Article
Ensemble is a Group Communication System built at Cornell and the Hebrew Universities. It allows processes to create process groups in which scalable reliable fifo-ordered multicast and point-to-point communication are supported. The system also supports other communication properties, such as multicast causal and total ordering, flow control, etc....
Conference Paper
Full-text available
A secure reliable multicast protocol enables a process to send a message to a group of recipients such that all honest destinations receive the same message, despite the malicious efforts of fewer than a third of them, including the sender. This has been shown to be a useful tool in building secure distributed services, albeit with a cost that typi...
Article
Full-text available
This paper describes a method for constructing a distributed database from a set of compute-nodes, a local area network, and a set of object-disks. We assume object-disks do not fail; nodes can fail or go to sleep for long periods. In order for a node to access an object inside an object-disk a valid lease is required. There are two issues that nee...
Article
Full-text available
Deduplication on is rarely used on primary storage because of the disk bottleneck problem, which results from the need to keep an index mapping chunks of data to hash values in memory in order to detect duplicate blocks. This index grows with the number of unique data blocks, creating a scalability problem, and at current prices the cost of addi-ti...

Network

Cited By

Projects

Projects (5)
Archived project
Archived project