Yehuda Afek

Yehuda Afek
  • Tel Aviv University

About

177
Publications
14,203
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,132
Citations
Current institution
Tel Aviv University

Publications

Publications (177)
Preprint
Full-text available
We present a novel yet simple and comprehensive DNS cache POisoning Prevention System (POPS), designed to integrate as a module in Intrusion Prevention Systems (IPS). POPS addresses statistical DNS poisoning attacks, including those documented from 2002 to the present, and offers robust protection against similar future threats. It consists of two...
Article
Full-text available
This paper presents a new localhost browser based vulnerability and corresponding attack that opens the door to new attacks on private networks and local devices. We show that this new vulnerability may put hundreds of millions of internet users and their IoT devices at risk. We demonstrate the viability of the attack on a real product, "Folding@Ho...
Chapter
This paper presents a new localhost browser based vulnerability and corresponding attack that opens the door to new attacks on private networks and local devices. We show that this new vulnerability may put hundreds of millions of internet users and their IoT devices at risk. Following the attack presentation, we suggest three new protection mechan...
Preprint
The Domain Name System (DNS) infrastructure, a most critical system the Internet depends on, has recently been the target for different DDoS and other cyber-attacks, e.g., the notorious Mirai botnet. While these attacks can be destructive to both recursive and authoritative DNS servers, little is known about how recursive resolvers operate under su...
Preprint
A new scalable ISP level system architecture to secure and protect all IoT devices in a large number of homes is presented. The system is based on whitelisting, as in the Manufacturer Usage Description (MUD) framework, implemented as a VNF. Unlike common MUD suggestions that place the whitelist application at the home/enterprise network, our approa...
Preprint
We think of a tournament $T=([n], E)$ as a communication network where in each round of communication processor $P_i$ sends its information to $P_j$, for every directed edge $ij \in E(T)$. By Landau's theorem (1953) there is a King in $T$, i.e., a processor whose initial input reaches every other processor in two rounds or less. Namely, a processor...
Preprint
In this paper we present three attacks on private internal networks behind a NAT and a corresponding new protection mechanism, Internal Network Policy, to mitigate a wide range of attacks that penetrate internal networks behind a NAT. In the attack scenario, a victim is tricked to visit the attacker's website, which contains a malicious script that...
Article
We present a basic tool for zero day attack signature extraction. Given two large sets of messages, P the messages captured in the network at peacetime (i.e., mostly legitimate traffic) and A the messages captured during attack time (i.e., contains many attack messages), we present a tool for extracting a set S of strings that are frequently found...
Conference Paper
Random Subdomain DDoS attacks on the Domain Name System (DNS) infrastructure are becoming a popular vector in recent attacks (e.g., recent Mirai attack on Dyn). In these attacks, many queries are sent for a single or a few victim domains, yet they include highly varying non-existent subdomains generated randomly. Motivated by these attacks we desig...
Article
Efficient algorithms and techniques to detect and identify large flows in a high throughput traffic stream in the SDN match-and-action model are presented. This is in contrast to previous work that either deviated from the match and action model by requiring additional switch level capabilities or did not exploit the SDN data plane. Our constructio...
Preprint
Efficient algorithms and techniques to detect and identify large flows in a high throughput traffic stream in the SDN match-and-action model are presented. This is in contrast to previous work that either deviated from the match and action model by requiring additional switch level capabilities or did not exploit the SDN data plane. Our constructio...
Article
Motivated by a recent new type of randomized Distributed Denial of Service (DDoS) attacks on the Domain Name Service (DNS), we develop novel and efficient distinct heavy hitters algorithms and build an attack identification system that uses our algorithms. Heavy hitter detection in streams is a fundamental problem with many applications, including...
Article
Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path lacks hardware-software concurrency, because all f...
Conference Paper
Hardware lock-elision (HLE) introduces concurrency into legacy lock-based code by optimistically executing critical sections in a fast-path as hardware transactions. Its main limitation is that in case of repeated aborts, it reverts to a fallback-path that acquires a serial lock. This fallback-path lacks hardware-software concurrency, because all f...
Article
Configuring range based packet classification rules in network switches is crucial to all network core functionalities, such as firewalls and routing. However, OpenFlow, the leading management protocol for SDN switches, lacks the interface to configure range rules directly and only provides mask based rules, named flow entries. In this work we pres...
Article
This paper introduces a temporally bounded total store ordering (TBTSO) memory model, and shows that it enables nonblocking fence-free solutions to asymmetric synchronization problems, such as those arising in memory reclamation and biased locking. TBTSO strengthens the TSO memory model by bounding the time it takes a store to drain from the store...
Patent
Full-text available
A method for processing communication traffic includes receiving an incoming stream of compressed data conveyed by a sequence of data packets, each containing a respective portion of the compressed data. The respective portion of the compressed data contained in the first packet is stored in a buffer, having a predefined buffer size. Upon receiving...
Conference Paper
Full-text available
A non transactional load (NTL), is a load instruction, which is invisible to the transactional system, even if done within a transaction. It ignores and suppose to be ignored by the other concurrent transactions , thus NTL does not introduce any conflicts. An analysis of the potential benefits of NTL and the issues in introducing it are discussed....
Article
A recursive and fast construction of an n -element priority queue from exponentially smaller hardware priority queues and size n RAM is presented. All priority queue implementations to date require either O(logn)O(logn) instructions per operation or, exponential (with key size) space or, expensive special hardware whose cost and latency dramaticall...
Article
Work stealing is the method of choice for load balancing in task parallel programming languages and frameworks. Yet despite considerable effort invested in optimizing work stealing task queues, existing algorithms issue a costly memory fence when removing a task, and these fences are believed to be necessary for correctness. This paper refutes this...
Article
Work stealing is the method of choice for load balancing in task parallel programming languages and frameworks. Yet despite considerable effort invested in optimizing work stealing task queues, existing algorithms issue a costly memory fence when removing a task, and these fences are believed to be necessary for correctness. This paper refutes this...
Conference Paper
Work stealing is the method of choice for load balancing in task parallel programming languages and frameworks. Yet despite considerable effort invested in optimizing work stealing task queues, existing algorithms issue a costly memory fence when removing a task, and these fences are believed to be necessary for correctness. This paper refutes this...
Conference Paper
We present a basic tool for zero day attack signature extraction. Given two large sets of messages, P of messages captured in the network at peacetime (i.e., mostly legitimate traffic) and A captured during attack time (i.e., contains many attack messages), we present a tool for extracting a set S of strings, that are frequently found in A and not...
Conference Paper
A recursive and fast construction of an n elements priority queue from exponentially smaller hardware priority queues and size n RAM is presented. All priority queue implementations to date either require O (log n) instructions per operation or exponential (with key size) space or expensive special hardware whose cost and latency dramatically incre...
Conference Paper
Conventional wisdom in designing concurrent data structures is to use the most powerful synchronization primitive, namely compare-and-swap (CAS), and to avoid contended hot spots. In building concurrent FIFO queues, this reasoning has led researchers to propose combining-based concurrent queues. This paper takes a different approach, showing how to...
Conference Paper
We present a simple yet effective technique for improving performance of lock-based code using the hardware lock elision (HLE) feature in Intel's upcoming Haswell processor. We also describe how to extend Haswell's HLE mechanism to achieve a similar effect to our lock elision scheme entirely in hardware.
Conference Paper
Full-text available
This paper takes advantage of the emerging multi-core computer architecture to design a general framework for mitigating network-based complexity attacks. In complexity attacks, an attacker carefully crafts "heavy" messages (or packets) such that each heavy message consumes substantially more resources than a normal message. Then, it sends a suffic...
Conference Paper
Read-write locks are one of the most prevalent lock forms in concurrent applications because they allow read accesses to locked code to proceed in parallel. However, they do not offer any parallelism between reads and writes. This paper introduces pessimistic lock-elision (PLE), a new approach for non-speculatively replacing read-write locks with p...
Conference Paper
We present the CB tree, a counting-based self-adjusting binary search tree in which, as in splay trees, more-frequently accessed items move closer to the root. In a sequential execution, after m operations of which c(v) access itema v access of v traverses a path of length O(1+log m/c(v)) while doing few if any rotations. Unlike the original splay...
Article
Full-text available
In the {\em Musical Chairs} game $MC(n,m)$ a team of $n$ players plays against an adversarial {\em scheduler}. The scheduler wins if the game proceeds indefinitely, while termination after a finite number of rounds is declared a win of the team. At each round of the game each player {\em occupies} one of the $m$ available {\em chairs}. Termination...
Article
The JavaTMTM developers kit requires a size() operation for all objects, tracking the number of elements in the object. Unfortunately, the best known solution, available in the Java concurrency package, has a blocking concurrent implementation that does not scale. This paper presents a highly scalable wait-free implementation of a concurrent size()...
Article
Full-text available
We consider the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing. The model consists of an anonymous broadcast network in which nodes have no knowledge about the topology of the network or even an upper bound on its size. Furthermore, it is assumed that an adversary choos...
Article
In this paper we focus on the process of deep packet inspection of compressed web traffic. The major limiting factor in this process imposed by the compression, is the high memory requirements of 32 KB per connection. This leads to the requirements of hundreds of megabytes to gigabytes of main memory on a multi-connection setting. We introduce new...
Conference Paper
Full-text available
A synchronous message passing complete network with an adversary that may purge messages is used to precisely model tasks that are read-write wait-free computable. In the past, adversaries that reduce the computational power of a system as they purge messages were studied in the context of their ability to foil consensus. This paper considers the o...
Article
The Java[superscript TM] developers kit requires a size() operation for all objects, tracking the number of elements in the object. Unfortunately, the best known solution, available in the Java concurrency package, has a blocking concurrent implementation that does not scale. This paper presents a highly scalable wait-free implementation of a concu...
Conference Paper
Full-text available
It is well known that guaranteeing program consistency when accessing shared data comes at the price of degraded performance and scalability. This paper initiates the investigation of consistency oblivious programming (COP). In COP, sections of concurrent code that meet certain criteria are executed without checking for consistency. However, checkp...
Conference Paper
Full-text available
Poor placement of data blocks in memory may negatively impact application performance because of an increase in the cache conflict miss rate [18]. For dynamically allocated structures this placement is typically determined by the memory allocator. Cache index-oblivious allocators may inadvertently place blocks on a restricted fraction of the availa...
Conference Paper
Full-text available
In an asymmetric rendezvous system, such as an unfair synchronous queue and an elimination array, threads of two types, consumers and producers, show up and are matched, each with a unique thread of the other type. Here we present a new highly scalable, high throughput asymmetric rendezvous system that outperforms prior synchronous queue and elimin...
Conference Paper
Full-text available
We consider the problem of computing a maximal independent set (MIS) in an extremely harsh broadcast model that relies only on carrier sensing. The model consists of an anonymous broadcast network in which nodes have no knowledge about the topology of the network or even an upper bound on its size. Furthermore, it is assumed that nodes wake up asyn...
Conference Paper
Full-text available
We introduce oblivious protocols, a new framework for distributed computation with limited communication. Within this model we consider the musical chairs task MC(n,m), involving n players (processors) and m chairs. Initially, players occupy arbitrary chairs. Two players are in conflict if they both occupy the same chair. The task terminates when t...
Article
Full-text available
Humans are very good at optimizing solutions for specific problems. Biological processes, on the other hand, have evolved to handle multiple constrained distributed environments and so they are robust and adaptable. Inspired by observations made in a biological system we have recently presented a simple new randomized distributed MIS algorithm \cit...
Conference Paper
Full-text available
We consider the power of objects in the unbounded concurrency shared memory model, where there is an infinite set of processes and the number of processes active concurrently may increase without bound. By studying this model we obtain new results and observations that are relevant and meaningful to the standard bounded concurrency model. First we...
Conference Paper
Lock-based software transactional memory algorithms do not perform well in workloads with a high rate of context switches, which is caused for example by scheduling events or page faults. This occurs since threads that are switched-out by the operating system while holding locks block other threads from progressing, causing their transactions to ab...
Conference Paper
Full-text available
Compressing web traffic using standard GZIP is becoming both popular and challenging due to the huge increase in wireless web devices, where bandwidth is limited. Security and other content based networking devices are required to decompress the traffic of tens of thousands concurrent connections in order to inspect the content for different signat...
Article
Full-text available
Computational and biological systems are often distributed so that processors (cells) jointly solve a task, without any of them receiving all inputs or observing all outputs. Maximal independent set (MIS) selection is a fundamental distributed computing procedure that seeks to elect a set of local leaders in a network. A variant of this problem is...
Conference Paper
Full-text available
Linearizability, the key correctness condition that most optimized concurrent object implementations comply with, imposes tight synchronization between the object concurrent operations. This tight synchronization usually comes with a performance and scalability price. Yet, these implementations are often employed in an environment where a more rela...
Conference Paper
Full-text available
Working on shared mutable data requires synchronization through barriers, locks or transactional memory mechanisms. To avoid this overhead a thread may privatize part of the data and work on it locally. By privatizing a data item a thread is guaranteed that it is the only one accessing this data, i.e., that it accesses the data item in exclusion. T...
Conference Paper
Full-text available
Software Transactional Memory (STM) compilers commonly instrument memory accesses by transforming them into calls to STM library functions. Done naïvely, this instrumentation imposes a large overhead, slowing down the transaction execution. Many compiler optimizations have been proposed in an attempt to lower this overhead. In this paper we attempt...
Conference Paper
Full-text available
Many linearizable and optimized concurrent algorithms are available for known algorithms and data structures, such as, Queue, Tree, Stack, Counter and HashTable. However, sometimes these implementations are used in a more relaxed environment, provided as part of larger design pattern where a relaxed linearizability suffices rather than a strict one...
Conference Paper
Full-text available
Producer-consumer pools, that is, collections of unordered objects or tasks, are a fundamental element of modern multiprocessor software and a target of extensive research and development. For example, there are three common ways to implement such pools in the Java JDK6.0: the SynchronousQueue, the LinkedBlockingQueue, and the ConcurrentLinkedQueue...
Conference Paper
Full-text available
We present view transactions, a model for relaxed consistency checks in software transactional memory (STM). View transactions always operate on a consistent snapshot of memory but may commit in a different snapshot. They are therefore simpler to reason about, provide opacity and maintain composability. In addition, view transactions avoid many of...
Article
Full-text available
This paper introduces and investigates the k-simultaneous consensus problem: each process participates at the same time in k independent consensus instances until it decides in any one of them. Two results are presented. The first shows that the k-simultaneous consensus problem and the k-set agreement problem are wait-free equivalent in read/write...
Conference Paper
We address two problems, the g-tight group renaming task and what we call, safe-consensus task, and show the relations between them. We show that any g-tight group renaming task, the first problem, implements g processes consensus. We show this by introducing an intermediate task, the safe-consensus task, the second problem, and showing that g-tigh...
Conference Paper
This paper explores the power of failure detectors in read write shared memory systems with n processes whose names are drawn from the set {1...m}, m>=2n-1. We do so by making an additional assumption, name obliviousness, on top of the three failure detector assumptions introduced by ZieliDski. We present name non-oblivious failure detectors that a...
Conference Paper
Full-text available
We study the group renaming task, which is a natural generalization of the renaming task. An instance of this task consists of n processors, partitioned into m groups, each of at most g processors. Each processor knows the name of its group, which is in { 1, ..., M }. The task of each processor is to choose a new name for its group such that proces...
Article
Full-text available
This paper extends Common2, the family of objects that implement and are wait-free implementable from 2 consensus objects, in two ways: First, the stack object is shown to be in the family, refuting a conjecture to the contrary [6]. Second, Common2 is investigated in the unbounded concurrency model, whereas until now it was considered only in an n-...
Article
Deterministic collect algorithms are presented that are adaptive to total contention and are efficient with respect to both the number of registers used and the step complexity. One of them has optimal O(k) step and O(n) space complexities, but assumes that processes’ identifiers are in O(n), where n is the total number of processes in the system a...
Conference Paper
Full-text available
We address the problem of solving a task T=(T 1,...T m ) (called (m,1)-BG), in which a processor returns in an arbitrary one of m simultaneous consensus subtasks T 1,...T m . Processor p i submits to T an input vector of proposals (prop i,1,...,prop i,m ), one entry per subtask, and outputs, from just one subtask ℓ, a pair (ℓ, prop j,l ) for some j...
Conference Paper
What characteristics of an object determine its consensus number? Here we analyze how the consensus power of various objects changes without changing their functionality, but by placing certain restrictions on the object usage. For example it is shown that the consensus number of either a bounded-use queue or stack is 3 while the consensus number o...
Conference Paper
Full-text available
Common2, the family of objects that implement and are wait-free implementable from 2 consensus objects, is extended inhere in two ways: First, the stack object is added to the family --- an object that was conjectured not to be in the family. Second, Common2 is investigated in the unbounded concurrency model, whereas until now it was considered onl...
Chapter
A simple, general and optimal procedure to adapt algorithms designed for fixed topology networks to run on a network with dynamically changing topology is presented. The communication and time complexities of the procedure, per topological change, are independent of the number of topological changes and are linearly bounded by the size of the subne...
Conference Paper
Full-text available
This paper presents an economical, randomized, wait-free construction of an n-process test-and-set bit from read write registers. The test-and-set shared object has two atomic operations, test&set, which atomically reads the bit and sets its value to 1, and the reset operation that resets the bit to 0. We identify two new complexity measures by whi...
Conference Paper
Full-text available
Space and step complexity efficient deterministic adaptive to total contention collect algorithms are presented. One of them has an optimal O(k) step and O(n) space complexities, but restrict the processes identifiers size to O(n). Where n is the total number of processes in the system and k is the total contention, the total number of processes ac...
Article
Full-text available
Labovitz et al. (2001) and Labovitz et al. (2000) noticed that sometimes it takes border gateway protocol (BGP) a substantial amount of time and messages to converge and stabilize following the failure of some node in the Internet. In this paper, we suggest a minor modification to BGP that eliminates the problem pointed out and substantially reduce...
Conference Paper
Full-text available
In (Ref.1), (Ref.2) it was noticed that sometimes it takes BGP a substantial amount of time and messages to converge and stabilize following the failure of some node in the Internet. In this paper we suggest a minor modification to BGP that eliminates the problem pointed out and substantially reduces the convergence time and communication complexit...
Article
Layered communication protocols frequently implement a FIFO message facility On top of an unreliable non-FIFO service such as that provided by a packet-switching network. This paper investigates the possibility of implementing a reliable message layer on top of an underlying layer that can lose packets and deliver them out of order, with the additi...
Article
The notion of Internet Policy Atoms has been recently introduced in [1], [2] as groups of prefixes sharing a common BGP AS path at any Internet backbone router. In this paper we further research these 'Atoms'. First we offer a new method for computing the Internet policy atoms, and use the RIPE RIS database [6] to derive their structure. Second, we...
Article
Full-text available
this paper. Motivated by their first work Moir and Anderson developed renaming algorithms, in the read/write model, when such a bound on the maximum number of processes is known in advance. This led to a sequence of works on the renaming problem in this model [MA95, MG96, BGHM95] that lead to a long-lived (2K Gamma 1)-renaming algorithm with O(K )...
Article
Though it is common practice to treat synchronization primitives for multiprocessors as abstract data types, they are in reality machine instructions on registers. A crucial theoretical question with practical implications is the relationship between the size of the register and its computational power. We wish to study this question and choose as...
Article
Full-text available
Long-lived and adaptive implementations of mutual exclusion and renaming in the read/write shared memory model are presented. An implementation of a task is adaptive if the step complexity of any operation in the implementation is a function of the number of processes that take steps concurrently with the operation. The renaming algorithm assigns a...
Article
Full-text available
A new general theory about restoration of network paths is first introduced. The theory pertains to restoration of shortest paths in a network following failure, e.g., we prove that a shortest path in a network after removing k edges is the concatenation of at most k+1 shortest paths in the original network. The theory is then combined with efficie...
Article
A new paradigm for the design of self-stabilizing distributed algorithms, called local detection, is introduced. The essence of the paradigm is in defining a local condition based on the state of a processor and its immediate neighborhood such that the system is in a globally legal state if and only if the local condition is satisfied at all the no...
Article
A new general theory about restoration of network paths is first introduced. The theory pertains to restoration of shortest paths in a network following failure, e.g., we prove that a shortest path in a network after removing k edges is the concatenation of at most k + 1 shortest paths in the original network.The theory is then combined with effici...
Article
This paper presents Phantom, a simple constant space algorithm for rate-based flow control. As shown by our simulations, it converges fast to a fair rate allocation while generating a moderate queue length. While our approach can be easily implemented in ATM switches for managing available bit rate (ABR) traffic, it is also suitable for flow contro...
Conference Paper
Full-text available
A new general theory about restoration of network paths is first introduced. The theory pertains to restoration of shortest paths in a network following failure, e.g., we prove that a shortest path in a network after removing k edges is the concatenation of at most k + 1 shortest paths in the original network.The theory is then combined with effici...
Article
Full-text available
A new general theory about restoration of network paths is rst introduced. The theory pertains to restoration of shortest paths in a network following failure, e.g., we prove that a shortest path in a network after removing k edges is the concatenation of at most k + 1 shortest paths in the original network. The theory is then combined with ecient...
Conference Paper
In this paper we prove: For any constant d there is a large enough n such that there is no long-lived adaptive implementation of collect or renaming in the read write model with n processes that uses d or less MWMR registers.In other words, there is no implementation of a long-lived and adaptive renaming or collect object in the atomic read/write m...
Conference Paper
Long-lived and adaptive to point contention implementations of snapshot and immediate snapshot objects in the read/write shared-memory model are presented. In [2] we presented adaptive algorithms for mutual exclusion, collect and snapshot. However, the collect and snapshot algorithms were adaptive only when the number of local primitive operations...
Conference Paper
Trainet, a new scheme to extend MPLS (multi-protocol label switching) is presented. The scheme works much like the subway system in a large metropolitan area. Each (unidirectional) subway line corresponds to a labeled path, and a route in the network is defined by either a pair 〈label, count-value〉, where count specifies how many hops a packet stil...
Article
Full-text available
Several adaptive algorithms are automatically generated via a simple transformation from single-writer multi-reader algorithms, using the O(k) adaptive collect algorithm of Attiya and Fouren [AF98a]. Among these algorithms are an adaptive snapshot algorithm with step complexity O(k 2 ), and three algorithms solving (2k Gamma 1)-renaming, but with h...
Conference Paper
Full-text available
We suggest a new simple forwarding technique to speed up IP destination address lookup. The technique is a natural extension of IP, requires 5 bits in the IP header (IPv4, 7 in IPv6), and performs IP lookup nearly as fast as IP/Tag switching but with a smaller memory requirement and a much simpler protocol. The basic idea is that each router adds a...
Article
We consider shared memory systems that support multiobject operations in which processes may simultaneously access several objects in one atomic operation. We provide upper and lower bounds on the synchronization power (consensus number) of multiobject systems as a function of the type and the number of objects that may be simultaneously accessed i...
Conference Paper
Two implementations of an adaptive, wait-free, and long-lived renaming task in the read/write shared memory model are presented. Implementations of long-lived and adaptive objects were previously known only in the much stronger model of load-linked and store-conditional (i.e., read-modify-write) shared memory. In read/write shared-memory only one-s...
Article
Full-text available
An abstract is not available.
Article
Full-text available
An implementation of fast, wait-free, long-lived and dynamic renaming task in the read/write shared memory model is presented. The algorithm assigns a new unique id in the range 1; ; 4k 2 to any process whose initial unique name is taken from a set of size N , for an arbitrary N and where k is the number of processors that actually take steps or ho...
Article
This paper studies basic properties of rate-based flow-control algorithms and of the max-min fairness criteria. For the algorithms we suggest a new approach for their modeling and analysis, which may be considered more “optimistic” and realistic than traditional approaches. Three variations of the approach are presented, and their rate of convergen...

Network

Cited By