Amr El Abbadi's research while affiliated with University of California, Santa Barbara and other places

Publications (489)

Article
Diversity and Inclusion (D&I) are core to fostering innovative thinking. Existing theories demonstrate that to facilitate inclusion, multiple types of exclusionary dynamics, such as self-segregation, communication apprehension, and stereotyping and stigmatizing, must be overcome [11]. A diverse group of people tends to surface different perspective...
Article
Today's large-scale data management systems need to address distributed applications' confidentiality and scalability requirements among a set of collaborative enterprises. This paper presents Qanaat , a scalable multi-enterprise permissioned blockchain system that guarantees the confidentiality of enterprises in collaboration workflows. Qanaat pre...
Preprint
Linear sketches have been widely adopted to process fast data streams, and they can be used to accurately answer frequency estimation, approximate top K items, and summarize data distributions. When data are sensitive, it is desirable to provide privacy guarantees for linear sketches to preserve private information while delivering useful results w...
Preprint
Full-text available
Byzantine fault-tolerant protocols cover a broad spectrum of design dimensions from environmental setting on communication topology, to more technical features such as commitment strategy and even fundamental social choice related properties like order fairness. Designing and building BFT protocols remains a laborious task despite of years of inten...
Article
In this paper, we propose the first deterministic algorithms to solve the frequency estimation and frequent item problems in the bounded-deletion model. We establish the space lower bound for solving the deterministic frequent items problem in the bounded-deletion model, and propose Lazy SpaceSaving ± and SpaceSaving ± algorithms with optimal space...
Preprint
In this paper, we propose the first deterministic algorithms to solve the frequency estimation and frequent item problems in the bounded deletion model. We establish the space lower bound for solving the deterministic frequent items problem in the bounded deletion model, and propose the Lazy SpaceSaving$^\pm$ and SpaceSaving$^\pm$ algorithms with o...
Preprint
Full-text available
Today's large-scale data management systems need to address distributed applications' confidentiality and scalability requirements among a set of collaborative enterprises. In this paper, we present Qanaat, a scalable multi-enterprise permissioned blockchain system that guarantees confidentiality. Qanaat consists of multiple enterprises where each...
Conference Paper
Metadata from voice calls, such as the knowledge of who is communicating with whom, contains rich information about people’s lives. Indeed, it is a prime target for powerful adversaries such as nation states. Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. This pape...
Article
Recently the long standing problem of optimal construction of quantile sketches was resolved by K arnin, L ang, and L iberty using the KLL sketch (FOCS 2016). The algorithm for KLL is restricted to online insert operations and no delete operations. For many real-world applications, it is necessary to support delete operations. When the data set is...
Article
This errata article discusses and corrects a minor error in our work published in VLDB 2019. The discrepancy specifically pertains to Algorithms 3 and 4. The algorithms presented in the paper are biased towards a commit decision in a specific failure scenario. We explain the error using an example before correcting the algorithm.
Conference Paper
Full-text available
The unique features of blockchains such as immutability, transparency, provenance, and authenticity have been used by many large-scale data management systems to deploy a wide range of distributed applications including supply chain management, health-care, and crowdworking in a permissioned setting. Unlike permissionless settings, e.g., Bitcoin, w...
Chapter
This chapter begins with a case study of Strava, a fitness app that inadvertently exposed sensitive military information even while protecting individual users' information privacy. The case study is analyzed as an example of how recent advances in algorithmic group inference technologies threaten privacy, both for individuals and for groups. It th...
Chapter
This chapter begins with a case study of Strava, a fitness app that inadvertently exposed sensitive military information even while protecting individual users' information privacy. The case study is analyzed as an example of how recent advances in algorithmic group inference technologies threaten privacy, both for individuals and for groups. It th...
Preprint
Distributed caches are widely deployed to serve social networks and web applications at billion-user scales. This paper presents Cache-on-Track (CoT), a decentralized, elastic, and predictive caching framework for cloud environments. CoT proposes a new cache replacement policy specifically tailored for small front-end caches that serve skewed workl...
Preprint
Full-text available
Despite recent intensive research, existing crowdworking systems do not adequately address all the requirements of a real-world crowdworking environment. First, crowdworking platforms need to integrate within society and in particular to interface with legal and social institutions. Global regulations must be enforced, such as minimal and maximal w...
Article
The recent adoption of blockchain technologies and open permissionless networks suggest the importance of peer-to-peer atomic cross-chain transaction protocols. Users should be able to atomically exchange tokens and assets without depending on centralized intermediaries such as exchanges. Recent peer-to-peer atomic cross-chain swap protocols use ha...
Conference Paper
Full-text available
Modern large-scale data management systems utilize consensus protocols to provide fault tolerance. Consensus protocols are extensively used in the distributed database infrastructure of large enterprises such as Google, Amazon, and Facebook as well as permissioned blockchain systems like IBM's Hyperledger Fabric. In the last four decades, numerous...
Preprint
Full-text available
Significant amounts of data are currently being stored and managed on third-party servers. It is impractical for many small scale enterprises to own their private datacenters, hence renting third-party servers is a viable solution for such businesses. But the increasing number of malicious attacks, both internal and external, as well as buggy softw...
Preprint
Full-text available
Scalability is one of the main roadblocks to business adoption of blockchain systems. Despite recent intensive research on using sharding techniques to enhance the scalability of blockchain systems, existing solutions do not efficiently address cross-shard transactions. In this paper, we introduce SharPer, a permissioned blockchain system that enha...
Preprint
Full-text available
This paper presents TXSC, a framework that provides smart contract developers with transaction primitives. These primitives allow developers to write smart contracts without the need to reason about the anomalies that can arise due to concurrent smart contract function executions.
Conference Paper
Full-text available
Despite recent intensive research, existing blockchain systems do not adequately address all the characteristics of distributed applications. In particular, distributed applications collaborate with each other following service level agreements (SLAs) to provide different services. While collaboration between applications, e.g., cross-application t...
Conference Paper
Full-text available
IoT devices influence many different spheres of society and are predicted to have a huge impact on our future. Extracting real-time insights from diverse sensor data and dealing with the underlying uncertainty of sensor data are two main challenges of the IoT ecosystem In this paper, we propose a data processing architecture, M-DB, to effectively i...
Conference Paper
Permissioned Blockchain systems rely mainly on Byzantine fault-tolerant protocols to establish consensus on the order of transactions. While Byzantine fault-tolerant protocols mostly guarantee consistency (safety) in an asynchronous network using 3f+1 machines to overcome the simultaneous malicious failure of any f nodes, in many systems, e.g., blo...
Article
Despite recent intensive research, existing blockchain systems do not adequately address all the characteristics of distributed applications. In particular, distributed applications collaborate with each other following service level agreements (SLAs) to provide different services. While collaboration between applications, e.g., cross-application t...
Preprint
Full-text available
Large scale data management systems utilize State Machine Replication to provide fault tolerance and to enhance performance. Fault-tolerant protocols are extensively used in the distributed database infrastructure of large enterprises such as Google, Amazon, and Facebook, as well as permissioned blockchain systems like IBM's Hyperledger Fabric. How...
Conference Paper
Full-text available
The uprise of Bitcoin and other peer-to-peer cryptocurrencies has opened many interesting and challenging problems in cryptography, distributed systems, and databases. The main underlying data structure is blockchain, a scalable fully replicated structure that is shared among all participants and guarantees a consistent view of all user transaction...
Preprint
Full-text available
Recent works in social network stream analysis show that a user's online persona attributes (e.g., gender, ethnicity, political interest, location, etc.) can be accurately inferred from the topics the user writes about or engages with. Attribute and preference inferences have been widely used to serve personalized recommendations, directed ads, and...
Preprint
Full-text available
Permissionless blockchains (e.g., Bitcoin, Ethereum, etc) have shown a wide success in implementing global scale peer-to-peer cryptocurrency systems. In such blockchains, new currency units are generated through the mining process and are used in addition to transaction fees to incentivize miners to maintain the blockchain. Although it is clear how...
Preprint
Full-text available
The recent adoption of blockchain technologies and open permissionless networks suggest the importance of peer-to-peer atomic cross-chain transaction protocols. Users should be able to atomically exchange tokens and assets without depending on centralized intermediaries such as exchanges. Recent peer-to-peer atomic cross-chain swap protocols use ha...
Preprint
Full-text available
Many existing blockchains do not adequately address all the characteristics of distributed system applications and suffer from serious architectural limitations resulting in performance and confidentiality issues. While recent permissioned blockchain systems, have tried to overcome these limitations, their focus has mainly been on workloads with no...
Article
Data storage in the Cloud needs to be scalable and fault-tolerant. Atomic commitment protocols such as Two Phase Commit (2PC) provide ACID guarantees for transactional access to sharded data and help in achieving scalability. Whereas consensus protocols such as Paxos consistently replicate data across different servers and provide fault tolerance....
Article
Machine learning and data mining threaten personal privacy, and many tools exist to help users protect their privacy (e.g., available privacy settings on Facebook, anonymization and encryption of personal data, etc.). But such technologies also pose threats to "group privacy," which is a concept scholars know relatively little about. Moreover, ther...
Article
Full-text available
Bitcoin is a successful and interesting example of a global scale peer-to-peer cryptocurrency that integrates many techniques and protocols from cryptography, distributed systems, and databases. The main underlying data structure is blockchain, a scalable fully replicated structure that is shared among all participants and guarantees a consistent v...
Preprint
Full-text available
Collaborative Filtering (CF) is one of the most commonly used recommendation methods. CF consists in predicting whether, or how much, a user will like (or dislike) an item by leveraging the knowledge of the user's preferences as well as that of other users. In practice, users interact and express their opinion on only a small subset of items, which...
Conference Paper
In this paper, we propose Dynamic Paxos (DPaxos), a Paxos-based consensus protocol to manage access to partitioned data across globally-distributed datacenters and edge nodes. DPaxos is intended to implement a State Machine Replication component in data management systems for the edge. DPaxos targets the unique opportunities of utilizing edge compu...
Conference Paper
Companies are often motivated to evaluate their environmental sustainability, and to make public pronouncements about their performance with respect to quantitative sustainability metrics. Public trust in these declarations is enhanced if the claims are certified by a recognized authority. Because accurate evaluations of environmental impacts requi...
Article
Cloud-based data-intensive applications have to process high volumes of transactional and analytical requests on large-scale data. Businesses base their decisions on the results of analytical requests, creating a need for real-time analytical processing. We propose Janus, a hybrid scalable cloud datastore, which enables the efficient execution of d...
Conference Paper
Social media streams analysis can reveal the characteristics of people who engage with or write about different topics. Recent works show that it is possible to reveal sensitive attributes (e.g., location, gender, ethnicity, political views, etc.) of individuals by analyzing their social media streams. Although, the prediction of a user's sensitive...
Article
Today's web applications and social networks are serving billions of users around the globe. These users generate billions of key lookups and millions of data object updates per second. A single user's social network page load requires hundreds of key lookups. This scale creates many design challenges for the underlying storage systems. First, thes...
Conference Paper
Trending Topic Detection has been one of the most popular methods to summarize what happens in the real world through the analysis and summarization of social media content. However, as trending topic extraction algorithms become more sophisticated and report additional information like the characteristics of users that participate in a trend, sign...
Conference Paper
Today's web applications and social networks are serving billions of users around the globe. These users generate billions of key lookups and millions of data object updates per second. A single user's social network page load requires hundreds of key lookups. This scale creates many design challenges for the underlying storage systems. First, thes...
Conference Paper
Life Cycle Assessment(LCA) is crucial for evaluating the ecological sustainability of a product or service, and the accurate evaluation of sustainability requires detailed and transparent information about industrial activities. However, such information is usually considered confidential and withheld from the public. In this paper, we present a ri...
Article
Full-text available
read online at: http://rdcu.be/nb5t Life cycle assessment (LCA) is the standard technique used to make a quantitative evaluation about the ecological sustainability of a product or service. The life cycle inventory (LCI) data sets that provide input to LCA computations can express essential information about the operation of a process or productio...
Conference Paper
A thorough understanding of social media discussions and the demographics of the users involved in these discussions has become critical for many applications like business or political analysis. Such an understanding and its ramifications on the real world can be enabled through the automatic summarization of Social Media. Trending topics are offe...
Book
This book constitutes the thoroughly refereed conference proceedings of the 5th International Conference on Networked Systems, NETYS 2017, held in Marrakech, Morocco, in May 2017. The 28 full and 6 short papers presented together with 3 keynotes were carefully reviewed and selected from 81 submissions. They are organized around the following topics...
Conference Paper
Geo-replication is the process of maintaining copies of data at geographically dispersed datacenters for better availability and fault-tolerance. The distinguishing characteristic of geo-replication is the large wide-area latency between datacenters that varies widely depending on the location of the datacenters. Thus, choosing which datacenters to...
Conference Paper
Global-scale data management (GSDM) empowers systems by providing higher levels of fault-tolerance, read availability, and efficiency in utilizing cloud resources. This has led to the emergence of global-scale data management and event processing. However, the Wide-Area Network (WAN) latency separating data is orders of magnitude larger than conven...
Conference Paper
Full-text available
NOTE: this conference paper has been expanded and published as a full article in Environment Systems and Decisions: https://www.researchgate.net/publication/311337557_Privacy-preserving_aggregation_in_life_cycle_assessment Many different kinds of organizations are motivated to make public disclosures about their environmental performance. These mo...
Conference Paper
Physical events in the real world are known to trigger reactions and then discussions in online social media. Mining these reactions through online social sensors offers a fast and low cost way to understand what is happening in the physical world. In some cases, however, further study of the affected population's emotional state can improve this u...
Article
With the advent of Web 2.0 users are producing bigger and bigger amounts of diverse data, which are stored in a large variety of systems. Since the users’ data spaces are scattered among those independent systems, data sharing becomes a challenging problem. Distributed search and recommendation provides a general solution for data sharing and among...
Conference Paper
For data-intensive applications with many concurrent users, modern distributed main memory database management systems (DBMS) provide the necessary scale-out support beyond what is possible with single-node systems. These DBMSs are optimized for the short-lived transactions that are common in on-line transaction processing (OLTP) workloads. One way...
Conference Paper
Cross datacenter replication is increasingly being deployed to bring data closer to the user and to overcome datacenter outages. The extent of the influence of wide-area communication on serializable transactions is not yet clear. In this work, we derive a lower-bound on commit latency. The sum of the commit latency of any two datacenters is at lea...
Article
With large elastic and scalable infrastructures, the Cloud is the ideal storage repository for Big Data applications. Big Data is typically characterized by three V's: Volume, Variety and Velocity. Supporting these properties raises significant challenges in a cloud setting, including partitioning for scale out; replication across data centers for...
Conference Paper
In the context of Web 2.0, the users become massive producers of diverse data that can be stored in a large variety of systems. The fact that the users’ data spaces are distributed in many different systems makes data sharing difficult. In this context of large scale distribution of users and data, a general solution to data sharing is offered by d...
Article
Data outsourcing or database as a service is a new paradigm for data management. The third party service provider hosts databases as a service. These parties provide efficient and cheap data management by obviating the need to purchase expensive hardware and software, deal with software upgrades and hire professionals for administrative and mainten...
Conference Paper
With hundreds of millions of users worldwide, social networks provide incredible opportunities for social connection, learning, political and social change, and individual entertainment and enhancement in a multiple contexts. Because many social interactions currently take place in online networks, social scientists have access to unprecedented amo...