Paris Carbone

Paris Carbone
KTH Royal Institute of Technology | KTH · Department of Software and Computer systems

About

18
Publications
12,412
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,327
Citations

Publications

Publications (18)
Article
Full-text available
Apache Flink 1 is an open-source system for processing streaming and batch data. Flink is built on the philosophy that many classes of data processing applications, including real-time analytics, continuous data pipelines, historic data processing (batch), and iterative algorithms (machine learning, graph analysis) can be expressed and executed as...
Preprint
Full-text available
Agreement among a set of processes and in the presence of partial failures is one of the fundamental problems of distributed systems. In the most general case, many decisions must be agreed upon over the lifetime of a system with dynamically changing membership. Such a sequence of decisions represents a distributed log, and can form the underlying...
Preprint
Full-text available
Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the fu...
Conference Paper
Contemporary end-to-end data pipelines need to combine many diverse workloads such as machine learning, relational operations, stream dataflows, tensor transformations, and graphs. For each of these workload types, there exists several frontends (e.g., SQL, Beam, Keras) based on different programming languages as well as different runtimes (e.g., S...
Article
Stream processing can generate insights from big data in real time as it is being produced. This paper reports findings from a 2017 seminar on big stream processing, focusing on applications, systems, and languages.
Article
Graph partitioning is an essential yet challenging task for massive graph analysis in distributed computing. Common graph partitioning methods scan the complete graph to obtain structural characteristics offline, before partitioning. However, the emerging need for low-latency, continuous graph analysis led to the development of online partitioning...
Conference Paper
Message-based programming frameworks facilitate the development and execution of core distributed computing algorithms today. Their twofold aim is to expose a programming model that minimises logical errors incurred during translation from an algorithmic specification to executable program, and also to provide an efficient runtime for event pattern...
Article
Full-text available
Stream processors are emerging in industry as an apparatus that drives analytical but also mission critical services handling the core of persistent application logic. Thus, apart from scalability and low-latency, a rising system need is first-class support for application state together with strong consistency guarantees, and adaptivity to cluster...
Chapter
In our data-centric society, online services, decision making, and other aspects are increasingly becoming heavily dependent on trends and patterns extracted from data. A broad class of societal-scale data management problems requires system support for processing unbounded data with low latency and high throughput. Large-scale data stream processi...
Conference Paper
Full-text available
Aggregation queries on data streams are evaluated over evolving and often overlapping logical views called windows. While the aggregation of periodic windows were extensively studied in the past through the use of aggregate sharing techniques such as Panes and Pairs, little to no work has been put in optimizing the aggregation of very common, non-p...
Article
Full-text available
Distributed stateful stream processing enables the deployment and execution of large scale continuous computations in the cloud, targeting both low latency and high throughput. One of the most fundamental challenges of this paradigm is providing processing guarantees under potential failures. Existing approaches rely on periodic global state snapsh...
Conference Paper
Full-text available
Recent advances in distributed computing have made it possible to achieve high availability on traditional systems and thus serve them as reliable services. For several offline computational applications, such as fine grained batch processing, their parallel nature in addition to weak consistency requirements allowed a more trivial transition. On t...
Conference Paper
Streaming a live music concert over the Internet is a challenging task as it requires real-time, high-quality data delivery over a large number of geographically distributed nodes. In this paper we propose MusiCast, a real-time peer-to-peer multicast system for streaming midi events and compressed audio data. We present a scalable and distributed t...

Network

Cited By

Projects

Project (1)