Mihail Stoian’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (1)


Lightweight Correlation-Aware Table Compression
  • Preprint

October 2024

·

3 Reads

Mihail Stoian

·

·

Jan Kobiolka

·

[...]

·

The growing adoption of data lakes for managing relational data necessitates efficient, open storage formats that provide high scan performance and competitive compression ratios. While existing formats achieve fast scans through lightweight encoding techniques, they have reached a plateau in terms of minimizing storage footprint. Recently, correlation-aware compression schemes have been shown to reduce file sizes further. Yet, current approaches either incur significant scan overheads or require manual specification of correlations, limiting their practicability. We present Virtual\texttt{Virtual}, a framework that integrates seamlessly with existing open formats to automatically leverage data correlations, achieving substantial compression gains while having minimal scan performance overhead. Experiments on data.gov\texttt{data.gov} datasets show that Virtual\texttt{Virtual} reduces file sizes by up to 40% compared to Apache Parquet.