Tim Gubner Cwi’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (1)


Exploring Query Execution Strategies for JIT, Vectorization and SIMD
  • Conference Paper
  • Full-text available

June 2017

·

245 Reads

·

18 Citations

Tim Gubner Cwi

·

This paper partially explores the design space for efficient query processors on future hardware that is rich in SIMD capabilities. It departs from two well-known approaches: (1) interpreted block-at-a-time execution (a.k.a. "vector-ization") and (2) "data-centric" JIT compilation, as in the HyPer system. We argue that in between these two design points in terms of granularity of execution and unit of compilation, there is a whole design space to be explored, in particular when considering exploiting SIMD. We focus on TPC-H Q1, providing implementation alternatives ("fla-vors") and benchmarking these on various architectures. In doing so, we explain in detail considerations regarding operating on SQL data in compact types, and the system features that could help using as compact data as possible. We also discuss various implementations of aggregation, and propose a new strategy called "in-register aggregation" that reduces memory pressure but also allows to compute on more compact , SIMD-friendly data types. The latter is related to an in-depth discussion of detecting numeric overflows, where we make a case for numeric overflow prevention, rather than detection. Our evaluation shows positive results, confirming that there is still a lot of design headroom.

Download

Citations (1)


... There exist breaking points for Linear-S-CPU and Chain-S-CPU because their retired instructions are increased with the number of tuples. For example, in Fig. 8b, when there are 2 15 , 2 16 , 2 17 , 2 18 , and 2 19 tuples, Linear-S-CPU executes 720,000, 1,440,000, 3,240,000, 5,400,000, and 12,240,000 instructions respectively. The CPI (cycles per instruction retired) is in the range of 6.7-9.5. ...

Reference:

An experimental study of group-by and aggregation on CPU-GPU processors
Exploring Query Execution Strategies for JIT, Vectorization and SIMD