Conference Paper

POET: Parameterized Optimizations for Empirical Tuning.

DOI: 10.1109/IPDPS.2007.370637 Conference: 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), Proceedings, 26-30 March 2007, Long Beach, California, USA
Source: DBLP

ABSTRACT The excessive complexity of both machine architectures and applications have made it difficult for compilers to stat- ically model and predict application behavior. This observa- tion motivates the recent interest in performance tuning using empirical techniques. We present a new embedded scripting language, POET (Parameterized Optimization for Empirical Tuning), for parameterizing complex code transformations so that they can be empirically tuned. The POET language aims to significantly improve the generality, flexibility, and efficiency of existing empirical tuning systems. We have used the language to parameterize and to empirically tune three loop optimizations—interchange, blocking, and unrolling— for two linear algebra kernels. We show experimentally that the time required to tune these optimizations using POET, which does not require any program analysis, is significantly shorter than that when using a full compiler- based source-code optimizer which performs sophisticated program analysis and optimizations.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Scientific programmers often turn to vendor-tuned Basic Linear Algebra Subprograms (BLAS) to obtain portable high performance. However, many numerical algorithms require several BLAS calls in sequence, and those successive calls result in suboptimal performance. The entire sequence needs to be optimized in concert. Instead of vendor-tuned BLAS, a programmer could start with source code in Fortran or C (e.g., based on the Netlib BLAS) and use a state-of-the-art optimizing compiler. However, our experiments show that optimizing compilers often attain only one-quarter the performance of hand-optimized code. In this paper we present a domain-specific compiler for matrix algebra, the Build to Order BLAS (BTO), that reliably achieves high performance using a scalable search algorithm for choosing the best combination of loop fusion, array contraction, and multithreading for data parallelism. The BTO compiler generates code that is between 16% slower and 39% faster than hand-optimized code.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Parameterizing source code for architecture-bound optimization is a common approach to high-performance programming but one that makes the programmer's task arduous and the resulting code difficult to maintain. Certain parameterizations, such as changing loop order, may require elaborate code instrumenting that distract from the main objective. In this paper, we propose a templating and automatic code generation approach based on standard Python modules and the Opal library for algorithm optimization. Advantages of our approach include its programmatic simplicity and the flexibility offered by the templating engine. We provide a complete example for the matrix multiply where optimization with respect to blocking, loop unrolling and compiler flags takes place.
    Cahier du GERAD. 01/2013;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: GeCoS is an open source framework that provide a highly productive environment for hardware design. GeCoS primarily targets custom hardware design using High Level Synthesis, distinguishing itself from classical compiler infrastructures. Compiling for custom hardware makes use of domain specific semantics that are not considered by general purpose compilers. Finding the right balance between various performance criteria, such as area, speed, and accuracy, is the goal, contrary to the typical goal in high performance context to maximize speed. The GeCoS infrastructure facilitates the prototyping of hardware design flows, going beyond compiler analyses and transformations. Hardware designers must interact with the compiler for design space exploration, and it is important to be able to give instant feedback to the users.
    Source Code Analysis and Manipulation (SCAM), 2013 IEEE 13th International Working Conference on; 01/2013

Full-text (2 Sources)

Available from
Jun 4, 2014