Harry Wagstaff's research while affiliated with The University of Edinburgh and other places

Publications (15)

Article
System-level Dynamic Binary Translation (DBT) provides the capability to boot an Operating System (OS) and execute programs compiled for an Instruction Set Architecture (ISA) different from that of the host machine. Due to their performance-critical nature, system-level DBT frameworks are typically hand-coded and heavily optimized, both for their g...
Conference Paper
Full-text available
Many Virtual Execution Environments (VEEs) rely on Just-in-time (JIT) compilation technology for code generation at runtime, e.g. in Dynamic Binary Translation (DBT) systems or language Virtual Machines (VMs). While JIT compilation improves native execution performance as opposed to e.g. interpretive execution, the JIT compilation process itself in...
Preprint
SLAM is becoming a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is a problem since different SLAM applications can have different functi...
Article
Full-text available
Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Mapping), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research ef...
Article
Hardware virtualization solutions provide users with benefits ranging from application isolation through server consolidation to improved disaster recovery and faster server provisioning. While hardware assistance for virtualization is supported by all major processor architectures, including Intel, ARM, PowerPC, and MIPS, these extensions are targ...
Conference Paper
System designers typically use well-studied benchmarks to evaluate and improve new architectures and compilers. We design tomorrow's systems based on yesterday's applications. In this paper we investigate an emerging application, 3D scene understanding, likely to be significant in the mobile space in the near future. Until now, this application cou...
Conference Paper
Instruction set simulators (ISS) have many uses in embedded software and hardware development and are typically based on dynamic binary translation (DBT), where frequently executed regions of guest instructions are compiled into host instructions using a just-in-time (JIT) compiler. Full-system simulation, which necessitates handling of asynchronou...
Article
Instruction set simulators (ISS) have many uses in embedded software and hardware development and are typically based on dynamic binary translation (DBT), where frequently executed regions of guest instructions are compiled into host instructions using a just-in-time (JIT) compiler. Full-system simulation, which necessitates handling of asynchronou...
Article
Processor design tools integrate in their workflows generators for instruction set simulators (ISS) from architecture descriptions. However, it is difficult to validate the correctness of these simulators. ISA coverage analysis is insufficient to isolate modelling faults, which might only be exposed in corner cases. We present a novel ISA branch co...
Article
Region-based JIT compilation operates on translation units comprising multiple basic blocks and, possibly cyclic or conditional, control flow between these. It promises to reconcile aggressive code optimisation and low compilation latency in performance-critical dynamic binary translators. Whilst various region selection schemes and isolated code o...
Conference Paper
Modern processor design tools integrate in their workflows generators for instruction set simulators (Iss) from architecture descriptions. Whilst these generated simulators are useful for design evaluation and software development, they suffer from poor performance. We present an ultra-fast Jit-compiled Iss generated from an ArchC description. We a...

Citations

... QEMU [9] might not be the fastest DBT engine today [10], but it has unparalleled stability, support for many targets and hosts CPUs, is open-sourced, and has a large and very active community supported by the industry. It has become the reference for many users and developers, which makes it the de facto standard for cross-ISA emulation. ...
... Prune=63 Prune=127 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.1x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x 1.0x for different sizes of a convolutional layer, as well as lowerlevel details about the execution in hardware, we executed the workloads in a Full-System Mali GPU simulator [22]. ...
... It was out of the scope of this work to evaluate the performance of the SLAM algorithms used. Therefore, we selected the ORB-SLAM2 framework that obtained good results in the SLAM benchmark SLAMBench2 [6], as a representative for state-of-the-art SLAM framework. For the server video streamer and the client video receiver, we use Gstreamer or a custom implementation of an MJPEG encoder/decoder as alternatives in our evaluations below. ...
... One possible solution to improve the kit assembly operations is the use of Augmented Reality (AR), considered one of the nine pillars of Industry 4.0 to support operators with real-time information for faster decision-making, while improving work processes [15,24,50,54,56,62]. This technology can integrate virtual information in the operators workspace [35,42], helping them in assembly tasks [18,43,49], provide context-aware assistance [5], data visualization and interaction (acting as a Human-Machine Interface (HMI)) [16,40], indoor localization [60], maintenance applications [8,18,61], quality control [4,65], material management [16,51] or remote collaboration [7,39,66], by presenting additional layers of digital information on top of real-world environments [3,28,33,37,38,57]. Prior studies identify certain benefits of applying AR for technological industrialization, like increased work safety, effective learning and training, as well as more task effectiveness [10,12,31], as well as improved Human-Robot Interaction (HRI) [1,13,19,34]. ...
... For all other uses, contact the owner/author(s one aspect is usually at the expense of performance degradation in other aspects, and the two cancel each other out eventually. An example of this problem is that some individual benchmarks in the SPEC CPU 2006 [4] experienced performance fluctuations or even degradation among different QEMU versions [11]. Virtualization benchmarks, such as SPECvirt 2013 [9], which are a combination of several common application benchmarks in data centers, focus only on server's consolidation capacity. ...
... In addition to being more explainable, tree-based models are paving the way for the use of ML in computer architecture due to being generally more resilient to the magnitude of the input features and fairly easy to use and deploy. Recent works have used tree-based models for performance prediction [130,16] and automatic design space exploration [63,35]. Within the scope of extracting relevant software/architectural insight from profiling, Fenacci et al. [45] employ decision trees to gather insights on benchmarks targeting embedded applications. ...
... 2. Гипервизоры (VMM -Virtual Machine Manager), использующие расширения процессорных архитектур для использования аппаратной виртуализации: Xen [5], KVM [6]. ...
... • The Shader Program Execution Engine, which allows us to simulate the behaviour of Mali programs. Future plans for simulation include extending the infrastructure to support real time graphics simulation, increasing GPU Simulation performance using Dynamic Binary Translation (DBT) [79], [82], [85] techniques, and extending the Mali Model to support performance modelling. We have also continued to investigate new techniques for full-system dynamic binary translation (such as exploiting hardware features on the host to further accelerate simulation performance), as well as new methodologies for accelerating the implementation and verification of full system instruction set simulators. ...
... Hybrid interpreter/DBT systems [17] offload the expensive JIT compilation of work-units to threads [3], whilst still making forward progress in the interpreter, and thus hiding the latency of JIT compilation. Such set-ups (described as Asynchronous Mixed-mode Translation by [24]) have a greater scope for implementation, and raise questions such as what, when, and how guest code should be translated. Figure 1 contrasts a typical configuration for a hybrid interpreter/DBT-based VEE against our novel scheme. ...
... In this context, optimization techniques for constraint propagation [6], execution path coverage models [7] and mining techniques for processor manuals [8] have been considered. Alternative approaches integrate coverage-guided test generation based on bayesian networks [9] and other machine learning techniques [10] as well as fuzzing [11] and symbolic execution [12]. However, these approaches are either not designed for RTL verification or impose restrictions on the generated instruction streams. ...