Figure 4 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
FP64 single core comparison against x86, reporting number of times faster or slower than the baseline SG2042
Source publication
The Sophon SG2042 is the world's first commodity 64-core RISC-V CPU for high performance workloads and an important question is whether the SG2042 has the potential to encourage the HPC community to embrace RISC-V. In this paper we undertaking a performance exploration of the SG2042 against existing RISC-V hardware and high performance x86 CPUs in...
Contexts in source publication
Context 1
... all systems we execute over the most performant number of threads, on all the x86 systems this was found to be the same as the number of physical cores, whereas for the SG2042 it was demonstrated in Section 3.2 that for some benchmark classes 32 threads provided better performance compared to 64 threads. Figure 4 reports single core performance running the benchmark suite at FP64 for the x86 CPUs baselined against the SG2042. This graph is organised the same way as the RISC-V commodity hardware comparison graph, where the bars are the average number of times faster or slower across the class, and whiskers range from the maximum to the minimum. ...
Context 2
... the average bars in Figure 5 do not provide a complete picture. As described in Section 3.2, the C920 only supports vectorisation for FP32 and indeed it can be seen from the whiskers in Fig- ures 5 and 4 that the maximum times faster is less for many benchmarks classes at FP32 than FP64. Furthermore, there are more slowest running kernels that perform slower on the x86 CPUs than the C920 at FP32. ...
Similar publications
The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set of interesting challenges to application developers. In addition to requiring code portability across different parallelization schemes, programs targeting these architectures have to be highly adaptable in terms...
Scientific applications are starting to explore the viability of quantum computing. This exploration typically begins with quantum simulations that can run on existing classical platforms, albeit without the performance advantages of real quantum resources. In the context of high-performance computing (HPC), the incorporation of simulation software...
Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (AI) research, development, and deployment. The extensive use of supercomputers for AI training has heightened concerns about energy consumption and ca...
Compute nodes on modern heterogeneous supercomputing systems comprise CPUs, GPUs, and high-speed network interconnects (NICs). Parallelization is identified as a technique for effectively utilizing these systems to execute scalable simulation and deep learning workloads. The resulting inter-process communication from the distributed execution of th...
Recent commodity x86 CPUs still dominate the majority of supercomputers and most of them implement vector architectures to support single instruction multiple data (SIMD). Although research for architectural exploration requires computer architecture simulators and number of simulators have been developed, only a few tools support recent x86 SIMD i...