FP32 single core comparison against x86, reporting number of times faster or slower than the baseline SG2042

FP32 single core comparison against x86, reporting number of times faster or slower than the baseline SG2042

Source publication
Preprint
Full-text available
The Sophon SG2042 is the world's first commodity 64-core RISC-V CPU for high performance workloads and an important question is whether the SG2042 has the potential to encourage the HPC community to embrace RISC-V. In this paper we undertaking a performance exploration of the SG2042 against existing RISC-V hardware and high performance x86 CPUs in...

Contexts in source publication

Context 1
... AMD Rome and Intel Icelake CPUs tend to outperform the Intel Broadwell, which is understandable given that the Broadwell is the older of the three. Figure 5 reports results from the same experiment using FP32 and it can be seen that the AMD Rome CPU is fairly lacklustre when executing at single precision compared to double, whereas the Intel processors on average perform just as well, and indeed the Sandybridge outperforms the C920 on average in each class when using FP32. However, the average bars in Figure 5 do not provide a complete picture. ...
Context 2
... 5 reports results from the same experiment using FP32 and it can be seen that the AMD Rome CPU is fairly lacklustre when executing at single precision compared to double, whereas the Intel processors on average perform just as well, and indeed the Sandybridge outperforms the C920 on average in each class when using FP32. However, the average bars in Figure 5 do not provide a complete picture. As described in Section 3.2, the C920 only supports vectorisation for FP32 and indeed it can be seen from the whiskers in Fig- ures 5 and 4 that the maximum times faster is less for many benchmarks classes at FP32 than FP64. ...

Similar publications

Article
Full-text available
The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set of interesting challenges to application developers. In addition to requiring code portability across different parallelization schemes, programs targeting these architectures have to be highly adaptable in terms...
Preprint
Full-text available
Scientific applications are starting to explore the viability of quantum computing. This exploration typically begins with quantum simulations that can run on existing classical platforms, albeit without the performance advantages of real quantum resources. In the context of high-performance computing (HPC), the incorporation of simulation software...
Preprint
Full-text available
Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (AI) research, development, and deployment. The extensive use of supercomputers for AI training has heightened concerns about energy consumption and ca...
Preprint
Full-text available
Compute nodes on modern heterogeneous supercomputing systems comprise CPUs, GPUs, and high-speed network interconnects (NICs). Parallelization is identified as a technique for effectively utilizing these systems to execute scalable simulation and deep learning workloads. The resulting inter-process communication from the distributed execution of th...
Article
Full-text available
Recent commodity x86 CPUs still dominate the majority of supercomputers and most of them implement vector architectures to support single instruction multiple data (SIMD). Although research for architectural exploration requires computer architecture simulators and number of simulators have been developed, only a few tools support recent x86 SIMD i...