Technical ReportPDF Available

Android 12 and 13 Benchmarks and Cortex-X2 CPU With Low MP Efficiency

Authors:
  • Independent Researcher

Abstract

Benchmarks and stress tests were run on existing devices, confirming that they ran successfully via upgrades to Android 12 and 13. A new phone was included with one Cortex-X2 cpu core and seven others of two varieties producing highest speeds but with questionable performance characteristics.
Android 12 and 13 Benchmarks and Cortex-X2 CPU With Low MP Efficiency
Contents
Summary Introduction Configurations
Whetstone Benchmark Dhrystone Benchmark Linpack Benchmark
Livermore Loops Benchmark MemSpeed Benchmark NeonSpeed Benchmark
BusSpeed Benchmark RandMem Benchmark FFT Benchmarks
MP-Whetstone Benchmark MP-Dhrystone Benchmark MP-BusSpeed Benchmark
MP-RandMem Benchmark MP-MFLOPS Benchmark NEON-MFLOPS-MP Benchmark
Java OpenGL Benchmark Java Drawing Benchmark Java Whetstone Benchmark
Java Linpack Benchmark DriveSpeed Benchmark CPU Stress Tests
Integer Stress Benchmark Floating Point Stress Benchmark Integer Stress Tests
Floating Point Stress Tests More Integer Stress Tests More Floating Point Stress Tests
Downloading PDF Files Alternative HTM Files
Summary
The main purpose of this report was to confirm that the benchmarks and stress testing programs continued to run successfully on the later
versions of Android, which they did. Again the systems tested had eight ARM big.LITTLE CPU configurations, one of which was run using
earlier versions of Android that effectively produced the same levels of performance. Configurations comprised:
System 1 - 2 x Cortex-A76 at 2.05 GHz, 6 x Cortex-A55 at 2.00 GHz
System 2 - 2 x Cortex-A76 at 2.00 GHz, 6 x Cortex-A55 at 1.80 GHz
System 3 - 2 x Cortex-A75 at 2.00 GHz, 6 x Cortex-A55 at 2.00 GHz
System 4 - 1 x Cortex-X2 at 2.80 GHz, 3 x Cortex-A710 at 2.52 GHz, 4 x Cortex-A510 at 1.82 GHz
Particular attention is given to comparing performance of the newer Cortex-X2 processor and the Cortex-A76, also multiprocessor efficiency
and battery/mains power effects.
10 Single Core Benchmarks
These and most others are normally run on power to avoid reduction in clock speed as the battery discharges and expected to use the
fastest CPU core. The first set were the old Classic Benchmarks comprising Whetstone, Dhrystone, Linpack and Livermore Loops. Then,
performance is expected to be proportional to CPU MHz with some variation due to architectural changes. System 4/System 2
performance gains are included, with minimum expectations of 1.4 times (2.8/2.0 GHz), with ratings of overall performance being
1.50, 2.40, 2.37# and 2.15 times over the four benchmarks, This includes a revised calculation for Benchmark 3# that produced a lower
Cortex-X2 gain of 1.35 times, demonstrating that it was slower than running on battery. Benchmark 4 was the last one that appeared to
run on the X2 at full speed under power (see recorded dates/times).
6 Single Core Memory Benchmarks - Each of these carries out four to six computing variations using 10 or 11 different memory levels,
covering all caches and RAM. Initial runs, under power, demonstrated low System 4/System 2 performance ratios, some being less than 1.0.
Later they were run under battery power, where most of the 300+ ratios were greater than 1.4 times, with highest gains of up to 10
times, using the Cortex-X2 L3 cache.
5 Multithreading Benchmarks These were run using 1, 2, 4 and 8 threads, two of which are included to demonstrate unsuitable
programming code for using multiple CPU cores. The report again includes more than 300 System 4/2 comparisons with minimum,
average, maximum ratios of 1.03, 2.21, 13.3 times, ignoring one at 0.89.
Calculations of Multithreading Efficiency are also provided, where examples were as below. PC results are for an Intel Core i5 CPU (using 8
threads on a 4 core processor) from Cray 1 Supercomputer Performance Comparisons With_Home Computers Phones and Tablets.pdf ratios,
demonstrating what might be expected. Here 8 PC threads produced a performance greater than 7 times that using 1 thread. This is for
the relatively simple code in the Whetstone Benchmark. Here, the Android systems were less efficient using 8 real cores with gains of
around 5 times. Then, Android MP_MFLOPS 8 core gains were particularly low at less than 3 times, and with the unbalanced Cortex-
X2 ratio of 1.31 times using 2 cores.
Whetstone MP-MFLOPS
Threads PC SSE S2 A76 S4 X2 PC SSE S2 A76 S4 X2
1 1.00 1.00 1.00 1.00 1.00 1.00
2 1.99 2.13 1.87 1.97 1.92 1.31
4 3.61 3.47 3.57 2.59 2.07 2.11
8 7.02 5.14 4.64 3.59 2.71 2.21
Max MFLOPS 8773 4869 7993 119460 34024 70401
PC 4.15 GHz AVX512 325915
MP-MFLOPS had been run earlier on power, with the latest run on battery, providing faster performance, as mentioned above. NEON-
MFLOPS-MP, the same program but using NEON SIMD functions, was run next on power, immediately followed by one on battery. This time,
performance via power and battery was similar. Other applicable results are from MP-BusSpeed and MP-RandMem, only run on
battery, where average System 4/2 random access performance gains were up to 10.6 times using the the former’s large L3 cache.
System 4 provided widely variable gains on other test functions.
Continued Below
Java and Graphics Benchmarks
These comprise OpenGL and Drawing programs, then Java versions of Whetstone and Linpack benchmarks. The main point is that they all ran
successfully but some of the results are questionable.
CPU Stress Tests
There are two main CPU stress testing programs that can use up to 32 threads one executing floating point instructions and the other
integer arithmetic. Parameters are provided to specify testing time, number of threads and memory size, the former also includes floating
point complexity.
Benchmarks are provided, using all options, to help in determining stress testing parameters, each thread using different segments of the
data, with repetitive calculations. As with other programs, performance varies depending on the current environment including battery or
mains power and temperature state.
The Integer Benchmark uses up to 32 threads where, with 8 CPU cores, similar performance is expected using 16 and 32 threads. However,
there are unexpected faster speeds where data used by threads can be transferred from lower level caches. The System 4 test
demonstrated the usual performance gains over System 2 but, this time, the earlier run on power was faster the the later one on battery.
The Floating Point Benchmark uses up to 8 threads but executes increasing computation levels. System 4/2 comparisons demonstrated
the former’s inferior performance gains using 2 threads and inexplicable Battery/Power ratios between 0.53 and 1.60.
Integer Stress Tests were run for 15 minutes and 8 threads on the three older systems also including samples of CPU MHz measurements
of the 8 cores. There, the 6 classified as LITTLE mainly appeared to run at constant frequencies with variations on the BIG ones. Runs using
mains and battery power are included, producing similar performance, with variations in LITTLE CPU MHz on the latter. Tests were run
on System 4 at 8, 4, 2 and 1 threads, including connecting mains power during the third one. For this phone, the MHz program failed to
operate properly and the stress test timed out, without manual intervention (see More Integer Stress Tests).
Floating Point Stress Tests These were also run for 15 minutes using 8 threads, demonstrating the same pattern of frequency changes,
using 32 calculations per word. This time, System 2 shows all cores running at maximum MHz throughout the 15 minute test. System 4
first run covered running on battery, reducing performance by 36% over the testing time, probably due to battery discharge. The
immediately following test was via power starting at an increased speed. This time, reduction was 31%. Later came tests using 4, 2 and 1
threads, where all suffered from performance degradation, including the last having a long timeout that lead to increased speed.
More Integer and Floating Point Stress Tests Based on some of the above results and later ones, minimum, average and maximum
performance ratios are provided for all four systems using 1, 2, 4, and 8 threads. Following are sample highlight results of multithreading
efficiency and measured performance using 2 and 8 threads of Systems 2 and 4. Performance MOPS are averages of MFLOPS for floating
point and measured MBytes per second divided by four for integers. Although System 4 is shown to be faster on all measurements, it is
indicated as having a lowest thread efficiency using all thread levels but only on floating point calculations. Efficiency using 4 and 8
threads were particularly low in all cases.
Integer Floating Point
Threads S2 A76 S4 X2 S2 A76 S4 X2
1 1.0 1.0 1.0 1.0
2 1.7 1.9 1.9 1.2
4 2.3 2.5 2.4 1.8
8 3.3 3.8 2.6 2.1
MOPS 2T 7677 10027 24896 36783
MOPS 8T 13840 26147 37249 65709
Performance Running On Battery Or Power
Performance is likely to reduce at increased CPU temperatures and as the battery becomes discharged. These test results could not identify
the best option for maximum performance using the Cortex-X2 based phone. Perhaps there are additional heating effects running certain
programs whilst the battery is being charged at a high rate. Maybe all comparative power and battery tests should initially run with fully
charged batteries, after a cooling off period.
Introductions Next or Go To Start
Inroduction
The results of the programs used are not intended to be used to produce an artificial single number rating. The aim is to identify a wide
range of performance attributes that indicate strengths and weaknesses and ongoing hardware and software compatibility.
In 2017, I published Android Benchmarks For 32 Bit and 64 Bit CPUs from ARM Intel and MIPS.pdf with background and details of the small
change required for my benchmarks to run under Android 8, with appropriate references and links to earlier programs and results. Later, I
repeated the tests covered in Android 9 Benchmarks and Stress Tests On 32 Bit and 64 Bit CPUs.pdf. In 2021 the programs were run again
and reported in Android 10 and 11 Benchmarks and ARM big.LITTLE Architecture Issues.pdf.
The documents, from the first two of the above links, provided the options to independently download and install all the programs used.
However, Compuserve changed the file structure making these inaccessible. But they can currently be obtained, via the main page index,
from the author’s website (with limited life) or via the Wayback Archive (Select latest year, month and highlighted day). These files also
include detailed descriptions, not provided here.
This 2023 report covers the latest releases of Android available at the time. These can complicate on-line installation and the easiest way
was to download the files to a PC and copy them onto an appropriate SD card or into the device’s memory.
With the original benchmarks, the only way I could find to report computer readable results, in the standard monospaced format, was Email
via the Save button. There, I was the default receiver but this could be changed. Now various options are offered with the same procedure
being on selecting the Gmail icon. My preference is the Google Drive option, allowing me to access the files on my PCs.
The programs provide the following range of activities, the actual testing functions being mainly produced using the same C code as my
Windows, Linux and Raspberry Pi benchmarks.
CPU Benchmarks - The first set are the Classic Benchmarks that were the original 1970s to 1980s programs that set standards of
performance for computers, comprising Whetstone, Dhrystone, Linpack and Livermore Loops.
Memory Benchmarks - Next are programs that measure performance with data from caches and RAM. MemSpeed (including NeonSpeed
variant), BusSpeed and RandMem all use the same range of data sizes between 4 KB and 64 MB. Then there is a Fast Fourier Transform
benchmark with multiple data sizes.
MultiThreading Benchmarks - These all measure performance using 1, 2, 4 and 8 threads. The first are MP-Whetstone, MP-Dhrystone and
MP-Linpack. The next batch use memory sized 12.8 KB, 128 KB and mainly 12.8 MB, comprising MP-MFLOPS (including NEON-MFLOPS MP),
MP-BusSpeed and MP-RandMem.
Java Benchmarks - These comprise Java versions of the Whetstone and Linpack benchmarks, a graphics one using drawing functions and
another using OpenGL.
DriveSpeed Benchmarks - For measuring main drive speeds.
CPU Stress Testing Programs - These have variable parameters to run MP benchmarks for extended periods, for identifying overheating
and discharging battery performance issues.
Run Time Procedures - Initially, as usual, the benchmarks were run on power, to avoid slow performance induced at discharged battery
levels. Then, single core programs are expected to run on the fastest processor core in a big.LITTLE arrangement.
Configurations next or Go To Start
Configurations
Many ARM processors have options for different sizes of L1, L2 and L3 caches and whether shared by multiple processor cores. It is often
difficult to discover the sizes in a particular device. However, memory benchmark results can provide an indication.
CPUID From Benchmarks From CPU-Z or Searches
System 1 Android 11 Lenovo Tab P11 Plus
Screen pixels w x h 1200 x 1928 SOC MediaTek Helio G90 12nm
Android Build Version 11 2x 2.05 GHz ARM Cortex-A76 and 6x 2.0 GHz ARM Cortex-A55
Has L3 cache
GPU Mali-G76 MC4 720 MHz
processor : 5
BogoMIPS : 26.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd05
CPU revision : 0
processor : 6
BogoMIPS : 26.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x3
CPU part : 0xd0b
CPU revision : 0
System 2 Android 12
Device Motorola moto g(50) SOC Snapdragon 750 8 nm, claim based on ARM Cortex-A76 and ARM Cortex-A55
Screen pixels w x h 720 x 1339 CPUs - 2 x 2.0 GHz Kryo 480 and 6 x 1.8 GHz Kryo 460
Android Build Version 12 Both Kryo caches L1 64 KB, L2 512 KB, L3 2 MB shared
GPU Adreno 619 450 MH
1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x51
CPU architecture: 8
CPU variant : 0xd
CPU part : 0x805
CPU revision : 14
processor : 5
BogoMIPS : 38.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x51
CPU architecture: 8
CPU variant : 0xd
CPU part : 0x805
CPU revision : 14
processor : 7
BogoMIPS : 38.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x51
CPU architecture: 8
CPU variant : 0x8
CPU part : 0x804
CPU revision : 14
System 3 Android 13
Device Samsung SM-X200 SOC Tiger T618 12 nm
Screen pixels w x h 1920 x 1128 2x 2.0 GHz Cortex-A75 & 6x 2.0 GHz Cortex-A55
Android Build Version 13 Caches L1 16 KB, L2 256 KB, L3 ?
GPU Mali G52 MP2 850 MHz
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd05
CPU revision : 0
processor : 6
BogoMIPS : 52.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop as+imddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x3
CPU part : 0xd0a
CPU revision : 1
Continued Below
Armv9 CPU Phone Configuration
This new phone’s CPU is based on Arm®v9.0-A architecture. As can be seen here, the program functions used identify a completely different
variety of features and limited information about the technology used. CPU-Z provided limited information and numerous searches did not
help in finding more
CPUID From Benchmarks From CPU-Z or Searches
System 4 Android 13 Samsung S22
Device Samsung SM-S901B 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710
Screen pixels w x h 1080 x 2009 SOC Exynos 2200 4nm
Caches L1 64 KB, L2 between 512 & 1024 KB, L3 between 512 KB and 8 MB
GPU Xclipse 920
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc
dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes
svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 bti
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x2
CPU part : 0xd48
CPU revision : 0
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x2
CPU part : 0xd47
CPU revision : 0
processor : 6
BogoMIPS : 51.20
Maximum CPU Speed Summary
System 1 - 2 x Cortex-A76 at 2.05 GHz, 6 x Cortex-A55 at 2.00 GHz
System 2 - 2 x Cortex-A76 at 2.00 GHz, 6 x Cortex-A55 at 1.80 GHz
System 3 - 2 x Cortex-A75 at 2.00 GHz, 6 x Cortex-A55 at 2.00 GHz
System 4 - 1 x Cortex-X2 at 2.80 GHz, 3 x Cortex-A710 at 2.52 GHz, 4 x Cortex-A510 at 1.82 GHz
The following single threaded CPU benchmarks are expected to run on the fastest CPU core. The same applying to the MP multithreading
programs running using a single thread.
Whetstone Benchmark below or Go To Start
Whetstone Benchmark - NativeWhetstone2.apk
This benchmark carries out both single precision floating point and integer calculations, the overall MWIPS rating being mainly dependent on
the former. Systems 1 and 2 are shown to provide the same performance characteristics, with the former slightly faster as expected with the
increase in CPU MHz. System 3 was somewhat slower, due to the older CPU technology. All produced the same the same numeric results. For
System 4, the simple calculations used are completely unsuitable for vector processing, relative performance mainly being proportional to
CPU MHz, in the case shown 2.8 GHz versus 2.0. The overall gain of 1.5 being influenced by tests using such as COS and EXP functions.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel Native Whetstone Benchmark 4A8 04-Feb-2023 13.11
Compiled for 64 bit ARM v8a
Test MFLOPS MOPS millisecs Results
N1 float 1087.84 0.018 -1.124750137
N2 float 846.07 0.159 -1.131330490
N3 if 3066.65 0.034 1.000000000
N4 fixpt 5109.38 0.062 12.000000000
N5 cos 147.35 0.565 0.499109805
N6 float 816.02 0.661 0.999999821
N7 equal 2043.99 0.090 3.000000000
N8 exp 76.12 0.489 0.935364604
MWIPS 4815.37 2.077
Total Elapsed Time 18.3 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel Native Whetstone Benchmark 4A8 05-Feb-2023 10.04
Test MFLOPS MOPS millisecs Results
N1 float 1068.88 0.018 -1.124750137
N2 float 886.76 0.152 -1.131330490
N3 if 2991.53 0.035 1.000000000
N4 fixpt 5013.41 0.063 12.000000000
N5 cos 141.39 0.588 0.499109805
N6 float 801.74 0.673 0.999999821
N7 equal 2004.78 0.092 3.000000000
N8 exp 70.97 0.524 0.935364604
MWIPS 4663.10 2.144
Total Elapsed Time 16.2 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel Native Whetstone Benchmark 4A8 04-Feb-2023 15.31
Test MFLOPS MOPS millisecs Results
N1 float 819.73 0.023 -1.124750137
N2 float 665.33 0.202 -1.131330490
N3 if 2997.37 0.035 1.000000000
N4 fixpt 3331.87 0.095 12.000000000
N5 cos 130.91 0.636 0.499109805
N6 float 666.54 0.809 0.999999821
N7 equal 1332.93 0.139 3.000000000
N8 exp 63.31 0.588 0.935364604
MWIPS 3959.52 2.526
Total Elapsed Time 15.6 seconds
System 4 Android 13 1x 2.80 GHz Cortex-X2
ARM/Intel Native Whetstone Benchmark 4A8 20-Apr-2023 20.18
Test MFLOPS MOPS millisecs Results System 4/System 2
N1 float 1491.65 0.013 -1.124750137 1.40
N2 float 1231.55 0.109 -1.131330490 1.39
N3 if 3598.79 0.029 1.000000000 1.20
N4 fixpt 6992.04 0.045 12.000000000 1.39
N5 cos 246.11 0.338 0.499109805 1.74
N6 float 1118.73 0.482 0.999999821 1.40
N7 equal 2796.29 0.066 3.000000000 1.39
N8 exp 106.54 0.349 0.935364604 1.50
MWIPS 6986.66 1.431 1.50
Total Elapsed Time 16.4 seconds
Dhrystone Benchmark below or Go To Start
Dhrystone Benchmark - Dhrystone2i.apk
The Dhrystone integer benchmark produces a performance rating in Vax MIPS (AKA DMIPS). Results from two runs are provided, for the first
3, to demonstrate variance in measured MIPS speeds. These are generally in line with performance expectations. But a single run can provide
false impressions. The program checks for correct numeric results.
With this benchmark often being used to identify performance of ARM CPUs, they may have added more hardware tweaks to increase the
rating to 12 MIPS per MHz on System 4, twice as high as the other bunch shown here. The program does not appear to be suitable for
vector operation. In 2015 it used to be around 2 MIPS/MHz with my 64 bit program and 4 on high end Intel CPUs.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 14.25
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 40
Dhrystones per Second 24826887
VAX MIPS rating 14130
ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 16.32
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 40
Dhrystones per Second 24821062
VAX MIPS rating 14127
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 14.35
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 40
Dhrystones per Second 24750676
VAX MIPS rating 14087
ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 16.38
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 40
Dhrystones per Second 24841761
VAX MIPS rating 14139
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 14.20
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 47
Dhrystones per Second 21287928
VAX MIPS rating 12116
ARM/Intel Dhrystone 2 Benchmark 4A8 05-Feb-2023 16.22
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 47
Dhrystones per Second 21373535
VAX MIPS rating 12165
System 4 Android 13 1x 2.80 GHz Cortex-X2
ARM/Intel Dhrystone 2 Benchmark 4A8 20-Apr-2023 20.26
Compiled for 64 bit ARM v8a
Nanoseconds one Dhrystone run 17 System 4/System 2
Dhrystones per Second 59677446
VAX MIPS rating 33966 2.40
Linpack Benchmark below or Go To Start
Linpack Tests - LinpackDP2.apk, LinpackSP2.apk, NEON-Linpacki.apk
The Linpack benchmark speed is measured in MFLOPS. Three versions are provided, the original using double precision floating point
calculations, then one with single precision arithmetic, with the third via NEON SIMD single precision intrinsic functions. Results for this
benchmark code should not be compared with those from High Performance Linpack (HPL) benchmark. Again the first two systems produced
similar performance, with the third much slower. Single precision calculations were somewhat faster than those using double precision,
producing different numeric sumchecks, yet consistent across all platforms. NEON functions lead to at least a doubling of measured MFLOPS
with the same single precision sumchecks.
System 4 - This is the first indication of possible heating issues, when running in in the preferred power on mode. Then the third test
appeared to be slower than expected. Note that a number of other benchmarks were run between the last two tests, also indicating slow
performance. This benchmarks can be compiled to use vector processing but limited to two floating point operations per word, similar to
MemSpeed and part of MFLOPS benchmarks.
The later System 4 gains over System 2 were all greater than twice with the NEON test achieving nearly 9.5 GFLOPS or 3.38 MFLOPS per
MHz.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark
4A8 06-Feb-2023 12.19 4A8 06-Feb-2023 12.20 4A8 06-Feb-2023 13.38
Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a
Speed 2047.81 MFLOPS Speed 2186.84 MFLOPS Speed 4705.52 MFLOPS
norm. resid 1.7 norm. resid 1.6 norm. resid 1.6
resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05
machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07
x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05
x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark
4A8 06-Feb-2023 14.59 4A8 06-Feb-2023 15.11 4A8 06-Feb-2023 15.13
Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a
Speed 2027.77 MFLOPS Speed 2150.02 MFLOPS Speed 4614.88 MFLOPS
norm. resid 1.7 norm. resid 1.6 norm. resid 1.6
resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05
machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07
x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05
x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark
4A8 06-Feb-2023 15.44 4A8 06-Feb-2023 15.45 4A8 06-Feb-2023 15.47
Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a
Speed 1474.16 MFLOPS Speed 1664.41 MFLOPS Speed 3294.97 MFLOPS
norm. resid 1.7 norm. resid 1.6 norm. resid 1.6
resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05
machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07
x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05
x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06
System 4 Android 13 1x 2.80 GHz Cortex-X2 Power then Battery
ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark
4A8 20-Apr-2023 20.28 4A8 20-Apr-2023 20.30 4A8 20-Apr-2023 20.45 ##
Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a
Speed 4834.32 MFLOPS Speed 4965.85 MFLOPS Speed 6246.93 MFLOPS
norm. resid 1.7 norm. resid 1.6 norm. resid 1.6
resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05
machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07
x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05
x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06
After Memory Benchmarks ##
System 4/System 2 MFLOPS 2.38 2.36 SLOW 1.35
2 ARM/Intel DP Linpack Benchmark ARM/Intel SP Linpack Benchmark ARM NEON Linpack Benchmark
4A8 23-Apr-2023 14.23 4A8 23-Apr-2023 14.21 4A8 23-Apr-2023 14.19
Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a Compiled for 64 bit ARM v8a
Speed 4826.04 MFLOPS Speed 5083.03 MFLOPS Speed 9466.57 MFLOPS
norm. resid 1.7 norm. resid 1.6 norm. resid 1.6
resid 7.41628980e-14 resid 3.80277634e-05 resid 3.80277634e-05
machep 2.22044605e-16 machep 1.19209290e-07 machep 1.19209290e-07
x[0]-1 -1.49880108e-14 x[0]-1 -1.38282776e-05 x[0]-1 -1.38282776e-05
x[n-1]-1 -1.89848137e-14 x[n-1]-1 -7.51018524e-06 x[n-1]-1 -7.51018524e-06
System 4/System 2 MFLOPS 2.38 2.36 2.05
Livermore Loops Benchmark below or Go To Start
Livermore Loops Benchmark - LivermoreLoops2.apk
The Livermore Loops comprise 24 kernels of numerical applications with speeds calculated in MFLOPS (double precision). A summary is also
produced, with maximum, minimum and various mean values, geometric mean being the official average. They are repeated three times at
different array dimension spans.
Below are MFLOPS scores for the 24 kernels, at one data span, and overall ratings of Maximum, Average, Geometric mean, Harmonic mean
and Minimum MFLOPS. Again, System 1 slightly faster CPU MHz gave a lead over System 2, with System 3 far behind. Results are also
provided using System 3 for a second power on run and on battery at 45% charge, all indicating the same performance.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel Livermore Loops Benchmark 4A8 06-Feb-2023 12.22
Compiled for 64 bit ARM v8a
MFLOPS for 24 loops Do Span 471
2603.8 1889.6 1644.0 1670.3 790.6 1433.2
2606.3 3006.5 2780.7 1905.8 941.0 2110.0
524.5 756.1 1414.9 1560.5 1533.0 2645.4
715.3 1930.0 1766.2 1300.3 1554.1 672.2
Overall Weighted MFLOPS Do Spans 471, 90, 19
Maximum Average Geomean Harmean Minimum
3007.5 1651.3 1495.8 1335.2 524.5
Results of last two calculations
4.850340602749970e+02 1.300000000000000e+01
Total Elapsed Time 8.8 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel Livermore Loops Benchmark 4A8 06-Feb-2023 15.15
Compiled for 64 bit ARM v8a
MFLOPS for 24 loops Do Span 471
2558.0 1853.8 1592.7 1636.9 774.7 1402.6
2553.6 2942.2 2730.5 1869.7 968.9 2086.2
516.0 745.9 1362.3 1525.4 1508.5 2594.3
700.3 1894.8 1736.3 1221.3 1521.9 658.0
Overall Weighted MFLOPS Do Spans 471, 90, 19
Maximum Average Geomean Harmean Minimum
2942.2 1619.1 1466.2 1308.7 516.0
Results of last two calculations
4.850340602749970e+02 1.300000000000000e+01
Total Elapsed Time 8.8 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel Livermore Loops Benchmark 4A8 06-Feb-2023 15.48
Compiled for 64 bit ARM v8a
MFLOPS for 24 loops Do Span 471
2138.1 1346.2 1329.3 1308.0 668.8 929.1
2183.1 2718.9 2443.1 1380.8 667.8 1375.9
410.7 534.2 961.6 1003.3 1241.0 1755.8
429.5 1328.1 1256.7 958.1 1234.5 440.7
Overall Weighted MFLOPS Do Spans 471, 90, 19
Maximum Average Geomean Harmean Minimum
2718.9 1258.8 1111.2 964.9 371.3
Results of last two calculations
4.850340602749970e+02 1.300000000000000e+01
Total Elapsed Time 9.0 seconds
System 3 Rerun
2137.7 1344.8 1329.4 1307.4 668.3 934.9
2182.5 2719.7 2443.9 1379.3 668.5 1376.4
412.5 533.2 961.2 1012.2 1241.8 1755.9
429.6 1328.9 1255.6 958.0 1234.6 440.7
System 3 Battery 45%
2137.8 1338.8 1329.3 1307.8 668.6 920.2
2181.5 2717.3 2443.8 1380.2 668.5 1380.1
413.1 535.0 961.2 1010.0 1235.2 1756.1
429.7 1328.5 1256.2 957.9 1233.8 440.7
Continued Below
Livermore Loops Armv9 CPU Phone
This benchmark was first run before temperature increases lead to noticeable performance deterioration. Then, maximum speed of an
individual loop was nearly 7 GFLOPS, 2.33 times faster than the older System 2 but with maximum gain of 3.63 times on another loop.
Rerunning the benchmark, on battery (cooler), indicated slightly faster performance.
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1 On Power
ARM/Intel Livermore Loops Benchmark 4A8 20-Apr-2023 20.32
Compiled for 64 bit ARM v8a
MFLOPS for 24 loops Do Span 471
6669.7 4873.3 2659.1 3066.6 1131.5 2339.6
6444.7 6866.3 6740.1 4898.4 1372.2 6161.1
1871.2 1695.0 3828.5 3432.4 2452.0 6094.3
927.5 2690.4 2831.2 3429.4 2301.9 1363.1
Overall Weighted MFLOPS Do Spans 471, 90, 19
Maximum Average Geomean Harmean Minimum
6867.1 3634.5 3136.9 2652.1 927.5
Results of last two calculations
4.850340602749970e+02 1.300000000000000e+01
Total Elapsed Time 9.6 seconds
Test 2 On Battery
ARM/Intel Livermore Loops Benchmark 4A8 30-Apr-2023 13.40
Compiled for 64 bit ARM v8a
MFLOPS for 24 loops Do Span 471
6827.0 4835.3 2747.5 3172.3 1136.1 2343.6
6520.9 6984.3 6718.6 4888.5 1375.9 6192.2
1928.1 1750.2 3963.8 3588.4 2550.2 6333.6
962.4 2699.2 2932.5 3547.5 2304.3 1361.5
Overall Weighted MFLOPS Do Spans 471, 90, 19
Maximum Average Geomean Harmean Minimum
7032.9 3662.7 3158.9 2667.6 929.0
Results of last two calculations
4.850340602749970e+02 1.300000000000000e+01
Total Elapsed Time 9.3 seconds
Test 1/System 2
MFLOPS for 24 loops Do Span 471
2.61 2.63 1.67 1.87 1.46 1.67
2.52 2.33 2.47 2.62 1.42 2.95
3.63 2.27 2.81 2.25 1.63 2.35
1.32 1.42 1.63 2.81 1.51 2.07
Maximum Average Geomean Harmean Minimum
2.33 2.24 2.14 2.03 1.80
Test 2/System 2
MFLOPS for 24 loops Do Span 471
2.67 2.61 1.73 1.94 1.47 1.67
2.55 2.37 2.46 2.61 1.42 2.97
3.74 2.35 2.91 2.35 1.69 2.44
1.37 1.42 1.69 2.90 1.51 2.07
Maximum Average Geomean Harmean Minimum
2.39 2.26 2.15 2.04 1.80
MemSpeed next or Go To Start
MemSpeed Benchmark - MemSpeedi.apk
This benchmark measures data reading speeds in MegaBytes per second carrying out calculations on arrays of cache and RAM data, sized 2
x 8 KB to 2 x 32 MB. Calculations are x[m]=x[m]+s*y[m] and x[m]=x[m]+y[m], using double and single precision (DP and SP) floating point
and x[m]=x[m]+s+y[m] and x[m]=x[m]+y[m] with integers. Million Floating Point Operations Per Second (MFLOPS) speed can be calculated
by dividing DP MB/second by 8 and 16, for the two tests, and SP speeds by 4 and 8.
The results clearly demonstrate differences in such as CPU, RAM and cache speeds, floating point double and single precision floating point
performance and cache sizes, indicating the invalidity of an overall single number rating.
With calculated single precision MFLOPS greater than MHz or double precision half that rate, the use of SIMD instructions being executed are
indicated. For some reason, the older technology Cortex A-75 was best on L1 cache based double precision MFLOPS.
This and later benchmarks demonstrate that System 3 RAM speeds are much slower than those for the other two.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel MemSpeed Benchmark 4A8 07-Feb-2023 10.21
Compiled for 64 bit ARM v8a
Reading Speed in MBytes/Second
Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m]
KBytes Dble Sngl Int Dble Sngl Int
16 14368 12749 13579 25806 13430 13114 L1
32 14377 12612 13629 25300 13078 12931
64 14315 12442 13534 26042 12740 12967
128 13677 12190 13147 21466 12434 12616 L2
256 13537 12097 13036 21231 12311 12491
512 13432 12018 12831 20618 12261 12454
1024 13230 11924 12791 18379 12173 12401 L3
4096 11013 10328 10937 10390 10612 10386
16384 9371 9342 9406 8997 9282 9084 RAM
65536 8799 8846 8878 8636 8801 8665
Max MFLOPS 1797 3187
Total Elapsed Time 12.2 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel MemSpeed Benchmark 4A8 07-Feb-2023 10.26
Compiled for 64 bit ARM v8a
Reading Speed in MBytes/Second
Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m]
KBytes Dble Sngl Int Dble Sngl Int
16 14059 12474 13286 26090 13109 12806 L1
32 14045 12320 13326 26087 13023 12843
64 14061 12187 13323 25871 12544 12729
128 13455 11979 12990 21318 12189 12418 L2
256 13100 11827 12715 20903 12119 12290
512 13309 11892 12791 21008 12129 12291
1024 13295 11932 12788 21078 11992 12281 L3 2 MB
4096 9419 9354 9522 8907 9251 6848 RAM
16384 7912 7797 7883 6614 7549 7320
65536 7722 7788 7530 7333 7467 7255
Max MFLOPS 1757 3119
Total Elapsed Time 11.8 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel MemSpeed Benchmark 4A8 07-Feb-2023 21.49
Compiled for 64 bit ARM v8a
Reading Speed in MBytes/Second
Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m]
KBytes Dble Sngl Int Dble Sngl Int
16 19342 12941 14154 18768 10836 10799 L1
32 19432 12942 14187 18798 10784 10970
64 19430 12940 14184 18651 10803 10971
128 9987 9084 9830 10006 9040 9114 L2
256 10341 9551 10274 10461 10125 10120
512 10239 9563 10283 10398 10030 10021
1024 9249 8657 9109 9267 8923 8959 L3
4096 4942 4881 4926 4879 4917 4888 RAM
16384 4577 4511 4565 4522 4532 4542
65536 4408 4509 4523 4527 4512 4510
Max MFLOPS 2429 3236
Total Elapsed Time 10.1 seconds
Continued Below
MemSpeed Armv9 CPU Phone
This was the first benchmark, run on 20th April, where some performance comparisons were lower those based on CPU MHz difference, the
penalty imposed of continuously running using the power connection. Later, running on battery, performance gains were between 1.49 and
4.22, the larger L3 cache being responsible for the latter. This time a repeat of Test 2 obtained the same perfomance on power (details not
shown here).
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1 On Power
ARM/Intel MemSpeed Benchmark 4A8 20-Apr-2023 20.40
Compiled for 64 bit ARM v8a
Reading Speed in MBytes/Second
Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m]
KBytes Dble Sngl Int Dble Sngl Int
16 18273 16318 13593 34975 21486 21577 L1
32 15278 13607 13606 34968 21565 21690
64 15230 13584 13562 34953 21214 21543
128 15301 13604 13578 34717 21359 21555 L2
256 15244 13599 13599 34859 21152 21389
512 15311 13611 13610 34911 21257 21269
1024 15236 13590 13529 34630 21168 21299
4096 15269 13588 13570 34599 21601 21495 L3
16384 15075 13472 13449 21727 18962 19053 RAM
65536 13210 13468 13460 18029 16851 14148
Max MFLOPS 2284 4080
Total Elapsed Time 11.3 seconds
Test 2 On Battery
ARM/Intel MemSpeed Benchmark 4A8 23-Apr-2023 13.52
Compiled for 64 bit ARM v8a
Reading Speed in MBytes/Second
Memory x[m]=x[m]+s*y[m] Int+ x[m]=x[m]+y[m]
KBytes Dble Sngl Int Dble Sngl Int
16 22292 19857 19860 51064 31512 31522 L1
32 22342 19872 19842 51115 31999 32111
64 22229 19706 19782 51115 31400 31663
128 22300 19864 19858 50730 31237 31454 L2
256 22298 19875 19844 50906 31585 31959
512 22265 19873 19859 50290 30853 31149
1024 22346 19865 19872 49249 29985 30510
4096 21319 18952 19300 43691 28347 28916 L3
16384 19239 17066 15105 19805 19700 20244 RAM
65536 16165 15122 15114 17565 17043 17009
Max MFLOPS 2793 4968
Total Elapsed Time 10.4 seconds
Test 1/System 2
KBytes Dble Sngl Int Dble Sngl Int
16 1.30 1.31 1.02 1.34 1.64 1.68
32 1.09 1.10 1.02 1.34 1.66 1.69
64 1.08 1.11 1.02 1.35 1.69 1.69
128 1.14 1.14 1.05 1.63 1.75 1.74
256 1.16 1.15 1.07 1.67 1.75 1.74
512 1.15 1.14 1.06 1.66 1.75 1.73
1024 1.15 1.14 1.06 1.64 1.77 1.73
4096 1.62 1.45 1.43 3.88 2.33 3.14
16384 1.91 1.73 1.71 3.29 2.51 2.60
65536 1.71 1.73 1.79 2.46 2.26 1.95
Test 2/System 2
KBytes Dble Sngl Int Dble Sngl Int
16 1.59 1.59 1.49 1.96 2.40 2.46
32 1.59 1.61 1.49 1.96 2.46 2.50
64 1.58 1.62 1.48 1.98 2.50 2.49
128 1.66 1.66 1.53 2.38 2.56 2.53
256 1.70 1.68 1.56 2.44 2.61 2.60
512 1.67 1.67 1.55 2.39 2.54 2.53
1024 1.68 1.66 1.55 2.34 2.50 2.48
4096 2.26 2.03 2.03 4.91 3.06 4.22 L3 vs RAM
16384 2.43 2.19 1.92 2.99 2.61 2.77
65536 2.09 1.94 2.01 2.40 2.28 2.34
NeonSpd Benchmark next or Go To Start
NeonSpeed Benchmark - NeonSpeedi.apk
This benchmark carries out the same calculations as the MemSpeed Benchmark, except they are all in single precision, as applicable with the
NEON calculations. The latter are carried out using NEON intrinsic functions. Using these SIMD instructions, four results per clock cycle are
possible or 8 GFLOPS at 2 GHz, rising to 16 GFLOPS with fused multiply and add instructions, as possible with the first two columns. Here we
have a maximum of nearly 10 GFLOPS. But more than 12 GFLOPS are demonstrated later under the MP-MFLOPS Benchmark, with compiled
code using a single CPU core.
NEON integer operations per second were slightly higher than those for floating point, where integer instructions per second would be
somewhat higher, due to the inclusion of load, store and branching instructions.
With NEON operation, the much slower performance of System 3 older processor is clearly shown.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM NeonSpeed Benchmark 4A8 08-Feb-2023 10.50
Compiled for 64 bit ARM v8a
Vector Reading Speed in MBytes/Second
Memory Float v=v+s*v Int v=v+v+s Neon v=v+v
KBytes Norm Neon Norm Neon Float Int
16 13068 39594 13739 43318 54907 54817 L1
32 13074 39493 13764 43255 46180 45660
64 13065 39273 13749 43106 45044 43823
128 12888 28829 13632 29341 29244 29271 L2
256 12647 26631 13425 26850 26852 26837
512 12629 22447 13434 22401 22417 22393
1024 12465 18418 13194 18358 18375 18341 L3
4096 11104 10324 11518 10239 9853 10056
16384 9022 8691 9324 8638 8589 8648 RAM
65536 8898 8365 8936 8322 8374 8312
Max MFLOPS 3269 9899
Total Elapsed Time 11.0 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM NeonSpeed Benchmark 4A8 08-Feb-2023 11.26
Compiled for 64 bit ARM v8a
Vector Reading Speed in MBytes/Second
Memory Float v=v+s*v Int v=v+v+s Neon v=v+v
KBytes Norm Neon Norm Neon Float Int
16 12829 38832 13490 42520 53871 53927 L1
32 12827 38786 13499 42635 53916 53880
64 12804 38518 13479 42122 43667 43600
128 12599 28491 13330 28704 28805 28773 L2
256 12488 27960 13172 28234 28509 28465
512 12547 27304 13238 27373 27753 27759
1024 12499 23922 13222 24250 24376 25347 L3
4096 9494 8896 10109 9242 9403 9242 RAM
16384 7968 7476 8194 7719 7735 7642
65536 7892 7274 7914 6716 7229 7226
Max MFLOPS 3207 9708
Total Elapsed Time 10.6 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM NeonSpeed Benchmark 4A8 08-Feb-2023 12.22
Compiled for 64 bit ARM v8a
Vector Reading Speed in MBytes/Second
Memory Float v=v+s*v Int v=v+v+s Neon v=v+v
KBytes Norm Neon Norm Neon Float Int
16 12933 21026 14176 21588 20680 20761 L1
32 12685 20668 13506 21296 20824 20824
64 12540 20612 13405 21227 20822 20844
128 9358 10086 10182 10055 10007 10016 L2
256 9843 10438 10550 10388 10379 10383
512 9827 10359 10414 10335 10270 10324
1024 8380 8886 8706 8902 8986 9011 L3
4096 4467 4561 4363 4576 4591 4596 RAM
16384 4656 4736 4674 4613 4741 4759
65536 4387 4601 4514 4588 4588 4588
Max MFLOPS 3233 5257
Total Elapsed Time 10.3 seconds
Continued Below
NeonSpeed Armv9 CPU Phone
The first on power test was run after MemSpeed and other memory benchmarks, with some results slower than from the older System 2,
using integers. Rerunning on battery provided gains similar to MemSpeed, with integer performance gains around MHz comparison ratio but L3
floating point cache improvements of up to 6.31 times. Comparing these two results, as shown below, performance was reduced by up to
32% during the hotter On Power run (Battery 47% faster).
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1 On Power
ARM NeonSpeed Benchmark 4A8 20-Apr-2023 20.47
Compiled for 64 bit ARM v8a
Vector Reading Speed in MBytes/Second
Memory Float v=v+s*v Int v=v+v+s Neon v=v+v
KBytes Norm Neon Norm Neon Float Int
16 16313 85505 13260 72262 73469 73365 L1
32 13567 71006 12891 72240 73527 73528
64 13591 61035 12889 65553 62412 60633
128 13599 45930 12889 45743 45572 45718 L2
256 13606 46165 12891 46201 46187 46215
512 13595 45389 12878 45385 45550 45544
1024 13603 45930 12886 45922 45797 45865
4096 13595 38351 12878 38425 38827 38993 L3
16384 13482 22725 12767 22666 22942 22846 RAM
65536 13367 15431 12790 17360 18269 18185
Max MFLOPS 4078 21376
Total Elapsed Time 10.3 seconds
Test 2 on Battery
ARM NeonSpeed Benchmark 4A8 23-Apr-2023 13.55
Compiled for 64 bit ARM v8a
Vector Reading Speed in MBytes/Second
Memory Float v=v+s*v Int v=v+v+s Neon v=v+v
KBytes Norm Neon Norm Neon Float Int
16 19862 102403 18573 102534 103684 103639 L1
32 19381 100863 18167 101897 103666 103409
64 19051 85761 18163 91701 85459 88190
128 19187 64767 18183 64770 64783 64820 L2
256 19199 64334 18184 65047 65140 65178
512 19185 63656 18192 64717 65401 65100
1024 19181 62057 18172 63202 62816 62338
4096 19153 56099 18067 56160 56082 55613 L3
16384 17795 24262 16849 24127 24352 23700 RAM
65536 15837 18834 15683 18968 19080 19083
Max MFLOPS 4966 25601
Total Elapsed Time 10.4 seconds
Test 1/System 2
KBytes Norm Neon Norm Neon Float Int
16 1.27 2.20 0.98 1.70 1.36 1.36
32 1.06 1.83 0.95 1.69 1.36 1.36
64 1.06 1.58 0.96 1.56 1.43 1.39
128 1.08 1.61 0.97 1.59 1.58 1.59
256 1.09 1.65 0.98 1.64 1.62 1.62
512 1.08 1.66 0.97 1.66 1.64 1.64
1024 1.09 1.92 0.97 1.89 1.88 1.81
4096 1.43 4.31 1.27 4.16 4.13 4.22
16384 1.69 3.04 1.56 2.94 2.97 2.99
65536 1.69 2.12 1.62 2.58 2.53 2.52
Test2/System 2
KBytes Norm Neon Norm Neon Float Int
16 1.55 2.64 1.38 2.41 1.92 1.92
32 1.51 2.60 1.35 2.39 1.92 1.92
64 1.49 2.23 1.35 2.18 1.96 2.02
128 1.52 2.27 1.36 2.26 2.25 2.25
256 1.54 2.30 1.38 2.30 2.28 2.29
512 1.53 2.33 1.37 2.36 2.36 2.35
1024 1.53 2.59 1.37 2.61 2.58 2.46
4096 2.02 6.31 1.79 6.08 5.96 6.02 L3 vs RAM
16384 2.23 3.25 2.06 3.13 3.15 3.10
65536 2.01 2.59 1.98 2.82 2.64 2.64
Battery/Power Best Case
512 1.41 1.40 1.41 1.43 1.44 1.43
1024 1.41 1.35 1.41 1.38 1.37 1.36
4096 1.41 1.46 1.40 1.46 1.44 1.43
BusSpeed Benchmark next or Go To Start
BusSpeed Benchmark - BusSpeedv7i.apk
This benchmark is designed to identify reading data in bursts over buses. The program starts by reading a word (4 bytes) with an address
increment of 32 words (128 bytes) before reading another word. The increment is reduced by half on successive tests, until all data is read.
On reading data from RAM, 64 Byte bursts are typically used. Then, measured reading speed reduces from a maximum, when all data is read,
to a minimum on using 16 word increments (64 bytes). Potential maximum bus speed can be estimated by multiplying the Int16 value by 16.
Then, for each half reduction in increments, a near doubling of MB/second could be expected. Burst reading is also indicated on some cache
based data transfers.
The near constant Read All performance indicates CPU speed limitation, influenced by calculations involved, where RAM Inc 2 to Read All
data transfer speeds do not approach doubling on systems 1 and 2. This effect also disguises System 3’s slower RAM.
See MP-BusSpeed results, indicating that access by multiple cores is necessary to obtain maximum memory throughput, where adequate CPU
performance is provided.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel BusSpeed Benchmark 4A8 08-Feb-2023 10.52
Compiled for 64 bit ARM v8a
Reading Speed 4 Byte Words in MBytes/Second
Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read
KBytes Words Words Words Words Words All
16 3887 5358 7637 8100 8113 8111 L1
32 7697 7796 7836 8102 8103 8111
64 6288 6426 7983 8114 8118 8111
128 2017 3596 6107 8099 8104 8108 L2
256 1646 2526 4675 7276 8065 8094
512 863 1304 2723 5462 8104 8101
1024 791 1128 2277 4449 7705 7907 L3
4096 608 996 1965 3548 7123 7894
16384 558 886 1791 3198 6659 7945 RAM
65536 548 873 1768 3199 6494 7957
Total Elapsed Time 5.0 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel BusSpeed Benchmark 4A8 08-Feb-2023 11.31
Compiled for 64 bit ARM v8a
Reading Speed 4 Byte Words in MBytes/Second
Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read
KBytes Words Words Words Words Words All
16 6809 6976 7643 7939 7952 7942 L1
32 7561 7650 7685 7951 7958 7952
64 6197 6285 7820 7959 7964 7946
128 1977 3555 5903 7894 7925 7938 L2
256 1526 2513 4872 7650 7913 7945
512 1022 1838 3661 7276 5696 6919
1024 910 1560 3071 5808 7796 6611 L3
4096 648 992 2132 4132 7393 7440 RAM
16384 586 877 1792 3650 6820 7898
65536 570 857 1763 3501 6647 7896
Total Elapsed Time 5.2 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel BusSpeed Benchmark 4A8 08-Feb-2023 12.24
Compiled for 64 bit ARM v8a
Reading Speed 4 Byte Words in MBytes/Second
Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read
KBytes Words Words Words Words Words All
16 6671 6851 7497 7964 7981 7983 L1
32 7330 7498 7498 7979 7980 7990
64 2827 2565 5606 7463 7836 7953
128 1566 1426 2322 4300 6046 7990 L2
256 1213 991 2076 3945 5492 7983
512 604 625 1851 3750 5444 7974
1024 616 588 1726 3202 4796 7103 L3
4096 579 522 1228 2419 4788 7448 RAM
16384 541 537 1135 2230 4545 7510
65536 496 520 1145 2292 4582 7528
Total Elapsed Time 4.9 seconds
Continued Below
BusSpeed Armv9 CPU Phone
This integer benchmark was run immediately after MemSpeed, when those for the on power run were often slower than the older phone.
Comparing the on battery results often produced results proportional to the MHz ratio of 1.4. These latest results, indicates that many of
the on battery measurements were around 50% faster.
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1 On Power
ARM/Intel BusSpeed Benchmark 4A8 20-Apr-2023 20.42
Compiled for 64 bit ARM v8a
Reading Speed 4 Byte Words in MBytes/Second
Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read
KBytes Words Words Words Words Words All
16 7774 8000 8823 9146 9166 7630 L1
32 7262 7357 7375 7641 7648 7635
64 6110 7378 7575 7644 7654 7633
128 3745 3985 7557 7653 7653 7635 L2
256 3742 3917 7567 7648 7654 7633
512 3785 4060 7419 7652 7654 7597
1024 3727 4073 6810 7647 7654 7626
4096 3246 2934 5918 7611 7641 7625 L3
16384 1803 1692 3441 6450 7556 7572 RAM
65536 1485 1535 3175 6175 7495 7544
Total Elapsed Time 5.1 seconds
Test 2 On Battery
ARM/Intel BusSpeed Benchmark 4A8 23-Apr-2023 14.03
Compiled for 64 bit ARM v8a
Reading Speed 4 Byte Words in MBytes/Second
Memory Inc32 Inc16 Inc8 Inc4 Inc2 Read
KBytes Words Words Words Words Words All
16 9518 9771 10736 11133 11157 11145 L1
32 10614 10747 10799 11160 11174 11153
64 8911 10778 11062 11163 11167 11155
128 5472 5824 11046 11169 11182 11152 L2
256 5504 5782 11121 11174 11179 11155
512 5544 5911 11065 11181 11172 11146
1024 5479 6056 10871 11177 11178 11150
4096 4731 4097 8153 11145 11145 11146 L3
16384 2432 2023 4103 7354 10873 11063 RAM
65536 1484 1712 3572 6648 10627 11050
Total Elapsed Time 5.0 seconds
Test1/System 2
KBytes Inc32 Inc16 Inc8 Inc4 Inc2 All
16 1.14 1.15 1.15 1.15 1.15 0.96
32 0.96 0.96 0.96 0.96 0.96 0.96
64 0.99 1.17 0.97 0.96 0.96 0.96
128 1.89 1.12 1.28 0.97 0.97 0.96
256 2.45 1.56 1.55 1.00 0.97 0.96
512 3.70 2.21 2.03 1.05 1.34 1.10
1024 4.10 2.61 2.22 1.32 0.98 1.15
4096 5.01 2.96 2.78 1.84 1.03 1.02
16384 3.08 1.93 1.92 1.77 1.11 0.96
65536 2.61 1.79 1.80 1.76 1.13 0.96
Test2/System 2
KBytes Inc32 Inc16 Inc8 Inc4 Inc2 All
16 1.40 1.40 1.40 1.40 1.40 1.40
32 1.40 1.40 1.41 1.40 1.40 1.40
64 1.44 1.71 1.41 1.40 1.40 1.40
128 2.77 1.64 1.87 1.41 1.41 1.40
256 3.61 2.30 2.28 1.46 1.41 1.40
512 5.42 3.22 3.02 1.54 1.96 1.61
1024 6.02 3.88 3.54 1.92 1.43 1.69
4096 7.30 4.13 3.82 2.70 1.51 1.50
16384 4.15 2.31 2.29 2.01 1.59 1.40
65536 2.60 2.00 2.03 1.90 1.60 1.40
Battery/Power Best Case
512 1.46 1.46 1.49 1.46 1.46 1.47
1024 1.47 1.49 1.60 1.46 1.46 1.46
4096 1.46 1.40 1.38 1.46 1.46 1.46
RandMem Benchmark next or Go To Start
RandMem Benchmark - RandMemi.apk
RandMem benchmark carries out four tests comprising serial and random address selections using the same program structure, with read and
read/write tests, where the data read points to the next address, with no arithmetic calculations. The main purpose is to demonstrate how
much slower performance can be through using random access. Here, speed can be considerably influenced by reading and writing in bursts,
where much of the data is not used, and by the size of preceding caches.
This benchmark demonstrates the best and worst data transfer speeds from RAM, running a single program. Best is serial reading that is has
minimum CPU instruction execution time reading all data in a burst. Worst is random access with a low probability in reading data form the
same burst.
Some of System 3’s results were noticeably slower than those in the other memory benchmarks.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel RandMem Benchmark 4A8 08-Feb-2023 10.53
Compiled for 64 bit ARM v8a
MBytes/Second Transferring 4 Byte Words
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 8659 13607 14309 13669 L1
32 14800 15595 14275 13640
64 14693 15357 14261 13579
128 12719 13268 8758 7856 L2
256 12616 13060 4867 5225
512 12746 13177 2816 3274
1024 12251 12337 1416 1908 L3
4096 11763 7213 664 717
16384 11472 6327 556 597 RAM
65536 11481 5996 526 565
Total Elapsed Time 8.1 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel RandMem Benchmark 4A8 08-Feb-2023 11.37
Compiled for 64 bit ARM v8a
MBytes/Second Transferring 4 Byte Words
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 14413 15265 14036 13429 L1
32 14467 15309 14068 13413
64 14558 15147 14022 13378
128 12462 13066 6195 6645 L2
256 12480 13083 4764 4853
512 10959 12560 1962 2452
1024 10617 12740 1195 1534 L3
4096 12067 6824 534 538 RAM
16384 12051 6031 409 415
65536 12002 5763 349 364
Total Elapsed Time 8.6 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel RandMem Benchmark 4A8 08-Feb-2023 12.25
Compiled for 64 bit ARM v8a
MBytes/Second Transferring 4 Byte Words
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 12972 15051 12798 12393 L1
32 13116 15184 12788 13243
64 12814 15150 11406 12759
128 8668 8727 2588 3199 L2
256 8078 7972 2279 2567
512 8017 7301 1555 1779
1024 7165 6442 1056 1268 L3
4096 7481 3425 484 410 RAM
16384 7453 3262 343 273
65536 7080 3014 333 292
Total Elapsed Time 8.5 seconds
Continued Below
RandMem Armv9 CPU Phone
Run during the hot time period, the on power performance still produced better performance than the older phone, particularly during the
random tests using the larger L3 cache. This helped even more during the later battery tests, where maximum gain was 10.72 times. This
time, maximum On Battery speed gains were around 30% but little different from RAM.
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1
ARM/Intel RandMem Benchmark 4A8 20-Apr-2023 20.43
Compiled for 64 bit ARM v8a
MBytes/Second Transferring 4 Byte Words
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 26053 25057 23723 18443 L1
32 23084 22915 22289 18268
64 21887 22732 21187 16691
128 20287 21627 13268 10698 L2
256 20283 21661 10263 9161
512 20217 21467 8842 8383
1024 20015 21326 7138 7354
4096 20218 20853 3323 4499 L3
16384 19874 12556 1568 1962 RAM
65536 19649 11471 983 1328
Total Elapsed Time 7.9 seconds
Test 2 Battery
ARM/Intel RandMem Benchmark 4A8 23-Apr-2023 14.00
Compiled for 64 bit ARM v8a
MBytes/Second Transferring 4 Byte Words
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 31747 30518 30144 24456 L1
32 30682 30415 29525 24245
64 29039 30172 28299 22411
128 26821 28695 17049 14193 L2
256 26980 28762 13155 11756
512 25989 27680 11462 10935
1024 25887 27358 9344 9597
4096 25894 25909 4078 5770 L3
16384 23440 13046 1647 1987 RAM
65536 22756 11750 1023 1372
Test1/System 2
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 1.81 1.64 1.69 1.37
32 1.60 1.50 1.58 1.36
64 1.50 1.50 1.51 1.25
128 1.63 1.66 2.14 1.61
256 1.63 1.66 2.15 1.89
512 1.84 1.71 4.51 3.42
1024 1.89 1.67 5.97 4.79
4096 1.68 3.06 6.22 8.36 L3
16384 1.65 2.08 3.83 4.73 RAM
65536 1.64 1.99 2.82 3.65
Test2/System 2
Memory Serial....... Random.......
KBytes Read Rd/Wrt Read Rd/Wrt
16 2.20 2.00 2.15 1.82
32 2.12 1.99 2.10 1.81
64 1.99 1.99 2.02 1.68
128 2.15 2.20 2.75 2.14
256 2.16 2.20 2.76 2.42
512 2.37 2.20 5.84 4.46
1024 2.44 2.15 7.82 6.26
4096 2.15 3.80 7.64 10.72 L3
16384 1.95 2.16 4.03 4.79 RAM
65536 1.90 2.04 2.93 3.77
Battery/Power Best and Worst Case
1024 1.29 1.28 1.31 1.31
4096 1.28 1.24 1.23 1.28
16384 1.18 1.04 1.05 1.01
65536 1.16 1.02 1.04 1.03
FFT Benchmarks next or Go To Start
FFT Benchmarks - fft1.apk, fft3c.apk
The benchmarks run code for single and double precision Fast Fourier Transforms of size 1024 to 1048576 (1K to 1024K), with running times
in milliseconds. Two versions are available FFT1, original version and with optimised C code as FFT3c. Memory used increases with FFT sizes,
up to use from RAM and is often accessed on a skipped sequential basis, leading to burst reading effects. The change from using a different
cache or RAM is demonstrated where execution time is more than double on doubling the FFT size.
Here, on executing FFT1, system 2 is shown to be faster than system 1. This test was repeated later, showing system 1 slightly faster, as
expected. As with all these first tests, the benchmarks were run with power connected, with the reason for the difference being unknown.
This demonstrates the danger in assessing performance by running a single benchmark.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel FFT Benchmark 1 4A8 08-Feb-2023 10.55
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.047 0.044 0.042 0.044 0.044 0.042
2 0.092 0.091 0.091 0.092 0.091 0.091
4 0.197 0.197 0.196 0.204 0.202 0.203
8 0.434 0.429 0.429 0.573 0.461 0.302
16 1.196 1.199 1.183 1.395 1.428 1.265
32 3.331 3.275 3.271 4.362 4.296 4.123
64 7.407 7.325 6.456 8.545 8.260 7.313
128 14.196 13.447 12.777 24.470 24.741 23.636
256 43.757 43.396 43.050 66.080 65.481 65.891
512 121.602 121.637 121.264 157.855 157.641 157.182
1024 310.438 309.197 303.803 369.157 364.380 362.249
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 4.3 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel FFT Benchmark 1 4A8 08-Feb-2023 11.40
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.037 0.030 0.030 0.031 0.031 0.030
2 0.065 0.064 0.064 0.065 0.065 0.064
4 0.140 0.139 0.139 0.144 0.143 0.143
8 0.306 0.303 0.303 0.420 0.411 0.410
16 0.697 0.668 0.666 1.002 0.875 0.836
32 1.740 1.744 1.707 2.158 2.112 2.090
64 4.656 4.247 4.453 5.826 5.675 6.420
128 17.591 12.325 11.902 23.000 23.823 22.929
256 45.956 47.550 46.355 64.257 63.979 63.376
512 120.193 120.099 124.833 156.133 155.517 156.019
1024 295.659 334.325 304.642 361.975 360.212 361.947
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 4.1 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel FFT Benchmark 1 4A8 08-Feb-2023 12.26
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.034 0.030 0.030 0.026 0.025 0.025
2 0.065 0.065 0.065 0.055 0.055 0.054
4 0.141 0.142 0.139 0.154 0.152 0.154
8 0.329 0.337 0.335 0.440 0.442 0.454
16 0.872 0.895 0.877 1.054 1.071 1.089
32 2.182 2.168 2.146 2.729 2.840 2.793
64 5.401 5.475 5.492 9.277 9.631 9.695
128 16.977 17.529 17.099 39.834 43.928 43.814
256 85.865 82.130 81.941 112.404 108.405 110.697
512 215.935 221.886 219.700 258.905 259.124 258.621
1024 506.663 504.806 500.864 604.900 598.287 595.695
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 6.5 seconds
Continued Below
FFT Benchmark 1 Armv9 CPU Phone
The FFTs access data on a skipped sequential basis, with the same sort of impact as random access on burst reading. This affects the larger
FFTs and use of the large L3 cache, leading to performance gains, over the older phone, being more than four times. The ups and downs on
other performance comparisons are difficult to explain. The smaller FFTs are more dependent on data transmission speed where, in this case,
some On Power measurements indicate faster speeds.
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1
ARM/Intel FFT Benchmark 1 4A8 20-Apr-2023 20.58
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.028 0.024 0.022 0.024 0.023 0.022
2 0.050 0.049 0.049 0.050 0.049 0.048
4 0.108 0.130 0.099 0.103 0.102 0.102
8 0.224 0.223 0.223 0.404 0.372 0.365
16 0.803 0.782 0.792 0.827 0.698 0.696
32 1.394 1.428 1.402 1.313 1.343 1.211
64 2.364 2.368 2.373 2.606 2.441 2.213
128 4.666 4.417 4.580 5.713 5.632 5.501
256 11.612 11.316 11.384 14.595 13.892 14.434
512 27.517 26.152 25.995 38.339 41.675 41.686
1024 79.904 78.725 78.795 105.524 105.813 107.723
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 1.3 seconds
Test 2 Battery
ARM/Intel FFT Benchmark 1 4A8 23-Apr-2023 14.06
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.027 0.023 0.022 0.024 0.023 0.022
2 0.050 0.178 0.049 0.050 0.049 0.049
4 0.108 0.107 0.107 0.112 0.111 0.111
8 0.245 0.245 0.242 0.400 0.412 0.397
16 0.850 0.857 0.865 0.950 0.892 0.694
32 1.524 1.404 1.417 1.391 1.259 1.212
64 2.543 2.188 2.174 2.316 2.287 2.183
128 4.584 4.687 4.464 4.886 4.555 4.635
256 9.222 9.279 9.224 10.926 10.972 10.583
512 22.076 21.046 21.753 33.690 31.855 33.518
1024 59.946 61.047 60.812 89.821 90.799 90.701
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 1.1 seconds
Average Comparisons
Test 1/Old 2 Test 2/old 2 Battery/Power
SP DP SP DP SP DP
1 1.31 1.32 1.35 1.32 1.03 1.00
2 1.30 1.37 0.70 1.29 0.53 0.94
4 1.24 1.15 1.30 1.08 1.05 0.94
8 1.36 1.18 1.25 1.06 0.92 0.90
16 0.85 1.49 0.79 1.42 0.92 0.95
32 1.23 2.18 1.19 2.28 0.97 1.04
64 1.88 3.64 1.93 4.20 1.03 1.16
128 3.06 4.37 3.04 5.61 0.99 1.28
256 4.08 4.00 5.04 5.01 1.24 1.25
512 4.58 3.52 5.63 4.19 1.23 1.19
1024 3.94 3.40 5.14 4.00 1.31 1.18
Second FFT Benchmark Results below
FFT fft3c.apk Results
With all these performance measurements, selecting a representative sample is difficult. At least averaging the three running times is helpful.
An example of comparing those for this benchmark with th earlier one (including using the revised System 1 times) shows that the latter was
faster on the small FFTs. The average FFT3c/FFT1 running time ratios were between 0.76 and 3.11, with 1.53 average. Taking the total
running time of one of each FFT size, produced ratios of 2.79 single precision and 2.47 double precision. For FFT3c total running time of one
of each indicted System 1 was 2.6% faster than System 2 and 43% faster than System 3.
System 1 Android 11 2.05 GHz ARM Cortex-A76
ARM/Intel FFT Benchmark 3c 4A8 08-Feb-2023 10.56
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.035 0.029 0.028 0.030 0.028 0.028
2 0.066 0.062 0.061 0.063 0.059 0.062
4 0.141 0.132 0.134 0.136 0.136 0.134
8 0.307 0.290 0.290 0.360 0.350 0.338
16 0.702 0.676 0.675 0.790 0.766 0.790
32 1.545 1.476 1.472 1.754 1.766 1.783
64 3.423 3.333 3.367 4.380 4.278 4.231
128 8.240 8.024 8.108 11.155 10.916 10.553
256 19.756 19.283 19.493 26.542 26.701 26.368
512 43.903 43.320 43.422 60.771 61.454 60.828
1024 94.409 93.012 93.336 145.439 142.632 144.625
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 2.1 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
ARM/Intel FFT Benchmark 3c 4A8 08-Feb-2023 11.42
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.044 0.030 0.030 0.031 0.028 0.028
2 0.069 0.063 0.063 0.061 0.060 0.060
4 0.162 0.135 0.135 0.135 0.133 0.133
8 0.347 0.301 0.298 0.317 0.314 0.337
16 0.841 0.722 0.908 0.826 1.134 0.840
32 1.795 1.753 1.652 2.089 2.047 1.987
64 3.586 3.422 3.732 4.646 4.674 4.701
128 8.411 8.138 7.877 10.902 10.906 10.933
256 19.554 20.523 19.439 25.088 24.605 26.126
512 47.427 44.633 44.105 56.174 63.102 62.016
1024 107.446 102.961 101.591 145.147 141.521 141.941
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 2.1 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
ARM/Intel FFT Benchmark 3c 4A8 08-Feb-2023 12.27
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.054 0.035 0.034 0.035 0.032 0.032
2 0.076 0.073 0.073 0.073 0.070 0.070
4 0.165 0.157 0.161 0.169 0.167 0.165
8 0.381 0.353 0.360 0.391 0.382 0.379
16 0.856 0.823 0.836 0.991 0.966 0.983
32 1.861 1.852 1.899 2.365 2.297 2.317
64 4.402 4.224 4.266 6.097 5.913 6.111
128 10.802 10.491 10.793 15.843 15.477 15.512
256 26.539 25.950 26.473 37.175 37.135 37.191
512 58.571 57.610 56.704 88.722 90.241 88.155
1024 125.591 124.655 126.555 217.677 222.146 221.802
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Continued Below
FFT Benchmark 3C Armv9 CPU Phone
With these, the smaller sizes are more dependent on processing speed, where the most noticeable feature is the increased performance
gains with the double precision versions compared with single precision ones. Many of the power and battery results were similar.
Running time of this benchmark is now less than one second, with some some measured FFT time being at microsecond level, possibly
dependent on timer resolution and questioning validity of comparisons.
System 4 Android 13 1x 2.80 GHz Cortex-X2
Test 1 Power
ARM/Intel FFT Benchmark 3c 4A8 20-Apr-2023 20.59
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.039 0.026 0.025 0.013 0.011 0.011
2 0.061 0.054 0.054 0.025 0.050 0.023
4 0.128 0.115 0.115 0.053 0.051 0.051
8 0.303 0.254 0.253 0.124 0.121 0.120
16 0.641 0.607 0.606 0.296 0.284 0.285
32 1.345 1.339 1.042 0.627 0.609 0.611
64 2.434 2.049 1.824 1.360 1.406 1.322
128 3.597 3.419 3.412 2.985 2.890 2.957
256 6.718 6.180 6.077 7.266 7.216 7.083
512 13.537 12.908 12.913 17.726 19.994 20.027
1024 31.804 30.518 30.398 46.998 44.458 44.174
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 0.7 seconds
Test 2 Battery
ARM/Intel FFT Benchmark 3c 4A8 23-Apr-2023 14.10
Compiled for 64 bit ARM v8a
Size milliseconds
K Single Precision Double Precision
1 0.051 0.026 0.025 0.013 0.011 0.011
2 0.061 0.054 0.053 0.025 0.023 0.023
4 0.139 0.115 0.115 0.053 0.051 0.051
8 0.276 0.257 0.254 0.123 0.121 0.141
16 0.646 0.607 0.604 0.295 0.284 0.284
32 1.366 0.985 0.979 0.632 0.619 0.618
64 2.240 2.054 1.869 1.394 1.333 1.328
128 3.824 3.569 2.988 3.016 2.914 2.882
256 6.934 6.319 6.096 7.240 7.160 7.170
512 13.635 13.227 13.144 17.729 17.652 17.596
1024 30.851 30.087 29.866 40.093 38.497 38.396
1024 Square Check Maximum Noise Average Noise
SP 9.999520e-01 3.346482e-06 4.565234e-11
DP 1.000000e+00 1.133294e-23 1.428110e-28
Total Elapsed Time 0.7 seconds
Average Comparisons
Test 1/Old 2 Test 2/old 2 Battery/Power
SP DP SP DP SP DP
1 1.16 2.02 1.02 2.53 0.88 1.25
2 1.15 2.30 1.16 2.58 1.01 1.12
4 1.21 2.63 1.17 2.54 0.97 0.96
8 1.17 3.06 1.20 3.02 1.03 0.99
16 1.33 3.29 1.33 3.27 1.00 0.99
32 1.40 3.39 1.56 3.40 1.12 1.00
64 1.70 3.62 1.74 3.63 1.02 1.00
128 2.34 3.57 2.35 3.57 1.00 1.00
256 3.14 3.24 3.08 3.45 0.98 1.06
512 3.46 3.15 3.40 3.59 0.98 1.14
1024 3.36 3.16 3.44 3.66 1.02 1.16
MP-Whetstone Benchmark next or Go To Start
MP-Whetstone Benchmark - MP-WHETSi.apk
For more information on Whetstone Benchmark see stand alone version, above. The multithreading version runs multiple copies of the same
shared code, with separate variables.
Before comparing results, it should be noted that the high Fixpt MOPS are impossible to achieve, where the compiler has found that some of
the code can be ignored without changing he calculated result. However, the time for this function has little effect on overall MWIPS rating.
With mixed MHz CPU cores and big.LITTLE architectures, comparisons become more complex, where each one indicates superior performance
in specific areas.
For this benchmark, overall seconds depend on calibrations and should not be compared. However, in an ideal world, on each system the time
would be constant up to 8 threads accessing 8 CPU cores. Comparing overall MWIPS ratings, throughput running 2, 4 and 8 threads, over
one thread, were around twice using 2 threads, then about 3.4 times at 4 threads, then between 5.1 and 6.2 with 8 threads.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz
ARM Cortex-A55
ARM/Intel MP-Whetstone Benchmark 4A8 08-Feb-2023 16.56
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 3886.0 680.6 663.1 734.2 119.3 62.5 23326.1 1976.9 741.3
2T 7695.3 1541.7 1409.7 1456.9 240.9 115.2 98493.7 4205.8 1474.0
4T 12943.7 2547.7 2495.0 2575.8 365.2 220.6148870.7 8186.7 2268.0
8T 24326.3 4564.2 4353.4 4700.6 695.7 435.2323353.9 22743.2 4101.4
Overall Seconds 2.91 1T, 2.91 2T, 3.93 4T, 4.83 8T
All calculations produced consistent numeric results
Total Elapsed Time 14.9 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 2 x 2.0 GHz Cortex-A76
and 6 x 1.8 GHz Cortex-A55)
ARM/Intel MP-Whetstone Benchmark 4A8 08-Feb-2023 17.22
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 4064.7 957.0 728.2 738.0 129.4 64.6 18308.6 2444.3 751.1
2T 8660.6 1757.5 1505.7 1596.2 270.0 142.0 85717.2 5241.1 1505.1
4T 14117.1 3461.0 3322.1 2696.8 439.0 239.9140592.8 11249.6 2471.6
8T 20887.7 4732.1 4868.8 4176.3 518.0 386.2309958.3 19432.5 3457.2
Overall Seconds 2.74 1T, 2.67 2T, 3.98 4T, 4.57 8T
All calculations produced consistent numeric results
Total Elapsed Time 14.3 seconds
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz
Cortex-A55
ARM/Intel MP-Whetstone Benchmark 4A8 08-Feb-2023 15.43
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 3856.7 819.3 818.8 666.1 130.1 63.4 50817.9 2984.8 562.2
2T 7716.2 1637.2 1636.4 1332.3 260.3 126.9112199.9 5982.7 1124.6
4T 13246.4 2792.2 2730.5 2385.4 421.9 230.4192831.0 11651.4 1966.1
8T 20674.2 4431.3 4528.9 3840.0 596.8 390.2289064.4 21237.2 3009.2
Overall Seconds 4.99 1T, 4.99 2T, 6.67 4T, 8.09 8T
All calculations produced consistent numeric results
Total Elapsed Time 25.7 seconds
Continued Below
MP-Whetstone Benchmark Armv9 CPU Phone
As indicated by Overall Seconds, the earlier time for twin fast core devices was effectively the same using 1 and 2 threads. System 4 CPU
has only one fast core, resulting in running time being 26% longer using two threads.
Some of the test function running times were, again, in the microsecond range, possibly distorting comparisons. The single core benchmark
obtained an overall speed rating of 1.50 times the older phone used for comparison purposes. This time it was between 1.50 and 1.75 times,
depending on the thread count.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz Cortex A510,
3x 2.52 GHz Cortex A710
Test 1 Battery
ARM/Intel MP-Whetstone Benchmark 4A8 23-Apr-2023 14.30
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
MWIPS MFLOPS MFLOPS MFLOPS Cos Exp Fixpt If Equal
1 2 3 MOPS MOPS MOPS MOPS MOPS
1T 6937.5 1419.7 1389.3 1092.2 236.7 103.3 83276.2 3985.3 1923.4
2T 12987.7 2695.4 2562.8 2087.1 449.1 197.9 125540.0 7418.4 3212.5
4T 24739.0 5315.1 5214.5 4090.7 835.9 384.5 244974.6 14825.7 5227.7
8T 32198.9 7676.4 7993.3 5510.1 1035.3 510.5 331293.6 25184.5 5594.2
Overall Seconds 4.20 1T, 5.29 2T, 6.00 4T, 9.04 8T
All calculations produced consistent numeric results
Total Elapsed Time 25.1 seconds
Test1/System 2
1T 1.71 1.48 1.91 1.48 1.83 1.60 4.55 1.63 2.56
2T 1.50 1.53 1.70 1.31 1.66 1.39 1.46 1.42 2.13
4T 1.75 1.54 1.57 1.52 1.90 1.60 1.74 1.32 2.12
8T 1.54 1.62 1.64 1.32 2.00 1.32 1.07 1.30 1.62
MP-Dhrystone Benchmark next or Go To Start
MP Dhrystone Benchmark - MP-Dhryi.apk
This benchmark does not provide reasonable increases in measured performance using multiple cores, probably because many of the variables
used are shared by all threads. Results using one thread are only slightly slower than from the single core version, indicating that threading
overheads were not excessive. The lack of improvement using multiple cores probably invalidates comparisons of the two systems. At least
the System 4/System 2 performance comparison indicated between 2.0 and 2.45 times gain.
System 1 Android 11 2.05 GHz ARM Cortex-A76 and
6 x 2.0 GHz ARM Cortex-A55
ARM/Intel MP-Dhrystone 2 Benchmark 4A8 08-Feb-2023 16.58
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
Threads 1 2 4 8
Seconds 0.80 2.03 5.47 14.00
Dhrystones per Second 25133472 19708774 14614211 11428905
VAX MIPS rating 14305 11217 8318 6505
Internal pass count correct all threads
Total Elapsed Time 22.7 seconds
System 2 Android 12 2 x 2.0 GHz ARM Cortex-A75 and
6 x 2.0 GHz Cortex-A55
ARM/Intel MP-Dhrystone 2 Benchmark 4A8 08-Feb-2023 17.24
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
Threads 1 2 4 8
Seconds 0.84 2.24 6.23 14.31
Dhrystones per Second 23687920 17834612 12843313 11183452
VAX MIPS rating 13482 10151 7310 6365
Internal pass count correct all threads
Total Elapsed Time 24.1 seconds
System 3 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz
Cortex-A55
ARM/Intel MP-Dhrystone 2 Benchmark 4A8 08-Feb-2023 15.45
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
Threads 1 2 4 8
Seconds 0.75 1.97 4.98 12.88
Dhrystones per Second 21326073 16280555 12851505 9937004
VAX MIPS rating 12138 9266 7314 5656
Internal pass count correct all threads
Total Elapsed Time 21.3 seconds
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
ARM/Intel MP-Dhrystone 2 Benchmark 4A8 23-Apr-2023 14.32
Compiled for 64 bit ARM v8a
Using 1, 2, 4 and 8 Threads
Threads 1 2 4 8
Seconds 0.69 2.01 5.08 14.28
Dhrystones per Second 57735505 39843345 31467495 22401220
VAX MIPS rating 32860 22677 17910 12750
Internal pass count correct all threads
Total Elapsed Time 22.6 seconds
System 4/Syestem 2 2.44 2.23 2.45 2.00
NEON-Linpack-MP Benchmark - NEON-Linpacki-MP.apk
This is a multithreading version of the above Further details and results can be found in Android Benchmarks For 32 Bit and 64 Bit CPUs from
ARM Intel and MIPS.pdf.
This benchmark is not available compiled with the new 4A8 compilation as overall running time had increased to more than 400 seconds, on a
new phone.
MP-BusSpeed Benchmark next or Go To Start
MP-BusSpeed Benchmark - MP-BusSpd2i.apk
This is a multithreading version of BusSpeed above, except, as for other memory benchmarks, restricted to three memory size demands that
were originally representative of using L1 cache, L2 cache and RAM data. To avoid caching effects of RAM based data, this version arranges
for threads to have staggered starting points, each reading all the data.
Considering Read All, performance of all three systems was virtually the same for cache based data, using the simple integer arithmetic
involved. Systems 1 and 2 RAM speeds were quite similar, with system 3 far behind, maybe due to dual channel versus single channel
operation.
Estimated bus speeds calculated as 16 times Inc16 results were similar to calculated MB/second when greater than one thread was used.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and
6 x 2.0 GHz ARM Cortex-A55
ARM/Intel MP-BusSpd2 Benchmark 4A8 08-Feb-2023 17.01
Compiled for 64 bit ARM v8a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 7329 7124 7451 7334 7341 7101 L1
2T 10325 13362 14290 7684 8059 7832 <<< Later 14130
4T 17070 19398 25187 24043 27212 20101
8T 14174 17228 36750 29288 41665 29522
122.9 1T 1878 2887 4854 7296 7368 6407 L2
2T 1863 3247 6737 7374 13119 7689
4T 3830 6261 9539 14764 17344 15561
8T 5462 8906 16427 25436 32650 29293
49152 1T 404 569 1155 2233 4053 4376 RAM
2T 409 777 1583 3176 6429 9715
4T 564 942 1821 3646 7426 11040
8T 598 970 1950 3715 7974 15460
No Errors Found
Total Elapsed Time 58.4 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 2 x 2.0 GHz
Cortex-A76 and 6 x 1.8 GHz Cortex-A55
ARM/Intel MP-BusSpd2 Benchmark 4A8 08-Feb-2023 17.26
Compiled for 64 bit ARM v8a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 7161 7297 7497 7588 7702 7460 L1
2T 8249 12429 13881 13746 15061 15482
4T 7947 10882 15414 19060 22373 19375
8T 12283 11971 29090 27379 39212 26439
122.9 1T 1992 3367 6029 7489 7375 7503 L2
2T 3907 7106 11767 14529 15642 15813
4T 4709 7833 12544 18015 19659 19260
8T 4742 8651 15108 25444 37308 32776
49152 1T 528 789 1730 3469 6325 7353 RAM
2T 726 988 1832 3623 7074 13999 Calculated
4T 719 882 1762 3321 6886 13740 Bus Speed
8T 681 861 1800 3451 7147 13906 13776
No Errors Found
Total Elapsed Time 52.9 seconds
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and
6 x 2.0 GHz Cortex-A55
ARM/Intel MP-BusSpd2 Benchmark 4A8 08-Feb-2023 15.47
Compiled for 64 bit ARM v8a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 7116 7549 7746 7936 7963 7976 L1
2T 12590 13817 14578 15785 15865 15924
4T 19944 23807 26173 27694 28498 20714
8T 16635 16726 35602 29673 43358 32010
122.9 1T 1232 1142 2415 4406 5734 7975 L2
2T 2718 3123 5270 8813 11478 15947
4T 3100 4607 7739 13599 18013 20644
8T 3189 6323 9391 19850 27135 30640
49152 1T 547 540 1116 2269 4488 7518 RAM
2T 581 580 1140 2289 4582 9156
4T 642 625 1691 3324 8091 9188
8T 601 687 1586 3099 5079 9027
No Errors Found
Total Elapsed Time 48.8 seconds
Continued Below
MP-BusSpeed Benchmark Armv9 CPU Phone
This was run on battery when maximum speeds were expected. Examining the main measurements, reading all data, performance
improvements, executing the integer functions, was mainly proportion to the CPU MHz ratio, using one thread, but increasing with multiple
ones. Maximum gain was with RAM data transfers using all CPU cores, reaching an increase of 2.47 times, with an estimated bus speed of
47.5 GB per second (2.968 x 16).
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
ARM/Intel MP-BusSpd2 Benchmark 4A8 23-Apr-2023 14.34
Compiled for 64 bit ARM v8a
MB/Second Reading Data, 1, 2, 4 and 8 Threads
KB Inc32 Inc16 Inc8 Inc4 Inc2 RdAll
12.3 1T 9766 10568 10655 10683 10312 10724 L1
2T 16006 17611 19131 19771 19436 19690
4T 29633 30846 35796 35823 37133 37949
8T 17413 18447 42972 39292 53233 52517
122.9 1T 4904 5381 8001 9509 9478 9553 L2
2T 8182 8579 15623 18945 19070 19051
4T 15433 15194 26980 34383 31191 35705
8T 14336 15505 27156 35831 39276 47641
49152 1T 1158 1163 2593 5707 10124 10218 RAM
2T 2580 2145 4723 9139 16890 18311 Calculated
4T 4236 3485 7626 12461 21916 30342 Bus Speed
8T 2821 2968 6508 10792 21131 34406 47488
No Errors Found
Total Elapsed Time 50.9 seconds
System 4 / System 2
12.3 1T 1.36 1.45 1.42 1.41 1.34 1.44
2T 1.94 1.42 1.38 1.44 1.29 1.27
4T 3.73 2.83 2.32 1.88 1.66 1.96
8T 1.42 1.54 1.48 1.44 1.36 1.99
122.9 1T 2.46 1.60 1.33 1.27 1.29 1.27
2T 2.09 1.21 1.33 1.30 1.22 1.20
4T 3.28 1.94 2.15 1.91 1.59 1.85
8T 3.02 1.79 1.80 1.41 1.05 1.45
49152 1T 2.19 1.47 1.50 1.65 1.60 1.39
2T 3.55 2.17 2.58 2.52 2.39 1.31
4T 5.89 3.95 4.33 3.75 3.18 2.21
8T 4.14 3.45 3.62 3.13 2.96 2.47
MP-RandMem Benchmark next or Go To Start
MP-RandMem Benchmark - MP-RndMemi.apk
This is a multithreading version of RandMem above. The most striking feature of these MP results is the apparent constant or near
performance at all thread sizes during read/write tests, over the memory area covered. This is probably because write back involves
accessing RAM.
This program simply reads (or writes) data that supplies the next location to access. this lack of arithmetic calculations apparently provides
faster data transmission speeds than BusSpeed.
Repeating the benchmark on System 1 continued to produce variable performance on RndRDWR tests using RAM.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and
6 x 2.0 GHz ARM Cortex-A55
ARM/Intel MP-RndMem Benchmark 4A8 08-Feb-2023 17.04
Compiled for 64 bit ARM v8a
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 15672 16244 15166 13508 L1
2T 14435 10708 21438 9174
4T 35744 8391 34088 7762
8T 52284 8129 32321 7232
122.9 1T 11052 11762 7956 7209 L2
2T 17349 9400 14378 5457
4T 30743 7405 18898 5343
8T 44553 6837 21266 4174
12288 1T 11287 6549 407 424 RAM
2T 9081 4458 641 223
4T 14381 3463 539 64
8T 16627 2564 1061 121
No Errors Found
Total Elapsed Time 47.9 seconds
System 2 Android 12 2.0 GHz Snapdragon 750
2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 15277 15160 13995 13764 L1
2T 27401 14764 27575 13529
4T 30145 14883 29903 13394
8T 43856 14293 33190 13297
122.9 1T 12005 13509 7296 7303 L2
2T 25241 12840 14676 7336
4T 30128 12674 15276 7226
8T 46484 11959 18064 7166
12288 1T 11371 6158 437 429 RAM
2T 15348 5818 471 402
4T 14136 5793 499 404
8T 17555 5276 597 392
No Errors Found
Total Elapsed Time 47.2 seconds
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75
and 6 x 2.0 GHz Cortex-A55
ARM/Intel MP-RndMem Benchmark 4A8 08-Feb-2023 15.49
Compiled for 64 bit ARM v8a
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 13840 15739 13328 13741 L1
2T 25791 15710 25075 13919
4T 34426 15334 33819 13779
8T 50511 15029 38275 13788
122.9 1T 8965 9269 2727 3397 L2
2T 16943 9249 6348 3391
4T 24738 9152 8399 3410
8T 42321 9190 12827 3402
12288 1T 7704 3364 510 358 RAM
2T 9140 3371 550 334
4T 15521 3367 574 358
8T 14550 3358 747 358
No Errors Found
Total Elapsed Time 42.6 seconds
Continued Below
MP-RandMem Benchmark Armv9 CPU Phone
Cached based measurements indicated gains over the older phone between 1.05 and 2.94. Best performance was random access of RAM
based data, between 4.53 and 10.56, influenced by the larger L3 cache.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
ARM/Intel MP-RndMem Benchmark 4A8 23-Apr-2023 14.38
Compiled for 64 bit ARM v8a
Battery
MB/Second Using 1, 2, 4 and 8 Threads
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 30966 18582 17061 14184
2T 29434 17173 29474 14222
4T 58965 25538 88024 22464
8T 91141 23590 67089 21167
122.9 1T 26009 19920 12525 10496
2T 39542 23049 23892 13454
4T 71554 23106 39923 12058
8T 75854 20575 42745 9824
12288 1T 23597 12335 1980 2921
2T 33194 11639 3260 2735
4T 44727 10552 5269 2372
8T 50346 9798 5297 1920
No Errors Found
System 4 / System 2
KB SerRD SerRDWR RndRD RndRDWR
12.29 1T 2.03 1.23 1.22 1.03
2T 1.07 1.16 1.07 1.05
4T 1.96 1.72 2.94 1.68
8T 2.08 1.65 2.02 1.59
122.9 1T 2.17 1.47 1.72 1.44
2T 1.57 1.80 1.63 1.83
4T 2.38 1.82 2.61 1.67
8T 1.63 1.72 2.37 1.37
12288 1T 2.08 2.00 4.53 6.81
2T 2.16 2.00 6.92 6.80
4T 3.16 1.82 10.56 5.87
8T 2.87 1.86 8.87 4.90
MP-MFLOPS Benchmark next or Go To Start
MP-MFLOPS Benchmark - MP-MFLOPS2i.apk
The arithmetic operations executed are of the form x[i] = (x[i] + a) * b - (x[i] + c) * d + (x[i] + e) * f with 2 and 32 operations per input
data word, using 1, 2, 4 and 8 threads. Data sizes are limited to three to use L1 cache, L2 cache and RAM at 12.8, 128 and 12800 KB (3200,
32000 and 3200000 single precision floating point words). Each thread uses the same calculations but accessing different segments of the
data. The program checks for consistent numeric results, primarily to show that all calculations are carried out and can be run.
As indicated earlier, on using SIMD with 128 bit registers and linked (fused) multiply and add, up to eight single precision floating point
operations could be expected per clock cycle, or 16 GFLOPS per core at 2 GHz. The first two processors, with Cortex A76 CPUs appear to
have reasonable implementation of SIMD, achieving over 12 GFLOPS at 32 operations per word, with System 3 far behind. All show
acceptable improvements using two cores, performance improvements then becoming disappointing using four cores, with these big.LITTLE
CPU architectures.
Note that all systems obtained the same sumchecks of numeric calculations at all levels of threading.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76
and 6 x 2.0 GHz ARM Cortex-A55
ARM/Intel MP-MFLOPS2 Benchmark 4A8 08-Feb-2023 17.06
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 5378 5545 3318 12106 12306 11395
2T 10988 10354 3174 22955 23278 12780
4T 9979 10692 2591 25718 25633 24694
8T 13285 14803 2433 30061 31648 28941
Results x 100000, 0 indicates ERRORS
1T 40392 76406 99700 35218 66014 99520
2T 40392 76406 99700 35218 66014 99520
4T 40392 76406 99700 35218 66014 99520
8T 40392 76406 99700 35218 66014 99520
Total Elapsed Time 8.1 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 2 x 2.0 GHz
Cortex-A76 and 6 x 1.8 GHz Cortex-A55
ARM/Intel MP-MFLOPS2 Benchmark 4A8 08-Feb-2023 17.31
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 6819 6238 2804 12506 12537 12441
2T 8797 9307 2946 22427 24126 22731
4T 9364 9132 2554 25008 26004 25345
8T 10985 13262 2398 33664 34024 32553
Results x 100000, 0 indicates ERRORS
1T 40392 76406 99700 35218 66014 99520
2T 40392 76406 99700 35218 66014 99520
4T 40392 76406 99700 35218 66014 99520
8T 40392 76406 99700 35218 66014 99520
Total Elapsed Time 7.5 seconds
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and
6 x 2.0 GHz Cortex-A55
ARM/Intel MP-MFLOPS2 Benchmark 4A8 08-Feb-2023 15.52
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 5825 4972 1567 7724 7327 7052
2T 11131 11772 1673 14574 15183 14065
4T 11598 13049 1775 17670 17991 17216
8T 13773 15038 1748 23906 24232 22806
Results x 100000, 0 indicates ERRORS
1T 40392 76406 99700 35218 66014 99520
2T 40392 76406 99700 35218 66014 99520
4T 40392 76406 99700 35218 66014 99520
8T 40392 76406 99700 35218 66014 99520
Total Elapsed Time 11.5 seconds
Continued Below
MP-MFLOPS Benchmark Armv9 CPU Phone
Using the older CPUs, 8 single precision floating point operations per clock cycle could be expected, that would lead to 22.4 GFLOPS at 2.8
GHz. Measured results indicate a maximum of 31.8 using 1 core, at 11.36 operations per clock cycle, when perhaps 6.5 could be expected
with the particular code used. It seems that extended SIMD operation has been applied to existing SIMD vector instructions.
Again, running times of individual tests could be too short to provide accurate performance estimates and comparisons. But it is clear that
more than twice as fast as the older phone can be achieved. On Power heating effects indicate possible reductions in performance of more
than 25%.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
Test 1 Power
ARM/Intel MP-MFLOPS2 Benchmark 4A8 20-Apr-2023 20.48
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 15822 13964 5340 24338 21850 21784
2T 17818 22599 5582 30511 30811 29994
4T 28770 25695 14235 48935 51359 48815
8T 44099 36862 25214 66160 71096 74910
Results x 100000, 0 indicates ERRORS
1T 40392 76406 99700 35218 66014 99520
2T 40392 76406 99700 35218 66014 99520
4T 40392 76406 99700 35218 66014 99520
8T 40392 76406 99700 35218 66014 99520
Total Elapsed Time 4.0 seconds
Test 2 Battery
ARM/Intel MP-MFLOPS2 Benchmark 4A8 23-Apr-2023 14.13
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 15285 14091 6830 31790 30388 30516
2T 21438 19857 8629 40890 41320 41764
4T 38093 25569 14322 64398 66969 64473
8T 40847 39072 31887 66206 68989 70401
Results x 100000, 0 indicates ERRORS
1T 40392 76406 99700 35218 66014 99520
2T 40392 76406 99700 35218 66014 99520
4T 40392 76406 99700 35218 66014 99520
8T 40392 76406 99700 35218 66014 99520
Total Elapsed Time 3.2 seconds
Test1/System 2
1T 2.32 2.24 1.90 1.95 1.74 1.75
2T 2.03 2.43 1.89 1.36 1.28 1.32
4T 3.07 2.81 5.57 1.96 1.98 1.93
8T 4.01 2.78 10.51 1.97 2.09 2.30
Test2/System 2
1T 2.24 2.26 2.44 2.54 2.42 2.45
2T 2.44 2.13 2.93 1.82 1.71 1.84
4T 4.07 2.80 5.61 2.58 2.58 2.54
8T 3.72 2.95 13.30 1.97 2.03 2.16
Battery/Power
1T 0.97 1.01 1.28 1.31 1.39 1.40
2T 1.20 0.88 1.55 1.34 1.34 1.39
4T 1.32 1.00 1.01 1.32 1.30 1.32
8T 0.93 1.06 1.26 1.00 0.97 0.94
NEON-MFLOPS-MP Benchmark next or Go To Start
NEON-MFLOPS-MP Benchmark - NEON-MFLOPS2i-MP.apk
This benchmark carries out the same calculations as MP-MFLOPS but uses hand coded NEON Intrinsic Functions. Measured maximum
performance was essentially the same. In both cases, performance at 2 operations per word can vary significantly, being more dependent on
data flow than processing speed.
All produced identical sumchecks, these being different to those from MP-MFLOPS, probably due to a variance initial run time calibration or
SIMD content.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76
and 6 x 2.0 GHz ARM Cortex-A55
ARM NEON-MFLOPS2-MP Benchmark 4A8 08-Feb-2023 17.07
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 7929 7999 3322 13136 13104 13090
2T 14163 13998 3171 25686 25710 25825
4T 15732 15495 3008 27646 27012 24837
8T 9105 12776 2439 29803 28991 27127
Results x 100000, 12345 indicates ERRORS
1T 44934 86735 99850 36770 79897 99759
2T 44934 86735 99850 36770 79897 99759
4T 44934 86735 99850 36770 79897 99759
8T 44934 86735 99850 36770 79897 99759
Total Elapsed Time 3.6 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 2 x 2.0 GHz
Cortex-A76 and 6 x 1.8 GHz Cortex-A55
ARM NEON-MFLOPS2-MP Benchmark 4A8 08-Feb-2023 17.33
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 4396 4753 2555 12669 11585 11782
2T 4661 6779 2894 22112 21236 21738
4T 7706 6001 2561 23015 26865 24635
8T 7286 7062 2397 35348 31644 29849
Results x 100000, 12345 indicates ERRORS
1T 44934 86735 99850 36770 79897 99759
2T 44934 86735 99850 36770 79897 99759
4T 44934 86735 99850 36770 79897 99759
8T 44934 86735 99850 36770 79897 99759
Total Elapsed Time 4.1 seconds
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and
6 x 2.0 GHz Cortex-A55
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 5486 5040 1706 7138 7167 7176
2T 11637 11560 1787 14195 14325 14398
4T 10948 10623 1853 17213 17304 17096
8T 12279 11952 1846 23173 23078 23495
Results x 100000, 12345 indicates ERRORS
1T 44934 86735 99850 36770 79897 99759
2T 44934 86735 99850 36770 79897 99759
4T 44934 86735 99850 36770 79897 99759
8T 44934 86735 99850 36770 79897 99759
Total Elapsed Time 5.9 seconds
Continued Below
NEON-MFLOPS-MP Benchmark Armv9 CPU Phone
The On Battery and On Power tests were carried out consecutively, when the phone was not particularly warm. Subject to inaccuracies due
to short running times, it can be assumed, from the calculations below, that performance was the same. Similarly, it was an improvement of
at least twice, of that for the older phone.
Comparing NEON-MFLOPS-MP with MP-MFLOPS indicates that performance was similar at 32 Ops/Word but the latter could be faster at 2
Ops/Word.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
Test 1 Battery
ARM NEON-MFLOPS2-MP Benchmark 4A8 23-Apr-2023 14.16
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 8810 9275 5476 29395 28331 30117
2T 16769 11592 8731 39449 40333 40440
4T 6822 17552 12335 62263 59255 59455
8T 25900 24135 18693 66554 64566 65969
Results x 100000, 12345 indicates ERRORS
1T 44934 86735 99850 36770 79897 99759
2T 44934 86735 99850 36770 79897 99759
4T 44934 86735 99850 36770 79897 99759
8T 44934 86735 99850 36770 79897 99759
Total Elapsed Time 1.8 seconds
Test 2 Power
ARM NEON-MFLOPS2-MP Benchmark 4A8 23-Apr-2023 14.17
Compiled for 64 bit ARM v8a
FPU Add & Multiply using 1, 2, 4 and 8 Threads
2 Ops/Word 32 Ops/Word
KB 12.8 128 12800 12.8 128 12800
MFLOPS
1T 9327 9188 5500 28636 28474 29788
2T 18024 18392 8359 38319 39596 39531
4T 31653 20778 10451 61957 64741 61611
8T 24930 22931 18816 56111 59569 66356
Results x 100000, 12345 indicates ERRORS
1T 44934 86735 99850 36770 79897 99759
2T 44934 86735 99850 36770 79897 99759
4T 44934 86735 99850 36770 79897 99759
8T 44934 86735 99850 36770 79897 99759
Total Elapsed Time 1.8 seconds
Test2/Test1 - Power/Battery
1T 1.06 0.99 1.00 0.97 1.01 0.99
2T 1.07 1.59 0.96 0.97 0.98 0.98
4T 4.64 1.18 0.85 1.00 1.09 1.04
8T 0.96 0.95 1.01 0.84 0.92 1.01
Test1/System2
1T 2.00 1.95 2.14 2.32 2.45 2.56
2T 3.60 1.71 3.02 1.78 1.90 1.86
4T 0.89 2.92 4.82 2.71 2.21 2.41
8T 3.55 3.42 7.80 1.88 2.04 2.21
Battery NEON/Normal MFLOPS
1T 0.58 0.66 0.80 0.92 0.93 0.99
2T 0.78 0.58 1.01 0.96 0.98 0.97
4T 0.18 0.69 0.86 0.97 0.88 0.92
8T 0.63 0.62 0.59 1.01 0.94 0.94
OpenGL Benchmark next or Go To Start
OpenGL Benchmark - JavaOpenGL1.apk
Necessary for early Android devices, the benchmark does not rely on complex visual scenes or mathematical functions. The objective being
to generate moderate to excessive loading via multiple simple objects. It uses all Java code, with OpenGL ES GL10 statements, to measure
graphics performance in Frames Per Second (FPS). Four tests draw a background of 50 cubes first as wireframes then colour shaded. The
third test views the cubes in and out of a tunnel with slotted sides and roof, also containing rotating plates. The last test adds textures to
the cubes and plates. The 50 cubes are redrawn 15, 30 and 60 times, with randomised positions, colours rotational settings. With 6 x 2
triangles per cube, minimum triangles per frame for the three sets of tests are 9000, 18000 and 36000.
Systems 1 and 3 do not appear to have the option to run with a faster refresh speed than 60 MHz. So maximum performance cannot be
demonstrated. System 2 default is much higher, providing up to near 90 FPS, but 60 MHz refresh rate was set to enable comparisons. These
still show significant superior performance. On the other hand, it should be borne in mind that System 2 has fewer than half the number of
pixels to deal with.
System 1 Android 11 2.05 GHz ARM Cortex-A76
Graphics Mali-76 MC4, refresh 60 MHz
Android Java OpenGL Benchmark 4A8 09-Feb-2023 10.56
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 59.50 60.06 59.37 49.99
18000+ 44.03 44.23 38.75 30.12
36000+ 22.78 23.19 21.54 16.32
Screen Pixels 1200 Wide 1928 High
Total Elapsed Time 120.4 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
Graphics 660 MHz Adreno 619, default refresh MHz
Android Java OpenGL Benchmark 4A8 09-Feb-2023 11.29
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 88.58 86.98 89.82 76.51
18000+ 63.02 63.01 55.57 45.03
36000+ 33.92 33.76 31.49 25.04
Screen Pixels 1339 Wide 720 High
Total Elapsed Time 120.5 seconds
System 2 Android 12 2.0 GHz ARM Cortex-A75
Graphics 660 MHz Adreno 619, refresh 60 MHz
Android Java OpenGL Benchmark 4A8 09-Feb-2023 19.14
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 50.43 47.05 53.55 56.48
18000+ 59.00 59.39 54.26 44.57
36000+ 33.35 33.50 31.14 25.02
Screen Pixels 1339 Wide 720 High
Total Elapsed Time 120.5 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
Graphics Mali-62, refresh 60 MHz
Android Java OpenGL Benchmark 4A8 09-Feb-2023 15.07
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 37.88 59.82 54.16 41.19
18000+ 26.59 35.84 31.73 28.13
36000+ 16.46 20.42 19.35 15.65
Screen Pixels 1200 Wide 1848 High
Total Elapsed Time 120.6 seconds
Continued Below
Armv9 CPU Phone
All these results were exceptionally slow. Perhaps the old version of OpenGL ES used to produce the benchmark is no longer applicable.
System 4 Android 13 1x 2.80 GHz Cortex-X2
Graphics Xclipse 920
Power
Android Java OpenGL Benchmark 4A8 20-Apr-2023 21.02
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 24.12 24.24 14.92 16.06
18000+ 8.46 8.46 6.11 6.67
36000+ 2.53 2.47 2.07 2.32
Screen Pixels 1080 Wide 2009 High
Total Elapsed Time 121.9 seconds
Battery
Android Java OpenGL Benchmark 4A8 20-Apr-2023 21.05
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 24.01 24.20 14.83 15.71
18000+ 8.41 8.37 6.06 6.63
36000+ 2.52 2.45 2.06 2.31
Screen Pixels 1080 Wide 2009 High
Total Elapsed Time 122.1 seconds
Battery Later
Android Java OpenGL Benchmark 4A8 23-Apr-2023 14.49
--------- Frames Per Second --------
Triangles WireFrame Shaded Shaded+ Textured
9000+ 33.77 31.61 18.93 18.33
18000+ 9.08 8.81 6.25 6.60
36000+ 2.53 2.46 2.07 2.32
Screen Pixels 1080 Wide 2009 High
Total Elapsed Time 121.9 seconds
Java Draw Benchmark next or Go To Start
Java Drawing Benchmark - JavaDraw.apk
This all Java benchmark uses small to rather excessive simple objects to measure drawing performance, again via Frames Per Second (FPS).
Five 10 second tests draw on a background of continuously changing colour shades.
Test 1 loads a PNG file twice, the bitmaps moving for each frame, side to side or circling.
Plus Test 2 generates 2 SweepGradient multi-coloured circles moving around.
Plus Test 3 draws 200 random small circles in the middle of the screen.
Plus Test 4 draws 80 lines from the centre of each side to the opposite side, with changing colours.
Plus Test 5 draws the same small random circles as Test 3 but with 4000, filling the screen.
As with the OpenGL benchmark, these results depend on the available refresh rates and screen pixel content. In this case, System 2 was
the only one allowed to run free of the imposition of VSYNC that limits the maximum refresh rate at 60 FPS. But, as shown, 60 FPS can be
selected in SettingsSettings, showing that it was slower than System 1.
System 1 Android 11 2.05 GHz ARM Cortex-A76
Graphics Mali-76 MC4, refresh 60 MHz
Android Java Drawing Benchmark 4A809-Feb-2023 11.04
Test Frames FPS
Display PNG Bitmap Twice 599 59.88
Plus 2 SweepGradient Circles 601 60.03
Plus 200 Random Small Circles 601 60.03
Plus 320 Long Lines 518 51.75
Plus 4000 Random Small Circles 217 21.68
Screen pixels 1200 Wide 1928 High
Total Elapsed Time 50.1 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
Graphics 660 MHz Adreno 619, default refresh MHz
Android Java Drawing Benchmark 4A809-Feb-2023 11.25
Test Frames FPS
Display PNG Bitmap Twice 879 87.81
Plus 2 SweepGradient Circles 893 89.22
Plus 200 Random Small Circles 844 84.37
Plus 320 Long Lines 202 20.11
Plus 4000 Random Small Circles 136 13.55
Screen pixels 1339 Wide 720 High
Total Elapsed Time 50.2 seconds
System 2 Android 12 2.0 GHz ARM Cortex-A75
Graphics 660 MHz Adreno 619, refresh 60 MHz
Android Java Drawing Benchmark 4A809-Feb-2023 19.18
Test Frames FPS
Display PNG Bitmap Twice 497 49.48
Plus 2 SweepGradient Circles 476 47.47
Plus 200 Random Small Circles 516 51.55
Plus 320 Long Lines 209 20.85
Plus 4000 Random Small Circles 139 13.90
Screen pixels 1339 Wide 720 High
Total Elapsed Time 50.2 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
Graphics Mali-62, refresh 60 MHz
Android Java Drawing Benchmark 4A809-Feb-2023 15.12
Test Frames FPS
Display PNG Bitmap Twice 596 59.58
Plus 2 SweepGradient Circles 600 59.98
Plus 200 Random Small Circles 407 40.63
Plus 320 Long Lines 106 10.54
Plus 4000 Random Small Circles 74 7.33
Screen pixels 1920 Wide 1128 High
Total Elapsed Time 50.2 seconds
Continued Below
Armv9 CPU Phone
System 4 also does not impose VSYNC, with first results indicating that the graphics speed was around 33% faster than System 2, with the
CPU speed dependent last tests up to 70% faster.
System 4 Android 13 2.80 GHz Cortex-X2
Graphics Xclipse 920
Battery
Android Java Drawing Benchmark 4A830-Apr-2023 13.48
Test Frames FPS
Display PNG Bitmap Twice 1187 118.61
Plus 2 SweepGradient Circles 1194 119.30
Plus 200 Random Small Circles 1162 116.19
Plus 320 Long Lines 343 34.21
Plus 4000 Random Small Circles 236 23.51
Screen pixels 1080 Wide 2009 High
Total Elapsed Time 50.1 seconds
Java Whetstone Benchmark next or Go To Start
Java Whetstone Benchmark - Java Whetstone.apk
Java performed quite well on all systems, at around half the speed of the optimised compiled C version above. Compared with System 3,
some System 4 speeds were slower than expected by the MHz comparison. Then, as before, higher gains were observed on using such as
COS and EXP functions.
System 1 Android 11 2.05 GHz ARM Cortex-A76
Android Java Whetstone Benchmark 4A8 02-Mar-2023 17.13
Test MFLOPS MOPS millisecs Results
N1 float 620.56 0.031 -1.124750137
N2 float 571.43 0.235 -1.131330490
N3 if 1014.71 0.102 1.000000000
N4 fixpt 2881.98 0.109 12.000000000
N5 cos 139.13 0.598 0.499110132
N6 float 274.09 1.968 0.999999821
N7 equal 630.29 0.293 3.000000000
N8 exp 72.73 0.512 0.935364604
MWIPS 2598.66 3.848
Total Elapsed Time 13.5 seconds
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
Android Java Whetstone Benchmark 4A8 02-Mar-2023 17.26
Test MFLOPS MOPS millisecs Results
N1 float 605.30 0.032 -1.124750137
N2 float 559.53 0.240 -1.131330490
N3 if 993.28 0.104 1.000000000
N4 fixpt 2720.21 0.116 12.000000000
N5 cos 134.19 0.620 0.499110132
N6 float 270.51 1.994 0.999999821
N7 equal 405.80 0.455 3.000000000
N8 exp 68.38 0.544 0.935364604
MWIPS 2435.86 4.105
Total Elapsed Time 14.6 seconds
System 3 Android 13 2.0 GHz ARM Cortex-A75
Android Java Whetstone Benchmark 4A8 02-Mar-2023 17.33
Test MFLOPS MOPS millisecs Results
N1 float 385.54 0.050 -1.124750137
N2 float 359.17 0.374 -1.131330490
N3 if 1000.00 0.104 1.000000000
N4 fixpt 1913.73 0.165 12.000000000
N5 cos 125.02 0.666 0.499110132
N6 float 184.60 2.922 0.999999821
N7 equal 310.33 0.596 3.000000000
N8 exp 59.71 0.623 0.935364604
MWIPS 1818.81 5.498
System 4 Android 13 1x 2.80 GHz Cortex-X2
Battery
Android Java Whetstone Benchmark 4A8 30-Apr-2023 13.44
Test MFLOPS MOPS millisecs Results System 4/System 2
N1 float 798.00 0.024 -1.124750137 1.32
N2 float 736.04 0.183 -1.131330490 1.32
N3 if 1352.94 0.077 1.000000000 1.36
N4 fixpt 4186.05 0.075 12.000000000 1.54
N5 cos 227.32 0.366 0.499110132 1.69
N6 float 367.44 1.468 0.999999821 1.36
N7 equal 835.44 0.221 3.000000000 2.06
N8 exp 101.20 0.368 0.935364604 1.48
MWIPS 3595.56 2.781 1.48
Total Elapsed Time 15.8 seconds
Java Linpack Benchmark next or Go To Start
Java Linpack Benchmark - LinpackJava.apk
The Java version carries out double precision floating point calculations. Performancs is shown to be much slower than the C results, where
the sumcheck values are the same as here, showing that it was executing identical arithmetic calculations. The System 4 speed was 2.5
times faster than the older System 2.
System 1 Android 11 2.05 GHz ARM Cortex-A76
Android Java Linpack Benchmark 4A8 03-Mar-2023 10.52
Speed 920.22 MFLOPS
norm. resid 1.67
resid 7.41628980e-14
machep 2.22044605e-16
x[0]-1 -1.49880108e-14
x[n-1]-1 -1.89848137e-14
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
Android Java Linpack Benchmark 4A8 03-Mar-2023 10.49
Speed 884.88 MFLOPS
norm. resid 1.67
resid 7.41628980e-14
machep 2.22044605e-16
x[0]-1 -1.49880108e-14
x[n-1]-1 -1.89848137e-14
System 3 Android 13 2.0 GHz ARM Cortex-A75
Android Java Linpack Benchmark 4A8 03-Mar-2023 10.56
Speed 645.24 MFLOPS
norm. resid 1.67
resid 7.41628980e-14
machep 2.22044605e-16
x[0]-1 -1.49880108e-14
x[n-1]-1 -1.89848137e-14
System 4 Android 13 1x 2.80 GHz Cortex-X2
Battery
Android Java Linpack Benchmark 4A8 30-Apr-2023 13.46
Speed 2346.11 MFLOPS
norm. resid 1.67
resid 7.41628980e-14
machep 2.22044605e-16
x[0]-1 -1.49880108e-14
x[n-1]-1 -1.89848137e-14
System 4/System 2 MFLOPS 2.65
DriveSpeed Benchmark next or Go To Start
DriveSpeed Benchmarks - DriveSpd1.apk
DriveSpeed carries out four tests.
Test 1 - Write and read three 8 and 16 MB files; Results given in MBytes/second
Test 2 - Write three 8 MB files, read can be cached in RAM; Results given in MBytes/second
Test 3 - Random write and read 1 KB from 4 to 16 MB; Results are average time in milliseconds
Test 4 - Write and read 200 files 4 KB to 16 KB; Results in MB/sec, msecs/file and delete seconds.
Buttons - RunS SD Card Not used now, RunI Main Drive, More > Don't Delete, Read Only or Both and Save See below
As can be seen, there were wide variations on measured performance, making it difficult to declare a winner, but System 3 appears to have
a greater number of lowest scores. Random reading speeds were too fast to register within the calculations used.
This was not run on System 4.
System 1 Android 11 2.05 GHz ARM Cortex-A76
Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.30
Internal Drive Data Cached
Compiled for 64 bit ARM v8a
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 1249.5 1264.4 1293.2 2927.6 2978.2 3162.4
16 1272.8 1314.8 1335.7 2970.1 3168.8 3539.9
Cached
8 871.2 455.3 1264.1 2847.8 3026.6 3206.2
Random Write Read
From MB 4 8 16 4 8 16
msecs 0.16 0.16 0.19 0.00 0.00 0.00
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
MB/sec 16.70 35.42 60.87 126.61 245.65 344.08
msecs 0.25 0.23 0.27 0.03 0.03 0.05 0.027
No delete
Total Elapsed Time 16.4 seconds
Path Used /data/user/0/com.drivespeed/files/
Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.37
Internal Drive Read Only
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 0.0 0.0 0.0 420.3 396.7 420.9
System 2 Android 12 2.0 GHz Snapdragon 750 (Cortex-A76)
Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.43
Internal Drive Data Cached
Compiled for 64 bit ARM v8a
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 1661.5 1649.2 1831.3 1993.7 2369.6 2969.9
16 1669.1 1530.7 1117.6 2125.8 2612.2 2167.9
Cached
8 1070.1 1557.8 1790.7 2124.0 2607.2 3217.0
Random Write Read
From MB 4 8 16 4 8 16
msecs 0.22 0.43 0.47 0.00 0.00 0.00
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
MB/sec 44.73 83.50 70.39 388.90 455.49 435.65
msecs 0.09 0.10 0.23 0.01 0.02 0.04 0.011
No delete
Total Elapsed Time 16.3 seconds
Path Used /data/user/0/com.drivespeed/files/
Continued Below
System 2
Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.45
Internal Drive Read Only
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 0.0 0.0 0.0 338.4 425.1 393.9
System 3 Android 13 2.0 GHz ARM Cortex-A75
Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.51
Internal Drive Data Cached
Compiled for 64 bit ARM v8a
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 849.8 1095.8 1478.3 2370.3 2270.8 2500.7
16 1519.2 1351.3 1234.3 1760.5 1853.6 1810.0
Cached
8 1612.1 1493.0 1262.3 2056.7 2007.7 1926.3
Random Write Read
From MB 4 8 16 4 8 16
msecs 0.36 0.37 0.36 0.00 0.00 0.00
200 Files Write Read Delete
File KB 4 8 16 4 8 16 secs
MB/sec 66.03 178.95 323.02 519.97 837.431283.88
msecs 0.06 0.05 0.05 0.01 0.01 0.01 0.006
No delete
Total Elapsed Time 16.5 seconds
Path Used /data/user/0/com.drivespeed/files/
Android DriveSpeed1 Benchmark 4A8 05-Mar-2023 10.59
Internal Drive Read Only
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 0.0 0.0 0.0 169.3 167.5 192.1
System 3 SD Card Option
Using RunS produces results on the latest versions of Android, but does not access the SD card. Following is an example of the start of a
log after selecting this button without the SD card inserted and is the same with it in place. So, it is using a different file path on the
internal drive. Writing speeds were much slower than via RunI but, on using the Read Only procedures, produced the same reading
performance.
MBytes/Second
MB Write1 Write2 Write3 Read1 Read2 Read3
8 60.0 62.5 62.3 1211.5 1140.5 1106.5
16 65.5 67.5 56.5 1256.2 1756.7 2147.9
Path Used /storage/emulated/0/
CPU Stress Tests next or Go To Start
CPU Stress Tests - MP-FPU-Stress.apk, MP-Int-Stress.apk, CP_MHz2.apk
USE AT YOUR OWN RISK
There are two main stress test programs that can use multiple threads to exercise (presently) all CPU cores, one using floating point
instructions, and the other carrying out integer arithmetic. Further detail is covered in the earlier report - Android Benchmarks For 32 Bit and
64 Bit CPUs from ARM Intel and MIPS.pdf. The third program monitors MHz of up to 8 cores. Each of the stress test applications has five
buttons:
RunB - Run Benchmark - Runs most combinations of number of threads, data sizes and calculations per data word for the FPU tests. This is
mainly to help to decide which options to use for stress testing. The benchmark runs using fixed parameters, carrying out exactly the same
number of calculations using all thread combinations and data sizes. The pass count changes according to the number of calculations per
word, for the FPU tests.
RunS - Run Stress Tests - Default running time is 15 minutes, with the middle data size, intended for containment in L2 cache, using 8
threads. and 32 operations per word in the FPU tests.
False Errors - These can be caused if the run button is tapped again when the tests are running. The main unique symptoms are multiple
“End Time” message displays.
SetS - Specify run time parameters for stress test - These are 1, 2, 4, 8, 16 or 32 threads, 2, 8 or 32 Operations per word for FPU tests,
12.8 or 16 KB, 128 or 160 KB, 12.8 or 16 MB for FPU or Integer tests, and running time in minutes.
Info - Test description and details - This is essentially the same as details provided here.
Save - This provides alternative methods to divert the logged output. Currently I select the Google Drive option, allowing me to access the
files on my PCs.
Unexpected Faster Speed - Performance depends on whether the data comes from caches or RAM. Then, increasing the number of
threads can lead to CPU cores using dedicated smaller and faster caches.
Sumchecks - The programs include sumchecks to show whether the correct arithmetic calculations were produced, as shown for the
benchmark results. For integers, each test section uses a different data pattern for all words, checked by the program after manipulation.
Floating point numeric results depend on the number of calculations carried out, constant for stress test reported time slots, easily verified
manually.
CP_MHz2 measurements are instantaneous at a constant sampling rate, not averages over that time. The program has Set, Run and Save
buttons, as above. Default running time is 15 minutes and sampling rate 10 seconds.
Later below are example results of Stress Test Benchmarks, followed by extended Reliability type Tests. Those for stress tests are from logs
running default parameters, with 15 minutes running time. Some of the latter include only necessary detail. Examples of full output are as
follows.
ARM/Intel MP-Int Stress Test 4A8 09-Mar-2023 10.24.37
Compiled for 64 bit ARM v8a
Data Same All
Seconds Size Threads MB/sec Sumcheck Threads
8.7 160 KB 8 57397 00000000 Yes
17.4 160 KB 8 56966 00000000 Yes
ARM/Intel MP-FPU Stress Test 4A8 13-Mar-2023 11.59.35
Compiled for 64 bit ARM v8a
Data Ops/ Nmeric
Seconds Size Threads Word MFLOPS Results
9.4 128 KB 8 32 38431 35216
18.6 128 KB 8 32 37721 35216
As seen via the CPU-Z utility app, core MHz values are shown to change at extremely rapid rates. Here, CP_MHz2.apk provides samples at a
selected number of seconds rate, as representative and not average. Example output:
MHz Measurement Test 4A8 13-Mar-2023 12.00.55
Running time 15 minutes, 30 second samples
MHz for Core
Secs 0 1 2 3 4 5 6 7
0.00 1805 1478 1805 1805 1805 1805 1651 1651
30.10 1805 1805 1805 1805 1805 1805 2035 2035
Integer Stress Test Benchmark Next or Go To Start
Integer Stress Test Benchmark
Measured performance was similar to earlier tests, such as MP-RandMem Serial Read, but show improved throughput using more than eight
threads. Maximum single core Integer MOPS (Million Operations Per Second) would be around 2400 for System 1 and 3800 for System 2,
particularly the latter suggesting SIMD activity.
The usual relative performance attributes are show to apply, with System 2 indicated as much faster, with cache based data, using 1 or 2
treads, then possibly slower at 4 and 8.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and
6 x 2.0 GHz ARM Cortex-A55
ARM/Intel MP-Int Stress Test 4A8 07-Mar-2023 10.40.16
Compiled for 64 bit ARM v8a
MB/second
KB KB MB Same All
Secs Thrds 16 160 16 Sumcheck Tests
1.8 1 14159 14594 13354 00000000 Yes
1.2 2 21954 29948 13697 FFFFFFFF Yes
1.1 4 32124 32881 13805 5A5A5A5A Yes
1.0 8 41607 40944 14064 AAAAAAAA Yes
1.0 16 42412 44068 13862 CCCCCCCC Yes
0.8 32 42941 50142 20698 0F0F0F0F Yes
End Time 07-Mar-2023 10.40.31
System 2 Android 12 2.0 GHz Snapdragon 750 2 x 2.0 GHz
Cortex-A76 and 6 x 1.8 GHz Cortex-A55
ARM/Intel MP-Int Stress Test 4A8 07-Mar-2023 10.44.17
Compiled for 64 bit ARM v8a
MB/second
KB KB MB Same All
Secs Thrds 16 160 16 Sumcheck Tests
1.8 1 15333 14398 12557 00000000 Yes
1.2 2 25656 25554 13615 FFFFFFFF Yes
1.2 4 29025 31166 13079 5A5A5A5A Yes
1.1 8 43667 40739 12317 AAAAAAAA Yes
1.0 16 39954 43161 13182 CCCCCCCC Yes
0.9 32 40849 42656 15047 0F0F0F0F Yes
End Time 07-Mar-2023 10.44.27
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and
6 x 2.0 GHz Cortex-A55
ARM/Intel MP-Int Stress Test 4A8 07-Mar-2023 10.48.58
Compiled for 64 bit ARM v8a
MB/second
KB KB MB Same All
Secs Thrds 16 160 16 Sumcheck Tests
2.8 1 11252 11433 6011 00000000 Yes
1.9 2 20286 16018 8505 FFFFFFFF Yes
1.7 4 24332 23788 8086 5A5A5A5A Yes
1.5 8 36755 33932 8156 AAAAAAAA Yes
1.4 16 37736 39228 8096 CCCCCCCC Yes
1.1 32 35649 36291 12974 0F0F0F0F Yes
End Time 07-Mar-2023 10.49.16
Continued Below
Armv9 CPU Phone
Performance gains over the older device continued to be similar to MP-RandMem Serial Read, except for RAM speed improvement that were
more significant. This time, performance on power was significantly faster than using the battery.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
System 4 Battery
ARM/Intel MP-Int Stress Test 4A8 23-Apr-2023 14.41.16
Compiled for 64 bit ARM v8a
MB/second
KB KB MB Same All
Secs Thrds 16 160 16 Sumcheck Tests
1.6 1 19675 16316 13029 00000000 Yes
1.1 2 31241 28440 15894 FFFFFFFF Yes
0.9 4 46282 40016 16222 5A5A5A5A Yes
0.7 8 59097 56981 18473 AAAAAAAA Yes
0.5 16 63286 67726 30086 CCCCCCCC Yes
0.4 32 65657 64560 61397 0F0F0F0F Yes
End Time 23-Apr-2023 14.41.27
System 4 Power
ARM/Intel MP-Int Stress Test 4A8 20-Apr-2023 20.51.13
Compiled for 64 bit ARM v8a
MB/second
KB KB MB Same All
Secs Thrds 16 160 16 Sumcheck Tests
1.2 1 23224 20831 19265 00000000 Yes
0.9 2 38975 37282 18468 FFFFFFFF Yes
0.5 4 62257 66630 40302 5A5A5A5A Yes
0.4 8 82663 90286 51540 AAAAAAAA Yes
0.3 16 88619 89234 72478 CCCCCCCC Yes
0.3 32 94039 86710 74422 0F0F0F0F Yes
End Time 20-Apr-2023 20.51.21
System 4/System 2
1 1.51 1.45 1.53
2 1.52 1.46 1.36
4 2.14 2.14 3.08
8 1.89 2.22 4.18
16 2.22 2.07 5.50
32 2.30 2.03 4.95
System 4 Power/Battery
1 1.18 1.28 1.48
2 1.25 1.31 1.16
4 1.35 1.67 2.48
8 1.40 1.58 2.79
16 1.40 1.32 2.41
32 1.43 1.34 1.21
Floating Point Stress Test Benchmark Next or Go To Start
Floating Point Stress Test Benchmark
This program uses the same C code as MP-MFLOPS, with the addition of tests using 8 floating point calculations per data word read/written.
Performance was also similar, including variations with multithreaded activity, apparent in results from multiple runs.
Again, at 12.8 and 128 KB. System 2 was much faster using 1 or 2 threads, but not so at more than 2.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and
6 x 2.0 GHz ARM Cortex-A55
ARM/Intel MP-FPU Stress Test 4A8 07-Mar-2023 10.41.57
Compiled for 64 bit ARM v8a
MFLOPS Numeric Results
Ops/ KB KB MB KB KB MB
Secs Thrd Word 12.8 128 12.8 12.8 128 12.8
0.3 T1 2 9427 8174 3316 40392 76406 99700
0.4 T2 2 12505 9288 2517 40392 76406 99700
0.4 T4 2 11865 15337 2318 40392 76406 99700
0.4 T8 2 14857 16797 2240 40392 76406 99700
0.7 T1 8 12064 11755 11519 54760 85092 99819
0.5 T2 8 22060 21418 10649 54760 85092 99819
0.5 T4 8 26292 24186 9696 54760 85092 99819
0.5 T8 8 26257 24723 8943 54760 85092 99819
2.5 T1 32 12560 12096 11976 35218 66014 99520
1.4 T2 32 20570 23527 22632 35218 66014 99520
1.2 T4 32 25966 26414 25899 35218 66014 99520
1.1 T8 32 28518 30202 28717 35218 66014 99520
End Time 07-Mar-2023 10.42.09
System 2 Android 12 2.0 GHz Snapdragon 750 2 x 2.0 GHz
Cortex-A76 and 6 x 1.8 GHz Cortex-A55
ARM/Intel MP-FPU Stress Test 4A8 07-Mar-2023 10.46.20
Compiled for 64 bit ARM v8a
MFLOPS Numeric Results
Ops/ KB KB MB KB KB MB
Secs Thrd Word 12.8 128 12.8 12.8 128 12.8
0.4 T1 2 7773 7983 2859 40392 76406 99700
0.4 T2 2 8975 7726 2545 40392 76406 99700
0.4 T4 2 8026 7542 2467 40392 76406 99700
0.4 T8 2 13882 11752 2336 40392 76406 99700
0.7 T1 8 11229 10090 11035 54760 85092 99819
0.6 T2 8 15553 17641 10259 54760 85092 99819
0.6 T4 8 18031 15945 10135 54760 85092 99819
0.5 T8 8 21272 21474 9410 54760 85092 99819
2.5 T1 32 11955 11956 12435 35218 66014 99520
1.4 T2 32 22202 22806 22787 35218 66014 99520
1.3 T4 32 23857 24021 25369 35218 66014 99520
1.0 T8 32 28250 32201 28726 35218 66014 99520
End Time 07-Mar-2023 10.46.33
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and
6 x 2.0 GHz Cortex-A55
ARM/Intel MP-FPU Stress Test 4A8 07-Mar-2023 10.50.13
Compiled for 64 bit ARM v8a
MFLOPS Numeric Results
Ops/ KB KB MB KB KB MB
Secs Thrd Word 12.8 128 12.8 12.8 128 12.8
0.7 T1 2 5440 4195 1617 40392 76406 99700
0.5 T2 2 9855 10851 1781 40392 76406 99700
0.5 T4 2 8167 8485 1881 40392 76406 99700
0.5 T8 2 12014 10806 1847 40392 76406 99700
1.3 T1 8 6384 6381 5647 54760 85092 99819
0.8 T2 8 12496 12140 6674 54760 85092 99819
0.8 T4 8 12311 11922 7397 54760 85092 99819
0.6 T8 8 17907 17982 7476 54760 85092 99819
4.5 T1 32 6903 6912 6866 35218 66014 99520
2.2 T2 32 13696 13797 13740 35218 66014 99520
2.0 T4 32 13620 16951 16788 35218 66014 99520
1.4 T8 32 21211 21290 22181 35218 66014 99520
End Time 07-Mar-2023 10.50.32
Continued Below
Armv9 CPU Phone
Battery/Power performance comparisons indicated wide variances but overall elapsed time was much longer on power, unlike the integer
stress testing benchmark where is was somewhat shorter.
System 4 Android 13 1x 2.80 GHz Cortex-X2, 4x 1.82 GHz
Cortex A510, 3x 2.52 GHz Cortex A710
System 4 Battery
ARM/Intel MP-FPU Stress Test 4A8 23-Apr-2023 14.45.52
Compiled for 64 bit ARM v8a
MFLOPS Numeric Results
Ops/ KB KB MB KB KB MB
Secs Thrd Word 12.8 128 12.8 12.8 128 12.8
0.2 T1 2 15743 13802 6168 40392 76406 99700
0.1 T2 2 23790 22564 8635 40392 76406 99700
0.1 T4 2 31487 16944 11190 40392 76406 99700
0.1 T8 2 29239 16754 14704 40392 76406 99700
0.5 T1 8 17614 16465 14614 54760 85092 99819
0.4 T2 8 23473 21702 13270 54760 85092 99819
0.4 T4 8 28836 22915 14793 54760 85092 99819
0.3 T8 8 35877 33822 26051 54760 85092 99819
1.7 T1 32 14379 21304 22032 35218 66014 99520
1.1 T2 32 24714 27766 30000 35218 66014 99520
0.7 T4 32 44493 37534 46516 35218 66014 99520
0.7 T8 32 40943 39881 52404 35218 66014 99520
End Time 23-Apr-2023 14.46.02
System 4 Power
ARM/Intel MP-FPU Stress Test 4A8 20-Apr-2023 20.49.55
Compiled for 64 bit ARM v8a
MFLOPS Numeric Results
Ops/ KB KB MB KB KB MB
Secs Thrd Word 12.8 128 12.8 12.8 128 12.8
0.2 T1 2 13959 13834 5427 40392 76406 99700
0.1 T2 2 21365 24557 9061 40392 76406 99700
0.1 T4 2 21907 21840 12173 40392 76406 99700
0.1 T8 2 18322 31692 12821 40392 76406 99700
0.5 T1 8 17088 17742 16266 54760 85092 99819
0.4 T2 8 23468 22740 13810 54760 85092 99819
0.4 T4 8 31470 24004 14281 54760 85092 99819
0.3 T8 8 28966 26081 23677 54760 85092 99819
1.7 T1 32 14975 20595 21972 35218 66014 99520
1.2 T2 32 24720 26515 28342 35218 66014 99520
0.8 T4 32 45125 33106 45770 35218 66014 99520
0.7 T8 32 49057 37660 46982 35218 66014 99520
End Time 20-Apr-2023 20.50.18
System 4/System 2
T1 2 1.80 1.73 1.90
T2 2 2.38 3.18 3.56
T4 2 2.73 2.90 4.93
T8 2 1.32 2.70 5.49
T1 8 1.52 1.76 1.47
T2 8 1.51 1.29 1.35
T4 8 1.75 1.51 1.41
T8 8 1.36 1.21 2.52
T1 32 1.25 1.72 1.77
T2 32 1.11 1.16 1.24
T4 32 1.89 1.38 1.80
T8 32 1.74 1.17 1.64
System 4 Battery/Power
T1 2 1.13 1.00 1.14
T2 2 1.11 0.92 0.95
T4 2 1.44 0.78 0.92
T8 2 1.60 0.53 1.15
T1 8 1.03 0.93 0.90
T2 8 1.00 0.95 0.96
T4 8 0.92 0.95 1.04
T8 8 1.24 1.30 1.10
T1 32 0.96 1.03 1.00
T2 32 1.00 1.05 1.06
T4 32 0.99 1.13 1.02
T8 32 0.83 1.06 1.12
Integer Stress Tests Next Page or Go To Start
Integer Stress Tests
Following are results from 15 minute tests at 160 KB and 8 threads. MHz samples were at 30 second intervals, with average measured
MB/second over the same time slot. System 1 tests were run with power on and using the battery, starting with 5% available charge,
without a major reduction in performance.
In all cases, CPU MHz of each of the six LITTLE CPU cores was essentially constant, performance degradation being imposed by MHz
reductions on the two main cores. Performance of System 2 was better than System 1, in spite of LITTLE CPU cores running at lower MHz.
This is probably caused by the latter being produced by a later fabrication level. As expected, the older technology based System 3 was the
slowest.
System 1 Power 1 Battery 2 Power 3 Power
Mean MB/second 48110 48088 54838 39839
Usual Slow CPU MHz 2000 2000 1805 2002
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55
MHz for Core
Secs MB/sec 0 1 2 3 4 5 6 7 Average
0 52349
30 51608 2000 2000 2000 2000 2000 2000 2050 2050 2013
60 48982 2000 2000 2000 2000 2000 2000 1796 1796 1949
90 46641 1275 875 1275 1175 1375 1275 1986 1986 1403
120 50087 2000 2000 2000 2000 2000 1800 1308 1308 1802
150 49026 2000 2000 2000 2000 2000 2000 1530 1530 1883
180 46743 2000 2000 2000 2000 2000 2000 1530 1419 1869
210 48994 2000 2000 2000 2000 2000 2000 1733 1733 1933
240 49110 2000 2000 2000 2000 2000 2000 1530 1530 1883
270 48631 2000 2000 2000 2000 2000 2000 1419 1419 1855
300 48052 2000 2000 2000 2000 2000 2000 1530 1530 1883
330 48752 2000 2000 2000 2000 2000 2000 1530 1308 1855
360 47384 2000 2000 2000 2000 2000 2000 1419 1530 1869
390 48812 2000 2000 2000 2000 2000 2000 1530 1419 1869
420 47352 2000 2000 2000 2000 2000 2000 1530 1530 1883
450 46944 2000 2000 2000 2000 2000 2000 1419 1419 1855
480 47086 2000 2000 2000 2000 2000 2000 1419 1419 1855
510 47789 2000 2000 2000 2000 2000 2000 1419 1419 1855
540 47799 2000 2000 2000 2000 2000 2000 1169 1308 1810
570 46693 2000 2000 2000 2000 2000 2000 1308 1419 1841
600 49389 2000 2000 2000 2000 2000 2000 1419 1308 1841
630 48092 2000 2000 2000 2000 2000 2000 1419 1308 1841
660 47454 2000 2000 2000 2000 2000 2000 1419 1419 1855
690 46836 2000 2000 2000 2000 2000 2000 1530 1530 1883
720 47261 2000 2000 2000 2000 2000 2000 1308 1419 1841
750 47122 2000 2000 2000 2000 2000 2000 1419 1419 1855
780 47362 2000 2000 2000 2000 2000 2000 1169 1419 1824
810 48045 2000 2000 2000 2000 2000 2000 1419 1419 1855
840 46429 1175 1933 2000 2000 2000 2000 1530 1419 1757
870 46835 2000 2000 2000 2000 2000 2000 1419 1308 1841
900 47738 1866 1866 1866 1866 2000 2000 1419 1530 1802
System 1 Battery - Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55
0 53347
30 52694 2000 2000 2000 2000 2000 2000 1923 2050 1997
60 48780 2000 2000 2000 2000 2000 2000 1733 1733 1933
90 49702 2000 2000 2000 2000 2000 2000 1670 1530 1900
120 49449 2000 2000 2000 2000 2000 2000 1530 1670 1900
150 49864 1075 1375 1375 1375 1375 1075 1986 1419 1382
180 49477 2000 2000 2000 2000 2000 2000 1530 1530 1883
210 47739 2000 2000 2000 2000 2000 2000 1530 1530 1883
240 47961 2000 2000 2000 2000 2000 2000 1530 1530 1883
270 46765 2000 2000 2000 2000 2000 2000 1419 1419 1855
300 48323 2000 2000 2000 2000 2000 2000 1670 1419 1886
330 46877 2000 2000 2000 2000 2000 2000 919 919 1730
360 48398 2000 2000 2000 2000 2000 2000 1670 1670 1918
390 47699 2000 2000 2000 2000 2000 2000 1419 1419 1855
420 46764 2000 2000 2000 2000 2000 2000 1419 1419 1855
450 48355 2000 2000 2000 2000 2000 2000 1308 1419 1841
480 46643 2000 2000 2000 2000 2000 2000 1419 1419 1855
510 47094 1933 1933 1933 1933 1933 1933 1308 1085 1749
540 47462 2000 2000 2000 2000 2000 2000 1419 1419 1855
570 47156 2000 2000 2000 2000 2000 2000 1530 1530 1883
600 47482 2000 2000 2000 2000 2000 2000 1419 1419 1855
630 47205 2000 2000 2000 2000 2000 2000 1419 1419 1855
660 46806 2000 2000 2000 2000 2000 2000 1419 1419 1855
690 47632 2000 2000 2000 2000 2000 2000 1419 1419 1855
720 45909 1800 1800 1800 1800 1800 1800 1419 1419 1705
750 45615 1866 1866 1866 1866 1866 1866 1085 1419 1713
780 47168 1866 1866 1866 1866 1866 1866 1419 1085 1713
810 26772 2000 2000 2000 2000 2000 2000 774 774 1694
840 46179 2000 2000 2000 2000 2000 2000 1419 1419 1855
870 46743 1933 1933 1933 1933 1933 1933 1308 1419 1791
900 45630 1933 1933 1933 1933 1933 1933 1419 1419 1805
Integer Stress Tests continued Below or Go To Start
Integer Stress Tests Continued
System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55)
MHz for Core
Secs MB/sec 0 1 2 3 4 5 6 7 Average
0 57397
30 56976 1805 1805 1805 1805 1805 1805 2035 2035 1863
60 56325 1805 1805 1805 1805 1805 1805 2035 2035 1863
90 56726 1805 1805 1805 1805 1805 1805 2035 2035 1863
120 56830 1805 1805 1805 1805 1805 1805 2035 2035 1863
150 56265 1805 1805 1805 1805 1805 1805 2035 2035 1863
180 56821 1805 1805 1805 1805 1805 1805 2035 2035 1863
210 56761 1805 1805 1805 1805 1805 1805 2035 2035 1863
240 56769 1805 1805 1805 1805 1805 1805 2035 2035 1863
270 56569 1805 1805 1805 1805 1805 1805 2035 2035 1863
300 56707 1805 1805 1805 1805 1805 1805 2035 2035 1863
330 56857 1805 1805 1805 1805 1805 1805 2035 2035 1863
360 56524 1805 1805 1805 1805 1805 1805 2035 2035 1863
390 56576 1805 1805 1805 1805 1805 1805 2035 2035 1863
420 56923 1805 1805 1805 1805 1805 1805 2035 2035 1863
450 56738 1805 1805 1805 1805 1805 1805 2035 2035 1863
480 56887 1805 1805 1805 1805 1805 1805 2035 2035 1863
510 55698 1805 1805 1805 1805 1805 1805 2035 2035 1863
540 56602 1805 1805 1805 1805 1805 1805 2035 2035 1863
570 56645 1805 1805 1805 1805 1805 1805 2035 2035 1863
600 56850 1805 1805 1805 1805 1805 1805 2035 2035 1863
630 56741 1805 1805 1805 1805 1805 1805 2035 2035 1863
660 56755 1805 1805 1805 1805 1805 1805 2035 2035 1863
690 56257 1805 1805 1805 1805 1805 1805 2035 2035 1863
720 55140 1805 1805 1805 1805 1805 1805 2035 2035 1863
750 56556 1805 1805 1805 1805 1805 1805 2035 2035 1863
780 56802 1805 1805 1805 1805 1805 1805 2035 2035 1863
810 56824 1805 1805 1805 1805 1805 1805 2035 2035 1863
840 30514 1805 1805 1805 1805 1805 1805 2035 2035 1863
870 33652 1709 1709 1709 1709 1709 1709 1805 1805 1733
900 50302 1709 1709 1709 1709 1709 1709 1805 1805 1733
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55
MHz for Core
Secs MB/sec 0 1 2 3 4 5 6 7 Average
0 44416
30 44323 2002 2002 2002 2002 2002 2002 2002 2002 2002
60 43513 2002 2002 2002 2002 2002 2002 1872 1872 1970
90 43487 2002 2002 2002 2002 2002 2002 1536 1536 1886
120 43751 2002 2002 2002 2002 2002 2002 1742 1742 1937
150 43154 2002 2002 2002 2002 2002 2002 1229 1229 1809
180 42516 2002 2002 2002 2002 2002 2002 1536 1536 1886
210 42549 2002 2002 2002 2002 2002 2002 1482 1482 1872
240 42621 2002 2002 2002 2002 2002 2002 1229 1229 1809
270 40041 2002 2002 2002 2002 2002 2002 1742 1742 1937
300 42976 2002 2002 2002 2002 2002 2002 1229 1229 1809
330 39246 2002 2002 2002 2002 2002 2002 2002 2002 2002
360 40390 2002 2002 2002 2002 2002 2002 1536 1536 1886
390 38808 2002 2002 2002 2002 2002 2002 2002 2002 2002
420 38806 2002 2002 2002 2002 2002 2002 1536 1536 1886
450 39480 2002 2002 2002 2002 2002 2002 2002 2002 2002
480 38574 2002 2002 2002 2002 2002 2002 2002 2002 2002
510 38316 2002 2002 2002 2002 2002 2002 2002 2002 2002
540 38770 2002 2002 2002 2002 2002 2002 1872 1872 1970
570 38459 2002 2002 2002 2002 2002 2002 1229 1872 1889
600 37892 2002 2002 2002 2002 2002 2002 2002 2002 2002
630 38772 2002 2002 2002 2002 2002 2002 1536 1536 1886
660 39099 2002 2002 2002 2002 2002 2002 2002 1229 1905
690 38011 2002 2002 2002 2002 2002 2002 1229 1536 1847
720 39059 2002 2002 2002 2002 2002 2002 2002 1742 1970
750 39290 2002 2002 2002 2002 2002 2002 2002 1742 1970
780 38913 2002 2002 2002 2002 2002 2002 1742 1742 1937
810 39524 2002 2002 2002 2002 2002 2002 1872 1872 1970
840 37500 2002 2002 2002 2002 2002 2002 1229 1229 1809
870 24380 2002 2002 2002 2002 2002 2002 1872 1536 1928
900 38368 2002 2002 2002 2002 2002 2002 1742 1742 1937
Continued Below
Armv9 CPU Phone
At least on this particular hardware and software, the MHz measuring program would not run properly in the background. An example is
provided below where recoding stopped when the stress test started execution. Because of this, the table only provides performance
measurements using 8, 4, 2 and 1 threads. In this case, they were executed in that order.
Timeout variance refernceother results
System 4 Android 13 1x 2.80 GHz Cortex-X2,
4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710
Threads 8 4 2 1
Battery Battery Bat+Pow Power
30-Apr 30-Apr 30-Apr 30-Apr
Start 15:00 15:17 15:44 16:06
End 15.17 15.44 16.06 16.32
Secs MB/sec MB/sec MB/sec MB/sec
10 133083 96175 44760 16160
30 119760 88868 47865 15773
60 111445 82186 47151 15757
90 111613 82591 43305 15771
120 109574 81741 43289 15977
150 109483 74503 44553 15769
180 108523 80390 41614 15768
210 106909 79071 43289 15770
240 107657 76151 43296 15768
270 104187 66732 41341 15731
300 104027 73007 40234 15765
330 40548 15985
360 Timeout 42721 15770
390 Timeout 69770 39264 15766
420 61693 38915 15991
450 63592 41352 15768
480 63941 40039 15770
510 111579 62500 39279 15761
540 111350 62786 39488 15769
570 109626 62670 33665 15768
600 108377 62609 39265 15769
630 106509 62758 37640 15771
660 106738 62372 38942 15721
690 105756 62816 37879 16274
720 90875 62794 38051 15769
750 87526 62403 36682 15771
780 89403 62037 37333 15708
810 91222 62149 35351 15746
840 90148 62758 35344 15765
870 90497 62562 37108 15765
900 88864 62803 33745 15769
Start S 133083 96175 44760 16160
End E 88864 62803 33745 15769
%E/S 67 65 75 98
Benchmk 90286 66630 37282 20831
MHz Measurement Test 4A8 30-Apr-2023 15.01.35
Running time 15 minutes, 30 second samples
MHz for Core
Secs 0 1 2 3 4 5 6 7
0.00 960 960 1152 960 1920 1632 1152 1344
30.09 1728 1728 1728 1728 2112 1824 1824 2304
60.32 1440 1728 1728 1728 1824 1824 1824 960
90.53 1728 1728 1728 1728 2016 1728 1728 2208
821.61 1344 1344 1344 1344 1536 1536 1536 1536
1277.15 1152 1056 1056 1056 2515 2515 2400 2400
End Time 30-Apr-2023 15.23.14
Floating Point Stress Tests Next Page or Go To Start
Floating Point Stress Tests
These were also run for 15 minutes using 8 threads, but with 128 KB data. The testing arrangements were as used for the integer exercise.
Performance is measured in MFLOPS. The significant observation here is that System 2 performed relatively better than the integer stress
tests by demonstrating all cores running at maximum MHz throughout the 15 minute test.
System 1 Power 2 Power 3 Power
Mean MFLOPS 31603 37395 22990
Usual Slow CPU MHz 2000 1805 2002
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55
MHz for Core
Secs MFLOPS 0 1 2 3 4 5 6 7 Average
0 34841
30 32620 2000 2000 2000 2000 2000 2000 2050 2050 2013
60 32965 2000 2000 2000 2000 2000 2000 1796 1796 1949
90 32142 2000 2000 2000 2000 2000 2000 1733 1733 1933
120 31115 2000 2000 2000 2000 2000 2000 1733 1733 1933
150 31404 2000 2000 2000 2000 2000 2000 1670 1670 1918
180 32130 2000 2000 2000 2000 2000 2000 1530 1796 1916
210 31275 2000 2000 2000 2000 2000 2000 1670 1530 1900
240 31024 2000 2000 2000 2000 2000 2000 1796 1796 1949
270 31986 2000 2000 2000 2000 2000 2000 1670 1670 1918
300 32255 2000 2000 2000 2000 2000 2000 1530 1530 1883
330 32591 2000 2000 2000 2000 2000 2000 1530 1733 1908
360 31627 2000 2000 2000 2000 2000 2000 1419 1670 1886
390 31064 2000 2000 2000 2000 2000 2000 1530 1530 1883
420 32626 2000 2000 2000 2000 2000 2000 1530 1530 1883
450 31898 2000 2000 2000 2000 2000 2000 1530 1530 1883
480 30940 1866 1933 2000 2000 2000 2000 1530 1530 1857
510 31994 2000 2000 2000 2000 2000 2000 1860 1419 1910
540 31563 2000 2000 2000 2000 2000 1933 1419 1419 1846
570 30872 2000 2000 2000 2000 2000 2000 1733 1169 1863
600 31143 2000 2000 2000 2000 2000 2000 1670 1670 1918
630 31670 2000 2000 2000 2000 2000 2000 1419 1419 1855
660 31703 2000 2000 2000 2000 2000 2000 1530 1530 1883
690 30936 1866 1800 1800 1800 1800 1800 1670 1670 1776
720 30664 2000 2000 2000 2000 2000 2000 1530 1530 1883
750 31153 2000 2000 2000 2000 2000 2000 1530 1530 1883
780 30367 1933 1933 2000 2000 2000 2000 1670 1308 1856
810 30412 2000 2000 2000 2000 2000 2000 1733 1733 1933
840 30837 2000 2000 2000 2000 2000 2000 1530 1530 1883
870 30699 2000 2000 2000 2000 2000 2000 1419 1308 1841
900 31165 2000 2000 2000 2000 2000 2000 1530 1530 1883
System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55)
MHz for Core
Secs MFLOPS 0 1 2 3 4 5 6 7 Average
0 38431
30 37700 1805 1805 1805 1805 1805 1805 2035 2035 1863
60 37537 1805 1805 1805 1805 1805 1805 2035 2035 1863
90 37643 1805 1805 1805 1805 1805 1805 2035 2035 1863
120 37777 1805 1805 1805 1805 1805 1805 2035 2035 1863
150 37524 1805 1805 1805 1805 1805 1805 2035 2035 1863
180 37956 1805 1805 1805 1805 1805 1805 2035 2035 1863
210 32704 1805 1805 1805 1805 1805 1805 2035 2035 1863
240 37343 1805 1805 1805 1805 1805 1805 2035 2035 1863
270 35775 1805 1805 1805 1805 1805 1805 2035 2035 1863
300 37173 1805 1805 1805 1805 1805 1805 2035 2035 1863
330 37469 1805 1805 1805 1805 1805 1805 2035 2035 1863
360 37749 1805 1805 1805 1805 1805 1805 2035 2035 1863
390 37643 1805 1805 1805 1805 1805 1805 2035 2035 1863
420 37404 1805 1805 1805 1805 1805 1805 2035 2035 1863
450 37339 1805 1805 1805 1805 1805 1805 2035 2035 1863
480 37850 1805 1805 1805 1805 1805 1805 2035 2035 1863
510 36378 1805 1805 1805 1805 1805 1805 2035 2035 1863
540 37348 1805 1805 1805 1805 1805 1805 2035 2035 1863
570 37537 1805 1805 1805 1805 1805 1805 2035 2035 1863
600 37885 1805 1805 1805 1805 1805 1805 2035 2035 1863
630 37787 1805 1805 1805 1805 1805 1805 2035 2035 1863
660 37526 1805 1805 1805 1805 1805 1805 2035 2035 1863
690 37721 1805 1805 1805 1805 1805 1805 2035 2035 1863
720 37841 1805 1805 1805 1805 1805 1805 2035 2035 1863
750 37871 1805 1805 1805 1805 1805 1805 2035 2035 1863
780 37513 1805 1805 1805 1805 1805 1805 2035 2035 1863
810 37863 1805 1805 1805 1805 1805 1805 2035 2035 1863
840 37711 1805 1805 1805 1805 1805 1805 2035 2035 1863
870 37709 1805 1805 1805 1805 1805 1805 2035 2035 1863
900 37528 1805 1805 1805 1805 1805 1805 2035 2035 1863
Floating Point Stress Tests continued Below or Go To Start
Floating Point Stress Tests Continued
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55
MHz for Core
Secs MFLOPS 0 1 2 3 4 5 6 7 Average
0 24716
30 24173 2002 2002 2002 2002 2002 2002 2002 2002 2002
60 23879 2002 2002 2002 2002 2002 2002 2002 2002 2002
90 24361 2002 2002 2002 2002 2002 2002 1742 1742 1937
120 24068 2002 2002 2002 2002 2002 2002 1872 1872 1970
150 23441 2002 2002 2002 2002 2002 2002 2002 2002 2002
180 23664 2002 2002 2002 2002 2002 2002 1872 1872 1970
210 23991 2002 2002 2002 2002 2002 2002 1536 1536 1886
240 23745 2002 2002 2002 2002 2002 2002 2002 2002 2002
270 23953 2002 2002 2002 2002 2002 2002 2002 2002 2002
300 23268 2002 2002 2002 2002 2002 2002 1872 1872 1970
330 23559 2002 2002 2002 2002 2002 2002 2002 1872 1986
360 23203 2002 2002 2002 2002 2002 2002 1536 1536 1886
390 23776 2002 2002 2002 2002 2002 2002 1482 1482 1872
420 22230 2002 2002 2002 2002 2002 2002 1536 1536 1886
450 23387 2002 2002 2002 2002 2002 2002 2002 2002 2002
480 23495 2002 2002 2002 2002 2002 2002 1742 1742 1937
510 23657 2002 2002 2002 2002 2002 2002 2002 2002 2002
540 23402 2002 2002 2002 2002 2002 2002 2002 2002 2002
570 21686 1820 1820 1820 1820 2002 2002 1872 1872 1879
600 23454 2002 2002 2002 2002 2002 2002 1872 1872 1970
630 22161 2002 2002 2002 2002 2002 2002 1742 1742 1937
660 20981 2002 2002 2002 2002 2002 2002 1536 2002 1944
690 21042 2002 2002 2002 2002 2002 2002 2002 2002 2002
720 22550 2002 2002 2002 2002 2002 2002 2002 2002 2002
750 22236 2002 2002 2002 2002 2002 2002 2002 2002 2002
780 22007 2002 2002 2002 2002 2002 2002 2002 2002 2002
810 20619 2002 2002 2002 2002 2002 2002 1229 1229 1809
840 21725 2002 2002 2002 2002 2002 2002 1229 1742 1873
870 21958 2002 2002 2002 2002 2002 2002 1536 1536 1886
900 22304 2002 2002 2002 2002 2002 2002 1742 1742 1937
Continued Below
Armv9 CPU Phone
System 4 Android 13 1x 2.80 GHz Cortex-X2,
4x 1.82 GHz Cortex A510, 3x 2.52 GHz Cortex A710
Threads 8 8 4 2 1
Battery Power Battery Battery Battery
27-Apr 27-Apr 30-Apr 30-Apr 30-Apr
Start 20:35 20:50 14:06 14:22 14:40
End 20:50 21:09 14:22 14:40 14:57
Secs MFLOPS MFLOPS MFLOPS MFLOPS MFLOPS
Start 84416 75701 66146 40172 18014
30 78275 72473 62252 40037 18003
60 77460 61675 61739 Timeout 18000
90 76556 65468 60870 41500 18007
120 75133 62711 60758 38685 18011
150 74824 62759 60320 39085 18002
180 74002 62159 60111 38975 18017
210 71878 58489 59853 38780 18014
240 73117 58442 59472 38367 18006
270 72064 55940 58854 38418 18005
300 72885 53904 35216 37431 18002
330 71437 55761 58663 36239 18015
360 71531 54161 57187 36538
390 70866 53668 54066 35590
420 70526 53834 51857 35860
450 69574 53701 55682 35227
480 62070 53907 50873 Timeout Timeout
510 62157 53930 52357 34290
540 59206 53534 51482 34310
570 57785 53970 49558 35564
600 56564 53967 Timeout 36059
630 59496 68216 59774 36938
660 55328 53941 47969 35854 31675
690 55826 59331 52595 34714 31642
720 56265 57811 57567 36331 30553
750 53968 58897 49164 36803 25729
780 56221 55074 59303 34276 22781
810 54436 56509 49458 34620 22491
840 55442 56757 58579 35851 22494
870 53653 53610 51860 35835 22493
900 54026 52228 50180 34358 22486
Start S 84416 75701 66146 40172 18014
End E 54026 52228 50180 34358 22486
%E/S 64 69 76 86 125
More Integer Stress Tests Next Page or Go To Start
More Integer Stress Tests
Following are summary results from 15 minute tests at 160 KB using 1, 2, 4 and 8 threads, comparing changes in average, maximum and
typical minimum multiprocessing gains (typical to exclude the odd exceptionally slow result).
Main observations are that average performance can reduce following extended running time and MP gains can be nowhere near being
proportional to the number core CPU cores used. For example, using 8 cores might lead to a three times improvement over that from a single
core and with less that four times apparently inevitable.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55
System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55)
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55
System 4 Android 13 1 x 2.80 GHz Cortex-X2, 4 x 1.82 GHz Cortex A510, 3 x 2.52 GHz Cortex A710
System 1 2 3 4
Threads MB/sec Gain MB/sec Gain MB/sec Gain MB/sec Gain
1 Best 14594 1.0 14398 1.0 11433 1.0 20831 1.0
2 Minimum 23529 1.6 30435 2.1 20842 1.8 33665 1.6
Average 25460 1.7 30707 2.1 21712 1.9 40107 1.9
Maximum 29863 2.0 30833 2.1 22919 2.0 47865 2.3
4 Minimum 30093 2.1 30379 2.1 23305 2.0 61693 3.0
Average 34008 2.3 35550 2.5 28169 2.5 69532 3.3
Maximum 40437 2.8 36440 2.5 29441 2.6 96175 4.6
8 Minimum 44260 3.0 50302 3.5 36674 3.2 87526 4.2
Average 48066 3.3 55361 3.8 39996 3.5 104589 5.0
Maximum 53708 3.7 57397 4.0 44521 3.9 133083 6.4
More Floating Point Stress Tests Next Page or Go To Start
More Floating Point Stress Tests
These were run using the same profile as the integer stress tests, where MP gains were even worse. Here, running a program that uses all
eight CPU cores can be shown to be only three times faster than using a single core.
System 1 Android 11 2 x 2.05 GHz ARM Cortex-A76 and 6 x 2.0 GHz ARM Cortex-A55
System 2 Android 12 2.0 GHz Snapdragon 750 (2 x 2.0 GHz Cortex-A76 and 6 x 1.8 GHz Cortex-A55)
System 3 Android 13 2 x 2.0 GHz ARM Cortex-A75 and 6 x 2.0 GHz Cortex-A55
System 4 Android 13 1 x 2.80 GHz Cortex-X2, 4 x 1.82 GHz Cortex A510, 3 x 2.52 GHz Cortex A710
System 1 2 3 4
Threads MFLOPS Gain MFLOPS Gain MFLOPS Gain MFLOPS Gain
1 Best 12096 1.0 12413 1.0 6917 1.0 31675 1.0
2 Minimum 22221 1.8 24629 2.0 13358 1.9 34276 1.1
Average 23468 1.9 24896 2.0 13821 2.0 36783 1.2
Maximum 24427 2.0 24990 2.0 13830 2.0 41500 1.3
4 Minimum 21944 1.8 26128 2.1 16433 2.4 35216 1.1
Average 25164 2.1 27510 2.2 16859 2.4 55459 1.8
Maximum 28083 2.3 27807 2.2 17087 2.5 66146 2.1
8 Minimum 29787 2.5 35775 2.9 20619 3.0 53653 1.7
Average 31555 2.6 37249 3.0 22881 3.3 65709 2.1
Maximum 34876 2.9 38431 3.1 24716 3.6 84416 2.7
Downloading PDF Files
On selecting a PDF file from ResearchGate, the file can be read but menu selections and internal links have no effect. In order to use them,
the file should be downloaded (if allowed by ResearchGate) by clicking on "More v" top line option (See Example PDF File Index) then opening
the file in a browser or suitable PDF file reader. The alternative is to load an HTM version of the report - see below.
Alternative HTM Files
Unfortunately ResearchGate changed the file structure, leaving older compressed files, containing benchmark programs and source code,
being no longer available. In the HTM version, access to reports enabling the latter are currenyly from the author’s website (with limited life)
or via the Wayback Archive (Archiive - mainly select latest year, month and highlighted day). This home page is an index of all publications
with links to view most HTM and PDF versions of reports.
Go To Start
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.