PreprintPDF Available

Celebrating 50 years of computer benchmarking and stress testing

Authors:
  • UK Government
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Celebrating 50 years of computer benchmarking and stress testing Background to support claims made in other publications.
Celebrating 50 years of computer benchmarking and stress testing
Roy Longbottom
From 1972 to 2022 I produced and ran computer benchmarking and stress testing programs.
The Whetstone Benchmark, for which I became the design authority, also covered exactly the
same time span.
Stress Tests 1972 to 1980
I was a member of the engineering support group of the UK Government Central Computer and
Telecommunications Agency (CCTA) from 1960 until my early retirement in 1993. For a long
period I was responsible for designing and running contractually required acceptance trials for
Government computers and those centrally funded for Universities. In the 1970s, major
changes were required for stress testing under forthcoming Multiprogramming Operating
Systems. For these, I wrote a series of 17 programs in Fortran (CPUs 5, disk drives 4, magnetic
tape units 3 and one each for card readers and punches, paper tape readers and punches, and
line printers). From 1972, these were used on many hundreds of acceptance trials up to 1990.
Details of these trials, programs and some results are covered in my 1980 book “Computer
System Reliability”.
My hands on involvement included on site acceptance trials of latter day supercomputers IBM
360/85, IBM 360/195 and CDC 7600, in 1972. Then, in 1979/80, a Cray 1 and a CDC Cyber 205,
for which I produced a new series of CPU benchmarks that compiled with automatic
vectorisation. The latter two also had pre-delivery trials in the USA. Of these seven trials,
three failed and required second trials, following appropriate delays. My stress testing
programs were responsible for all of the failures, one due to an excessive number of CPU fault
incidents, one due to reading the wrong files and the other due to the I/O interface not
correctly transmitting my data patterns.
Benchmarks and Performance 1972 to 1993
Besides for the Whetstone benchmark, I produced performance ratings from all CPU tests used
during acceptance trials that I was involved in, covering 72 different processors.
From 1981 to 1987, I was mainly involved in dealing with performance of data processing
systems, covering sizing, modelling, performance monitoring, general advice and attending user
specified benchmarking sessions.
From 1987, I continued with data processing consultancy plus reinvolvement in University
supercomputer activities, including acting as an independent adviser to a benchmarking NEC
and Fujitsu systems, during 1992 in Japan.
Whetstone Benchmark 1972 to 2022
The Whetstone benchmark was produced by my CCTA colleague Harold Curnow, with the initial
official results obtained in 1972. The Fortran version became the first general purpose
benchmarks that set industry standards of computer system performance. In its day, it was the
equivalent of today’s Geekbench. Then it was minicomputer manufacturers who said “Now who
has the fastest computer”, based on a single number score. I later took over design
responsibility for this benchmark, but included some changes earlier. The main one was to
produce performance measurements of the 8 test functions, to provide fairer comparisons and
identify any underhand activities. I also produced versions for vector processors and
multiprocessor systems. The benchmark was also run in all the acceptance tests and I
continued collecting and reporting results until retirement. See
https://www.researchgate.net/project/Whetstone-Benchmark.
Later Benchmarks and Stress Tests
With my first access to a PC, in CCTA, I acquired IBM BASIC and Fortran compilers to produce
new versions of the Whetstone benchmark. On retirement, with my own PC and C/C++ compiler
and website, I started my PC Benchmark Collection in roylongbottom.org.uk (1996).
Initial benchmarks were mainly for measuring CPU performance, aimed at identifying best and
worst characteristics, rather than a single number rating. These were followed by a number
covering cache and RAM, then input output devices, networks and graphics. Stress testing
programs, for all these areas, were produced as needed. All had parameters to specify running
time, most with regular performance reports, to identify time related changes.
The website currently has 86 HTM reports and 75 compressed files containing benchmarks,
source codes and descriptions, all FREE with no advertising. Also, 37 of these, or variations,
are in PDF files at ResearchGate, along with benchmark codes.
Over the years I produced different and new versions of programs for 32 bit and 64 bit working,
new compilers, new CPU architecture and multiple CPU cores. Each of these varieties covered a
range of platforms including using Intel and ARM type processors, running via DOS, OS/2
(1997), and various flavours of Windows (e.g. XP 2005) , Linux (2010 ), Android (2012) and
Raspberry Pi (2013). Where appropriate, in each area there are around 20 benchmarking and
stress testing programs, at both 32 bits and 64 bits.
Stress testing PCs started with those considering implications of overclockiing then laptop
performance issues. Next were those for evaluating newer technology heating effects, then
advanced vector type architecture and unbalanced CPU arrangements (like big.LITTLE). My
stress testing programs have parameters including for running time, alternative hardware,
validation of correct results and regularly reporting performance (as opposed to a single result).
By 2019 I had produced a number of reports on Raspberry Pi 1, 2 and 3, including for
comprehensive stress tests. Then, in 2019 (aged 84), I was recruited as a voluntary member of
Raspberry Pi pre-release Alpha testing team. This has continued to the present time, leading to
a further 9 PDF reports being produced, most based on those produced for the Alpha testing
team. My latest, for Raspberry Pi Pico W, is described in my LinkedIn Posts. All are available
from ResearchGate.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.