William B. Langdon’s research while affiliated with WWF United Kingdom and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (346)


Left: adding two 8 bit numbers to give 8 bit result. Information is lost as inputs contain at most 2×8\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2\times 8$$\end{document} bits (≤16\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le 16$$\end{document} bits) of information and output can contain at most 8 bits. Right: red 0–9 actual distribution of 0-9 digits in 37 VIPS C source files. Dashed blue 0–18 distribution if they are added. Although the output of +\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{document} is wider and has higher entropy (3.75), it is smoother and has less entropy than the combined entropy (5.76) of the two inputs to +\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{document}. (Example expanded in the appendix)
FlameGraph of Linux perf profile of gem5 simulating our RNAfold fragment (Section 4.2). Used functions are spread horizontally, whilst vertical axis indicates depth of function call nesting. (An interactive version is available via https://github.com/wblangdon/Deep-Imperative-Mutations-have-Less-Impact)
Left: VIPS 3264×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document}2448 benchmark input image (23 970 833 bytes) Right: 128×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document}96 thumbnail image generated by VIPS (36 919 bytes, see left of Figure 5 for enlarged thumbnail)
Twenty base RNA molecule used in gem5 test case higher_order_code_209. The figure shows the minimum free energy secondary structure, which is found by RNAfold. Note the C – G pair bindings form a characteristic low energy “hairpin” spiral, often found in both RNA and DNA molecules
Left: original VIPS thumbnail output. Almost all mutants which produce output, give images which are identical. Right: a similar but different mutant image

+24

Deep imperative mutations have less impact
  • Article
  • Full-text available

December 2024

·

17 Reads

Automated Software Engineering

W. B. Langdon

·

David Clark

Information theory and entropy loss predict deeper more hierarchical software will be more robust. Suggesting silent errors and equivalent mutations will be more common in deeper code, highly structured code will be hard to test, so explaining best practise preference for unit testing of small methods rather than system wide analysis. Using the genetic improvement (GI) tool MAGPIE, we measure the impact of source code mutations and how this varies with execution depth in two diverse multi-level nested software. gem5 is a million line single threaded state-of-the-art C++ discrete time VLSI circuit simulator, whilst PARSEC VIPS is a non-deterministic parallel computing multi-threaded image processing benchmark written in C. More than 28–53% of mutants compile and generate identical results to the original program. We observe 12% and 16% Failed Disruption Propagation (FDP). Excluding internal errors, exceptions and asserts, here most faults below about 30 nested function levels which are Executed and Infect data or divert control flow are not Propagated to the output, i.e. these deep PIE changes have no visible external effect. Suggesting automatic software engineering on highly structured code will be hard.

Download

Sustaining Evolution for Shallow Embodied Intelligence

December 2024

·

13 Reads

IOP Conference Series Materials Science and Engineering

Lenski’s experiments with E. Coli show Biology can sustain continual evolutionary improvement. However long term evolutionary experiments (LTEE) with evolutionary computing find that information theory’s failed disruption propagation (FDP) in deeply nested genetic programming (GP) hierarchies can greatly slow adaptation. We propose that researchers aiming at embodied artificial intelligence should control software robustness by using porous high surface area geometrical architectures, perhaps composed of many shallow mangrove like tree structures intimately linked to their data rich fitness environment.


Implicit Test Oracles for Quantum Computing

September 2024

·

9 Reads

Testing can be key to software quality assurance. Automated verification may increase throughput and reduce human fallibility errors. Test scripts supply inputs, run programs and check their outputs mechanically using test oracles. In software engineering implicit oracles automatically check for universally undesirable behaviour, such as the software under test crashing. We propose 4 properties (probability distributions, fixed qubit width, reversibility and entropy conservation) which all quantum computing must have and suggest they could be implicit test oracles for automatic, random, or fuzz testing of quantum circuits and simulators of quantum programs.


Computer hardware and software used during evaluation stages.
Number of fuzzed test inputs exposing bugs in gem5 found during 24-hour fuzzing aggregated by configuration.
Enhancing Search-Based Testing with LLMs for Finding Bugs in System Simulators

August 2024

·

40 Reads

Aidan Dakhama

·

·

W.B Langdon

·

[...]

·

Despite wide availability of automated testing techniques such as fuzzing, little attention has been devoted to testing computer architecture simulators. We propose a fully automated approach for this task. Our approach uses large languagemodels to create input programs, including information about their parametersand their types, as test cases for the simulators. The LLM’s output becomesthe initial seed for an existing fuzzer, AFL, which has been enhanced with threemutation operators, targeting both the input binary program and its parame-ters. We implement our approach in a tool called SearchSYS. We use it to testthe gem5 system simulator. SearchSYS discovered 21 new bugs in gem5, 14 where gem5’s software prediction differs from the real behaviour on actual hardware and 7 where it crashed. New defects were uncovered with each of the 6 LLMs used.




The 13th International Workshop on Genetic Improvement(GI @ ICSE 2024)

July 2024

·

7 Reads

ACM SIGSOFT Software Engineering Notes

The GI @ ICSE 2024 workshop, held 16 April, in addition to presentations, contained a keynote on how to use Genetic Improvement to control deep AI large language models in software engineering and a tutorial on a language independent tool for GI research. We summarise these, the papers, people, prizes, acknowledgements, discussions and hopes for the future.



SearchGEM5: Towards Reliable Gem5 with Search Based Software Testing and Large Language Models

December 2023

·

15 Reads

·

4 Citations

Lecture Notes in Computer Science

We introduce a novel automated testing technique that combines LLM and search-based fuzzing. We use ChatGPT to parameterise C programs. We compile the resultant code snippets, and feed compilable ones to SearchGEM5, our extension to AFL++ fuzzer with customised new mutation operators. We run thus created 4005 binaries through our system under test, gem5, increasing its existing test coverage by more than 1000 lines. We discover 244 instances where gem5 simulation of the binary differs from the binary’s expected behaviour.



Citations (60)


... Schulte et al (2014); Schulte (2014); Chen and Venkataramani (2016); Dorn et al (2019); Bruce et al (2021). Indeed we used it in Langdon and Clark (2024b). We downloaded the 64bit X86 version of PARSEC 3.0 from GitHub 5 and extracted the VIPS library from it. ...

Reference:

Deep imperative mutations have less impact
Genetic Improvement of Last Level Cache
  • Citing Chapter
  • March 2024

Lecture Notes in Computer Science

... Previously [2] we proposed a novel way of testing system simulator software. Our differential testing [3] approach uses large language models (LLMs) to Ąrst generate a set of initial programs, which are then compiled and fed through a fuzzer (a modiĄed version of AFL++, see Section 3.2). ...

SearchGEM5: Towards Reliable Gem5 with Search Based Software Testing and Large Language Models
  • Citing Chapter
  • December 2023

Lecture Notes in Computer Science

... Indeed researchers are usually interested in the size of programs rather than their depth (Blot and Petke 2022a, p15). We did some work on integer Langdon (2022b) and floating point Langdon (2022d) functions, where fault masking could be total if the program nesting was deep enough, however all were artificially evolved (genetic programming Koza (1992); Poli et al (2008)) not real programs. For details see Sect. ...

Open to Evolve Embodied Intelligence

IOP Conference Series Materials Science and Engineering

... Running for up to a million generations without size limits has generated, at two billion nodes, the biggest programs yet evolved and forced the development [11,12,13] of, at the equivalent of more than a trillion GP operations per second, the fastest GP system [14,15,16]. It has also prompted detailed analysis of programs [17], including from an information theoretic [18] perspective [19,20,21]. (Of course information theory has long been used with evolutionary computing, e.g. ...

Dissipative Arithmetic
  • Citing Article
  • October 2022

Complex Systems

... Indeed researchers are usually interested in the size of programs rather than their depth (Blot and Petke 2022a, p15). We did some work on integer Langdon (2022b) and floating point Langdon (2022d) functions, where fault masking could be total if the program nesting was deep enough, however all were artificially evolved (genetic programming Koza (1992); Poli et al (2008)) not real programs. For details see Sect. ...

Failed disruption propagation in integer genetic programming
  • Citing Conference Paper
  • July 2022

... Running for up to a million generations without size limits has generated, at two billion nodes, the biggest programs yet evolved and forced the development [11,12,13] of, at the equivalent of more than a trillion GP operations per second, the fastest GP system [14,15,16]. It has also prompted detailed analysis of programs [17], including from an information theoretic [18] perspective [19,20,21]. (Of course information theory has long been used with evolutionary computing, e.g. ...

Measuring failed disruption propagation in genetic programming
  • Citing Conference Paper
  • July 2022

... Overall, when applying incremental evaluation to the Sextic polynomial problem with an unlisted 16-core Intel CPU, 571 billion effective GPops/s were achieved [68]. Finally, Langdon and Banzhaf extended the ideas presented in [67] to include those presented in [68], which, when using a 3.00GHz Intel Xeon Gold 6136 server, allowed for a significant 1.103 trillion effective GPops/s on the Sextic polynomial problem [70]. However, when distinguishing between effective GPops/s and the original definition of GPops/s, we submit that the value of 139 billion listed in [67] is still the largest GPops/s value reported for non-Boolean domains. ...

Long-Term Evolution Experiment with Genetic Programming
  • Citing Article
  • June 2022

Artificial Life

... Firstly, using Poli's submachine code GP [6,7], to evolve large binary Boolean trees [8] and more recently exploiting SIMD Intel AVX and multi-core parallelism to evolve floating point GP [9,10]. Running for up to a million generations without size limits has generated, at two billion nodes, the biggest programs yet evolved and forced the development [11,12,13] of, at the equivalent of more than a trillion GP operations per second, the fastest GP system [14,15,16]. It has also prompted detailed analysis of programs [17], including from an information theoretic [18] perspective [19,20,21]. ...

Deep Genetic Programming Trees Are Robust
  • Citing Article
  • June 2022

ACM Transactions on Evolutionary Learning and Optimization