Content uploaded by Susi Lehtola

Author content

All content in this area was uploaded by Susi Lehtola on Mar 25, 2022

Content may be subject to copyright.

OVERVIEW

Free and open source software for computational

chemistry education

Susi Lehtola

1

| Antti J. Karttunen

2

1

Molecular Sciences Software Institute,

Blacksburg, Virginia, USA

2

Department of Chemistry and Materials

Science, Aalto University, Espoo, Finland

Correspondence

Susi Lehtola, Molecular Sciences Software

Institute, Blacksburg, VA 24061, USA.

Email: susi.lehtola@alumni.helsinki.fi

Funding information

Business Finland, Grant/Award Number:

3767/31/2019

Edited by: Peter R. Schreiner, Editor-in-

Chief

Abstract

After decades of waiting, computational chemistry for the masses is finally

here. Our brief review on free and open source software (FOSS) packages

points out the existence of software offering a wide range of functionality, all

the way from approximate semiempirical calculations with tight-binding den-

sity functional theory to sophisticated ab initio wave function methods such as

coupled-cluster theory, covering both molecular and solid-state systems. Com-

bined with the remarkable increase in the computing power of personal

devices, which now rivals that of the fastest supercomputers in the world in

the 1990s, we demonstrate that a decentralized model for teaching computa-

tional chemistry is now possible thanks to FOSS packages, enabling students

to perform reasonable modeling on their own computing devices in the bring

your own device (BYOD) scheme. FOSS software can be made trivially simple

to install and keep up to date, eliminating the need for departmental support,

and also enables comprehensive teaching strategies, as various algorithms'

actual implementations can be used in teaching. We exemplify what kinds of

calculations are feasible with four FOSS electronic structure programs, assum-

ing only extremely modest computational resources, to illustrate how FOSS

packages enable decentralized approaches to computational chemistry educa-

tion within the BYOD scheme. FOSS also has further benefits driving its adop-

tion: the open access to the source code of FOSS packages democratizes the

science of computational chemistry, and FOSS packages can be used without

limitation also beyond education, in academic and industrial applications, for

example.

This article is categorized under:

Software > Quantum Chemistry

KEYWORDS

computational chemistry education, free software, open source

Received: 26 November 2021 Revised: 14 February 2022 Accepted: 22 February 2022

DOI: 10.1002/wcms.1610

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided

the original work is properly cited.

© 2022 The Authors. WIREs Computational Molecular Science published by Wiley Periodicals LLC.

WIREs Comput Mol Sci. 2022;e1610. wires.wiley.com/compmolsci 1of33

https://doi.org/10.1002/wcms.1610

1|INTRODUCTION

Quantum chemical research methods have been used extensively in the chemical industry already for several

decades.

1–4

In addition to the widespread use in industry as well as in academia, quantum chemistry is also utilized in

chemical education to provide atomic-level understanding of fundamental chemical concepts and phenomena.

5,6

For

example, in undergraduate general and organic chemistry curricula, students get hands-on experience on concepts such

as three-dimensional molecular structure, structural isomerism, conformers, and stereochemistry by means of computa-

tional exercises or computer laboratory sessions.

7–9

Although some of the aforementioned aspects can in principle be studied even with simpler methodologies such as

classical force fields, quantum chemical calculations with state-of-the-art software packages allow students to get first-

hand understanding on more advanced topics such as molecular orbitals, chemical bonding, energetics,

10

thermodynamics,

11,12

reaction mechanisms,

13

and various spectroscopies.

14–18

The ability to interpret and understand chemical phenomena with the help of quantum chemical calculations is a

valuable skill in every chemist's professional life: nowadays, a significant portion of even the experimental studies

reported in the chemical literature is tightly integrated with quantum chemical investigations. Moreover, as quantum

chemistry is the critical bridging component between experimental work and machine learning methods, the ability to

run quantum chemical calculations can be expected to become even more increasingly relevant and necessary to work-

life in the near future.

Although computational chemistry for the masses—a pervasive inclusion of computational modeling in the chemis-

try curriculum—has been long thought to be coming,

19

it does not appear to have arrived yet. In their recent overview,

Grushow and Reeves

20

have summarized some select landmarks in computational chemistry education. At the same

time, Grushow and Reeves note how computational chemistry still has a somewhat limited presence in undergraduate

curricula, which can be attributed at least in part to the history of computational chemistry software.

In the 1990s, commercial software companies started selling graphical user interfaces to their quantum chemistry

packages, some of which were particularly geared toward educational use. Such software was and still is typically used

in a computer classroom setting, where a limited number of relatively powerful desktop computers are available for the

students during the teaching sessions. The benefit of a computer classroom setting is that all software can be pre-

installed for the students and the standardized software environment makes the possibilities (and limitations) of the

software setup clear for the teachers in charge of the educational content. However, the computer classroom approach

has limited scalability, as the number of students is limited by the number of workstations; this often makes the

approach impractical for large-scale undergraduate teaching. Furthermore, while the computer classroom setting may

be useful for teaching during contact sessions, the students' possibilities for running calculations outside the contact ses-

sions are limited by the requirement of physical access to the computer classroom—which has proved to be challenging

especially during the ongoing global coronavirus disease pandemic which has required social distancing. The classroom

setting also typically limits the teacher and students to using the pre-installed software, while costs for the required soft-

ware licenses can be unfeasibly high for educational institutions with limited budgets. Someone also has to maintain

the software on the classroom computers and ensure it is kept up to date.

In the early 2000s, the WebMO package introduced a web-based approach to computational chemistry education, in

which the quantum chemistry software only needs to be installed and maintained on a central server, and the teachers

and students can then access it through a web browser interface.

21,22

A number of quantum chemistry software pack-

ages have been integrated with WebMO whose integrated molecular editor and analysis tools make it a rather low-

barrier interface to quantum chemistry. As the users thus only need a web browser to access the computing software,

WebMO was the first tool to enable a bring your own device (BYOD) paradigm in computational chemistry, in which

the students can use their personal devices to take part in the teaching.

However, WebMO still requires someone to set up and administer the WebMO server, even though the need to pur-

chase actual server hardware has been removed by the possibility of installing the service on cloud platforms such as

the Amazon Web Services or the Google Cloud. Recently, the cloud-based Chem Compute platform has also begun to

offer web access to computational chemistry software. Chem Compute provides computing resources for undergraduate

teaching as well as research at no cost to the teachers,

23

thus allowing institutions that do not have the personnel or

financial resources to set up their own physical or cloud servers to offer computational chemistry education. However,

Chem Compute relies on computational resources volunteered by third parties whose continued future availability is

not guaranteed.

2of33 LEHTOLA AND KARTTUNEN

As discussed above, great advances like WebMO and Chem Compute have been made in the direction of the BYOD

paradigm, to which many universities have already shifted in order to cut down on the costs associated with the now-

deprecated computer classroom model. In this work, we will show that free and open source software (FOSS) can be

used in the context of the BYOD paradigm to achieve computational chemistry for the masses, all the while democratiz-

ing science by tearing down established power structures and barriers for research and education. (Inroads into BYOD

in the context of virtual laboratories have also been recently discussed by Kobayashi et al.

24

)

The layout of this work is as follows. In Section 2, we will begin by defining what we mean by FOSS (Section 2.1).

Then, we discuss why FOSS has not been the norm in science (Section 2.2), what FOSS enables for the teaching of com-

putational chemistry (Section 2.3), and why it would be a good time now to switch over to FOSS in teaching

(Section 2.4). We present a brief overview of available FOSS packages in Section 3. We include several practical demon-

strations of using state-of-the-art FOSS programs for computational chemistry education in Section 4, showcasing the

kinds of calculations that are possible assuming only limited computer resources. The article concludes in a brief sum-

mary and discussion in Section 5.

2|FREE AND OPEN SOURCE SOFTWARE

2.1 |Definitions

As our readers may not be familiar with the concept of FOSS, some definitions are necessary before the present discus-

sion can take place. For the purposes of this article, we will adopt three key criteria for FOSS:

1. The ability of anyone to freely use the software for any purpose.

2. The ability to freely study the operation of the software, and modify it at will.

3. The ability to freely redistribute copies of the software—as well as modified versions thereof—to others.

Consequently, any software that does not satisfy these criteria for FOSS is referred to as proprietary or closed source

software.

What is the significance of these criteria? The first criterion means simply that there can be no limitations on poten-

tial uses of the software: for instance, in addition to use in academic research and education, commercial use must also

be permitted by the license. Moreover, the first criterion bars license terms that prohibit use of the software for purposes

deemed questionable by the licensors, such as use in nuclear power plants or in research on genetic engineering. FOSS

can be used by anyone for anything.

The second criterion means that the source code of the software must not only be available, but also that

customizations to the source code must be allowed. This is of major importance for developing new features or compu-

tational models, for example. Being able to use software written by other authors to accomplish certain tasks eliminates

the need to “reinvent the wheel”and thereby results in faster scientific development.

25

This phenomenon has tradition-

ally been the main enticement of contributing to closed-source or “open teamware”

26

packages, as access to their source

code partly eliminates the need to start from scratch, as algorithms implemented in the package by its other contribu-

tors can be leveraged to develop new computational models.

However, the control of access to the source code of such closed-source programs lead to perpetuating power struc-

tures and may inhibit academic collaborations between authors of different program packages,

27

instead of the

Popperian ideal of science: the selfless pursuit of truth,

28

and a fair and unbiased competition of ideas and methods in

the context of computational chemistry. Key persons in control of the access to the source codes of various software

packages are able to hold back equitable competition and collaboration between scientists developing new methods

and algorithms. The issue with gatekeepers is not a new phenomenon: as was already quipped by Max Planck, “A new

scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its oppo-

nents eventually die, and a new generation grows up that is familiar with it”; this apt observation is supported by a recent

study that investigated the dynamics of scientific evolution with the standard empirical tools of applied microeconom-

ics.

29

This problem is less likely to manifest in FOSS, as will be explained in the next paragraph.

The third criterion means that anyone who has a copy of the software can redistribute it to others. One does not

need to ask case-by-case permission from the authors of the software in order to share it with one's collaborators or the

reviewers of a scientific paper, for instance. It also means that anyone who has added new features to the program can

LEHTOLA AND KARTTUNEN 3of33

freely distribute their version. This eliminates the problematic role of the gatekeepers in the “open teamware”model,

as alternative versions of the software commonly known as forks can be distributed. It also eliminates the possibilities

of the infamous practice

30

of preventing one's competitors from using one's software, which may have the result of hid-

ing deficiencies and bugs in one's software. Case in point: the “war on supercooled water”

31

exemplifies the problems of

having prominent figures as exclusive gatekeepers. The “war”was only resolved once Princeton scientists gained access

to their Berkeley competitors' source code and found a coarse error therein.

32

Such problems are much less likely to

exist if FOSS is used, as FOSS programs are freely redistributable and can be thoroughly inspected by anyone.

In our opinion, the three criteria laid out above condense the essence of both the generally accepted 10-item defini-

tion for “open source software”by the Open Source Initiative

33

as well as the four essential freedoms of “free software”

or “libre software”defined by the Free Software Foundation.

34

Note that there is a wide variety of FOSS licenses that fit

these criteria and that can be adopted by software projects. New software projects should choose their license with

care.

35

It is always easier to switch to a more permissive license later on than to move to a more restrictive license: any

versions released under a FOSS license will continue being FOSS in the future, as well, even if newer versions switch to

using a proprietary license, for example.

2.2 |Why is free/open-source software not the default?

2.2.1 | Code distribution

The ideology of FOSS is in line with the demands of science,

36

as much like the Schrödinger or Dirac equation, compu-

tational models should ideally always be publicly available. Moreover, as the initial development and ongoing use of

most scientific software has been and continues to be funded by public research funding, the results of such work—the

developed program source code—should be available to everyone.

It is worthwhile to comment on the reasons for the longstanding status quo. As discussed by Hinsen,

37

before the

advent of electronic computers, algorithms were developed with pen and paper, and the traditional paper journal article

format is ideally suited to fully describe such algorithms. But, when implemented on a computer, algorithms often

become too complicated to thoroughly describe in a journal article, and significant portions of the implementation are

always left out. As this tacit information on what happens “under the hood”of various computational chemistry pack-

ages is typically passed only within the academic groups contributing to those codes, lack of access to the source code

creates another barrier of entry for third parties, and again ends up perpetuating established power structures.

However, nowadays there are well-established ways for distributing scientific software. Version control systems

such as Git

38

facilitate robust development of software, which can be hosted at no cost on sites such as GitHub

39

and

GitLab.

40

GitHub and GitLab also enable a community approach to code development through the use of public code

review, which is leveraged by many program packages to improve code quality and to decrease the learning curve for

potential new contributors to the package. Stable releases of software can be made available on Open Science data

repositories such as Zenodo

41

with version-specific Digital Object Identifiers (DOIs). Also precompiled versions can

nowadays be easily distributed, as we will discuss in Section 3.

2.2.2 | Maintenance and user support

A commonly referred impediment to FOSS in science is that funding its maintenance and/or user support is challeng-

ing.

26,42,43

However, there are several companies whose whole business model is founded on the use, development, and

support of FOSS. For instance, Red Hat Inc. broke $1 billion in annual revenue in 2012, and its revenue has increased

ever since, surpassing $3 billion in 2018.

44

There is clearly money to be made in selling support for FOSS. Moreover, in

contrast to proprietary software, maintenance and support for FOSS can be acquired from third parties if the original

author(s) are either unavailable or unwilling to support for their code; this is the key to the Red Hat style business

model.

The business model also works for scientific FOSS. For example, Kitware Inc., established in 1998,

45

has built its

business model around developing and supporting a variety of scientific FOSS. Paraview

46

and ITK

47

enable modeling,

visualization and data analysis for large datasets, while the CMake build system has become a quintessential tool for

4of33 LEHTOLA AND KARTTUNEN

building scientific software.

48

As of 2022, Kitware has more than 200 employees and their FOSS projects span many

fields of science and technology, including quantum chemistry.

49

Due to the relatively small market for specialized scientific software, the availability of public research funding has

always played a key role in the development of computational chemistry software. Related to future development of

FOSS in science, the European Commission has outlined Open Science as their policy priority and the standard method

of working under its research and innovation funding programs.

50

As evidenced by forums such as the Computational Chemistry List

51

and the present authors' professional experi-

ence, online peer-to-peer user support—whose motivations have been studied, for example, by Constant, Sproull, and

Kiesler

52

—is invaluable even in the case of proprietary programs. In the case of FOSS, this peer-to-peer support has an

enhanced role, and is one of the keys behind the success of FOSS.

53

Because anyone can modify the software and dis-

tribute modified copies thereof, anyone can fix the bugs they run into, and gain fame even for small contributions.

Importantly, the possibility to contribute bug fixes to FOSS projects reduces the barrier between users and devel-

opers, and is the typical route how a project gains new developers. The fostering of new developers can also be greatly

aided by practices such as open code review, which serves a double purpose of both ensuring a top quality code base

and teaching both the new contributor as well as any other project followers about the structure and design philosophy

of the project. This naturally also leads to a more sustainable development environment, since a constant influx of new

developers is secured, and enables expert knowledge (also known as tacit knowledge) to be passed onto new members

of the development team.

Other aspects of the economic principles of FOSS have also been studied extensively.

54–74

FOSS is a public

good.

55,56,68

Participation in the development and support of FOSS has been found to be more motivating than that of

proprietary software,

69,70

and participation in FOSS projects is motivating and carries economic benefits.

71,72

FOSS pro-

motes peer review, free exchange of ideas, and maintainability,

73

and competition of FOSS packages promotes

innovation.

74

2.2.3 | Linux distributions

The Linux operating system is a prime example of FOSS. Originating from the University of Helsinki, Finland, it is

nowadays ubiquitous. It is used in billions of mobile phones, laptops, workstations, as well as servers and compute clus-

ters all around the world. All supercomputers on the TOP500 list

75

and the majority of the world's internet servers have

run on Linux for a long time; Android smartphones likewise run on Linux. Because of Linux, proprietary operating sys-

tems have been irrelevant in high-performance computing for many years. Chemists had good reasons to switch to

Linux already ages ago

76

; the present authors have used Linux as their main computational research platform for over

20 years.

A valuable feature of Linux distributions is that they are usually cross-platform: in addition to the usual x86 and

x86-64 platforms, consisting of processors by, for example, the Intel Corporation and Advanced Micro Devices Inc.

(AMD), Fedora packages are also available on s390x processors used on IBM mainframe computers and ARM proces-

sors such as the ones used in Raspberry Pi and new Mac computers, for instance. This versatility allows the use of het-

erogeneous hardware and ensures seamless compatibility even if students have dissimilar computing devices at their

disposal.

Several Linux distributions, such as Ubuntu, Debian, and Fedora Linux have also solved the problem of efficient dis-

tribution of software decades ago. Our criteria for FOSS in Section 2.1 allow such scientific software to be packaged as

part of Linux distributions, and indeed several powerful program packages are already available as distribution pack-

ages thanks to the grand entrance of FOSS software in quantum chemistry in recent years. Some FOSS quantum chem-

istry packages like Erkale,

77

Psi4,

78

and its predecessor Psi3

79

and PySCF

80

have been developed in a fully free/open-

source development model since their beginning, while other packages that originated within a closed-source licensing

model have also become open-sourced recently, such as OpenMolcas,

81

Dalton,

82

and NWChem.

83

2.2.4 | Case study: Libxc library of density functional approximations

An example of a successful scientific FOSS project can be found in the Libxc library of density functional approxima-

tions.

84

The modular library currently implements over 600 density functional approximations such as PBE,

85

B3LYP,

86

LEHTOLA AND KARTTUNEN 5of33

and SCAN,

87

and is used by over 30 electronic structure programs ranging from programs using Gaussian basis sets

(Erkale,

77

Psi4,

78

and PySCF

80

) to plane-wave codes (ABINIT,

88

INQ,

89

and Quantum Espresso

90

), finite element pro-

grams (HelFEM

91–94

and DFT-FE

95

), and multiresolution adaptive grids (MADNESS

96

). In order to facilitate wider use

by the community, Libxc recently switched to a more permissive FOSS license that allows the library to be more easily

included in closed-source programs. Libxc is now used in several proprietary and commercial software packages, for

example, the Slater-type orbital ADF package,

96

and the Gaussian-type orbital GAMESS-US,

97

Molpro,

98

MRCC,

99

ORCA,

100

and TURBOMOLE

101

programs; several other packages are also contemplating to migrate to Libxc.

The advantages of the community adoption of Libxc are manifold. A new density functional approximation only

needs to be implemented in Libxc to become available in any of the electronic programs that support Libxc, underlining

the efficiency of the modular FOSS model. Moreover, access to the same implementation of a density functional approx-

imation enables, for example, the study of reproducibility across various numerical approaches,

102

which is important

to be able to compare results obtained with different methods or software packages. Indeed, economic gains in terms of

software development productivity and product quality can be achieved by reuse of mature FOSS components that are

of the highest quality.

103

We believe that computational chemistry will continue to transform by adopting more and more FOSS components,

the electronic structure library (ESL) being one of the notable pushes in this direction.

104

Well-designed, modular FOSS

components can be maintained even by a single academic group; the semi-empirical dispersion library of the Grimme

group is a successful recent example.

105–107

We will discuss this topic further in Section 3.5.

2.3 |What does FOSS offer for teaching?

2.3.1 | Free redistribution: Install and maintenance

In addition to its benefits for general use cases,

108

FOSS has three major advantages for teaching: the availability of the

source code, the availability of precompiled binaries, as well as the general applicability of the software beyond acade-

mia. Starting out with the first advantage, software that satisfies the criteria for FOSS discussed in Section 2.1 can be

redistributed, and included in Linux distributions, for example. This greatly facilitates the installation of these pro-

grams, as prepackaged software can be installed in a matter of minutes on a wide range of hardware, ranging from stu-

dents' laptops to compute servers, simply by running a single command, or alternatively, finding the program in the

distribution's graphical application manager and clicking on “Install.”

We wish to note here that although installing scientific software by hand by compiling from source code affords cus-

tomized tunings that may result in faster operation, that is, decreased runtimes of quantum chemistry packages, in

many cases the gains realizable in computational chemistry education or small-scale computing are relatively modest

and pale in comparison with the ease of effort afforded by the centralized packaging system. Compiling from source

takes a lot of time as well as expertise, and can lead to poor performance if the compiler options are not adequately cho-

sen; note that several proprietary programs have likewise adopted a binary-only distribution model with the same

limitations.

However, installation is only a part of the problem: the software must also be kept up to date. This does not happen

automatically, and a constant level of administration effort is then required to monitor new releases, and to download

and install new versions of the software. In contrast, the Linux distribution packages get automatically updated with

the rest of the system whenever new package versions come out: Linux package managers not only handle updates to

the Linux operating system kernel, but also all other software, such as the internet browsers, the email clients, the office

productivity software suites, the Fortran and LATEX compilers, and so on. Also computational chemistry packages get

automatically updated.

2.3.2 | Access to source code

The second advantage of FOSS is that as the source code is available, it can be used in teaching. For instance, a course on

electronic structure calculations can exemplify the basic algorithms by showing how they are implemented in an openly

available program. Some codes go even further: for instance, Psi4Numpy

109

is a project that aims to supply simple, easily

modifiable Python algorithms for educational and proof-of-concept purposes. The PySCF quantum chemistry program

80

6of33 LEHTOLA AND KARTTUNEN

makes it easy to override and customize all algorithms, as they are mostly written in Python. Similarly, DFTK

110

has been

designed to facilitate algorithmic development and might therefore also be useful for educational purposes.

Access to these kinds of projects not only facilitates research in and development of new electronic structure

methods, but also means that teaching no longer has to be limited to pen and paper exercises: instead, it can also

include real-life demonstrations. For example, an advanced course on electronic structure theory could involve asking

students to write their own, customized solver for self-consistent field theory.

111

2.3.3 | Sophisticated workflows

The third advantage of FOSS for teaching is that since students (like anyone else) can access the full power of various

computational chemistry programs, they also have the possibility to develop more general technical skills such as pro-

gramming and interfacing programs with each other, for instance by generating sophisticated workflows that automate

complex tasks. Automated workflows are highly useful tools for practical computations, as they can be leveraged to eas-

ily run and analyze thousands to even millions of calculations that are needed for high-performance screening of mate-

rials, for instance. Several large-scale projects such as Materials Project,

112

Materials Cloud,

113

AiiDA,

114

Atomic

Simulation Recipes,

115

and QCEngine

116

are FOSS and provide immediate access to powerful automated workflows for

computational chemistry. As was summarized in the first criterion in Section 2.1, FOSS can also be freely used without

limitations in industry to develop new thermoelectric energy conversion materials

117

or semiconductor devices,

118

for

example, underlining its freedom and flexibility.

2.4 |Why would it be timely to switch to free/open-source software?

We have argued above that FOSS has important ramifications for the reproducibility of science and also has several

advantages for teaching. Although it is possible to switch from proprietary programs to FOSS within the traditional

setup based on computer classrooms and/or central compute servers, there is yet another important aspect to consider:

the BYOD approach discussed in Section 1. In this section, we wish to examine FOSS from the point of view of the

ongoing paradigm shift to the BYOD scheme.

As the price of laptop computers has dropped, many students now bring their own devices to the classroom. This

paradigm shift has also affected university policies. Students preferring to use their own devices have led to a significant

decrease in the demand for computer classrooms. Universities may now find it cheaper to just offer a laptop to all stu-

dents. For instance, the Faculty of Science of the University of Helsinki pivoted to such an approach several years ago.

As a result, the university has been able to cut down on computer classrooms that are expensive to maintain even while

several students refuse the laptop offered by the university and opt to using their private laptops instead.

Although as was already discussed in Section 1, a centralized compute server approach is compatible with the

BYOD paradigm, the effortless availability of FOSS programs can be used to finally bring computational chemistry to

the masses and thereby truly democratize science. As FOSS software packages can be made instantly available to every-

one, the FOSS approach is ideally suited for personal devices in the BYOD approach. Such a distributed approach is

optimal also for massive open online courses (MOOCs), as enrollment does not have to be limited based on the avail-

able centralized computer resources. Instead, the students can run all of the necessary calculations on their own

hardware.

Naturally, certain tradeoffs are implied in a course employing heterogeneous BYOD approaches, as one cannot

assume personal devices to have the same computational power as purpose-built, dedicated compute servers. However,

we argue that this is not much of an impediment due to the immense developments in the speed of processors and

improved algorithms achieved during the past several decades. A concrete example of this is the TOP500 list of super-

computers, which contains almost 30 years worth of data on the most powerful supercomputers in the world.

119,120

The

estimated performance of the fastest and slowest supercomputer on the list on a year-by-year basis is shown in Figure 1

in units of 10

9

floating-point operations per second (GFlops). Figure 1 also shows analogous benchmark data for com-

modity hardware: a cheap tablet computer with an Intel Celeron N4000 processor and a high-end business laptop with

an Intel i7-10610U processor of one of the present authors (SL). A Raspberry Pi 4 minicomputer was also assessed, and

found to perform similarly to the Celeron N4000 processor.

LEHTOLA AND KARTTUNEN 7of33

As Figure 1 illustrates, personal devices have performance in the tens to hundreds of gigaflops, which is comparable

to the performance of fastest supercomputers of the mid-1990s, or to the slowest supercomputer on the TOP500 list in

the mid-2000s. This amazing development in computational power means that the content of classic books on quantum

chemistry such as Szabo and Ostlund

121

could be reproduced nowadays on commodity hardware; however, there is no

reason to, since better computational methods and basis sets are available nowadays in many FOSS packages. Many cal-

culations could probably be even carried out on an up-to-date smartphone!

The data in Figure 1 suggest that a variety of calculations are possible within a reasonable time with personal

devices. Combined with FOSS program packages that can be installed and kept up to date in a trivial fashion with a

package manager, computational chemistry can finally be made available to the masses, as students are able to run

(and modify!) FOSS packages on their own devices. The skills they gain doing so are directly transferable to both

research and industry, as the same packages can also be used for heavy-duty calculations on supercomputers which is

also freely allowed by their permissive licenses.

3|OVERVIEW OF AVAILABLE FOSS PROGRAM PACKAGES

This section presents an overview of available FOSS program packages for computational chemistry. As the number of

FOSS projects has grown immensely in recent years, we restrict the overview to self-contained packages which are able

to run quantum electronic structure calculations from atomistic input. FOSS for other types of molecular modeling has

been discussed elsewhere,

122,123

while various computational chemistry resources for education have been recently

summarized by Rodríguez-Becerra et al.

124

As the availability of software is a moving goalpost, since new packages appear and old ones become technologically

obsolete and stop being maintained, any review can by force of necessity only represent the situation at a given point in

time. Continuously updated databases are an alternative that is (hopefully) always up to date,

125

but any observations

made on their basis similarly are tied to the time of observation and become outdated as enough time passes. For this

reason, new reviews are typically published whenever the availability of software has changed enough.

Budget laptop, Celeron N4000

High-end business laptop, Core i7-10610U

10−1

1

10

102

103

104

105

106

107

108

109

GFlops

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

Year

FIGURE 1 The best-performing (red stars) and worst-performing (blue squares) supercomputer on the TOP500 list,

120

as well as the

performance of a budget laptop with a Celeron N4000 processor and a high-end business laptop with a Core i7-10610U processor (see

Supporting Information). Note logarithmic scale on yaxis. The performance of Raspberry Pi 4 was found to be similar to Celeron N4000

8of33 LEHTOLA AND KARTTUNEN

The main goal of this section is merely to illustrate the breadth of software that is already available for use in com-

putational chemistry. We have assembled the collection of packages by thorough literature and internet searches.

Because unmaintained packages are unlikely to be easy to install, or to become available as prepackaged software, we

limit the overview to software that shows at least some development activity in recent years, as checked from the

upstream development repositories. Even if it later turns out that we have missed some recently published software

package in this review, or if some packages become replaced by newer competitors after the publication of this article,

our main points should remain unaffected: there will likely still be a similar breadth of FOSS packages suitable for a

variety of purposes within computational chemistry and computational chemistry education.

As FOSS, the programs listed here can be packaged and distributed openly without restriction; several of them are

already available as part of Linux distributions such as Debian, Ubuntu, and Fedora Linux. Linux distribution packages

are centrally maintained by the Linux distribution's packagers, and require no special knowledge or local department

personnel to install them or keep the software up to date, in contrast to typical proprietary packages. As we show in the

Supporting Information, the packages can be installed on the command line; alternatively, they can also be installed

using the distribution's application store. Importantly, the software is also automatically kept up to date by the distribu-

tion package manager, whereas the installation and upkeep of proprietary packages tends to require significant local

expertise and time effort.

It is not even necessary to be running Linux to use such prepackaged programs. Windows users can run the

software under the Windows Subsystem for Linux (WSL), which allows installing and using a Linux distribution

easily inside Windows 10. The cross-platform Python Package Index

126

(PyPI) and Conda

127

package managers are

other alternatives for easy access to an increasing number of quantum chemistry packages on Linux, Windows,

and macOS. Computer laboratory settings can also be imitated using pre-made, customized live CDs or live USBs,

for example.

Because of the large number of packages to review, we organize the discussion into

•Programs for molecular calculations with Gaussian basis sets, Section 3.1

•Programs for solid-state calculations with various numerical approaches, Section 3.2

•Programs employing fully numerical methods, Section 3.3

•Programs employing semiempirical methods, Section 3.4

Due to space constraints, we only include minimalistic descriptions of the programs, and advise the reader to look

up the programs' evolving capabilities in detail on the internet to assess their usefulness for a given computational

chemistry course or other application. Most of the electronic structure programs support either Hartree–Fock

(HF) and/or density functional theory

128,129

(DFT); several molecular programs also support various post-HF methods.

We will also discuss projects of a more limited scope in Section 3.5.

3.1 |Programs for molecular calculations with Gaussian basis sets

Gaussian basis sets dominate the field of quantum chemistry, since all electrons can efficiently be included in the calcu-

lation, the electronic Coulomb integrals can be evaluated analytically in the Gaussian basis,

130

and the evaluation is

efficient when recursion relations are used.

131,132

Thanks to many decades of work on the development of Gaussian

basis sets,

133–135

basis sets exist for the accurate reproduction of various molecular properties at several levels of theory.

Access to analytical integrals greatly facilitates the implementation of post-HF theories, and also guarantees accurate

force and Hessian evaluations.

Bagel

136

is a C++ program package that features, for example, analytical complete active space perturbation theory at

the second order (CASPT2) nuclear energy gradients and derivative couplings, relativistic multireference wave func-

tions based on the Dirac equation, and implementations of novel electronic structure theories.

Chronus Quantum

137

is a C++ program package that focuses on the consistent treatment of time dependence and spin

in the electronic wave function, as well as the inclusion of relativistic effects in said treatments.

Dalton

138

is a Fortran program that specializes in molecular properties at various levels of theory, such as frequency-

dependent response properties; one-, two-, and three-photon processes, etc. In addition to HF and DFT, Dalton features

several post-HF methods like multiconfigurational self-consistent field (MCSCF) theory and coupled-cluster theory.

LEHTOLA AND KARTTUNEN 9of33

Ergo

139

is a C++ program for linear-scaling HF and DFT calculations for molecules.

ERKALE

77

is a C++ program implementing HF and DFT that specializes in the modeling of inelastic X-ray spectros-

copies, self-interaction corrected DFT, as well as various orbital localization methods.

e

T140

is a C++ program primarily aimed for coupled-cluster calculations of molecular systems, which specializes in

multiscale and multilevel methods, as well as modern Cholesky decomposition techniques for two-electron integrals.

Fermi.jl

141

is a Julia package for HF and post-HF calculations.

JuliaChem

142

is a Julia package for HF calculations.

LSDalton

138

is a Fortran code targeted for linear-scaling HF and DFT calculations on large molecular systems, and also

includes some coupled-cluster capabilities.

MolGW

143

is a Fortran/C++ package that implements HF and DFT, but specializes in many-body perturbation theory:

the GW approximation and the Bethe–Salpeter equation.

MPQC

144

is a C++ program for massively parallel quantum chemistry, which originally focused on HF and DFT but

has later evolved support for post-HF many-body theories.

NWChem

83

is a major quantum chemistry package written in Fortran and has a variety of features for both molecular

and solid-state calculations.

Psi4

78

is a modular C++/Python package for HF, DFT, and various post-HF calculations that can be used either as a

traditional quantum chemistry package with simple and intuitive input files, or as Python modules for running calcula-

tions in Python.

PySCF

80

is a collection of Python modules for electronic structure calculations with significant capabilities also for

solid-state simulations, including, for example, coupled-cluster implementations for crystalline systems.

PyQuante

145

is a Python package for quantum chemistry with some C extensions that emphasizes ease of understanding

the code over performance.

OpenMolcas

81

is a Fortran package that specializes in multiconfigurational approaches to electronic structure theory,

but also implements various DFT calculations, for example.

Serenity

146

is a C++ program for subsystem quantum chemical methods.

SlowQuant

147

is a Python program for molecular quantum chemistry that derives its name from the use of Python for

even the computational demanding parts of the program.

VeloxChem

148

is a C++/Python package for molecular properties and for modeling various spectroscopies based on

response theory.

Uquantchem

149

is a Fortran 90 program written for HF, DFT, Møller–Plesset perturbation theory, configuration interac-

tion singles and doubles, quantum Monte Carlo, and so on.

3.2 |Programs for solid-state calculations

The major difference between solid-state and molecular calculations is that the orbitals experience exponential decay in

molecular calculations, while solid-state calculations are performed on periodic crystals where the wave function has to

obey Bloch's theorem.

150

Because of the periodicity, calculations in the solid state are in many ways more difficult than

those in molecules due to the need of k-point sampling, for instance; see Ref. [151] for a recent introduction. Post-HF

methods are much less prominent in the solid state than in molecules. Instead, calculations on solids are typically car-

ried out with DFT and pseudopotentials

152

; pseudopotentials make the calculations less costly while introducing an

error which is typically negligible compared with the error in the density functional approximation itself.

The conventional way to model crystalline systems is to use plane waves. However, many other numerical schemes

have also been pursued. Note that the programs listed here that employ (pseudo)atomic basis functions can naturally

handle periodicity in 0, 1, 2, or 3 dimensions, corresponding to atoms and molecules, chains, sheets, and crystals,

respectively. Still, we have listed them as solid state codes because they are most often used for calculations with DFT

and pseudopotentials.

ABINIT

88

is Fortran program for plane wave calculations that supports DFT as well as more advanced formalisms like

many-body perturbation theory.

ACE-Molecule

153

is a C++ program that employs uniform real-space grids of Lagrange sinc functions and

pseudopotentials, and supports density functional calculations on both periodic and non-periodic systems and wave

function theory calculations based on Kohn–Sham orbitals.

10 of 33 LEHTOLA AND KARTTUNEN

BigDFT

154

is a Fortran program that is based on the use of pseudopotentials and a two-tier Daubechies wavelet basis to

achieve a spatially localized basis.

Conquest

155

is a Fortran program for large-scale DFT calculations employing pseudo-atomic orbital basis sets.

CP2K

156

is a Fortran package based on Gaussian basis sets specializing in solid state physics, implementing HF, DFT,

Møller–Plesset perturbation theory, and the random phase approximation.

DFTK

110

or the density-functional toolkit is a collection of Julia routines for experimenting with plane-wave DFT that

emphasizes simplicity and flexibility in the aim of facilitating algorithmic and numerical developments and simplify

interdisciplinary collaboration in solid-state research.

ELK,

157

EXCITING,

158

and FLEUR

159

are Fortran programs for linearized augmented-plane wave calculations which

can reach microhartree accurate total energies for carefully chosen basis sets.

GPAW

160

is Python/C electronic structure program for DFT calculations within the projector-augmented wave

approach which supports three modes of operation: (i) finite-difference grids, (ii) numerical atomic orbitals, and

(iii) plane waves.

INQ

89

is a new, modular implementation of plane wave DFT and time-dependent DFT written from scratch to work on

graphics processing units (GPUs).

JDFTx

161

isaC++ plane wave DFT code aimed to be easy to develop and easy to use, whose key feature is support for

joint DFT for the description of electronic systems in contact with molecular liquids.

M-SPARC

162

is a MATLAB package for prototyping DFT calculations employing finite-difference grids and

pseudopotentials.

Octopus

163

is a Fortran program based on pseudopotentials and finite difference grids that focuses on time-dependent

DFT for handling non-equilibrium phenomena.

OpenMX

164

is a C package for DFT calculations with pseudopotentials and numerical atomic orbitals.

PARSEC

165

is a Fortran program based on finite-difference grids for density functional calculations with

pseudopotentials.

PWDFT.jl

166

is a Julia package written from scratch to facilitate development of novel computational methods using

plane waves.

RMG

167

is a C++/Fortran program employing real space grids and multigrid algorithms for density functional calcula-

tions with pseudopotentials.

Siesta

168

is a Fortran program for electronic structure calculations and ab initio molecular dynamics of molecules and

solids that employs a basis set of numerical atomic orbitals, which are strictly localized, enabling the use of sparsity.

Qbox

169

is a C++ program aimed for first principles molecular simulations using plane waves and pseudopotentials.

Quantum Espresso

90

is a Fortran/C program for plane wave calculations with pseudopotentials on a wide range of hard-

ware from laptops to supercomputers.

SPARC

170

is a C program for parallel DFT calculations employing finite-difference grids and pseudopotentials.

3.3 |Programs relying on fully numerical representations

The idea in modern fully numerical methods is to represent the orbitals directly in real space, and to use a representa-

tion of non-uniform accuracy (more grid points near the nuclei and fewer points in empty regions of the system) so that

all-electron calculations become feasible. Although fully numerical approaches have a long history for calculations on

atoms and diatomic molecules,

171

they are otherwise a relatively recent development in electronic structure theory and

have only recently become competitive with e.g. Gaussian-basis calculations whenever high accuracy is needed.

172

DFT-FE

95

is a C++ program that employs spectral finite-element basis sets for a local real-space variational formula-

tion of DFT, and is able to handle pseudopotential and all-electron calculations within the same framework and arbi-

trary periodicity.

HelFEM is a C++ program for fully numerical calculations on atoms

92,94

and diatomic molecules

91

at the HF or DFT

levels of theory employing high-order numerical basis functions and yielding fully variational energies.

MADNESS

173

is a C++ program that relies on the use of multiresolution adaptive grids, which has been used in a vari-

ety of studies on novel real-space approaches to electron correlation, for instance.

MRChem

172

is a C++ program that also relies on multiresolution adaptive grids for HF and DFT calculations on mole-

cules; its specialty is the computation of magnetic properties such as nuclear magnetic shielding constants.

LEHTOLA AND KARTTUNEN 11 of 33

x2dhf

174

is a Fortran program for non-relativistic finite difference restricted open-shell HF and DFT calculations on

diatomic molecules.

3.4 |Programs employing semiempirical models

Semiempirical models offer affordable techniques for approximate quantum mechanical calculations that fall

in accuracy in-between ab initio density functional calculations and force field techniques. Tight-binding

DFT

175–177

is probably the best-known semiempirical model, and it is available in several program packages.

Other types of semiempirical methods exist as well, please refer to Thiel

178

and Bannwarth et al.

179

for

discussion.

DFTB+

180

is a Fortran package for various calculations based on tight-binding DFT.

Latte

181

is a Fortran program for tight-binding DFT molecular dynamics.

Sparrow

182

is a C++/Python program for fast semiempirical quantum chemical calculations, including tight-binding

DFT.

xtb

179

is a Fortran package that implements various semiempirical eXtended Tight-Binding methods.

3.5 |Limited-scope projects

Although the main focus of our review is on self-contained packages for quantum electronic structure calculations for

computational chemistry education, this narrow scope risks not seeing the forest from the trees. The major part of

FOSS—the forest in the analogy—is a huge thriving ecosystem of small projects with limited scope, which wildly out-

number the more conspicuous large program packages—the trees—which exist in synergy with the smaller projects:

the smaller subprojects are often used by the larger programs. Thereby, in order to gain a thorough overview of FOSS it

is invaluable to extend our review from the self-contained packages reviewed above to projects of a more limited scope

which often have little user visibility.

The proliferation of small projects has multiple raisons d'être. The most common one is simply a specific personal

need. The good news is that because of the limited effort required to develop and maintain a code with a well-defined

scope, they can be developed and maintained by a single research group, or often even by a single person. The bad news

is that probably the majority of all FOSS projects in existence are unmaintained, simply because the authors moved on

to other things. As was already mentioned in the beginning of Section 3, we have not considered such projects in this

review.

3.5.1 | Keys to modular design

There is a systematic reason for the origin of the specific personal need mentioned in the previous paragraph: the DRY

[Don't Repeat Yourself] and KISS [Keep It Simple, Stupid!] principles, which have been key principles in software engi-

neering for an extended time and are still used to teach programming.

183

DRY is a reminder to avoid code duplication: a given functionality should only be programmed once and that imple-

mentation called everywhere it is needed, instead of repeating the same functionality in several places of the program.

The latter approach is more verbose, making it less maintainable and more prone to bugs.

In KISS, a complex problem is broken down into smaller subtasks. Once the subtasks—the common pieces of the

problem—have been identified, the principle is reapplied to the subtasks themselves: can they be broken down to a

compact collection of even simpler tasks?

Once a KISS design has been established, each component has a clear role in the design of the whole program. Even

though achieving the best design may in reality require several iterations of refactoring (restructuring) the code, the

effort in each iteration of the refactor is limited because even the code one is starting with should be quite simple if the

initial application of KISS was even partly successful.

12 of 33 LEHTOLA AND KARTTUNEN

3.5.2 | Is modular design a limitation?

A well-made design is like a puzzle: each software component fills in a piece of the puzzle by carrying out a small, well-

defined task. Each piece should ideally be so small that a working implementation can be developed in a matter of

hours.

The first attempt at the design of the program layout is often not fully successful, because the structure of a scientific

problem is not always clear before it has been fully solved. For this reason, program structures tend to develop

over time.

If a redesign of the modular structure of a problem leads to a more elegant or efficient implementation, it is

often adopted in a new version of the software. Such redesigns are extremely common in software development,

and are the reason for versioning software: the major version changes whenever the interface becomes incompati-

ble with the older version.

184

However, the redesign is often achievable through simple reorganizations of the ear-

lier code base. The software does not have to be rewritten, as the existing pieces can just be rearranged to fit the

new pattern.

If the design of a modular library changes enough, it can essentially become a wholly new library. In this case,

migrating to the newer version of the library may be a significant task for other projects, and the old and the new ver-

sion of the library may coexist for an extended time. A good example in the field of quantum chemistry is the libint

library of two-electron integrals,

185

which is used by several FOSS codes. A new major version of the library was intro-

duced in 2014 to take advantage of the new features afforded by modern processors, but many quantum chemistry pro-

grams still use the original version published in the early 2000s, since the functionality provided by the older version

suffices for the purposes it was designed for.

3.5.3 | The importance of interoperability

An example of a modular design that has stood the test of time is the Basic Linear Algebra Subprogram (BLAS) library,

which was originally introduced in the late 1970s.

186

BLAS implements elementary linear algebra operations, such as

adding, scaling, and multiplying vectors and matrices; operations which hold a central place in most branches of com-

putational science, including quantum chemistry, much of which is linear algebra.

Although a simple for-loop based implementation of BLAS operations, such as matrix–matrix multiplication

Cik ¼P

j

AijBjk can be written up in minutes, the mathematical structure of the problems can be employed to design a

faster implementation. In a later step, the implementation can even be hand-optimized to the specific processor used in

the machine; competing optimized BLAS implementations are an active area of research.

187,188

Although BLAS was published well before the FOSS movement gained steam via the internet, it serves as an excel-

lent example of what can be achieved by the use of open source, or at least by sharing a common programming inter-

face. BLAS is so pervasive, since it is ubiquitous: everyone uses it, and there are many competing implementations.

When individual projects are interoperable, such as in the case of BLAS, the development of efficient programs is

greatly hastened. Simply by using an optimized BLAS library instead of the reference implementation can in many

cases yield speedups of several orders of magnitude.

Unfortunately, interoperability is still hampered in the field of quantum chemistry since components are not

truly interoperable due to the lack of common standards. The evaluation of two-electron integrals is a good exam-

ple: it is the rate determining step in conventional HF calculations, and several implementations of two-electron

integrals have been published.

185,189–191

However, these implementations do not share a common interface.

Instead, the interfaces tend to reflect the structure of earlier legacy codes that have a large number of differing con-

ventions on the ordering, normalization, and signs of Gaussian basis functions, for instance. Despite some

attempts,

192,193

two-electron integrals libraries—or quantum chemistry programs, for that matter!—are still not

interoperable.

3.5.4 | The move to increased modularity

The situation may, however, be slowly changing. Libxc

84

has already standardized density functional calculations in over

30 electronic structure programs; XCFun

194

is another implementation of density functional approximations like Libxc that

LEHTOLA AND KARTTUNEN 13 of 33

has also been adopted by many codes, several of which support both Libxc and XCFun. Other types of libraries are also fol-

lowing suit. There is a growing ecosystem of modular electronic structure libraries as recently discussed by Oliveira et al.

104

in the scope of solid state calculations. We will complement it with a brief overview of some modular open source projects

that have become used within several quantum chemistry programs below. The use of common implementations will hope-

fully lead to more interoperability between electronic structure programs also in other aspects.

Given the multitude of small libraries that are available, the listing in this subsection is likely far from complete;

however, its goal is merely to illustrate that there is more to FOSS than the self-contained packages listed above. Spe-

cialized projects like these eliminate redundant work and enable rapid implementation of new features in quantum

chemistry programs.

Polarization,embedding,andquantumchemicalmodels are a good example of modular functionality, since the

data structures needed to implement such models fit well in the modular design. Examples of such projects

include:

adcc195 is a toolkit for implementing algebraic-diagrammatic construction (ADC) methods.

CheMPS2

196

is an implementation of the density matrix renormalization group method.

cppe

197

is an implementation of polarizable embedding.

DFT-D3

198

and DFT-D4

106

are implementations of semiempirical dispersion corrections for density functional

calculations.

libefp

199

is an implementation of the effective fragment potential method.

Libxc

84

contains implementations of density functional approximations which have been generated with computer

algebra.

PCMSolver

200

is an open-source library for the polarizable continuum model electrostatic problem.

XCFun

194

contains implementations of density functional approximations which employ automatic differentiation.

There are also several projects that specifically deal with Gaussian basis sets and that are thereby used by several

quantum chemistry codes.

The Basis Set Exchange

201

is a Python library for storing and managing Gaussian basis sets and converting basis sets

between various program formats; the project also has a web interface at http://www.basissetexchange.org which will

be more familiar to most readers.

erd

189

computes two-electron integrals with Rys quadrature.

libint

185

is a library for the evaluation of molecular integrals of many-body operators over Gaussian functions employing

Obara–Saika recursion routines.

libcint

190

is an integral library for automatically implementing general integrals for Gaussian-type scalar and spinor

basis functions using Rys quadrature.

simint

191

is a vectorized library for electron repulsion integrals employing Obara–Saika recursions.

libecpint

202

is a software library for evaluating effective core potential integrals.

3.5.5 | Visualization, manipulation, and analysis

The visualization, manipulation, and analysis tools discussed in this subsection are user-facing programs and are

thereby a more visible showcase of limited-scope projects than the lower-level libraries that were discussed in

Section 3.5.4. Indeed, simplified frontends are often invaluable for initializing, visualizing and analyzing calculations.

Several FOSS packages with graphical user interfaces are also available for this purpose; some even come with integra-

tion with FOSS electronic structure programs that allow running calculations within a graphical interface. For creating

models and visualizing computational results, FOSS graphical user interfaces such as Jmol,

203

Avogadro,

204

IQmol,

205

and PyMol

206

can be installed and used.

Unfortunately, the interoperability challenges mentioned in Section 3.5 affect visualization and analysis tools espe-

cially acutely, because these applications tend to require access to the electronic wave function, for which no univer-

sally accepted standard exists. This problem plagues the whole field of computational chemistry, affecting both FOSS

and proprietary programs. In the lack of a universal standard, the interconversion of various input and output file for-

mats between different programs can be carried out for example with the Open Babel

207

and cclib

208

packages.

14 of 33 LEHTOLA AND KARTTUNEN

The atomic simulation environment (ASE)

209

contains versatile tools for building molecular and periodic models

and enables easy retrieval of molecular structures from structural databases such as PubChem.

210

It can also act as a

frontend to several quantum chemical programs, thus offering a unified interface.

Calculations can be postprocessed with the Multiwfn

211

and ORBKIT

212

packages, for instance, which both support

several file formats.

4|ILLUSTRATIONS OF FEASIBLE COMPUTATIONS

To enable a practical demonstration of the BYOD paradigm within computational chemistry education, it is time to

illustrate the easy access to several powerful FOSS quantum chemistry packages in two widely used Linux distributions:

Fedora and Ubuntu. The Supporting Information contains practical step-by-step examples of combining the BYOD par-

adigm with FOSS packages to run quantum chemical calculations according to the BYOD-FOSS paradigm. Four pro-

gram packages are used in the practical illustrations: xtb (Section 4.1), NWChem (Section 4.2), Psi4 (Section 4.3), and

Quantum Espresso (Section 4.4). Installation instructions are provided for each code and all examples can be run under

Linux, macOS, or the Windows Subsystem for Linux. In all cases, the software can be installed in a matter of minutes

on a personal computer, either using a Linux distribution package manager or the Conda package manager. For conve-

nience, the Supporting Information is also available as a git repository.

213

4.1 |xtb

The primary design goal of xtb has been the fast calculation of structures and noncovalent interaction energies for

molecular systems with up to roughly 1000 atoms.

179,214

The GFNn-xTB methods implemented in xtb are semiempirical

quantum chemical methods

179

parametrized for the whole periodic table up to radon (Z=86). A highly attractive

feature of xtb is its performance: calculations on small molecules (10–20 atoms) finish in matter of seconds even on a

low-performance laptop computer. Xtb is a powerful tool in the pre-optimization of geometries and molecular confor-

mations before computationally more demanding calculations, for instance; see Ref. [215] for a recent application to

water oxidation catalysis.

The Supporting Information includes step-by-step guidelines for installing xtb and using it to study structures, con-

formations, energetics, and molecular orbitals of inorganic and organic molecules. Calculations on pharmaceutically

relevant cisplatin and transplatin molecules shown in Figure 2 are briefly summarized here to showcase the basic use

of xtb. Cisplatin, cis-[Pt(NH

3

)

2

Cl

2

], is a chemotherapy medication used in cancer treatments whose stereoisomer, trans-

platin, trans-[Pt(NH

3

)

2

Cl

2

], is ineffective in cancer treatment.

The Pt(II) atom is square-planar coordinated in both cisplatin and transplatin. Which configuration, cis or trans,is

lower in energy? We use the xtb program to answer this question. The first task is to have initial geometries for the two

molecules. In general, initial geometries can be obtained from structural databases such as Pubchem

210

; built in a

graphical user interface with programs such as Jmol, Avogadro, or IQMol; or built by hand in internal coordinates

(bond lengths, angles and dihedrals) in the Z-matrix formalism, for example. Hand-built molecular geometries for cis-

platin and transplatin are given in XYZ format in Figures 3 and 4, respectively. While these geometries should be suffi-

ciently close to optimal to allow for a straightforward optimization without difficulties, they are still quite rough in that

the total energy is expected to change by several millihartrees in the geometry optimization, corresponding to changes

in the energy of several kcal/mol.

FIGURE 2 Cisplatin (left) and transplatin (right). Color coding: Pt =gray, Cl =green, N =blue, and H =white

LEHTOLA AND KARTTUNEN 15 of 33

The next step is to bring both molecules into a (local) minimum of the potential energy surface (PES) by optimizing the

geometries with xtb. The point groups of the initial geometries are approximately C

2v

and C

2h

for cisplatin and transplatin, respec-

tively, but symmetry is not enforced during the xtb optimizations. The only input needed by xtb in this case are the Cartesian

coordinates of both molecules in XYZ format, which were given in Figures 3 and 4 for cisplatin and transplatin, respectively.

The geometry optimizations complete in seconds even on a low-performance computer; the Supporting Information

contains all of the necessary inputs. For cisplatin, the optimized Pt–Cl and Pt–N distances are 2.24 and 2.15 Å, respec-

tively. Considering the relatively low level of theory, the obtained distances are in reasonable agreement with the Pt–Cl

and Pt–N distances of 2.25 and 2.06 Å, respectively, obtained with the much higher-level methods of Tasinato,

Puzzarini, and Barone

216

who employed coupled-cluster theory with full single and double substitutions and

perturbative triple substitutions, CCSD(T).

Comparing the total energies of the two stereoisomers after geometry optimization shows that the total energy of

transplatin is 20 kJ/mol lower, that is, more negative than that of cisplatin. This means that transplatin is the energeti-

cally more favorable stereoisomer of diamminedichloroplatinum(II), [Pt(NH

3

)

2

Cl

2

]. For comparison, Liu and Franke

217

reported an energy difference of 56 kJ/mol with a much higher level of theory: relativistic CCSD(T) employing direct

perturbation theory, a 13s9p7d5f2g contracted Gaussian basis for Pt and aug-cc-pVQZ for other elements, evaluated on

top of molecular geometries optimized for the Becke'88–Perdew'86 functional.

218,219

The result from xtb, which we were

able to get in a matter of seconds, is in good qualitative (or even semiquantitative) agreement with the result obtained

with the high level of theory. Next, in Section 4.2, we will revisit cisplatin and transplatin with DFT calculations that

afford a step up in accuracy over xtb.

4.2 |NWChem

NWChem is a program that has been developed for almost 30 years. Consequently, a large number of features are avail-

able in the code: HF, DFT, as well as post-HF calculations, ab initio molecular dynamics, and so on. NWChem has been

11

cis-[Pt(NH3)2Cl2] (cisplatin); angstrom units

Pt 0.00000000 -0.00000000 -0.19134710

Cl 0.00000000 1.61220407 1.42085566

Cl 0.00000000 -1.61220407 1.42085566

N 0.00000000 1.40714181 -1.59849021

H 0.81649658 1.30951047 -2.16752575

H -0.81649658 1.30951047 -2.16752575

N 0.00000000 -1.40714181 -1.59849021

H -0.81649658 -1.30951047 -2.16752575

H 0.81649658 -1.30951047 -2.16752575

H 0.00000000 2.30951093 -1.16752621

H 0.00000000 -2.30951093 -1.16752621

FIGURE 3 Molecular geometry of cisplatin in XYZ format

11

trans-[Pt(NH3)2Cl2] (transplatin); angstrom units

Pt 0.00000000 0.00000000 0.00000000

Cl 2.27999997 -0.00036653 0.00000000

Cl -2.27999997 0.00036653 0.00000000

N -0.00031991 -1.98999997 0.00000000

H 0.46944690 -2.32340883 -0.81740913

H 0.46944690 -2.32340883 0.81740913

N 0.00031991 1.98999997 0.00000000

H -0.46944690 2.32340883 -0.81740913

H -0.46944690 2.32340883 0.81740913

H 0.94318252 2.32318174 0.00000000

H -0.94318252 -2.32318174 0.00000000

FIGURE 4 Molecular geometry of transplatin in XYZ format

16 of 33 LEHTOLA AND KARTTUNEN

designed to run on high-performance parallel supercomputers as well as on conventional workstations. The Supporting

Information includes step-by-step guidelines for installing NWChem and using it to study the same pharmaceutically

relevant cisplatin and transplatin molecules that were studied with xtb in Section 4.1.

We choose to use non-empirical DFT in the NWChem examples. Although NWChem also includes more accurate

ab initio methods such as coupled-cluster theories, we shall not consider them in this work since their proper use

requires much more understanding and computational power than DFT does, and as such methods are typically not

included in undergraduate level courses. We choose the non-empirical PBE0 hybrid functional

85,220,221

(sometimes also

known as hybrid PBE or PBEh) that provides reasonable geometries and energetics across the periodic table and shows

good performance for complexes with d- and f-metals.

222,223

Even though DFT is simpler than many post-HF theories, setting up adequate DFT calculations still requires some

considerations. The one-electron basis set is one of the most important aspects to consider in any electronic structure

calculation in general, such as our attempted PBE0 calculation with NWChem. The choice of the one-electron basis set

has an immense importance on the computational cost and accuracy of the resulting calculations. While the GFNn-xTB

methods discussed above in Section 4.1 did not require the specification of a basis set, as the basis set is already an

essential part of the specification of the GFNn-xTB methods themselves, the basis set—which parametrizes the allowed

degrees of freedom for the movement of the electrons—does need to be specified for HF, DFT and post-HF

calculations.

Because of the profound importance of the choice of the basis set, various types of Gaussian basis sets have a long

history in quantum chemistry.

133

Although many readers will be familiar with traditional basis sets like STO-3G,

224

3-

21G,

225

and 6-31G*,

226

the development of computer processors and quantum chemical models in recent decades have

also lead to significant advances in basis set design. Hundreds of Gaussian basis sets intended for various purposes are

nowadays available on the Basis Set Exchange,

201

for example.

Because the basis set is an approximation, it is highly desirable to be able to control its accuracy in order to make

tradeoffs between the cost of the calculation and the accuracy of the obtained results. Accordingly, modern basis sets

typically come in families of varying size

134,135

: the smallest sets enable quick but qualitative calculations, while the

larger sets enable quantitative computations at the cost of more computer time. In contrast to traditional basis sets,

modern basis set families allow for a cost-efficient approach to the complete basis set limit, at which point the error in

the one-electron basis set no longer affects the calculation. Note that also other types of basis sets than Gaussians may

be used for quantum chemistry, see Ref. [171] for further discussion.

In this work, we will only consider the Karlsruhe def2 family of Gaussian basis sets,

227

which are a good all-round

choice for general chemistry as they are available for the whole periodic table up to radon (Z=86). As radon is an ele-

ment of the 6th period, while relativistic effects are already essential for chemistry of the 5th row,

228,229

relativistic

effects are described in the def2 basis sets through the use of effective core potentials (ECPs).

230

The ECP is used to

describe the chemically inactive, deep-core electrons only implicitly; this also decreases the overall cost of the

calculation.

The Karlsruhe def2 sets come in three levels of accuracy. Split-valence (SV) basis sets are the smallest reasonable

basis set for general applications. The def2-SVP basis is a SV basis set with polarization (P) functions, and is similar in

size to the 6-31G** also known as the 6-31G(d,p) basis set. Like 6-31G**, the def2-SVP set can also be used without

polarization functions on hydrogen atoms; this basis is called def2-SV(P), it is smaller than the 6-31G* basis, and it is

often useful for quick qualitative/semi-quantitative calculations. For more quantitative calculations, the def2 series also

contains a triple-ζvalence polarization set (def2-TZVP) as well as a quadruple-ζvalence polarization set (def2-QZVP),

which typically suffice for achieving the complete basis set limit in HF and DFT calculations. Calculations at post-HF

levels of theory, however, require larger basis sets with additional polarization functions; the def2-TZVPP and

def2-QZVPP basis sets exist for this purpose. Diffuse functions (D) are necessary for the proper description of anions as

well as to model, for example, electric polarizabilities; sets are likewise available at all levels of accuracy (def2-SVPD,

def2-TZVPD, def2-TZVPPD, def2-QZVPD, and def2-QZVPPD) for this purpose.

231

For the present demonstration, we choose the def2-TZVP basis set, as triple-ζbasis sets are well-known to yield

energies that are sufficiently close to the complete basis set limit (see also the applications in Sections 4.3.1 and 4.3.2).

Although hybrid functionals are computationally more demanding than non-hybrid functionals, it is notable that the

dispersion-corrected hybrid PBE0-D4 generalized gradient approximation (GGA) functional was recently shown to out-

perform the dispersion-corrected, meta-GGA-type non-hybrid r

2

SCAN-D4 functional in accuracy even for reaction

energies of metal–organic reactions.

232

LEHTOLA AND KARTTUNEN 17 of 33

Having completed our introduction to DFT calculations, basis sets, and NWChem, similarly to the workflow in the

case of xtb, the first task is to bring both molecules into a (local) minimum of the potential energy surface (PES) by

means of geometry optimization. The geometry optimization is started from the same hand-built initial geometries pres-

ented in Section 4.1. In contrast to xtb, NWChem is capable of employing the point group symmetry (C

2v

and C

2h

for

cisplatin and transplatin, respectively) during the geometry optimization in order to speed up both the electronic struc-

ture calculation as well as the geometry optimization, and will do so by default. This means that the calculation runs

faster, but also that the molecule is constrained to the same point group as the initial geometry during the whole opti-

mization. If the user is not careful, this may also be a bad thing, as the use of symmetry may sometimes lead to conver-

gence to a saddle point instead of a local minimum.

The input required for NWChem is more complicated than that for xtb. Running NWChem requires setting up an

input file that contains various computational parameters in addition to the input geometry. Fully annotated input files

can be found in the Supporting Information, a shortened example is shown in Figure 5.

The geometry optimizations of cisplatin and transplatin finish in a matter of minutes on one processor core,

depending on the used computer. The optimized Pt–Cl and Pt–N distances for cisplatin are 2.28 and 2.08 Å, respec-

tively. These values are in excellent agreement with the values of Tasinato, Puzzarini, and Barone

216

that were dis-

cussed in Section 4.1, that is, Pt–Cl and Pt–N distances of 2.25 and 2.06 Å, respectively: the geometries agree to 0.03 Å.

Next, comparing the total PBE0/def2-TZVP energies of the two stereoisomers shows that transplatin is 54 kJ/mol

lower (more negative) than cisplatin. Our DFT value is in good quantitative agreement with the energy difference of

56 kJ/mol obtained by Liu and Franke

217

using a high-level CCSD(T) method; however, in contrast to their CCSD(T)

calculations, our DFT calculations can be performed in a matter of minutes even on a personal computer.

For cisplatin, we also write out the molecular orbitals after the geometry has been optimized. The molecular orbitals

provided by from the non-empirical PBE0/def2-TZVP calculations can now be compared with the ones from the semi-

empirical xtb calculations from Section 4.1, see Figure 6. The frontier orbitals—the highest occupied molecular orbital

title "Cisplatin"

charge 0

geometry units angstroms autosym 0.1

Pt 0.00000000 -0.00000000 -0.19134710

Cl 0.00000000 1.61220407 1.42085566

Cl 0.00000000 -1.61220407 1.42085566

N 0.00000000 1.40714181 -1.59849021

H 0.81649658 1.30951047 -2.16752575

H -0.81649658 1.30951047 -2.16752575

N 0.00000000 -1.40714181 -1.59849021

H -0.81649658 -1.30951047 -2.16752575

H 0.81649658 -1.30951047 -2.16752575

H 0.00000000 2.30951093 -1.16752621

H 0.00000000 -2.30951093 -1.16752621

end

dft

xc pbe0

mult 1

iterations 100

end

basis spherical

* library def2-tzvp

end

ecp

Pt library def2-ecp

end

driver

maxiter 100

xyz

end

task dft o

p

timize

FIGURE 5 NWChem example: PBE0/def2-TZVP geometry optimization of cisplatin; for transplatin, the nuclear coordinates given in

Figure 4 are used, instead

18 of 33 LEHTOLA AND KARTTUNEN

(HOMO) as well as the lowest unoccupied molecular orbital (LUMO)—from the xtb and NWChem calculations are in

good agreement. Also HOMO-3, HOMO-2, and HOMO-1 appear similar; the HOMO-2 and HOMO-1 orbitals are merely

switched between the NWChem and xtb calculations. The energetical ordering of orbitals can easily switch when the

orbitals have similar energies; reorderings of the occupied orbitals have no effect on the properties of the system.

From the point of view of crystal field theory, the Pt(II) atom in cisplatin has a square planar coordination and eight

5d electrons. The four HOMOs and the LUMO all involve Pt 5d orbitals. In line with crystal field theory, both NWChem

and xtb show that the LUMO involves the Pt 5dx2y2orbital. HOMO-3 involves the Pt 5dz2orbital, while the 5d

xy

,5d

xz

,

and 5d

yz

orbitals contribute to HOMO-2, HOMO-1, and HOMO. As is clearly seen from the data presented above, the

non-empirical PBE0/def2-TZVP and the semiempirical GFN2-xTB level of theory provide a similar description of the

frontier orbitals of the Pt(II) complex. Again, the full inputs for the calculations are given in the Supporting

Information.

4.3 |Psi4

While NWChem represented older and more established quantum chemistry codes, Psi4 represents the newer genera-

tion of quantum chemistry codes. The origins of Psi4 trace to the Psi3 research code written in C++ for high-accuracy

studies on small molecules.

79

Compared with Psi3, Psi4 is designed to be a user-friendly, general-purpose code for fast,

automated computations on molecules with hundreds of atoms.

78

Psi4 contains a number of computational methods

ranging from HF and DFT to post-HF methods such as Møller–Plesset perturbation theory,

233

coupled-cluster theory,

234

configuration interaction theory, orbital-optimized correlation methods, symmetry-adapted perturbation theory,

multireference methods, and so on.

78

Although the core of the program is still in C++, Psi4 has thorough Python inter-

faces and can be used either as a traditional quantum chemistry program with input files, or directly from Python.

We will demonstrate the use of Psi4 in the context of two common exercises in elementary courses on computa-

tional chemistry: a conformational study of methylcyclohexane and the reproduction of the molecular geometry of the

chromyl fluoride (CrO

2

F

2

) molecule with special consideration on the one-electron basis set. We will again focus on the

def2 family of basis sets that was introduced in Section 4.2.

4.3.1 | Methylcyclohexane

Starting out with the conformational study of methylcyclohexane, the workflow is as follows. First, the molecule is built

in a molecular editor such as Avogadro, IQmol, or Jmol, and the drawn molecular structure is preoptimized using a

force field available in the editor; the goal of the preoptimization is merely to ensure that the bond lengths are realistic

so that the electronic structure calculations during the geometry optimization converge without problems, and so that

the bonding pattern does not change.

FIGURE 6 The four highest occupied MOs (HOMOs) and the lowest unoccupied MO (LUMO) of cisplatin as obtained from NWChem

(PBE0/def2-TZVP) and xtb (GFN2-xTB). The color code for the nuclei is the same as in Figure 2, while red and blue denote positive and

negative orbital amplitudes, respectively (note that the overall sign of the orbital can be freely chosen). The isovalue used for the orbitals is

0.04 electrons/Bohr

3

LEHTOLA AND KARTTUNEN 19 of 33

In the next step, the molecular structure is reoptimized with xtb, and a conformational search is carried out with

xtb with the Conformer-Rotamer Ensemble Sampling Tool (CREST) program which has been shown to reproduce con-

formational ensembles to good accuracy.

235–237

Again, the Supporting Information includes short tutorials for installing

and using the CREST code, which employs xtb to carry out conformational searches of molecules.

236

CREST finds four

conformers, and outputs them in an increasing order in energy.

The four conformers are then reoptimized in Psi4 using the PBE0/def2-TZVP

85,220,221,227

level of theory introduced

above in Section 4.2. Psi4 employs density fitting

238–242

by default; this means that the universal fitting basis for

Hartree–Fock calculations

243

is used in the calculation. The Psi4 input file for the first conformer is shown in Figure 7.

The inputs for the other molecules are analogous and shall not be repeated here; they are, however, available in the

Supporting Information.

molecule {

01

C -1.0139237009 0.0001157060 -0.3320119090

C -0.3010211074 1.2491572923 0.1879180723

C -0.3011951696 -1.2490517349 0.1878718396

C 1.1683390004 1.2516621049 -0.2233071254

C 1.1681695646 -1.2517772582 -0.2232981267

C 1.8703096243 -0.0000923733 0.2934985390

C -2.4834630882 0.0000222911 0.0795247173

H -0.9582190930 0.0002005854 -1.4269602139

H -0.3718670923 1.2740378936 1.2781840671

H -0.7951641526 2.1435985756 -0.1985907954

H -0.7954642203 -2.1434127996 -0.1986469736

H -0.3720420205 -1.2738839625 1.2781559690

H 1.6616052523 2.1443202680 0.1678151692

H 1.2391547021 1.2815104197 -1.3133695212

H 1.2390062002 -1.2817145988 -1.3133390411

H 1.6612233508 -2.1444918905 0.1679208818

H 2.9153982958 -0.0001245162 -0.0238784763

H 1.8521966765 -0.0001224730 1.3859698783

H -2.5743116471 0.0004900512 1.1639789401

H -2.9899376694 0.8819226593 -0.3066017637

H -2.9892458557 -0.8827595520 -0.3054682049

}

set basis def2-tzvp

o

p

timize(’

p

be0’)

FIGURE 7 Psi4 example: PBE0/def2-TZVP geometry optimization for the lowest-lying methylcyclohexane conformer

TABLE 1 Conformer energy differences ΔE

conformer n

=E

conformer n

E

conformer 1

in kcal/mol and number of basis functions N

bf

for the

methylcyclohexane conformers according to PBE0 calculations with various basis sets, evaluated at the PBE0/def2-TZVP optimized

geometries

Method N

bf

Conformer 2 Conformer 3 Conformer 4

PBE0/STO-3G 49 1.19 5.54 5.78

PBE0/STO-6G 49 1.25 5.57 5.84

PBE0/MINAO 49 0.85 5.08 5.05

PBE0/def2-SV(P) 126 2.00 6.62 7.07

PBE0/def2-SVP 168 1.97 6.57 7.01

PBE0/def2-TZVP 301 2.10 6.31 6.74

PBE0/def2-QZVP 819 2.11 6.31 6.73

GFN2-xTB (CREST geometry) 1.51 5.32 5.36

Note: For comparison, the GFN2-xTB data from the CREST output is also included.

20 of 33 LEHTOLA AND KARTTUNEN

With the PBE0/def2-TZVP optimized geometries at hand for each of the four conformers, we perform single-point

calculations on each conformer in a variety of basis sets; the resulting energy differences to the lowest-energy con-

former (#1) are given in Table 1. In addition to the def2 family, we also have included data for the MINAO basis con-

sisting of the minimal-basis Hartree–Fock orbitals extracted from the triple-ζcc-pVTZ basis set,

244

as well as the STO-

3G and STO-6G basis sets which are 3-Gaussian and 6-Gaussian function expansions of a minimal-basis Slater-type

orbital (STO) basis set, respectively.

224

(It is important to note in this context that not all STO basis sets are minimal:

STO basis sets of various sizes ranging up to polarized quadruple-ζhave been reported

245,246

and remain widely used

for practical calculations in programs employing STO basis sets.)

The data in Table 1 leads us to the following insights. First, even the minimal basis sets successfully predict the

energy ordering of the conformers: although MINAO flips the order of conformers 3 and 4, it still predicts conformer

1 to be the lowest in energy. Note that this comparison is restricted to the use of fixed geometries; relaxing the geome-

tries in each basis might change the conclusion somewhat. The good performance of the minimal basis sets for this

application shows that conformational energies enjoy an excellent degree of error cancellation, which is one of the

main motivations for using atomic basis sets in the first place.

171

The shortcomings of minimal basis sets are showcased by the large differences between the results obtained with

the MINAO and STO-nG basis sets. Minimal basis sets are as small as possible and thereby have very little flexibility:

good accuracy for one type of system does not translate to good accuracy in another system, and minimal basis sets gen-

erally have poor predictive power for chemistry.

134,135

MINAO is derived from atomic calculations only, and is thereby

fully biased toward atoms, while the Slater-type orbital basis used by Hehre, Stewart, and Pople

224

is optimized for an

TABLE 2 Geometric parameters of chromyl fluoride (CrO

2

F

2

) at various levels of theory

Method Basis r(CrF) (Å) r(CrO) (Å) ∠OCrOðÞ(

)∠FCrFðÞ(

)

GFN1-xTB 1.525 1.597 111.37 106.53

GFN2-xTB 1.548 1.671 111.50 110.38

PW92 STO-3G 1.491 1.584 109.44 108.14

STO-6G 1.495 1.589 109.59 107.71

def2-SV(P) 1.548 1.684 108.41 110.80

def2-SVP 1.541 1.675 108.35 110.58

def2-TZVP 1.551 1.693 108.33 110.26

def2-QZVP 1.554 1.695 108.20 110.48

PBE STO-3G 1.504 1.606 109.47 108.05

STO-6G 1.507 1.611 109.61 107.65

def2-SV(P) 1.565 1.713 108.41 110.75

def2-SVP 1.557 1.704 108.38 110.48

def2-TZVP 1.568 1.721 108.45 110.01

def2-QZVP 1.571 1.724 108.30 110.23

r

2

SCAN STO-3G 1.497 1.602 109.98 106.94

STO-6G 1.500 1.605 110.26 106.22

def2-SV(P) 1.553 1.700 108.83 109.48

def2-SVP 1.545 1.692 108.77 109.25

def2-TZVP 1.554 1.706 108.89 108.80

def2-QZVP 1.556 1.708 108.76 108.96

Experiment

a

1.575 1.720 107.8 111.9

Experiment

b

1.55 1.71

a

Experimental values from Ref. [255].

b

Experimental values from Ref. [256].

LEHTOLA AND KARTTUNEN 21 of 33

average molecular environment, which is reflected in the slightly improved results in Table 1. However, this is only

achieved at the cost of a bias toward molecules, meaning that the STO-nG basis sets are not as good for isolated atoms.

It is generally preferable to use larger and more flexible basis sets in applications, which guarantee a uniform accu-

racy for all types of systems, and to try to converge the results to the complete basis set limit. This means controllably

removing the error made in the one-electron basis set approximation until the error becomes negligible either in abso-

lute value, or in comparison to the other sources of error in the calculation, such as the error inherent in the employed

density functional approximation, for example.

As has already been previously discussed, the smallest reasonable basis for general applications is def2-SV(P). It pre-

dicts conformational energies roughly within 0.3 kcal/mol compared with the converged quadruple-ζvalues, as can be

seen from Table 1. As shown by the comparison between the def2-SV(P) and def2-SVP data, the role of polarization

functions on hydrogen is small for the studied conformational energies.

Systematically more converged energies are obtained by going to the triple-ζdef2-TZVP basis and the quadruple-ζ

def2-QZVP basis. The data show that already the triple-ζcalculations are converged to 0.01 kcal/mol in the conformer

energy differences, demonstrating the usefulness of modern, systematic basis set families: the complete basis set limit

can be reached simply by using larger and larger basis sets.

For comparison, Table 1 also includes data for the GFN2-xTB method from the CREST output.

214

A visual assess-

ment of the data confirms that GFN2-xTB correctly reproduces the energy ordering of the conformers, and that the con-

former energy differences are reproduced at an accuracy comparable to the minimal basis set calculations, with the

converged PBE0/def2-QZVP data as reference. This data emphatically suggests that historical applications of minimal

basis sets in quantum chemistry can be straightforwardly replaced with modern semiempirical calculations with xtb,

for instance, which have much lower computational cost.

Studying a single molecular geometry is in general insufficient, if the molecule has the potential for multiple low-

lying conformers. The data in Table 1 demonstrates the importance of proper conformational sampling in applications

to thermochemistry or chemical reactions, for instance: in the case of methylcyclohexane, insufficient conformational

sampling can cause errors of up to 7 kcal/mol which may easily surpass the error arising from the level of theory or the

basis set.

4.3.2 | Geometry of chromyl fluoride

For a somewhat more complicated example, we study the equilibrium geometry of chromyl fluoride (CrO

2

F

2

) at various

levels of DFT, which is known to be surprisingly accurate for simple transition metal complexes.

247

CrO

2

F

2

assumes a

tetrahedral geometry. Again, the workflow is to build the molecule in a molecular editor, preoptimize the molecular

geometry with xtb, and then run the geometry optimizations in Psi4; however, now the optimization is done separately

for each basis set in contrast to the procedure used in Section 4.3.1.

For this study, we choose the GFN1-xTB

248

and GFN2-xTB

214

semiempirical methods as well as a set of non-

empirical density functionals: the Perdew–Wang 1992 (PW92) local density approximation (LDA),

150,249,250

the Per-

dew–Burke–Ernzerhof (PBE) GGA,

85

as well as the r

2

SCAN meta-GGA functional that represents the state of the art in

non-empirical density functionals.

251,252

The geometry optimizations are undertaken with very tight convergence

thresholds to ensure benchmark quality geometries.

Density fitting is again used in these calculations. As we only consider density functionals that do not contain exact

exchange in this application, smaller auxiliary basis sets optimized for reproducing only Coulomb interactions could be

employed

253

; however, for simplicity we stick to using the Psi4 default which is to use the larger auxiliary basis sets

243

that also work in the presence of exact exchange, such as the PBE0 functional used in Sections 4.2 and 4.3.1.

The results shown in Table 2 demonstrate that while the STO-nG minimal basis sets

224,254

yield relatively poor

geometries compared with the experimental values from Refs. [256,257], already the split-valence def2-SV(P) basis

set

227

leads to bond lengths that are converged to 0.03 Å and fractions of a degree in angles. The differences become

smaller, that is, the bond lengths and angles become more converged going to the larger basis sets, with the differences

between the def2-TZVP and def2-QZVP results being already negligible.

The bond lengths from the PBE/def2-QZVP calculations are in excellent agreement with the older experimental

values from Ref. [255]; the bond angles are in reasonable agreement with the experimental data from the same refer-

ence. r

2

SCAN/def2-QZVP, in turn, is in excellent agreement with the newer experimental bond lengths from Ref. [256].

22 of 33 LEHTOLA AND KARTTUNEN

4.4 |Quantum Espresso

Quantum Espresso (QE) is an integrated suite of FOSS codes for electronic structure calculations based on DFT, plane

waves, and pseudopotentials. The QE distribution consists of a set of core components and programs, a set of plug-ins

for more advanced tasks, and a number of third-party packages designed to be interoperable with the core components.

QE can be used to study the geometries, energetics, thermodynamics, electronic properties, response properties, spec-

troscopic properties, and transport properties of solid-state materials. The Supporting Information includes step-by-step

guidelines for installing QE and using it to study two polymorphs of zinc(II) sulfide, ZnS.

ZnS crystallizes in two principal forms, sphalerite and wurtzite (Figure 8). Sphalerite is a naturally occurring min-

eral belonging to the cubic crystal system with space group F[]43m(No. 216). Both Zn and S atoms are tetrahedrally

coordinated in the sphalerite structure and the crystal structure can be considered as a diamond lattice with two atom

types. Wurtzite is also a naturally occurring mineral and it can be considered as a hexagonal polymorph of sphalerite,

crystallizing in the space group P6

3

mc (No. 186). The coordination with nearest and next-nearest neighbors in wurtzite

is identical to that in sphalerite. The first structural differences between the two polymorphs arise only in the third shell

of neighbors.

257

From a thermodynamical point of view, sphalerite is the low-temperature ZnS polymorph in bulk form

FIGURE 8 Two polymorphs of ZnS: Sphalerite (left) and wurtzite (right). Zinc atoms in blue, sulfur atoms in yellow. For wurtzite, the

c-axis points upward

&CONTROL

calculation=’vc-relax’

prex=’zns’

/

&SYSTEM

space_group=216 ! Space group

a=5.4093 ! Lattice parameter a in angstroms

nat=2 ! Number of atoms in the asymmetric unit

ntyp=2 ! Number of atom types. Here, Zn and S.

ecutwfc=40 ! Kinetic energy cutoff for wavefunctions (Ry)

ecutrho=200 ! Kinetic energy cutoff for charge density and potential (Ry)

/

ATOMIC_SPECIES

Zn 65.38 zn_pbe_v1.uspp.F.UPF

S 32.065 s_pbe_v1.4.uspp.F.UPF

ATOMIC_POSITIONS crystal_sg

Zn 0.00000 0.00000 0.00000

S 0.25000 0.25000 0.25000

K_POINTS automatic

888 000

FIGURE 9 Quantum espresso example: Geometry optimization of sphalerite-ZnS with PBE functional and GBRV pseudopotentials.

Fully annotated input files can be found from the Supporting Information

LEHTOLA AND KARTTUNEN 23 of 33

and the transition temperature to wurtzite is 1293 ± 10 K.

258

Wurtzite-ZnS is thus metastable at room temperature, but

it is found in nature and can also be produced synthetically.

The illustrative QE calculations are carried out with the non-empirical PBE exchange-correlation functional.

85

To

run the calculations with QE, we need pseudopotentials that have been developed for this functional. Here we use the

ultrasoft Garrity–Bennett–Rabe–Vanderbilt (GBRV) pseudopotentials, which form a highly accurate and computation-

ally inexpensive open-source pseudopotential library that has been designed and optimized for use in high-throughput

DFT calculations.

259

The main attractive feature of the GBRV pseudopotentials is that they are tailored for relatively

small plane wave cutoffs of 40 Rydberg for wave functions and 200 Rydberg for the charge density and potential,

259

resulting in affordable computational costs.

To study sphalerite-ZnS and wurtzite-ZnS with QE, we need their crystal structures. A good source for crystal struc-

ture data is the Crystallography Open Database (COD),

260

which is where we obtained the structures in the Crystallo-

graphic Information File (CIF) format; the COD structures are available in the Supporting Information.

There are several ways in which the crystal structures can be entered in QE input files. In the example here, we have

directly used the crystallographic information to create an input file, which is shown in Figure 9; a helpful resource for build-

ing QE input files is afforded by the QE input generator and structure visualizer provided by the Materials Cloud.

261

4.4.1 | Optimal geometry

Before attempting any calculations, it is important to determine how dense a sampling of the reciprocal space (k-sam-

pling) is needed to describe the materials sufficiently accurately. The convergence tests described in the Supporting

Information show that a 8 88 Monkhorst–Pack

262

k-point mesh leads to a truncation error smaller than 1 meV for

sphalerite-ZnS. A comparable k-point spacing is then also used for wurtzite-ZnS.

The geometry optimization of sphalerite-ZnS finishes in a few minutes, while the wurtzite-ZnS may take tens of

minutes when run on a single processor core. The optimized lattice parameters are in good agreement with the experi-

mental lattice parameters found on COD. The optimized lattice parameters are a=5.447 Å for sphalerite-ZnS and

a=3.846 Å and c=6.304 Å for wurtzite-ZnS, whereas the experimental lattice parameters are a=5.4093 Å for

sphalerite-ZnS and a=3.811 Å and c=6.234 Å for wurtzite-ZnS.

260

This means that the computations overestimate

the lattice parameters by approximately 1% over the experiment.

The energy comparison of the optimized sphalerite-ZnS and wurtzite-ZnS structures shows that the total energies

differ by only 0.6 kJ/mol per formula unit. This value is in good agreement with Cardona et al.

263

who reported an

energy difference of less than 0.008 eV (0.8 kJ/mol) per formula unit from LDA and GGA calculations on ZnS poly-

morphs. The energy difference is so small, because the crystal structures are so similar: differences arise only in the

-4

-3

-2

-1

0

1

2

3

4

ΓX W K ΓL U W L

K

Energy / eV

Wave vector

Sphalerite-ZnS band structure

FIGURE 10 Electronic band structure of sphalerite-ZnS obtained with PBE functional and GBRV pseudopotentials

24 of 33 LEHTOLA AND KARTTUNEN

third-nearest neighbor shell, as was already mentioned above. Note that so far we have only compared electronic total

energies; Gibbs free energies should be considered instead for a full understanding of the thermodynamics, but this is

beyond the scope of this work.

4.4.2 | Band structure

The second practical example illustrates how the electronic band structure of sphalerite-ZnS can be calculated and

plotted with QE. In any band structure calculation, the band path in the reciprocal space has to be defined in terms of

k-points. The band path depends on the Bravais lattice of the crystal structure. An excellent source for band paths is

the SeeK-path service,

264

which readily provides crystal-structure-based band paths for several program packages. Here,

we use the face centered cubic (FCC) band path from Setyawan and Curtarolo,

265

and the resulting electronic band

structure of sphalerite-ZnS is illustrated in Figure 10.

From the band structure plot in Figure 10, we can see that sphalerite-ZnS has a direct band gap of about 2 eV at the

Γpoint when using the PBE functional and the GBRV pseudopotentials. The band structure in Figure 10 is in good

agreement with the PBE band structure available in the Materials Project.

112

However, the PBE calculations severely

underestimate the experimental band gap measured at 10 K, which is about 3.8 eV.

266

The agreement with experiment

could be improved for example with the DFT +U approach or with hybrid density functionals, both of which are out-

side the scope of this work.

5|SUMMARY AND CONCLUSIONS

We have argued that FOSS allows for a BYOD approach to the teaching of computational chemistry, and finally affords com-

putational chemistry for the masses, thereby also democratizing the science of computational chemistry. The distributed

BYOD approach to computational chemistry also supports the delivery of massive open online courses (MOOCs), avoiding

the need to organize computing resources for a large number of students in a cost-effective and secure way. We have briefly

reviewed the current selection of FOSS programs for electronic structure calculations, and illustrated the installation and

practical use of several programs for computational chemistry education on personal computers. As the technical barriers

for running quantum chemical calculations on personal laptops have practically vanished, educators can focus on content

creation and developing practices for sharing and co-creating computational chemistry teaching material as Open Educa-

tional Resources.

267

The Psi4Education project

5,268

is one such attempt at open teaching materials. We hope open materials

become more readily available and more thoroughly used in the future.

On a final note, we would like to point out that the free availability of FOSS operating system kernels, compilers,

debuggers as well as user-space tools—which have not been discussed in this review—have had a critical role in

enabling the development of the plethora of the FOSS projects discussed within this work, as well as our own work. We

would like to thank the entire FOSS community for providing high-quality tools for a variety of purposes, and invite

our readers to join the FOSS movement.

ACKNOWLEDGMENTS

We thank Paul Saxe and Jonathan Moussa for invaluable comments on an early stage of this manuscript. We also thank

all the anonymous peer reviewers of this manuscript for constructive criticisms which have similarly helped to improve

the structure and content of this paper. A. J. K. thanks Business Finland for Co-Innovation funding (Grant

No. 3767/31/2019).

CONFLICT OF INTEREST

The authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS

Susi Lehtola: Conceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); method-

ology (equal); resources (equal); software (equal); validation (equal); visualization (equal); writing –original draft

(equal); writing –review and editing (lead). Antti Karttunen: Conceptualization (equal); data curation (equal); formal

LEHTOLA AND KARTTUNEN 25 of 33

analysis (equal); investigation (equal); methodology (equal); resources (equal); software (equal); validation (equal); visu-

alization (equal); writing –original draft (equal); writing –review and editing (supporting).

DATA AVAILABILITY STATEMENT

Data available in article supplementary material. The data is also openly available in a public repository that does not

issue DOIs.

ORCID

Susi Lehtola https://orcid.org/0000-0001-6296-8103

Antti J. Karttunen https://orcid.org/0000-0003-4187-5447

RELATED WIRES ARTICLES

The Chronus Quantum software package

VeloxChem: A Python-driven density-functional theory program for spectroscopy simulations in high-performance

computing environments

Extended tight-binding quantum chemistry methods

REFERENCES

1. Westmoreland P. Applying molecular and materials modeling. 1st ed. Netherlands: Springer; 2002.

2. Head-Gordon M, Artacho E. Chemistry on the computer. Phys Today. 2008;61:58–63.

3. Deglmann P, Schäfer A, Lennartz C. Application of quantum calculations in the chemical industry: an overview. Int J Quantum Chem.

2015;115:107–36.

4. Weiß H, Deglmann P, In't Veld PJ, Cetinkaya M, Schreiner E. Multiscale materials modeling in an industrial environment. Annu Rev

Chem Biomol Eng. 2016;7:65–86.

5. Fortenberry RC, McDonald AR, Shepherd TD, Kennedy M, Sherrill CD. PSI4Education: computational chemistry labs using free soft-

ware. In: Daus K, Rigsby R, editors. The promise of chemical education: addressing our Students' needs. Washington, DC: American

Chemical Society; 2015. p. 85–98.

6. Grushow A, Reeves M. Using computational methods to teach chemical principles. Washington, DC: American Chemical Society; 2019.

7. Esselman BJ, Hill NJ. Integration of computational chemistry into the undergraduate organic chemistry laboratory curriculum. J Chem

Educ. 2016;93:932–6.

8. Winfield LL, McCormack K, Shaw T. Using iSpartan to support a student-centered activity on alkane conformations. J Chem Educ.

2018;96:89–92.

9. Esselman BJ, Hill NJ. Integrating computational chemistry into an organic chemistry laboratory curriculum using WebMO. Using com-

putational methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 139–62.

10. Phillips JA. Modeling reaction energies and exploring noble gas chemistry in the physical chemistry laboratory. Using computational

methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 33–50.

11. Reeves MS, Berghout HL, Perri MJ, Singleton SM, Whitnell RM. How can you measure a reaction enthalpy without going into the lab?

Using computational methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 51–63.

12. Martini SR, Hartzell CJ. Integrating computational chemistry into a course in classical thermodynamics. J Chem Educ. 2015;92:1201–3.

13. Stocker KM. Using electronic structure calculations to investigate the kinetics of gas-phase ammonia synthesis. Using computational

methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 21–32.

14. Snyder HD, Kucukkal TG. Computational chemistry activities with Avogadro and ORCA. J Chem Educ. 2021;98:1335–41.

15. Hoover GC, Dicks AP, Seferos DS. Upper-year materials chemistry computational modeling module for organic display technologies.

J Chem Educ. 2021;98:805–11.

16. Furlan PY, Bell-Loncella ET. Integrating computation and visualization to enhance learning IR spectroscopy in the general chemistry

laboratory: computer-assisted learning of IR spectroscopy. Spectrosc Lett. 2010;43:618–25.

17. Martin WR, Ball DW. Using computational chemistry to extend the acetylene rovibrational spectrum to C

2

T

2

. Using computational

methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 93–107.

18. DeVore TC. Introducing quantum calculations into the physical chemistry laboratory. Using computational methods to teach chemical

principles. Washington, DC: American Chemical Society. 2019. p. 109–25.

19. JCE Staff. Computational chemistry for the masses. J Chem Educ. 1996;73:104.

20. Grushow A, Reeves MS Using Computational Methods To Teach Chemical Principles: Overview. Using Computational Methods To Teach

Chemical Principles. Washington, DC: American Chemical Society; 2019. https://pubs.acs.org/doi/abs/10.1021/bk-2019-1312.ch001

21. WebMO A web-based interface to computational chemistry packages [cited 2021 May 8]. Available from: https://www.webmo.net/

22. Polik WF, Schmidt JR. WebMO: web-based computational chemistry calculations in education and research. Wiley Interdiscip Rev

Comput Mol Sci. 2022. 12 (1):e1554. https://doi.org/10.1002/wcms.1554

23. Perri MJ, Akinmurele M, Haynie M. Chem Compute Science Gateway: an online computational chemistry tool. Using computational

methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 79–92.

26 of 33 LEHTOLA AND KARTTUNEN

24. Kobayashi R, Goumans TPM, Carstensen NO, Soini TM, Marzari N, Timrov I, et al. Virtual computational chemistry teaching

laboratories—hands-on at a distance. J Chem Educ. 2021;98:3163–71.

25. Schwalbe S, Fiedler L, Kraus J, Kortus J, Trepte K, Lehtola S. PYFLOSIC: python-based Fermi–Löwdin orbital self-interaction correc-

tion. J Chem Phys. 2020;153:084104.

26. Krylov AI, Herbert JM, Furche F, Head-Gordon M, Knowles PJ, Lindh R, et al. What is the price of open-source software? J Phys Chem

Lett. 2015;6:2751–4.

27. Jacob CR. How open is commercial scientific software? J Phys Chem Lett. 2016;7:351–3.

28. Li L. Why should anyone become a scientist? The ideal of science and its importance. J Chem Educ. 1999;76:20.

29. Azoulay P, Fons-Rosen C, Zivin JSG. Does science advance one funeral at a time? Am Econ Rev. 2019;109:2889–920.

30. Giles J. Software company bans competitive users. Nature. 2004;429:231–1.

31. Smart AG. The war over supercooled water. Phys Today. 2018.

32. Palmer JC, Haji-Akbari A, Singh RS, Martelli F, Car R, Panagiotopoulos AZ, et al. Comment on “the putative liquid-liquid transition is

a liquid-solid transition in atomistic models of water”[I and II: J. Chem. Phys. 135, 134503 (2011); J. Chem. Phys. 138, 214504 (2013)].

J Chem Phys. 2018;148:137101.

33. Open Source Initiative. The open source definition [cited 2021 May 13]. Available from: https://opensource.org/osd

34. Free Software Foundation. What is free software? [cited 2021 May 13]. Available from: https://www.gnu.org/philosophy/free-sw.

html.en

35. Stahl MT. Open-source software: not quite endsville. Drug Discov Today. 2005;10:219–22.

36. Gezelter JD. Open source and open data should be standard practices. J Phys Chem Lett. 2015;6:1168–9.

37. Hinsen K. Computational science: shifting the focus from tools to models. F1000Research. 2014;3:101.

38. Git Community. Git, a free and open source distributed version control system [cited 2021 May 20]. Available from: https://git-

scm.com/

39. GitHub, Inc. Github collaboration platform [cited 2021 May 20]. Available from: https://github.com/

40. GitLab, Inc. Gitlab collaboration platform [cited 2021 May 20]. Available from: https://gitlab.com/

41. European Organization For Nuclear Research and OpenAIRE. Zenodo; 2013.

42. Swarts J. Open-source software in the sciences: the challenge of user support. J Bus Tech Commun. 2018;33:60–90.

43. Dalke A. The chemfp project. J Chem. 2019;11:76.

44. Haff G. How open source ate software. Berkeley, CA: Apress; 2018.

45. Kitware Inc. About Kitware [cited 2021 Jan 28]. Available from: https://www.kitware.com/about/

46. Ahrens J, Geveci B, Law C. ParaView: an end-user tool for large-data visualization. Visualization handbook. Oxford, UK: Elsevier;

2005. p. 717–31.

47. McCormick M, Liu X, Jomier J, Marion C, Ibanez L. ITK: enabling reproducible research and open science. Front Neuroinform. 2014;8:13.

48. Hoffman B, Cole D & Vines J Software process for rapid development of HPC software using CMake. In: 2009 DoD high performance

computing modernization program users group conference (IEEE); 2009.

49. Hanwell MD, Harris C, Genova A, Haghighatlari M, Khatib ME, Avery P, et al. Open chemistry, JupyterLab, REST, and quantum

chemistry. Int J Quantum Chem. 2021;121:e26472.

50. European Commission. Open science [cited 2021 Jan 28]. Available from: https://ec.europa.eu/info/research-and-innovation/strategy/

strategy-2020-2024/our-digital-future/open-science

51. Wieber F, Pisanty A, Hocquet A. “We were here before the web and hype…”: a brief history of and tribute to the computational chemis-

try list. J Chem. 2018;10:67.

52. Constant D, Sproull L, Kiesler S. The kindness of strangers: the usefulness of electronic weak ties for technical advice. Organ Sci. 1996;

7:119–35.

53. Lakhani KR, von Hippel E. How open source software works: “free”user-to-user assistance. Res Policy. 2003;32:923–43.

54. Schiff A. The economics of open source software: a survey of the early literature. Rev Netw Econ. 2002;1:66–74.

55. Myatt DP. Equilibrium selection and public-good provision: the development of open-source software. Oxf Rev Econ Policy. 2002;18:

446–61.

56. Johnson JP. Open source software: private provision of a public good. J Econ Manage Strategy. 2002;11:637–62.

57. Mustonen M. Copyleft—the economics of Linux and other open source software. Inf Econ Policy. 2003;15:99–121.

58. Lerner J, Tirole J. Some simple economics of open source. J Ind Econ. 2003;50:197–234.

59. Bonaccorsi A, Rossi C. Why open source software can succeed. Res Policy. 2003;32:1243–58.

60. Hawkins RE. The economics of open source software for a competitive firm. Netnomics. 2004;6:103–17.

61. Bitzer J. Commercial versus open source software: the role of product heterogeneity in competition. Econ Syst. 2004;28:369–81.

62. Lerner J, Tirole J. The economics of technology sharing: open source and beyond. J Econ Perspect. 2005;19:99–120.

63. Lerner J. The scope of open source licensing. J Law Econ Organ. 2005;21:20–56.

64. Bitzer J, Schröder PJH. Bug-fixing and code-writing: the private provision of open source software. Inf Econ Policy. 2005;17:389–406.

65. West J, Gallagher S. Challenges of open innovation: the paradox of firm investment in open-source software. R&D Management. 2006;

36:319–31.

66. Rossi MA. Decoding the free/open source software puzzle. In: Bitzer J, Schröder PJH, editors. The economics of open source software

development. Amsterdam, The Netherlands: Elsevier; 2006. p. 15–55.

LEHTOLA AND KARTTUNEN 27 of 33

67. Gaudeul A. Do open source developers respond to competition? The (LA)TEX case study. Rev Netw Econ. 2007;6:239–63.

68. von Krogh G, von Hippel E. The promise of research on open source software. Manage Sci. 2006;52:975–83.

69. Hars A, Ou S. Working for free? Motivations for participating in open-source projects. Int J Electron Commer. 2002;6:25–39.

70. Bitzer J, Schrettl W, Schröder PJH. Intrinsic motivation in open source software development. J Comp Econ. 2007;35:160–9.

71. Lerner J, Pathak PA, Tirole J. The dynamics of open-source contributors. Am Econ Rev. 2006;96:114–8.

72. Fershtman C, Gandal N. Open source software: motivation and restrictive licensing. Int Econ Econ Policy. 2007;4:209–25.

73. Johnson JP. Collaboration, peer review and open source software. Inf Econ Policy. 2006;18:477–97.

74. Bitzer J, Schröder PJH. The impact of entry and competition by open source software on innovation activity. In: Bitzer J, Schröder PJH,

editors. The economics of open source software development. Amsterdam, The Netherlands: Elsevier; 2006. p. 219–46.

75. top500.org. Top500 operating system statistics [cited 2021 July 6]. Available from: https://www.top500.org/statistics/details/osfam/1/

76. Moore JF, McCann MP. Linux and the chemist. J Chem Educ. 2003;80:219.

77. Lehtola J, Hakala M, Sakko A, Hämäläinen K. ERKALE: a flexible program package for X-ray properties of atoms and molecules.

J Comput Chem. 2012;33:1572–85.

78. Smith DGA, Burns LA, Simmonett AC, Parrish RM, Schieber MC, Galvelis R, et al. PSI4 1.4: open-source software for high-throughput

quantum chemistry. J Chem Phys. 2020;152:184108.

79. Crawford TD, Sherrill CD, Valeev EF, Fermann JT, King RA, Leininger ML, et al. PSI3: an open-source ab initio electronic structure

package. J Comput Chem. 2007;28:1610–6.

80. Sun Q, Zhang X, Banerjee S, Bao P, Barbry M, Blunt NS, et al. Recent developments in the PYSCF program package. J Chem Phys.

2020;153:024109.

81. Aquilante F, Autschbach J, Baiardi A, Battaglia S, Borin VA, Chibotaru LF, et al. Modern quantum chemistry with [open]Molcas.

J Chem Phys. 2020;152:214117.

82. Olsen JMH, Reine S, Vahtras O, Kjellgren E, Reinholdt P, Hjorth Dundas KO, et al. Dalton project: a python platform for molecular-

and electronic-structure simulations of complex systems. J Chem Phys. 2020;152:214115.

83. Aprà E, Bylaska EJ, de Jong WA, Govind N, Kowalski K, Straatsma TP, et al. NWChem: past, present, and future. J Chem Phys. 2020;

152:184102.

84. Lehtola S, Steigemann C, Oliveira MJT, Marques MAL. Recent developments in LIBXC: a comprehensive library of functionals for den-

sity functional theory. SoftwareX. 2018;7:1–5.

85. Perdew JP, Burke K, Ernzerhof M. Generalized gradient approximation made simple. Phys Rev Lett. 1996;77:3865–8.

86. Stephens PJ, Devlin FJ, Chabalowski CF, Frisch MJ. Ab initio calculation of vibrational absorption and circular dichroism spectra using

density functional force fields. J Phys Chem. 1994;98:11623–7.

87. Sun J, Ruzsinszky A, Perdew J. Strongly constrained and appropriately normed semilocal density functional. Phys Rev Lett. 2015;115:

036402.

88. Romero AH, Allan DC, Amadon B, Antonius G, Applencourt T, Baguet L, et al. ABINIT: overview and focus on selected capabilities.

J Chem Phys. 2020;152:124102.

89. Andrade X, Pemmaraju CD, Kartsev A, Xiao J, Lindenberg A, Rajpurohit S, et al. Inq, a modern GPU-accelerated computational frame-

work for (time-dependent) density functional theory. J Chem Theory Comput. 2021;17:7447–67.

90. Giannozzi P, Baseggio O, Bonfà P, Brunato D, Car R, Carnimeo I, et al. QUANTUM ESPRESSO toward the exascale. J Chem Phys.

2020;152:154105.

91. Lehtola S. Fully numerical Hartree–Fock and density functional calculations. II. Diatomic molecules. Int J Quantum Chem. 2019;119:

e25944.

92. Lehtola S. Fully numerical Hartree–Fock and density functional calculations. I. Atoms. Int J Quantum Chem. 2019;119:e25945.

93. Lehtola S, Dimitrova M, Sundholm D. Fully numerical electronic structure calculations on diatomic molecules in weak to strong mag-

netic fields. Mol Phys. 2020;118:e1597989.

94. Lehtola S. Fully numerical calculations on atoms with fractional occupations and range-separated exchange functionals. Phys Rev A.

2020;101:012516.

95. Motamarri P, Das S, Rudraraju S, Ghosh K, Davydov D, Gavini V. DFT-FE: a massively parallel adaptive finite-element code for large-

scale density functional theory calculations. Comput Phys Commun. 2020;246:106853.

96. te Velde G, Bickelhaupt FM, Baerends EJ, Fonseca Guerra C, van Gisbergen SJA, Snijders JG, et al. Chemistry with ADF. J Comput

Chem. 2001;22:931–67.

97. Barca GMJ, Bertoni C, Carrington L, Datta D, De Silva N, Deustua JE, et al. Recent developments in the general atomic and molecular

electronic structure system. J Chem Phys. 2020;152:154102.

98. Werner H-J, Knowles PJ, Manby FR, Black JA, Doll K, Heßelmann A, et al. The Molpro quantum chemistry package. J Chem Phys.

2020;152:144107.

99. K

allay M, Nagy PR, Mester D, Rolik Z, Samu G, Csontos J, et al. The MRCC program system: accurate quantum chemistry from water

to proteins. J Chem Phys. 2020;152:074107.

100. Neese F, Wennmohs F, Becker U, Riplinger C. The ORCA quantum chemistry program package. J Chem Phys. 2020;152:224108.

101. Balasubramani SG, Chen GP, Coriani S, Diedenhofen M, Frank MS, Franzke YJ, et al. TURBOMOLE: modular program suite for

ab initio quantum-chemical and condensed-matter simulations. J Chem Phys. 2020;152:184107.

102. Lejaeghere K, Bihlmayer G, Bjorkman T, Blaha P, Blugel S, Blum V, et al. Reproducibility in density functional theory calculations of

solids. Science. 2016;351:aad3000.

103. Ajila SA, Wu D. Empirical study of the effects of open source adoption on software development economics. J Syst Softw. 2007;80:1517–29.

28 of 33 LEHTOLA AND KARTTUNEN

104. Oliveira MJT, Papior N, Pouillon Y, Blum V, Artacho E, Caliste D, et al. The CECAM electronic structure library and the modular soft-

ware development paradigm. J Chem Phys. 2020;153:024117.

105. Caldeweyher E, Bannwarth C, Grimme S. Extension of the D3 dispersion coefficient model. J Chem Phys. 2017;147:034112.

106. Caldeweyher E, Ehlert S, Hansen A, Neugebauer H, Spicher S, Bannwarth C, et al. A generally applicable atomic-charge dependent

London dispersion correction. J Chem Phys. 2019;150:154122.

107. Caldeweyher E, Mewes J-M, Ehlert S, Grimme S. Extension and evaluation of the D4 London-dispersion model for periodic systems.

Phys Chem Chem Phys. 2020;22:8499–512.

108. DeLano WL. The case for open-source software in drug discovery. Drug Discov Today. 2005;10:213–7.

109. Smith DGA, Burns LA, Sirianni DA, Nascimento DR, Kumar A, James AM, et al. PSI4NumPy: an interactive quantum chemistry pro-

gramming environment for reference implementations and rapid development. J Chem Theory Comput. 2018;14:3504–11.

110. Herbst MF, Levitt A, Cancès E. DFTK: a Julian approach for simulating electrons in solids. JuliaCon Proc. 2021;3:69.

111. Lehtola S, Blockhuys F, Van Alsenoy C. An overview of self-consistent field calculations within finite basis sets. Molecules. 2020;25:

1218.

112. Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, et al. Commentary: the materials project: a materials genome approach to

accelerating materials innovation. APL Mater. 2013;1:011002.

113. Talirz L, Kumbhar S, Passaro E, Yakutovich AV, Granata V, Gargiulo F, et al. Materials cloud, a platform for open computational sci-

ence. Sci Data. 2020;7:299.

114. Huber SP, Zoupanos S, Uhrin M, Talirz L, Kahle L, Häuselmann R, et al. AiiDA 1.0, a scalable computational infrastructure for auto-

mated reproducible workflows and data provenance. Sci Data. 2020;7:300.

115. Gjerding M, Skovhus T, Rasmussen A, Bertoldo F, Larsen AH, Mortensen JJ, et al. Atomic simulation recipes: a python framework and

library for automated workflows. Comput Mater Sci. 2021;199:110731.

116. Smith DGA, Lolinco AT, Glick ZL, Lee J, Alenaizan A, Barnes TA, et al. Quantum chemistry common driver and databases (QCDB)

and quantum chemistry engine (QCEngine): automation and interoperability among computational chemistry programs. J Chem Phys.

2021;155:204801.

117. Samsonidze G, Kozinsky B. Half-heusler compounds for use in thermoelectric generators. US Patent 20170141282; May 2017.

118. Ye J-H, Huang C-L. Method for crystallizing metal oxide semiconductor layer, semiconductor structure, active array substrate, and

indium gallium zinc oxide crystal. US Patent 20180166474; June 2018.

119. Strohmaier E, Meuer HW, Dongarra J, Simon HD. TheTOP500 list and progress in high-performance computing. Computer. 2015;48:42–9.

120. Meuer HW, Strohmaier E, Dongarra J, Simon H, Meuer M. Top500 [cited 2021 May 20]. Available from: https://top500.org/

121. Szabo A, Ostlund NS. Modern quantum chemistry: introduction to advanced electronic structure theory. : Dover Pubns; 1996.

122. Geldenhuys WJ, Gaasch KE, Watson M, Allen DD, der Schyf CJV. Optimizing the use of open-source software applications in drug dis-

covery. Drug Discov Today. 2006;11:127–32.

123. Pirhadi S, Sunseri J, Koes DR. Open source molecular modeling. J Mol Graph Model. 2016;69:127–43.

124. Rodríguez-Becerra J, C

aceres-Jensen L, Díaz T, Druker S, Padilla VB, Pernaa J, et al. Developing technological pedagogical science

knowledge through educational computational chemistry: a case study of pre-service chemistry teachers' perceptions. Chem Educ Res

Pract. 2020;21:638–54.

125. Talirz L, Ghiringhelli LM, Smit B. Trends in atomistic simulation software usage. J Comp Mol Sci. 2021;3 (1):1483.

126. Python package index—pypi [cited 2021 July 7]. Available from: https://pypi.org/

127. Continuum Analytics. Conda package manager [cited 2021 May 26]. Available from: https://conda.io/

128. Hohenberg P, Kohn W. Inhomogeneous electron gas. Phys Rev. 1964;136:B864–71.

129. Kohn W, Sham LJ. Self-consistent equations including exchange and correlation effects. Phys Rev. 1965;140:A1133–8.

130. Boys SF. Electronic wave functions. I. A general method of calculation for the stationary states of any molecular system. Proc R Soc

Lond Ser A Math Phys Eng Sci. 1950;200:542–54.

131. McMurchie LE, Davidson ER. One- and two-electron integrals over cartesian Gaussian functions. J Comput Phys. 1978;26:218–31.

132. Obara S, Saika A. Efficient recursive computation of molecular integrals over cartesian Gaussian functions. J Chem Phys. 1986;84:3963.

133. Davidson ER, Feller D. Basis set selection for molecular calculations. Chem Rev. 1986;86:681–96.

134. Hill JG. Gaussian basis sets for molecular applications. Int J Quantum Chem. 2013;113:21–34.

135. Jensen F. Atomic orbital basis sets. Wiley Interdiscip Rev Comput Mol Sci. 2013;3:273–95.

136. Shiozaki T. BAGEL: brilliantly advanced general electronic-structure library. Wiley Interdiscip Rev Comput Mol Sci. 2018;8:e1331.

137. Williams-Young DB, Petrone A, Sun S, Stetina TF, Lestrange P, Hoyer CE, et al. The Chronus quantum software package. Wiley Inter-

discip Rev Comput Mol Sci. 2020;10:e1436.

138. Aidas K, Angeli C, Bak KL, Bakken V, Bast R, Boman L, et al. The Dalton quantum chemistry program system. Wiley Interdiscip Rev

Comput Mol Sci. 2014;4:269–84.

139. Rudberg E, Rubensson EH, Sałek P, Kruchinina A. Ergo: an open-source program for linear-scaling electronic structure calculations.

SoftwareX. 2018;7:107–11.

140. Folkestad SD, Kjønstad EF, Myhre RH, Andersen JH, Balbi A, Coriani S, et al. e

T

1.0: an open source electronic structure program with

emphasis on coupled cluster and multilevel methods. J Chem Phys. 2020;152:184103.

141. Aroeira GJR, Davis MM, Turney JM, Schaefer HF. Fermi.jl: a modern design for quantum chemistry. J Chem Theory Comput. 2022. 18

(2):677–686. https://doi.org/10.1021/acs.jctc.1c00719

LEHTOLA AND KARTTUNEN 29 of 33

142. Poole D, Vallejo JLG, Gordon MS. A new kid on the block: application of Julia to Hartree–Fock calculations. J Chem Theory Comput.

2020;16:5006–13.

143. Bruneval F, Rangel T, Hamed SM, Shao M, Yang C, Neaton JB. MOLGW 1: many-body perturbation theory software for atoms, mole-

cules, and clusters. Comput Phys Commun. 2016;208:149–61.

144. Peng C, Lewis CA, Wang X, Clement MC, Pierce K, Rishi V, et al. Massively parallel quantum chemistry: a high-performance research

platform for electronic structure. J Chem Phys. 2020;153:044120.

145. Mueller RP. PyQuante: Python quantum chemistry [cited 2021 July 6]. Available from: http://pyquante.sourceforge.net/

146. Unsleber JP, Dresselhaus T, Klahr K, Schnieders D, Böckers M, Barton D, et al. Serenity: a subsystem quantum chemistry program.

J Comput Chem. 2018;39:788–98.

147. Kjellgren E. SlowQuant [cited 2021 July 6]. Available from: https://github.com/erikkjellgren/SlowQuant

148. Rinkevicius Z, Li X, Vahtras O, Ahmadzadeh K, Brand M, Ringholm M, et al. VeloxChem: a python-driven density-functional theory pro-

gram for spectroscopy simulations in high-performance computing environments. Wiley Interdiscip Rev Comput Mol Sci. 2019;10:e1457.

149. Souvatzis P. Uquantchem: a versatile and easy to use quantum chemistry computational software. Comput Phys Commun. 2014;185:

415–21.

150. Bloch F. Bemerkung zur Elektronentheorie des Ferromagnetismus und der elektrischen Leitfähigkeit. Z Phys. 1929;57:545–55.

151. Kratzer P, Neugebauer J. The basics of electronic structure theory for periodic systems. Front Chem. 2019;7:1–18.

152. Schwerdtfeger P. The pseudopotential approximation in electronic structure theory. ChemPhysChem. 2011;12:3143–55.

153. Kang S, Woo J, Kim J, Kim H, Kim Y, Lim J, et al. ACE-molecule: an open-source real-space quantum chemistry package. J Chem

Phys. 2020;152:124110.

154. Ratcliff LE, Dawson W, Fisicaro G, Caliste D, Mohr S, Degomme A, et al. Flexibilities of wavelets as a computational basis set for large-

scale electronic structure calculations. J Chem Phys. 2020;152:194110.

155. Nakata A, Baker JS, Mujahed SY, Poulton JTL, Arapan S, Lin J, et al. Large scale and linear scaling DFT with the CONQUEST code.

J Chem Phys. 2020;152:164112.

156. Kühne TD, Iannuzzi M, Del Ben M, Rybkin VV, Seewald P, Stein F, et al. CP2K: an electronic structure and molecular dynamics soft-

ware package - quickstep: efficient and accurate electronic structure calculations. J Chem Phys. 2020;152:194103.

157. The Elk Code. Available from: http://elk.sourceforge.net/

158. Gulans A, Kontur S, Meisenbichler C, Nabok D, Pavone P, Rigamonti S, et al. Exciting: a full-potential all-electron package

implementing density-functional theory and many-body perturbation theory. J Phys Condens Matter. 2014;26:363202.

159. FLEUR. Available from: http://www.flapw.de

160. Enkovaara J, Rostgaard C, Mortensen JJ, Chen J, Dułak M, Ferrighi L, et al. Electronic structure calculations with GPAW: a real-space

implementation of the projector augmented-wave method. J Phys Condens Matter. 2010;22:253202.

161. Sundararaman R, Letchworth-Weaver K, Schwarz KA, Gunceler D, Ozhabes Y, Arias TA. JDFTx: software for joint density-functional

theory. SoftwareX. 2017;6:278–84.

162. Xu Q, Sharma A, Suryanarayana P. M-SPARC: Matlab-simulation package for ab-initio real-space calculations. SoftwareX. 2020;11:

100423.

163. Tancogne-Dejean N, Oliveira MJT, Andrade X, Appel H, Borca CH, Le Breton G, et al. Octopus, a computational framework for explor-

ing light-driven phenomena and quantum dynamics in extended and finite systems. J Chem Phys. 2020;152:124119.

164. Ozaki T, Kino H. Numerical atomic basis orbitals from H to Kr. Phys Rev B. 2004;69:195113.

165. Saad Y, Chelikowsky JR, Shontz SM. Numerical methods for electronic structure calculations of materials. SIAM Rev. 2010;52:3–54.

166. Fathurrahman F, Agusta MK, Saputro AG, Dipojono HK. PWDFT.Jl: a Julia package for electronic structure calculation using density

functional theory and plane wave basis. Comput Phys Commun. 2020;256:107372.

167. Briggs EL, Sullivan DJ, Bernholc J. Real-space multigrid-based approach to large-scale electronic structure calculations. Phys Rev B.

1996;54:14362–75.

168. García A, Papior N, Akhtar A, Artacho E, Blum V, Bosoni E, et al. SIESTA: recent developments and applications. J Chem Phys. 2020;

152:204108.

169. Gygi F. Architecture of Qbox: a scalable first-principles molecular dynamics code. IBM J Res Dev. 2008;52:137–44.

170. Xu Q, Sharma A, Comer B, Huang H, Chow E, Medford AJ, et al. SPARC: simulation package for ab-initio real-space calculations.

SoftwareX. 2021;15:100709.

171. Lehtola S. A review on non-relativistic, fully numerical electronic structure calculations on atoms and diatomic molecules. Int J Quan-

tum Chem. 2019;119:e25968.

172. Jensen SR, Flå T, Jonsson D, Monstad RS, Ruud K, Frediani L. Magnetic properties with multiwavelets and DFT: the complete basis

set limit achieved. Phys Chem Chem Phys. 2016;18:21145–61.

173. Harrison RJ, Beylkin G, Bischoff FA, Calvin JA, Fann GI, Fosso-Tande J, et al. MADNESS: A multiresolution, adaptive numerical envi-

ronment for scientific simulation. SIAM J Sci Comput. 2016;38:S123–42.

174. Kobus J. A finite difference Hartree–Fock program for atoms and diatomic molecules. Comput Phys Commun. 2013;184:799–811.

175. Koskinen P, Mäkinen V. Density-functional tight-binding for beginners. Comput Mater Sci. 2009;47:237–53.

176. Seifert G, Joswig J-O. Density-functional tight binding—an approximate density-functional theory method. Wiley Interdiscip Rev:

Comput Mol Sci. 2012;2:456–65.

30 of 33 LEHTOLA AND KARTTUNEN

177. Gaus M, Cui Q, Elstner M. Density functional tight binding: application to organic and biological molecules. Wiley Interdiscip Rev:

Comput Mol Sci. 2013;4:49–61.

178. Thiel W. Semiempirical quantum-chemical methods. Wiley Interdiscip Rev Comput Mol Sci. 2014;4:145–57.

179. Bannwarth C, Caldeweyher E, Ehlert S, Hansen A, Pracht P, Seibert J, et al. Extended tight-binding quantum chemistry methods.

Wiley Interdiscip Rev Comput Mol Sci. 2021;11:e1493.

180. Hourahine B, Aradi B, Blum V, Bonafé F, Buccheri A, Camacho C, et al. DFTB+, a software package for efficient approximate density

functional theory based atomistic simulations. J Chem Phys. 2020;152:124101.

181. Bock N, Cawkwell MJ, Coe JD, Krishnapriyan A, Kroonblawd MP, Lang A, Liu C, Saez EM, Mniszewski SM, Negre CFA,

Niklasson AMN, Sanville E, Wood MA, Yang P, Latte [cited 2021 July 12]. Available from: https://github.com/lanl/LATTE.

182. Husch T, Reiher M. Comprehensive analysis of the neglect of diatomic differential overlap approximation. J Chem Theory Comput.

2018;14:5169–79.

183. Cabezas I, Segovia R, Caratozzolo P & Webb E Using software engineering design principles as tools for freshman students learning.

In: 2020 IEEE Frontiers in education conference (FIE) (IEEE); 2020).

184. Lam P, Dietrich J & Pearce DJ Putting the semantics into semantic versioning. In: Proceedings of the 2020 ACM SIGPLAN interna-

tional symposium on new ideas, new paradigms, and reflections on programming and software (ACM); 2020.

185. Valeev EF. Libint: a library for the evaluation of molecular integrals of many-body operators over gaussian functions. Available from:

http://libint.valeyev.net/

186. Lawson CL, Hanson RJ, Krogh FT, Kincaid DR. Algorithm 539: basic linear algebra subprograms for fortran usage [f1]. ACM Trans

Math Softw. 1979;5:324–5.

187. Zee FGV, van de Geijn RA. BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans Math Softw. 2015;41:1–33.

188. Whaley RC & Dongarra JJ Automatically tuned linear algebra software. In: Proceedings of the IEEE/ACM SC98 conference (IEEE);

1998.

189. Flocke N, Lotrich V. Efficient electronic integrals and their generalized derivatives for object oriented implementations of electronic

structure calculations. J Comput Chem. 2008;29:2722–36.

190. Sun Q. Libcint: an efficient general integral library for Gaussian basis functions. J Comput Chem. 2015;36:1664–71.

191. Pritchard BP, Chow E. Horizontal vectorization of electron repulsion integrals. J Comput Chem. 2016;37:2537–46.

192. Peng F, Wu M-S, Sosonkina M, Windus T, Bentz J, Gordon M, Kenny J & Janssen C Tackling component interoperability in quantum

chemistry software. In: Proceedings of the 2007 symposium on component and framework technology in high-performance and scien-

tific computing—CompFrame '07 (ACM Press); 2007.

193. Kenny JP, Janssen CL, Valeev EF, Windus TL. Components for integral evaluation in quantum chemistry. J Comput Chem. 2008;29:

562–77.

194. Ekström U, Visscher L, Bast R, Thorvaldsen AJ, Ruud K. Arbitrary-order density functional response theory from automatic differentia-

tion. J Chem Theory Comput. 2010;6:1971–80.

195. Herbst MF, Scheurer M, Fransson T, Rehn DR, Dreuw A adcc: A versatile toolkit for rapid development of algebraic-diagrammatic con-

struction methods. Wiley Interdiscip Rev Comput Mol Sci. 2020;10: (6):e1462.

196. Wouters S, Poelmans W, Ayers PW, Van Neck D. CheMPS2: a free open-source spin-adapted implementation of the density matrix

renormalization group for ab initio quantum chemistry. Comput Phys Commun. 2014;185:1501–14.

197. Scheurer M, Reinholdt P, Kjellgren ER, Olsen JMH, Dreuw A, Kongsted J. CPPE: an open-source C++ and python library for polariz-

able embedding. J Chem Theory Comput. 2019;15:6154–63.

198. Grimme S, Antony J, Ehrlich S, Krieg H. A consistent and accurate ab initio parametrization of density functional dispersion correction

(DFT-D) for the 94 elements H-Pu. J Chem Phys. 2010;132:154104.

199. Kaliman IA, Slipchenko LV. LIBEFP: a new parallel implementation of the effective fragment potential method as a portable software

library. J Comput Chem. 2013;34:2284–92.

200. Remigio RD, Frediani L, Steindal AH, Bast R, Burns LA, Crawford TD, Weijo V. PCMSolver, an open-source library for the polarizable

continuum model electrostatic problem [cited 2021 Feb 2]. Available from: https://github.com/PCMSolver/pcmsolver

201. Pritchard BP, Altarawy D, Didier B, Gibson TD, Windus TL. New basis set exchange: an open, up-to-date resource for the molecular

sciences community. J Chem Inf Model. 2019;59:4814–20.

202. Shaw R, Hill J. Libecpint: a c++ library for the efficient evaluation of integrals over effective core potentials. J Open Source Softw.

2021;6:3039.

203. Jmol: an open-source Java viewer for chemical structures in 3D. Available from: http://www.jmol.org

204. Hanwell MD, Curtis DE, Lonie DC, Vandermeersch T, Zurek E, Hutchison GR. Avogadro: an advanced semantic chemical editor, visu-

alization, and analysis platform. J Cheminf. 2012;4:17.

205. Gilbert A. Iqmol, a free open-source molecular editor and visualization package [cited 2021 June 26]. Available from: http://iqmol.org

206. Schrödinger Inc. Pymol, a molecular visualization system [cited 2021 July 6]. Available from: https://github.com/schrodinger/pymol-

open-source

207. O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open babel: an open chemical toolbox. J Cheminf.

2011;3:33.

208. O'Boyle NM, Tenderholt AL, Langner KM. Cclib: a library for package-independent computational chemistry algorithms. J Comput

Chem. 2008;29:839–45.

LEHTOLA AND KARTTUNEN 31 of 33

209. Larsen AH, Mortensen JJ, Blomqvist J, Castelli IE, Christensen R, Dułak M, et al. The atomic simulation environment—a python

library for working with atoms. J Phys Condens Matter. 2017;29:273002.

210. Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem in 2021: new data content and improved web interfaces. Nucleic

Acids Res. 2020;49:D1388–95.

211. Lu T, Chen F. Multiwfn: a multifunctional wavefunction analyzer. J Comput Chem. 2011;33:580–92.

212. Hermann G, Pohl V, Tremblay JC, Paulus B, Hege H-C, Schild A. ORBKIT: a modular python toolbox for cross-platform postprocessing

of quantum chemical wavefunction data. J Comput Chem. 2016;37:1511–20.

213. Lehtola S, Karttunen AJ. git repository containing a copy of the supporting information [cited 2021 Aug 8]. Available from: https://

github.com/susilehtola/fosschemistry

214. Bannwarth C, Ehlert S, Grimme S. GFN2-xTB—an accurate and broadly parametrized self-consistent tight-binding quantum chemical

method with multipole electrostatics and density-dependent dispersion contributions. J Chem Theory Comput. 2019;15:1652–71.

215. Menzel JP, Kloppenburg M, Beli

c J, Groot HJM, Visscher L, Buda F. Efficient workflow for the investigation of the catalytic cycle of

water oxidation catalysts: combining GFN-xTB and density functional theory. J Comput Chem. 2021;42:1885–94.

216. Tasinato N, Puzzarini C, Barone V. Correct modeling of cisplatin: a paradigmatic case. Angew Chem Int Ed Engl. 2017;56:13838–41.

217. Liu W, Franke R. Comprehensive relativistic ab initio and density functional theory studies on PtH, PtF, PtCl, and Pt(NH

3

)

2

Cl

2

.

J Comput Chem. 2002;23:564–75.

218. Becke AD. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys Rev A. 1988;38:3098–100.

219. Perdew JP. Density-functional approximation for the correlation energy of the inhomogeneous electron gas. Phys Rev B. 1986;33:

8822–4.

220. Adamo C, Barone V. Toward reliable density functional methods without adjustable parameters: the PBE0 model. J Chem Phys. 1999;

110:6158–70.

221. Ernzerhof M, Scuseria GE. Assessment of the Perdew–Burke–Ernzerhof exchange-correlation functional. J Chem Phys. 1999;110:

5029–36.

222. Vetere V, Adamo C, Maldivi P. Performance of the “parameter free”PBE0 functional for the modeling of molecular properties of heavy

metals. Chem Phys Lett. 2000;325:99–105.

223. Bühl M, Reimann C, Pantazis DA, Bredow T, Neese F. Geometries of third-row transition-metal complexes from density-functional the-

ory. J Chem Theory Comput. 2008;4:1449–59.

224. Hehre WJ, Stewart RF, Pople JA. Self-consistent molecular-orbital methods. I. Use of Gaussian expansions of slater-type atomic

orbitals. J Chem Phys. 1969;51:2657.

225. Binkley JS, Pople JA, Hehre WJ. Self-consistent molecular orbital methods. 21. Small split-valence basis sets for first-row elements.

J Am Chem Soc. 1980;102:939.

226. Hehre WJ, Ditchfield R, Pople JA. Self-consistent molecular orbital methods. XII. Further extensions of Gaussian-type basis sets for use

in molecular orbital studies of organic molecules. J Chem Phys. 1972;56:2257–61.

227. Weigend F, Ahlrichs R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design

and assessment of accuracy. Phys Chem Chem Phys. 2005;7:3297–305.

228. Pyykkö P. Relativistic effects in chemistry: more common than you thought. Annu Rev Phys Chem. 2012;63:45–64.

229. Pyykkö P. The physics behind chemistry and the periodic table. Chem Rev. 2012;112:371–84.

230. Dolg M. Chapter 14 relativistic effective core potentials. In: Springborg M, Li J, V

azquez AMM, editors. Theoretical and computational

chemistry. Amsterdam, The Netherlands: Elsevier; 2002. p. 793–862.

231. Rappoport D, Furche F. Property-optimized gaussian basis sets for molecular response calculations. J Chem Phys. 2010;133:134105.

232. Ehlert S, Huniar U, Ning J, Furness JW, Sun J, Kaplan AD, et al. r

2

SCAN-D4: dispersion corrected meta-generalized gradient approxi-

mation for general chemical applications. J Chem Phys. 2021;154:061101.

233. Møller C, Plesset MSM. Note on an approximation treatment for many-electron systems. Phys Rev. 1934;46:618–22.

234. Čížek J. On the correlation problem in atomic and molecular systems. Calculation of wavefunction components in Ursell-type expan-

sion using quantum-field theoretical methods. J Chem Phys. 1966;45:4256–66.

235. Grimme S. Exploration of chemical compound, conformer, and reaction space with meta-dynamics simulations based on tight-binding

quantum chemical calculations. J Chem Theory Comput. 2019;15:2847–62.

236. Pracht P, Bohle F, Grimme S. Automated exploration of the low-energy chemical space with fast quantum chemical methods. Phys

Chem Chem Phys. 2020;22:7169–92.

237. Pracht P, Grimme S. Calculation of absolute molecular entropies and heat capacities made simple. Chem Sci. 2021;12:6551–68.

238. Whitten JL. Coulombic potential energy integrals and approximations. J Chem Phys. 1973;58:4496.

239. Baerends EJ, Ellis DE, Ros P. Self-consistent molecular Hartree–Fock–slater calculations I. the computational procedure. Chem Phys.

1973;2:41–51.

240. Dunlap BI, Connolly JWD, Sabin JR. On the applicability of LCAO-Xαmethods to molecules containing transition metal atoms: the

nickel atom and nickel hydride. Int J Quantum Chem. 1977;12:81–7.

241. Dunlap BI, Connolly JWD, Sabin JR. On some approximations in applications of Xαtheory. J Chem Phys. 1979;71:3396.

242. Dunlap BI, Rösch N, Trickey SB. Variational fitting methods for electronic structure calculations. Mol Phys. 2010;108:3167–80.

243. Weigend F. Hartree–Fock exchange fitting basis sets for H to Rn. J Comput Chem. 2008;29:167–75.

32 of 33 LEHTOLA AND KARTTUNEN

244. Dunning TH. Gaussian basis sets for use in correlated molecular calculations. I. the atoms boron through neon and hydrogen. J Chem

Phys. 1989;90:1007.

245. Van Lenthe E, Baerends EJ. Optimized slater-type basis sets for the elements 1-118. J Comput Chem. 2003;24:1142–56.

246. Chong DP, van Lenthe E, Van Gisbergen S, Baerends EJ. Even-tempered slater-type orbitals revisited: from hydrogen to krypton.

J Comput Chem. 2004;25:1030–6.

247. Bühl M, Kabrede H. Geometries of transition-metal complexes from density-functional theory. J Chem Theory Comput. 2006;2:

1282–90.

248. Grimme S, Bannwarth C, Shushkov P. A robust and accurate tight-binding quantum chemical method for structures, vibrational fre-

quencies, and noncovalent interactions of large molecular systems parametrized for all spd-block elements (z=1–86). J Chem Theory

Comput. 2017;13:1989–2009.

249. Dirac PAM. Note on exchange phenomena in the Thomas atom. Math Proc Cambridge Philos Soc. 1930;26:376–85.

250. Perdew JP, Wang Y. Accurate and simple analytic representation of the electron-gas correlation energy. Phys Rev B. 1992;45:13244–9.

251. Furness JW, Kaplan AD, Ning J, Perdew JP, Sun J. Accurate and numerically efficient r

2

SCAN meta-generalized gradient approxima-

tion. J Phys Chem Lett. 2020;11:8208–15.

252. Furness JW, Kaplan AD, Ning J, Perdew JP, Sun J. Correction to "accurate and numerically efficient r

2

SCAN meta-generalized gradient

approximation". J Phys Chem Lett. 2020;11:9248–8.

253. Weigend F. Accurate Coulomb-fitting basis sets for H to Rn. Phys Chem Chem Phys. 2006;8:1057–65.

254. Pietro WJ, Hehre WJ. Molecular orbital theory of the properties of inorganic and organometallic compounds. 3. STO-3G basis sets for

first- and second-row transition metals. J Comput Chem. 1983;4:241–51.

255. French RJ, Hedberg L, Hedberg K, Gard GL, Johnson BM. Molecular structure and quadratic force field of chromyl fluoride, CrO

2

F

2

.

Inorg Chem. 1983;22:892–5.

256. Levason WL, Ogden JS, Saad AK, Young NA, Brisdon AK, Holliman PJ, et al. Metal K-edge EXAFS (extended x-ray absorption fine

structure) studies of CrO

2

F

2

and MnO

3

F at 10K. J Fluorine Chem. 1991;53:43–51.

257. Gilbert B, Frazer BH, Zhang H, Huang F, Banfield JF, Haskel D, et al. X-ray absorption spectroscopy of the cubic and hexagonal pol-

ytypes of zinc sulfide. Phys Rev B. 2002;66:245205.

258. Gardner PJ, Pang P. Thermodynamics of the zinc sulphide transformation, sphalerite !wurtzite, by modified entrainment. J Chem

Soc Faraday Trans. 1988;1 84:1879.

259. Garrity KF, Bennett JW, Rabe KM, Vanderbilt D. Pseudopotentials for high-throughput DFT calculations. Comput Mater Sci. 2014;81:

446–52. arXiv:1305.5973.

260. COD. Crystallography open database [cited 2021 July 20]. Available from: http://www.crystallography.net/cod/

261. Materials Cloud. Quantum espresso input generator and structure visualizer [cited 2021 July 20]. Available from: https://www.

materialscloud.org/work/tools/qeinputgenerator

262. Monkhorst HJ, Pack JD. Special points for Brillouin-zone integrations. Phys Rev B. 1976;13:5188–92.

263. Cardona M, Kremer RK, Lauck R, Siegle G, Muñoz A, Romero AH, et al. Electronic, vibrational, and thermodynamic properties of ZnS

with zinc-blende and rocksalt structure. Phys Rev B. 2010;81:075207.

264. Materials Cloud. Seek-path: the k-path finder and visualizer [cited 2021 July 20]. Available from: https://www.materialscloud.org/

work/tools/seekpath

265. Setyawan W, Curtarolo S. High-throughput electronic band structure calculations: challenges and tools. Comput Mater Sci. 2010;49:

299–312.

266. Tran TK, Park W, Tong W, Kyi MM, Wagner BK, Summers CJ. Photoluminescence properties of ZnS epilayers. J Appl Phys. 1997;81:

2803–9.

267. McDonald AR, Nash JA, Nerenberg PS, Ball KA, Sode O, Foley JJ, et al. Building capacity for undergraduate education and training in

computational molecular science: a collaboration between the MERCURY consortium and the molecular sciences software institute.

Int J Quantum Chem. 2020;120:e26359.

268. Magers DB, Ch

avez VH, Peyton BG, Sirianni DA, Fortenberry RC, Ringer McDonald A. PSI4EDUCATION: free and open-source pro-

graming activities for chemical education with free and open-source software. In: McDonald AR, Nash JA, editors. Teaching program-

ming across the chemistry curriculum. Washington, DC: American Chemical Society; 2021. p. 107–22.

SUPPORTING INFORMATION

Additional supporting information may be found in the online version of the article at the publisher's website.

How to cite this article: Lehtola S, Karttunen AJ. Free and open source software for computational chemistry

education. WIREs Comput Mol Sci. 2022. e1610. https://doi.org/10.1002/wcms.1610

LEHTOLA AND KARTTUNEN 33 of 33