ArticlePDF Available

Free and open source software for computational chemistry education

Authors:

Abstract

After decades of waiting, computational chemistry for the masses is finally here. Our brief review on free and open source software (FOSS) packages points out the existence of software offering a wide range of functionality, all the way from approximate semiempirical calculations with tight‐binding density functional theory to sophisticated ab initio wave function methods such as coupled‐cluster theory, covering both molecular and solid‐state systems. Combined with the remarkable increase in the computing power of personal devices, which now rivals that of the fastest supercomputers in the world in the 1990s, we demonstrate that a decentralized model for teaching computational chemistry is now possible thanks to FOSS packages, enabling students to perform reasonable modeling on their own computing devices in the bring your own device (BYOD) scheme. FOSS software can be made trivially simple to install and keep up to date, eliminating the need for departmental support, and also enables comprehensive teaching strategies, as various algorithms' actual implementations can be used in teaching. We exemplify what kinds of calculations are feasible with four FOSS electronic structure programs, assuming only extremely modest computational resources, to illustrate how FOSS packages enable decentralized approaches to computational chemistry education within the BYOD scheme. FOSS also has further benefits driving its adoption: the open access to the source code of FOSS packages democratizes the science of computational chemistry, and FOSS packages can be used without limitation also beyond education, in academic and industrial applications, for example. This article is categorized under: Software > Quantum Chemistry
OVERVIEW
Free and open source software for computational
chemistry education
Susi Lehtola
1
| Antti J. Karttunen
2
1
Molecular Sciences Software Institute,
Blacksburg, Virginia, USA
2
Department of Chemistry and Materials
Science, Aalto University, Espoo, Finland
Correspondence
Susi Lehtola, Molecular Sciences Software
Institute, Blacksburg, VA 24061, USA.
Email: susi.lehtola@alumni.helsinki.fi
Funding information
Business Finland, Grant/Award Number:
3767/31/2019
Edited by: Peter R. Schreiner, Editor-in-
Chief
Abstract
After decades of waiting, computational chemistry for the masses is finally
here. Our brief review on free and open source software (FOSS) packages
points out the existence of software offering a wide range of functionality, all
the way from approximate semiempirical calculations with tight-binding den-
sity functional theory to sophisticated ab initio wave function methods such as
coupled-cluster theory, covering both molecular and solid-state systems. Com-
bined with the remarkable increase in the computing power of personal
devices, which now rivals that of the fastest supercomputers in the world in
the 1990s, we demonstrate that a decentralized model for teaching computa-
tional chemistry is now possible thanks to FOSS packages, enabling students
to perform reasonable modeling on their own computing devices in the bring
your own device (BYOD) scheme. FOSS software can be made trivially simple
to install and keep up to date, eliminating the need for departmental support,
and also enables comprehensive teaching strategies, as various algorithms'
actual implementations can be used in teaching. We exemplify what kinds of
calculations are feasible with four FOSS electronic structure programs, assum-
ing only extremely modest computational resources, to illustrate how FOSS
packages enable decentralized approaches to computational chemistry educa-
tion within the BYOD scheme. FOSS also has further benefits driving its adop-
tion: the open access to the source code of FOSS packages democratizes the
science of computational chemistry, and FOSS packages can be used without
limitation also beyond education, in academic and industrial applications, for
example.
This article is categorized under:
Software > Quantum Chemistry
KEYWORDS
computational chemistry education, free software, open source
Received: 26 November 2021 Revised: 14 February 2022 Accepted: 22 February 2022
DOI: 10.1002/wcms.1610
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided
the original work is properly cited.
© 2022 The Authors. WIREs Computational Molecular Science published by Wiley Periodicals LLC.
WIREs Comput Mol Sci. 2022;e1610. wires.wiley.com/compmolsci 1of33
https://doi.org/10.1002/wcms.1610
1|INTRODUCTION
Quantum chemical research methods have been used extensively in the chemical industry already for several
decades.
14
In addition to the widespread use in industry as well as in academia, quantum chemistry is also utilized in
chemical education to provide atomic-level understanding of fundamental chemical concepts and phenomena.
5,6
For
example, in undergraduate general and organic chemistry curricula, students get hands-on experience on concepts such
as three-dimensional molecular structure, structural isomerism, conformers, and stereochemistry by means of computa-
tional exercises or computer laboratory sessions.
79
Although some of the aforementioned aspects can in principle be studied even with simpler methodologies such as
classical force fields, quantum chemical calculations with state-of-the-art software packages allow students to get first-
hand understanding on more advanced topics such as molecular orbitals, chemical bonding, energetics,
10
thermodynamics,
11,12
reaction mechanisms,
13
and various spectroscopies.
1418
The ability to interpret and understand chemical phenomena with the help of quantum chemical calculations is a
valuable skill in every chemist's professional life: nowadays, a significant portion of even the experimental studies
reported in the chemical literature is tightly integrated with quantum chemical investigations. Moreover, as quantum
chemistry is the critical bridging component between experimental work and machine learning methods, the ability to
run quantum chemical calculations can be expected to become even more increasingly relevant and necessary to work-
life in the near future.
Although computational chemistry for the massesa pervasive inclusion of computational modeling in the chemis-
try curriculumhas been long thought to be coming,
19
it does not appear to have arrived yet. In their recent overview,
Grushow and Reeves
20
have summarized some select landmarks in computational chemistry education. At the same
time, Grushow and Reeves note how computational chemistry still has a somewhat limited presence in undergraduate
curricula, which can be attributed at least in part to the history of computational chemistry software.
In the 1990s, commercial software companies started selling graphical user interfaces to their quantum chemistry
packages, some of which were particularly geared toward educational use. Such software was and still is typically used
in a computer classroom setting, where a limited number of relatively powerful desktop computers are available for the
students during the teaching sessions. The benefit of a computer classroom setting is that all software can be pre-
installed for the students and the standardized software environment makes the possibilities (and limitations) of the
software setup clear for the teachers in charge of the educational content. However, the computer classroom approach
has limited scalability, as the number of students is limited by the number of workstations; this often makes the
approach impractical for large-scale undergraduate teaching. Furthermore, while the computer classroom setting may
be useful for teaching during contact sessions, the students' possibilities for running calculations outside the contact ses-
sions are limited by the requirement of physical access to the computer classroomwhich has proved to be challenging
especially during the ongoing global coronavirus disease pandemic which has required social distancing. The classroom
setting also typically limits the teacher and students to using the pre-installed software, while costs for the required soft-
ware licenses can be unfeasibly high for educational institutions with limited budgets. Someone also has to maintain
the software on the classroom computers and ensure it is kept up to date.
In the early 2000s, the WebMO package introduced a web-based approach to computational chemistry education, in
which the quantum chemistry software only needs to be installed and maintained on a central server, and the teachers
and students can then access it through a web browser interface.
21,22
A number of quantum chemistry software pack-
ages have been integrated with WebMO whose integrated molecular editor and analysis tools make it a rather low-
barrier interface to quantum chemistry. As the users thus only need a web browser to access the computing software,
WebMO was the first tool to enable a bring your own device (BYOD) paradigm in computational chemistry, in which
the students can use their personal devices to take part in the teaching.
However, WebMO still requires someone to set up and administer the WebMO server, even though the need to pur-
chase actual server hardware has been removed by the possibility of installing the service on cloud platforms such as
the Amazon Web Services or the Google Cloud. Recently, the cloud-based Chem Compute platform has also begun to
offer web access to computational chemistry software. Chem Compute provides computing resources for undergraduate
teaching as well as research at no cost to the teachers,
23
thus allowing institutions that do not have the personnel or
financial resources to set up their own physical or cloud servers to offer computational chemistry education. However,
Chem Compute relies on computational resources volunteered by third parties whose continued future availability is
not guaranteed.
2of33 LEHTOLA AND KARTTUNEN
As discussed above, great advances like WebMO and Chem Compute have been made in the direction of the BYOD
paradigm, to which many universities have already shifted in order to cut down on the costs associated with the now-
deprecated computer classroom model. In this work, we will show that free and open source software (FOSS) can be
used in the context of the BYOD paradigm to achieve computational chemistry for the masses, all the while democratiz-
ing science by tearing down established power structures and barriers for research and education. (Inroads into BYOD
in the context of virtual laboratories have also been recently discussed by Kobayashi et al.
24
)
The layout of this work is as follows. In Section 2, we will begin by defining what we mean by FOSS (Section 2.1).
Then, we discuss why FOSS has not been the norm in science (Section 2.2), what FOSS enables for the teaching of com-
putational chemistry (Section 2.3), and why it would be a good time now to switch over to FOSS in teaching
(Section 2.4). We present a brief overview of available FOSS packages in Section 3. We include several practical demon-
strations of using state-of-the-art FOSS programs for computational chemistry education in Section 4, showcasing the
kinds of calculations that are possible assuming only limited computer resources. The article concludes in a brief sum-
mary and discussion in Section 5.
2|FREE AND OPEN SOURCE SOFTWARE
2.1 |Definitions
As our readers may not be familiar with the concept of FOSS, some definitions are necessary before the present discus-
sion can take place. For the purposes of this article, we will adopt three key criteria for FOSS:
1. The ability of anyone to freely use the software for any purpose.
2. The ability to freely study the operation of the software, and modify it at will.
3. The ability to freely redistribute copies of the softwareas well as modified versions thereofto others.
Consequently, any software that does not satisfy these criteria for FOSS is referred to as proprietary or closed source
software.
What is the significance of these criteria? The first criterion means simply that there can be no limitations on poten-
tial uses of the software: for instance, in addition to use in academic research and education, commercial use must also
be permitted by the license. Moreover, the first criterion bars license terms that prohibit use of the software for purposes
deemed questionable by the licensors, such as use in nuclear power plants or in research on genetic engineering. FOSS
can be used by anyone for anything.
The second criterion means that the source code of the software must not only be available, but also that
customizations to the source code must be allowed. This is of major importance for developing new features or compu-
tational models, for example. Being able to use software written by other authors to accomplish certain tasks eliminates
the need to reinvent the wheeland thereby results in faster scientific development.
25
This phenomenon has tradition-
ally been the main enticement of contributing to closed-source or open teamware
26
packages, as access to their source
code partly eliminates the need to start from scratch, as algorithms implemented in the package by its other contribu-
tors can be leveraged to develop new computational models.
However, the control of access to the source code of such closed-source programs lead to perpetuating power struc-
tures and may inhibit academic collaborations between authors of different program packages,
27
instead of the
Popperian ideal of science: the selfless pursuit of truth,
28
and a fair and unbiased competition of ideas and methods in
the context of computational chemistry. Key persons in control of the access to the source codes of various software
packages are able to hold back equitable competition and collaboration between scientists developing new methods
and algorithms. The issue with gatekeepers is not a new phenomenon: as was already quipped by Max Planck, A new
scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its oppo-
nents eventually die, and a new generation grows up that is familiar with it; this apt observation is supported by a recent
study that investigated the dynamics of scientific evolution with the standard empirical tools of applied microeconom-
ics.
29
This problem is less likely to manifest in FOSS, as will be explained in the next paragraph.
The third criterion means that anyone who has a copy of the software can redistribute it to others. One does not
need to ask case-by-case permission from the authors of the software in order to share it with one's collaborators or the
reviewers of a scientific paper, for instance. It also means that anyone who has added new features to the program can
LEHTOLA AND KARTTUNEN 3of33
freely distribute their version. This eliminates the problematic role of the gatekeepers in the open teamwaremodel,
as alternative versions of the software commonly known as forks can be distributed. It also eliminates the possibilities
of the infamous practice
30
of preventing one's competitors from using one's software, which may have the result of hid-
ing deficiencies and bugs in one's software. Case in point: the war on supercooled water
31
exemplifies the problems of
having prominent figures as exclusive gatekeepers. The warwas only resolved once Princeton scientists gained access
to their Berkeley competitors' source code and found a coarse error therein.
32
Such problems are much less likely to
exist if FOSS is used, as FOSS programs are freely redistributable and can be thoroughly inspected by anyone.
In our opinion, the three criteria laid out above condense the essence of both the generally accepted 10-item defini-
tion for open source softwareby the Open Source Initiative
33
as well as the four essential freedoms of free software
or libre softwaredefined by the Free Software Foundation.
34
Note that there is a wide variety of FOSS licenses that fit
these criteria and that can be adopted by software projects. New software projects should choose their license with
care.
35
It is always easier to switch to a more permissive license later on than to move to a more restrictive license: any
versions released under a FOSS license will continue being FOSS in the future, as well, even if newer versions switch to
using a proprietary license, for example.
2.2 |Why is free/open-source software not the default?
2.2.1 | Code distribution
The ideology of FOSS is in line with the demands of science,
36
as much like the Schrödinger or Dirac equation, compu-
tational models should ideally always be publicly available. Moreover, as the initial development and ongoing use of
most scientific software has been and continues to be funded by public research funding, the results of such workthe
developed program source codeshould be available to everyone.
It is worthwhile to comment on the reasons for the longstanding status quo. As discussed by Hinsen,
37
before the
advent of electronic computers, algorithms were developed with pen and paper, and the traditional paper journal article
format is ideally suited to fully describe such algorithms. But, when implemented on a computer, algorithms often
become too complicated to thoroughly describe in a journal article, and significant portions of the implementation are
always left out. As this tacit information on what happens under the hoodof various computational chemistry pack-
ages is typically passed only within the academic groups contributing to those codes, lack of access to the source code
creates another barrier of entry for third parties, and again ends up perpetuating established power structures.
However, nowadays there are well-established ways for distributing scientific software. Version control systems
such as Git
38
facilitate robust development of software, which can be hosted at no cost on sites such as GitHub
39
and
GitLab.
40
GitHub and GitLab also enable a community approach to code development through the use of public code
review, which is leveraged by many program packages to improve code quality and to decrease the learning curve for
potential new contributors to the package. Stable releases of software can be made available on Open Science data
repositories such as Zenodo
41
with version-specific Digital Object Identifiers (DOIs). Also precompiled versions can
nowadays be easily distributed, as we will discuss in Section 3.
2.2.2 | Maintenance and user support
A commonly referred impediment to FOSS in science is that funding its maintenance and/or user support is challeng-
ing.
26,42,43
However, there are several companies whose whole business model is founded on the use, development, and
support of FOSS. For instance, Red Hat Inc. broke $1 billion in annual revenue in 2012, and its revenue has increased
ever since, surpassing $3 billion in 2018.
44
There is clearly money to be made in selling support for FOSS. Moreover, in
contrast to proprietary software, maintenance and support for FOSS can be acquired from third parties if the original
author(s) are either unavailable or unwilling to support for their code; this is the key to the Red Hat style business
model.
The business model also works for scientific FOSS. For example, Kitware Inc., established in 1998,
45
has built its
business model around developing and supporting a variety of scientific FOSS. Paraview
46
and ITK
47
enable modeling,
visualization and data analysis for large datasets, while the CMake build system has become a quintessential tool for
4of33 LEHTOLA AND KARTTUNEN
building scientific software.
48
As of 2022, Kitware has more than 200 employees and their FOSS projects span many
fields of science and technology, including quantum chemistry.
49
Due to the relatively small market for specialized scientific software, the availability of public research funding has
always played a key role in the development of computational chemistry software. Related to future development of
FOSS in science, the European Commission has outlined Open Science as their policy priority and the standard method
of working under its research and innovation funding programs.
50
As evidenced by forums such as the Computational Chemistry List
51
and the present authors' professional experi-
ence, online peer-to-peer user supportwhose motivations have been studied, for example, by Constant, Sproull, and
Kiesler
52
is invaluable even in the case of proprietary programs. In the case of FOSS, this peer-to-peer support has an
enhanced role, and is one of the keys behind the success of FOSS.
53
Because anyone can modify the software and dis-
tribute modified copies thereof, anyone can fix the bugs they run into, and gain fame even for small contributions.
Importantly, the possibility to contribute bug fixes to FOSS projects reduces the barrier between users and devel-
opers, and is the typical route how a project gains new developers. The fostering of new developers can also be greatly
aided by practices such as open code review, which serves a double purpose of both ensuring a top quality code base
and teaching both the new contributor as well as any other project followers about the structure and design philosophy
of the project. This naturally also leads to a more sustainable development environment, since a constant influx of new
developers is secured, and enables expert knowledge (also known as tacit knowledge) to be passed onto new members
of the development team.
Other aspects of the economic principles of FOSS have also been studied extensively.
5474
FOSS is a public
good.
55,56,68
Participation in the development and support of FOSS has been found to be more motivating than that of
proprietary software,
69,70
and participation in FOSS projects is motivating and carries economic benefits.
71,72
FOSS pro-
motes peer review, free exchange of ideas, and maintainability,
73
and competition of FOSS packages promotes
innovation.
74
2.2.3 | Linux distributions
The Linux operating system is a prime example of FOSS. Originating from the University of Helsinki, Finland, it is
nowadays ubiquitous. It is used in billions of mobile phones, laptops, workstations, as well as servers and compute clus-
ters all around the world. All supercomputers on the TOP500 list
75
and the majority of the world's internet servers have
run on Linux for a long time; Android smartphones likewise run on Linux. Because of Linux, proprietary operating sys-
tems have been irrelevant in high-performance computing for many years. Chemists had good reasons to switch to
Linux already ages ago
76
; the present authors have used Linux as their main computational research platform for over
20 years.
A valuable feature of Linux distributions is that they are usually cross-platform: in addition to the usual x86 and
x86-64 platforms, consisting of processors by, for example, the Intel Corporation and Advanced Micro Devices Inc.
(AMD), Fedora packages are also available on s390x processors used on IBM mainframe computers and ARM proces-
sors such as the ones used in Raspberry Pi and new Mac computers, for instance. This versatility allows the use of het-
erogeneous hardware and ensures seamless compatibility even if students have dissimilar computing devices at their
disposal.
Several Linux distributions, such as Ubuntu, Debian, and Fedora Linux have also solved the problem of efficient dis-
tribution of software decades ago. Our criteria for FOSS in Section 2.1 allow such scientific software to be packaged as
part of Linux distributions, and indeed several powerful program packages are already available as distribution pack-
ages thanks to the grand entrance of FOSS software in quantum chemistry in recent years. Some FOSS quantum chem-
istry packages like Erkale,
77
Psi4,
78
and its predecessor Psi3
79
and PySCF
80
have been developed in a fully free/open-
source development model since their beginning, while other packages that originated within a closed-source licensing
model have also become open-sourced recently, such as OpenMolcas,
81
Dalton,
82
and NWChem.
83
2.2.4 | Case study: Libxc library of density functional approximations
An example of a successful scientific FOSS project can be found in the Libxc library of density functional approxima-
tions.
84
The modular library currently implements over 600 density functional approximations such as PBE,
85
B3LYP,
86
LEHTOLA AND KARTTUNEN 5of33
and SCAN,
87
and is used by over 30 electronic structure programs ranging from programs using Gaussian basis sets
(Erkale,
77
Psi4,
78
and PySCF
80
) to plane-wave codes (ABINIT,
88
INQ,
89
and Quantum Espresso
90
), finite element pro-
grams (HelFEM
9194
and DFT-FE
95
), and multiresolution adaptive grids (MADNESS
96
). In order to facilitate wider use
by the community, Libxc recently switched to a more permissive FOSS license that allows the library to be more easily
included in closed-source programs. Libxc is now used in several proprietary and commercial software packages, for
example, the Slater-type orbital ADF package,
96
and the Gaussian-type orbital GAMESS-US,
97
Molpro,
98
MRCC,
99
ORCA,
100
and TURBOMOLE
101
programs; several other packages are also contemplating to migrate to Libxc.
The advantages of the community adoption of Libxc are manifold. A new density functional approximation only
needs to be implemented in Libxc to become available in any of the electronic programs that support Libxc, underlining
the efficiency of the modular FOSS model. Moreover, access to the same implementation of a density functional approx-
imation enables, for example, the study of reproducibility across various numerical approaches,
102
which is important
to be able to compare results obtained with different methods or software packages. Indeed, economic gains in terms of
software development productivity and product quality can be achieved by reuse of mature FOSS components that are
of the highest quality.
103
We believe that computational chemistry will continue to transform by adopting more and more FOSS components,
the electronic structure library (ESL) being one of the notable pushes in this direction.
104
Well-designed, modular FOSS
components can be maintained even by a single academic group; the semi-empirical dispersion library of the Grimme
group is a successful recent example.
105107
We will discuss this topic further in Section 3.5.
2.3 |What does FOSS offer for teaching?
2.3.1 | Free redistribution: Install and maintenance
In addition to its benefits for general use cases,
108
FOSS has three major advantages for teaching: the availability of the
source code, the availability of precompiled binaries, as well as the general applicability of the software beyond acade-
mia. Starting out with the first advantage, software that satisfies the criteria for FOSS discussed in Section 2.1 can be
redistributed, and included in Linux distributions, for example. This greatly facilitates the installation of these pro-
grams, as prepackaged software can be installed in a matter of minutes on a wide range of hardware, ranging from stu-
dents' laptops to compute servers, simply by running a single command, or alternatively, finding the program in the
distribution's graphical application manager and clicking on Install.
We wish to note here that although installing scientific software by hand by compiling from source code affords cus-
tomized tunings that may result in faster operation, that is, decreased runtimes of quantum chemistry packages, in
many cases the gains realizable in computational chemistry education or small-scale computing are relatively modest
and pale in comparison with the ease of effort afforded by the centralized packaging system. Compiling from source
takes a lot of time as well as expertise, and can lead to poor performance if the compiler options are not adequately cho-
sen; note that several proprietary programs have likewise adopted a binary-only distribution model with the same
limitations.
However, installation is only a part of the problem: the software must also be kept up to date. This does not happen
automatically, and a constant level of administration effort is then required to monitor new releases, and to download
and install new versions of the software. In contrast, the Linux distribution packages get automatically updated with
the rest of the system whenever new package versions come out: Linux package managers not only handle updates to
the Linux operating system kernel, but also all other software, such as the internet browsers, the email clients, the office
productivity software suites, the Fortran and LATEX compilers, and so on. Also computational chemistry packages get
automatically updated.
2.3.2 | Access to source code
The second advantage of FOSS is that as the source code is available, it can be used in teaching. For instance, a course on
electronic structure calculations can exemplify the basic algorithms by showing how they are implemented in an openly
available program. Some codes go even further: for instance, Psi4Numpy
109
is a project that aims to supply simple, easily
modifiable Python algorithms for educational and proof-of-concept purposes. The PySCF quantum chemistry program
80
6of33 LEHTOLA AND KARTTUNEN
makes it easy to override and customize all algorithms, as they are mostly written in Python. Similarly, DFTK
110
has been
designed to facilitate algorithmic development and might therefore also be useful for educational purposes.
Access to these kinds of projects not only facilitates research in and development of new electronic structure
methods, but also means that teaching no longer has to be limited to pen and paper exercises: instead, it can also
include real-life demonstrations. For example, an advanced course on electronic structure theory could involve asking
students to write their own, customized solver for self-consistent field theory.
111
2.3.3 | Sophisticated workflows
The third advantage of FOSS for teaching is that since students (like anyone else) can access the full power of various
computational chemistry programs, they also have the possibility to develop more general technical skills such as pro-
gramming and interfacing programs with each other, for instance by generating sophisticated workflows that automate
complex tasks. Automated workflows are highly useful tools for practical computations, as they can be leveraged to eas-
ily run and analyze thousands to even millions of calculations that are needed for high-performance screening of mate-
rials, for instance. Several large-scale projects such as Materials Project,
112
Materials Cloud,
113
AiiDA,
114
Atomic
Simulation Recipes,
115
and QCEngine
116
are FOSS and provide immediate access to powerful automated workflows for
computational chemistry. As was summarized in the first criterion in Section 2.1, FOSS can also be freely used without
limitations in industry to develop new thermoelectric energy conversion materials
117
or semiconductor devices,
118
for
example, underlining its freedom and flexibility.
2.4 |Why would it be timely to switch to free/open-source software?
We have argued above that FOSS has important ramifications for the reproducibility of science and also has several
advantages for teaching. Although it is possible to switch from proprietary programs to FOSS within the traditional
setup based on computer classrooms and/or central compute servers, there is yet another important aspect to consider:
the BYOD approach discussed in Section 1. In this section, we wish to examine FOSS from the point of view of the
ongoing paradigm shift to the BYOD scheme.
As the price of laptop computers has dropped, many students now bring their own devices to the classroom. This
paradigm shift has also affected university policies. Students preferring to use their own devices have led to a significant
decrease in the demand for computer classrooms. Universities may now find it cheaper to just offer a laptop to all stu-
dents. For instance, the Faculty of Science of the University of Helsinki pivoted to such an approach several years ago.
As a result, the university has been able to cut down on computer classrooms that are expensive to maintain even while
several students refuse the laptop offered by the university and opt to using their private laptops instead.
Although as was already discussed in Section 1, a centralized compute server approach is compatible with the
BYOD paradigm, the effortless availability of FOSS programs can be used to finally bring computational chemistry to
the masses and thereby truly democratize science. As FOSS software packages can be made instantly available to every-
one, the FOSS approach is ideally suited for personal devices in the BYOD approach. Such a distributed approach is
optimal also for massive open online courses (MOOCs), as enrollment does not have to be limited based on the avail-
able centralized computer resources. Instead, the students can run all of the necessary calculations on their own
hardware.
Naturally, certain tradeoffs are implied in a course employing heterogeneous BYOD approaches, as one cannot
assume personal devices to have the same computational power as purpose-built, dedicated compute servers. However,
we argue that this is not much of an impediment due to the immense developments in the speed of processors and
improved algorithms achieved during the past several decades. A concrete example of this is the TOP500 list of super-
computers, which contains almost 30 years worth of data on the most powerful supercomputers in the world.
119,120
The
estimated performance of the fastest and slowest supercomputer on the list on a year-by-year basis is shown in Figure 1
in units of 10
9
floating-point operations per second (GFlops). Figure 1 also shows analogous benchmark data for com-
modity hardware: a cheap tablet computer with an Intel Celeron N4000 processor and a high-end business laptop with
an Intel i7-10610U processor of one of the present authors (SL). A Raspberry Pi 4 minicomputer was also assessed, and
found to perform similarly to the Celeron N4000 processor.
LEHTOLA AND KARTTUNEN 7of33
As Figure 1 illustrates, personal devices have performance in the tens to hundreds of gigaflops, which is comparable
to the performance of fastest supercomputers of the mid-1990s, or to the slowest supercomputer on the TOP500 list in
the mid-2000s. This amazing development in computational power means that the content of classic books on quantum
chemistry such as Szabo and Ostlund
121
could be reproduced nowadays on commodity hardware; however, there is no
reason to, since better computational methods and basis sets are available nowadays in many FOSS packages. Many cal-
culations could probably be even carried out on an up-to-date smartphone!
The data in Figure 1 suggest that a variety of calculations are possible within a reasonable time with personal
devices. Combined with FOSS program packages that can be installed and kept up to date in a trivial fashion with a
package manager, computational chemistry can finally be made available to the masses, as students are able to run
(and modify!) FOSS packages on their own devices. The skills they gain doing so are directly transferable to both
research and industry, as the same packages can also be used for heavy-duty calculations on supercomputers which is
also freely allowed by their permissive licenses.
3|OVERVIEW OF AVAILABLE FOSS PROGRAM PACKAGES
This section presents an overview of available FOSS program packages for computational chemistry. As the number of
FOSS projects has grown immensely in recent years, we restrict the overview to self-contained packages which are able
to run quantum electronic structure calculations from atomistic input. FOSS for other types of molecular modeling has
been discussed elsewhere,
122,123
while various computational chemistry resources for education have been recently
summarized by Rodríguez-Becerra et al.
124
As the availability of software is a moving goalpost, since new packages appear and old ones become technologically
obsolete and stop being maintained, any review can by force of necessity only represent the situation at a given point in
time. Continuously updated databases are an alternative that is (hopefully) always up to date,
125
but any observations
made on their basis similarly are tied to the time of observation and become outdated as enough time passes. For this
reason, new reviews are typically published whenever the availability of software has changed enough.
Budget laptop, Celeron N4000
High-end business laptop, Core i7-10610U
10−1
1
10
102
103
104
105
106
107
108
109
GFlops
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
Year
FIGURE 1 The best-performing (red stars) and worst-performing (blue squares) supercomputer on the TOP500 list,
120
as well as the
performance of a budget laptop with a Celeron N4000 processor and a high-end business laptop with a Core i7-10610U processor (see
Supporting Information). Note logarithmic scale on yaxis. The performance of Raspberry Pi 4 was found to be similar to Celeron N4000
8of33 LEHTOLA AND KARTTUNEN
The main goal of this section is merely to illustrate the breadth of software that is already available for use in com-
putational chemistry. We have assembled the collection of packages by thorough literature and internet searches.
Because unmaintained packages are unlikely to be easy to install, or to become available as prepackaged software, we
limit the overview to software that shows at least some development activity in recent years, as checked from the
upstream development repositories. Even if it later turns out that we have missed some recently published software
package in this review, or if some packages become replaced by newer competitors after the publication of this article,
our main points should remain unaffected: there will likely still be a similar breadth of FOSS packages suitable for a
variety of purposes within computational chemistry and computational chemistry education.
As FOSS, the programs listed here can be packaged and distributed openly without restriction; several of them are
already available as part of Linux distributions such as Debian, Ubuntu, and Fedora Linux. Linux distribution packages
are centrally maintained by the Linux distribution's packagers, and require no special knowledge or local department
personnel to install them or keep the software up to date, in contrast to typical proprietary packages. As we show in the
Supporting Information, the packages can be installed on the command line; alternatively, they can also be installed
using the distribution's application store. Importantly, the software is also automatically kept up to date by the distribu-
tion package manager, whereas the installation and upkeep of proprietary packages tends to require significant local
expertise and time effort.
It is not even necessary to be running Linux to use such prepackaged programs. Windows users can run the
software under the Windows Subsystem for Linux (WSL), which allows installing and using a Linux distribution
easily inside Windows 10. The cross-platform Python Package Index
126
(PyPI) and Conda
127
package managers are
other alternatives for easy access to an increasing number of quantum chemistry packages on Linux, Windows,
and macOS. Computer laboratory settings can also be imitated using pre-made, customized live CDs or live USBs,
for example.
Because of the large number of packages to review, we organize the discussion into
Programs for molecular calculations with Gaussian basis sets, Section 3.1
Programs for solid-state calculations with various numerical approaches, Section 3.2
Programs employing fully numerical methods, Section 3.3
Programs employing semiempirical methods, Section 3.4
Due to space constraints, we only include minimalistic descriptions of the programs, and advise the reader to look
up the programs' evolving capabilities in detail on the internet to assess their usefulness for a given computational
chemistry course or other application. Most of the electronic structure programs support either HartreeFock
(HF) and/or density functional theory
128,129
(DFT); several molecular programs also support various post-HF methods.
We will also discuss projects of a more limited scope in Section 3.5.
3.1 |Programs for molecular calculations with Gaussian basis sets
Gaussian basis sets dominate the field of quantum chemistry, since all electrons can efficiently be included in the calcu-
lation, the electronic Coulomb integrals can be evaluated analytically in the Gaussian basis,
130
and the evaluation is
efficient when recursion relations are used.
131,132
Thanks to many decades of work on the development of Gaussian
basis sets,
133135
basis sets exist for the accurate reproduction of various molecular properties at several levels of theory.
Access to analytical integrals greatly facilitates the implementation of post-HF theories, and also guarantees accurate
force and Hessian evaluations.
Bagel
136
is a C++ program package that features, for example, analytical complete active space perturbation theory at
the second order (CASPT2) nuclear energy gradients and derivative couplings, relativistic multireference wave func-
tions based on the Dirac equation, and implementations of novel electronic structure theories.
Chronus Quantum
137
is a C++ program package that focuses on the consistent treatment of time dependence and spin
in the electronic wave function, as well as the inclusion of relativistic effects in said treatments.
Dalton
138
is a Fortran program that specializes in molecular properties at various levels of theory, such as frequency-
dependent response properties; one-, two-, and three-photon processes, etc. In addition to HF and DFT, Dalton features
several post-HF methods like multiconfigurational self-consistent field (MCSCF) theory and coupled-cluster theory.
LEHTOLA AND KARTTUNEN 9of33
Ergo
139
is a C++ program for linear-scaling HF and DFT calculations for molecules.
ERKALE
77
is a C++ program implementing HF and DFT that specializes in the modeling of inelastic X-ray spectros-
copies, self-interaction corrected DFT, as well as various orbital localization methods.
e
T140
is a C++ program primarily aimed for coupled-cluster calculations of molecular systems, which specializes in
multiscale and multilevel methods, as well as modern Cholesky decomposition techniques for two-electron integrals.
Fermi.jl
141
is a Julia package for HF and post-HF calculations.
JuliaChem
142
is a Julia package for HF calculations.
LSDalton
138
is a Fortran code targeted for linear-scaling HF and DFT calculations on large molecular systems, and also
includes some coupled-cluster capabilities.
MolGW
143
is a Fortran/C++ package that implements HF and DFT, but specializes in many-body perturbation theory:
the GW approximation and the BetheSalpeter equation.
MPQC
144
is a C++ program for massively parallel quantum chemistry, which originally focused on HF and DFT but
has later evolved support for post-HF many-body theories.
NWChem
83
is a major quantum chemistry package written in Fortran and has a variety of features for both molecular
and solid-state calculations.
Psi4
78
is a modular C++/Python package for HF, DFT, and various post-HF calculations that can be used either as a
traditional quantum chemistry package with simple and intuitive input files, or as Python modules for running calcula-
tions in Python.
PySCF
80
is a collection of Python modules for electronic structure calculations with significant capabilities also for
solid-state simulations, including, for example, coupled-cluster implementations for crystalline systems.
PyQuante
145
is a Python package for quantum chemistry with some C extensions that emphasizes ease of understanding
the code over performance.
OpenMolcas
81
is a Fortran package that specializes in multiconfigurational approaches to electronic structure theory,
but also implements various DFT calculations, for example.
Serenity
146
is a C++ program for subsystem quantum chemical methods.
SlowQuant
147
is a Python program for molecular quantum chemistry that derives its name from the use of Python for
even the computational demanding parts of the program.
VeloxChem
148
is a C++/Python package for molecular properties and for modeling various spectroscopies based on
response theory.
Uquantchem
149
is a Fortran 90 program written for HF, DFT, MøllerPlesset perturbation theory, configuration interac-
tion singles and doubles, quantum Monte Carlo, and so on.
3.2 |Programs for solid-state calculations
The major difference between solid-state and molecular calculations is that the orbitals experience exponential decay in
molecular calculations, while solid-state calculations are performed on periodic crystals where the wave function has to
obey Bloch's theorem.
150
Because of the periodicity, calculations in the solid state are in many ways more difficult than
those in molecules due to the need of k-point sampling, for instance; see Ref. [151] for a recent introduction. Post-HF
methods are much less prominent in the solid state than in molecules. Instead, calculations on solids are typically car-
ried out with DFT and pseudopotentials
152
; pseudopotentials make the calculations less costly while introducing an
error which is typically negligible compared with the error in the density functional approximation itself.
The conventional way to model crystalline systems is to use plane waves. However, many other numerical schemes
have also been pursued. Note that the programs listed here that employ (pseudo)atomic basis functions can naturally
handle periodicity in 0, 1, 2, or 3 dimensions, corresponding to atoms and molecules, chains, sheets, and crystals,
respectively. Still, we have listed them as solid state codes because they are most often used for calculations with DFT
and pseudopotentials.
ABINIT
88
is Fortran program for plane wave calculations that supports DFT as well as more advanced formalisms like
many-body perturbation theory.
ACE-Molecule
153
is a C++ program that employs uniform real-space grids of Lagrange sinc functions and
pseudopotentials, and supports density functional calculations on both periodic and non-periodic systems and wave
function theory calculations based on KohnSham orbitals.
10 of 33 LEHTOLA AND KARTTUNEN
BigDFT
154
is a Fortran program that is based on the use of pseudopotentials and a two-tier Daubechies wavelet basis to
achieve a spatially localized basis.
Conquest
155
is a Fortran program for large-scale DFT calculations employing pseudo-atomic orbital basis sets.
CP2K
156
is a Fortran package based on Gaussian basis sets specializing in solid state physics, implementing HF, DFT,
MøllerPlesset perturbation theory, and the random phase approximation.
DFTK
110
or the density-functional toolkit is a collection of Julia routines for experimenting with plane-wave DFT that
emphasizes simplicity and flexibility in the aim of facilitating algorithmic and numerical developments and simplify
interdisciplinary collaboration in solid-state research.
ELK,
157
EXCITING,
158
and FLEUR
159
are Fortran programs for linearized augmented-plane wave calculations which
can reach microhartree accurate total energies for carefully chosen basis sets.
GPAW
160
is Python/C electronic structure program for DFT calculations within the projector-augmented wave
approach which supports three modes of operation: (i) finite-difference grids, (ii) numerical atomic orbitals, and
(iii) plane waves.
INQ
89
is a new, modular implementation of plane wave DFT and time-dependent DFT written from scratch to work on
graphics processing units (GPUs).
JDFTx
161
isaC++ plane wave DFT code aimed to be easy to develop and easy to use, whose key feature is support for
joint DFT for the description of electronic systems in contact with molecular liquids.
M-SPARC
162
is a MATLAB package for prototyping DFT calculations employing finite-difference grids and
pseudopotentials.
Octopus
163
is a Fortran program based on pseudopotentials and finite difference grids that focuses on time-dependent
DFT for handling non-equilibrium phenomena.
OpenMX
164
is a C package for DFT calculations with pseudopotentials and numerical atomic orbitals.
PARSEC
165
is a Fortran program based on finite-difference grids for density functional calculations with
pseudopotentials.
PWDFT.jl
166
is a Julia package written from scratch to facilitate development of novel computational methods using
plane waves.
RMG
167
is a C++/Fortran program employing real space grids and multigrid algorithms for density functional calcula-
tions with pseudopotentials.
Siesta
168
is a Fortran program for electronic structure calculations and ab initio molecular dynamics of molecules and
solids that employs a basis set of numerical atomic orbitals, which are strictly localized, enabling the use of sparsity.
Qbox
169
is a C++ program aimed for first principles molecular simulations using plane waves and pseudopotentials.
Quantum Espresso
90
is a Fortran/C program for plane wave calculations with pseudopotentials on a wide range of hard-
ware from laptops to supercomputers.
SPARC
170
is a C program for parallel DFT calculations employing finite-difference grids and pseudopotentials.
3.3 |Programs relying on fully numerical representations
The idea in modern fully numerical methods is to represent the orbitals directly in real space, and to use a representa-
tion of non-uniform accuracy (more grid points near the nuclei and fewer points in empty regions of the system) so that
all-electron calculations become feasible. Although fully numerical approaches have a long history for calculations on
atoms and diatomic molecules,
171
they are otherwise a relatively recent development in electronic structure theory and
have only recently become competitive with e.g. Gaussian-basis calculations whenever high accuracy is needed.
172
DFT-FE
95
is a C++ program that employs spectral finite-element basis sets for a local real-space variational formula-
tion of DFT, and is able to handle pseudopotential and all-electron calculations within the same framework and arbi-
trary periodicity.
HelFEM is a C++ program for fully numerical calculations on atoms
92,94
and diatomic molecules
91
at the HF or DFT
levels of theory employing high-order numerical basis functions and yielding fully variational energies.
MADNESS
173
is a C++ program that relies on the use of multiresolution adaptive grids, which has been used in a vari-
ety of studies on novel real-space approaches to electron correlation, for instance.
MRChem
172
is a C++ program that also relies on multiresolution adaptive grids for HF and DFT calculations on mole-
cules; its specialty is the computation of magnetic properties such as nuclear magnetic shielding constants.
LEHTOLA AND KARTTUNEN 11 of 33
x2dhf
174
is a Fortran program for non-relativistic finite difference restricted open-shell HF and DFT calculations on
diatomic molecules.
3.4 |Programs employing semiempirical models
Semiempirical models offer affordable techniques for approximate quantum mechanical calculations that fall
in accuracy in-between ab initio density functional calculations and force field techniques. Tight-binding
DFT
175177
is probably the best-known semiempirical model, and it is available in several program packages.
Other types of semiempirical methods exist as well, please refer to Thiel
178
and Bannwarth et al.
179
for
discussion.
DFTB+
180
is a Fortran package for various calculations based on tight-binding DFT.
Latte
181
is a Fortran program for tight-binding DFT molecular dynamics.
Sparrow
182
is a C++/Python program for fast semiempirical quantum chemical calculations, including tight-binding
DFT.
xtb
179
is a Fortran package that implements various semiempirical eXtended Tight-Binding methods.
3.5 |Limited-scope projects
Although the main focus of our review is on self-contained packages for quantum electronic structure calculations for
computational chemistry education, this narrow scope risks not seeing the forest from the trees. The major part of
FOSSthe forest in the analogyis a huge thriving ecosystem of small projects with limited scope, which wildly out-
number the more conspicuous large program packagesthe treeswhich exist in synergy with the smaller projects:
the smaller subprojects are often used by the larger programs. Thereby, in order to gain a thorough overview of FOSS it
is invaluable to extend our review from the self-contained packages reviewed above to projects of a more limited scope
which often have little user visibility.
The proliferation of small projects has multiple raisons d'être. The most common one is simply a specific personal
need. The good news is that because of the limited effort required to develop and maintain a code with a well-defined
scope, they can be developed and maintained by a single research group, or often even by a single person. The bad news
is that probably the majority of all FOSS projects in existence are unmaintained, simply because the authors moved on
to other things. As was already mentioned in the beginning of Section 3, we have not considered such projects in this
review.
3.5.1 | Keys to modular design
There is a systematic reason for the origin of the specific personal need mentioned in the previous paragraph: the DRY
[Don't Repeat Yourself] and KISS [Keep It Simple, Stupid!] principles, which have been key principles in software engi-
neering for an extended time and are still used to teach programming.
183
DRY is a reminder to avoid code duplication: a given functionality should only be programmed once and that imple-
mentation called everywhere it is needed, instead of repeating the same functionality in several places of the program.
The latter approach is more verbose, making it less maintainable and more prone to bugs.
In KISS, a complex problem is broken down into smaller subtasks. Once the subtasksthe common pieces of the
problemhave been identified, the principle is reapplied to the subtasks themselves: can they be broken down to a
compact collection of even simpler tasks?
Once a KISS design has been established, each component has a clear role in the design of the whole program. Even
though achieving the best design may in reality require several iterations of refactoring (restructuring) the code, the
effort in each iteration of the refactor is limited because even the code one is starting with should be quite simple if the
initial application of KISS was even partly successful.
12 of 33 LEHTOLA AND KARTTUNEN
3.5.2 | Is modular design a limitation?
A well-made design is like a puzzle: each software component fills in a piece of the puzzle by carrying out a small, well-
defined task. Each piece should ideally be so small that a working implementation can be developed in a matter of
hours.
The first attempt at the design of the program layout is often not fully successful, because the structure of a scientific
problem is not always clear before it has been fully solved. For this reason, program structures tend to develop
over time.
If a redesign of the modular structure of a problem leads to a more elegant or efficient implementation, it is
often adopted in a new version of the software. Such redesigns are extremely common in software development,
and are the reason for versioning software: the major version changes whenever the interface becomes incompati-
ble with the older version.
184
However, the redesign is often achievable through simple reorganizations of the ear-
lier code base. The software does not have to be rewritten, as the existing pieces can just be rearranged to fit the
new pattern.
If the design of a modular library changes enough, it can essentially become a wholly new library. In this case,
migrating to the newer version of the library may be a significant task for other projects, and the old and the new ver-
sion of the library may coexist for an extended time. A good example in the field of quantum chemistry is the libint
library of two-electron integrals,
185
which is used by several FOSS codes. A new major version of the library was intro-
duced in 2014 to take advantage of the new features afforded by modern processors, but many quantum chemistry pro-
grams still use the original version published in the early 2000s, since the functionality provided by the older version
suffices for the purposes it was designed for.
3.5.3 | The importance of interoperability
An example of a modular design that has stood the test of time is the Basic Linear Algebra Subprogram (BLAS) library,
which was originally introduced in the late 1970s.
186
BLAS implements elementary linear algebra operations, such as
adding, scaling, and multiplying vectors and matrices; operations which hold a central place in most branches of com-
putational science, including quantum chemistry, much of which is linear algebra.
Although a simple for-loop based implementation of BLAS operations, such as matrixmatrix multiplication
Cik ¼P
j
AijBjk can be written up in minutes, the mathematical structure of the problems can be employed to design a
faster implementation. In a later step, the implementation can even be hand-optimized to the specific processor used in
the machine; competing optimized BLAS implementations are an active area of research.
187,188
Although BLAS was published well before the FOSS movement gained steam via the internet, it serves as an excel-
lent example of what can be achieved by the use of open source, or at least by sharing a common programming inter-
face. BLAS is so pervasive, since it is ubiquitous: everyone uses it, and there are many competing implementations.
When individual projects are interoperable, such as in the case of BLAS, the development of efficient programs is
greatly hastened. Simply by using an optimized BLAS library instead of the reference implementation can in many
cases yield speedups of several orders of magnitude.
Unfortunately, interoperability is still hampered in the field of quantum chemistry since components are not
truly interoperable due to the lack of common standards. The evaluation of two-electron integrals is a good exam-
ple: it is the rate determining step in conventional HF calculations, and several implementations of two-electron
integrals have been published.
185,189191
However, these implementations do not share a common interface.
Instead, the interfaces tend to reflect the structure of earlier legacy codes that have a large number of differing con-
ventions on the ordering, normalization, and signs of Gaussian basis functions, for instance. Despite some
attempts,
192,193
two-electron integrals librariesor quantum chemistry programs, for that matter!are still not
interoperable.
3.5.4 | The move to increased modularity
The situation may, however, be slowly changing. Libxc
84
has already standardized density functional calculations in over
30 electronic structure programs; XCFun
194
is another implementation of density functional approximations like Libxc that
LEHTOLA AND KARTTUNEN 13 of 33
has also been adopted by many codes, several of which support both Libxc and XCFun. Other types of libraries are also fol-
lowing suit. There is a growing ecosystem of modular electronic structure libraries as recently discussed by Oliveira et al.
104
in the scope of solid state calculations. We will complement it with a brief overview of some modular open source projects
that have become used within several quantum chemistry programs below. The use of common implementations will hope-
fully lead to more interoperability between electronic structure programs also in other aspects.
Given the multitude of small libraries that are available, the listing in this subsection is likely far from complete;
however, its goal is merely to illustrate that there is more to FOSS than the self-contained packages listed above. Spe-
cialized projects like these eliminate redundant work and enable rapid implementation of new features in quantum
chemistry programs.
Polarization,embedding,andquantumchemicalmodels are a good example of modular functionality, since the
data structures needed to implement such models fit well in the modular design. Examples of such projects
include:
adcc195 is a toolkit for implementing algebraic-diagrammatic construction (ADC) methods.
CheMPS2
196
is an implementation of the density matrix renormalization group method.
cppe
197
is an implementation of polarizable embedding.
DFT-D3
198
and DFT-D4
106
are implementations of semiempirical dispersion corrections for density functional
calculations.
libefp
199
is an implementation of the effective fragment potential method.
Libxc
84
contains implementations of density functional approximations which have been generated with computer
algebra.
PCMSolver
200
is an open-source library for the polarizable continuum model electrostatic problem.
XCFun
194
contains implementations of density functional approximations which employ automatic differentiation.
There are also several projects that specifically deal with Gaussian basis sets and that are thereby used by several
quantum chemistry codes.
The Basis Set Exchange
201
is a Python library for storing and managing Gaussian basis sets and converting basis sets
between various program formats; the project also has a web interface at http://www.basissetexchange.org which will
be more familiar to most readers.
erd
189
computes two-electron integrals with Rys quadrature.
libint
185
is a library for the evaluation of molecular integrals of many-body operators over Gaussian functions employing
ObaraSaika recursion routines.
libcint
190
is an integral library for automatically implementing general integrals for Gaussian-type scalar and spinor
basis functions using Rys quadrature.
simint
191
is a vectorized library for electron repulsion integrals employing ObaraSaika recursions.
libecpint
202
is a software library for evaluating effective core potential integrals.
3.5.5 | Visualization, manipulation, and analysis
The visualization, manipulation, and analysis tools discussed in this subsection are user-facing programs and are
thereby a more visible showcase of limited-scope projects than the lower-level libraries that were discussed in
Section 3.5.4. Indeed, simplified frontends are often invaluable for initializing, visualizing and analyzing calculations.
Several FOSS packages with graphical user interfaces are also available for this purpose; some even come with integra-
tion with FOSS electronic structure programs that allow running calculations within a graphical interface. For creating
models and visualizing computational results, FOSS graphical user interfaces such as Jmol,
203
Avogadro,
204
IQmol,
205
and PyMol
206
can be installed and used.
Unfortunately, the interoperability challenges mentioned in Section 3.5 affect visualization and analysis tools espe-
cially acutely, because these applications tend to require access to the electronic wave function, for which no univer-
sally accepted standard exists. This problem plagues the whole field of computational chemistry, affecting both FOSS
and proprietary programs. In the lack of a universal standard, the interconversion of various input and output file for-
mats between different programs can be carried out for example with the Open Babel
207
and cclib
208
packages.
14 of 33 LEHTOLA AND KARTTUNEN
The atomic simulation environment (ASE)
209
contains versatile tools for building molecular and periodic models
and enables easy retrieval of molecular structures from structural databases such as PubChem.
210
It can also act as a
frontend to several quantum chemical programs, thus offering a unified interface.
Calculations can be postprocessed with the Multiwfn
211
and ORBKIT
212
packages, for instance, which both support
several file formats.
4|ILLUSTRATIONS OF FEASIBLE COMPUTATIONS
To enable a practical demonstration of the BYOD paradigm within computational chemistry education, it is time to
illustrate the easy access to several powerful FOSS quantum chemistry packages in two widely used Linux distributions:
Fedora and Ubuntu. The Supporting Information contains practical step-by-step examples of combining the BYOD par-
adigm with FOSS packages to run quantum chemical calculations according to the BYOD-FOSS paradigm. Four pro-
gram packages are used in the practical illustrations: xtb (Section 4.1), NWChem (Section 4.2), Psi4 (Section 4.3), and
Quantum Espresso (Section 4.4). Installation instructions are provided for each code and all examples can be run under
Linux, macOS, or the Windows Subsystem for Linux. In all cases, the software can be installed in a matter of minutes
on a personal computer, either using a Linux distribution package manager or the Conda package manager. For conve-
nience, the Supporting Information is also available as a git repository.
213
4.1 |xtb
The primary design goal of xtb has been the fast calculation of structures and noncovalent interaction energies for
molecular systems with up to roughly 1000 atoms.
179,214
The GFNn-xTB methods implemented in xtb are semiempirical
quantum chemical methods
179
parametrized for the whole periodic table up to radon (Z=86). A highly attractive
feature of xtb is its performance: calculations on small molecules (1020 atoms) finish in matter of seconds even on a
low-performance laptop computer. Xtb is a powerful tool in the pre-optimization of geometries and molecular confor-
mations before computationally more demanding calculations, for instance; see Ref. [215] for a recent application to
water oxidation catalysis.
The Supporting Information includes step-by-step guidelines for installing xtb and using it to study structures, con-
formations, energetics, and molecular orbitals of inorganic and organic molecules. Calculations on pharmaceutically
relevant cisplatin and transplatin molecules shown in Figure 2 are briefly summarized here to showcase the basic use
of xtb. Cisplatin, cis-[Pt(NH
3
)
2
Cl
2
], is a chemotherapy medication used in cancer treatments whose stereoisomer, trans-
platin, trans-[Pt(NH
3
)
2
Cl
2
], is ineffective in cancer treatment.
The Pt(II) atom is square-planar coordinated in both cisplatin and transplatin. Which configuration, cis or trans,is
lower in energy? We use the xtb program to answer this question. The first task is to have initial geometries for the two
molecules. In general, initial geometries can be obtained from structural databases such as Pubchem
210
; built in a
graphical user interface with programs such as Jmol, Avogadro, or IQMol; or built by hand in internal coordinates
(bond lengths, angles and dihedrals) in the Z-matrix formalism, for example. Hand-built molecular geometries for cis-
platin and transplatin are given in XYZ format in Figures 3 and 4, respectively. While these geometries should be suffi-
ciently close to optimal to allow for a straightforward optimization without difficulties, they are still quite rough in that
the total energy is expected to change by several millihartrees in the geometry optimization, corresponding to changes
in the energy of several kcal/mol.
FIGURE 2 Cisplatin (left) and transplatin (right). Color coding: Pt =gray, Cl =green, N =blue, and H =white
LEHTOLA AND KARTTUNEN 15 of 33
The next step is to bring both molecules into a (local) minimum of the potential energy surface (PES) by optimizing the
geometries with xtb. The point groups of the initial geometries are approximately C
2v
and C
2h
for cisplatin and transplatin, respec-
tively, but symmetry is not enforced during the xtb optimizations. The only input needed by xtb in this case are the Cartesian
coordinates of both molecules in XYZ format, which were given in Figures 3 and 4 for cisplatin and transplatin, respectively.
The geometry optimizations complete in seconds even on a low-performance computer; the Supporting Information
contains all of the necessary inputs. For cisplatin, the optimized PtCl and PtN distances are 2.24 and 2.15 Å, respec-
tively. Considering the relatively low level of theory, the obtained distances are in reasonable agreement with the PtCl
and PtN distances of 2.25 and 2.06 Å, respectively, obtained with the much higher-level methods of Tasinato,
Puzzarini, and Barone
216
who employed coupled-cluster theory with full single and double substitutions and
perturbative triple substitutions, CCSD(T).
Comparing the total energies of the two stereoisomers after geometry optimization shows that the total energy of
transplatin is 20 kJ/mol lower, that is, more negative than that of cisplatin. This means that transplatin is the energeti-
cally more favorable stereoisomer of diamminedichloroplatinum(II), [Pt(NH
3
)
2
Cl
2
]. For comparison, Liu and Franke
217
reported an energy difference of 56 kJ/mol with a much higher level of theory: relativistic CCSD(T) employing direct
perturbation theory, a 13s9p7d5f2g contracted Gaussian basis for Pt and aug-cc-pVQZ for other elements, evaluated on
top of molecular geometries optimized for the Becke'88Perdew'86 functional.
218,219
The result from xtb, which we were
able to get in a matter of seconds, is in good qualitative (or even semiquantitative) agreement with the result obtained
with the high level of theory. Next, in Section 4.2, we will revisit cisplatin and transplatin with DFT calculations that
afford a step up in accuracy over xtb.
4.2 |NWChem
NWChem is a program that has been developed for almost 30 years. Consequently, a large number of features are avail-
able in the code: HF, DFT, as well as post-HF calculations, ab initio molecular dynamics, and so on. NWChem has been
11
cis-[Pt(NH3)2Cl2] (cisplatin); angstrom units
Pt 0.00000000 -0.00000000 -0.19134710
Cl 0.00000000 1.61220407 1.42085566
Cl 0.00000000 -1.61220407 1.42085566
N 0.00000000 1.40714181 -1.59849021
H 0.81649658 1.30951047 -2.16752575
H -0.81649658 1.30951047 -2.16752575
N 0.00000000 -1.40714181 -1.59849021
H -0.81649658 -1.30951047 -2.16752575
H 0.81649658 -1.30951047 -2.16752575
H 0.00000000 2.30951093 -1.16752621
H 0.00000000 -2.30951093 -1.16752621
FIGURE 3 Molecular geometry of cisplatin in XYZ format
11
trans-[Pt(NH3)2Cl2] (transplatin); angstrom units
Pt 0.00000000 0.00000000 0.00000000
Cl 2.27999997 -0.00036653 0.00000000
Cl -2.27999997 0.00036653 0.00000000
N -0.00031991 -1.98999997 0.00000000
H 0.46944690 -2.32340883 -0.81740913
H 0.46944690 -2.32340883 0.81740913
N 0.00031991 1.98999997 0.00000000
H -0.46944690 2.32340883 -0.81740913
H -0.46944690 2.32340883 0.81740913
H 0.94318252 2.32318174 0.00000000
H -0.94318252 -2.32318174 0.00000000
FIGURE 4 Molecular geometry of transplatin in XYZ format
16 of 33 LEHTOLA AND KARTTUNEN
designed to run on high-performance parallel supercomputers as well as on conventional workstations. The Supporting
Information includes step-by-step guidelines for installing NWChem and using it to study the same pharmaceutically
relevant cisplatin and transplatin molecules that were studied with xtb in Section 4.1.
We choose to use non-empirical DFT in the NWChem examples. Although NWChem also includes more accurate
ab initio methods such as coupled-cluster theories, we shall not consider them in this work since their proper use
requires much more understanding and computational power than DFT does, and as such methods are typically not
included in undergraduate level courses. We choose the non-empirical PBE0 hybrid functional
85,220,221
(sometimes also
known as hybrid PBE or PBEh) that provides reasonable geometries and energetics across the periodic table and shows
good performance for complexes with d- and f-metals.
222,223
Even though DFT is simpler than many post-HF theories, setting up adequate DFT calculations still requires some
considerations. The one-electron basis set is one of the most important aspects to consider in any electronic structure
calculation in general, such as our attempted PBE0 calculation with NWChem. The choice of the one-electron basis set
has an immense importance on the computational cost and accuracy of the resulting calculations. While the GFNn-xTB
methods discussed above in Section 4.1 did not require the specification of a basis set, as the basis set is already an
essential part of the specification of the GFNn-xTB methods themselves, the basis setwhich parametrizes the allowed
degrees of freedom for the movement of the electronsdoes need to be specified for HF, DFT and post-HF
calculations.
Because of the profound importance of the choice of the basis set, various types of Gaussian basis sets have a long
history in quantum chemistry.
133
Although many readers will be familiar with traditional basis sets like STO-3G,
224
3-
21G,
225
and 6-31G*,
226
the development of computer processors and quantum chemical models in recent decades have
also lead to significant advances in basis set design. Hundreds of Gaussian basis sets intended for various purposes are
nowadays available on the Basis Set Exchange,
201
for example.
Because the basis set is an approximation, it is highly desirable to be able to control its accuracy in order to make
tradeoffs between the cost of the calculation and the accuracy of the obtained results. Accordingly, modern basis sets
typically come in families of varying size
134,135
: the smallest sets enable quick but qualitative calculations, while the
larger sets enable quantitative computations at the cost of more computer time. In contrast to traditional basis sets,
modern basis set families allow for a cost-efficient approach to the complete basis set limit, at which point the error in
the one-electron basis set no longer affects the calculation. Note that also other types of basis sets than Gaussians may
be used for quantum chemistry, see Ref. [171] for further discussion.
In this work, we will only consider the Karlsruhe def2 family of Gaussian basis sets,
227
which are a good all-round
choice for general chemistry as they are available for the whole periodic table up to radon (Z=86). As radon is an ele-
ment of the 6th period, while relativistic effects are already essential for chemistry of the 5th row,
228,229
relativistic
effects are described in the def2 basis sets through the use of effective core potentials (ECPs).
230
The ECP is used to
describe the chemically inactive, deep-core electrons only implicitly; this also decreases the overall cost of the
calculation.
The Karlsruhe def2 sets come in three levels of accuracy. Split-valence (SV) basis sets are the smallest reasonable
basis set for general applications. The def2-SVP basis is a SV basis set with polarization (P) functions, and is similar in
size to the 6-31G** also known as the 6-31G(d,p) basis set. Like 6-31G**, the def2-SVP set can also be used without
polarization functions on hydrogen atoms; this basis is called def2-SV(P), it is smaller than the 6-31G* basis, and it is
often useful for quick qualitative/semi-quantitative calculations. For more quantitative calculations, the def2 series also
contains a triple-ζvalence polarization set (def2-TZVP) as well as a quadruple-ζvalence polarization set (def2-QZVP),
which typically suffice for achieving the complete basis set limit in HF and DFT calculations. Calculations at post-HF
levels of theory, however, require larger basis sets with additional polarization functions; the def2-TZVPP and
def2-QZVPP basis sets exist for this purpose. Diffuse functions (D) are necessary for the proper description of anions as
well as to model, for example, electric polarizabilities; sets are likewise available at all levels of accuracy (def2-SVPD,
def2-TZVPD, def2-TZVPPD, def2-QZVPD, and def2-QZVPPD) for this purpose.
231
For the present demonstration, we choose the def2-TZVP basis set, as triple-ζbasis sets are well-known to yield
energies that are sufficiently close to the complete basis set limit (see also the applications in Sections 4.3.1 and 4.3.2).
Although hybrid functionals are computationally more demanding than non-hybrid functionals, it is notable that the
dispersion-corrected hybrid PBE0-D4 generalized gradient approximation (GGA) functional was recently shown to out-
perform the dispersion-corrected, meta-GGA-type non-hybrid r
2
SCAN-D4 functional in accuracy even for reaction
energies of metalorganic reactions.
232
LEHTOLA AND KARTTUNEN 17 of 33
Having completed our introduction to DFT calculations, basis sets, and NWChem, similarly to the workflow in the
case of xtb, the first task is to bring both molecules into a (local) minimum of the potential energy surface (PES) by
means of geometry optimization. The geometry optimization is started from the same hand-built initial geometries pres-
ented in Section 4.1. In contrast to xtb, NWChem is capable of employing the point group symmetry (C
2v
and C
2h
for
cisplatin and transplatin, respectively) during the geometry optimization in order to speed up both the electronic struc-
ture calculation as well as the geometry optimization, and will do so by default. This means that the calculation runs
faster, but also that the molecule is constrained to the same point group as the initial geometry during the whole opti-
mization. If the user is not careful, this may also be a bad thing, as the use of symmetry may sometimes lead to conver-
gence to a saddle point instead of a local minimum.
The input required for NWChem is more complicated than that for xtb. Running NWChem requires setting up an
input file that contains various computational parameters in addition to the input geometry. Fully annotated input files
can be found in the Supporting Information, a shortened example is shown in Figure 5.
The geometry optimizations of cisplatin and transplatin finish in a matter of minutes on one processor core,
depending on the used computer. The optimized PtCl and PtN distances for cisplatin are 2.28 and 2.08 Å, respec-
tively. These values are in excellent agreement with the values of Tasinato, Puzzarini, and Barone
216
that were dis-
cussed in Section 4.1, that is, PtCl and PtN distances of 2.25 and 2.06 Å, respectively: the geometries agree to 0.03 Å.
Next, comparing the total PBE0/def2-TZVP energies of the two stereoisomers shows that transplatin is 54 kJ/mol
lower (more negative) than cisplatin. Our DFT value is in good quantitative agreement with the energy difference of
56 kJ/mol obtained by Liu and Franke
217
using a high-level CCSD(T) method; however, in contrast to their CCSD(T)
calculations, our DFT calculations can be performed in a matter of minutes even on a personal computer.
For cisplatin, we also write out the molecular orbitals after the geometry has been optimized. The molecular orbitals
provided by from the non-empirical PBE0/def2-TZVP calculations can now be compared with the ones from the semi-
empirical xtb calculations from Section 4.1, see Figure 6. The frontier orbitalsthe highest occupied molecular orbital
title "Cisplatin"
charge 0
geometry units angstroms autosym 0.1
Pt 0.00000000 -0.00000000 -0.19134710
Cl 0.00000000 1.61220407 1.42085566
Cl 0.00000000 -1.61220407 1.42085566
N 0.00000000 1.40714181 -1.59849021
H 0.81649658 1.30951047 -2.16752575
H -0.81649658 1.30951047 -2.16752575
N 0.00000000 -1.40714181 -1.59849021
H -0.81649658 -1.30951047 -2.16752575
H 0.81649658 -1.30951047 -2.16752575
H 0.00000000 2.30951093 -1.16752621
H 0.00000000 -2.30951093 -1.16752621
end
dft
xc pbe0
mult 1
iterations 100
end
basis spherical
* library def2-tzvp
end
ecp
Pt library def2-ecp
end
driver
maxiter 100
xyz
end
task dft o
p
timize
FIGURE 5 NWChem example: PBE0/def2-TZVP geometry optimization of cisplatin; for transplatin, the nuclear coordinates given in
Figure 4 are used, instead
18 of 33 LEHTOLA AND KARTTUNEN
(HOMO) as well as the lowest unoccupied molecular orbital (LUMO)from the xtb and NWChem calculations are in
good agreement. Also HOMO-3, HOMO-2, and HOMO-1 appear similar; the HOMO-2 and HOMO-1 orbitals are merely
switched between the NWChem and xtb calculations. The energetical ordering of orbitals can easily switch when the
orbitals have similar energies; reorderings of the occupied orbitals have no effect on the properties of the system.
From the point of view of crystal field theory, the Pt(II) atom in cisplatin has a square planar coordination and eight
5d electrons. The four HOMOs and the LUMO all involve Pt 5d orbitals. In line with crystal field theory, both NWChem
and xtb show that the LUMO involves the Pt 5dx2y2orbital. HOMO-3 involves the Pt 5dz2orbital, while the 5d
xy
,5d
xz
,
and 5d
yz
orbitals contribute to HOMO-2, HOMO-1, and HOMO. As is clearly seen from the data presented above, the
non-empirical PBE0/def2-TZVP and the semiempirical GFN2-xTB level of theory provide a similar description of the
frontier orbitals of the Pt(II) complex. Again, the full inputs for the calculations are given in the Supporting
Information.
4.3 |Psi4
While NWChem represented older and more established quantum chemistry codes, Psi4 represents the newer genera-
tion of quantum chemistry codes. The origins of Psi4 trace to the Psi3 research code written in C++ for high-accuracy
studies on small molecules.
79
Compared with Psi3, Psi4 is designed to be a user-friendly, general-purpose code for fast,
automated computations on molecules with hundreds of atoms.
78
Psi4 contains a number of computational methods
ranging from HF and DFT to post-HF methods such as MøllerPlesset perturbation theory,
233
coupled-cluster theory,
234
configuration interaction theory, orbital-optimized correlation methods, symmetry-adapted perturbation theory,
multireference methods, and so on.
78
Although the core of the program is still in C++, Psi4 has thorough Python inter-
faces and can be used either as a traditional quantum chemistry program with input files, or directly from Python.
We will demonstrate the use of Psi4 in the context of two common exercises in elementary courses on computa-
tional chemistry: a conformational study of methylcyclohexane and the reproduction of the molecular geometry of the
chromyl fluoride (CrO
2
F
2
) molecule with special consideration on the one-electron basis set. We will again focus on the
def2 family of basis sets that was introduced in Section 4.2.
4.3.1 | Methylcyclohexane
Starting out with the conformational study of methylcyclohexane, the workflow is as follows. First, the molecule is built
in a molecular editor such as Avogadro, IQmol, or Jmol, and the drawn molecular structure is preoptimized using a
force field available in the editor; the goal of the preoptimization is merely to ensure that the bond lengths are realistic
so that the electronic structure calculations during the geometry optimization converge without problems, and so that
the bonding pattern does not change.
FIGURE 6 The four highest occupied MOs (HOMOs) and the lowest unoccupied MO (LUMO) of cisplatin as obtained from NWChem
(PBE0/def2-TZVP) and xtb (GFN2-xTB). The color code for the nuclei is the same as in Figure 2, while red and blue denote positive and
negative orbital amplitudes, respectively (note that the overall sign of the orbital can be freely chosen). The isovalue used for the orbitals is
0.04 electrons/Bohr
3
LEHTOLA AND KARTTUNEN 19 of 33
In the next step, the molecular structure is reoptimized with xtb, and a conformational search is carried out with
xtb with the Conformer-Rotamer Ensemble Sampling Tool (CREST) program which has been shown to reproduce con-
formational ensembles to good accuracy.
235237
Again, the Supporting Information includes short tutorials for installing
and using the CREST code, which employs xtb to carry out conformational searches of molecules.
236
CREST finds four
conformers, and outputs them in an increasing order in energy.
The four conformers are then reoptimized in Psi4 using the PBE0/def2-TZVP
85,220,221,227
level of theory introduced
above in Section 4.2. Psi4 employs density fitting
238242
by default; this means that the universal fitting basis for
HartreeFock calculations
243
is used in the calculation. The Psi4 input file for the first conformer is shown in Figure 7.
The inputs for the other molecules are analogous and shall not be repeated here; they are, however, available in the
Supporting Information.
molecule {
01
C -1.0139237009 0.0001157060 -0.3320119090
C -0.3010211074 1.2491572923 0.1879180723
C -0.3011951696 -1.2490517349 0.1878718396
C 1.1683390004 1.2516621049 -0.2233071254
C 1.1681695646 -1.2517772582 -0.2232981267
C 1.8703096243 -0.0000923733 0.2934985390
C -2.4834630882 0.0000222911 0.0795247173
H -0.9582190930 0.0002005854 -1.4269602139
H -0.3718670923 1.2740378936 1.2781840671
H -0.7951641526 2.1435985756 -0.1985907954
H -0.7954642203 -2.1434127996 -0.1986469736
H -0.3720420205 -1.2738839625 1.2781559690
H 1.6616052523 2.1443202680 0.1678151692
H 1.2391547021 1.2815104197 -1.3133695212
H 1.2390062002 -1.2817145988 -1.3133390411
H 1.6612233508 -2.1444918905 0.1679208818
H 2.9153982958 -0.0001245162 -0.0238784763
H 1.8521966765 -0.0001224730 1.3859698783
H -2.5743116471 0.0004900512 1.1639789401
H -2.9899376694 0.8819226593 -0.3066017637
H -2.9892458557 -0.8827595520 -0.3054682049
}
set basis def2-tzvp
o
p
timize(’
p
be0’)
FIGURE 7 Psi4 example: PBE0/def2-TZVP geometry optimization for the lowest-lying methylcyclohexane conformer
TABLE 1 Conformer energy differences ΔE
conformer n
=E
conformer n
E
conformer 1
in kcal/mol and number of basis functions N
bf
for the
methylcyclohexane conformers according to PBE0 calculations with various basis sets, evaluated at the PBE0/def2-TZVP optimized
geometries
Method N
bf
Conformer 2 Conformer 3 Conformer 4
PBE0/STO-3G 49 1.19 5.54 5.78
PBE0/STO-6G 49 1.25 5.57 5.84
PBE0/MINAO 49 0.85 5.08 5.05
PBE0/def2-SV(P) 126 2.00 6.62 7.07
PBE0/def2-SVP 168 1.97 6.57 7.01
PBE0/def2-TZVP 301 2.10 6.31 6.74
PBE0/def2-QZVP 819 2.11 6.31 6.73
GFN2-xTB (CREST geometry) 1.51 5.32 5.36
Note: For comparison, the GFN2-xTB data from the CREST output is also included.
20 of 33 LEHTOLA AND KARTTUNEN
With the PBE0/def2-TZVP optimized geometries at hand for each of the four conformers, we perform single-point
calculations on each conformer in a variety of basis sets; the resulting energy differences to the lowest-energy con-
former (#1) are given in Table 1. In addition to the def2 family, we also have included data for the MINAO basis con-
sisting of the minimal-basis HartreeFock orbitals extracted from the triple-ζcc-pVTZ basis set,
244
as well as the STO-
3G and STO-6G basis sets which are 3-Gaussian and 6-Gaussian function expansions of a minimal-basis Slater-type
orbital (STO) basis set, respectively.
224
(It is important to note in this context that not all STO basis sets are minimal:
STO basis sets of various sizes ranging up to polarized quadruple-ζhave been reported
245,246
and remain widely used
for practical calculations in programs employing STO basis sets.)
The data in Table 1 leads us to the following insights. First, even the minimal basis sets successfully predict the
energy ordering of the conformers: although MINAO flips the order of conformers 3 and 4, it still predicts conformer
1 to be the lowest in energy. Note that this comparison is restricted to the use of fixed geometries; relaxing the geome-
tries in each basis might change the conclusion somewhat. The good performance of the minimal basis sets for this
application shows that conformational energies enjoy an excellent degree of error cancellation, which is one of the
main motivations for using atomic basis sets in the first place.
171
The shortcomings of minimal basis sets are showcased by the large differences between the results obtained with
the MINAO and STO-nG basis sets. Minimal basis sets are as small as possible and thereby have very little flexibility:
good accuracy for one type of system does not translate to good accuracy in another system, and minimal basis sets gen-
erally have poor predictive power for chemistry.
134,135
MINAO is derived from atomic calculations only, and is thereby
fully biased toward atoms, while the Slater-type orbital basis used by Hehre, Stewart, and Pople
224
is optimized for an
TABLE 2 Geometric parameters of chromyl fluoride (CrO
2
F
2
) at various levels of theory
Method Basis r(CrF) (Å) r(CrO) (Å) OCrOðÞ(
)FCrFðÞ(
)
GFN1-xTB 1.525 1.597 111.37 106.53
GFN2-xTB 1.548 1.671 111.50 110.38
PW92 STO-3G 1.491 1.584 109.44 108.14
STO-6G 1.495 1.589 109.59 107.71
def2-SV(P) 1.548 1.684 108.41 110.80
def2-SVP 1.541 1.675 108.35 110.58
def2-TZVP 1.551 1.693 108.33 110.26
def2-QZVP 1.554 1.695 108.20 110.48
PBE STO-3G 1.504 1.606 109.47 108.05
STO-6G 1.507 1.611 109.61 107.65
def2-SV(P) 1.565 1.713 108.41 110.75
def2-SVP 1.557 1.704 108.38 110.48
def2-TZVP 1.568 1.721 108.45 110.01
def2-QZVP 1.571 1.724 108.30 110.23
r
2
SCAN STO-3G 1.497 1.602 109.98 106.94
STO-6G 1.500 1.605 110.26 106.22
def2-SV(P) 1.553 1.700 108.83 109.48
def2-SVP 1.545 1.692 108.77 109.25
def2-TZVP 1.554 1.706 108.89 108.80
def2-QZVP 1.556 1.708 108.76 108.96
Experiment
a
1.575 1.720 107.8 111.9
Experiment
b
1.55 1.71
a
Experimental values from Ref. [255].
b
Experimental values from Ref. [256].
LEHTOLA AND KARTTUNEN 21 of 33
average molecular environment, which is reflected in the slightly improved results in Table 1. However, this is only
achieved at the cost of a bias toward molecules, meaning that the STO-nG basis sets are not as good for isolated atoms.
It is generally preferable to use larger and more flexible basis sets in applications, which guarantee a uniform accu-
racy for all types of systems, and to try to converge the results to the complete basis set limit. This means controllably
removing the error made in the one-electron basis set approximation until the error becomes negligible either in abso-
lute value, or in comparison to the other sources of error in the calculation, such as the error inherent in the employed
density functional approximation, for example.
As has already been previously discussed, the smallest reasonable basis for general applications is def2-SV(P). It pre-
dicts conformational energies roughly within 0.3 kcal/mol compared with the converged quadruple-ζvalues, as can be
seen from Table 1. As shown by the comparison between the def2-SV(P) and def2-SVP data, the role of polarization
functions on hydrogen is small for the studied conformational energies.
Systematically more converged energies are obtained by going to the triple-ζdef2-TZVP basis and the quadruple-ζ
def2-QZVP basis. The data show that already the triple-ζcalculations are converged to 0.01 kcal/mol in the conformer
energy differences, demonstrating the usefulness of modern, systematic basis set families: the complete basis set limit
can be reached simply by using larger and larger basis sets.
For comparison, Table 1 also includes data for the GFN2-xTB method from the CREST output.
214
A visual assess-
ment of the data confirms that GFN2-xTB correctly reproduces the energy ordering of the conformers, and that the con-
former energy differences are reproduced at an accuracy comparable to the minimal basis set calculations, with the
converged PBE0/def2-QZVP data as reference. This data emphatically suggests that historical applications of minimal
basis sets in quantum chemistry can be straightforwardly replaced with modern semiempirical calculations with xtb,
for instance, which have much lower computational cost.
Studying a single molecular geometry is in general insufficient, if the molecule has the potential for multiple low-
lying conformers. The data in Table 1 demonstrates the importance of proper conformational sampling in applications
to thermochemistry or chemical reactions, for instance: in the case of methylcyclohexane, insufficient conformational
sampling can cause errors of up to 7 kcal/mol which may easily surpass the error arising from the level of theory or the
basis set.
4.3.2 | Geometry of chromyl fluoride
For a somewhat more complicated example, we study the equilibrium geometry of chromyl fluoride (CrO
2
F
2
) at various
levels of DFT, which is known to be surprisingly accurate for simple transition metal complexes.
247
CrO
2
F
2
assumes a
tetrahedral geometry. Again, the workflow is to build the molecule in a molecular editor, preoptimize the molecular
geometry with xtb, and then run the geometry optimizations in Psi4; however, now the optimization is done separately
for each basis set in contrast to the procedure used in Section 4.3.1.
For this study, we choose the GFN1-xTB
248
and GFN2-xTB
214
semiempirical methods as well as a set of non-
empirical density functionals: the PerdewWang 1992 (PW92) local density approximation (LDA),
150,249,250
the Per-
dewBurkeErnzerhof (PBE) GGA,
85
as well as the r
2
SCAN meta-GGA functional that represents the state of the art in
non-empirical density functionals.
251,252
The geometry optimizations are undertaken with very tight convergence
thresholds to ensure benchmark quality geometries.
Density fitting is again used in these calculations. As we only consider density functionals that do not contain exact
exchange in this application, smaller auxiliary basis sets optimized for reproducing only Coulomb interactions could be
employed
253
; however, for simplicity we stick to using the Psi4 default which is to use the larger auxiliary basis sets
243
that also work in the presence of exact exchange, such as the PBE0 functional used in Sections 4.2 and 4.3.1.
The results shown in Table 2 demonstrate that while the STO-nG minimal basis sets
224,254
yield relatively poor
geometries compared with the experimental values from Refs. [256,257], already the split-valence def2-SV(P) basis
set
227
leads to bond lengths that are converged to 0.03 Å and fractions of a degree in angles. The differences become
smaller, that is, the bond lengths and angles become more converged going to the larger basis sets, with the differences
between the def2-TZVP and def2-QZVP results being already negligible.
The bond lengths from the PBE/def2-QZVP calculations are in excellent agreement with the older experimental
values from Ref. [255]; the bond angles are in reasonable agreement with the experimental data from the same refer-
ence. r
2
SCAN/def2-QZVP, in turn, is in excellent agreement with the newer experimental bond lengths from Ref. [256].
22 of 33 LEHTOLA AND KARTTUNEN
4.4 |Quantum Espresso
Quantum Espresso (QE) is an integrated suite of FOSS codes for electronic structure calculations based on DFT, plane
waves, and pseudopotentials. The QE distribution consists of a set of core components and programs, a set of plug-ins
for more advanced tasks, and a number of third-party packages designed to be interoperable with the core components.
QE can be used to study the geometries, energetics, thermodynamics, electronic properties, response properties, spec-
troscopic properties, and transport properties of solid-state materials. The Supporting Information includes step-by-step
guidelines for installing QE and using it to study two polymorphs of zinc(II) sulfide, ZnS.
ZnS crystallizes in two principal forms, sphalerite and wurtzite (Figure 8). Sphalerite is a naturally occurring min-
eral belonging to the cubic crystal system with space group F[]43m(No. 216). Both Zn and S atoms are tetrahedrally
coordinated in the sphalerite structure and the crystal structure can be considered as a diamond lattice with two atom
types. Wurtzite is also a naturally occurring mineral and it can be considered as a hexagonal polymorph of sphalerite,
crystallizing in the space group P6
3
mc (No. 186). The coordination with nearest and next-nearest neighbors in wurtzite
is identical to that in sphalerite. The first structural differences between the two polymorphs arise only in the third shell
of neighbors.
257
From a thermodynamical point of view, sphalerite is the low-temperature ZnS polymorph in bulk form
FIGURE 8 Two polymorphs of ZnS: Sphalerite (left) and wurtzite (right). Zinc atoms in blue, sulfur atoms in yellow. For wurtzite, the
c-axis points upward
&CONTROL
calculation=’vc-relax’
prex=’zns’
/
&SYSTEM
space_group=216 ! Space group
a=5.4093 ! Lattice parameter a in angstroms
nat=2 ! Number of atoms in the asymmetric unit
ntyp=2 ! Number of atom types. Here, Zn and S.
ecutwfc=40 ! Kinetic energy cutoff for wavefunctions (Ry)
ecutrho=200 ! Kinetic energy cutoff for charge density and potential (Ry)
/
ATOMIC_SPECIES
Zn 65.38 zn_pbe_v1.uspp.F.UPF
S 32.065 s_pbe_v1.4.uspp.F.UPF
ATOMIC_POSITIONS crystal_sg
Zn 0.00000 0.00000 0.00000
S 0.25000 0.25000 0.25000
K_POINTS automatic
888 000
FIGURE 9 Quantum espresso example: Geometry optimization of sphalerite-ZnS with PBE functional and GBRV pseudopotentials.
Fully annotated input files can be found from the Supporting Information
LEHTOLA AND KARTTUNEN 23 of 33
and the transition temperature to wurtzite is 1293 ± 10 K.
258
Wurtzite-ZnS is thus metastable at room temperature, but
it is found in nature and can also be produced synthetically.
The illustrative QE calculations are carried out with the non-empirical PBE exchange-correlation functional.
85
To
run the calculations with QE, we need pseudopotentials that have been developed for this functional. Here we use the
ultrasoft GarrityBennettRabeVanderbilt (GBRV) pseudopotentials, which form a highly accurate and computation-
ally inexpensive open-source pseudopotential library that has been designed and optimized for use in high-throughput
DFT calculations.
259
The main attractive feature of the GBRV pseudopotentials is that they are tailored for relatively
small plane wave cutoffs of 40 Rydberg for wave functions and 200 Rydberg for the charge density and potential,
259
resulting in affordable computational costs.
To study sphalerite-ZnS and wurtzite-ZnS with QE, we need their crystal structures. A good source for crystal struc-
ture data is the Crystallography Open Database (COD),
260
which is where we obtained the structures in the Crystallo-
graphic Information File (CIF) format; the COD structures are available in the Supporting Information.
There are several ways in which the crystal structures can be entered in QE input files. In the example here, we have
directly used the crystallographic information to create an input file, which is shown in Figure 9; a helpful resource for build-
ing QE input files is afforded by the QE input generator and structure visualizer provided by the Materials Cloud.
261
4.4.1 | Optimal geometry
Before attempting any calculations, it is important to determine how dense a sampling of the reciprocal space (k-sam-
pling) is needed to describe the materials sufficiently accurately. The convergence tests described in the Supporting
Information show that a 8 88 MonkhorstPack
262
k-point mesh leads to a truncation error smaller than 1 meV for
sphalerite-ZnS. A comparable k-point spacing is then also used for wurtzite-ZnS.
The geometry optimization of sphalerite-ZnS finishes in a few minutes, while the wurtzite-ZnS may take tens of
minutes when run on a single processor core. The optimized lattice parameters are in good agreement with the experi-
mental lattice parameters found on COD. The optimized lattice parameters are a=5.447 Å for sphalerite-ZnS and
a=3.846 Å and c=6.304 Å for wurtzite-ZnS, whereas the experimental lattice parameters are a=5.4093 Å for
sphalerite-ZnS and a=3.811 Å and c=6.234 Å for wurtzite-ZnS.
260
This means that the computations overestimate
the lattice parameters by approximately 1% over the experiment.
The energy comparison of the optimized sphalerite-ZnS and wurtzite-ZnS structures shows that the total energies
differ by only 0.6 kJ/mol per formula unit. This value is in good agreement with Cardona et al.
263
who reported an
energy difference of less than 0.008 eV (0.8 kJ/mol) per formula unit from LDA and GGA calculations on ZnS poly-
morphs. The energy difference is so small, because the crystal structures are so similar: differences arise only in the
-4
-3
-2
-1
0
1
2
3
4
ΓX W K ΓL U W L
K
Energy / eV
Wave vector
Sphalerite-ZnS band structure
FIGURE 10 Electronic band structure of sphalerite-ZnS obtained with PBE functional and GBRV pseudopotentials
24 of 33 LEHTOLA AND KARTTUNEN
third-nearest neighbor shell, as was already mentioned above. Note that so far we have only compared electronic total
energies; Gibbs free energies should be considered instead for a full understanding of the thermodynamics, but this is
beyond the scope of this work.
4.4.2 | Band structure
The second practical example illustrates how the electronic band structure of sphalerite-ZnS can be calculated and
plotted with QE. In any band structure calculation, the band path in the reciprocal space has to be defined in terms of
k-points. The band path depends on the Bravais lattice of the crystal structure. An excellent source for band paths is
the SeeK-path service,
264
which readily provides crystal-structure-based band paths for several program packages. Here,
we use the face centered cubic (FCC) band path from Setyawan and Curtarolo,
265
and the resulting electronic band
structure of sphalerite-ZnS is illustrated in Figure 10.
From the band structure plot in Figure 10, we can see that sphalerite-ZnS has a direct band gap of about 2 eV at the
Γpoint when using the PBE functional and the GBRV pseudopotentials. The band structure in Figure 10 is in good
agreement with the PBE band structure available in the Materials Project.
112
However, the PBE calculations severely
underestimate the experimental band gap measured at 10 K, which is about 3.8 eV.
266
The agreement with experiment
could be improved for example with the DFT +U approach or with hybrid density functionals, both of which are out-
side the scope of this work.
5|SUMMARY AND CONCLUSIONS
We have argued that FOSS allows for a BYOD approach to the teaching of computational chemistry, and finally affords com-
putational chemistry for the masses, thereby also democratizing the science of computational chemistry. The distributed
BYOD approach to computational chemistry also supports the delivery of massive open online courses (MOOCs), avoiding
the need to organize computing resources for a large number of students in a cost-effective and secure way. We have briefly
reviewed the current selection of FOSS programs for electronic structure calculations, and illustrated the installation and
practical use of several programs for computational chemistry education on personal computers. As the technical barriers
for running quantum chemical calculations on personal laptops have practically vanished, educators can focus on content
creation and developing practices for sharing and co-creating computational chemistry teaching material as Open Educa-
tional Resources.
267
The Psi4Education project
5,268
is one such attempt at open teaching materials. We hope open materials
become more readily available and more thoroughly used in the future.
On a final note, we would like to point out that the free availability of FOSS operating system kernels, compilers,
debuggers as well as user-space toolswhich have not been discussed in this reviewhave had a critical role in
enabling the development of the plethora of the FOSS projects discussed within this work, as well as our own work. We
would like to thank the entire FOSS community for providing high-quality tools for a variety of purposes, and invite
our readers to join the FOSS movement.
ACKNOWLEDGMENTS
We thank Paul Saxe and Jonathan Moussa for invaluable comments on an early stage of this manuscript. We also thank
all the anonymous peer reviewers of this manuscript for constructive criticisms which have similarly helped to improve
the structure and content of this paper. A. J. K. thanks Business Finland for Co-Innovation funding (Grant
No. 3767/31/2019).
CONFLICT OF INTEREST
The authors declare no conflict of interest.
AUTHOR CONTRIBUTIONS
Susi Lehtola: Conceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); method-
ology (equal); resources (equal); software (equal); validation (equal); visualization (equal); writing original draft
(equal); writing review and editing (lead). Antti Karttunen: Conceptualization (equal); data curation (equal); formal
LEHTOLA AND KARTTUNEN 25 of 33
analysis (equal); investigation (equal); methodology (equal); resources (equal); software (equal); validation (equal); visu-
alization (equal); writing original draft (equal); writing review and editing (supporting).
DATA AVAILABILITY STATEMENT
Data available in article supplementary material. The data is also openly available in a public repository that does not
issue DOIs.
ORCID
Susi Lehtola https://orcid.org/0000-0001-6296-8103
Antti J. Karttunen https://orcid.org/0000-0003-4187-5447
RELATED WIRES ARTICLES
The Chronus Quantum software package
VeloxChem: A Python-driven density-functional theory program for spectroscopy simulations in high-performance
computing environments
Extended tight-binding quantum chemistry methods
REFERENCES
1. Westmoreland P. Applying molecular and materials modeling. 1st ed. Netherlands: Springer; 2002.
2. Head-Gordon M, Artacho E. Chemistry on the computer. Phys Today. 2008;61:5863.
3. Deglmann P, Schäfer A, Lennartz C. Application of quantum calculations in the chemical industry: an overview. Int J Quantum Chem.
2015;115:10736.
4. Weiß H, Deglmann P, In't Veld PJ, Cetinkaya M, Schreiner E. Multiscale materials modeling in an industrial environment. Annu Rev
Chem Biomol Eng. 2016;7:6586.
5. Fortenberry RC, McDonald AR, Shepherd TD, Kennedy M, Sherrill CD. PSI4Education: computational chemistry labs using free soft-
ware. In: Daus K, Rigsby R, editors. The promise of chemical education: addressing our Students' needs. Washington, DC: American
Chemical Society; 2015. p. 8598.
6. Grushow A, Reeves M. Using computational methods to teach chemical principles. Washington, DC: American Chemical Society; 2019.
7. Esselman BJ, Hill NJ. Integration of computational chemistry into the undergraduate organic chemistry laboratory curriculum. J Chem
Educ. 2016;93:9326.
8. Winfield LL, McCormack K, Shaw T. Using iSpartan to support a student-centered activity on alkane conformations. J Chem Educ.
2018;96:8992.
9. Esselman BJ, Hill NJ. Integrating computational chemistry into an organic chemistry laboratory curriculum using WebMO. Using com-
putational methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 13962.
10. Phillips JA. Modeling reaction energies and exploring noble gas chemistry in the physical chemistry laboratory. Using computational
methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 3350.
11. Reeves MS, Berghout HL, Perri MJ, Singleton SM, Whitnell RM. How can you measure a reaction enthalpy without going into the lab?
Using computational methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 5163.
12. Martini SR, Hartzell CJ. Integrating computational chemistry into a course in classical thermodynamics. J Chem Educ. 2015;92:12013.
13. Stocker KM. Using electronic structure calculations to investigate the kinetics of gas-phase ammonia synthesis. Using computational
methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 2132.
14. Snyder HD, Kucukkal TG. Computational chemistry activities with Avogadro and ORCA. J Chem Educ. 2021;98:133541.
15. Hoover GC, Dicks AP, Seferos DS. Upper-year materials chemistry computational modeling module for organic display technologies.
J Chem Educ. 2021;98:80511.
16. Furlan PY, Bell-Loncella ET. Integrating computation and visualization to enhance learning IR spectroscopy in the general chemistry
laboratory: computer-assisted learning of IR spectroscopy. Spectrosc Lett. 2010;43:61825.
17. Martin WR, Ball DW. Using computational chemistry to extend the acetylene rovibrational spectrum to C
2
T
2
. Using computational
methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 93107.
18. DeVore TC. Introducing quantum calculations into the physical chemistry laboratory. Using computational methods to teach chemical
principles. Washington, DC: American Chemical Society. 2019. p. 10925.
19. JCE Staff. Computational chemistry for the masses. J Chem Educ. 1996;73:104.
20. Grushow A, Reeves MS Using Computational Methods To Teach Chemical Principles: Overview. Using Computational Methods To Teach
Chemical Principles. Washington, DC: American Chemical Society; 2019. https://pubs.acs.org/doi/abs/10.1021/bk-2019-1312.ch001
21. WebMO A web-based interface to computational chemistry packages [cited 2021 May 8]. Available from: https://www.webmo.net/
22. Polik WF, Schmidt JR. WebMO: web-based computational chemistry calculations in education and research. Wiley Interdiscip Rev
Comput Mol Sci. 2022. 12 (1):e1554. https://doi.org/10.1002/wcms.1554
23. Perri MJ, Akinmurele M, Haynie M. Chem Compute Science Gateway: an online computational chemistry tool. Using computational
methods to teach chemical principles. Washington, DC: American Chemical Society. 2019. p. 7992.
26 of 33 LEHTOLA AND KARTTUNEN
24. Kobayashi R, Goumans TPM, Carstensen NO, Soini TM, Marzari N, Timrov I, et al. Virtual computational chemistry teaching
laboratorieshands-on at a distance. J Chem Educ. 2021;98:316371.
25. Schwalbe S, Fiedler L, Kraus J, Kortus J, Trepte K, Lehtola S. PYFLOSIC: python-based FermiLöwdin orbital self-interaction correc-
tion. J Chem Phys. 2020;153:084104.
26. Krylov AI, Herbert JM, Furche F, Head-Gordon M, Knowles PJ, Lindh R, et al. What is the price of open-source software? J Phys Chem
Lett. 2015;6:27514.
27. Jacob CR. How open is commercial scientific software? J Phys Chem Lett. 2016;7:3513.
28. Li L. Why should anyone become a scientist? The ideal of science and its importance. J Chem Educ. 1999;76:20.
29. Azoulay P, Fons-Rosen C, Zivin JSG. Does science advance one funeral at a time? Am Econ Rev. 2019;109:2889920.
30. Giles J. Software company bans competitive users. Nature. 2004;429:2311.
31. Smart AG. The war over supercooled water. Phys Today. 2018.
32. Palmer JC, Haji-Akbari A, Singh RS, Martelli F, Car R, Panagiotopoulos AZ, et al. Comment on the putative liquid-liquid transition is
a liquid-solid transition in atomistic models of water[I and II: J. Chem. Phys. 135, 134503 (2011); J. Chem. Phys. 138, 214504 (2013)].
J Chem Phys. 2018;148:137101.
33. Open Source Initiative. The open source definition [cited 2021 May 13]. Available from: https://opensource.org/osd
34. Free Software Foundation. What is free software? [cited 2021 May 13]. Available from: https://www.gnu.org/philosophy/free-sw.
html.en
35. Stahl MT. Open-source software: not quite endsville. Drug Discov Today. 2005;10:21922.
36. Gezelter JD. Open source and open data should be standard practices. J Phys Chem Lett. 2015;6:11689.
37. Hinsen K. Computational science: shifting the focus from tools to models. F1000Research. 2014;3:101.
38. Git Community. Git, a free and open source distributed version control system [cited 2021 May 20]. Available from: https://git-
scm.com/
39. GitHub, Inc. Github collaboration platform [cited 2021 May 20]. Available from: https://github.com/
40. GitLab, Inc. Gitlab collaboration platform [cited 2021 May 20]. Available from: https://gitlab.com/
41. European Organization For Nuclear Research and OpenAIRE. Zenodo; 2013.
42. Swarts J. Open-source software in the sciences: the challenge of user support. J Bus Tech Commun. 2018;33:6090.
43. Dalke A. The chemfp project. J Chem. 2019;11:76.
44. Haff G. How open source ate software. Berkeley, CA: Apress; 2018.
45. Kitware Inc. About Kitware [cited 2021 Jan 28]. Available from: https://www.kitware.com/about/
46. Ahrens J, Geveci B, Law C. ParaView: an end-user tool for large-data visualization. Visualization handbook. Oxford, UK: Elsevier;
2005. p. 71731.
47. McCormick M, Liu X, Jomier J, Marion C, Ibanez L. ITK: enabling reproducible research and open science. Front Neuroinform. 2014;8:13.
48. Hoffman B, Cole D & Vines J Software process for rapid development of HPC software using CMake. In: 2009 DoD high performance
computing modernization program users group conference (IEEE); 2009.
49. Hanwell MD, Harris C, Genova A, Haghighatlari M, Khatib ME, Avery P, et al. Open chemistry, JupyterLab, REST, and quantum
chemistry. Int J Quantum Chem. 2021;121:e26472.
50. European Commission. Open science [cited 2021 Jan 28]. Available from: https://ec.europa.eu/info/research-and-innovation/strategy/
strategy-2020-2024/our-digital-future/open-science
51. Wieber F, Pisanty A, Hocquet A. We were here before the web and hype: a brief history of and tribute to the computational chemis-
try list. J Chem. 2018;10:67.
52. Constant D, Sproull L, Kiesler S. The kindness of strangers: the usefulness of electronic weak ties for technical advice. Organ Sci. 1996;
7:11935.
53. Lakhani KR, von Hippel E. How open source software works: freeuser-to-user assistance. Res Policy. 2003;32:92343.
54. Schiff A. The economics of open source software: a survey of the early literature. Rev Netw Econ. 2002;1:6674.
55. Myatt DP. Equilibrium selection and public-good provision: the development of open-source software. Oxf Rev Econ Policy. 2002;18:
44661.
56. Johnson JP. Open source software: private provision of a public good. J Econ Manage Strategy. 2002;11:63762.
57. Mustonen M. Copyleftthe economics of Linux and other open source software. Inf Econ Policy. 2003;15:99121.
58. Lerner J, Tirole J. Some simple economics of open source. J Ind Econ. 2003;50:197234.
59. Bonaccorsi A, Rossi C. Why open source software can succeed. Res Policy. 2003;32:124358.
60. Hawkins RE. The economics of open source software for a competitive firm. Netnomics. 2004;6:10317.
61. Bitzer J. Commercial versus open source software: the role of product heterogeneity in competition. Econ Syst. 2004;28:36981.
62. Lerner J, Tirole J. The economics of technology sharing: open source and beyond. J Econ Perspect. 2005;19:99120.
63. Lerner J. The scope of open source licensing. J Law Econ Organ. 2005;21:2056.
64. Bitzer J, Schröder PJH. Bug-fixing and code-writing: the private provision of open source software. Inf Econ Policy. 2005;17:389406.
65. West J, Gallagher S. Challenges of open innovation: the paradox of firm investment in open-source software. R&D Management. 2006;
36:31931.
66. Rossi MA. Decoding the free/open source software puzzle. In: Bitzer J, Schröder PJH, editors. The economics of open source software
development. Amsterdam, The Netherlands: Elsevier; 2006. p. 1555.
LEHTOLA AND KARTTUNEN 27 of 33
67. Gaudeul A. Do open source developers respond to competition? The (LA)TEX case study. Rev Netw Econ. 2007;6:23963.
68. von Krogh G, von Hippel E. The promise of research on open source software. Manage Sci. 2006;52:97583.
69. Hars A, Ou S. Working for free? Motivations for participating in open-source projects. Int J Electron Commer. 2002;6:2539.
70. Bitzer J, Schrettl W, Schröder PJH. Intrinsic motivation in open source software development. J Comp Econ. 2007;35:1609.
71. Lerner J, Pathak PA, Tirole J. The dynamics of open-source contributors. Am Econ Rev. 2006;96:1148.
72. Fershtman C, Gandal N. Open source software: motivation and restrictive licensing. Int Econ Econ Policy. 2007;4:20925.
73. Johnson JP. Collaboration, peer review and open source software. Inf Econ Policy. 2006;18:47797.
74. Bitzer J, Schröder PJH. The impact of entry and competition by open source software on innovation activity. In: Bitzer J, Schröder PJH,
editors. The economics of open source software development. Amsterdam, The Netherlands: Elsevier; 2006. p. 21946.
75. top500.org. Top500 operating system statistics [cited 2021 July 6]. Available from: https://www.top500.org/statistics/details/osfam/1/
76. Moore JF, McCann MP. Linux and the chemist. J Chem Educ. 2003;80:219.
77. Lehtola J, Hakala M, Sakko A, Hämäläinen K. ERKALE: a flexible program package for X-ray properties of atoms and molecules.
J Comput Chem. 2012;33:157285.
78. Smith DGA, Burns LA, Simmonett AC, Parrish RM, Schieber MC, Galvelis R, et al. PSI4 1.4: open-source software for high-throughput
quantum chemistry. J Chem Phys. 2020;152:184108.
79. Crawford TD, Sherrill CD, Valeev EF, Fermann JT, King RA, Leininger ML, et al. PSI3: an open-source ab initio electronic structure
package. J Comput Chem. 2007;28:16106.
80. Sun Q, Zhang X, Banerjee S, Bao P, Barbry M, Blunt NS, et al. Recent developments in the PYSCF program package. J Chem Phys.
2020;153:024109.
81. Aquilante F, Autschbach J, Baiardi A, Battaglia S, Borin VA, Chibotaru LF, et al. Modern quantum chemistry with [open]Molcas.
J Chem Phys. 2020;152:214117.
82. Olsen JMH, Reine S, Vahtras O, Kjellgren E, Reinholdt P, Hjorth Dundas KO, et al. Dalton project: a python platform for molecular-
and electronic-structure simulations of complex systems. J Chem Phys. 2020;152:214115.
83. Aprà E, Bylaska EJ, de Jong WA, Govind N, Kowalski K, Straatsma TP, et al. NWChem: past, present, and future. J Chem Phys. 2020;
152:184102.
84. Lehtola S, Steigemann C, Oliveira MJT, Marques MAL. Recent developments in LIBXC: a comprehensive library of functionals for den-
sity functional theory. SoftwareX. 2018;7:15.
85. Perdew JP, Burke K, Ernzerhof M. Generalized gradient approximation made simple. Phys Rev Lett. 1996;77:38658.
86. Stephens PJ, Devlin FJ, Chabalowski CF, Frisch MJ. Ab initio calculation of vibrational absorption and circular dichroism spectra using
density functional force fields. J Phys Chem. 1994;98:116237.
87. Sun J, Ruzsinszky A, Perdew J. Strongly constrained and appropriately normed semilocal density functional. Phys Rev Lett. 2015;115:
036402.
88. Romero AH, Allan DC, Amadon B, Antonius G, Applencourt T, Baguet L, et al. ABINIT: overview and focus on selected capabilities.
J Chem Phys. 2020;152:124102.
89. Andrade X, Pemmaraju CD, Kartsev A, Xiao J, Lindenberg A, Rajpurohit S, et al. Inq, a modern GPU-accelerated computational frame-
work for (time-dependent) density functional theory. J Chem Theory Comput. 2021;17:744767.
90. Giannozzi P, Baseggio O, Bonfà P, Brunato D, Car R, Carnimeo I, et al. QUANTUM ESPRESSO toward the exascale. J Chem Phys.
2020;152:154105.
91. Lehtola S. Fully numerical HartreeFock and density functional calculations. II. Diatomic molecules. Int J Quantum Chem. 2019;119:
e25944.
92. Lehtola S. Fully numerical HartreeFock and density functional calculations. I. Atoms. Int J Quantum Chem. 2019;119:e25945.
93. Lehtola S, Dimitrova M, Sundholm D. Fully numerical electronic structure calculations on diatomic molecules in weak to strong mag-
netic fields. Mol Phys. 2020;118:e1597989.
94. Lehtola S. Fully numerical calculations on atoms with fractional occupations and range-separated exchange functionals. Phys Rev A.
2020;101:012516.
95. Motamarri P, Das S, Rudraraju S, Ghosh K, Davydov D, Gavini V. DFT-FE: a massively parallel adaptive finite-element code for large-
scale density functional theory calculations. Comput Phys Commun. 2020;246:106853.
96. te Velde G, Bickelhaupt FM, Baerends EJ, Fonseca Guerra C, van Gisbergen SJA, Snijders JG, et al. Chemistry with ADF. J Comput
Chem. 2001;22:93167.
97. Barca GMJ, Bertoni C, Carrington L, Datta D, De Silva N, Deustua JE, et al. Recent developments in the general atomic and molecular
electronic structure system. J Chem Phys. 2020;152:154102.
98. Werner H-J, Knowles PJ, Manby FR, Black JA, Doll K, Heßelmann A, et al. The Molpro quantum chemistry package. J Chem Phys.
2020;152:144107.
99. K
allay M, Nagy PR, Mester D, Rolik Z, Samu G, Csontos J, et al. The MRCC program system: accurate quantum chemistry from water
to proteins. J Chem Phys. 2020;152:074107.
100. Neese F, Wennmohs F, Becker U, Riplinger C. The ORCA quantum chemistry program package. J Chem Phys. 2020;152:224108.
101. Balasubramani SG, Chen GP, Coriani S, Diedenhofen M, Frank MS, Franzke YJ, et al. TURBOMOLE: modular program suite for
ab initio quantum-chemical and condensed-matter simulations. J Chem Phys. 2020;152:184107.
102. Lejaeghere K, Bihlmayer G, Bjorkman T, Blaha P, Blugel S, Blum V, et al. Reproducibility in density functional theory calculations of
solids. Science. 2016;351:aad3000.
103. Ajila SA, Wu D. Empirical study of the effects of open source adoption on software development economics. J Syst Softw. 2007;80:151729.
28 of 33 LEHTOLA AND KARTTUNEN
104. Oliveira MJT, Papior N, Pouillon Y, Blum V, Artacho E, Caliste D, et al. The CECAM electronic structure library and the modular soft-
ware development paradigm. J Chem Phys. 2020;153:024117.
105. Caldeweyher E, Bannwarth C, Grimme S. Extension of the D3 dispersion coefficient model. J Chem Phys. 2017;147:034112.
106. Caldeweyher E, Ehlert S, Hansen A, Neugebauer H, Spicher S, Bannwarth C, et al. A generally applicable atomic-charge dependent
London dispersion correction. J Chem Phys. 2019;150:154122.
107. Caldeweyher E, Mewes J-M, Ehlert S, Grimme S. Extension and evaluation of the D4 London-dispersion model for periodic systems.
Phys Chem Chem Phys. 2020;22:8499512.
108. DeLano WL. The case for open-source software in drug discovery. Drug Discov Today. 2005;10:2137.
109. Smith DGA, Burns LA, Sirianni DA, Nascimento DR, Kumar A, James AM, et al. PSI4NumPy: an interactive quantum chemistry pro-
gramming environment for reference implementations and rapid development. J Chem Theory Comput. 2018;14:350411.
110. Herbst MF, Levitt A, Cancès E. DFTK: a Julian approach for simulating electrons in solids. JuliaCon Proc. 2021;3:69.
111. Lehtola S, Blockhuys F, Van Alsenoy C. An overview of self-consistent field calculations within finite basis sets. Molecules. 2020;25:
1218.
112. Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, et al. Commentary: the materials project: a materials genome approach to
accelerating materials innovation. APL Mater. 2013;1:011002.
113. Talirz L, Kumbhar S, Passaro E, Yakutovich AV, Granata V, Gargiulo F, et al. Materials cloud, a platform for open computational sci-
ence. Sci Data. 2020;7:299.
114. Huber SP, Zoupanos S, Uhrin M, Talirz L, Kahle L, Häuselmann R, et al. AiiDA 1.0, a scalable computational infrastructure for auto-
mated reproducible workflows and data provenance. Sci Data. 2020;7:300.
115. Gjerding M, Skovhus T, Rasmussen A, Bertoldo F, Larsen AH, Mortensen JJ, et al. Atomic simulation recipes: a python framework and
library for automated workflows. Comput Mater Sci. 2021;199:110731.
116. Smith DGA, Lolinco AT, Glick ZL, Lee J, Alenaizan A, Barnes TA, et al. Quantum chemistry common driver and databases (QCDB)
and quantum chemistry engine (QCEngine): automation and interoperability among computational chemistry programs. J Chem Phys.
2021;155:204801.
117. Samsonidze G, Kozinsky B. Half-heusler compounds for use in thermoelectric generators. US Patent 20170141282; May 2017.
118. Ye J-H, Huang C-L. Method for crystallizing metal oxide semiconductor layer, semiconductor structure, active array substrate, and
indium gallium zinc oxide crystal. US Patent 20180166474; June 2018.
119. Strohmaier E, Meuer HW, Dongarra J, Simon HD. TheTOP500 list and progress in high-performance computing. Computer. 2015;48:429.
120. Meuer HW, Strohmaier E, Dongarra J, Simon H, Meuer M. Top500 [cited 2021 May 20]. Available from: https://top500.org/
121. Szabo A, Ostlund NS. Modern quantum chemistry: introduction to advanced electronic structure theory. : Dover Pubns; 1996.
122. Geldenhuys WJ, Gaasch KE, Watson M, Allen DD, der Schyf CJV. Optimizing the use of open-source software applications in drug dis-
covery. Drug Discov Today. 2006;11:12732.
123. Pirhadi S, Sunseri J, Koes DR. Open source molecular modeling. J Mol Graph Model. 2016;69:12743.
124. Rodríguez-Becerra J, C
aceres-Jensen L, Díaz T, Druker S, Padilla VB, Pernaa J, et al. Developing technological pedagogical science
knowledge through educational computational chemistry: a case study of pre-service chemistry teachers' perceptions. Chem Educ Res
Pract. 2020;21:63854.
125. Talirz L, Ghiringhelli LM, Smit B. Trends in atomistic simulation software usage. J Comp Mol Sci. 2021;3 (1):1483.
126. Python package indexpypi [cited 2021 July 7]. Available from: https://pypi.org/
127. Continuum Analytics. Conda package manager [cited 2021 May 26]. Available from: https://conda.io/
128. Hohenberg P, Kohn W. Inhomogeneous electron gas. Phys Rev. 1964;136:B86471.
129. Kohn W, Sham LJ. Self-consistent equations including exchange and correlation effects. Phys Rev. 1965;140:A11338.
130. Boys SF. Electronic wave functions. I. A general method of calculation for the stationary states of any molecular system. Proc R Soc
Lond Ser A Math Phys Eng Sci. 1950;200:54254.
131. McMurchie LE, Davidson ER. One- and two-electron integrals over cartesian Gaussian functions. J Comput Phys. 1978;26:21831.
132. Obara S, Saika A. Efficient recursive computation of molecular integrals over cartesian Gaussian functions. J Chem Phys. 1986;84:3963.
133. Davidson ER, Feller D. Basis set selection for molecular calculations. Chem Rev. 1986;86:68196.
134. Hill JG. Gaussian basis sets for molecular applications. Int J Quantum Chem. 2013;113:2134.
135. Jensen F. Atomic orbital basis sets. Wiley Interdiscip Rev Comput Mol Sci. 2013;3:27395.
136. Shiozaki T. BAGEL: brilliantly advanced general electronic-structure library. Wiley Interdiscip Rev Comput Mol Sci. 2018;8:e1331.
137. Williams-Young DB, Petrone A, Sun S, Stetina TF, Lestrange P, Hoyer CE, et al. The Chronus quantum software package. Wiley Inter-
discip Rev Comput Mol Sci. 2020;10:e1436.
138. Aidas K, Angeli C, Bak KL, Bakken V, Bast R, Boman L, et al. The Dalton quantum chemistry program system. Wiley Interdiscip Rev
Comput Mol Sci. 2014;4:26984.
139. Rudberg E, Rubensson EH, Sałek P, Kruchinina A. Ergo: an open-source program for linear-scaling electronic structure calculations.
SoftwareX. 2018;7:10711.
140. Folkestad SD, Kjønstad EF, Myhre RH, Andersen JH, Balbi A, Coriani S, et al. e
T
1.0: an open source electronic structure program with
emphasis on coupled cluster and multilevel methods. J Chem Phys. 2020;152:184103.
141. Aroeira GJR, Davis MM, Turney JM, Schaefer HF. Fermi.jl: a modern design for quantum chemistry. J Chem Theory Comput. 2022. 18
(2):677686. https://doi.org/10.1021/acs.jctc.1c00719
LEHTOLA AND KARTTUNEN 29 of 33
142. Poole D, Vallejo JLG, Gordon MS. A new kid on the block: application of Julia to HartreeFock calculations. J Chem Theory Comput.
2020;16:500613.
143. Bruneval F, Rangel T, Hamed SM, Shao M, Yang C, Neaton JB. MOLGW 1: many-body perturbation theory software for atoms, mole-
cules, and clusters. Comput Phys Commun. 2016;208:14961.
144. Peng C, Lewis CA, Wang X, Clement MC, Pierce K, Rishi V, et al. Massively parallel quantum chemistry: a high-performance research
platform for electronic structure. J Chem Phys. 2020;153:044120.
145. Mueller RP. PyQuante: Python quantum chemistry [cited 2021 July 6]. Available from: http://pyquante.sourceforge.net/
146. Unsleber JP, Dresselhaus T, Klahr K, Schnieders D, Böckers M, Barton D, et al. Serenity: a subsystem quantum chemistry program.
J Comput Chem. 2018;39:78898.
147. Kjellgren E. SlowQuant [cited 2021 July 6]. Available from: https://github.com/erikkjellgren/SlowQuant
148. Rinkevicius Z, Li X, Vahtras O, Ahmadzadeh K, Brand M, Ringholm M, et al. VeloxChem: a python-driven density-functional theory pro-
gram for spectroscopy simulations in high-performance computing environments. Wiley Interdiscip Rev Comput Mol Sci. 2019;10:e1457.
149. Souvatzis P. Uquantchem: a versatile and easy to use quantum chemistry computational software. Comput Phys Commun. 2014;185:
41521.
150. Bloch F. Bemerkung zur Elektronentheorie des Ferromagnetismus und der elektrischen Leitfähigkeit. Z Phys. 1929;57:54555.
151. Kratzer P, Neugebauer J. The basics of electronic structure theory for periodic systems. Front Chem. 2019;7:118.
152. Schwerdtfeger P. The pseudopotential approximation in electronic structure theory. ChemPhysChem. 2011;12:314355.
153. Kang S, Woo J, Kim J, Kim H, Kim Y, Lim J, et al. ACE-molecule: an open-source real-space quantum chemistry package. J Chem
Phys. 2020;152:124110.
154. Ratcliff LE, Dawson W, Fisicaro G, Caliste D, Mohr S, Degomme A, et al. Flexibilities of wavelets as a computational basis set for large-
scale electronic structure calculations. J Chem Phys. 2020;152:194110.
155. Nakata A, Baker JS, Mujahed SY, Poulton JTL, Arapan S, Lin J, et al. Large scale and linear scaling DFT with the CONQUEST code.
J Chem Phys. 2020;152:164112.
156. Kühne TD, Iannuzzi M, Del Ben M, Rybkin VV, Seewald P, Stein F, et al. CP2K: an electronic structure and molecular dynamics soft-
ware package - quickstep: efficient and accurate electronic structure calculations. J Chem Phys. 2020;152:194103.
157. The Elk Code. Available from: http://elk.sourceforge.net/
158. Gulans A, Kontur S, Meisenbichler C, Nabok D, Pavone P, Rigamonti S, et al. Exciting: a full-potential all-electron package
implementing density-functional theory and many-body perturbation theory. J Phys Condens Matter. 2014;26:363202.
159. FLEUR. Available from: http://www.flapw.de
160. Enkovaara J, Rostgaard C, Mortensen JJ, Chen J, Dułak M, Ferrighi L, et al. Electronic structure calculations with GPAW: a real-space
implementation of the projector augmented-wave method. J Phys Condens Matter. 2010;22:253202.
161. Sundararaman R, Letchworth-Weaver K, Schwarz KA, Gunceler D, Ozhabes Y, Arias TA. JDFTx: software for joint density-functional
theory. SoftwareX. 2017;6:27884.
162. Xu Q, Sharma A, Suryanarayana P. M-SPARC: Matlab-simulation package for ab-initio real-space calculations. SoftwareX. 2020;11:
100423.
163. Tancogne-Dejean N, Oliveira MJT, Andrade X, Appel H, Borca CH, Le Breton G, et al. Octopus, a computational framework for explor-
ing light-driven phenomena and quantum dynamics in extended and finite systems. J Chem Phys. 2020;152:124119.
164. Ozaki T, Kino H. Numerical atomic basis orbitals from H to Kr. Phys Rev B. 2004;69:195113.
165. Saad Y, Chelikowsky JR, Shontz SM. Numerical methods for electronic structure calculations of materials. SIAM Rev. 2010;52:354.
166. Fathurrahman F, Agusta MK, Saputro AG, Dipojono HK. PWDFT.Jl: a Julia package for electronic structure calculation using density
functional theory and plane wave basis. Comput Phys Commun. 2020;256:107372.
167. Briggs EL, Sullivan DJ, Bernholc J. Real-space multigrid-based approach to large-scale electronic structure calculations. Phys Rev B.
1996;54:1436275.
168. García A, Papior N, Akhtar A, Artacho E, Blum V, Bosoni E, et al. SIESTA: recent developments and applications. J Chem Phys. 2020;
152:204108.
169. Gygi F. Architecture of Qbox: a scalable first-principles molecular dynamics code. IBM J Res Dev. 2008;52:13744.
170. Xu Q, Sharma A, Comer B, Huang H, Chow E, Medford AJ, et al. SPARC: simulation package for ab-initio real-space calculations.
SoftwareX. 2021;15:100709.
171. Lehtola S. A review on non-relativistic, fully numerical electronic structure calculations on atoms and diatomic molecules. Int J Quan-
tum Chem. 2019;119:e25968.
172. Jensen SR, Flå T, Jonsson D, Monstad RS, Ruud K, Frediani L. Magnetic properties with multiwavelets and DFT: the complete basis
set limit achieved. Phys Chem Chem Phys. 2016;18:2114561.
173. Harrison RJ, Beylkin G, Bischoff FA, Calvin JA, Fann GI, Fosso-Tande J, et al. MADNESS: A multiresolution, adaptive numerical envi-
ronment for scientific simulation. SIAM J Sci Comput. 2016;38:S12342.
174. Kobus J. A finite difference HartreeFock program for atoms and diatomic molecules. Comput Phys Commun. 2013;184:799811.
175. Koskinen P, Mäkinen V. Density-functional tight-binding for beginners. Comput Mater Sci. 2009;47:23753.
176. Seifert G, Joswig J-O. Density-functional tight bindingan approximate density-functional theory method. Wiley Interdiscip Rev:
Comput Mol Sci. 2012;2:45665.
30 of 33 LEHTOLA AND KARTTUNEN
177. Gaus M, Cui Q, Elstner M. Density functional tight binding: application to organic and biological molecules. Wiley Interdiscip Rev:
Comput Mol Sci. 2013;4:4961.
178. Thiel W. Semiempirical quantum-chemical methods. Wiley Interdiscip Rev Comput Mol Sci. 2014;4:14557.
179. Bannwarth C, Caldeweyher E, Ehlert S, Hansen A, Pracht P, Seibert J, et al. Extended tight-binding quantum chemistry methods.
Wiley Interdiscip Rev Comput Mol Sci. 2021;11:e1493.
180. Hourahine B, Aradi B, Blum V, Bonafé F, Buccheri A, Camacho C, et al. DFTB+, a software package for efficient approximate density
functional theory based atomistic simulations. J Chem Phys. 2020;152:124101.
181. Bock N, Cawkwell MJ, Coe JD, Krishnapriyan A, Kroonblawd MP, Lang A, Liu C, Saez EM, Mniszewski SM, Negre CFA,
Niklasson AMN, Sanville E, Wood MA, Yang P, Latte [cited 2021 July 12]. Available from: https://github.com/lanl/LATTE.
182. Husch T, Reiher M. Comprehensive analysis of the neglect of diatomic differential overlap approximation. J Chem Theory Comput.
2018;14:516979.
183. Cabezas I, Segovia R, Caratozzolo P & Webb E Using software engineering design principles as tools for freshman students learning.
In: 2020 IEEE Frontiers in education conference (FIE) (IEEE); 2020).
184. Lam P, Dietrich J & Pearce DJ Putting the semantics into semantic versioning. In: Proceedings of the 2020 ACM SIGPLAN interna-
tional symposium on new ideas, new paradigms, and reflections on programming and software (ACM); 2020.
185. Valeev EF. Libint: a library for the evaluation of molecular integrals of many-body operators over gaussian functions. Available from:
http://libint.valeyev.net/
186. Lawson CL, Hanson RJ, Krogh FT, Kincaid DR. Algorithm 539: basic linear algebra subprograms for fortran usage [f1]. ACM Trans
Math Softw. 1979;5:3245.
187. Zee FGV, van de Geijn RA. BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans Math Softw. 2015;41:133.
188. Whaley RC & Dongarra JJ Automatically tuned linear algebra software. In: Proceedings of the IEEE/ACM SC98 conference (IEEE);
1998.
189. Flocke N, Lotrich V. Efficient electronic integrals and their generalized derivatives for object oriented implementations of electronic
structure calculations. J Comput Chem. 2008;29:272236.
190. Sun Q. Libcint: an efficient general integral library for Gaussian basis functions. J Comput Chem. 2015;36:166471.
191. Pritchard BP, Chow E. Horizontal vectorization of electron repulsion integrals. J Comput Chem. 2016;37:253746.
192. Peng F, Wu M-S, Sosonkina M, Windus T, Bentz J, Gordon M, Kenny J & Janssen C Tackling component interoperability in quantum
chemistry software. In: Proceedings of the 2007 symposium on component and framework technology in high-performance and scien-
tific computingCompFrame '07 (ACM Press); 2007.
193. Kenny JP, Janssen CL, Valeev EF, Windus TL. Components for integral evaluation in quantum chemistry. J Comput Chem. 2008;29:
56277.
194. Ekström U, Visscher L, Bast R, Thorvaldsen AJ, Ruud K. Arbitrary-order density functional response theory from automatic differentia-
tion. J Chem Theory Comput. 2010;6:197180.
195. Herbst MF, Scheurer M, Fransson T, Rehn DR, Dreuw A adcc: A versatile toolkit for rapid development of algebraic-diagrammatic con-
struction methods. Wiley Interdiscip Rev Comput Mol Sci. 2020;10: (6):e1462.
196. Wouters S, Poelmans W, Ayers PW, Van Neck D. CheMPS2: a free open-source spin-adapted implementation of the density matrix
renormalization group for ab initio quantum chemistry. Comput Phys Commun. 2014;185:150114.
197. Scheurer M, Reinholdt P, Kjellgren ER, Olsen JMH, Dreuw A, Kongsted J. CPPE: an open-source C++ and python library for polariz-
able embedding. J Chem Theory Comput. 2019;15:615463.
198. Grimme S, Antony J, Ehrlich S, Krieg H. A consistent and accurate ab initio parametrization of density functional dispersion correction
(DFT-D) for the 94 elements H-Pu. J Chem Phys. 2010;132:154104.
199. Kaliman IA, Slipchenko LV. LIBEFP: a new parallel implementation of the effective fragment potential method as a portable software
library. J Comput Chem. 2013;34:228492.
200. Remigio RD, Frediani L, Steindal AH, Bast R, Burns LA, Crawford TD, Weijo V. PCMSolver, an open-source library for the polarizable
continuum model electrostatic problem [cited 2021 Feb 2]. Available from: https://github.com/PCMSolver/pcmsolver
201. Pritchard BP, Altarawy D, Didier B, Gibson TD, Windus TL. New basis set exchange: an open, up-to-date resource for the molecular
sciences community. J Chem Inf Model. 2019;59:481420.
202. Shaw R, Hill J. Libecpint: a c++ library for the efficient evaluation of integrals over effective core potentials. J Open Source Softw.
2021;6:3039.
203. Jmol: an open-source Java viewer for chemical structures in 3D. Available from: http://www.jmol.org
204. Hanwell MD, Curtis DE, Lonie DC, Vandermeersch T, Zurek E, Hutchison GR. Avogadro: an advanced semantic chemical editor, visu-
alization, and analysis platform. J Cheminf. 2012;4:17.
205. Gilbert A. Iqmol, a free open-source molecular editor and visualization package [cited 2021 June 26]. Available from: http://iqmol.org
206. Schrödinger Inc. Pymol, a molecular visualization system [cited 2021 July 6]. Available from: https://github.com/schrodinger/pymol-
open-source
207. O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open babel: an open chemical toolbox. J Cheminf.
2011;3:33.
208. O'Boyle NM, Tenderholt AL, Langner KM. Cclib: a library for package-independent computational chemistry algorithms. J Comput
Chem. 2008;29:83945.
LEHTOLA AND KARTTUNEN 31 of 33
209. Larsen AH, Mortensen JJ, Blomqvist J, Castelli IE, Christensen R, Dułak M, et al. The atomic simulation environmenta python
library for working with atoms. J Phys Condens Matter. 2017;29:273002.
210. Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem in 2021: new data content and improved web interfaces. Nucleic
Acids Res. 2020;49:D138895.
211. Lu T, Chen F. Multiwfn: a multifunctional wavefunction analyzer. J Comput Chem. 2011;33:58092.
212. Hermann G, Pohl V, Tremblay JC, Paulus B, Hege H-C, Schild A. ORBKIT: a modular python toolbox for cross-platform postprocessing
of quantum chemical wavefunction data. J Comput Chem. 2016;37:151120.
213. Lehtola S, Karttunen AJ. git repository containing a copy of the supporting information [cited 2021 Aug 8]. Available from: https://
github.com/susilehtola/fosschemistry
214. Bannwarth C, Ehlert S, Grimme S. GFN2-xTBan accurate and broadly parametrized self-consistent tight-binding quantum chemical
method with multipole electrostatics and density-dependent dispersion contributions. J Chem Theory Comput. 2019;15:165271.
215. Menzel JP, Kloppenburg M, Beli
c J, Groot HJM, Visscher L, Buda F. Efficient workflow for the investigation of the catalytic cycle of
water oxidation catalysts: combining GFN-xTB and density functional theory. J Comput Chem. 2021;42:188594.
216. Tasinato N, Puzzarini C, Barone V. Correct modeling of cisplatin: a paradigmatic case. Angew Chem Int Ed Engl. 2017;56:1383841.
217. Liu W, Franke R. Comprehensive relativistic ab initio and density functional theory studies on PtH, PtF, PtCl, and Pt(NH
3
)
2
Cl
2
.
J Comput Chem. 2002;23:56475.
218. Becke AD. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys Rev A. 1988;38:3098100.
219. Perdew JP. Density-functional approximation for the correlation energy of the inhomogeneous electron gas. Phys Rev B. 1986;33:
88224.
220. Adamo C, Barone V. Toward reliable density functional methods without adjustable parameters: the PBE0 model. J Chem Phys. 1999;
110:615870.
221. Ernzerhof M, Scuseria GE. Assessment of the PerdewBurkeErnzerhof exchange-correlation functional. J Chem Phys. 1999;110:
502936.
222. Vetere V, Adamo C, Maldivi P. Performance of the parameter freePBE0 functional for the modeling of molecular properties of heavy
metals. Chem Phys Lett. 2000;325:99105.
223. Bühl M, Reimann C, Pantazis DA, Bredow T, Neese F. Geometries of third-row transition-metal complexes from density-functional the-
ory. J Chem Theory Comput. 2008;4:144959.
224. Hehre WJ, Stewart RF, Pople JA. Self-consistent molecular-orbital methods. I. Use of Gaussian expansions of slater-type atomic
orbitals. J Chem Phys. 1969;51:2657.
225. Binkley JS, Pople JA, Hehre WJ. Self-consistent molecular orbital methods. 21. Small split-valence basis sets for first-row elements.
J Am Chem Soc. 1980;102:939.
226. Hehre WJ, Ditchfield R, Pople JA. Self-consistent molecular orbital methods. XII. Further extensions of Gaussian-type basis sets for use
in molecular orbital studies of organic molecules. J Chem Phys. 1972;56:225761.
227. Weigend F, Ahlrichs R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design
and assessment of accuracy. Phys Chem Chem Phys. 2005;7:3297305.
228. Pyykkö P. Relativistic effects in chemistry: more common than you thought. Annu Rev Phys Chem. 2012;63:4564.
229. Pyykkö P. The physics behind chemistry and the periodic table. Chem Rev. 2012;112:37184.
230. Dolg M. Chapter 14 relativistic effective core potentials. In: Springborg M, Li J, V
azquez AMM, editors. Theoretical and computational
chemistry. Amsterdam, The Netherlands: Elsevier; 2002. p. 793862.
231. Rappoport D, Furche F. Property-optimized gaussian basis sets for molecular response calculations. J Chem Phys. 2010;133:134105.
232. Ehlert S, Huniar U, Ning J, Furness JW, Sun J, Kaplan AD, et al. r
2
SCAN-D4: dispersion corrected meta-generalized gradient approxi-
mation for general chemical applications. J Chem Phys. 2021;154:061101.
233. Møller C, Plesset MSM. Note on an approximation treatment for many-electron systems. Phys Rev. 1934;46:61822.
234. Čížek J. On the correlation problem in atomic and molecular systems. Calculation of wavefunction components in Ursell-type expan-
sion using quantum-field theoretical methods. J Chem Phys. 1966;45:425666.
235. Grimme S. Exploration of chemical compound, conformer, and reaction space with meta-dynamics simulations based on tight-binding
quantum chemical calculations. J Chem Theory Comput. 2019;15:284762.
236. Pracht P, Bohle F, Grimme S. Automated exploration of the low-energy chemical space with fast quantum chemical methods. Phys
Chem Chem Phys. 2020;22:716992.
237. Pracht P, Grimme S. Calculation of absolute molecular entropies and heat capacities made simple. Chem Sci. 2021;12:655168.
238. Whitten JL. Coulombic potential energy integrals and approximations. J Chem Phys. 1973;58:4496.
239. Baerends EJ, Ellis DE, Ros P. Self-consistent molecular HartreeFockslater calculations I. the computational procedure. Chem Phys.
1973;2:4151.
240. Dunlap BI, Connolly JWD, Sabin JR. On the applicability of LCAO-Xαmethods to molecules containing transition metal atoms: the
nickel atom and nickel hydride. Int J Quantum Chem. 1977;12:817.
241. Dunlap BI, Connolly JWD, Sabin JR. On some approximations in applications of Xαtheory. J Chem Phys. 1979;71:3396.
242. Dunlap BI, Rösch N, Trickey SB. Variational fitting methods for electronic structure calculations. Mol Phys. 2010;108:316780.
243. Weigend F. HartreeFock exchange fitting basis sets for H to Rn. J Comput Chem. 2008;29:16775.
32 of 33 LEHTOLA AND KARTTUNEN
244. Dunning TH. Gaussian basis sets for use in correlated molecular calculations. I. the atoms boron through neon and hydrogen. J Chem
Phys. 1989;90:1007.
245. Van Lenthe E, Baerends EJ. Optimized slater-type basis sets for the elements 1-118. J Comput Chem. 2003;24:114256.
246. Chong DP, van Lenthe E, Van Gisbergen S, Baerends EJ. Even-tempered slater-type orbitals revisited: from hydrogen to krypton.
J Comput Chem. 2004;25:10306.
247. Bühl M, Kabrede H. Geometries of transition-metal complexes from density-functional theory. J Chem Theory Comput. 2006;2:
128290.
248. Grimme S, Bannwarth C, Shushkov P. A robust and accurate tight-binding quantum chemical method for structures, vibrational fre-
quencies, and noncovalent interactions of large molecular systems parametrized for all spd-block elements (z=186). J Chem Theory
Comput. 2017;13:19892009.
249. Dirac PAM. Note on exchange phenomena in the Thomas atom. Math Proc Cambridge Philos Soc. 1930;26:37685.
250. Perdew JP, Wang Y. Accurate and simple analytic representation of the electron-gas correlation energy. Phys Rev B. 1992;45:132449.
251. Furness JW, Kaplan AD, Ning J, Perdew JP, Sun J. Accurate and numerically efficient r
2
SCAN meta-generalized gradient approxima-
tion. J Phys Chem Lett. 2020;11:820815.
252. Furness JW, Kaplan AD, Ning J, Perdew JP, Sun J. Correction to "accurate and numerically efficient r
2
SCAN meta-generalized gradient
approximation". J Phys Chem Lett. 2020;11:92488.
253. Weigend F. Accurate Coulomb-fitting basis sets for H to Rn. Phys Chem Chem Phys. 2006;8:105765.
254. Pietro WJ, Hehre WJ. Molecular orbital theory of the properties of inorganic and organometallic compounds. 3. STO-3G basis sets for
first- and second-row transition metals. J Comput Chem. 1983;4:24151.
255. French RJ, Hedberg L, Hedberg K, Gard GL, Johnson BM. Molecular structure and quadratic force field of chromyl fluoride, CrO
2
F
2
.
Inorg Chem. 1983;22:8925.
256. Levason WL, Ogden JS, Saad AK, Young NA, Brisdon AK, Holliman PJ, et al. Metal K-edge EXAFS (extended x-ray absorption fine
structure) studies of CrO
2
F
2
and MnO
3
F at 10K. J Fluorine Chem. 1991;53:4351.
257. Gilbert B, Frazer BH, Zhang H, Huang F, Banfield JF, Haskel D, et al. X-ray absorption spectroscopy of the cubic and hexagonal pol-
ytypes of zinc sulfide. Phys Rev B. 2002;66:245205.
258. Gardner PJ, Pang P. Thermodynamics of the zinc sulphide transformation, sphalerite !wurtzite, by modified entrainment. J Chem
Soc Faraday Trans. 1988;1 84:1879.
259. Garrity KF, Bennett JW, Rabe KM, Vanderbilt D. Pseudopotentials for high-throughput DFT calculations. Comput Mater Sci. 2014;81:
44652. arXiv:1305.5973.
260. COD. Crystallography open database [cited 2021 July 20]. Available from: http://www.crystallography.net/cod/
261. Materials Cloud. Quantum espresso input generator and structure visualizer [cited 2021 July 20]. Available from: https://www.
materialscloud.org/work/tools/qeinputgenerator
262. Monkhorst HJ, Pack JD. Special points for Brillouin-zone integrations. Phys Rev B. 1976;13:518892.
263. Cardona M, Kremer RK, Lauck R, Siegle G, Muñoz A, Romero AH, et al. Electronic, vibrational, and thermodynamic properties of ZnS
with zinc-blende and rocksalt structure. Phys Rev B. 2010;81:075207.
264. Materials Cloud. Seek-path: the k-path finder and visualizer [cited 2021 July 20]. Available from: https://www.materialscloud.org/
work/tools/seekpath
265. Setyawan W, Curtarolo S. High-throughput electronic band structure calculations: challenges and tools. Comput Mater Sci. 2010;49:
299312.
266. Tran TK, Park W, Tong W, Kyi MM, Wagner BK, Summers CJ. Photoluminescence properties of ZnS epilayers. J Appl Phys. 1997;81:
28039.
267. McDonald AR, Nash JA, Nerenberg PS, Ball KA, Sode O, Foley JJ, et al. Building capacity for undergraduate education and training in
computational molecular science: a collaboration between the MERCURY consortium and the molecular sciences software institute.
Int J Quantum Chem. 2020;120:e26359.
268. Magers DB, Ch
avez VH, Peyton BG, Sirianni DA, Fortenberry RC, Ringer McDonald A. PSI4EDUCATION: free and open-source pro-
graming activities for chemical education with free and open-source software. In: McDonald AR, Nash JA, editors. Teaching program-
ming across the chemistry curriculum. Washington, DC: American Chemical Society; 2021. p. 10722.
SUPPORTING INFORMATION
Additional supporting information may be found in the online version of the article at the publisher's website.
How to cite this article: Lehtola S, Karttunen AJ. Free and open source software for computational chemistry
education. WIREs Comput Mol Sci. 2022. e1610. https://doi.org/10.1002/wcms.1610
LEHTOLA AND KARTTUNEN 33 of 33
... An important observation related to these two developments was recently made by Lehtola and Karttunen: 2 commodity hardware in the present day is as fast as the fastest supercomputer in the world in the 1990s. Combined with the present-day availability of faster algorithms, then-pioneering calculations can nowadays be reproduced even on students' laptop computers with easily installable free and open source software (see Ref. 2 for the employed definition and more details). ...
... If state-ofthe-art open source code development practices such as code review are employed, this also supports the training of new developers and maintainers for the project. 2 The issue with scoped development efforts is likewise less of an issue in production grade open source reusable libraries than in similar libraries embedded within a monolithic package. The former project has a clear scope, guiding and simplifying its development and maintenance, and also attracting the attention of world wide subject matter experts who are inherently familiar with the state-ofthe-art in that field. ...
... An extreme example is the case of the basic linear algebra subsystem (BLAS) discussed by Lehtola and Karttunen. 2 It is noteworthy that while BLAS is nowadays ubiquitous, 60 it took years of concerted efforts by the BLAS team to convince both academia and industry to adopt the standard. ...
Article
The traditional foundation of science lies on the cornerstones of theory and experiment. Theory is used to explain experiment, which in turn guides the development of theory. Since the advent of computers and the development of computational algorithms, computation has risen as the third cornerstone of science, joining theory and experiment on an equal footing. Computation has become an essential part of modern science, amending experiment by enabling accurate comparison of complicated theories to sophisticated experiments, as well as guiding by triage both the design and targets of experiments and the development of novel theories and computational methods. Like experiment, computation relies on continued investment in infrastructure: it requires both hardware (the physical computer on which the calculation is run) as well as software (the source code of the programs that performs the wanted simulations). In this Perspective, I discuss present-day challenges on the software side in computational chemistry, which arise from the fast-paced development of algorithms, programming models, as well as hardware. I argue that many of these challenges could be solved with reusable open source libraries, which are a public good, enhance the reproducibility of science, and accelerate the development and availability of state-of-the-art methods and improved software.
... In contrast, in the modern open source paradigm of software development, common tasks are accomplished via reusable shared modular libraries. 31 In the present case of the evaluation of DFAs, the aforementioned Libxc 9 is the implementation of choice: Libxc is used by around 40 electronic structure programs based on various numerical approaches, such as atomic-orbital basis sets, plane waves, as well as real-space approaches. Thanks to the modular approach to software development, new functionals only need to be implemented in Libxc to become useable in a large number of programs. ...
... In addition to established commercial packages, several programs that are free and open source software (FOSS) have also become available in recent years. 31 Here, we especially want to mention PySCF 96 and Psi4, 97 which are both interfaced to Libxc and enable efficient density functional calculations. ...
... We have suggested several straightforward ways in which to determine such data with publicly available free and open source software. 31 The systems for which reference data are reported should include both spin-restricted and spin-unrestricted systems; the N and Ne atoms offer excellent test systems as they have well-behaved electronic structures. ...
Article
Density functional theory is the workhorse of chemistry and materials science, and novel density functional approximations are published every year. To become available in program packages, the novel density functional approximations (DFAs) need to be (re)implemented. However, according to our experience as developers of Libxc [Lehtola et al., SoftwareX 7, 1 (2018)], a constant problem in this task is verification due to the lack of reliable reference data. As we discuss in this work, this lack has led to several non-equivalent implementations of functionals such as Becke–Perdew 1986, Perdew–Wang 1991, Perdew–Burke–Ernzerhof, and Becke’s three-parameter hybrid functional with Lee–Yang–Parr correlation across various program packages, yielding different total energies. Through careful verification, we have also found many issues with incorrect functional forms in recent DFAs. The goal of this work is to ensure the reproducibility of DFAs. DFAs must be verifiable in order to prevent the reappearance of the above-mentioned errors and incompatibilities. A common framework for verification and testing is, therefore, needed. We suggest several ways in which reference energies can be produced with free and open source software, either with non-self-consistent calculations with tabulated atomic densities or via self-consistent calculations with various program packages. The employed numerical parameters—especially the quadrature grid—need to be converged to guarantee a ≲0.1 μEh precision in the total energy, which is nowadays routinely achievable in fully numerical calculations. Moreover, as such sub-μEh level agreement can only be achieved when fully equivalent implementations of the DFA are used, the source code of the reference implementation should also be made available in any publication describing a new DFA.
... An important observation related to these two developments was recently made by Lehtola and Karttunen: 2 commodity hardware in the present day is as fast as the fastest supercomputer in the world in the 1990s. Combined with the present-day availability of faster algorithms, then-pioneering calculations can nowadays be reproduced even on students' laptop computers with easily installable free and open source software (see Ref. 2 for the employed definition and more details). ...
... If state-ofthe-art open source code development practices such as code review are employed, this also supports the training of new developers and maintainers for the project. 2 The issue with scoped development efforts is likewise less of an issue in production grade open source reusable libraries than in similar libraries embedded within a monolithic package. The former project has a clear scope, guiding and simplifying its development and maintenance, and also attracting the attention of world wide subject matter experts who are inherently familiar with the state-ofthe-art in that field. ...
... An extreme example is the case of the basic linear algebra subsystem (BLAS) discussed by Lehtola and Karttunen. 2 It is noteworthy that while BLAS is nowadays ubiquitous, 60 it took years of concerted efforts by the BLAS team to convince both academia and industry to adopt the standard. ...
Preprint
The traditional foundation of science lies on the cornerstones of theory and experiment. Theory is used to explain experiment, which in turn guides the development of theory. Since the advent of computers and the development of computational algorithms, computation has risen as the third cornerstone of science, joining theory and experiment on an equal footing. Computation has become an essential part of modern science, amending experiment by enabling accurate comparison of complicated theories to sophisticated experiments, as well as guiding by triage both the design and targets of experiments and the development of novel theories and computational methods. Like experiment, computation relies on continued investment in infrastructure: it requires both hardware (the physical computer on which the calculation is run) as well as software (the source code of the programs that performs the wanted simulations). In this Perspective, I discuss present-day challenges on the software side in computational chemistry, which arise from the fast-paced development of algorithms, programming models, as well as hardware. I argue that many of these challenges could be solved with reusable open source libraries, which are a public good, enhance the reproducibility of science, and accelerate the development and availability of state-of-the-art methods and improved software.
... In contrast, in the modern open source paradigm of software development, common tasks are accomplished via reusable shared modular libraries. 31 In the present case of the evaluation of DFAs, the aforementioned Libxc 9 is the implementation of choice: Libxc is used by around 40 electronic structure programs based on various numerical approaches, such as atomic-orbital basis sets, plane waves, as well as real-space approaches. Thanks to the modular approach to software development, new functionals only need to be implemented in Libxc to become usable in a large number of programs. ...
... In addition to established commercial packages, several programs that are free and open source software (FOSS) have also become available in recent years. 31 Here we especially want to mention PySCF 95 and Psi4, 96 which are both interfaced to Libxc and enable efficient density functional calculations. ...
... We have suggested several straightforward ways in which to determine such data with publicly available free and open source software. 31 The systems for which reference data is reported should include both spin-restricted and spin-unrestricted systems; the N and Ne atoms offer excellent test systems as they have well-behaved electronic structures. Regardless of the employed approach, it is essential to converge the calculation of the reference energy with respect to all numerical parameters, most notably the quadrature grid, as we demonstrated in ref. 39 and section V B. The reference energy should be computed and reported to very high precision: using suitably large integration grids and small cutoff thresholds, an agreement of better than 0.1 µE h in total energies is typically achievable in Gaussian-basis calculations across programs. ...
Preprint
Full-text available
Density functional theory is the workhorse of chemistry and materials science, and novel density functional approximations (DFAs) are published every year. To become available in program packages, the novel DFAs need to be (re)implemented. However, according to our experience as developers of Libxc [Lehtola et al, SoftwareX 7, 1 (2018)], a constant problem in this task is verification, due to the lack of reliable reference data. As we discuss in this work, this lack has lead to several non-equivalent implementations of functionals such as BP86, PW91, PBE, and B3LYP across various program packages, yielding different total energies. Through careful verification, we have also found many issues with incorrect functional forms in recent DFAs. The goal of this work is to ensure the reproducibility of DFAs: DFAs must be verifiable in order to prevent reappearances of the abovementioned errors and incompatibilities. A common framework for verification and testing is therefore needed. We suggest several ways in which reference energies can be produced with free and open source software, either with non-self-consistent calculations with tabulated atomic densities or via self-consistent calculations with various program packages. The employed numerical parameters -- especially, the quadrature grid -- need to be converged to guarantee the $\lesssim0.1\mu E_{h}$ precision for fully numerical calculations which routinely afford such precision in the total energy. Such sub-$\mu E_{h}$ level of agreement can only be achieved when fully equivalent implementations of the DFA are used. Therefore, also the source code of the reference implementation should be made available in any publication describing a new DFA.
... As anticipated in 2000, computational cost has continued decreasing, and much larger problems can be solved computationally. 1,32 Computational toxicology is increasingly accepted by regulators as a surrogate for in vivo testing. 33,34 While machine learning and artificial intelligence have started displacing more fundamental modeling techniques for optimization or troubleshooting of commercial reactors, reactor design is still focused on fundamentals. ...
Article
This perspective provides the collective opinions of a dozen chemical reaction engineers from academia and industry. In this sequel to the “Vision 2020: Reaction Engineering Roadmap,” published in 2001, we provide our opinions about the field of reaction engineering by addressing the current situation, identifying barriers to progress, and recommending research directions in the context of four industry sectors (basic chemicals, specialty chemicals, pharmaceuticals, and polymers) and five technology areas (reactor system selection, design and scale-up, chemical mechanism development and property estimation, catalysis, nonstandard reactor types, and electrochemical systems). Our collective input in this report includes numerous recommendations regarding research needs in the field of reaction engineering in the coming decades, including guidance for prioritizing efforts in workforce development, measurement science, and computational methods. We see important roles for reaction engineers in the plastics circularity challenge, decarbonization of processes, electrification of chemical reactors, conversion of batch processes to continuous processes, and development of intensified, dynamic reaction processes.
... Chemistry is abstract and has a strong relationship with daily needs, so reasoning is needed in studying each discussion (Carmel et al., 2019;Ugwu, 2020;Lehtola & Karttunen, 2022). Chemistry learning teaches the ability to identify chemistry prob-lems and make inferences based on facts to discover various changes in nature and the effects of human interactions with nature (Mahaffy et al., 2018;Flynn et al., 2019;Holme, 2019). ...
Article
This research focuses on the analysis of the ethnochemistry potential of the vines contained in the Lontar Usada Taru Pramana. Lontar Usada Taru Pramana is a note written on palm leaves about plants that are useful as medicines used as a reference for traditional Balinese medicine. This study aims to analyze the effectiveness of task-based learning that utilizes the ethnochemistry potential of vines contained in Lontar Usada Taru Pramana, on students’ scientific explanations skills. This research was conducted during the post-Covid-19 period in vocational and high schools with 234 students. This research was quantitative and applied The One-Group Pretest-Posttest Design with replication where there was no control class and all research subjects were given the same treatment. Task instructions were passed through pre-task, process task, and post-task. The type of task in learning is to make scientific studies of ethnochemistry by sharing personal experiences and solving problems. The data collection technique used tests as descriptive essay questions to measure students’ scientific explanation skills on some materials in booklets of Taru Pramana Lontar. The tests in this study described several components: plant classification, chemical content, benefits and methods of concocting it as medicine, and the scientific version of the Lontar Usada Taru Pramana composition. The effectiveness of task-based learning was analyzed using the N-Gain and T-test. The results of this study indicate that giving assignments based on Lontar Usada Taru Pramana in chemistry learning is effective in increasing students’ ability to explain the scientific study of vines as medicine. The N-gain results are in the high category of 0.76 for vocational students and 0.72 for high school students. While the T-test result shows that there is a significant difference between students’ pretest and posttest results in both vocational school and high school with a significance of .01. Students tend to correctly give scientific explanations to the plants they often encounter. This study shows that the ethnochemistry potential of the vines on Lontar Usada Taru Pramana can improve students’ scientific explanation skills. This study recommends elaborating chemistry concepts in the preservation of cultural heritage through transferring knowledge on using traditional and modern medicinal plants and their development in research.
... In this regard, Tuvi-Arad and Blonder [5] highlight the importance of chemical compound databases, which can serve as valuable resources for teaching chemistry. Similarly, Lehtola and Karttunen [6] acknowledge that CC now offers a vast array of open-source and freely available software, enabling its integration with Massive Open Online Courses and thus reaching a wide range of students. ...
Article
Full-text available
The use of technology in education has experienced significant growth in recent years. In this regard, computational chemistry is considered a dynamic element due to the constant advances in computational methods in chemistry, making it an emerging technology with high potential for application in teaching chemistry. This article investigates the characteristics and perceptions of in-service chemistry teachers who participated in an e-learning educational computational chemistry course. Additionally, it examines how educational data mining techniques can contribute to optimising and developing e-learning environments. The results indicate that teachers view incorporating computational chemistry elements in their classes positively but that this is not profoundly reflected in their teaching activity planning. On the other hand, generated statistical models demonstrate that the most relevant variables to consider in the instructional design of an e-learning educational computational chemistry course are related to participation in various course instances and partial evaluations. In this sense, the need to provide additional support to students during online learning is highlighted, especially during critical moments such as evaluations. In conclusion, this study offers valuable information on the characteristics and perceptions of in-service chemistry teachers and demonstrates that educational data mining techniques can help improve e-learning environments.
... Both HelFEM and Erkale are free and open-source software. 65 As was already mentioned above in section 2, all GTO basis sets are employed in fully uncontracted form. ...
Preprint
Strong magnetic fields such as those found on white dwarfs have significant effects on the electronic structure of atoms and molecules. However, the vast majority of molecular studies in the literature in such fields are carried out with Gaussian basis sets designed for zero field, leading to large basis set truncation errors [Lehtola et al, Mol. Phys. 2020, 118, e1597989]. In this work, we aim to identify the failures of the Gaussian basis sets in atomic calculations to guide the design of new basis sets for strong magnetic fields. We achieve this by performing fully numerical electronic structure calculations at the complete basis set (CBS) limit for the ground state and low lying excited states of the atoms $1 \le Z \le 18$ in weak to intermediate magnetic fields. We also carry out finite-field calculations for a variety of Gaussian basis sets, introducing a real-orbital approximation for the magnetic-field Hamiltonian. Our primary focus is on the aug-cc-pVTZ basis set, which has been used in many works in the literature. A study of the differences in total energies of the fully numerical CBS limit calculations and the approximate Gaussian basis calculations is carried out to provide insight into basis set truncation errors. Examining a variety of states over the range of magnetic field strengths from $B = 0$ to $B = 0.6 B_0$, we observe significant differences for the aug-cc-pVTZ basis set, while much smaller errors are afforded by the benchmark-quality AHGBSP3-9 basis set [Lehtola, J. Chem. Phys. 2020, 152, 134108]. This suggests that there is considerable room to improve Gaussian basis sets for calculations at finite magnetic fields.
Article
Full-text available
This qualitative research explored the rationales of open-source development in chemin-formatics. The objective was to promote open science by mapping out and categorizing the reasons why open-source development is being carried out. This topic is important because cheminformat-ics has an industrial background and open-source is the key solution in promoting the growth of cheminformatics as an independent academic field. The data consisted of 87 research articles that were analyzed using qualitative content analysis. The analysis produced six rationale categories: (1) Develop New Software, (2) Update Current Features, Tools, or Processes, (3) Improve Usability, (4) Support Open-source Development and Open Science, (5) Fulfill Chemical Information Needs, and (6) Support Chemistry Learning and Teaching. This classification can be used in designing rationales for future software development projects, which is one of the largest research areas in cheminformatics. In particular, there is a need to develop cheminformatics education for which software development can serve as an interesting multidisciplinary framework.
Article
Integration of computational data science (CDS) into the university curriculum offers several advantages for students, faculty and the institution. This article discusses the benefits to students of introducing CDS into the university curriculum with a focus on developing skills in cheminformatics, data analysis, structure–activity relationships, modelling and simulation. Moreover, CDS can enable students to engage with complex chemical and toxicological data in new and dynamic ways, helping them to develop a more nuanced understanding of the potential hazards and risks associated with different chemicals and substances. On the other hand, it can foster greater collaboration between students and faculty and with external partners in industry and government. This can lead to the development of more effective and efficient toxicological testing methods and tools to screen chemicals for potential hazards and aid the development of environmentally friendly chemicals. Overall, the integration of CDS into the university curriculum will help prepare the next generation of scientists giving them a competitive edge to make considerable contributions to green chemistry, designing safer chemicals and non-animal testing methods. It will enable them to tackle modern challenges facing society including identifying safer and more sustainable chemicals and predicting the health and environmental impacts of novel chemical substances.
Article
Full-text available
We present inq, a new implementation of density functional theory (DFT) and time-dependent DFT (TDDFT) written from scratch to work on graphic processing units (GPUs). Besides GPU support, inq makes use of modern code design features and takes advantage of newly available hardware. By designing the code around algorithms, rather than against specific implementations and numerical libraries, we aim to provide a concise and modular code. The result is a fairly complete DFT/TDDFT implementation in roughly 12 000 lines of open-source C++ code representing a modular platform for community-driven application development on emerging high-performance computing architectures.
Article
Full-text available
Driven by the unprecedented computational power available to scientific research, the use of computers in solid-state physics, chemistry and materials science has been on a continuous rise. This review focuses on the software used for the simulation of matter at the atomic scale. We provide a comprehensive overview of major codes in the field, and analyze how citations to these codes in the academic literature have evolved since 2010. An interactive version of the underlying data set is available at https://atomistic.software.
Article
Full-text available
Photocatalytic water oxidation remains the bottleneck in many artificial photosynthesis devices. The efficiency of this challenging process is inherently linked to the thermodynamic and electronic properties of the chromophore and the water oxidation catalyst (WOC). Computational investigations can facilitate the search for favorable chromophore‐catalyst combinations. However, this remains a demanding task due to the requirements on the computational method that should be able to correctly describe different spin and oxidation states of the transition metal, the influence of solvation and the different rates of the charge transfer and water oxidation processes. To determine a suitable method with favorable cost/accuracy ratios, the full catalytic cycle of a molecular ruthenium based WOC is investigated using different computational methods, including density functional theory (DFT) with different functionals (GGA, Hybrid, Double Hybrid) as well as the semi‐empirical tight binding approach GFN‐xTB. A workflow with low computational cost is proposed that combines GFN‐xTB and DFT and provides reliable results. GFN‐xTB geometries and frequencies combined with single‐point DFT energies give free energy changes along the catalytic cycle that closely follow the full DFT results and show satisfactory agreement with experiment, while significantly decreasing the computational cost. This workflow allows for cost efficient determination of energetic, thermodynamic and dynamic properties of WOCs. Water oxidation catalysts play a crucial role in the development of solar energy conversion devices. Thus, it is important to have computational tools that can reliably predict the catalytic mechanism and the energetics along the catalytic cycle. The approach proposed in this work combines a good accuracy with a small computational cost.
Article
Community efforts in the computational molecular sciences (CMS) are evolving toward modular, open, and interoperable interfaces that work with existing community codes to provide more functionality and composability than could be achieved with a single program. The Quantum Chemistry Common Driver and Databases (QCDB) project provides such capability through an application programming interface (API) that facilitates interoperability across multiple quantum chemistry software packages. In tandem with the Molecular Sciences Software Institute and their Quantum Chemistry Archive ecosystem, the unique functionalities of several CMS programs are integrated, including CFOUR, GAMESS, NWChem, OpenMM, Psi4, Qcore, TeraChem, and Turbomole, to provide common computational functions, i.e., energy, gradient, and Hessian computations as well as molecular properties such as atomic charges and vibrational frequency analysis. Both standard users and power users benefit from adopting these APIs as they lower the language barrier of input styles and enable a standard layout of variables and data. These designs allow end-to-end interoperable programming of complex computations and provide best practices options by default.
Article
The COVID-19 pandemic disrupted chemistry teaching practices globally as many courses were forced online, necessitating adaptation to the digital platform. The biggest impact was to the practical component of the chemistry curriculum—the so-called wet lab. Naively, it would be thought that computer-based teaching laboratories would have little problem in making the move. However, this is not the case as there are many unrecognized differences between delivering computer-based teaching in-person and virtually: software issues, technology, and classroom management. Consequently, relatively few “hands-on” computational chemistry teaching laboratories are delivered online. In this paper, we describe these issues in more detail and how they can be addressed, drawing on our experience in delivering a third-year computational chemistry course as well as remote hands-on workshops for the Virtual Winter School on Computational Chemistry and the European BIG-MAP project.
Article
The Atomic Simulation Recipes (ASR) is an open source Python framework for working with atomistic materials simulations in an efficient and sustainable way that is ideally suited for high-throughput projects. Central to ASR is the concept of a Recipe: a high-level Python script that performs a well defined simulation task robustly and accurately while keeping track of the data provenance. The ASR leverages the functionality of the Atomic Simulation Environment (ASE) to interface with external simulation codes and attain a high abstraction level. We provide a library of Recipes for common simulation tasks employing density functional theory and many-body perturbation schemes. These Recipes utilize the GPAW electronic structure code, but may be adapted to other simulation codes with an ASE interface. Being independent objects with automatic data provenance control, Recipes can be freely combined through Python scripting giving maximal freedom for users to build advanced workflows. ASR also implements a command line interface that can be used to run Recipes and inspect results. The ASR Migration module helps users maintain their data while the Database and App modules makes it possible to create local databases and present them as customized web pages.
Article
WebMO is a web-based interface for all major quantum chemistry programs. WebMO uses a server–client architecture that installs on a single server or cluster computer and provides access to state-of-the-art computational chemistry programs from a standard web browser. The web interface provides a 3-D molecular editor, pre-defined calculations types, job submission and monitoring, visualization of results, and user management tools. Barriers to using state-of-the-art computational chemistry in teaching and research are minimized through WebMO's universal accessibility, its intuitive and uniform interface to all programs, no software to install on client computers, and support for multiple users with a single instance. Applications of WebMO throughout the undergraduate curriculum are provided. The extensible open-architecture design allows for collaboration among educators, researchers, quantum chemistry program developers, and the WebMO interface developers. This article is categorized under: • Computer and Information Science > Visualization • Electronic Structure Theory > Ab Initio Electronic Structure Methods • Software > Quantum Chemistry Abstract WebMO is a web-based interface for all major quantum chemistry programs. It includes a 3-D molecular editor, pre-defined calculations types, job submission and monitoring, visualization of results, and user management tools. WebMO's server-client architecture supports universal accessibility via the web, an intuitive and uniform interface, no software installation on client computers, and multiple users with a single instance. Examples of WebMO usage throughout the undergraduate curriculum are provided.
Article
We present SPARC: Simulation Package for Ab-initio Real-space Calculations. SPARC can perform Kohn–Sham density functional theory calculations for isolated systems such as molecules as well as extended systems such as crystals and surfaces, in both static and dynamic settings. It is straightforward to install/use and highly competitive with state-of-the-art planewave codes, demonstrating comparable performance on a small number of processors and increasing advantages as the number of processors grows. Notably, SPARC brings solution times down to a few seconds for systems with O(100–500) atoms on large-scale parallel computers, outperforming planewave counterparts by an order of magnitude and more.