PreprintPDF Available

LabInform ELN: A lightweight and flexible electronic laboratory notebook for academic research based on the open-source software DokuWiki

April 2023

April 2023

DOI:10.26434/chemrxiv-2023-2tvct

License
CC BY-NC-ND 4.0

Authors:

Till Biskup

Physikalisch-Technische Bundesanstalt

Preprints and early-stage research may not have been peer reviewed yet.

Scientific recordkeeping is a key prerequisite of reproducibility and hence an essential aspect of conducting science. Researchers need to be able to retrospectively figure out what they or others did, how they collected data and how they drew conclusions. This is typically the realm of laboratory notebooks, and with the advent of the digital era, there is an increasing move towards digitising those notebooks as well. Here, we present LabInform ELN, a lightweight and flexible electronic laboratory notebook for academic research based on the open-source software DokuWiki. Key features are its minimal system requirements, flexibility, modularity, and compliance with auditing requirements. The LabInform ELN is compared with other leading open-source solutions, its key concepts are discussed and full working examples for a spectroscopic laboratory as well as for quantum-chemical calculations presented. The minimalistic system requirements allow for using it in small groups and even by individual scientists and help with improving both, reproducibility and access to the notes taken. At the same time, thanks to its fine-grained access management, it scales well to larger groups. Furthermore, it can be easily adjusted to specific needs from within its web interface. Therefore, we anticipate LabInform ELN and the ideas behind its implementation to have a high impact in the field, particularly for groups with limited IT resources.

Start page of the LabInform ELN providing the first overview and guiding the user. The icons are clickable and used consistently throughout the ELN, helping to recognise the respective area.

…

Figures - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

LabInform ELN: A lightweight and ﬂexible electronic laboratory notebook

for academic research based on the open-source software DokuWiki

Mirjam Schr¨oder1and Till Biskup2, ∗

1Leibniz-Institut f¨ur Katalyse e.V., Albert-Einstein-Straße 29a, 18059 Rostock, Germany

2Physikalische Chemie, Albert-Ludwigs-Universit¨at Freiburg, Albertstr. 21, 79104 Freiburg, Germany

Scientiﬁc recordkeeping is a key prerequisite of reproducibility and hence an essential aspect of

conducting science. Researchers need to be able to retrospectively ﬁgure out what they or others did,

how they collected data and how they drew conclusions. This is typically the realm of laboratory

notebooks, and with the advent of the digital era, there is an increasing move towards digitising those

notebooks as well. Here, we present LabInform ELN, a lightweight and ﬂexible electronic laboratory

notebook for academic research based on the open-source software DokuWiki. Key features are its

minimal system requirements, ﬂexibility, modularity, and compliance with auditing requirements.

The LabInform ELN is compared with other leading open-source solutions, its key concepts are

discussed and full working examples for a spectroscopic laboratory as well as for quantum-chemical

calculations presented. The minimalistic system requirements allow for using it in small groups

and even by individual scientists and help with improving both, reproducibility and access to the

notes taken. At the same time, thanks to its ﬁne-grained access management, it scales well to

larger groups. Furthermore, it can be easily adjusted to speciﬁc needs from within its web interface.

Therefore, we anticipate LabInform ELN and the ideas behind its implementation to have a high

impact in the ﬁeld, particularly for groups with limited IT resources.

Keywords: electronic laboratory notebook, ELN, reproducible research, research data management, meta-

data, wiki, inventory, FAIR data, knowledge management

I. INTRODUCTION

Scientiﬁc recordkeeping is an essential aspect of con-

ducting science and is crucial for knowledge creation and

dissemination.[1] For research to be reproducible [2, 3],

researchers need to be able to retrospectively ﬁgure out

what they or others did, how they collected data and how

they drew conclusions. Furthermore, recordkeeping is a

necessary prerequisite for FAIR(er) data [4].

Often, the primary recording takes place in laboratory

notebooks [5, 6] that can contain anything from details of

an experiment to developing hypotheses and document-

ing the analysis of recorded data. Nowadays, though,

it is increasingly rare for hand-written lab notebooks to

contain the actual raw data of measurements or observa-

tions. This has mainly two reasons: the amount of data

recorded has grown tremendously [7–9], and data are in-

creasingly recorded in a digital fashion. Therefore, often

only graphical representations of data or results of anal-

yses may ﬁnd their way into lab notebooks, but not the

actual data. This makes it necessary to provide stable

references to the actual data that are still operational

years later. This does not necessarily mean that ‘true’

persistent identiﬁers (PID) need to be used or actual links

provided within an (electronic) lab notebook, although

this can be quite convenient. Establishing and consis-

tently adhering to a system, e.g. numbering samples and

measurements consecutively, combined with a hierarchy

of directories on the ﬁle system reﬂecting this system,

would suﬃce – and even allow for a semi-automatic ac-

∗E-mail: research@till-biskup.de

cess to the data.[10]

The concept of lab notebooks is probably as old as

science, and in some sense, the recordings of astronomical

observations on clay tablets in the ancient Babylonian

imperium [11] can be regarded as a lab notebook as well.

Clearly, the original reason for lab notebooks is scientiﬁc

recordkeeping and reproducibility, not the need to prove

the origin and precedence of ideas in a patent application,

as sometimes stated. The latter has only led to imposing

constraints on how to use lab notebooks (mainly) in the

industry and to emphasising their usefulness.

While paper-based lab notebooks have clearly a num-

ber of advantages, such as no external dependencies, im-

mediate and intuitive usability, and ﬂexibility, they come

with a number of disadvantages in the digital age as well.

To mention just two aspects: (i) metadata [12–14] (i.e.,

information about the data recorded) are written on pa-

per and per se not accessible for processing and analysis

programs – i.e. they are not machine-actionable – and

(ii) access to the lab notebook is only possible physically

in one place. These and other aspects have led to an

increasing interest in digital, i.e. electronic laboratory

notebooks (ELNs) in the last years in the academic com-

munity [15, 16]. The idea, however, is much older: The

ﬁrst ELNs can be traced back to the 1950s [17], and RS/1

as a ‘true’ ELN is from the 1980s [18, 19]. Furthermore,

in the experimental area of the chemical and pharmaceu-

tical industry, ELNs have successfully been introduced

already quite some time ago. Here, the standard has

been set with the ‘good laboratory practice’ (GLP) [20]

as a system to ensure inter alia reliability, reproducibil-

ity, and quality of chemicals, and many other directives

refer to it. In these highly regulated contexts, often labo-

ratory information and management systems (LIMS) get

used that cover much more than the traditional labora-

tory notebook, such as availability and maintenance of

devices used and often an end-to-end digital workﬂow

from the raw data to the ﬁnal report. Typical LIMS,

however, are rather rigid, require ﬁxed routines to be fol-

lowed strictly and are therefore often not applicable to

an academic research context.

While scientiﬁc record-keeping is a necessary prereq-

uisite for reproducibility and as such at the heart of the

scientiﬁc method, it has gained tremendous interest in

the realm of data-driven science referred to as ‘fourth

paradigm’[21] by Jim Gray [9, 22, 23]. The availabil-

ity of a vast amount of data is about to fundamentally

change the way we perform science, with reuse of results,

i.e. actual recorded data, on an entirely diﬀerent scale.

This is why the FAIR principles [4] have been spelled out,

although it should be mentioned that they rest on ear-

lier work [24, 25]. Basically they are an adoption of Tim

Berners-Lee’s concept of linked data [26] and the seman-

tic web [27–30] to research data. It is the mere necessity

to allow and ease data sharing and reuse and enhance

awareness of the necessary prerequisites for reuse. Be-

sides astronomy, the particle physics community was and

still is at the forefront of this data-intensive ‘big science’

[31]. There is good reasons for the world-wide web to

have been developed at CERN in the context of particle

physics, with a huge scientiﬁc community spread all over

the globe [28]. Hence it is only logical that the semantic

web has mentioned the scientiﬁc workﬂows early on as

key area.[27, 29] One prerequisite for the semantic web

[27–30] are structured (and linked) metadata. An ELN

can help here, particulary with structured metadata re-

garding provenance of the data and stable (hyper-)links

to all relevant pieces of information.

The ‘data deluge’ [7], as it has been eloquently de-

scribed, somewhat echoes Edsger Dijkstras statement re-

garding the development of software engineering: as soon

as there were general-purpose computers, programming

became a gigantic problem [32]. With (digital) data be-

coming ubiquitous, handling of data became a ubiquitous

problem for the scientists, as nobody was trained for it

and the necessary tools did not exist. One of the ﬁrst

disciplines to realise this problem was bioinformatics, be-

sides particle physics and astronomy, and they all devel-

oped tools to work both with huge individual datasets

as well as huge amounts of individual datasets. However,

apart from these disciplines, there seems still an apparent

lack of awareness of the problems with as well as the solu-

tions to handling increasingly larger amounts of (digital)

data. [33–36]

Despite the ELNs and the larger e-science infrastruc-

ture [37, 38] being driven by data-intensive ‘big science’

applications, ELNs are advantageous even for ‘little sci-

ence’ and ‘little data’ [31, 39] as discussed here. Similarly,

many of the concepts (though not necessarily always the

tools) developed for handling large data volumes can be

advantageously applied to the smaller scale. Automation

[40, p. 34][41] and machine-actionable information are at

the core of more reliable and reproducible data handling

and thus more reproducible science.

Before actually describing the LabInform ELN and the

design principles behind it, three questions need to be ad-

dressed ﬁrst: (i) Who is the target audience of both, this

article and the software presented? (ii) What is (and is

not) an ELN anyway? (iii) Why yet another ELN? What

makes this one diﬀerent to the others on the market? The

latter two questions will be addressed in more detail in

the two following sections. Nevertheless, short answers

are provided here already to each of the three questions.

Who is the target audience of both, this article and

the software presented? The article as well as the soft-

ware presented aim at academic researchers, either in-

dividuals or small groups, with limited time and (IT)

resources. While generally applicable, the use cases pre-

sented stem from the personal experience of close to two

decades working in spectroscopy applied to fundamen-

tal research. [42–44] This ﬁeld is characterised by data

acquisition to be intrinsically digital, though rarely di-

rectly automatable and automated. Furthermore, it is

rather on the small-scale end of science, often referred to

as ‘little science’ as compared to the data-intensive ‘big

science’ [31, 39].

What is (and is not) an ELN anyway? There is proba-

bly as many answers as there are products on the market.

In context of this article, an ELN is primarily the digi-

tal analogon of the paper-based laboratory notebook, i.e.

a place for the individual researcher to document their

work in the lab, ideally in a structured and organised

way, helping to (retrospectively) reproduce or at least

ﬁgure out what has been done and why. Key diﬀerences

to the paper-based notebook are the much easier acces-

sibility, the ease with which media such as images can be

included, and the possibility of clickable cross-references

(actual hypertext). This means that things such as an

inventory or data storage are per se not part of an ELN,

although many implementations add those features to

their portfolio. For details, see the next section.

Why yet another ELN? What makes the LabInform

ELN diﬀerent to the others available on the market? Ba-

sically, it tries to follow the UNIX philosophy [45, 46] in

many aspects, namely the mantra: do one thing and do it

well. It is lightweight, e.g. not using a database as central

element. It is ﬂexible and thus adaptable by other disci-

plines. Furthermore, it is based on (battle-proven) open-

source wiki software, i.e. not a dedicated ELN software.

This translates into a much larger and broader user basis

of the underlying technique with prospective long-term

support, Last but not least, it has a small footprint and

minimal requirements in terms of hardware and personal

infrastructure, i.e. skilled administrators. For details,

see the second-next section.

II. WHAT IS AND WHY USING AN ELN?

There are many diﬀerent competing and at least partly

incompatible deﬁnitions of the term ‘electronic lab note-

book’. Hence it seems reasonable to ﬁrst provide our

deﬁnition of an ELN. Compared to many products avail-

able on the market, we adopt a rather narrow deﬁnition

of the term and mention both, key features as well as

non-features of ELNs. Many of these non-features are

valid requirements for a digital research infrastructure,

but they should be implemented using dedicated though

interoperable tools, following the separation of concerns

approach [32, 47] known from software development.

Very simply put, an ELN is ‘just’ a digital replace-

ment for the paper-based lab notebook, and one of the

simple reasons why to use an ELN would be its accessi-

bility via a webbrowser from basically everywhere. We

will not provide a detailed account for when and when

not to use ELNs and their advantages and disadvantages.

The interested reader is referred to the body of literature

available [15, 16, 48–51]. We rather summarise key and

additional useful features of ELNs before continuing in

the next section with the speciﬁcs of the LabInform ELN

and what makes this one diﬀerent from existing solutions.

Key features First of all, an ELN should be able to re-

place the conventional, paper-based lab notebook. While

switching from one system to another is always a natural

place to rethink previous decisions, basically everything a

scientist has done so far with their paper-based notebook

should be possible with the ELN of choice as well. Next

is access to the ELN from wherever necessary, deﬁnitely

from within the lab and when analysing the data, be it

in the oﬃce or perhaps at home. Therefore, a web-based

ELN is clearly a natural choice nowadays. Note that

‘web-based’ does not imply a cloud solution, though. For

a more detailed discussion why an ELN should always be

hosted locally rather than cloud-based, see below. Of

course, including images and other media ﬁles is a neces-

sary requirement, but the ELN should allow for including

and storing structured information (e.g., key–value pairs)

as well that can ideally be accessed digitally via an appli-

cation programming interface (API). From own and oth-

ers’ experience, creating new pages should be template-

based [52], and the templates used be easily adjustable

from within the ELN not needing to resort to (external)

programmers or technical administrators. Depending on

the context, it is a (legal) requirement of an ELN to com-

ply with auditing requirements. An additional necessary

prerequisite of an ELN is an easy (tabular) overview of

experiments or measurements that is at least sortable,

better with the possibility to apply ﬁlters. Furthermore,

being able to export all contents to a generic format is

a strict requirement. No ELN will exist forever, and the

information contained therein is too valuable to be lost.

Hence an ELN needs to be able to automatically export

(convert) its contents to a generic format.[53]

Useful additional features Besides the key require-

ments for an ELN listed above, there are additional fea-

tures making an ELN more powerful and sometimes eas-

ier to use. Two should be mentioned here: an inven-

tory, at least for samples, and interfaces to processing

and analysis software allowing to automatically generate

contents for the ELN. Particularly with the inventory,

cross-links from an individual labbook page to the sam-

ple measured and back come in quite handy, a key com-

ponent of the semantic web [27–30] mentioned earlier.

With regard to the processing and analysis software, it is

important to note that an ELN should provide an inter-

face to those routines, but not the actual routines, again

a matter of separation of concerns [32, 47].

Non-features Besides the features listed above, there

are a few things that are often named in conjunction with

ELNs but are not key components of any ELN. An ELN

in itself is already complex, and there is no such thing

as a turnkey one-size-ﬁts-it-all solution, particularly in

academic research. Therefore, it is much better to sep-

arate concerns [32, 47] and deploy a series of individ-

ual, though interacting components. This is at the heart

of the UNIX philosophy: do one thing and do it well.

[45, 46] An ELN is not a repository for your data. Al-

though it is tempting to store your data within an ELN,

this is usually not a good idea, as the data should sur-

vive for much longer than the (ELN) software used as a

repository. Data should be stored using a dedicated so-

lution that takes care of preserving a structure as well as

integrity and proper backup of the precious data. [54] Of

course, it is a good idea to provide stable references from

within the ELN to the actual data, another application

of key concepts of the semantic web [27–30]. Depending

on the context, this could be a PID or simply a naming

scheme that is reﬂected in the directory hierarchy of the

data in the ﬁle system. For a more detailed discussion

see the LabInform datasafe [54] and LOI (Lab Object

Identiﬁer) components [55]. Furthermore, an ELN is not

a catalogue of your data. One of the key features listed

above for an ELN is an overview of experiments or mea-

surements, but that is much more low-level than a proper

catalogue. Last but not least, an ELN is not an inventory

for samples and alike. Admittedly, having a small-scale

inventory of the samples available within an ELN solu-

tion can come in quite handy, and the LabInform ELN

does come with such inventory. But that is a matter of

convenience and no requirement.

Increasingly, ELNs include functionality for data pro-

cessing and analysis, a prominent example being Chemo-

tion [56] including Chemspectra [57] for NMR, IR, and

MS data. However, this is as well not part of an ELN,

and there are good reasons to separate it from an ELN

solution and provide a separate, though seamlessly inter-

acting solution. Particularly useful would be to automat-

ically include results from data analyses to the individual

labbook pages documenting the recording of the under-

lying primary data. This is possible using APIs allowing

to access the ELN from the analysis framework. For an

approach to a fully reproducible data analysis see the

ASpecD framework [58] and packages based on it [59–

63]. Other aspects sometimes covered by ELNs or LIMS

are the management of measurement setups in the lab or

even a more general project management, including pro-

posals and publications. While again not part of an ELN,

this has been implemented, e.g. in the larger LabInform

infrastructure [55] but somewhat separate from and in-

dependent of the actual ELN presented here.

III. WHY LABINFORM ELN? WHAT MAKES

THIS ONE DIFFERENT?

There are already quite a number of ELNs and even

open-source solutions available, for an overview see

[15, 16]. Hence why yet another ELN solution, and what

makes this one diﬀerent? Besides the fact that the LabIn-

form ELN described here has been used successfully over

the past ten years by a number of people, we are con-

vinced that it has a few features and design decisions that

make it stand out from the existing solutions and hence

worth to share with a larger community. Most notably,

it tries to adhere to the UNIX philosophy [45, 46]. This

is a clear diﬀerence to most other ELNs that increasingly

try to incorporate many more things, as particularly seen

with elabFTW [64] and Chemotion [56], but openBIS [65]

as well, the latter introducing itself as ELN-LIMS.

The LabInform ELN is based on battle-proven wiki

software, DokuWiki [66], with minimal system require-

ments (e.g., no database server) and a small footprint.

Furthermore, it comes with a hierarchical structure, sup-

port for templates, and ﬁne-grained access control, be-

sides administration and maintenance near-exclusively

from within its web interface, minimising the eﬀort for

dedicated IT personnel. Eventually, it is highly ﬂexible

and adaptable due to a robust plugin architecture and a

series of high-quality plugins maintained by the author(s)

of DokuWiki itself.

In the next section, we describe the individual design

decisions of the LabInform ELN in quite some detail,

as we value these decisions higher than the actual im-

plementation. Therefore, we anticipate our ELN to be

useful even for those using other solutions, as they may

well adapt these to their needs.

In this section we will discuss why the LabInform ELN

is wiki-based, why we use DokuWiki (and not, e.g. Me-

diaWiki), compare the LabInform ELN to other leading

open-source ELNs, and brieﬂy mention why ELNs should

always be hosted locally with full (and exclusive) control

of the respective institution, group or scientist over the

data.

A. Why wiki-based? Flexibility

Scientiﬁc recordkeeping [1], and particularly a labora-

tory notebook [5, 6], requires a lot of ﬂexibility in how

information can be recorded, both in terms of content and

its structuring. Therefore, a wiki that provides a simple

syntax, freedom of formatting, but as well pre-deﬁned

(and user-adjustable) templates is an excellent choice.

Actually, the ﬁrst wiki was created by Ward Cunning-

ham in 1994 [67], only shortly after the invention of the

World Wide Web by Tim Berners-Lee in 1990 [28], and

the ‘wiki way’ implemented ideas that Berners-Lee had

in mind originally: web pages were always meant to be

writable as well, for the community to contribute. Fur-

thermore, we are not the ﬁrst successfully using a wiki as

an ELN [68].

A wiki usually provides some special (reduced) markup

language focussing on logical text markup rather than

formatting and that makes it much easier for the

users two write structured text than to hard-code all

this in HTML. Compared to word processors, this en-

forces structured information to a certain extend, help-

ing tremendously with storing and afterwards accessing

structured information in the ELN. This is part of the

way towards machine-actionable information.

Another reason for using a wiki, and hence a pre-

existing software solution, is the technical debt coming

with every piece of software [69–71]. Hence, it is always

good to not reinvent the wheel, but to use existing, suc-

cessful software components. A last reason in favour of

using a wiki as technical foundation for an ELN: No ELN

will ever be a turnkey solution – you will always need to

adapt it to your needs, be it for a single person, a group,

or a larger organisation. Therefore, an ELN should be

modular, easily expandable and simple to adjust by your-

self to your speciﬁc needs. Key to this are a template sys-

tem and an administration from within the web interface,

with no need for direct access to the ﬁle system. The lat-

ter comes with both, security and competence concerns

as well as the usual lack of trained IT personnel.

B. Why DokuWiki? Robust, modular, small

footprint

One of the key reasons for using DokuWiki as the tech-

nical base of the LabInform ELN is a key design phi-

losophy of DokuWiki, namely that it doesn’t rely on a

database as central storage, but on plain text ﬁles on the

ﬁle system. What may appear as a minor detail makes it

stand out from most other wiki systems and gives rise to

its robustness, resilience, and long-term availability.[72]

The latter is particulary important in context of science:

The DokuWiki engine may long be gone and defunct,

but you will still be able to easily retrieve all information

contained within your wiki using whatever tools you have

available in your operating system for searching through

(text) ﬁles on the ﬁle system. For a more detailed discus-

sion on ﬁle formats and why to (always) prefer text ﬁles

for information storage, cf. [46, ch. 5] and the discussion

in [73]. Besides the resilience and long-term availabilty

and readability of the contents, relying on plain text ﬁles

in a folder hierarchy on the ﬁle system comes with obvi-

ous additional beneﬁts: Backups are tremendously sim-

pliﬁed, as only one directory needs to be backed up. Fur-

thermore, this architecture allows for easily adding whole

content areas (namespaces) by simply copying the rele-

vant ﬁles on the ﬁle system level.

As mentioned, DokuWiki is a proven, lightweight wiki

software with minimal requirements. Thanks to omit-

ting the database as central storage, it comes with much

lower system requirements than e.g. MediaWiki, simpli-

fying setup and administration particularly in context of

small groups in science often lacking IT personnel. Fur-

thermore, DokuWiki is open source, has a good and up to

date code base, many extensions, and a large active user

base. The current development of the wiki core focusses

on maintainability and robustness, i.e. code quality [74].

Thanks to its robust plugin architecture, DokuWiki

comes with many helpful extensions for knowledge man-

agement and process control that can be used as well

to provide core ELN functionality. Central plugins used

for the LabInform ELN are maintained by the DokuWiki

author himself, therefore there are no additional exter-

nal dependencies. Nevertheless, thanks to the well main-

tained interface and documention, it is easy to write own

extensions. Last but not least, there is a company sup-

porting the development of DokuWiki that would other-

wise take over development jobs, even if anyone who can

work with PHP is qualiﬁed to do so. This adds to the

resilience, long-term availability, and independence from

a vendor or service provider.

On a more general note, DokuWiki has been devel-

oped for knowledge management from the start, hence

it is well suited for (structured) documentation as is the

nature of scientiﬁc recordkeeping and writing lab note-

books. In contrast to other wikis, DokuWiki is a hierar-

chical wiki. This allows for a better separation of areas,

such as (sub-)groups or diﬀerent topics. Furthermore, it

provides ‘hackable URIs’ that allow for easy access when

knowing the structure of the wiki. Another important

aspect for the use as an ELN is the ﬁne-grained access

control: Not everyone should be allowed to read every-

thing, let alone edit it – not even in a working group or

department, hence access control can quickly become im-

portant. Furthermore, DokuWiki comes with a complete

version control builtin, allowing for a complete audit trail

[75]. Last but not least, a wiki has a much broader user

base than a dedicated ELN and hence possibly a much

longer survival time for the software as such. DokuWiki

as such exists since 07/2004, and therefore already much

longer than the average ELN [48].

Given that MediaWiki, the engine behind the well-

known Wikipedia, is probably the most widespread wiki

software, a few words seem justiﬁed why MediaWiki does

not make for a decent technical base for an ELN in our

opinion. MediaWiki has been developed for an entirely

diﬀerent usecase, namely high load with many users and

traﬃc. Furthermore, it does not provide an appropriate

access control system, as MediaWiki was developed for

Wikipedia, and was always meant to be as open as possi-

ble. Actually, MediaWiki was never intended to provide

access control, and all plugins for this purpose are inse-

cure according to oﬃcial MediaWiki developers [76]. Fur-

thermore, MediaWiki does not provide any hierarchy and

hence has a rather ﬂat structure, making organising your

information in diﬀerent areas a lot harder. Besides that,

MediaWiki uses a database engine at its core as central

storage. This makes the setup and maintenance much

more complicated compared to DokuWiki and comes

with more dependencies. Furthermore, there is no sim-

ple adding/moving of content, let alone its backup, and

export of the contents to other formats is more diﬃcult.

Finally, administration of a MediaWiki instance is mainly

done via direct access to the underlying server, thus re-

quiring trained IT personnel. Taken together, while def-

initely an excellent piece of software with a vibrant user

and developer community, MediaWiki just seems not the

right tool to base an ELN on.

C. Comparison to other open-source ELNs

In light of a number of open-source ELNs available,

not to mention the countless commercial products on the

market, we compare here the LabInform ELN to four

open-source ELNs, namely openBIS [65], elabFTW [64],

Chemotion [56], and Kadi4Mat [77]. This comparison is

clearly not exhaustive, and there are good reasons for

using each one of these or any other ELN.

Generally, most other ELNs try to provide much more

than an ELN. The LabInform ELN is much closer to the

UNIX philosophy [45, 46]: do one thing well, use univer-

sal simple interfaces to other components, be modular,

interoperable, and ﬂexible. All other open-source ELNs

mentioned are based on a database, making installation

and maintenance more diﬃcult. Furthermore, none of

the ELNs discussed have a broad developer community

in the background, at least none larger than that of Doku-

Wiki. While at least Chemotion is partly supported by

and funded within the German national research data in-

frastructure (NFDI) programme [78], that’s not necessar-

ily sustainable, as comparable programs from other coun-

tries have shown [39, 79, 80]. Tools developed within such

a funding programme will unfortunately often not last

much longer than the funding period. Key to sustainable

operation of a software is hence a large user base as well

as a group of developers, not to mention a high-quality

code base making it easy for others to take over develop-

ment. Quite naturally, due to the much broader audience

and usecases of DokuWiki, the user community of Doku-

Wiki exceeds that of any of the ELNs [81]. Furthermore,

DokuWiki has a community of programmers interested

in knowledge management solutions, hence programmers

that themselves use the product they develop, but whose

day job is software engineering. ELNs are either devel-

oped by programmers (increasingly) disconnected from

science (at least its day-to-day operation), or by scien-

tists that rarely have been trained how to professionally

develop software that is deployed large-scale. [82–86]

As for the individual ELNs, openBIS [65] has a clear

focus on (micro-)biology, as the full name – open Biol-

ogy Information System – reveals. Furthermore, it intro-

duces itself as ELN-LIMS, hence with a much broader

approach than ‘just’ an ELN. elabFTW [64] is the pro-

totype of a software increasingly incorporating many ad-

ditional features beyond its original purpose as an ELN.

This results in a much more complex setup, correspond-

ingly more complicated and diﬃcult to maintain. Fur-

thermore, there is basically one main developer, and

only installation as (docker) container is oﬃcially sup-

ported due to the many dependencies. While perfectly

valid from a developer’s point of view, this does not

necessarily point towards a robust and simple installa-

tion and maintenance. Chemotion [56] has a clear fo-

cus on synthetic chemistry, including nice features such

as a molecule viewer and the display and even manip-

ulation of spectra [57]. Kadi4Mat [77], the Karlsruhe

Data Infrastructure for Materials Science, is again much

more than an ELN, and it combines a repository (for

‘warm’, i.e. unpublished research data) with an ELN.

Furthermore, the focus of its ELN component is on the

automated and documented execution of heterogeneous

workﬂows. While open for other research disciplines as

well, Kadi4Mat is rooted in and developed for materials

science and its special needs and workﬂows.

Just to be clear: None of the arguments above provide

a case against each of the named ELNs. Each of them is a

nice piece of software a perfectly valid choice as an ELN.

Our aim is just to highlight where LabInform ELN is

diﬀerent from the other available (open-source) solutions,

and why it may be suited for scientists not happy with

any of the other, regardless the reason.

D. Data storage: local vs. cloud

There is a number of popular (commercial) ELNs such

as LabFolder and SciNote that are usually not hosted

on-premise, but provided as software as a service (SaaS)

by the respective vendor. The same applies for using

OneNote and similar tools. This warrants some discus-

sion as to where to store the contents of an ELN. Three

rather generic aspects are (i) intellectual property rights

(IPR), (ii) the true costs of SaaS, and (iii) limited access

from the laboratory network.

Generally, the question where the information con-

tained in an ELN is stored is directly connected to IPR.

Therefore, given no further explicit technical measures

ensuring the access to the information being restricted

and controlled, you may simply not be allowed by appli-

cable law to use a cloud-based solution for your ELN,

even more so in the European Union with its rather

strict privacy legislation, e.g. the General Data Protec-

tion Regulation (GDPR). Of course there exist general

technical solutions to this problem, e.g. encrypting and

decrypting all data locally and storing only the encrypted

data in the cloud. However, this is not always easy or at

all possible to implement.

While often quite attractive particularly for small sci-

entiﬁc groups with limited IT capabilities, using ELNs

provided as SaaS has its clear disadvantages. As a mat-

ter of fact, SaaS becomes usually eventually more ex-

pensive, with less control over both, costs and available

services. Hence it appears only initially more attractive

than hosting a technical solution locally. This is by no

means restricted to ELNs but holds true for most digital

businesses, although out-sourcing is still pretty en vogue.

As a last general remark, depending on the IT secu-

rity context, access to an ELN hosted externally may not

be possible from within the laboratory network. There

are excellent reasons to shield the laboratory network

from the internet, perhaps the most intuitive being that

hardware controlling measurement setups is much more

long-lived than operating system versions, but often does

not allow for updates due to incompatibilities. In case

the ELN resides in the institutional demilitarised zone

(DMZ), none of these problems exists, and access from

outside can rather easily be provided employing well-

established techniques such as virtual private networks

(VPN) used routinely in this context.

Eventually, the question of whether to install an ELN

locally or in the cloud does not really arise, as there is

little more sensitive/private data than a lab book. Local

here means at least internal to the institution, but it can

also be installed and used decentrally by (sub)groups and

individual researchers. Such a local or institutional in-

stallation is dramatically simpliﬁed by low resource con-

sumption and low administrative eﬀort in installation

and maintenance, as is true for DokuWiki and hence the

LabInform ELN. DokuWiki can even be installed com-

pletely locally on a memory stick, but typically one will

install it on a local server that can be accessed from the

lab etc. as well as from the workplace. As mentioned,

external access can be realised via VPN if this additional

security is desired or required. Last but not least, a reg-

ular (automatic) backup of the LabInform ELN is very

simple, as only two directories (conf, data) need to be

backed up and no databases are involved.[87][88]

IV. KEY CONCEPTS AND FEATURES OF THE

LABINFORM ELN

As mentioned already, the LabInform ELN has been

developed by spectroscopists for their speciﬁc use cases.

This does not imply that the LabInform ELN is limited

to spectroscopy. It is rather that spectroscopy is the ﬁeld

the authors feel competent in and can share real-world

experience.

A. Workﬂows implemented in the LabInform ELN

A characteristic of spectroscopy and its associated

workﬂows that informed the design of the LabInform

Batch Sample Measurement

Figure 1. Workﬂow of documenting a measurement as im-

plemented in the LabInform ELN: Each measurement is car-

ried out on a sample that in turn is derived from a batch.

Therefore, you usually start with creating a page for a batch,

derive a sample from it and create the respective page for

the sample, and only then you create the labbook page for

the measurement. Due to the PIDs for batches and samples,

cross-references are automatically created. For details, see

the text.

ELN is its focus on measurements with a given method

performed on a particular sample. Hence a key require-

ment of an ELN in this context is to document in suf-

ﬁcient detail the measurements performed on individ-

ual samples, while thanks to its digital nature providing

cross-links from a labbook entry of an individual mea-

surement to the relevant sample as well as automatically

aggregated overview tables of measurements, both for a

given method and for measurements on a particular sam-

ple. As we will demonstrate, a very similar workﬂow can

be developed for quantum-chemical calculations. The au-

thors have successfully used the LabInform ELN for doc-

umenting their quantum-chemical calculations as well.

The workﬂow of documenting a measurement as imple-

mented in the LabInform ELN is graphically represented

in Fig. 1. As mentioned, in spectroscopy each measure-

ment is carried out on a sample. Usually, such a sample

is not consumed by the measurement, and even if, there

is usually a stock from which it originated, so the same

batch of material is still there for creating new samples

and performing comparison measurements. This leads

ﬁrst to the discrimination between sample and batch. A

sample is the entity that gets investigated (measured) –

i.e. the entity that is actually put into the spectrometer

in spectroscopy – and it is the result of some preparation

starting with (parts of) a batch. A batch is the individual

supply from collaborations, syntheses, or manufacturers

and the starting point of a sample. To allow for auto-

matic cross-referencing between measurements, samples,

and batches, and to provide a place for storing additional

information about samples and batches, the LabInform

ELN comes with an inventory of samples and batches.

For the workﬂow of documenting a measurement, this

means that you usually start with creating a page for a

batch that originated either from a collaboration, a syn-

thesis or a manufacturer. Afterwards, you (physically)

derive a sample from it that you can actually measure,

e.g. by dissolving or diluting it or simply putting it into

a compartment for measuring, be it a tube, an optical

cell, or a rotor for MAS NMR. Within the LabInform

ELN you create the respective page for the physically

existing sample, and only then you create the labbook

page for the measurement. Upon creating the entries

Molecule Geometry Calculation

Figure 2. Workﬂow of documenting (quantum-chemical) cal-

culations as implemented in the LabInform ELN: Each calcu-

lation is carried out on a geometry that in turn is connected

to a molecule. Therefore, you usually start with creating a

page for a molecule, create a geometry of this molecule and

create the respective page for the moleculular geometry, and

only then you create the labbook page for the calculation. For

details, see the text.

for batches and samples, both are automatically given

unique (and persistent) identiﬁers that allow for auto-

matic cross-referencing within the LabInform ELN.

Especially from a spectroscopy point of view, the

overview of the measurements performed on an individ-

ual sample and an overview of the existing samples are

crucial aspects. Often, several diﬀerent measurements

with diﬀerent purposes will be performed on a single sam-

ple. Similarly, an overview of all measurements of a par-

ticular type in form of a table that can be sorted and

ﬁltered is a very useful tool. Hence, the LabInform ELN

provides two distinct ways to ﬁnd and access a laboratory

book entry: by type of measurement and by sample. De-

tails of how to create the respective entries for batches,

samples, and measurements will be given below, as well

as how to access the individual pages.

As mentioned, a very similar workﬂow to the one de-

scribed above for documenting (spectroscopic) measure-

ments on samples can be implemented for documenting

(quantum-chemical) calculations. A ﬁrst graphical rep-

resentation is given in Fig. 2, and the similarities to the

workﬂow for documenting measurements (Fig. 1) is im-

mediately obvious. The focus is here on documenting

quantum-chemical calculations. Each individual calcu-

lation is carried out on a given geometry, and a geom-

etry is always connected to a molecule as chemical en-

tity. Therefore, the LabInform ELN provides an inven-

tory for molecules and geometries, and as with batches

and samples, upon creating the respective entries they

are automatically given unique (and persistent) identi-

ﬁers that are used for automatic cross-referencing. To

perform a calculation, you ﬁrst need to create an entry

for a molecule, then create an entry for a particular ge-

ometry of this molecule, and only then can you create the

entry for documenting the calculation, i.e. the actual lab-

book page. As with batches and samples, there can (and

usually will be) several geometries for one and the same

molecule, and the molecule’s page contains an automat-

ically generated overview table of the connected geome-

tries with links to their respective entries. The same is

true for calculations on a given geometry. Furthermore, a

geometry can be the result of a calculation, as is the case

in the usual ﬁrst step of quantum-chemical calculations,

namely the geometry optimisation. Last but not least,

as usually the calculations have some relevance for the

experimental spectroscopic work, a molecule can have a

reference to a physical batch.

B. Core aspects of the LabInform ELN concept

The authors ﬁrmly believe that only systems that are

suﬃciently easy to use and whose use promises obvious

advantages will be used. For an ELN this means that

it needs to contain structured information that is ideally

at least partially machine-actionable. Furthermore, an

ELN needs to provide a convenient, intuitive and user-

friendly interface ﬁtting seamlessly into the workﬂow of

the individual scientist. How does this translate to the

concepts and features of the LabInform ELN?

Labbook pages, regardless whether they document

measurements or calculations, are generated via a (web)

form, which in turn uses (user-deﬁned) templates for the

pages. The same is true for the pages documenting the

inventory, namely batches and samples for measurements

and molecules and geometries for (quantum-chemical)

calculations, each using their own respective template.

Templates can even be chosen automatically depending

on the value of certain ﬁelds in the web form. For conve-

nience, the following discussion will focus on experiments

but can be applied analogously to calculations. Gener-

ally, there is one page per documented experiment, at

least for one sample and one type of measurement. The

basic metadata for each labbook entry are summarised

at the top of the page as key–value pairs and partly re-

quested via the web form used to create the labbook en-

try. Furthermore, these metadata include automatically

generated cross-links, e.g. to the sample measured. As a

rule of thumb, changes to these metadata listed at the top

of the page – be it for another measurement, or another

type of experiment – logically require creating a new lab-

book page. This rule is not technically enforced by the

LabInform ELN, but necessary for its consistent use. For

each type of experiment there is a separate type of lab-

book entry, each with its own (user-deﬁned) template.

Labbook pages are created chronologically, per month

in a separate namespace (directory) to keep the number

of media ﬁles per directory manageable. Last but not

least, the basic information contained in the metadata at

the top of the individual labbook entries is aggregated in

overview tables.

A prototypical example of a labbook page of an indi-

vidual measurement of a sample is shown in Fig. 3. The

contents of such a labbook page are: (i) the basic meta-

data as key-value pairs that can be used to aggregate

information contained therein in overview tables; (ii) a

detailed log (with time) of the history of the measure-

ment; (iii) some general comments; (iv) a ﬁrst evalua-

tion or analysis of the recorded data, typically in form of

a graphical representation that could be automatically

generated, e.g. using analysis tools [59–62] based on the

ASpecD framework [58], these analyses serve as a quick

Figure 3. Prototypical labbook entry for a measurement on a

sample. Elements are the basic metadata of the measurement,

most of them entered in the form for creating the labbook

entry, a detailed log, comments, a ﬁrst analysis (typically a

graphical representation of the data), further ideas, and the

Infoﬁle [73] of the measurement. For details see the text.

overview, especially if one does not have a catalogue, and

they could eventually be automatically included by the

analysis tools via the XML-RPC interface provided by

DokuWiki; (v) further plans; (vi) metadata for the mea-

surement as a downloadable code block, e.g. an Infoﬁle

[58, 59, 73].

As mentioned, by providing the unique (persistent)

identiﬁer of the sample in the form used for creating the

labbook entry a cross-reference to the page of this sam-

ple in the inventory is created, and at the same time,

the measurement appears in the overview table of mea-

surements on the sample page. This shows the power of

having an inventory as part of the LabInform ELN, al-

though strictly speaking a sample inventory is not a key

component of an ELN.

Recurring structures of a labbook page can be conve-

niently included using building blocks as templates for

parts of a page. These templates can be fully controlled

by the individual users and previewed before including

them into the actual page. A typical example would be

the Infoﬁle of a measurement in case more than one mea-

surement is documented on a single labbook page.

C. Inventory

As described above for the workﬂows, the LabInform

ELN comes with inventories for both, samples that are

experimentally characterised and molecules whose prop-

erties are theoretically calculated. As with the labbook

pages, entries are created using web forms, and unique

and persistent identiﬁers automatically assigned to each

individual entry. For samples, the inventory is separated

into batches and samples, as described above, with the

sample being the actual entity experiments are carried

out with. Similarly, for molecules there is the distinction

between molecule and geometry, with the geometry being

the actual entity the calculation is performed on.

Samples get automatically linked from the labbook

pages documenting individual measurements, and sum-

mary tables with measurements for a sample automati-

cally created on the individual sample pages. Similarly,

the page of an individual batch contains an automatically

generated overview table listing the samples derived from

this batch. Sample pages provide a link back to the batch

from which the sample was derived. As a batch can be

derived from another batch – e.g. dissolving a solid mate-

rial –, a page of an individual batch provides a link back

to another originating batch in this case. Similarly, the

page of the batch other batches are derived from contains

an overview table listing those batches that were created

from this batch.

It should be noted that the idea of the inventory of the

LabInform ELN is to provide convenient ways to access

the information on individual measurements or calcula-

tions and allow for (automatic) cross-linking. Therefore,

an inventory for chemicals or details regarding the loca-

tions of storage of samples etc. are clearly beyond the

scope of the LabInform ELN, although at least the latter

could be added in a rather straight-forward manner to

the templates for the sample pages.

D. Cross-linking

Providing (automatic) references to other pages within

the labbook is one of the crucial characteristics of an ELN

and a clear advantage of the digital implementation as

compared to the physical paper-based labbook. Further-

more, cross-linking is a crucial aspect of the Semantic

Web [27] envisioned by the originator of the World Wide

Web (WWW), Tim Berners-Lee, as well as the WWW

itself. Hence, it does not come to any surprise that the

WWW was created in context of documenting experi-

ments and organising knowledge of a large-scale scientiﬁc

facility (the CERN in Geneva).

Within the LabInform ELN, cross-linking is mainly

done automatically using the unique and persistent iden-

tiﬁers of the individual pages of the inventory. Addition-

ally, there is a special syntax for referring to these PIDs

of samples, batches, molecules, and geometries, provid-

ing the user with a convenient way to add manual cross-

references in a robust way that is independent of the

actual place these pages are located within the wiki.

Furthermore, when performing a series of experiments

documented on individual labbook pages, it is good prac-

tice to manually add links to the previous and next lab-

book page in such a series within the detailed log. This

makes it much easier to navigate when coming from the

overview table of measurements and having just clicked

the ﬁrst entry that seemed somewhat reasonable.

E. Overview tables – sortable and with ﬁlters

Providing access to the wealth of information con-

tained in an ELN is a crucial aspect. To this end, the

LabInform ELN makes heavy use of automatically cre-

ated overview tables, be it of all measurements using one

method, the measurements carried out on an individual

sample, the samples derived from a batch, and more.

Those overview tables will be automatically created for

batches, samples, molecules, and geometries, as they are

part of the templates used for creating the individual

pages.

Crucial for their functionality is that each of these ta-

bles can be both, sorted and ﬁltered for each individual

column. As each row in such a table provides at least one

cross-link to another page, the information is easily ac-

cessible. In some sense, one can think of the LabInform

ELN as a ‘poor man’s catalogue’ thanks to this feature,

although a proper catalogue of your research data has

many more features and is clearly beyond the scope of

the LabInform ELN.

F. Help directly within ELN

While the characteristics discussed so far are mostly

related to the contents of the ELN and the workﬂow,

the following aspects are more technical, though not less

important. A simple yet often overlooked fact: an indivi-

ual’s lifetime does not scale, written documentation does.

Therefore, the LabInform ELN comes with help directly

builtin. Systems that are used regularly should have a

user interface that is as intuitive as possible. Neverthe-

less, while we can try to minimise accidental complexity

as much as possible, every non-trivial process comes with

inherent complexity we have to cope with [89]. In terms

of an ELN, those who originally designed and regularly

use it don’t need any manual. However, usually you will

have some people new to the lab or only temporarily

present that are going to use the ELN, and others may

only use it infrequently.

While in-person training is highly valuable to get peo-

ple started, being the single point of contact for all ques-

tions regarding a particular tool doesn’t scale well. This

is why in the LabInform ELN, help texts are included

right into the ELN, in a way they don’t disturb the fre-

quent user but help those unfamiliar with or only occa-

sionally using the system. Here, brief and to the point

explanations of how to perform the task at hand as well

as (lab-speciﬁc) conventions shall be provided. The texts

explain the general features and can be adapted to the

individual requirements.

G. Structures providing overview and simple access

Thanks to the hierarchical nature of DokuWiki and

quite in contrast to alternative wiki engines such as Me-

diaWiki, the LabInform ELN is organised not only in

inventory and labbook pages, but in namespaces (i.e. di-

rectories) for each individual method. Furthermore, the

consistent use of icons throughout the entire ELN helps

with easy recognition of the currently active area. High-

level pages using these very same icons and linking to the

respective areas provide simple access and the necessary

overview. An example is shown in Fig. 4.

H. Administration from within the Web UI

Another strength of the DokuWiki wiki engine, besides

its simple usage, robustness, and small footprint: most

adjustments can be made from within the Web UI. The

same is true therefore for the LabInform ELN. Web forms

and templates are entirely created using the Web UI, and

even moving individual pages as well as larger chunks

of content is possible, besides conﬁguring nearly every

aspect of the wiki engine. Some of these features are

provided by plugins, mostly maintained by the DokuWiki

core developer team.

Figure 4. Start page of the LabInform ELN providing the ﬁrst

overview and guiding the user. The icons are clickable and

used consistently throughout the ELN, helping to recognise

the respective area.

As a consequence, operating and adjusting the LabIn-

form ELN does not require any detailed IT know-how

(server, terminal) nor access to the ﬁle system. This is

particularly helpful for small groups or situations with

limited IT capacities. Additionally, due to the minimal

system requirements, maintenance of the underlying op-

erating system can be limited to a minimum as well.

I. Fine-grained access control and roles

Basically, three roles can be distinguished for the

LabInform ELN, each with a diﬀerent focus and level

of access: user, wiki administrator, and system admin-

istrator. The regular user of the LabInform ELN uses

the system on a daily basis and depending on the rights

(role) set they can also create and adjust templates. As

mentioned already, everything can be done via the user

interface of the wiki, hence no to little IT knowledge is

necessary for the user.

The wiki administrator is responsible for the admin-

istration, setup and maintenance of the wiki, including

updates if necessary, as well as creating users and set-

ting appropriate access rights. Again, everything can be

done via the wiki’s admin interface, hence little time is re-

quired for the day-to-day operations, and little IT knowl-

edge is necessary, but a certain willingness to familiarise

oneself with the syntax of DokuWiki and the possibilities

of the plugins.

Eventually, the system administrator is responsible for

installing the wiki. Hence this person needs access to the

underlying operating system. Furthermore, this person

is responsible for setting up backups, monitoring, etc.,

as well as for performing updates for the system. Never-

theless, due to the minimal system requirements of the

DokuWiki engine the LabInform ELN is based on, only

a low time commitment is necessary during routine op-

eration.

Just to mention, DokuWiki additionally knows the role

of a manager with extended rights compared to normal

users, but with fewer rights than the wiki administrator.

In a larger installation, such a role can be assigned to in-

dividual persons in a group in order to relieve the central

wiki administrator.

Besides the roles mentioned, access control can be set

ﬁne-grained both, for individual pages as well as for

namespaces (aka directories). Furthermore, user groups

can be created to account for speciﬁc requirements of a

research group, and access control set not only on per-

user, but as well on per-group base.

J. Documentation

While DokuWiki is well documented, the LabInform

ELN comes with additional extensive documentation for

users, administrators, and developers that is available

online [90]. All components of the LabInform ELN are

open-source and readily available. The documentation

should enable everybody to setup a DokuWiki instance

and convert it into a fully functional ELN without relying

on the authors of the LabInform ELN. This is an impor-

tant prerequisite for a sustainable and robust ELN, and

as a matter of fact, the usability of software generally

scales with the quality of the documentation available.

V. OUTLOOK

An ELN is neither an end in itself nor is it a silver bul-

let. Nevertheless, it can clearly be a crucial component

of your research data management [91–93], particularly

given data to be both increasingly digital and volumi-

nous. The workﬂows described here and implemented

in the LabInform ELN work well for the authors in a

spectroscopic setting deeply rooted in physical chemistry.

Hence they most probably need to be adapted to your

own needs. Bear in mind that no ELN will ever be a

turn-key solution and that we need to ﬁrst gain a thor-

ough understanding of our own processes before we can

start mapping them to a digital workﬂow. Furthermore,

the LabInform ELN is but one component of a larger in-

frastructure for research data management employed by

the authors, others being tools for metadata acquisition

during data recording [73], a framework for reproducible

data analysis [58], a local repository [54], and a LIMS

[55]. The interplay of these diﬀerent components will

brieﬂy be described and afterwards other, partly similar

solutions mentioned.

While certainly the purpose of an ELN is to document

measurements, machine-actionable metadata recorded

during data acquisition should be stored next to the ac-

tual data ﬁles. This is the realm of the Infoﬁle [73]: While

the Infoﬁle aims at collecting all necessary metadata dur-

ing data acquisition, it does not provide an overview of

the measurements (and other actions) that have been

done, and does not provide the necessary context in itself.

However, as shown in Fig. 3, including an Infoﬁle into a

labbook entry is both, trivial and highly valuable. Data

processing and analysis should be automated as much as

possible, and a gap-less automatically written protocol of

each individual step including all implicit and explicit pa-

rameters is of particular importance. This has been im-

plemented in the ASpecD framework [58] and packages

based on it [59–63]. Already now the reporting capa-

bilities of the ASpecD framework can be used to provide

code snippets with graphical (or tabular) representations

of the data analysis that are manually included into a

labbook page (as seen in Fig. 3). The XML-RPC inter-

face of DokuWiki makes it possible to further automate

this process and update labbook pages from within the

ASpecD framework. Extending the functionality of the

ASpecD framework in this direction is currently actively

being considered. In terms of research data management

[91–93] and the research data life cycle [94], further as-

pects that need to accompany an ELN are a (local) data

repository as well as PIDs. Those concepts have been

implemented in the wider LabInform infrastructure [55],

particularly with the Datasafe [54] as repository and the

Lab Object Identiﬁer (LOI) concept for PIDs. Other as-

pects implemented in wiki components of the LabInform

LIMS are more related to knowledge and project as well

as lab management and hence similar to Premier [95] and

LabCIRS [96], respectively.

VI. CONCLUSIONS

Although ELNs are sometimes heralded as the solution

to research data management, there is no such thing as

a silver bullet [89, 97] and it only leads to frustration to

mistake a tool for the solution. Scientiﬁc recordkeeping

is clearly a key aspect of science and a prerequisite of

(more) reproducible research. In the progressively dig-

ital environment and with the tremendously increasing

amount of data, ELNs can clearly contribute to reducing

the accidental complexity [89] and help us to focus on the

essential, i.e. intrinsic complexity of science. The best

tools are those that feel natural in their handling rather

than forcing us to do things in ways we didn’t intent.

With its minimal system requirements, robustness, mod-

ularity and resilience, we are sure the LabInform ELN is

such a tool, particularly when adapted to own needs and

workﬂows. Thus we anticipate the LabInform ELN and

the ideas behind its implementation to have a high im-

pact in the ﬁeld, particularly for groups with limited IT

resources, and to help with research data management

resulting in more reproducible research.

ACKNOWLEDGEMENTS

The authors thank all people using earlier instances of

the LabInform ELN and contributing ideas how to im-

prove it, in chronological order: D. Meyer, K. Serrer,

D. Nohr, J. Popp, C. Matt. Earlier, J. L¨owenstein pi-

oneered using DokuWiki as a (less structured) ELN. D.

Meyer in particular for contributing the simple yet pow-

erful idea to label samples with numbers, allowing for

simple cross-referencing between inventory and ELN and

providing the most primitive implementation of a PID.

Th. Berthold for showing TB the structure and extend

of necessary information to a given measurement and for

his idea to ensure independence and transferability of this

information by storing it in text ﬁles located next to the

actual data – an early implementation of a distributed

ELN that gave birth to the Infoﬁle [73]. K. Heidtke for

reassurance that no ELN will ever be a turn-key solution

and you need ﬁrst to understand your processes before

you can map them to a digital workﬂow. K. Boldt and B.

Corzilius for their explicit interest in using and further

developing the LabInform ELN in their groups, thus pro-

viding crucial motivation to speed things up. Last but

not least, the main author and maintainer of DokuWiki:

A. Gohr; as well as all other contributors and authors of

valuable plugins used for creating an ELN using Doku-

Wiki.

SOFTWARE AVAILABILITY

The LabInform ELN is free software available un-

der a BSD license from GitHub: https://github.com/

tillbiskup/labinform-eln. Extensive documentation

can be found online at https://eln.docs.labinform.

de, a demo instance at https://eln.labinform.de/.

SUPPLEMENTAL MATERIAL

Demo: https://eln.labinform.de/

Documentation: https://eln.docs.labinform.de

[1] Shankar, K. Order from chaos: The poetics and prag-

matics of scientiﬁc recordkeeping. J. Am. Soc. Inf. Sci.

Technol. 2007,58, 1457–1466.

[2] Baker, M. Is there a reproducibility crisis? Nature 2016,

533, 452–454.

[3] Stodden, V., Leisch, F., Peng, R. D., Eds. Implementing

Reproducible Research; CRC Press: Boca Raton, 2014.

[4] Wilkinson, M. D. et al. The FAIR Guiding Principles for

scientiﬁc data management and stewardship. Sci. Data

2016,3, 160018.

[5] Ebel, H. F.; Bliefert, C.; Russey, W. E. The Art of Sci-

entiﬁc Writing; Wiley-VCH: Weinheim, 2004.

[6] Eisenberg, A. Keeping a laboratory notebook. J. Chem.

Edu. 1982,59, 1045–1046.

[7] Bell, G.; Hey, T.; Szalay, A. Beyond the data deluge.

Science 2009,323, 1297–1298.

[8] Szalay, A.; Gray, J. Science in an exponential world. Na-

ture 2006,440, 413–414.

[9] Hey, T., Tansley, S., Tolle, K., Eds. The Fourth

Paradigm; Microsoft Research: Redmont, Washington,

2009.

[10] Note, however, that simply providing the paths to the

data on the ﬁle system of a single computer would not

be helpful, as those paths are typically not long-term

stable.

[11] Neugebauer, O. Astronomical Cuneiform Texts; Lund

Humphries: London, 1955.

[12] Zeng, M. L. Metadata, 3rd ed.; Facet Publishing: Lon-

don, 2022.

[13] Riley, J. Understanding Metadata; National Information

Standards Organization (NISO): Baltimore, MD, 2017.

[14] Brand, A.; Daly, F.; Meyers, B. Metadata Demystiﬁed;

The Sheridan Press & NISO Press: Hanover, PA, 2003.

[15] Kanza, S.; Willoughby, C.; Gibbins, N.; Whitby, R.;

Frey, J. G.; Zupanˇciˇc, J. E. A. K.; Hren, M.; Kovaˇc, K.

Electronic lab notebooks: can they replace paper? J.

Cheminf. 2017,9, 31.

[16] Bird, C. L.; Willoughby, C.; Frey, J. G. Laboratory note-

books in the digital era: the role of ELNs in record keep-

ing for chemistry and other sciences. Chem. Soc. Rev.

2013,42, 8157–8175.

[17] Waldo, W. H.; Barnett, E. H. An electronic computer as

a research assistant. Ind. Eng. Chem. 1958,50, 1641–

1643.

[18] Gilbert, W. A. RS/1: An Electronic Laboratory Note-

book. Bioscience 1985,35, 588–590.

[19] Borman, S. A. Scientiﬁc Software. Anal. Chem. 1985,57,

983A–994A.

[20] OECD, OECD Principles of Good Laboratory Practice;

1998.

[21] The other three paradigms are: theory, experiment, and

simulation.

[22] Hey, T.; Hey, J. e-Science and its implications for the

library community. Library Hi Tech 2006,24, 515–528.

[23] Hey, T.; Trefethen, A. The fourth paradigm ten years on.

Informatik Spektrum 2020,42, 441–447.

[24] OECD, Recommendation of the Council concerning Ac-

cess to Research Data from Public Funding ; 2006;

amended 2021.

[25] Borgman, C. L. The conundrum of sharing research data.

J. Am. Soc. Inf. Sci. Technol. 2012,63, 1059–1078.

[26] Berners-Lee, T. Linked Data. 2006; https://www.w3.

org/DesignIssues/LinkedData.html.

[27] Berners-Lee, T.; Hendler, J.; Lassila, O. The Semantic

Web. Sci. Am. 2001,284, 34–43.

[28] Berners-Lee, T. Weaving the Web : the original design an

ultimate destiny of the World Wide Web by its inventor;

HarperSanFrancisco: New York, 1999.

[29] Shadbolt, N.; Hall, W.; Berners-Lee, T. The semantic

web revisited. IEEE Intell. Syst. 2006,21, 96–101.

[30] Frey, J. G. The value of the Semantic Web in the labo-

ratory. Drug Discov. Today 2009,14, 552–561.

[31] Price, D. J., de Solla Little science, big science; Columbia

University Press: New York, 1963.

[32] Dijkstra, E. W. The humble programmer. Commun.

ACM 1972,15, 859–865.

[33] Wilson, G. Software carpentry. Getting scientists to write

better code by making them more productive. Comput.

Sci. Eng. 2006,8, 66–69.

[34] Wilson, G. What should computer scientists teach to

physical scientists and engineers? IEEE Comput. Sci.

Eng. 1996,3, 46–55.

[35] Koltay, T. Data governance, data literacy and the man-

agement of data quality. IFLA J. 2016,42, 303–312.

[36] Koltay, T. Data literacy for researchers and data librari-

ans. J. Libr. Info. Sci. 2017,49, 3–14.

[37] Kanza, S.; Willoughby, C.; Bird, C. L.; Frey, J. G.

eScience infrastructures in physical chemistry. Annu.

Rev. Phys. Chem. 2022,73, 97–116.

[38] Jablonka, K. M.; Patiny, L.; Smit, B. Making the col-

lective knowledge of chemistry open and machine action-

able. Nat. Chem. 2022,14, 365–376.

[39] Borgman, C. L. Big Data, Little Data, No Data: Schol-

arship in the Networked World; MIT Press: Cambridge,

MA, 2015.

[40] Whitehead, A. N. An Introduction to Mathematics;

Dover Publications: Mineola, 2017; original 1911.

[41] Allesina, S.; Wilmes, M. Computing Skills for Biologists;

Princeton University Press: Princeton and Oxford, 2019.

[42] Biskup, T. Time-resolved EPR of radical pair intermedi-

ates in cryptochromes. Mol. Phys. 2013,111, 3698–3703.

[43] Biskup, T. Structure–function relationship of organic

semiconductors: Detailed insights from time-resolved

EPR spectroscopy. Front. Chem. 2019,7, 10.

[44] Biskup, T. Doping of organic semiconductors: Insights

from EPR spectroscopy. Appl. Phys. Lett. 2021,119,

010503.

[45] McIlroy, M. D.; Pinson, E. N.; Tague, B. A. UNIX time-

sharing system: foreword. Bell Syst. Tech. J. 1978,57,

1899–1904.

[46] Raymond, E. S. The Art of UNIX Programming; Addison

Wesley: Boston, 2004.

[47] Dijkstra, E. W. A Discipline of Programming; Prentice-

Hall: Englewood Cliﬀs, New Jersey, 1976.

[48] Higgins, S. G.; Nogiwa-Valdez, A. A.; Stevens, M. M.

Considerations for implementing electronic laboratory

notebooks in an academic research environment. Nat.

Protoc. 2022,17, 179–189.

[49] Kwok, R. How to pick an electronic laboratory notebook.

Nature 2018,560, 269–270.

[50] Dirnagl, U.; Przesdzing, I. A pocket guide to elec-

tronic laboratory notebooks in the academic life sciences.

F1000Research 2016,5, 2.

[51] Badiola, K. A. et al. Experiences with a researcher-

centric ELN. Chem. Sci. 2015,6, 1614–1629.

[52] Willoughby, C.; Logothetis, T. A.; Frey, J. G. Eﬀects of

using structured templates for recalling chemistry exper-

iments. 2016,8, 9.

[53] Whether the ELN ﬁle format [98] based on the RO-Crate

metadata speciﬁcation [99] will eventually provide a vi-

able data exchange format remains to be seen, particu-

larly as this format seems to focus on storing the actual

data in an ELN, what is clearly not intended for the

LabInform ELN.

[54] Schr¨oder, M.; Biskup, T. LabInform datasafe. 2023;

https://datasafe.docs.labinform.de/.

[55] Biskup, T. LabInform: A modular laboratory informa-

tion system built from open source components. Chem-

Rxiv 2022, 10.26434/chemrxiv-2022-vz360.

[56] Tremouilhac, P.; Nguyen, A.; Huang, Y.; Kotov, S.;

L¨utjohann, D. S.; H¨ubsch, F.; Jung, N.; Br¨ase, S. Chemo-

tion ELN: an Open Source electronic lab notebook for

chemists in academia. J. Cheminf. 2017,9, 54.

[57] Huang, Y.; Tremouilhac, P.; Nguyen, A.; Jung, N.;

Br¨ase, S. ChemSpectra: a web-based spectra editor for

analytical data. J. Cheminf. 2021,13, 8.

[58] Popp, J.; Biskup, T. ASpecD: A modular framework for

the analysis of spectroscopic data focussing on repro-

ducibility and good scientiﬁc practice. Chem. Methods

2022,2, e202100097.

[59] Schr¨oder, M.; Biskup, T. cwepr - A Python package for

analysing cw-EPR data focussing on reproducibility and

simple usage. J. Magn. Reson. 2022,335, 107140.

[60] Schr¨oder, M.; Biskup, T. cwepr Python package. 2021;

https://docs.cwepr.de/, doi:10.5281/zenodo.4896687.

[61] Popp, J.; Schr¨oder, M.; Biskup, T. trEPR

Python package. 2021; https://docs.trepr.de/,

doi:10.5281/zenodo.4897112.

[62] Biskup, T. UVVisPy Python package. 2021; https://

docs.uvvispy.de/, doi:10.5281/zenodo.5106817.

[63] Biskup, T. FitPy Python package. 2022; https://docs.

fitpy.de/, doi:10.5281/zenodo.5920380.

[64] CARPi, N.; Minges, A.; Piel, M. eLabFTW: An open

source laboratory notebook for research labs. J. Open

Source Software 2017,2, 146.

[65] Barillari, C.; Ottoz, D. S. M.; Fuentes-Serna, J. M.; Ra-

makrishnan, C.; Rinn, B.; Rudolf, F. openBIS ELN-

LIMS: an open-source database for academic laborato-

ries. Bioinformatics 2016,32, 638–640.

[66] DokuWiki. 2023; https://dokuwiki.org/.

[67] Leuf, B.; Cunningham, W. The Wiki Way. Quick Col-

laboration on the Web; Addison-Wesley: Upper Saddle

River, NJ, 2001.

[68] Lawrie, G. A.; Grøndahl, L.; Boman, S.; Andrews, T.

Wiki laboratory notebooks: supporting student learning

in collaborative inquiry-based laboratory experiments. J.

Sci. Educ. Technol. 2016,25, 394–409.

[69] Cunningham, W. The WyCash Portfolio Management

System. Addendum to the Proceedings on Object-

Oriented Programming Systems, Languages, and Ap-

plications (Addendum). New York, NY, USA, 1992; p

29–30.

[70] Allman, E. Managing Technical Debt. Commun. ACM

2012,55, 50––55.

[71] Kerievsky, J. Refactoring to Patterns; Addison-Wesley:

Boston, 2005.

[72] As C. Odebrecht put it: Every database is ephemeral.

What counts are the data, i.e. the information contained

in the database. Therefore, we need to be able to throw

away the database and the fancy interface and start from

scratch while keeping access to our data/content.

[73] Paulus, B.; Biskup, T. Towards more reproducible and

FAIRer research data: documenting provenance during

data acquisition using the Infoﬁle format. Digit. Discov.

2023,2, 234–244.

[74] Andreas Gohr, Refactoring. 2018; https://www.

patreon.com/posts/refactoring-18685665.

[75] DokuWiki: Old Revisions. 2018; https://www.

dokuwiki.org/attic.

[76] MediaWiki: Security issues with authorization

extensions. 2022; https://www.mediawiki.org/

wiki/Special:MyLanguage/Security_issues_with_

authorization_extensions, visited 2023-03-12.

[77] Brandt, N.; Griem, L.; Herrmann, C.; Schoof, E.;

Tosato, G.; Zhao, Y.; Zschumme, P.; Selzer, M.

Kadi4Mat: A research data infrastructure for materials

science. Data Sci. J. 2021,20, 8.

[78] Herres-Pawlis, S.; Liermann, J. C.; Koepler, O. Research

data in chemistry – results of the ﬁrst NFDI4Chem com-

munity survey. Z. Anorg. Allg. Chem. 2020,646, 1748–

1757.

[79] Hey, T.; Trefethen, A. E. The UK e-science core pro-

gramme and the grid. Futur. Gener. Comput. Syst. 2002,

18, 1017–1031.

[80] Hey, T.; Trefethen, A. E. UK e-science programme: next

generation grid applications. Int. J. High Perform. Com-

put. Appl. 2004,18, 285–291.

[81] As of 09/2021, between 50k and 250k DokuWiki instal-

lations are estimated: https://www.dokuwiki.org/faq:

installcount.

[82] Merali, Z. ...why scientiﬁc programming does not com-

pute. Nature 2010,467, 775–777.

[83] Baxter, S. M.; Day, S. W.; Fetrow, J. S.; Reisinger, S. J.

Scientiﬁc software development is not an oxymoron.

PLoS Comput. Biol. 2006,2, e87.

[84] Goble, C. Better software, better research. IEEE Internet

Comput. 2014,18, 4–8.

[85] Prli´c, A.; Procter, J. B. Ten simple rules for the open

development of scientiﬁc software. PLoS Comput. Biol.

2012,8, e1002802.

[86] De Roure, D.; Goble, C. Software design for empowering

scientists. IEEE Softw. 2009,26, 88–95.

[87] Biskup, T. SOLVed-IT. 2023; https://www.solved-it.

org/.

[88] For those cases a database is used internally, e.g. for the

structured data plugin, the database backend is SQLite

and the database therefore contained in a single ﬁle that

can and will be backed up together with the other con-

tent.

[89] Brooks, F. P. J. No Silver Bullet Essence and Accidents

of Software Engineering. Computer 1987,20, 10–19.

[90] Schr¨oder, M.; Biskup, T. LabInform ELN documenta-

tion. 2023; https://eln.docs.labinform.de/.

[91] Strasser, C. Research Data Management; National Infor-

mation Standards Organization (NISO): Baltimore, MD,

2015.

[92] Corti, L.; Van den Eynden, V.; Bishop, L.; Woollard, M.

Managing and Sharing Research Data: A Guide to Good

Practice; SAGE Publications: Thousand Oaks, CA,

2020.

[93] Briney, K. Data Management for Researchers: Organize,

Maintain and Share your Data for Research Success;

Pelagic Publishing: Exeter, UK, 2015.

[94] Cox, A. M.; Tam, W. W. T. A critical analysis of life-

cycle models of the research process and research data

management. Aslib J. Inf. Manag. 2018,70, 142–157.

[95] Dirnagl, U.; Kurreck, C.; Casta˜nos-V´elez, E.; Bernard, R.

Quality management for academic laboratories: burden

or boon? EMBO Rep. 2018,19, e47143.

[96] Dirnagl, U.; Przesdzing, I.; Kurreck, C.; Major, S.

A laboratory critical incident and error reporting sys-

tem for experimental biomedicine. PLoS Biol. 2016,14,

e2000705.

[97] Brooks, F. P. The Mythical Man Month, anniversary edi-

tion with four new chapters ed.; Addison Wesley Long-

man: Boston, 1995.

[98] The ELN Consortium, ELN ﬁle format. 2022; https:

//github.com/TheELNConsortium/TheELNFileFormat.

[99] RO-Crate Metadata Speciﬁcation 1.1. 2022; https://

w3id.org/ro/crate/1.1.

ResearchGate has not been able to resolve any citations for this publication.

Towards more reproducible and FAIRer research data: documenting provenance during data acquisition using the Infofile format

Article

Full-text available

Dec 2022

Information, i.e. data, is regarded as the new oil in the 21st century. The impact of this statement from economics for science and the research community is reflected in the hugely increasing number of machine-learning and artificial intelligence applications that were one driving force behind writing out the FAIR principles. However, any form of data (re)use requires the provenance of the data to be recorded. Hence, recording metadata during data acquisition is both an essential aspect of and as old as science itself. Here, we discuss the why, when, what, and how of research data documentation and present a simple textual file format termed Infofile developed for this purpose. This format allows researchers in the lab to record all relevant metadata during data acquisition in a user-friendly and obvious way while minimising any external dependencies. The resulting machine-actionable metadata in turn allow processing and analysis software to access relevant information, besides making the research data more reproducible and FAIRer. By demonstrating a simple, yet powerful and proven solution to the problem of metadata recording during data acquisition, we anticipate the Infofile format and its underlying principles to have great impact on the reproducibility and hence quality of science, particularly in the field of “little science” lacking established and well-developed software toolchains and standards.

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice**

Article

Full-text available

Apr 2022

Reproducibility is at the heart of science. However, most published results usually lack the information necessary to be independently reproduced. Even more, most authors will not be able to reproduce the results from a few years ago due to lacking a gap‐less record of every processing and analysis step including all parameters involved. There is only one way to overcome this problem: developing robust tools for data analysis that, while maintaining a maximum of flexibility in their application, allow the user to perform advanced processing steps in a scientifically sound way. At the same time, the only viable approach for reproducible and traceable analysis is to relieve the user of the responsibility for logging all processing steps and their parameters. This can only be achieved by using a system that takes care of these crucial though often neglected tasks. Here, we present a solution to this problem: a framework for the analysis of spectroscopic data (ASpecD) written in the Python programming language that can be used without any actual programming needed. This framework is made available open‐source and free of charge and focusses on usability, small footprint and modularity while ensuring reproducibility and good scientific practice. Furthermore, we present a set of best practices and design rules for scientific software development and data analysis. Together, this empowers scientists to focus on their research minimising the need to implement complex software tools while ensuring full reproducibility. We anticipate this to have a major impact on reproducibility and good scientific practice, as we raise the awareness of their importance, summarise proven best practices and present a working user‐friendly software solution.

LabInform: A Modular Laboratory Information System Built From Open Source Components

Preprint

Full-text available

Feb 2022

Till Biskup

A framework for reproducible data analysis is only half the battle if it comes to reproducible research. Additional essential requirements are a way to safely store both, raw data and metadata and a method to uniquely refer to a dataset or any piece of information. Such unique identifier is fully independent of the actual place the information referred to is stored and does not change over time. Additionally, numeric IDs for samples and alike come in quite handy. A knowledge base and an electronic lab notebook, both based on wiki software and thus easily accessible requiring only a web browser and connection to the intranet, complete the system. Overarching design rules are simplicity, robustness and sustainability, focussing on small-scale deployment of the system retaining compatibility with future developments and community efforts. Key aspects in setting up the system are its use of well-proven open-source tools combined with maximal modularity, resulting in a low entry threshold and allowing to implement and develop it along the way of focussing on actual research.

ChemSpectra: a web-based spectra editor for analytical data

Article

Full-text available

Feb 2021

ChemSpectra, a web-based software to visualize and analyze spectroscopic data, integrating solutions for infrared spectroscopy (IR), mass spectrometry (MS), and one-dimensional ¹H and ¹³C NMR (proton and carbon nuclear magnetic resonance) spectroscopy, is described. ChemSpectra serves as web-based tool for the analysis of the most often used types of one-dimensional spectroscopic data in synthetic (organic) chemistry research. It was developed to support in particular processes for the use of open file formats which enable the work according to the FAIR data principles. The software can deal with the open file formats JCAMP-DX (IR, MS, NMR) and mzML (MS) proposing these data file types to gain interoperable data. ChemSpectra can be extended to read also other formats as exemplified by selected proprietary mass spectrometry data files of type RAW and NMR spectra files of type FID. The JavaScript-based editor can be integrated with other software, as demonstrated by integration into the Chemotion electronic lab notebook (ELN) and Chemotion repository, demonstrating the implementation into a digital work environment that offers additional functionality and sustainable research data management options. ChemSpectra supports different functions for working with spectroscopic data such as zoom functions, peak picking and automatic peak detection according to a default or manually defined threshold. NMR specific functions include the definition of a reference signal, the integration of signals, coupling constant calculation and multiplicity assignment. Embedded into a web application such as an ELN or a repository, the editor can also be used to generate an association of spectra to a sample and a file management. The file management supports the storage of the original spectra along with the last edited version and an automatically generated image of the spectra in png format. To maximize the benefit of the spectra editor for e.g. ELN users, an automated procedure for the transfer of the detected or manually chosen signals to the ELN was implemented. ChemSpectra is released under the AGPL license to encourage its re-use and further developments by the community.

Kadi4Mat: A Research Data Infrastructure for Materials Science

Article

Full-text available

Feb 2021
Data Sci J

The concepts and current developments of a research data infrastructure for materials science are presented, extending and combining the features of an electronic lab notebook and a repository. The objective of this infrastructure is to incorporate the possibility of structured data storage and data exchange with documented and reproducible data analysis and visualization, which finally leads to the publication of the data. This way, researchers can be supported throughout the entire research process. The software is being developed as a web-based and desktop-based system, offering both a graphical user interface and a programmatic interface. The focus of the development is on the integration of technologies and systems based on both established as well as new concepts. Due to the heterogeneous nature of materials science data, the current features are kept mostly generic, and the structuring of the data is largely left to the users. As a result, an extension of the research data infrastructure to other disciplines is possible in the future. The source code of the project is publicly available under a permissive Apache 2.0 license.

Making the collective knowledge of chemistry open and machine actionable

Article

Apr 2022

Large amounts of data are generated in chemistry labs—nearly all instruments record data in a digital form, yet a considerable proportion is also captured non-digitally and reported in ways non-accessible to both humans and their computational agents. Chemical research is still largely centred around paper-based lab notebooks, and the publication of data is often more an afterthought than an integral part of the process. Here we argue that a modular open-science platform for chemistry would be beneficial not only for data-mining studies but also, well beyond that, for the entire chemistry community. Much progress has been made over the past few years in developing technologies such as electronic lab notebooks that aim to address data-management concerns. This will help make chemical data reusable, however it is only one step. We highlight the importance of centring open-science initiatives around open, machine-actionable data and emphasize that most of the required technologies already exist—we only need to connect, polish and embrace them. A substantial proportion of the data generated in chemistry research is captured non-digitally and reported in ways that non-accessible to both humans and computers. A variety of tools do exist to capture, analyse and publish data in an open, reusable, machine-actionable manner — they should be connected to create an open-science platform for chemistry.

Considerations for implementing electronic laboratory notebooks in an academic research environment

Article

Jan 2022

As research becomes predominantly digitalized, scientists have the option of using electronic laboratory notebooks to record and access entries. These systems can more readily meet volume, complexity, accessibility and preservation requirements than paper notebooks. Although the technology can yield many benefits, these can be realized only by choosing a system that properly fulfills the requirements of a given context. This review explores the factors that should be considered when introducing electronic laboratory notebooks to an academically focused research group. We cite pertinent studies and discuss our own experience implementing a system within a multidisciplinary research environment. We also consider how the required financial and time investment is shared between individuals and institutions. Finally, we discuss how electronic laboratory notebooks fit into the broader context of research data management. This article is not a product review; it provides a framework for both the initial consideration of an electronic laboratory notebook and the evaluation of specific software packages.

cwepr – A Python package for analysing cw-EPR data focussing on reproducibility and simple usage

Article

Dec 2021

Reproducibility is at the heart of science. Nevertheless, with the advent of computer-based data processing and analysis, most spectroscopists have a hard time fully reproducing a figure from last year’s publication starting from the raw data. Unfortunately, this renders their work eventually unscientific. To change this, we need to develop analysis tools that relieve their users from having to trace each processing and analysis step. Furthermore, these tools need to be modular, extendible, and easy to use in order to get used. To this end, we present here the open-source Python package cwepr based on the ASpecD framework for reproducible analysis of spectroscopic data. This package follows best practices of both, science and software development. Key features include an automatically generated gap-less record of each individual processing and analysis step from the raw data to the final published figure. Additionally, it provides a powerful user interface requiring no programming skills of the user. Due to its code quality, modularity, and extensive documentation, it can be easily extended and is actively developed by spectroscopists working in the field. We expect this approach to have a high impact in the field and to help fighting the looming reproducibility crisis in spectroscopy.

EScience Infrastructures in Physical Chemistry

Article

Apr 2022

As the volume of data associated with scientific research has exploded over recent years, the use of digital infrastructures to support this research and the data underpinning it has increased significantly. Physical chemists have been making use of eScience infrastructures since their conception, but in the last five years their usage has increased even more. While these infrastructures have not greatly affected the chemistry itself, they have in some cases had a significant impact on how the research is undertaken. The combination of the human effort of collaboration to create open source software tools and semantic resources, the increased availability of hardware for the laboratories, and the range of data management tools available has made the life of a physical chemist significantly easier. This review considers the different aspects of eScience infrastructures and explores how they have improved the way in which we can conduct physical chemistry research. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Doping of organic semiconductors: Insights from EPR spectroscopy

Article

Jul 2021
APPL PHYS LETT

Till Biskup

Doping, here understood as purposefully introducing charge carriers, is a standard procedure, which is regularly employed with semiconductors to enhance conductivity and, hence, improve efficiency. Organic semiconductors are no different here, only the ratio of a dopant to a host is dramatically different compared to their inorganic counterparts. Therefore, doping of organic semiconductors will often affect the morphology and the conductivity of the host material. As the charge carriers created upon doping are usually paramagnetic, electron paramagnetic resonance (EPR) spectroscopy is perfectly suited to investigate the doping process, providing unique insights due to its exclusive sensitivity to paramagnetic states and high resolution on a molecular scale. To make an impact, EPR spectroscopy needs to be applied routinely to a large series of different systems, and the data obtained need to be analyzed in a reliable and robust way. This strongly advocates for using conventional X-band cw-EPR spectroscopy at room temperature wherever possible. Questions that can be addressed by EPR spectroscopy are discussed, and this Perspective presents how the method can gain greater importance for addressing the urgent research questions in the field, mainly by automating both data acquisition and analysis and developing robust and reliable analysis tools.

LabInform ELN: A lightweight and flexible electronic laboratory notebook for academic research based on the open-source software DokuWiki

Abstract and Figures

Recommended publications

Towards more reproducible and FAIRer research data: documenting provenance during data acquisition u...

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and...

Towards more reproducible and FAIRer research data: documenting provenance during data acquisition u...

cwepr – A Python package for analysing cw-EPR data focussing on reproducibility and simple usage