ArticlePDF Available
This is the 52nd report of a series of workshops
organised by the European Centre for the
Validation of Alternative Methods (ECVAM). The
main objective of ECVAM, as defined in 1993 by its
Scientific Advisory Committee, is to promote the
scientific and regulatory acceptance of alternative
methods which are of importance to the biosciences,
and that reduce, refine or replace the use of labora-
tory animals.
The ECVAM workshop on the quantitative
structure-activity relationship applicability
domain was held at ECVAM on 29 September–1
October 2004, under the chairmanship of Andrew
Worth. The workshop was attended by experts
from academia, industry, international organisa-
tions and regulatory authorities. The aim of the
workshop was to review the state of the art of
methods for identifying the domain of applicabil-
ity of structure-activity relationships (SARs) and
quantitative structure-activity relationships
(QSARs), collectively referred to as (Q)SARs. The
report is intended to provide a source of input to
the development of an OECD Guidance Document
on (Q)SAR Validation. The report also makes rec-
ommendations for further research needed to
understand and apply the concept of the (Q)SAR
applicability domain (AD).
(Q)SARs are theoretical models that can be used to
predict the physicochemical, biological and environ-
mental properties of chemicals.
Current Status of Methods for Defining the Applicability
Domain of (Quantitative) Structure–Activity Relationships
The Report and Recommendations of ECVAM Workshop 521,2
Tatiana I. Netzeva,3Andrew P. Worth,3 Tom Aldenberg,4Romualdo Benigni,5Mark T.D.
Cronin,6Paola Gramatica,7Joanna S. Jaworska,8Scott Kahn,9Gilles Klopman,10 Carol A.
Marchant,11 Glenn Myatt,12 Nina Nikolova-Jeliazkova,13 Grace Y. Patlewicz,14 Roger Perkins,15
David W. Roberts,16 Terry W. Schultz,17 David T. Stanton,18 Johannes J.M. van de Sandt,19
Weida Tong,15 Gilman Veith20 and Chihae Yang12
3ECVAM, Institute for Health & Consumer Protection, European Commission Joint Research Centre, Ispra,
Italy; 4RIVM, Bilthoven, The Netherlands; 5Experimental and Computational Carcinogenesis Unit,
Environment and Health Department, Istituto Superiore di Sanità, Rome, Italy; 6School of Pharmacy and
Chemistry, John Moores University, Liverpool, UK; 7QSAR and Environmental Chemistry Research Unit,
Department of Structural and Functional Biology, University of Insubria, Varese, Italy; 8Central Product
Safety, Procter & Gamble, Strombeek–Bever, Belgium; 9Accelrys Inc., San Diego, CA, USA; 10MULTICASE
Inc., Beachwood, OH, USA; 11Lhasa Ltd, Department of Chemistry, University of Leeds, Leeds, UK;
12Leadscope Inc., Columbus, OH, USA; 13Institute of Parallel Processing, Bulgarian Academy of Sciences,
Sofia, Bulgaria; 14SEAC, Unilever, Colworth House, Sharnbrook, UK; 15Center for Toxicoinformatics,
Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug
Administration, Jefferson, AR, USA; 16Bebington, Wirral, Merseyside, UK; 17Biological Activity Testing &
Modeling Laboratory, College of Veterinary Medicine, University of Tennessee, Knoxville, TN, USA; 18Miami
Valley Laboratory, Procter & Gamble, Cincinnati, OH, USA; 19Food and Chemical Risk Analysis Department,
TNO, Zeist, The Netherlands; 20Environment, Health and Safety Division, OECD, Paris, France
Address for correspondence: A. Worth, ECB, Institute for Health & Consumer Protection, European
Commission Joint Research Centre, 21020 Ispra (VA), Italy.
ATLA 33, 1–19, 2005 1
Address for reprints: ECVAM, Institute for Health & Consumer Protection, European Commission Joint Research
Centre, 21020 Ispra (VA), Italy.
1ECVAM — The European Centre for the Validation of Alternative Methods. 2This document represents the agreed
report of the participants as individual scientists.
A QSAR expresses in a mathematical form the
quantitative relationship that may exist between
the chemical structure of a series of chemicals and
their measured effect or activity. Multiple linear
regression analysis is often used as the method for
developing such a relationship, although partial
least squares (PLS) analysis, neural networks and
other mathematical tools are often used as well. A
SAR expresses the qualitative relationship between
a two-dimensional molecular fragment (structural
alert), or a three-dimensional arrangement of
molecular features (pharmacophore), and the pres-
ence or absence of a particular effect or activity.
As a result of recent policy developments in the
European Union (EU), it is expected that the use of
(Q)SARs for regulatory purposes will increase. On
29 October 2003, the European Commission (EC)
adopted a legislative proposal (1) for a new chemical
management system called REACH (Registration,
Evaluation and Authorisation of Chemicals), which
is intended to harmonise the information require-
ments applied to New and Existing Chemicals.
Annex IX of the legislative proposal for REACH
provides for the use of valid (Q)SARs for predicting
the environmental and toxicological properties of
chemicals, in the interests of time-effectiveness,
cost-effectiveness and animal welfare.
The development of valid (Q)SARs for human
health endpoints will also contribute to meeting the
needs of the Seventh Amendment to the Cosmetics
Directive (2). This lays down deadlines for the
replacement of animal tests via the gradual imposi-
tion of testing bans on cosmetics (i.e. products or
ingredients), which are reinforced by the gradual
imposition of marketing bans.
According to a recent assessment by the
European Chemicals Bureau (ECB), which, like
ECVAM, is part of the EC’s Joint Research Centre
(JRC), approximately 3.9 million additional verte-
brate test animals could be used as a consequence of
the implementation of REACH, if alternative meth-
ods are not accepted by regulatory authorities and
adopted by industry (3). However, a considerable
reduction in animal use could be obtained if alter-
natives were applied more extensively. According to
the ECB report (3), a “standard scenario” based on
the average acceptance of (Q)SARs and related
techniques (for example, read-across) would lead to
a saving of 1.3 million test animals, whereas the
maximum acceptance of these techniques would
enhance this saving potential to 1.9 million test ani-
The recent chemical policy developments are
placing an enormous challenge on (Q)SAR develop-
ers, regulators and the EC, and have raised the
need to develop internationally accepted guidance
on good (Q)SAR modelling practices. The JRC,
being responsible for the provision of independent
scientific advice to policy makers in the EU, estab-
lished an activity (called a “JRC Action”) on
(Q)SARs in January 2003, with the overall aim of
promoting the availability of valid (Q)SARs for reg-
ulatory use. One activity of the JRC Action on
(Q)SARs is the development of technical guidance
on (Q)SAR validation. This guidance document is
being developed within the framework of the
Organisation for Economic Cooperation and
Development (OECD) Group on (Q)SARs (4).
In November 2004, the OECD Member Countries
adopted five principles for the validation of (Q)SAR
models for regulatory purposes, now referred to as
the OECD Principles for (Q)SAR Validation.
According to these principles, and in order to facili-
tate its consideration for regulatory purposes, a
(Q)SAR model should be associated with the follow-
ing information:
1. a defined endpoint;
2. an unambiguous algorithm;
3. a defined domain of applicability;
4. appropriate measures of goodness-of-fit, robust-
ness and predictivity; and
5. a mechanistic interpretation, if possible.
Principle 3 expresses the need to define an AD for
(Q)SARs. This need is based on the fact that
(Q)SARs are reductionist models, which are
inevitably associated with limitations in terms of
the types of chemical structures, physicochemical
properties and mechanisms of action for which they
can generate reliable predictions.
The ECVAM workshop on the (Q)SAR AD was
organised to help to determine what types of infor-
mation are needed to define (Q)SAR ADs, and to
review the current status of methods for defining
the ADs of (Q)SARs. This workshop report is
intended to serve as a source of input to the OECD
guidance document on (Q)SAR validation, the aim
of which is to provide detailed guidance on how to
apply the (Q)SAR validation principles to various
types of models.
The Concept of the (Q)SAR
Applicability Domain
In the (Q)SAR field, the AD is widely understood to
express the scope and limitations of a model, i.e. the
range of chemical structures for which the model is
considered to be applicable. However, in the
(Q)SAR literature, it is not always apparent
whether (or to what extent) the AD concept has
been applied. In some cases, the AD concept is
implicit in the original publication; for example, the
model has been developed from a training set of
chemicals that belong to a single chemical class or
2 T.I. Netzeva et al.
ECVAM Workshop 52: (quantitative) structure–activity relationships 3
that are considered to share a common mechanism
of action. In other cases, the AD concept has been
explicitly defined. In such cases, the most com-
monly adopted approach has been to define the AD
of the model with structural rules and/or a range of
(continuous) descriptor variables (see, for example,
5). If continuous descriptor variables are used, it is
possible to define the AD in terms of coverage of the
training set in the model descriptor space (6). Such
approximations are statistically based, since inter-
polated estimates are considered to be more reliable
than extrapolated ones.
Other approaches have been based on: a) the
application of multiple linear regression (MLR)
analysis in combination with the distance approach
(see, for example, 7); b) definition of a “tolerance
volume” around a model by using PLS analysis (8);
and c) decision tree analysis (9).
Various approaches for defining the AD have
been based on similarity analysis. A comprehensive
review of such approaches has been produced by
Nikolova & Jaworska (10). All of these approaches
are based on the premise that a QSAR prediction is
reliable if the chemical for which a prediction is
being made is “similar” to the compounds in the
training set. The assessment of chemical similarity
is not trivial, since the concept of “similarity” is
sometimes used in a subjective manner, and in
cases where the concept is used in a quantitative
manner, different measures of chemical similarity
have been proposed. Furthermore, in addition to
structural and/or physicochemical similarity, it is
also possible to consider similarity in terms of the
response and/or mode of action.
The AD concept is applied in several commercially
available (Q)SAR prediction systems, including
MCASE (developed by Multicase Inc., Beachwood,
OH, USA), TOPKAT (developed by Accelrys Inc., San
Diego, CA, USA), and the Chem-Tox platform (devel-
oped by Leadscope Inc., Columbus, OH, USA).
It can therefore be seen that, where the AD concept
has been applied, it has been applied in different ways,
depending on the type of (Q)SAR model and modelling
approach. Given the regulatory need for transparency
in the reporting of (Q)SAR models, including their
ADs, and given the diversity of approaches for defin-
ing an AD, it can be concluded that there is a need to
develop a single, but flexible, conceptual framework
capable of expressing the ADs of various types of mod-
els, developed by different approaches and possibly for
different purposes. Figure 1 summarises the multiple
aspects of the AD concept. For any given (Q)SAR, one
or more of these aspects could be relevant. For exam-
ple, a structural fragment could be used to derive a
qualitative model (SAR), but could also be used in a
quantitative manner to develop a QSAR. Figure 1
should not be over-interpreted to imply that certain
aspects are mutually exclusive: for example, it is pos-
sible to develop a QSAR based on structural descrip-
tors (count variables for the presence/absence of
specified structural features) and/or continuous phys-
icochemical descriptors. Furthermore, while (Q)SARs
are sometimes referred to as “mechanistically based”
or “statistically based”, this should be taken to reflect
the philosophy and approach adopted in the develop-
ment of the model, but does not imply that the two
types of models are mutually exclusive. In fact, many
(Q)SARs are both statistically and mechanistically
A general definition of the (Q)SAR AD
Taking into account the multiple aspects of the AD
concept, the authors of this report proposed the fol-
lowing general definition for the concept of the AD:
“The applicability domain of a (Q)SAR model is the
response and chemical structure space in which the
model makes predictions with a given reliability”.
Figure 1: Aspects of the (quantitative) structure-activity relationship ([Q]SAR) applicability
Applicability domain
Structural Physico-
chemical Coverage Similarity SAR QSAR Mechan-
istic Statistical
AD approach Modelling method Philosophy
In this definition, chemical structure can be
expressed by physicochemical and/or fragmental
information, and response can be any physicochem-
ical, biological or environmental effect that is being
The importance of the AD in the (Q)SAR life
The AD is an important consideration in all three
phases of the (Q)SAR life-cycle (development, vali-
dation and application), as illustrated by Figure 2.
The concept should be applied during model devel-
opment, to ensure that a domain is defined as
broadly as possible for a desired level of predictivity.
It should be noted that, for a model with a given
number of descriptors, there is generally a trade-off
between the breadth of the domain and the level of
predictivity. Thus, in general, one would either aim
to develop a model with broad applicability, sacri-
ficing to some extent the level of predictivity, or one
would aim to develop a model with narrow applica-
bility (for example, a specific class of chemicals), but
with greater predictivity. Both types of model,
sometimes (rather confusingly) called “global” and
“local” models, respectively, can be useful, depend-
ing on the desired application.
The AD is important during (Q)SAR validation,
in the sense that a predefined AD can be verified
and possibly refined. In particular, if an external
validation is being performed, predictions must be
made for compounds that do not form part of the
training set. However, to ensure that the external
“validation” set is appropriate for model validation,
the test chemical structures should fall within the
AD of the model, as deduced by analysis of the
training set. An open question is whether external
validation should also include chemical structures
that are considered to fall outside the defined AD,
to check whether the boundary is correctly defined,
and to investigate the effect of extending the AD on
the predictivity of the model.
The ultimate reason for having a well-defined AD
is to assist the regulatory application of (Q)SARs to
particular chemicals. The decision to use a (Q)SAR
for regulatory purposes will generally require an
assessment of whether the chemical of interest (for
example, a chemical registered under REACH) fits
within the AD of the model. This is an essential
piece of information during the regulatory assess-
ment of chemicals, because it informs the model
user as to whether the endpoint of interest can be
reliably predicted for the chemical of interest.
Furthermore, in the case where multiple (Q)SARs
are available for a given chemical, the model user
may also wish to compare the reliability of the pre-
dictions made by different models.
“Mechanistic QSARs” based on
Physicochemical Descriptors
The term “mechanism of toxic action” can be
defined as the action of a toxicant at the molecular
level, whereas “mode of [toxic] action” refers to a
more general effect or physiological response at a
higher level of biological organisation. For example,
there is discussion of the mode of action of nar-
cotics, which is displayed at the organism level as a
general decrease in activity and, within this mode of
action, several distinct mechanisms are sometimes
considered (which may result from different inter-
actions at the molecular level; 11). In this report,
the terms “mechanism” and “mode of action” are
used interchangeably, even though “mechanism” is
sometimes used to provide a more detailed or lower-
level description of events in the cause-to-effect
chain than is “mode of action”.
Figure 2: The central place of the applicability domain in various stages of the (quantitative)
structure-activity relationship ([Q]SAR) life cycle
Model validationModel development Model application
Biological data
Physicochemical and
structural data
Statistical analysis
OECD principles for
(Q)SAR validation Prioritisation
Chemical categories
Risk assessment
4 T.I. Netzeva et al.
The terms “mechanistic” and “mechanistically
based” have been used in relation to toxicological
QSARs for about two decades, with different mean-
ings (12). Initially, the expression “different mech-
anism of toxic action” was used to explain outliers
to simple regression-derived acute toxicity QSARs
developed with data from a particular chemical
class. Subsequently, rules based on two-dimen-
sional structures were used to identify chemicals
considered to elicit toxicity by the same mechanism
or mode of action. As the number of potential
chemical descriptors increased, the term “mech-
anistically based” became increasingly used to
refer to QSARs developed by using descriptors that
were interpretable in terms of the physicochemical
properties they encoded and the causal link
between the physicochemical properties and the
endpoint modelled. Examples of “interpretable”
descriptors include the octanol–water partition
coefficient (logKow) for hydrophobicity, and the
energy of the lowest unoccupied molecular orbital
(ELUMO) for soft electrophilicity. Models based
exclusively on “interpretable” descriptors are often
referred to as “mechanistic QSARs”.
In principle, if one develops a mechanistically
based QSAR, it should be possible to define an AD
with data for fewer chemicals than would be the
case if one developed a QSAR that is only statisti-
cally based. In the development of a mechanisti-
cally based QSAR for a specific endpoint, it is
hypothesised that the endpoint is a function of cer-
tain physicochemical properties, because the end-
point results from a particular molecular
mechanism of action. Data on selected chemicals
can then be compiled to test the hypothesis for the
key physicochemical properties. If the additional
data support the hypothesis, there would be a
mechanistic aspect to the AD, and the boundaries
of the AD could probably be defined by using data
for fewer chemicals.
In the development of a purely statistically based
QSAR, no assumptions are made about the cause of
the endpoint, or more than one cause is anticipated.
Thus, it is necessary to test a larger number of
chemicals to capture the variation in the descriptor
space before using statistics in the definition of the
There are relatively few regulatory endpoints for
which “mechanistic QSARs” have been proposed,
due to gaps in our understanding of underlying
mechanisms of action and the scarcity of high-qual-
ity data sets suitable for hypothesis testing. Two
examples are acute aquatic toxicity and skin sensi-
Mechanistic (Q)SARs for aquatic toxicity
In terms of their acute aquatic toxicity, the major-
ity of industrial organic chemicals are considered to
exhibit a narcosis mechanism of toxic action (13).
Narcotic chemicals cause only non-covalent and
reversible alterations at the theoretical site of
action, which is considered to be the cell membrane.
In the modelling of the narcosis mode of action,
octanol is often regarded as an appropriate surro-
gate for the target lipid, and the logKow is used as a
descriptor for the chemical interaction with the
membrane. The AD of a QSAR for narcosis can be
expressed either as a set of exclusion rules (i.e. all
compounds that do not fall in certain classes are
narcotics (14), or as a set of inclusion rules that
identify chemicals (not necessary classes) capable of
exhibiting the narcosis mode of action (5). The first
approach can be applicable to more chemicals, but
the AD risks being reduced by additional exclusion
rules, added to account for chemicals exhibiting
mechanisms that were not known or that were not
taken into account when the rules were developed.
The second approach can give a higher confidence
in the domain, but it may restrict the number of
chemicals that can be predicted by the model.
As toxicity data sets grew larger with the testing
of more so-called “reactive” chemicals (i.e. chemi-
cals with measured toxicity significantly greater
than that predicted by narcosis models), the devel-
opment of QSARs took different approaches. In one
approach, QSARs were developed on the basis of
generic electrophilic and hydrophobic terms (the
response–surface approach), which sacrificed fit in
order to expand the AD while maintaining inter-
pretability of descriptors. In a second approach,
QSARs were developed on the basis of larger num-
bers of descriptors which sacrificed the inter-
pretability of descriptors in order to expand the AD,
while maintaining fit. A third approach was to
develop QSARs that represent well-studied molecu-
lar mechanisms of action, while maintaining both
fit and interpretability of descriptors. The ADs of
such QSARs were limited to narrowly-defined sets
of chemicals (defined in terms of classical organic
chemical reactions, such as Michael addition).
Mechanistic (Q)SARs for skin sensitisation
Another field where “mechanistic” QSARs have
been developed is the modelling of skin sensitisa-
tion. For chemicals that act as skin sensitisers, elec-
trophilic or pro-electrophilic behaviour is almost
always the key step in the mechanism of action.
Therefore, it is natural to group the chemicals into
ADs based on the various (pro)electrophilic mecha-
nisms, such as SN2 electrophiles, Michael-type
acceptors, SNAr electrophiles, activated esters, and
“poison ivy” type pro-electrophiles.
As yet, there is no “global” QSAR for any of these
natural domains. In principle, it should be possible to
develop, for example, a “global Michael-type acceptor
QSAR” for skin sensitisation, covering, for example,
ECVAM Workshop 52: (quantitative) structure–activity relationships 5
α,β-unsaturated aldehydes, ketones, nitriles,
nitroaliphatics and sulphones. In practice, the devel-
opment of such a QSAR is difficult, because it most
likely requires extensive new testing. Therefore, the
modelling of skin sensitisation is often based on a set
of structure-based rules and QSARs for several
structural domains (for example, the aldehydes
domain and the sulphonate esters domain) which cut
across the natural mechanistic domains (15–17). For
example, the aldehydes domain cuts across Schiff
base formers (a domain which also includes non-
aldehydes such as diketones and pyruvate esters)
and Michael acceptors (a domain including many
non-aldehydes). Within a given structure domain, a
QSAR is typically developed for a “tested domain”,
i.e. a subset of chemicals that occupy a smaller region
of the descriptor space of the entire structure
Statistically Based QSARs Based on
Physicochemical Descriptors
This section describes a variety of interpolation
methods that have been developed for statistically
based QSARs (6). Interpolation is a mathematical
term that describes the process of predicting the
value of a function at a point from its known values
at two or more surrounding points. Interpolation
methods make estimations from the training set of
data, which are represented as a set of points in n-
dimensional descriptor space, where nis the num-
ber of descriptors in the model.
An interpolation region in one-dimensional
descriptor space is simply the interval between the
minimum and the maximum values of the training
data set. Interpolation regions in multivariate
descriptor space are more complex. Four major
approaches have been recognised to estimate inter-
polation regions in multivariate space. These are
based on ranges, geometry, distances and probabil-
ity density distribution functions (6).
Range-based methods
The simplest method for describing the AD is to
consider ranges of individual descriptors. This
defines an n-dimensional hyper-rectangle with
sides parallel to the coordinate axes. The data dis-
tribution is assumed to be uniform. Two limitations
of this approach are that interior empty space is not
detected and there is no correction for correlations
(linear or non-linear) between descriptors.
Principal components analysis (PCA) is a mathe-
matical method in which the original data set is
transformed by rotation of the axes, to correct for
the correlations between the descriptors. The PCA
procedure involves centring the data around the
standard mean, and manipulating the covariance
matrix of the transformed data to form a new coor-
dinate system, in which the new axes, called princi-
pal components (PCs), are orthogonal to one
another. The PCs are aligned with the directions of
the greatest variations in the data set. An n-dimen-
sional hyper-rectangle can then be defined with
sides parallel to the PCs and with the data points
between the minimum and maximum value of each
PC. The hyper-rectangle also includes empty space,
but this is less empty than the hyper-rectangle
based on the original descriptor ranges.
A variation of the PCA domain is implemented as
the optimum prediction space (OPS) in the TOP-
KAT software (18). In the case of the OPS, the data
are also centred around the average of each param-
eter, and the PCA procedure is applied to generate
the new orthogonal coordinate system (called the
OPS coordinate system). The minimum and maxi-
mum values of the data points on each axis of the
OPS coordinate system define the OPS boundary.
In addition, to deal with data sets comprising non-
uniformly distributed data, the confidence of the
prediction is estimated in terms of the property sen-
sitive object similarity (PSS) between the training
set and a queried point. The PSS is the TOPKAT
implementation of a heuristic solution to reflect
dense and sparse regions of the data set. A “simi-
larity search” enables the user to check the per-
formance of TOPKAT in predicting the effects of a
chemical which is structurally similar to the test
structure. The user is also given literature refer-
ences to the original sources of information.
Geometric methods
The most straightforward empirical method for
defining the coverage of an n-dimensional set is the
convex hull, which is the smallest convex area that
contains the original set. This method is illustrated
in Figure 3, in which the two-dimensional inter-
polation space is defined by two descriptors (logKow
and acceptor delocalisability), which were found to
be important predictors of acute toxicity to fish
(19). Even within the convex hull defined by these
two descriptors, there are regions with a high den-
sity of data points, and regions where the data are
sparse. To address these limitations, more-sophisti-
cated methods for defining the domain have been
Calculation of the convex hull is a computational
geometry problem (20). Efficient algorithms for
convex hull calculation are available for two and
three dimensions, but the order of complexity (O) of
the algorithms also increases with increasing num-
bers of data points and dimensions. For npoints
and ddimensions, the complexity is of the order of
O(n[d/2]+1). A disadvantage of this approach is that
potential empty spaces within the convex hull can-
not be identified.
6 T.I. Netzeva et al.
Distance-based methods
Distance-based approaches calculate the distance
from a query data point to a data set. The decision
as to whether a data point is close to the data set
depends on whether there is a criterion for the dis-
tance to be below a defined threshold.
Regions at a constant distance are called iso-dis-
tance contours. The shape of the iso-distance con-
tours depends on the particular distance measure
used and the particular approach for measuring the
distance between a query point and a data set.
Examples of different approaches include: a) dis-
tance to the mean; b) average distance between the
query point and all data set points; and c) maximum
distance between the query point and all data set
Distance-based approaches can be used to sepa-
rate regions of varying density by imposing cut-off
values. However, these regions do not reflect the
actual information density of the data set, and the
cut-off values do not correspond with the density of
the data.
Three distance-based approaches have been
found to be most useful in QSAR research, namely,
the Euclidean, Mahalanobis and city-block distance
measures (6). Related approaches are based on the
Hotellings test and the leverage, which is calculated
from the Hat Matrix.
Euclidean, Mahalanobis and city-block distances
The Euclidean distance is the square root of the
squared differences between corresponding ele-
ments of the rows (or columns) in the distance
matrix. This is probably the most commonly used
distance metric. The Mahalanobis distance is a
weighted Euclidean distance, where the weighting
is determined by the sample variance–covariance
matrix. The block distance is the sum of the
absolute differences between corresponding ele-
ments of the rows (or columns) in the distance
matrix. The block distance is also known as the city-
block or Manhattan distance.
Methods based on the Euclidean and
Mahalanobis distance measures identify the inter-
polation regions by assuming that the data are nor-
mally distributed. In contrast to the Euclidean
distance, the Mahalanobis distance takes into
account the correlation between descriptor axes.
The city-block distance assumes a uniform distribu-
tion of data points.
The Euclidean distance, calculated according to
Equation 1, places an equal weight on each dimen-
sion (descriptor) in the model data space.
d(i,j) = Ïttttttttttttttt
(xilxjl)2+ (xi2xj2)2+ ... + (xip – xjp)2
(Equation 1)
where d(i,j) is the distance between two points iand
j; xip is the value of point ialong axis p; and xjp is
the value of point jalong axis p.
To increase the accuracy and relevance of the
similarity measure, a correction is made to account
for the fact that not all descriptors to the overall
model are equally important. A simple correction is
to use the weighted Euclidean distance (21), as indi-
cated in Equation 2:
d(i,j) =
w1(xilxjl)2+ w2(xi2xj2)2+ ... + wp(xip – xjp)2
(Equation 2)
where wnis the weight assigned based on the
importance of the nth descriptor in the model.
In Equation 2, the overall weights of the descrip-
tors in the QSAR model are used. These are
obtained by calculating the QSAR model coeffi-
cients with auto-scaled (mean-centred and vari-
ance-normalised) descriptors. The magnitude of
each resulting QSAR model coefficient reflects the
relative contribution of a single descriptor to the
calculated value of the modelled property. The
weights can be obtained by normalising all coeffi-
cients, so that the most important descriptor has a
coefficient of 1.0.
To compare the usefulness of conventional and
the weighted-Euclidean distances, a data set was
Figure 3: Representation of the
applicability domain by a two-
dimensional plot
The training data are represented by circles, and a new
chemical to be predicted by the model is represented by
the triangle. The training set was used to derive a two-
dimensional linear regression model for acute fish
toxicity on the basis of two descriptors: logKow and
acceptor delocalisability of the ether oxygen of the ester
group (19).
0.102 0.104
acceptor delocalisability
00.106 0.108 0.110 0.112 0.114 0.116
ECVAM Workshop 52: (quantitative) structure–activity relationships 7
compiled for a series of similar QSAR models for the
prediction of boiling points (21–23). Two main
observations were made. Firstly, the conventional
Euclidean distance did not always correctly deter-
mine when a chemical was in the domain of the
model (i.e. when it should be accurately predicted).
Chemicals with large distances could still be pre-
dicted accurately. In other words, a model would
not be considered to be applicable to these new
chemicals, despite the fact that it predicted them
accurately. Secondly, the use of weighted-Euclidean
distances improved this situation. In all cases
where the external prediction set member had a rel-
ative distance greater than 1.0, it also had a less
accurate prediction.
These observations lead to two general conclu-
sions. Firstly, it is necessary to account for the
influence of each descriptor in the model when
quantifying similarity based on the model data
space. The descriptors do not contribute equally to
the model prediction, so they should not be
expected to contribute equally to the assessment of
molecular similarity. Such a modification appears
to avoid the discounting of a model when it is actu-
ally applicable to a new chemical. Secondly, it is
insufficient to determine molecular similarity only
in the model data space. It is possible that a new
chemical can be very similar to the training set
chemicals in all respects that the model considers,
but it could have an additional feature that also
affected the property in question, and that was not
properly accounted for by the model.
Hotelling’s test and leverage
The Hotelling’s T2statistic is the multivariate
equivalent of Student’s t statistic, and provides a
check for observations adhering to multivariate
normality (24). A similar statistic is the leverage
value (25), which is proportional to the Hotelling T2
and to the Mahalanobis distance. Both Hotelling T2
and leverage correct for co-linearity in the descrip-
tors through the use of the covariance matrix.
The model space can be represented by a two-
dimensional matrix comprising nchemicals (rows)
and kvariables (columns), called the descriptor
matrix (X). The leverage of a chemical provides a
measure of the distance of the chemical from the
centroid of X. Chemicals close to the centroid are
less influential in model building than are extreme
points. The leverages of all chemicals in the data set
are generated by manipulating X according to
Equation 3, to give the so-called Influence Matrix or
Hat Matrix (H).
H = X(XTX)–1 XT(Equation 3)
where X is the descriptor matrix, XTis the trans-
pose of X, and (A)–1 is the inverse of matrix A,
where A = (XTX).
The leverages or hat values (hi) of the chemicals
(i) in the descriptor space are the diagonal elements
of H, and can be computed by Equation 4 (25).
hii = xiT(XTX)–1 xi(Equation 4)
where xiis the descriptor row-vector of the query
A “warning leverage” (h*) is generally fixed at
3p/n, where nis the number of training chemicals,
and pthe number of model variables plus one. A
chemical with high leverage in the training set
greatly influences the regression line: the fitted
regression line is forced near to the observed value
and its residual (observed-predicted value) is small,
so the chemical does not appear to be an outlier,
even though it may actually be outside the AD. In
contrast, if a chemical in the test set has a hat value
greater than the warning leverage h*, this means
that the prediction is the result of substantial
extrapolation and therefore may not be reliable.
The Hotelling T2and leverage statistics can be
used in the assessment of whether test chemicals
fall outside the QSAR AD, as illustrated by
Gramatica et al. (26), Tropsha et al. (27), and
Eriksson et al. (8). The observation that a chemical
has a hat value greater than the warning leverage
indicates that the chemical falls outside the AD.
However, the observation that a chemical has a hat
value less than the warning leverage does not nec-
essarily indicate that the chemical falls within the
AD. Chemicals may also fall outside the AD, if they
are outliers as defined by their large standardised
residuals. To identify chemicals that are outside the
AD on the basis of both leverages and standardised
residuals, the Williams plot is sometimes used, as
illustrated in Figure 4.
The model used for this illustration, taken from
Kulkarni et al. (28), uses three descriptors for 32
chemicals (six chemicals were excluded as outliers)
to predict acute toxicity to the fish, Pimephales
promelas, where p= 4, n= 32, and h* = 3 × 4/32
= 0.375. When the model was redeveloped (29) by
using a training and a test set, four points with
extreme leverages were identified on the Williams
graph (Figure 4a). Of these, one chemical (com-
pound 32) with a large leverage from the training
set, was predicted correctly, but would be expected
to have a disproportionate influence on the regres-
sion line, whereas three chemicals (compounds 37,
41, and 43) with a large leverage from the test set,
were not well predicted. At the same time, there
were several additional outliers which were not
identified on the basis of leverage alone (Figure 4b).
In the assessment of a new chemical for which a
prediction can be made, but for which there is no
experimental value, it is not possible to determine
the standardised residual, so the conclusion can
only be based on the leverage. Thus, the leverage
can be useful in identifying some of the chemicals
8 T.I. Netzeva et al.
Figure 4: Examination of outliers in a regression-based quantitative structure-activity
= training, = test.
a) Williams plot, i.e. plot of standardised residuals versus hat values, with a warning leverage of 0.375.
b) Plot of predicted versus observed toxicity.
Both plots were derived by re-analysis of data from Kulkarni et al. (28).
–1.0 0–0.5 0.5 1.0 1.5 2.0 2.5 3.0 3.5
observed values
predicted values
12 20
22 31
24 6
13 34 35
36 37
38 40
42 43
10 523 9
17 18
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
hat value
standardised residual
1, 2, 7,
10, 22,
25, 24,
27, 30,
14 1516
20 12
ECVAM Workshop 52: (quantitative) structure–activity relationships 9
that fall outside the AD, but it does not necessarily
identify all of them.
Probability density distribution-based
The probability density function of a data set can be
estimated by parametric or non-parametric methods
(6). Parametric methods assume that the density
function has the shape of a standard distribution
(for example, a Gaussian or Poisson distribution).
Alternatively, a number of non-parametric tech-
niques are available which do not make any assump-
tions about the data distribution. Non-parametric
techniques allow the probability density to be esti-
mated solely from data by kernel density estimation
or mixture density methods. For the assessment of
QSAR ADs, emphasis has been on the investigation
of non-parametric techniques.
Probability density methods are the only meth-
ods capable of identifying internal empty regions
within the convex hull of a QSAR AD. Further-
more, if empty regions are located close to the con-
vex hull border, probability density methods can
generate concave regions to reflect the actual data
distribution. The first step is to estimate the prob-
ability density of the data set. Density estimation
is an area of extensive research, but most methods
focus on low dimensional (1D, 2D, 3D) densities,
unless some further assumptions are made (30).
An algorithm for multivariate kernel density esti-
mation has been developed by Gray & Moore (31,
The next step after the probability density esti-
mation is to find the smallest region that comprises
some predefined fraction of the total probability
mass. The smallest interval (in 1D) or multidimen-
sional region (>1D), comprising (1-α)*100 percent
of the probability mass, where (0 < α< 1), is known
as the (1-α)-highest density region (HDR). A 90%
HDR is illustrated in Figure 5.
It is not a trivial task to calculate the HDR, because
it becomes increasingly computationally intensive for
higher dimensions, unless one assumes a Gaussian
model or another parametric distribution (33). An
example of probability density estimation applied to
the model of Dimitrov et al. (19) is illustrated in
Figure 6.
Another probabilistic approach for assessing the
AD of regression-based QSARs, based on the “joint
applicability domain”, has been proposed by T.
Aldenberg (unpublished results). In this approach,
the probability contours for the joint distribution of
X (predictor) and Y (response) is calculated on the
basis of the bivariate or multivariate distribution.
This is then used to identify data points as inside or
outside the domain.
Probability cuts-offs can be provided by the well-
known graphical device in exploratory data analysis
called the box-and-whisker (or simply box) plot (34).
For univariate empirical data, “near” outside val-
ues are characterised by being beyond 1.5 times the
interquartile range above the third quartile (or
below the first quartile). Extreme (“far”) outside
values are those located beyond 3.0 times the
interquartile range above the third quartile, or
below the first quartile. When a data set displays
several near or far outside values, it may contain
erroneous data, and/or the data may come from
another distribution (for example, a more skewed
distribution), in which case a transformation may
be needed.
The idea of “outside values” can be transferred to
the multivariate normal distribution. Points within
the (elliptical) contour that capture 99% of the
probability distribution are inside points. They are
in the “code green” zone. The points between this
boundary and the outer ellipse covering 99.99% of
the cases are “outside”. A model should be used for
those values with caution. This is the “code orange”
zone. Data points outside the outer ellipse are in the
“code red” zone. The model should not be used for
these predictions.
Figure 7 illustrates the joint applicability
domain of a single-descriptor QSAR, in which the
predictor is logKow and the response is muta-
genicity (log TA98). The original model by
Debnath et al. (35) was based on a data set of 88
chemicals. The labelled points are 18 chemicals
taken from Glende et al. (36). It can be seen that
eight of the labelled compounds fall in the “green
zone”, eight fall in the “orange zone”, and two in
the “red zone”. At present, the joint AD approach
has been found to be useful when applied to sin-
gle-descriptor QSARs. It is considered possible to
extend the joint AD approach to more-complex
models (involving the x-dependent conditional
distribution of Y), thus providing a more-sophisti-
cated estimate of the predictor (descriptor)
domain, but this needs further research.
Figure 5: Probability density and the 90%
highest density region (HDR)
d = the probability density corresponding to the upper
(a2) and lower (a1) limits of x, which define the
boundaries of the HDR. x defines the 1D property space.
–6 –4 –2 0 2 4 6 8 10 12
probability density
10 T.I. Netzeva et al.
Statistical QSARs based on Structural
It is important to discriminate between QSARs that
predict physical properties and those that predict
(bio)chemical activities, because there are concep-
tual differences between the two that affect the
assessment of their ADs. Physical properties (for
example, molecular weight, solubility, partitioning
properties) are global properties, in that every atom
of the molecule contributes to the observed prop-
erty, and while the relative locations of the atoms is
relevant, the effect is mostly limited to immediate
neighbouring atoms. The ability to define the ADs
of such QSARs is limited by lack of prior knowledge
of the contribution of the selected descriptors to the
properties of tested molecules. The applicability of a
QSAR is compromised when the values of one or
more descriptors fall outside the range of values
used in the derivation of the model.
Chemical and biochemical activities (for example,
chemical reactivity, metabolism, biodegradation,
some toxic and pharmacological properties, and
active membrane transport) are properties which
are determined primarily by a specific part of the
molecule. These properties result from a specific
“chemical functionality” that must be present in
the molecule, and which enables the molecule to
bind or react in a defined way. Thus, in contrast to
models for physical properties, not all molecules
will exhibit the biological property, since they need
the proper structural feature(s) to be active. If the
“chemical functionality” is unknown, or several
functionalities need to be present for the activity,
the ability to define the AD is complicated by the
difficulty of ensuring that the appropriate struc-
tural features are represented in both the training
and test sets.
The possible occurrence of unknown fragments in
a test set is inevitable when applying QSARs based
on structural fragments. It is widely accepted that
the accuracy of prediction for molecules with
unknown fragments is lower than the accuracy for
those that contain known fragments. Therefore, it is
considered important to inform the user of the
Figure 6: Interpolation region estimated with a kernel density approach
The training set was used to derive a two-dimensional linear regression model for acute fish toxicity on the basis of two
descriptors: logKow and acceptor delocalisability of the ether oxygen of the ester group (19). The triangle represents a
new chemical whose toxicity is to be predicted by the qualitative structure-activity relationship.
0.102 0.104
acceptor delocalisability
00.106 0.108 0.110 0.112 0.114 0.116
= 99–100%
= 90–99%
= 80–90%
= 70–80%
= 60–70%
= 50–60%
= 40–50%
= 30–40%
= 20–30%
= 10–20%
= 0–10%
ECVAM Workshop 52: (quantitative) structure–activity relationships 11
QSAR model when an unknown fragment appears.
The generation of such a warning is given with
QSAR models in the MULTICASE platform.
MCASE (and MC4PC) evaluate the structural fea-
tures of a set of non-congeneric molecules and iden-
tify the substructural fragments, called biophores,
that are considered responsible for the observed
activity (37). Chemicals containing the same bio-
phore are grouped into subsets for which independ-
ent QSAR models are developed. The descriptors of
these models are called modulators, and consist of
fragments found within the individual sets, as well
as calculated transport, partitioning and quantum
mechanical properties that may be relevant to the
chemical activities of the individual biophores. The
result of this operation is a set of QSAR models for
the congeneric sets of molecules containing the
same biophore, which is identified as a “chemical
functionality” responsible for the observed property.
The validity of the model is expressed as the proba-
bility that the corresponding biophore is indeed
related to activity. Model predictions are also accom-
panied by an assessment of whether every group of
three bonded non-hydrogen atoms in the test struc-
ture has been seen and therefore evaluated by the
model builder, or not seen and therefore of unknown
effect on the prediction results (38, 39).
Another way of expressing the AD of a QSAR
with structural fragments is by calculating similar-
ity measures. An example based on the experience
of Leadscope Inc. includes the generation of a warn-
ing for unknown substituents, as well as determi-
nation of whether a compound is sufficiently
similar to the training set. For the latter, the
Tanimoto score is calculated. A test compound or
the entire test set can be compared with the entire
training set. In Figure 8, a test set is compared with
the training set for two models. For each compound
in the test set, a pairwise similarity score is com-
puted. The median pairwise similarity within the
test set is then obtained for each test compound.
Next, for each compound in the test set, a pairwise
similarity score is calculated for all the compounds
in the training set. The median pairwise similarity
between the test and training sets is calculated for
each test compound. Figure 8 shows the correlation
of median similarities “within” the test set against
the median similarities “between” test and training
sets. Those compounds within the density ellipse
are structurally similar to those in the training set,
and can be considered to lie within the AD of the
The AD of QSARs with structural fragments can
also be defined in terms of the coverage of fragmen-
Figure 7: Joint applicability domain of a single-descriptor quantitative structure-activity
The elliptical contours contain a given fraction of the bivariate normal probability mass: 50%, 99% (used as the AD)
and 99.99%. The model may be used for predictions lying inside the inner two ellipses (green zone). However, the model
should not be applied outside the outer ellipse (red zone), and should only be used with caution for predictions lying
between the outer two ellipses (orange zone).
–2 0
–10 24
17 18
15 14
12 T.I. Netzeva et al.
tal space. In this case, the above-mentioned inter-
polation methods (for example, ranges, distances,
leverages and probability density approaches) are
applicable. However, some data pre-processing is
generally needed, due to the nature of the data. For
example, a scaling of descriptors (for example, in
the range 0–1) is useful when the descriptors dis-
play different numerical ranges, to ensure that all
the variables have the same chance of influencing a
regression model. Another important pre-treat-
ment is to analyse the correlations between the
descriptors, and, if the descriptors are highly corre-
lated, to apply PCA to develop new orthogonal axes
(new descriptors).
A comparative study of different interpolation
methods (40) was applied to the Syracuse Research
Corporation (SRC) KOWWIN model for logKow pre-
diction, which uses the group-contribution method
(41). In this study, it was shown that the probabil-
ity density approach is more restrictive than the
range, distance and leverage methods. As a crite-
rion of success, the root mean square error (RMSE)
of the compounds from an external validation set
(the same for all methods) that fall in the AD of the
model was considered (at a cut-off threshold equal
to the lowest probability density value of a training
set data point). The result was not surprising, since
probability density approaches do not require
descriptors to be normally distributed, and are
therefore well suited for descriptors based on struc-
tural fragments. However, the result was also
attributed to a dramatic reduction in the number of
structures that were classified “in the AD” (nearly
half of the total number of chemical structures that
were considered to be in the AD by the other meth-
The SAR Applicability Domain
The term SAR describes a qualitative relationship,
which means that no mathematical model needs to
be applied in order for a prediction for a new chem-
ical to be made. The simplest example is the struc-
tural alert. Although a SAR may be qualitative, it is
not necessarily the case that the derivation of the
SAR itself is achieved by non-statistical means. A
structural alert, for example, can be identified by
the automated statistical analysis of a training set
of chemicals or by expert judgement. In the former
case, useful structural alerts can be generated, even
in the absence of mechanistic understanding. In the
latter case, additional information relating to the
known or putative mechanism of action of a chemi-
cal can be considered, and this additional informa-
tion may compensate for gaps in the available data
An example of the structural alert approach has
been published by Ashby and co-workers for the
identification of chemicals with carcinogenic poten-
tial based on their DNA reactivity, either directly or
following metabolic activation (42, 43). Such struc-
Figure 8: Applicability domain based on a measure of similarity defined by Leadscope Inc.
(Columbus, OH, USA)
a) Global; b) aromatic amines.
= points with correct predictions; = points with incorrect predictions.
median correlation within test set
median correlation between
training/test sets
00.08 0.12 0.16 0.20
median correlation within test set
median correlation between
training/test sets
00.6 0.9 1.2
ECVAM Workshop 52: (quantitative) structure–activity relationships 13
tural alerts can be applied to new chemicals by
expert judgement, or can be incorporated into com-
puter systems (44). The automated use of structural
alerts facilitates their rapid and reproducible use in
the absence of human error.
Several computer systems for toxicity prediction
make use of structural alerts (in addition to rules
based on physicochemical properties). DEREK for
Windows (45) is an example of such a system, and is
used here for illustration purposes. Other systems
include HazardExpert (46), the OncoLogic carcino-
genicity prediction program (47), and the decision
support system for irritation and corrosivity devel-
oped by the German Bundesinstitut für
Risikobewertung (BfR), formerly called the
Bundesinstitut für gesundheitlichen Verbraucher-
schutz und Veterinärmedizin (BgVV; 48).
DEREK for Windows is an expert system that
makes use of a knowledge base composed of struc-
tural alerts, examples and rules, each of which may
contribute to the toxicity predictions made by the
system. Each alert in the knowledge base describes
the relationship between a structural feature, or
toxicophore, and the toxicological endpoint with
which it is associated. When a chemical is
processed, the system reports any matches of alerts
present in the knowledge base with the query struc-
ture. For example, decanoyl chloride is predicted by
DEREK for Windows to be a possible skin sensitiser
in humans, as a result of the presence of alert
describing the relationship between a carboxylic
acid halide group and the occurrence of skin sensi-
The AD for an alert of this type can be defined
simply in terms of the scope of the alert. If a chem-
ical contains the alert, then it lies within the
domain; if it does not contain the alert, then it lies
outside the domain, in which case no conclusion for
or against toxicity can be drawn. This scenario is
analogous to the situation with any QSAR model.
For example, a query chemical may lie outside the
AD of a QSAR describing the skin sensitisation of
carboxylic acid halides, either because it is a car-
boxylic acid halide, but possesses some property
which is not adequately represented in the model,
or because it is a member of an entirely different
chemical class. In either case, no conclusion can be
drawn about the skin sensitisation potential of the
chemical, because activity may still occur by some
entirely different mechanism.
The skin sensitisation alert for carboxylic acid
halides in DEREK for Windows is comparatively
simple in scope and, as a result, many chemicals
which contain this functional group will activate
the alert. In practice, it is unlikely that all such
compounds in the chemical universe will exhibit
skin sensitisation. However, in the absence of toxi-
city data for chemicals of sufficient structural diver-
sity, more-stringent constraints to the scope of the
alert cannot currently be defined. This is equivalent
to the generation of a QSAR model from a training
set of chemicals which identifies, for example, an
electronic descriptor as the primary determinant of
the observed biological activity. Other physico-
chemical properties, such as steric parameters, may
also be influential, but, unless sufficient variation
in these parameters is represented within the train-
ing set, their importance may not be identified dur-
ing development of the QSAR model. As a
consequence, a query chemical with steric proper-
ties which differ significantly from the chemicals in
the training set, would appear to lie within the AD,
but the resulting prediction could be unreliable.
More-refined alerts can be derived for chemical
classes where more toxicity data and other support-
ing evidence are available. Refinements of the alert
provide information on the boundaries of an alert,
and can take at least two forms. One type of refine-
ment refers to the presence of particular functional
groups and their locations, which leads to some
compounds within the general alert class being
excluded as active. Another type of refinement
refers to a range of physicochemical values, outside
which the alert is not considered a reliable indicator
of activity.
As an illustration of the association of physico-
chemical ranges with structural alerts, DEREK for
Windows makes use of SAR rules which are depend-
ent on physicochemical and toxicologically relevant
biological properties. For example, query chemicals
with a molecular weight above 1000 are considered
unlikely to result in oestrogenic activity, according
to reported screening filters (49). The rule con-
cerned in this case can be applied universally, on
the basis that a molecular weight can be unambigu-
ously calculated for any chemical of defined compo-
sition, and on the mechanistic understanding that
chemicals above a certain size will be physically too
large to bind to the oestrogen receptor. On the
other hand, the system considers that the likelihood
of skin sensitisation in humans is reduced for chem-
icals with a percutaneous absorption below
10–5cm/hour. Currently, percutaneous absorption is
determined from the Potts & Guy equation (50), as
applied to all chemicals for which molecular weight
and logKow values can be calculated. The algorithm
used to calculate the logKow value for use in the
Potts & Guy equation, can itself be considered a
QSAR model and will therefore be associated with
its own AD.
For QSARs, one approach for avoiding the inap-
propriate application of a model in cases where a
query chemical appears to fall within the AD,
involves the use of a similarity measure to compare
the query chemical with those present in the train-
ing set. This similarity measure should ideally
reflect the mechanistic basis of the QSAR, although
the use of structural analogy alone may be adequate
in situations where the mechanism is unclear. The
same approach could also be applied to the applica-
14 T.I. Netzeva et al.
tion of structural alerts for a particular query chem-
ical, provided that the training set for each alert is
available. Currently, each DEREK for Windows
alert includes links only to selected chemicals from
the training set, chosen to reflect the scope of the
alert, in the interests of conciseness and, in some
instances, data confidentiality.
The Applicability Domain of Decision
Trees and Decision Forests
This section addresses the definition of ADs for
models based on decision tree (DT) and decision for-
est (DF) approaches.
An approach developed by Tong and colleagues
has been applied to a novel DF consensus modelling
method (51, 52), which uses the consensus predic-
tion of multiple, comparable and heterogeneous
DTs. The critical assumption in consensus model-
ling is that multiple models will effectively identify
and encode more aspects of the relationship than
will a single model. The DF method attempts to
minimise overfitting by combining DTs and by max-
imising the differences among individual DTs,
thereby cancelling some random noise. The
approach specifies the AD in terms of prediction
confidence and domain extrapolation.
Prediction confidence is a measure of the cer-
tainty of prediction of a specific chemical. In the DF
method, prediction confidence is probabilistically
calculated for each unknown chemical by averaging
the predictions over all the DTs that are combined
to form the model. Figure 9 gives an example to
illustrate how prediction accuracy and prediction
confidence are related. Prediction accuracy is plot-
ted versus prediction confidence for both a DT and
a DF, for a problem where oestrogen receptor bind-
ing activity was modelled by applying 2000 runs of
10-fold cross-validation to a data set containing 232
chemicals (ER232). A strong trend of increasing
accuracy with increasing confidence is apparent for
both the DT and the DF, as is the substantially
higher accuracy for the DF across the entire range
of confidence levels.
Domain extrapolation is the prediction accuracy
for a chemical that is outside the training domain,
i.e. the model space defined by the training set
chemicals. Domain extrapolation can be probabilis-
tically calculated as the average Euclidian distance
that an unknown chemical’s descriptors in tree
Figure 9: Decision forest prediction accuracy versus confidence level for oestrogen receptor
The statistics were calculated by applying 2000 runs of 10-fold cross-validation to a data set containing 232 chemicals.
= forest; = tree.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
confidence level
percentage accuracy
ECVAM Workshop 52: (quantitative) structure–activity relationships 15
paths are outside the range of those same descrip-
tors, based on all chemicals in the training set that
determines the AD. Figure 10 shows the results of
evaluation of DF domain extrapolation for two
oestrogen receptor binding data sets, ER232 con-
taining 232 chemicals, and ER1092 containing 1092
chemicals. Specifically, Figure 10 compares the
overall prediction accuracy for chemicals within the
training domain with accuracy for chemicals falling
several degrees of extrapolation outside the focused
domain. In general, the further away the chemicals
are from the training domain, the smaller the pre-
diction accuracy, and the larger the data set, the
greater the degree of extrapolation for a given pre-
diction accuracy.
1. The validity of a (Q)SAR model depends on its
goodness-of-fit, robustness and predictivity. The
degree of predictivity needs to be considered in
conjunction with the breadth of the AD, since
there is generally a trade-off between the two.
2. When determining the properties of an individ-
ual chemical by means of a (Q)SAR, the AD of
the (Q)SAR is an essential piece of information
in judging the reliability of the prediction for
that chemical.
3. A general definition of the (Q)SAR AD is pro-
posed by the authors of this report, to cover dif-
ferent types of models and different types of
(statistical) modelling approaches. However, it is
recognised that the specific definitions for indi-
vidual models will be model-dependent.
4. The starting point for the definition of any
specific (Q)SAR AD is the training domain, i.e.
the chemical space of the training set.
Therefore, to develop an adequate definition of
the AD of a given model, the full training set
comprising both structures and descriptors is
5. Despite the model-specificity of (Q)SAR ADs, a
single conceptual framework can be developed,
based on the various elements that can be
included in the definition of an AD. The ele-
ments identified so far, include the modelling
method (SAR, QSAR, decision tree), the philoso-
phy of the modelling approach (mechanistically
based, statistically based), the types of descrip-
tor used (structural, physicochemical), and the
general AD approach (coverage/interpolation,
chemical similarity). These elements are not
intended to be mutually exclusive, so the defini-
tion of a given AD could be composed of multiple
6. There are various methods for AD estimation
that are dependent on a number of factors,
including the model dimensionality, the descrip-
tors used and the underlying data distribution.
These various methods will agree to the extent
that the underlying assumptions in model devel-
opment are met.
7. There will always be an uncertainty associated
with any method for assessing (Q)SAR ADs, just
as there is always uncertainty associated with
individual (Q)SAR predictions. One type of
uncertainty is the “unexpected deviation from
the model”, and this relates to the fact that a
prediction may fall within the defined AD of a
model, and yet still be unreliable, due to the fact
that the chemical has some additional prop-
erty/feature, not accounted for by the model.
Another type of uncertainty relates to the fact
that a chemical falling outside the defined AD of
a given model may still exhibit the response
being modelled, because it elicits this response
by a mechanism not accounted for by the model
in question.
Figure 10: Decision forest prediction
accuracy versus domain
extrapolation for binding to two
oestrogen receptors (ER232 and
The statistics were generated by performing 2000 runs of
10-fold cross-validation to two data sets (ER232 [ ] and
ER1092 [ ]) containing 232 and 1092 chemicals,
respectively (9). Domain extrapolation for a chemical is
defined as a percentage away from the training domain,
while the prediction accuracy for the domain is
calculated by dividing correct predictions by the total
number of chemicals in this domain.
in domain 0–10% 10–20% 20–30% >30%
domain extrapolation
percentage accuracy
16 T.I. Netzeva et al.
8. Mechanistic information can be useful as a
supplement to information provided by mathe-
matical/statistical methods. For example, mech-
anistic information may be useful when
assessing the reliability of predictions in “empty
spaces”, or when rationalising unexpected devia-
tions from the model.
1. There is a need to develop a global similarity test
to determine whether the structural features in
a new test compound are covered in the original
training set of chemicals (for example, a quanti-
tative measure of uniqueness relative to the
training set).
2. Further work is needed to elaborate the concep-
tual framework proposed in this report.
3. Further work is needed to explore the possibility
of associating confidence limits with the AD.
Confidence limits could be a useful addition to
the AD, since it could be useful to have “fuzzy”
boundaries rather than simply “black and
white” boundaries.
4. There is a need to develop automated tools, to
help (Q)SAR users to appreciate the limitations
of the (Q)SAR models they are applying.
Mathematical and statistical approaches are par-
ticularly well-suited to automation via com-
puter-based tools.
5. The definition of the AD should be the responsi-
bility of the model builder rather than the model
user. The reason is that the model developer
generally has a better understanding of the
training set, the method(s) used for model devel-
opment, and the limitations of these methods.
6. There is a need for training to improve aware-
ness of the AD concept and its implications for
the assessment and application of (Q)SAR mod-
els, and to familiarise end-users with automated
tools for the assessment of ADs.
7. This report should be used as an input to the
development of the OECD Guidance Document
on (Q)SAR Validation.
1. Anon. (2003). Proposal Concerning the Registration,
Evaluation, Authorisation and Restriction of Chem-
icals (REACH). COM(2003)644 final. Brussels, Belg-
ium: European Commission. Website http://europa.
(Accessed 20.12.04).
2. Anon. (2003). Directive 2003/15/EC of the European
Parliament and of the Council of 27 February 2003
amending Council Directive 76/768/EEC on the
approximation of the laws of the Member States
relating to cosmetic products (Text with EEA rele-
vance). Official Journal of the European Union L66,
3. Van der Jagt, K., Munn, S., Tørsløv, J. & de Brujin,
J. (2004). Alternative Approaches can Reduce the Use
of Test Animals under REACH. Addendum to the
Report “Assessment of Additional Testing Needs
under REACH. Effects of (Q)SARs, Risk Based
Testing and Voluntary Industry Initiatives”. JRC
Report EUR 21405 EN, 25 pp. Ispra, Italy: European
Commission Joint Research Centre. Website (Accessed 16.3.05).
4. Worth, A.P., van Leeuwen, C.J. & Hartung, T.
(2004). The prospects for using (Q)SARs in a chang-
ing political environment: high expectations and a
key role for the Commission’s Joint Research
Centre. SAR & QSAR in Environmental Research
15, 331–343.
5. Schultz, T.W., Cronin, M.T.D., Netzeva, T.I. &
Aptula, A.O. (2002). Structure-toxicity relationships
for aliphatic chemicals evaluated with Tetrahymena
pyriformis. Chemical Research in Toxicology 15,
6. Jaworska, J., Aldenberg, T. & Nikolova, N. (2005).
Review of methods for assessing the applicability
domains of SARs and QSARs. Final report to the
Joint Research Centre (Contract No. ECVA-CCR.
496575-Z). Part 1: Review of statistical methods for
QSAR AD estimation by the training set. Website (Accessed 16.3.05).
7. Gramatica, P., Pilutti, P. & Papa, E. (2003). Pre-
dicting the NO3radical tropospheric degradability of
organic pollutants by theoretical molecular descrip-
tors. Atmospheric Environment 37, 3115–3124.
8. Eriksson, L., Jaworska, J., Worth, A.P., Cronin,
M.T.D., McDowell, R.M. & Gramatica, P. (2003).
Methods for reliability, uncertainty assessment, and
applicability evaluations of classification and regres-
sion based QSARs. Environmental Health Per-
spectives 111, 1361–1375.
9. Tong, W., Xie, Q., Hong, H., Shi, L., Fang, H. &
Perkins, R. (2004). Assessment of prediction confi-
dence and domain extrapolation of two structure-
activity relationship models for predicting estrogen
receptor binding activity. Environmental Health
Perspectives 112, 1249–1254.
10. Nikolova, N. & Jaworska, J. (2003). Approaches to
measure chemical similarity: a review. QSAR &
Combinatorial Science 22, 1006–1026.
11. Cronin, M.T.D. (2003). Quantitative structure-activ-
ity relationships for acute aquatic toxicity: the role
of mechanism of toxic action in successful modeling.
In Quantitative Structure-Activity Relationship
(QSAR) Models of Mutagens and Carcinogens (ed. R
Benigni), pp. 235–258. Boca Raton, FL, USA: CRC
12. Schultz, T.W., Cronin, M.T.D., Walker, J.D. &
Aptula, A.O. (2003). Quantitative structure-activity
relationships (QSARs) in toxicology: a historical
perspective. Journal of Molecular Structure:
THEOCHEM 622, 1–22.
13. Bradbury, S.P. & Lipnick, R.L. (1990). Introduction:
structural properties for determining mechanisms of
toxic action. Environmental Health Perspectives 87,
ECVAM Workshop 52: (quantitative) structure–activity relationships 17
14. Schultz, T.W., Sinks, G.D. & Cronin, M.T.D. (1997).
Identification of mechanisms of toxic action of phe-
nols to Tetrahymena pyriformis from molecular
descriptors. In Quantitative Structure-Activity
Relationships in Environmental Sciences, Vol. VII,
Proceedings of QSAR 96, Elsinore, DK, June 24–28,
1996 (ed. F. Chen & G. Schüürmann), pp. 329–342.
Pensacola, FL, USA: SETAC Press.
15. Patlewicz, G., Basketter, D.A., Smith, C.K.,
Hotchkiss, S.A. & Roberts, D.W. (2001). Skin-sensi-
tization structure-activity relationships for aldehy-
des. Contact Dermatitis 44, 331–336.
16. Roberts, D.W. & Patlewicz, G. (2002). Mechanism
based structure-activity relationships for skin sensi-
tisation: the carbonyl group domain. SAR & QSAR
in Environmental Research 13, 145–152.
17. Patlewicz, G.Y., Wright, Z.M., Basketter, D.A.,
Pease, C.K., Lepoittevin, J.P. & Arnau, E.G. (2002).
Structure-activity relationships for selected fra-
grance allergens. Contact Dermatitis 47, 219–226.
18. Anon. (2000). US patent no. 6 036 349: Method and
Apparatus for Validation of Model-based Pre-
dictions. Issued March 14, 2000. Washington, DC:
19. Dimitrov, S.D., Mekenyan, O.G., Sinks, G.D. &
Schultz, T.W. (2003). Global modeling of narcotic
chemicals: ciliate and fish toxicity. Journal of
Molecular Structure: THEOCHEM 622, 63–70.
20. Preparata, F.P. & Shamos, M.I. (1991). Computa-
tional Geometry: An Introduction, 390pp. New York,
NY, USA: Springer Verlag.
21. Stanton, D.T. & Jurs, P.C. (1991). Computer-
assisted prediction of normal boiling points of
furans, tetrahydrofurans, and thiophenes. Journal
of Chemical Information and Computer Sciences 31,
22. Stanton, D.T., Egolf, L.M. & Jurs, P.C. (1992).
Computer-assisted prediction of normal boiling
points of pyrans and pyrroles. Journal of Chemical
Information and Computer Sciences 32, 306–316.
23. Stanton, D.T. (2000). Development of a quantitative
structure-property relationship model for estimating
normal boiling points of small multifunctional
organic molecules. Journal of Chemical Information
and Computer Sciences 40, 81–90.
24. Seber, G.A.F. (2004). Multivariate Observations,
686pp. New York, NY, USA: John Wiley & Sons.
25. Atkinson, A.C. (1991). Plots, Transformation,
Regression, 282pp. Oxford, UK: Clarendon Press.
26. Gramatica, P., Pilutti, P. & Papa, E. (2004).
Validated QSAR prediction of OH tropospheric
degradation of VOCs: splitting into training-test sets
and consensus modeling. Journal of Chemical
Information and Computer Sciences 44, 1794–1802.
27. Tropsha, A., Gramatica, P. & Gombar, V. (2003).
The importance of being earnest: validation is the
absolute essential for successful application and
interpretation of QSPR models. QSAR & Comb-
inatorial Science 2, 69–77.
28. Kulkarni, S.A., Raje, D.V. & Chakrabarti, T. (2001).
Quantitative structure-activity relationships based
on functional and structural characteristics of
organic compounds. SAR and QSAR in Environ-
mental Research 12, 565–591.
29. Gramatica, P. (2004). Evaluation of Different Stat-
istical Approaches to the Validation of Quantitative
Structure-activity Relationships. Final report to the
Joint Research Centre. Contract No. ECVA-CCR.
496576-Z. 177pp. Website
Documents (Accessed 16.3.05).
30. Silverman, B. W. (1986). Density Estimation for
Statistics and Data Analysis, 176pp. London, UK:
Chapman & Hall.
31. Gray, A. & Moore, A. (2003). Nonparametric Density
Estimation: Toward Computational Tractability. In
Proceedings of SIAM International Conference on
Data Mining, San Francisco, USA, 2003, 9p.
Website (Accessed
32. Gray, A. & Moore, A. (2003). Very fast multivariate
kernel density estimation using via computational
geometry. Proceedings of Joint Statistics Meeting
2003. Alexandria, VA, USA: The American Stat-
istical Association (Website
meetings/jsm/2003 (Accessed 16.3.05).
33. Chen, M-H. & Shao, Q-M. (1999). Monte Carlo esti-
mation of bayesian credible and HPD intervals.
Journal of Computational and Graphical Statistics
8, 69–92.
34. Tukey, J.W. (1977). Exploratory Data Analysis,
688pp. Reading, UK: Addison-Wesley.
35. Debnath, A.K., Debnath, G., Shusterman, A.J. &
Hansch, C. (1992). A QSAR investigation of the role
of hydrophobicity in regulating mutagenicity in the
Ames test. I. Mutagenicity of aromatic and het-
eroaromatic amines in Salmonella typhimurium
TA98 and TA100. Environmental and Molecular
Mutagenesis 19, 37–52.
36. Glende, C, H. Schmitt, L. Erdinger, G. Engelhardt,
& G. Boche (2001). Transformation of mutagenic
aromatic amines into non-mutagenic species by alkyl
substituents. Part I. Alkylation ortho to the amino
function. Mutation Research 498, 19–37.
37. Klopman, G. (1992). MULTICASE: a hierarchical
computer automated structure evaluation program.
Quantitative Structure-Activity Relationships 11,
38. Klopman, G. & Chakravarti, S.K. (2003). Structure-
activity relationship study of a diverse set of estro-
gen receptor ligands (I) using MultiCASE expert
system. Chemosphere 51, 445–459
39. Klopman, G. & Chakravarti, S.K. (2003). Screening
of high production volume chemicals for estrogen
receptor binding affinity (II) by the MultiCASE
expert system. Chemosphere 51, 461–468.
40. Jaworska, J., Aldenberg, T. & Nikolova, N. (2005).
Review of methods for assessing the applicability
domains of SARs and QSARs. Final report to the
Joint Research Centre (Contract No. ECVA-
CCR.496575-Z). Part 2: An approach to determining
applicability domain for QSAR group contribution
models: an analysis of SRC KOWWIN. Website (Accessed 16.3.05).
41. Meylan, W.M. & Howard, P.H. (1995). Atom frag-
ment contribution method for estimating octanol-
water partition-coefficients. Journal of Pharma-
ceutical Sciences 84, 83–92.
42. Ashby, J., Tennant R.W., Zeiger, E. & Stasiewicz, S.
(1989). Classification according to chemical struc-
ture, mutagenicity to Salmonella and level of car-
cinogenicity of a further 42 chemicals tested for
carcinogenicity by the U.S. National Toxicology
Program. Mutation Research 223, 73–103.
43. Tennant, R.W. & Ashby, J. (1991). Classification
according to chemical structure, mutagenicity to
Salmonella and level of carcinogenicity of a further
39 chemicals tested for carcinogenicity by the U.S.
National Toxicology Program. Mutation Research
18 T.I. Netzeva et al.
257, 209–227.
44. Ridings, J.E., Barratt, M.D., Cary, R., Earnshaw, C.G.,
Eggington, C.E., Ellis, M.K., Judson, P.N., Langowski,
J.J., Marchant, C.A., Payne, M.P., Watson, W.P. &
Yih, T.D. (1996). Computer prediction of possible toxic
action from chemical structure: an update on the
DEREK system. Toxicology 106, 267–279.
45. Judson, P.N., Marchant, C.A. & Vessey, J.D. (2003).
Using argumentation for absolute reasoning about
the potential toxicity of chemicals. Journal of
Chemical Information and Computer Sciences 43,
46. Smithing, M.P. & Darvas, F. (1992). HazardExpert:
an expert system for predicting chemical toxicity. In
Food Safety Assessment (ed. J.W. Finley, S.F.
Robinson & D.J. Armstrong), ACS Symposium
Series, pp. 191–200. Washington, DC, USA: Ameri-
can Chemical Society.
47. Woo, Y., Lai, D.Y., Argus, M.F. & Arcos, J.C. (1995).
Development of structure-activity relationship rules
for predicting carcinogenic potential of chemicals.
Toxicology Letters 79, 219–228.
48. Gerner, I., Zinke, S., Graetschel, G. & Schlede, E.
(2000). Development of a decision support system for
the introduction of alternative methods into local
irritancy/corrosivity testing strategies. Creation of
fundamental rules for a decision support system.
ATLA 28, 665–698.
49. Hong, H., Tong, W., Fang, H., Shi, L., Xie, Q., Wu, J.,
Perkins, R., Walker, J.D., Branham, W. & Sheehan,
D.M. (2002). Prediction of estrogen receptor binding
for 58,000 chemicals using an integrated system of a
tree-based model with structural alerts. Environ-
mental Health Perspectives 110, 29–36.
50. Potts, R.O. & Guy, R.H. (1992). Predicting skin per-
meability. Pharmaceutical Research 9, 663–669.
51. Tong, W., Hong, H., Fang, H., Xie, Q. & Perkins, R.
(2003). Decision forest: combining the predictions of
multiple independent decision tree models. Journal
of Chemical Information and Computer Sciences 43,
52. Tong, W., Hong, H., Xie, Q., Xie, L., Fang, H. &
Perkins, R. (2004) Assessing QSAR limitations: a
regulatory perspective. Current Computer Aided
Drug Design 1, 65–72.
ECVAM Workshop 52: (quantitative) structure–activity relationships 19
... The applicability domain is defined as a space including molecules with accurately predicted activities; because the model is based on a limited number of compounds, it does not encompass the complete chemical space [44,45]. ...
Full-text available
With the aim of researching new antimalarial drugs, a series of quinoline, isoquinoline and quinazoline derivatives were studied against the Plasmodium falciparum CQ-sensitive and MQ-resistant strain 3D7 protozoan parasite. DFT with B3LYP functional and 6-311G basis set was used to calculate quantum chemical descriptors for QSAR models. The molecular mechanics (MM2) method was used to calculate constitutional, physicochemical, and topological descriptors. By randomly dividing the dataset into training and test sets, we were able to construct reliable models using linear regression (MLR), nonlinear regression (MNLR) and artificial neural networks (ANN). The determination coefficient values indicate the predictive quality of the established models. The robustness and predictive power of the generated models were also confirmed via internal validation, external validation, the Y-Randomization test and the applicability domain. Furthermore, molecular docking studies were conducted to identify the key interactions between the studied molecules and the PfPMT receptor's active site. The findings of this contribution study indicate that the antimalarial activity of these compounds against Plasmodium falciparum appears to be largely determined by four descriptors, i.e., Total Connectivity (Tcon), percentage of carbon (C (%)), density (D) and bond length between the two nitrogen atoms (Bond N-N). On the basis of the reliable QSAR model and molecular docking results, several new antimalarial compounds have been designed. The selection of drug-candidates was performed according to drug-likeness and ADMET parameters.
... A chemical compound is considered out of the applicability domain when the amount of leverage (hi) of this chemical compound is higher than the critical value h* [28]. Conversely, a chemical compound is considered to be within the applicability domain, when the (hi) is less than the h* value [30]. The AD analysis was performed using MATLAB software version 2011 [31]. ...
Full-text available
The 3D-QSAR models were established in this study based on comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA), the optimal CoMFA model established gave \({\text{Q}}^{2}\)= 0.671, \({\text{R}}^{2}\)= 0.925 and \({\text{R}}_{\text{p}\text{r}\text{e}\text{d} }^{2}\)= 0.868, and the best CoMSIA/SEA model gave \({\text{Q}}^{2}\)= 0.627, \({\text{R}}^{2}\)= 0.775, and \({\text{R}}_{\text{p}\text{r}\text{e}\text{d} }^{2}\)= 0.962. The predictive ability of the developed models was evaluated by external and internal validation. In this study, steric, electrostatic, and hydrogen bond acceptor fields played a key role in the anti-cancer activity. Molecular docking results theoretically revealed the importance of residues ARG164 and THR45 in the active site of the TrxR enzyme. Based on these results, we designed several new inhibitors, and their inhibitory activities were predicted by the best model (CoMFA). In addition, these new inhibitors were analyzed for their ADMET properties and their similarity to drugs. These results will be of great help for the optimization of new anti-cancer drug discovery.
... Given these molecules cannot capture the full breadth of chemical space, it is essential to define an applicability domain (AD). An AD stipulates the area of chemical space for which the model can make predictions with good reliability [29]. This is one of the five guidelines outlined by the Organisation for Economic Co-operation and Development (OECD) recommendations for valid QSAR development [26]. ...
Full-text available
Central nervous system (CNS) disorders are a therapeutic area in drug discovery where demand for new treatments greatly exceeds approved treatment options. This is complicated by the high failure rate in late-stage clinical trials, resulting in exorbitant costs associated with bringing new CNS drugs to market. Computer-aided drug design (CADD) techniques minimise the time and cost burdens associated with drug research and development by ensuring an advantageous starting point for pre-clinical and clinical assessments. The key elements of CADD are divided into ligand-based and structure-based methods. Ligand-based methods encompass techniques including pharmacophore modelling and quantitative structure activity relationships (QSARs), which use the relationship between biological activity and chemical structure to ascertain suitable lead molecules. In contrast, structure-based methods use information about the binding site architecture from an established protein structure to select suitable molecules for further investigation. In recent years, deep learning techniques have been applied in drug design and present an exciting addition to CADD workflows. Despite the difficulties associated with CNS drug discovery, advances towards new pharmaceutical treatments continue to be made, and CADD has supported these findings. This review explores various CADD techniques and discusses applications in CNS drug discovery from 2018 to November 2022.
... Several studies have employed QSAR approach to investigate the quantitative structure activity relationship and predict the biotoxicity of pollutants such as natural medicine (Hamadache et al. 2016), herbicides (Gough and Hall 1999;Zakarya et al. 1996), food additive (Valerio et al. 2007), cosmetics (Hamadache et al. 2016), agriculture (Yang et al. 2020) and metal nanomaterials (Sizochenko and Leszczynski 2016). According to OECD Requirements and Guidelines (Netzeva et al. 2005), a valid * Mei He * Lei Tian ...
Full-text available
Amide herbicides have been widely applied in agriculture and found to be widespread and affect nontarget organisms in the environment. To better understand the biotoxicity mechanisms and determine the toxicity to the nontarget organisms for the hazard and risk assessment, five QSAR models were developed for the biotoxicity prediction of amide herbicides toward five aquatic and terrestrial organisms (including algae, daphnia, fish, earthworm and avian species), based on toxicity concentration and quantitative molecular descriptors. The results showed that the developed models complied with OECD principles for QSAR validation and presented excellent performances in predictive ability. In combination, the investigated QSAR relationship led to the toxicity mechanisms that eleven electrical descriptors (EHOMO, ELUMO, αxx, αyy, αzz, μ, qN−, Qxx, Qyy, qH+, and q−), four thermodynamic descriptors (Cv, Sθ, Hθ, and ZPVE), and one steric descriptor (Vm) were strongly associated with the biotoxicity of amide herbicides. Electrical descriptors showed the greatest impacts on the toxicity of amide herbicides, followed by thermodynamic and steric descriptors.
... For each compound, the leverage values can be calculated and, by plotting these values against the standardized residuals, it is possible to establish the applicability domain of the developed model [48]. This allows detection of molecules that our model cannot predict adequately, thus considered as outliers [49,50], molecules with distinctive structures (high leverage outliers, ℎ > ℎ * ), or those associated to the response (predicted residuals > 3*SDEC). All compounds that are outside the limits established by the leverage warning and three times the standard deviation in error calculation are outliers. ...
As the rate of discovery of new antibacterial compounds towards multidrug resistant bacteria is declining, there is an urge for the search of molecules that could revert this tendency. Acinetobacter baumannii has emerged as a highly virulent Gram-negative bacterium that has acquired multiple mechanisms against antibiotics and is considered of critical priority. In this work we developed a quantitative structure-property relationship (QSPR) model with 592 compounds for the identification of structural parameters related to their property as antibacterial agents against A. baumannii. QSPR mathematical validation (R2 = 70.27, RN = -0.008, aR2 = 0.014 and δK = 0.021) and its prediction ability (Q2LMO= 67.89, Q2EXT = 67.75, a(Q2)= -0.068, δQ = 0.0, rm2 = 0.229, and ∆rm2 = 0.522) were obtained with different statistical parameters; additional validation jobs were done using three sets of external molecules (R2 = 72.89, 71.64 and 71.56). We used the QSPR model to perform a virtual screening on the BIOFACQUIM natural product database. From this screening our model showed that molecules 32 to 35 and 54 to 68, isolated from different extracts of plants of the Ipomoea sp., are potential antibacterial against A. baumannii. Furthermore, biological assays showed that molecules 56 and 60 to 64 have a wide antibacterial activity against clinical isolated strains of A. baumannii, as well as other multidrug resistant bacteria including Staphylococcus aureus, Escherichia coli, Klebsiella pneumonia, and Pseudomonas aeruginosa. Finally, we propose 60 as a potential lead compound due to its broad-spectrum activity and its structural simplicity. Therefore, our QSPR model can be used as a tool for the investigation and search of new antibacterial compounds against A. baumannii.
... All in this work studied compounds were in the range of residuals differing by ±3 standard deviations from the mean value (h* = 0.310). One compound on the Williams plot, trifluoroacetic acid-TFA (43), has a higher leverage value than h*, but its activity has been predicted correctly [52]. It is worth mentioning that TFA is one of ten carboxylic acids used to build and validate the model; however, it is the shortest one, which may be the reason of being for outlier. ...
Full-text available
In this study, we investigated PFAS (per- and polyfluoroalkyl substances) binding potencies to nuclear hormone receptors (NHRs): peroxisome proliferator-activated receptors (PPARs) α, β, and γ and thyroid hormone receptors (TRs) α and β. We have simulated the docking scores of 43 perfluoroalkyl compounds and based on these data developed QSAR (Quantitative Structure-Activity Relationship) models for predicting the binding probability to five receptors. In the next step, we implemented the developed QSAR models for the screening approach of a large group of compounds (4464) from the NORMAN Database. The in silico analyses indicated that the probability of PFAS binding to the receptors depends on the chain length, the number of fluorine atoms, and the number of branches in the molecule. According to the findings, the considered PFAS group bind to the PPARα, β, and γ only with low or moderate probability, while in the case of TR α and β it is similar except that those chemicals with longer chains show a moderately high probability of binding.
Quantitative structure-property relationships (QSPRs) are important tools to facilitate and accelerate the discovery of compounds with desired properties. While many QSPRs have been developed, they are associated with various shortcomings such as a lack of generalizability and modest accuracy. Albeit various machine-learning and deep-learning techniques have been integrated into such models, another shortcoming has emerged in the form of a lack of transparency and interpretability of such models. In this work, two interpretable graph neural network (GNN) models (attentive group-contribution (AGC) and group-contribution-based graph attention (GroupGAT)) are developed by integrating fundamentals using the concept of group contributions (GC). The interpretability consists of highlighting the substructure with the highest attention weights in the latent representation of the molecules using the attention mechanism. The proposed models showcased better performance compared to classical group-contribution models, as well as against various other GNN models describing the aqueous solubility, melting point, and enthalpies of formation, combustion, and fusion of organic compounds. The insights provided are consistent with insights obtained from the semiempirical GC models confirming that the proposed framework allows highlighting the important substructures of the molecules for a specific property.
The manuscript describes a method for understanding the correlation of structural features and first oxidation potentials \(\left({E}_{ox}^{1}\right)\) of electron-donating compounds (EDCs) with tetrathiafulvalene (TTF), dithiadiazafulvalenes (DTDAF), and tetraazafulvalene (TAF) frameworks. The density functional theory (DFT) procedure at B3LYP (6–31 + g(d)) was used for geometric optimization, given the large dimensions of the molecules studied, and their high structural similarity. First of all, the correlation between the oxidation potential and the highest occupied molecular orbital (HOMO) energy level as an effective quantum chemical descriptor was examined. Then, nucleus-independent chemical shifts (NICSs) calculation was applied to affirm the oxidation mechanism and interpret the effect of replacing the sulfur atoms by nitrogen, on the oxidation process. Finally, a more comprehensive investigation of structural features that affect the oxidation potential, topological, geometrical, constitutional, as well as, electrostatic, charged partial surface area, quantum-chemical, molecular orbital, and thermodynamic descriptors was calculated. A predictive model was developed based on the genetic algorithm multivariate linear regression (GA-MLR). There was an outstanding agreement between the theoretical and the experimental values obtained for the first oxidation potentials of the test set (Q2Ext = 0.981).
The solar photo-Fenton process leads to the formation of transformation products (TPs) that are new compounds with an unknown chemical, physical, and biological characteristics.
Full-text available
The notification procedure for new chemicals of the European Union (EU) requires protocols on physicochemical and toxicological tests for the evaluation of physicochemical properties and probable toxic effects of each notified substance. In order to reduce the amount of animal testing, alternative methods should be introduced into toxicity testing. Therefore, we have developed a rule-based decision support system (DSS) for the prediction of the local corrosive/irritant properties of new chemicals. To this end, data on more than 1000 substances were examined, which resulted in approximately 180 "exception-rules" of the kind IF (physicochemical property) A THEN not (toxic) Effect B. In addition, the structural formulae of the chemicals were analysed, which resulted in approximately 160 "structure-rules" of the kind IF Substructure A THEN Effect B. The DSS can predict (based on theoretical structure-activity relationships) whether a chemical produces: a) corrosive effects (i.e. no testing is necessary; b) might have corrosive effects (i.e. no animal testing, in vitro tests are suitable) ; and c) will produce no effects or only marginal effects (i.e. animal tests are necessary based on current EU legislation for hazard assessment purposes). In addition, the DSS provides reliable data for legal classification and labelling based on a specific result. 2000 FRAME.
HazardExpert is one of several advanced expert systems developed during the past decade by CompuDrug Chemistry Ltd., Hungary and marketed by both CompuDrug Chemistry Ltd. and their US subsidiary CompuDrug USA, Inc. This paper will first explain the general ideas behind CompuDrug's expert systems and what these systems, as a group, are meant to achieve. From this point, the paper goes on to HazardExpert as an example of an expert system, examining the specific objectives of the systems and how these objectives are met. Finally, examples of possible uses of the program are detailed.
Computer-assisted methods are applied to the development of predictive models for the normal boiling points of diverse sets of pyrans and pyrroles. The models developed employ molecular structure based parameters or descriptors to encode the features of the compounds which determine the boiling point. A set of 20 descriptors is identified that allows for the development of good quality models for the pyrans and for sets of furans, tetrahydrofurans (THFs), and thiophenes, which have been studied previously. A model is presented which yields good predictions for a combined set of pyrans, furans, THFs, and thiophenes. The scope of this work is expanded to include nitrogen-containing heterocycles through the study of a diverse set of pyrroles. As part of this work, a new set of descriptors is developed for the purpose of capturing information concerning the molecular features responsible for intermolecular hydrogen-bonding interactions. Finally, the pyrrole dataset is combined with a large set of furans, THFs, thiophenes, and pyrans for the purpose of producing a more general boiling point prediction equation. The results of these studies are examined to determine their impact on future work.
Multivariate Graphical Displays Transforming to Normality Distributional Tests and Plots Robust Estimation Outlying Observations