ArticlePDF Available

EADB: An Estrogenic Activity Database for Assessing Potential Endocrine Activity

Authors:

Abstract and Figures

Endocrine-active chemicals can potentially have adverse effects on both humans and wildlife. They can interfere with the body’s endocrine system through direct or indirect interactions with many protein targets. Estrogen receptors (ERs) are one of the major targets, and many endocrine disruptors are estrogenic and affect the normal estrogen signaling pathways. However, ERs can also serve as therapeutic targets for various medical conditions, such as menopausal symptoms, osteoporosis, and ER-positive breast cancer. Because of the decades-long interest in the safety and therapeutic utility of estrogenic chemicals, a large number of chemicals have been assayed for estrogenic activity, but these data exist in various sources and different formats that restrict the ability of regulatory and industry scientists to utilize them fully for assessing risk-benefit. To address this issue, we have developed an Estrogenic Activity Database (EADB; http://www.fda.gov/ScienceResearch/BioinformaticsTools/EstrogenicActivityDatabaseEADB/default.htm) and made it freely available to the public. EADB contains 18,114 estrogenic activity data points collected for 8212 chemicals tested in 1284 binding, reporter gene, cell proliferation, and in vivo assays in 11 different species. The chemicals cover a broad chemical structure space and the data span a wide range of activities. A set of tools allow users to access EADB and evaluate potential endocrine activity of chemicals. As a case study, a classification model was developed using EADB for predicting ER binding of chemicals.
Content may be subject to copyright.
Published by Oxford University Press on behalf of the Society of Toxicology 2013.
This work is written by (a) US Government employee(s) and is in the public domain in theUS.
EADB: An Estrogenic Activity Database for Assessing Potential
Endocrine Activity
JieShen,*
LeiXu, HongFang, Ann M.Richard,§ Jeffrey D.Bray, Richard S.Judson,§ GuangxuZhou,*
Thomas J.Colatsky,|| Jason L.Aungst,||| ChristinaTeng,|||| Steve C.Harris,*
WeigongGe,*
Susie Y.Dai,# ZhenqiangSu,*
Abigail C.Jacobs,**
WafaHarrouk,†† RogerPerkins,*
WeidaTong,*
and
HuixiaoHong*
,1
*Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079;
School of Materials Science and Engineering, Changan University, Nan Er Huan Zhong Duan, Xian City 710064, China;Ofce of Scientic Coordination,
National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079; §National Center for Computational Toxicology,
Ofce of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711;Division of Reproductive and
Urological Products and ||Division of Drug Safety Research, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring,
Maryland 20993; |||Division of Food Contact Notications, Ofce of Food Additive Safety, Center for Food Safety and Applied Nutrition, U.S. Food and Drug
Administration, College Park, Maryland 20740; ||||Division of National Toxicology Program, National Institute of Environmental Health Sciences, National
Institutes of Health, Research Triangle Park, North Carolina 27709; #Ofce of the Texas State Chemist and Veterinary Pathobiology, Texas A&M University,
College Station, Texas 77843; **Ofce of New Drugs and ††Division of Nonprescription Clinical Evaluation, Center for Drug Evaluation and Research,
U.S. Food and Drug Administration, Silver Spring, Maryland 20993
1
To whom correspondence should be addressed at Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and
Drug Administration, 3900 NCTR Road, Jefferson, AR 72079. Fax: (870) 543-7854. E-mail: huixiao.hong@fda.hhs.gov
Received May 20, 2013; accepted July 16, 2013
Endocrine-active chemicals can potentially have adverse effects
on both humans and wildlife. They can interfere with the body’s
endocrine system through direct or indirect interactions with many
protein targets. Estrogen receptors (ERs) are one of the major
targets, and many endocrine disruptors are estrogenic and affect
the normal estrogen signaling pathways. However, ERs can also
serve as therapeutic targets for various medical conditions, such as
menopausal symptoms, osteoporosis, and ER-positive breast can-
cer. Because of the decades-long interest in the safety and thera-
peutic utility of estrogenic chemicals, a large number of chemicals
have been assayed for estrogenic activity, but these data exist in
various sources and different formats that restrict the ability of
regulatory and industry scientists to utilize them fully for assessing
risk-benet. To address this issue, we have developed an Estrogenic
Activity Database (EADB; http://www.fda.gov/ScienceResearch/
BioinformaticsTools/EstrogenicActivityDatabaseEADB/default.
htm) and made it freely available to the public. EADB contains
18,114 estrogenic activity data points collected for 8212 chemicals
tested in 1284 binding, reporter gene, cell proliferation, and in vivo
assays in 11 different species. The chemicals cover a broad chemical
structure space and the data span a wide range of activities. Aset
of tools allow users to access EADB and evaluate potential endo-
crine activity of chemicals. As a case study, a classication model
was developed using EADB for predicting ER binding of chemicals.
Key Words: endocrine disruptor; estrogen receptor; estrogenic
activity; database.
Endocrine-active chemicals and endocrine disruptors (EDs)
have been the subject of intense scientic discussions over the
past two decades because of their potential to interfere with
hormone (endocrine) systems in both humans and wildlife (De
Coster and van Larebeke, 2012; Zoeller etal., 2012). In 1996,
the U.S. Congress passed two laws, the Food Quality Protection
Act of 1996 (FQPA 1996) and the Safe Drinking Water Act
Amendments of 1996 (SDWA Amendments 1996). Pursuant to
the two acts, the U.S. Environmental Protection Agency (EPA)
launched the Endocrine Disruptor Screening Program (EDSP)
to evaluate chemicals for possible effects on the endocrine sys-
tem in humans and wildlife (EPA, 1998; Willett etal., 2011).
The endocrine system is composed of glands that produce
and secrete hormones and their corresponding receptors and
metabolizing and synthesizing enzymes (i.e., steroidogenic),
(Luu-The and Labrie, 2010; Miller, 2002; Nelson and Bulun,
2001; Wilson, 2009), as well as the proteins that compete
for EDs in serum (Hong et al., 2012). EDs can mimic the
effect of endogenous hormones and exert a harmful effect by
causing inappropriate responses or can block the interaction
of hormones with endogenous receptors, resulting in adverse
effects on developmental, reproductive, neurological, and
immune systems (Daston et al., 2003). Binding to hormone
receptors is one class of molecular initiating events that can
Disclaimer: The ndings and conclusions in this article have not been
formally disseminated by the U.S. Food and Drug Administration, the U.S.
Environmental Protection Agency, and National Institutes of Health and should
not be construed to represent any agency determination or policy.
  135(2), 277–291 2013
doi:10.1093/toxsci/kft164
Advance Access publication July 27, 2013
at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from at FDA Library on October 24, 2013http://toxsci.oxfordjournals.org/Downloaded from
SHEN ET AL.
lead to disruption of the endocrine function. The tier 1 assays
of EPAs EDSP were developed to identify chemicals with the
potential to interact with major endocrine pathways, and include
tests for direct interactions with the estrogen receptor (ER) and
androgen receptor (Willett et al., 2011). ER is arguably the
most important receptor and has been the subject of extensive
study (Shanle and Xu, 2011; Watson etal., 2010).
ER belongs to the nuclear receptor superfamily and is widely
expressed in various tissues within the body. In humans, there
are two major subtypes of ER, ER-α and ER-β, which have
a high degree of structural homology (Hall and McDonnell,
2005). EDs can bind to ER and interfere with normal estrogen
signaling through genomic and nongenomic pathways. In the
genomic pathway, estrogenic molecules such as 17β-estradiol
(E2) bind to ER to form ER-ligand complexes, which then cre-
ate ER-ER dimers that recruit cofactors and undergo a substan-
tial conformational change induced by ligand binding, leading
to the mature transcription factor. The ER-ER transcription
factor complex can then directly bind to DNA or create tran-
scription factors that bind to DNA, such as specicity protein
1 and activator protein 1 (Safe and Kim, 2008; Wormke etal.,
2003). In the nongenomic pathway, estrogenic molecules bind
to membrane-bound ERs and subsequently interfere with the
membrane ER-mediated activation of second-messenger and
protein kinase signaling (Yager and Davidson, 2006). Each
of these mechanisms involving ER binding can lead to down-
stream changes that have the potential to disrupt the endocrine
system depending on dose and situation.
In the last two decades, a large number of chemicals have
been assayed for estrogenic activity by government agencies
and academic research groups (Blair et al., 2000; Branham
etal., 2002; Shen etal., 2010, 2012). In addition, new chemi-
cal entities have been synthesized to target ER for the treat-
ment of various diseases (Komm and Chines, 2012; Minutolo
et al., 2011; Silverman, 2010). To enable and optimize the
use of the data generated by these studies, we developed the
Estrogenic Activity Database (EADB) to provide both scien-
tic and regulatory communities a comprehensive and up-to-
date resource for evaluating potential endocrine activity of
chemicals. EADB was developed in a Java Web Start applica-
tion with ORACLE database. We also implemented EADB in
the Instant JChem (http://www.chemaxon.com/), which facili-
tates easy browsing, querying, and exporting functions. The
database incorporates the most extensive collection of chemi-
cals with publicly available estrogenic activity data obtained
from in vitro and in vivo assays. The chemicals contained in
EADB are from diverse sources, including drugs, pesticides,
industrial chemicals, consumer product chemicals, and new
chemical entities.
The estrogenic activity data curated in EADB are converted
to standardized representations for comparability, demonstrate
a high degree of concordance for the large majority of chemi-
cals for which multiple study results are available, and cover
a wide range of activity types and values. The large chemical
space coverage and standardized representations of estrogenic
activity data make EADB a useful resource and tool for assess-
ing potential estrogenic activity of chemicals.
EADB is publicly available from http://www.fda.gov/
ScienceResearch/BioinformaticsTools/EstrogenicActivityData
baseEADB/default.htm. It provides the scientic community a
free resource to search estrogenic activity data for chemicals of
interest and to develop predictive models for assessing poten-
tial estrogenic activity of chemicals for which no estrogenic
activity data are available. As a case study to demonstrate the
utility of the database, a classication model for predicting ER
binding of chemicals was developed usingEADB.
The U.S. Food and Drug Administration (FDA) is in the
process of consolidating information on substances into an
agency-wide Substance Registration System (SRS). The SRS
assigns a Unique Ingredient Identier (UNII) that is used in
product listing to unambiguously identify a substance. Afuture
version of EADB will contain the UNII, which will allow this
essential resource to be integrated into the regulatory frame-
work across the FDA and outside theFDA.
MATERIALS AND METHODS
Data sources and curation. Data curated for EADB were extracted
from three sources: the published literature, the FDAs Endocrine Disruptor
Knowledge Base (EDKB), and other publicly available data sets. Figure 1
illustrates the EADB data sources together with the workows to curate the
data. The published literature (up through 30 June 2012) covered several
research areas, primarily toxicology, environmental science, and medicinal
chemistry. Estrogenic in vivo data (uterotrophic assays) were transferred from
EDKB. Public data sets from the National Toxicology Program and Ministry
of Economy, Trade and Industry (METI) of Japan were also incorporated into
EADB. The estrogenic data and associated metadata from the raw sources were
separated into four categories: assay descriptions, activity data, references, and
chemicals. As shown in Figure1, each category contained multiple-related data
entries. The assays were further categorized into four types: binding assays,
reporter gene assays, cell proliferation assays, and in vivo assays. For binding
assays, ER subtypes and ER protein domains (full-length or ligand-binding
domain) used in the assays were recorded in EADB. The estrogenic activity
data for each compound in each assay were recorded separately and associated
with the assay description, literature or database reference, and chemical struc-
ture using the internal identication numbers. For compounds in the Chemical
database of European Molecular Biology Laboratory (ChEMBL) (https://www.
ebi.ac.uk/chembl/; Gaulton etal., 2012), the structures, chemical abstracts ser-
vice (CAS) numbers, and chemical names (including synonyms) were retrieved
directly from ChEMBL. For all other compounds, the structures were drawn
and named using Marvin Sketch (http://www.chemaxon.com/).
Briey, all the data were sourced from accessible references and curated
manually. Chemical structures, assay descriptions, and activity data were
parsed into different data formats and linking between different types of data
was generated at the same time. The data were manually checked. Obvious
typing errors were corrected immediately. Suspected errors were carefully
corrected by reading the original source.
Data standardization. The chemical structure of each compound in
EADB was processed and dearomatized by JChem Standardizer (http://www.
chemaxon.com/). The estrogenic activity data were standardized as described
herein. In each assay, if E2 (a potent and active endogenous estrogen) was
tested, then the estrogenic activity data of the other tested compounds was
normalized to the relative activity (RA) data of E2. For example, the binding
278
ESTROGENIC ACTIVITY DATABASE
afnity data were normalized to relative binding afnity (RBA) data for the
binding assays in which E2 was assayed. In the same way, the activity data
were normalized to RA data for the reporter gene assays and cell prolifera-
tion assays when E2 was tested. Both RBA and RA of E2 were set to 100
for normalization. After normalization, base-10 logarithmic transformation
was applied to the RA data, resulting in the endpoints logRBA and logRA in
EADB. Except for the unit-less RA data, activity data were standardized to
nanomolar concentrations. Some of the activity data in the original references
are qualitative descriptions rather than quantitative values, such as “not active,
“no binding,” or “weak binding.We used −10,000 to represent “not active”
and “no binding,” and −5000 to represent “weak binding” in EADB, in order to
facilitate numeric search on the activity data.
Data schemes and database implementation. All the data in EADB
comprise two hierarchical data tree relational representations (Fig. 1 and
Supplementary g. S1). The rst data tree named DATA is a joint table contain-
ing the data from tables of estrogenic activity data, assay descriptions, and refer-
ences. This data tree corresponds to the Biological Activity Interface for EADB
(Fig.1). The compound table is a subtree with a one-to-many relationship with
the root (Supplementary g. S1). The second data tree named MOLECULES
contains the molecular structures and other chemical identication informa-
tion. This data tree corresponds to the Chemical Interface for EADB (Fig.1). It
contains three subtrees: the estrogenic activity data, CAS, and synonym tables
(Supplementary g. S1). Two interfaces (form views) enable browsing, query-
ing, and other functions in EADB. The Java Web Start application consists of a
client front-end and a database storing all the data. The Instant JChem version
uses the local database to store and manage the data trees.
Data sets used in chemical structure space comparison. To examine the
chemical structure space coverage of EADB, three data sets from DrugBank,
FDAs UNII from SRS, and the DSSTox TOX21S inventory were used to com-
pare with EADB. DrugBank contains 6516 FDA-approved and experimental
drugs. The chemical structures were downloaded from http://www.drugbank.
ca/downloads. The UNII from SRS were generated based on molecular struc-
tures and/or descriptive information (http://fdasis.nlm.nih.gov/srs/srs.jsp). The
UNII list contains 26,733 unique CAS numbers. The chemical structures were
generated from the CAS numbers using an in-house program and output to a
structure-data le (SDF). The TOX21S data set contains 8193 chemicals under-
going testing under the cross-federal agency Tox21 high-throughput screening
(HTS) program (http://epa.gov/ncct/Tox21/; Tice etal., 2013). The chemical
structures were downloaded from the EPA DSSTox Web site (http://www.epa.
gov/ncct/dsstox/sdf_tox21s.html).
Calculation of Mold2 molecular descriptors. Chemical structure space
can be described using molecular descriptors. The molecular descriptors used in
this study were generated using Mold
2
(http://www.fda.gov/ScienceResearch/
BioinformaticsTools/Mold2/default.htm), a free software tool developed at the
FDA and demonstrated to be reliable for numerically describing chemical struc-
tures (Hong etal., 2008). Specically, 777 Mold
2
descriptors were separately
generated for the chemicals in each of the four SDF les that have molecular
descriptions of chemicals from EADB, DrugBank, UNII, and TOX21S. The
FIG.1. EADB overview. After estrogenic activity data and related information were curated from the literature, FDA’s EDKB, and public data sets, data
standardization and chemical structure cleaning were performed. The data were categorized into four different tables, which were internally associated. Two
separate interfaces based on estrogenic activity data and chemical structures were designed and implemented.
279
SHEN ET AL.
calculated Mold
2
descriptors were output in text les for subsequent chemical
structure space coverage analysis.
Comparison of chemical structure spaces. Principal component analysis
(PCA) was applied on the four data sets. Prior to PCA, the molecular descrip-
tors were ltered using Shannon entropy (Shannon, 1948). More specically,
for each of the four data sets, Shannon entropy was rst calculated for each
of the 777 Mold
2
molecular descriptors. Then, the molecular descriptors
were sorted based on their Shannon entropy values, and the top 300 molecu-
lar descriptors were retained. Thereafter, the 300 molecular descriptors were
scaled into values between 0 and 1.Finally, PCA was applied to the data sets
represented by the scaled values, and the rst 3 principal components (PCs)
were used to compare the chemical structure spaces.
The chemical structure space of a data set is dened by a cuboid in the
three-dimensional (3D) space of the rst 3 PC. In addition to visual compari-
son, chemical structure spaces of two data sets i and j are compared quantita-
tively using two measures: chemical coverage (CC
ij
) and chemical distribution
similarity (CDS
ij
). CC
ij
is the ratio of chemicals covered by both chemical
structure spaces (within both cuboids) to total of chemicals in an individual
data set. It is used to measure how many chemicals in a data set are covered
by the other data set. To measure the similarity of distributions of chemicals
between two data sets, each of the covered two cuboids is divided into 216
subspaces (6 × 6× 6 in the 3D space) rst. Then, number of chemicals for each
of the 216 subspaces is counted for the two data sets separately. The CDS
ij
is
measured using the correlation coefcient between the two vectors of 216
chemical counts.
Analysis of data concordance. When a chemical has multiple data points
from the same type of assays, e.g., ER binding assays, the data concordance
within the assay type for the chemical is dened and calculated by using
Equation 1 as follows:
C =
active
inactive
activeinactive
oncordance 100%
dd
dd
+
× (1)
Where
d
active
represents number of active data points and d
inactive
indicates
number of inactive data points. The concordance values of chemicals with
multiple data points from the same type of assays are given in Supplementary
table S2.
When analyzing data concordance between two related but different types
of assays, e.g., ER binding assays versus reporter gene assays, the overall con-
cordance is dened and calculated using Equation 2 as follows:
Concordance =
Overall
active
inactivediscordant
total
cc c
c
++
×1100% (2)
Where
c
active
is the number of chemicals with all active data points from the
two types of assays, c
inactive
is the number of chemicals with all inactive data
points, c
discordant
is the number of chemicals with discordant data points within
the same assay type (i.e., some are active and the rest are inactive) for both
types of assays, and c
total
is the number of chemicals assayed in both assay
types. The concordances for the different activity sets within an assay type (all
active, all inactive, and discordant data) in each type of assays are dened and
calculated by using the following Equations 3–5:
Concordance
active
active
active
c
n
100%,
(3)
Concordance
inactive
inactive
inactive
c
n
100%, (4)
Concordance
discordant
discordant
discordant
c
n
100% (5)
Where
n
active
,
n
inactive
, and
n
discordant
are numbers of chemicals with all active,
all inactive, and discordant data from the same type of assays, respectively.
c
active
, c
inactive
, and c
discordant
are the same as dened in Equation 2.
Development of classication model for predicting ER binding activity
of chemicals. As a case study to demonstrate the utility of EADB in pre-
dictive toxicology, we developed a model for predicting ER binding activity.
After removing 103 chemicals with discordant ER binding activity data (i.e.,
concordance calculated from Equation 1 is < 100%), EADB had 4719 ER
binders (chemicals with concordant positive results in all the tested binding
assays) and 675 nonbinders (chemicals with concordant negative results in all
the tested binding assays). The molecular structures of these 5394 chemicals
were exported from EADB in an SDF le that was used for calculating the
777 Mold
2
(Hong etal., 2008) molecular descriptors. Thereafter, the descrip-
tors with constant value across all 5394 chemicals were removed. The values
for each of the remaining 633 Mold
2
descriptors were then scaled to values
between 0 and 1.The supervised machine learning methodology, decision for-
est (DF; Tong etal., 2003), was used to build the ER binding activity prediction
model based on the scaled Mold
2
descriptors. To assess the performance of
the DF model, vefold cross-validation was conducted as shown in Figure2.
In one cross-validation step, the 5394 chemicals were randomly split into ve
equal portions. Four of the ve portions were used to train a DF model, which
was then used to predict ER binding activity for the remaining portion. This
process was repeated sequentially so that each of the ve portions was left out
once as the testing set. The prediction results were then averaged to provide
the estimate of model performance. The vefold cross-validation was repeated
10 times using different random divisions of the 5394 chemicals. Accuracy,
sensitivity, specicity, balanced accuracy, and Mathew’s correlation coefcient
(MCC, dened and calculated in Equation 6) were calculated and reported for
each of the 10 cross-validations.
MCC
TP*TN FP*FN
TP FP *TPFN*TN FP *TNFN
=
+
(
)
+
(
)
+
(
)
+
(6)
TP, TN, FP, and FN indicate numbers of true positives, true negatives, false
positives and false negatives, respectively.
Statistical analysis. All statistical analyses, including two-tailed t-test,
PCA, and box-plot, as well as Shannon entropy calculation and scaling, were
conducted using packages in R 2.15.1 (http://www.r-project.org/).
RESULTS
Data Curation
A comprehensive set of estrogenic activity data from a
variety of data sources was assembled and curated (Fig. 1),
with the primary data source being the published literature.
We systematically searched the literature published before
30 June 2012 by using Web of Knowledge with keywords
of “estrogen receptor” or “estrogenic.” In total, 14,873 data
were curated from the literature. Estrogenic activity data from
444 papers published in 21 journals were loaded into EADB
(Supplementary table S1 lists the detail of the publications
and the corresponding summaries of the data curated). The
second major data source were reports and databases in the
public domain, including 667 estrogenic activity data from
the Interagency Coordinating Committee on the Validation of
280
ESTROGENIC ACTIVITY DATABASE
Alternative Methods report “Current Status of Test Methods
for Detecting Endocrine Disruptors: In Vitro Estrogen Receptor
Binding Assays” (http://iccvam.niehs.nih.gov/docs/endo_docs/
nal1002/erbndbrd/ERBd034504.pdf) and 938 data from Risk
Assessment of Endocrine Disrupters, METI, Japan (http://
www.meti.go.jp/english/report/data/g020205ae.html). The
1640 in vivo uterotrophic assay activity data (1604 of mouse
and 36 of rat) from FDAs EDKB were included.
In total, EADB contains 18,114 estrogenic activity data for
8212 molecules tested in 1284 assays (binding assays, reporter
gene assays, cell proliferation assays, and in vivo assays).
Table1 presents the statistics on chemicals, data, assays, and
references inEADB.
The same type of assays often measured and reported vari-
ous endpoints in different units. Table2 summarizes the end-
points and their corresponding data curated in EADB. The
endpoints, units, and transformation methods are recorded in
the database to retain all the original information pertaining to
the data stored inEADB.
Species used in the assays and the procedures of the assays
are important for assessing potential estrogenic activity of
chemicals, and these types of metadata have also been entered
into EADB. Table3 lists the 11 species veriably used in the
assays curated into EADB. We had difculties conrming the
species used in the assays based on the original publications
for 12% (2059) of the data and consequently marked those as
“unknown” inEADB.
User Interfaces
EADB provides different user interfaces (Fig. 3) to
accommodate users with different knowledge backgrounds
or different purposes. The biological data focused interface
(Fig. 3A) stresses examining chemical structures with a
specic estrogenic activity. The chemical structure focused
interface (Fig.3B) stresses exploring estrogenic activity data
for specic chemicals. The primary component of EADB
is the activity table located in the right of the window. It
displays the database content and the querying results. The
searching and chemical structure displaying panels are in the
FIG.2. The owchart of vefold cross-validations. The chemicals were rst randomly partitioned into ve equal portions. One portion was retained for test-
ing the DF model that was trained using the remaining four portions. The process was repeated until each of the ve portions was retained once, and the results
from predicting the ve portions were then averaged as a performance measure of the DF model. The vefold cross-validation was iterated 10 times with different
random divisions of the chemicals into ve portions.
TABLE1
Summary of Data Contained in EADB
Assay type Chemicals Data Assays References
Binding 5494 10,853 751 377
Reporter gene 1371 2633 234 80
Cell proliferation 1540 3039 297 107
In vivo 1351 1640 2 1
Total 8212 18,165 1284 447
281
SHEN ET AL.
left of the window. They provide the structure searching and
data ltering functions (Fig.3A). By clicking the “Individual
Compound” button, the molecule interface would popup, and
all the information, including the molecular identications,
properties, and experimental data, will be shown clearly in this
window (Fig.3B).
The Instant JChem version of EADB provides similar
interfaces as shown in Supplementary gure S2. Each interface
contains three functional windows. The project window shows
the organization of database components in EADB, including
data trees and interfaces for using the database. The query
window provides users an easy and visually oriented way to
build complex queries through logical operations. The main
window displays the database content of search results in a
manner expected for different purposes. The project window
and query window are the same for both interfaces. The only
difference between the two interfaces comes from the main
windows. Supplementary gure S2A gives a screenshot of the
chemical structure focused interface, whereas Supplementary
gure S2B shows the biological data focused interface
implemented in EADB. The chemical structure focused
interface displays molecular structure and related information
such as name, physicochemical properties, CAS, synonyms, as
well as links to PubChem (http://pubchem.ncbi.nlm.nih.gov/)
and ChemSpider (http://www.chemspider.com/) in the left
panel (Supplementary g. S2A). The right panel of the main
window is a joint dynamic table that lists estrogenic activity
data related to the compound displayed in the left panel, with
each row describing one datum and the columns representing
different types of data such as estrogenic activity data, assay
descriptions, and literature references. The biological data
focused interface shows all the activity data in a dynamic table
along with related assay, reference, and chemical structure
displayed in the right side (Supplementary g. S2B).
Table 4 summarizes the database functions implemented
in EADB. Detailed instructions on using the database and the
functions implemented are given in the EADB users’ manual
(Supplementary data).
Chemical Space Coverage
The utility of the database for assessing estrogenic potential
of chemicals largely depends on structural similarity between
TABLE2
Statistics of Data in Different Types of Assays and Corresponding Endpoints
Endpoint Description Data
Binding assay 10,853
logRBA Log transfer of RBA compared with E2 8478
IC
50
50% inhibition concentration 1128
EC
50
50% effective concentration 63
Ki Binding afnity 728
Kd Dissociation constant 19
Ka Association constant 41
INH Inhibition 396
Reporter gene 2633
logRA Log transfer of RA compared with E2 1212
logRA10 Log transfer of effective concentration equal to 10% of E2 518
REC10 Effective concentration equal to 10% of E2 13
EC
50
50% effective concentration 60
IC
50
50% inhibition concentration 496
Ki Antagonist activity 25
INH Inhibition percentage 167
Agonism Agonistic estrogenic activities 71
Antagonism Antiestrogenic antagonistic activities 71
Cell proliferation 3039
logRPE Log transfer of relative proliferative effect compared with E2 57
logRPP Log transfer of relative proliferative potency compared with E2 90
logRE Log transfer of relative efciency compared with E2 207
EC
50
50% effective concentration 18
IC
50
50% inhibition concentration 1649
ED
50
50% effective dose 44
GI
50
Concentration of drug that reduces cell growth by 50% 403
IC
30
30% inhibition concentration 10
Ki Constant of cytotoxicity 22
INH Inhibition percentage 480
Agonism Agonistic activity 7
Antagonism Antagonistic activity 52
In vivo 1640
logRP Log transferred relative potency compared with E2 1640
282
ESTROGENIC ACTIVITY DATABASE
TABLE3
Statistics of Data for Species in Different Assay Types
Human Rat Mouse Cattle Sheep Rabbit Trout Lizard Chicken Escherichia coli Monkey Unknown
Binding Chemical 2458 1619 223 423 266 38 42 25 22 11 6 1084
Data 5531 2241 353 579 373 44 42 25 22 22 6 1615
Assay 430 107 24 42 30 7 1 1 1 2 1 105
Reference 195 66 15 34 22 5 1 1 1 1 1 73
Reporter gene Chemical 1225 32 17 0 0 0 0 0 0 0 0 172
Data 2164 32 89 0 0 0 0 0 0 0 0 348
Assay 186 2 7 0 0 0 0 0 0 0 0 39
Reference 66 2 1 0 0 0 0 0 0 0 0 15
Cell proliferation Chemical 1464 1 0 0 0 0 0 0 0 0 0 95
Data 2942 1 0 0 0 0 0 0 0 0 0 96
Assay 292 1 0 0 0 0 0 0 0 0 0 4
Reference 107 1 0 0 0 0 0 0 0 0 0 3
In vivo Chemical 0 19 1343 0 0 0 0 0 0 0 0 0
Data 0 36 1604 0 0 0 0 0 0 0 0 0
Assay 0 1 1 0 0 0 0 0 0 0 0 0
Reference 0 1 1 0 0 0 0 0 0 0 0 0
Total Chemical 4468 1630 1534 423 266 38 42 25 22 11 6 1229
Data 10,637 2310 2046 579 373 44 42 25 22 22 6 2059
Assay 908 111 32 42 30 7 1 1 1 2 1 148
Reference 298 68 16 34 22 5 1 1 1 1 1 86
the chemical being considered and chemicals selected for
comparison. To assess the applicability of EADB in the evalu-
ation of drug product safety in terms of estrogenic potential,
we compared chemical structure space in EADB with drugs
in DrugBank (Knox et al., 2011), a database containing the
most known marketed drug products. The CC
ij
for the chemi-
cal structure spaces of EADB and DrugBank (Fig.4A) were
99.68 and 99.75%, respectively, indicating 99.68% chemicals
in DrugBank are covered by the chemical space of EADB and
99.75% chemicals in EADB are covered by the chemical space
of DrugBank. The CDS
ij
between DrugBank and EADB was
0.793, indicating the distributions of chemicals in chemical
structure spaces of DrugBank and EADB were comparable.
SRS is a compilation of the substances used in drugs, bio-
logics, foods, and medical devices regulated by the FDA. For
evaluating utility of EADB in assessment of estrogenic poten-
tial for all the FDA-regulated products, comparative analysis
of chemical structure spaces between the UNII list in SRS and
the chemicals in EADB was conducted. We observed that the
chemical structure space of EADB is similar to the chemical
structure space of UNII list in SRS (Fig.4B) with a high CC
ij
value for EADB (100%) and a slightly lower CC
ij
value for
UNII list (97.12%). Moreover, the CDS
ij
between UNII list and
EADB in the covered space was0.802.
As a multiple agency collaborative project, Tox21 (Kavlock
etal., 2009) aims to develop, validate, and translate innovative,
HTS chemical testing methods to characterize key interactions
in cellular pathways for toxicological evaluation of a wide
range of environmental and commercial chemicals that are
regulated by and of interest to the EPA, the National Institutes
of Environmental Health Sciences, the National Institutes of
Health (NIH), National Chemical Genomics Center (NCGC),
and the FDA. TOX21S lists the unique chemical inventory
currently undergoing HTS testing in Tox21. The Tox21 library
includes approximately one-third marketed drugs (NCGC)
with the remaining two thirds comprising a broad diversity
of environmental chemicals of concern for potential exposure
or toxicity. The Tox21 inventory could be considered to be
broadly representative of the chemical structure space needed
for toxicological evaluation across EPA, NIH, and FDA pro-
grams. To further assess the applicability of EADB for safety
evaluation of potentially estrogenic chemicals, we compared
the chemical structure spaces between EADB and TOX21S.
The CC
ij
values were 99.29 and 99.70% for TOX21S and
EADB, respectively, indicating the chemical structure spaces
(Fig. 4C) of the two data sets are similar. Furthermore, the
distributions of chemicals in the covered chemical structure
space are similar, and the CDS
ij
was 0.644. After excluding
997 drug compounds from TOX21S, the chemical structure
space of the environmental chemicals remained high coverage
with EADB (Fig.4D) with slightly lower CC
ij
values of 99.26
and 99.68% for TOX21S and EADB, respectively, and a slight
lower CDS
ij
of 0.603.
Estrogenic ActivityRanges
To ensure EADB contains a suitable set of chemicals with a
sufciently wide estrogenic activity range for assessing estro-
genic potential, we analyzed the distribution of estrogenic activ-
ity data in EADB. The results plotted in Figure5 demonstrate
a wide estrogenic activity range, including inactive chemicals.
283
SHEN ET AL.
FIG.3. Snapshots of EADB interfaces. Both the biological data focused interface (A) and chemical structure focused interface (B) consist of the panels of
molecular structure and assay data. The query and ltering functions are implemented in the biological data focused interface. The chemical structure focused
interface can be opened by clicking the “Show” Individual Compound at the top of the biological data focused interface.
284
ESTROGENIC ACTIVITY DATABASE
EADB contains four types of assays and each of them has
many data endpoints (Table2). Among the 10,818 data from
binding assays, 8471 were obtained from binding assays with
the reference compound, E2, assayed in the same experiments.
Out of these 8471 binding data, 799 showed no binding activ-
ity or very weak binding afnity (the left panel of Fig. 5A).
The distribution of the logRBA values of the remaining 7672
are plotted in the right panel of Figure5A. In EADB, (17β)-
3-aminoestra-1,3,5(10)-trien-17-ol, a synthesized E2 analog
(Wiese et al., 1997), is the most potent ER binder, with a
logRBA value of 3.876, whereas desethylatrazine is the least
potent ER binder, with a logRBA value of −4.7. Thus, EADB
contains ER binding afnity data that span a wide range of
more than eight orders of magnitude. Moreover, the distribu-
tion of logRBA values plotted in the right panel of Figure5A is
not sparse. The binding activity distributions of other endpoints
with > 100 data points also show wide binding afnity ranges
(Supplementary g. S3).
TABLE4
Database Functions Implemented in EADB
Function Description
Browsing The database or searching results can be browsed easily in different ways.
Searching Searching can be carried out on structure (substructure search, super structure search, similarity search, full search, R-group search,
and exclusion search) or on data, including numerical data (various estrogenic activity data) and text data (assay descriptions and
literature references), as well as logical combinations of multiple searching operations.
Updating The database can be updated through adding new chemicals or estrogenic activity data and editing the structures or data whenever
errors are found.
Exporting Structures and data can be exported in various formats
FIG.4. Comparisons of EADB chemical structure space with chemical structure spaces of DrugBank (A), UNII (B), TOX21S (C), and the environmental
chemicals in TOX21S (D; after excluding 997 drug compounds). PCA was conducted on the ve sets of chemicals that are described by the chemical descriptors
calculated using Mold
2
. The rst 3 PCs were used to represent chemical structure spaces for each of the ve data sets. The pairwise comparisons of chemical
structure spaces between EADB and the four data sets were performed using scatter plots of the three PCs. Color codes: red for EADB; blue for DrugBank; cyan
for UNII; green for TOX21S and black for TOX21S after excluding drug compounds.
285
SHEN ET AL.
For the reporter gene assays, 1211 data were from experi-
ments in which the reference compound E2 was also tested.
Among those 1211 data, 105 showed no activity in the reporter
gene assays (the left-most bar in Fig. 5B). The distribution
of the logRA values for the remaining 1106 are plotted in
the right panel of Figure 5B. The largest logRA is 2.841 for
(1S,4R,5R)-4-(2-fluoro-4-hydroxyphenyl)-2,2,6-trimethyl-
3-oxabicyclo[3.3.1]non-6-en-1-ol and the smallest logRA
is −5.375 for 1-chloro-2-[2,2-dichloro-1-(4-chlorophenyl)
ethenyl]benzene. The majority (> 78%) of logRA values are
between −3 and 3, spanning more than six orders of magni-
tude (the right panel of Fig.5B). Similar to the binding data,
the reporter gene activity data in EADB are not sparse. The
reporter gene activity distributions of other endpoints with >
100 data are similar to the logRA values (Supplementary g.
S3).
The cell proliferation assays covered a range of kinds of
experiments and corresponding diverse sets of endpoints
(Table 2). Some assays tested antiproliferation activity of
compounds. For the endpoints (e.g., IC
50
, ED
50
, GI
50
) of those
assays, the original data were recorded in EADB without nor-
malization and transformation. The most prevalent type of cell
proliferation activity data in EADB are IC
50
values (concen-
tration of testing chemical that reduces cell growth by 50%).
The 1512 IC
50
values from cell proliferation assays are between
100µM and 0.1 nM, spanning a wide range of more than six
orders of magnitude. There are 134 inactive data for endpoint
IC
50
(the left-most bar in Fig.5C) for which an IC
50
value can-
not be detected or extrapolated. The distribution of IC
50
values
plotted in Figure5C indicates that the cell proliferation activity
data are not sparse. The cell proliferation activity distributions
of other endpoints with > 100 data are similar to that for IC
50
data (Supplementary g. S5).
The in vivo assay data were generated from two different
experiments. The data have been normalized to the endpoint
logRP, which is the base-10 logarithm of relative potency com-
pared with E2 (Table2). Of the 1640 in vivo data, a very high
proportion, 1455, are inactive. The remaining 185 span the
logRP values range from −4 to 4, covering eight orders of mag-
nitude (Fig.5D).
FIG.5. Distributions of estrogenic activity data of the most popular endpoints from the four types of assays: logRBA of binding assays (A); logRA of reporter
gene assays (B); logIC
50
of cell proliferation assays (C); logRP of in vivo assays (D). For each assay type, the activity ranges were rst divided into multiple even
bins indicated along the x-axis. The data falling into each bin were then counted and drawn as a bar with the height at y-axis representing the number of data. The
label “NA” located left-most at the x-axes were used to represent inactive data and the label “Weak” at the x-axis for binding assays (A) indicates weak binders.
286
ESTROGENIC ACTIVITY DATABASE
Data Concordance
We found that the majority of the chemicals in EADB are
concordant (Supplementary table S3) within-assay types. Very
few chemicals are discordant: 3.3, 13.2, 14.7 and 4.8% for
binding, reporter gene, cell proliferation, and in vivo assays,
respectively (Fig.6A). An interesting observation, but consist-
ent with expectation, is that the more complex the biological
endpoint in an in vitro assay, the higher the probability of dis-
cordance (three left-most bars in Fig.6A).
For chemicals with discordant data, the distributions of
chemicals are plotted in Figure 6B. Most of these chemicals
have very low concordance, i.e., the number of active data is
equal or close to number of inactivedata.
We also analyzed concordance among the different types of
assays. We rst identied the chemicals tested using two types
of assays. The concordance between the two compared types of
assays was then analyzed using Equations 2–5.
There were 667 chemicals tested in both binding and
reporter gene assays. Their overall concordance was 73.0%
(Supplementary table S4). Further, concordance between these
assays for active chemicals was much higher than for inactive
chemicals and for chemicals with discordant data (i.e., partially
active). Interestingly, most of the chemicals that did not show
activity in any binding assay (101/110) were active in reporter
gene assays.
Activity data from both binding and cell proliferation assays
were available for 768 chemicals in EADB. The overall concord-
ance between binding data and cell proliferation data was 83.5%
(Supplementary table S5). However, the concordance for chemi-
cals with all active data was a higher 96.2 and 86.2% for binding
and cell proliferation, respectively. It should be noted, however,
that the high concordances in this case are largely due to the very
high “all active” rates for both types of assays. Similar to the con-
cordance between binding and reporter gene assays, the concord-
ance between binding and cell proliferation assays for chemicals
having all inactive data was very low, again heavily inuenced
by the overall very low rates of “all inactive” chemicals for both
binding and cell proliferation, some 1–2%.
EADB contains only 145 chemicals with activity data from
both reporter gene and cell proliferation assays. Their overall
FIG.6. Data concordance analysis results. The percentages of chemicals having discordant data in the four types of assays are plotted in a bar chart (A) with
each bar representing one assay type. The distributions of chemicals with discordant data at different concordance levels in the four types of assays were plotted
in (B). The x-axis value of each of the points indicates the concordance of chemicals, whereas the y-axis gives the number of chemicals.
287
SHEN ET AL.
concordance was 79.3% (Supplementary table S6). Once again,
high concordance for chemicals with all active data, 87.4 and
88.8% for reporter gene and cell proliferation, respectively, was
observed, whereas concordance for chemicals having all inac-
tive data was much lower, again, largely due to the much higher
incidences of “all active” versus “all inactive” in the two assay
groups.
Analysis of data concordance between in vivo and in
vitro assays was conducted, and the results are summarized
in Supplementary table S7 (between binding and in vivo),
Supplementary table S8 (between reporter gene and in vivo),
and Supplementary table S9 (between cell proliferation and
in vivo). Note that the total number of overlapping chemicals
being compared in each case is signicantly smaller than for
the in vitro to in vitro comparisons in Supplementary tables
S4–S6. Also noteworthy is the more balanced distribution of
“all active” versus “all inactive” for the in vivo assay group in
each case. As expected, the concordance between in vivo and in
vitro assays was much lower than those between in vitro assays.
Furthermore, chemicals that show estrogenic activity in an in
vivo assay most likely exert estrogenic activity in an in vitro
assay, whereas a large portion of chemicals active in in vitro did
not show estrogenic activity in in vivo.
Concordance analyses demonstrated that the estrogenic
activity data in EADB are generally concordant both within the
same type of assays and between different types of assays, indi-
cating the usefulness and reliability of EADB for safety assess-
ment related to estrogenic potential of chemicals. In summary,
the within-assay type concordance is higher than the cross-
assay type concordance. Moreover, the concordance between in
vivo and in vitro assays is lower than the concordance between
the in vitro assays.
Prediction of ER Binding Activity
As a case study to demonstrate the utility of EADB, a DF
model was developed for predicting ER binding activity. Ten
repetitions of vefold cross-validations (Fig.2) were performed
to estimate the predictive performance of the DF model, and
the results are given in Figure 7. The mean accuracy, sensi-
tivity, specicity, balanced accuracy, and MCC were 93.84
(SD=0.25%), 98.03 (SD=0.21%), 64.53 (SD=2.51%), 81.35
(SD=1.29%), and 69.66% (SD=1.50%), respectively.
DISCUSSION
EADB is a rich data source for research and regulatory sci-
entists to use to assess a chemical’s potential for estrogenicity.
Endocrine disruption in both humans and wildlife is a prior-
ity concern for environmental sciences, particularly where
a no effect level of exposure may be nonexistent. Given that
the different modes of estrogen action gure prominently in
assessing such potential and the continuing need to assess a
vast and growing number of industrial chemicals for potential
estrogenicity, the EADB lls an important need. FDA regulates
therapeutic compounds that may contain ER agonists, partial
FIG.7. Performance of the 10 iterations of vefold cross-validation. Accuracy was plotted in diamonds, sensitivity in up-triangles, specicity in down-
triangles, MCC in circles, and balanced accuracy in stars.
288
ESTROGENIC ACTIVITY DATABASE
agonists, or antagonists, as well as medical devices, cosmetics,
veterinary medicine products, and foods and food packaging
that may contain estrogenic compounds.
EADB’s value is best realized for screening and prediction.
Chemical structure and similarity search capabilities provide a
simple means of comparing an untested chemical structure with
the body of data for tested chemical structures. More valuable
still is the use of the data to supervise the training of predictive
models to estimate estrogenic activity solely based on chemi-
cal structure. Preferably, training set chemicals are selected to
span the chemical structure space and activity range of untested
chemicals on which the model will be used. Care should be
taken to exclude false-positive or false-negative data points.
In addition, a reasonable balance between active and inactive
chemicals is desirable. Such models are valuable to industry
and regulatory authorities alike to screen for potential estro-
genic activity and to inform decisions as to the need for addi-
tional testing. According to EPAs EDSP21 work plan (EPA,
2011), EPA will use computational or in silico models and
molecular-based in vitro HTS assays to prioritize and screen
chemicals to determine their potential to interact with endo-
crine systems. As a case study, the DF model we developed
for predicting ER binding activity based on EADB is more
accurate in cross-validation than earlier models (Hong et al.,
2008; Tong etal., 2004). Moreover, the performance (Fig.7) of
the model is very stable with a very small SD, 0.25%, for the
10 iterations of vefold cross-validation. The lower specicity
(64.5%) compared with the sensitivity (98.0%) is likely inu-
enced by the very unbalanced nature of the data set (87.5% ER
binders and only 12.5% ER nonbinders), reecting less struc-
tural information about inactives compared with actives. The
specicity could be expected to be improved by adding more
inactive chemicals to the trainingset.
EADB provides an open public resource to quickly estimate
the potential estrogenic activity of a new chemical entity before
any testing has begun. Safety evaluation is an important part of
the FDAs mission, with risk assessment a key part of the evalu-
ation. For cases of inadvertent exposure in regulated products
where extensive testing in animals and humans is not routinely
conducted, EADB may provide sufcient evidence that estro-
genic activity is unlikely. For drugs, risk assessment usually
takes a number of factors into consideration, such as the indica-
tion, patient population, route of exposure, duration, and the
safety margins calculated from nonclinical ndings at expo-
sures relative to the expected clinical exposure. EADB would
permit rapid assessment to determine if a testing for endocrine
activity should be conducted earlier in development to miti-
gate the risk, or if any additional studies would even be needed
beyond those normally conducted.
Two publicly available chemical databases, PubChem
developed by the NIH (Wang et al., 2009) and ChEMBL
developed by European Bioinformatics Institute (Gaulton
et al., 2012), provide comprehensive and well-organized
biological databases for a large number of chemicals. Two
large toxicity-specic data resources, ACToR (Aggregated
Computational Toxicology Resource, http://actor.epa.gov)
developed by EPA (EPA, 2012) and TOXNET (Toxicology
Data Network, http://toxnet.nlm.nih.gov) developed by NIH’s
National Library of Medicine (Wexler, 2001), provide free
access and easy searching across publicly available data for
evaluating the potential risks of chemicals to human health
and the environment (the full ACToR database is available for
downloading, whereas TOXNET is only searchable online).
However, these resources lack domain-specic knowledge
for endocrine-active chemicals. Comparative Toxicogenomics
Database (CTD, http://ctdbase.org) provides information about
interactions between environmental chemicals and gene prod-
ucts and their relationships to diseases (Davis etal., 2013). It
contains information on > 600,000 chemical-gene interactions,
including thousands of chemical-ER interactions. Different
from EADB, CTD is a biology-oriented database and focuses
on the chemical-gene associations but not specic assay data.
The FDAs EDKB is an endocrine activity-specic knowledge
base. It was developed to serve as a free resource for scientists
to foster development of predictive computational toxicology
models and to reduce dependency on slow and expensive ani-
mal experiments (Ding etal., 2010). EDKB provides domain-
specic knowledge and estrogenic activity data, along with
data for other types of endocrine-related endpoints, and has
been frequently used by scientists for > 10years. However, the
EDKB has not been recently updated. To enhance the knowl-
edge base, EDKB is now undergoing redevelopment to incor-
porate up-to-date and comprehensive sources of data related
to all aspects of endocrine activity. EADB will be one of the
databases incorporated into the new version of EDKB. EADB
contains 5700 new chemicals and 15,000 new estrogenic activ-
ity data that are not included in EDKB. EADB only collected
estrogenic activity data, whereas EDKB has other types of data
related to endocrine activity such as androgenic activity data.
Therefore, users are suggested to use EADB when interested
in estrogenic activity and EDKB for other types of endocrine
activity.
Using EADB, we observed that the percentage of discord-
ant chemicals increases as the complexity of the biological
endpoint increases across the in vitro assays. Binding assays
are biochemical in nature and generally much simpler than
the other two types of in vitro assays that involve cellular pro-
cesses and functions. In spite of the fact that different bind-
ing assays use ERs extracted from different species or different
ER domains, or that the assays might have different experi-
mental procedures, they all directly measure binding afnity
of a chemical with ERs and, thus, could easily provide similar
results for the same chemical. The underlying mechanisms of
reporter gene and cell proliferation assays are more complex
than binding assays, and thus are likely to be more variable
within the assay type and sensitive to differing experimental
protocols. Protocol variation might explain our observation
that the percentages of chemicals with discordant estrogenic
289
SHEN ET AL.
activity data within reporter gene and cell proliferation assays
are higher than within binding assays. There are only two uter-
otrophic models in EADB, and most chemicals tested in one
of these in vivo assays were not tested in the other. The low
percentage (4.8%) of discordant chemicals in in vivo assays is
likely inuenced by the limited amount of data. Alternatively,
this could mean that only potent and efcacious compounds
were tested to conrm the in vitro ndings.
Interestingly, the concordance distributions of chemicals with
discordant data are quite uneven (Fig.6B). More than 50% of
chemicals with discordant data have the lowest concordance of
zero by Equation 1; such chemicals have an equal number
of active and inactive assay results. Closer examination of the
experiments for discordant chemicals could delineate causes of
the differing experimental outcomes. There are also discordant
chemicals that may be selective ER modulators, i.e., active in
some types of cells but not others, that possibly differs in how
they affect cofactor recruitment. These discordant chemicals
could be useful in the forward validation step. However, until
such time as these discordances are better understood, we sug-
gest that the chemicals with very low concordance among dif-
ferent assays should be removed from a training data set prior
to developing predictive models, as we did in developing the
DF model for predicting ER binding activity.
A key use of in vitro ER assays is to screen compounds
for their ability to interact with ER, and then prioritize these
chemicals for more rigorous in vivo testing. An in vitro assay
needs to be very sensitive for this purpose in that it should have
a minimal number of false negatives. Data in Supplementary
tables S7–S9 show that the binding and transactivation assays
have high but not perfect sensitivity. There are a few chemicals
that are active in some but not all in vitro assays, which then
show activity in the in vivo uterotrophic assay. The properties
and experimental results for these chemicals should be exam-
ined further. However, the main point is that no single in vitro
assay is perfect, so one would want to deploy multiple in vitro
assays in the initial screening step to minimize false negatives.
For example, Supplementary table S9 shows that the cell pro-
liferation assays included in EADB poorly identify chemicals
that will be in vivo positive. Hence, such assays would not be
appropriate for a screening and prioritization effort.
Both potency and efcacy data are important for evaluat-
ing endocrine activity of chemicals. Currently, EADB has rich
potency data. However, the amount of efcacy data is relatively
small. The effort to curate more comprehensive efcacy data
is ongoing. The future version of EADB is expected to include
more efcacydata.
In summary, EADB is the most comprehensive public data-
base of chemicals assayed for estrogenic activities. It contains
carefully curated estrogenic activity data extracted from a wide
array of public and literature sources for > 8000 chemicals.
With the powerful database functions implemented in EADB,
users can easily browse, query, and export the data. Where mul-
tiple data are available for a given chemical, the data curated in
EADB display a high degree of concordance in activity calls.
Additionally, the results span a wide range of estrogenic activity
potency. These characteristics make EADB a valuable resource
and a convenient tool for assessing potential estrogenic activity
of chemicals and for developing predictive models, as demon-
strated by high accuracy of the DF model developed based on
the database. EADB is openly available to the public, it can be
supplemented, corrected, and updated, and it is easily accessed
and used by scientists.
SUPPLEMENTARYDATA
Supplementary data are available online at http://toxsci.
oxfordjournals.org/.
FUNDING
Research Participation Program at the National Center for
Toxicological Research (J.S.) administered by the Oak Ridge
Institute for Science and Education through an interagency
agreement between the U.S. Department of Energy and the
U.S. Food and Drug Administration.
ACKNOWLEDGMENTS
The authors acknowledge and thank Dr Lawrence N.Callahan
(Substance Registration System, Division of Scientic
Computing and Medical Information, U.S. Food and Drug
Administration) for his critical evaluation of an earlier draft of
this article and for his editing and constructive comments. All
high-performance computations were performed using the Blue
Meadow in Food and Drug Administration Scientic Computing
Lab. The authors declare that there are no conicts of interest.
REFERENCES
Blair, R. M., Fang, H., Branham, W. S., Hass, B. S., Dial, S. L., Moland, C.
L., Tong, W., Shi, L., Perkins, R., and Sheehan, D. M. (2000). The estro-
gen receptor relative binding afnities of 188 natural and xenochemicals:
Structural diversity of ligands. Toxicol. Sci. 54, 138–153.
Branham, W. S., Dial, S. L., Moland, C. L., Hass, B. S., Blair, R. M., Fang, H., Shi,
L., Tong, W., Perkins, R. G., and Sheehan, D. M. (2002). Phytoestrogens and
mycoestrogens bind to the rat uterine estrogen receptor. J. Nutr. 132, 658–664.
Daston, G. P., Cook, J. C., and Kavlock, R. J. (2003). Uncertainties for endo-
crine disrupters: Our view on progress. Toxicol. Sci. 74, 245–252.
Davis, A. P., Murphy, C. G., Johnson, R., Lay, J. M., Lennon-Hopkins, K.,
Saraceni-Richards, C., Sciaky, D., King, B. L., Rosenstein, M. C., Wiegers,
T. C., et al. (2013). The Comparative Toxicogenomics Database: Update
2013. Nucleic Acids Res. 41, D1104–D1114.
De Coster, S., and van Larebeke, N. (2012). Endocrine-disrupting chemicals:
Associated disorders and mechanisms of action. J. Environ. Public Health
2012, 713696.
Ding, D., Xu, L., Fang, H., Hong, H., Perkins, R., Harris, S., Bearden, E. D.,
Shi, L., and Tong, W. (2010). The EDKB: An established knowledge base
for endocrine disrupting chemicals. BMC Bioinformatics 11(Suppl.6), S5.
290
ESTROGENIC ACTIVITY DATABASE
EPA. (1998). Endocrine Disruptor Screening Program. Available at: http://
www.epa.gov/endo/pubs/081198frnotice.pdf. Accessed November 15, 2012.
EPA. (2011). EDSP21 Work Plan. Available at: http://www.epa.gov/endo/pubs/
edsp21_work_plan_summary%20_overview_nal.pdf. Accessed November
15, 2012.
EPA. (2012). ACToR: Aggregated Computational Toxicology Resource.
Available at: http://actor.epa.gov/actor/faces/ACToRHome.jsp. Accessed
November 15, 2012.
FQPA. (1996). http://www.epa.gov/opp00001/regulating/laws/fqpa/. Accessed
November 13, 2012.
Gaulton, A., Bellis, L. J., Bento, A. P., Chambers, J., Davies, M., Hersey, A.,
Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., etal. (2012).
ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic
Acids Res. 40, D1100–D1107.
Hall, J. M., and McDonnell, D. P. (2005). Coregulators in nuclear estrogen
receptor action: From concept to therapeutic targeting. Mol. Interv. 5,
343–357.
Hong, H., Branham, W. S., Dial, S. L., Moland, C. L., Fang, H., Shen, J.,
Perkins, R., Sheehan, D., and Tong, W. (2012). Rat α-Fetoprotein binding
afnities of a large set of structurally diverse chemicals elucidated the rela-
tionships between structures and binding afnities. Chem. Res. Toxicol. 25,
2553–2566.
Hong, H., Xie, Q., Ge, W., Qian, F., Fang, H., Shi, L., Su, Z., Perkins, R.,
and Tong, W. (2008). Mold(2), molecular descriptors from 2D structures
for chemoinformatics and toxicoinformatics. J. Chem. Inf. Model. 48,
1337–1344.
Kavlock, R. J., Austin, C. P., and Tice, R. R. (2009). Toxicity testing in the
21
st
century: Implications for human health risk assessment. Risk Anal. 29,
485–487.
Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K.,
Mak, C., Neveu, V., etal. (2011). DrugBank 3.0: Acomprehensive resource
for ‘omics’ research on drugs. Nucleic Acids Res. 39, D1035–D1041.
Komm, B. S., and Chines, A. A. (2012). An update on selective estrogen recep-
tor modulators for the prevention and treatment of osteoporosis. Maturitas
71, 221–226.
Luu-The, V., and Labrie, F. (2010). The intracrine sex steroid biosynthesis path-
ways. Prog. Brain Res. 181, 177–192.
Miller, W. L. (2002). Androgen biosynthesis from cholesterol to DHEA. Mol.
Cell. Endocrinol. 198, 7–14.
Minutolo, F., Macchia, M., Katzenellenbogen, B. S., and Katzenellenbogen,
J. A. (2011). Estrogen receptor β ligands: Recent advances and biomedical
applications. Med. Res. Rev. 31, 364–442.
Nelson, L. R., and Bulun, S. E. (2001). Estrogen production and action. J. Am.
Acad. Dermatol. 45, S116–S124.
Safe, S., and Kim, K. (2008). Non-classical genomic estrogen receptor (ER)/
specicity protein and ER/activating protein-1 signaling pathways. J. Mol.
Endocrinol. 41, 263–275.
SDWA Amendments. (1996). http://water.epa.gov/lawsregs/guidance/sdwa/
theme.cfm. Accessed November 13, 2012.
Shanle, E. K., and Xu, W. (2011). Endocrine disrupting chemicals targeting
estrogen receptor signaling: Identication and mechanisms of action. Chem.
Res. Toxicol. 24, 6–19.
Shannon, C. E. (1948). A mathematical theory of communication. Bell Syst.
Tech. J. 27, 379–423.
Shen, J., Jiang, J., Kuang, G., Tan, C., Liu, G., Huang, J., and Tang, Y. (2012).
Discovery and structure-activity analysis of selective estrogen receptor modu-
lators via similarity-based virtual screening. Eur. J.Med. Chem. 54, 188–196.
Shen, J., Tan, C., Zhang, Y., Li, X., Li, W., Huang, J., Shen, X., and Tang, Y.
(2010). Discovery of potent ligands for estrogen receptor beta by structure-
based virtual screening. J. Med. Chem. 53, 5361–5365.
Silverman, S. L. (2010). New selective estrogen receptor modulators (SERMs)
in development. Curr. Osteoporos. Rep. 8, 151–153.
Tice, R. R., Austin, C. P., Kavlock, R. J., and Bucher, J. R. (2013). Improving
the human hazard characterization of chemicals: A tox21 update. Environ.
Health Perspect. 121, 756–765.
Tong, W., Hong, H., Fang, H., Xie, Q., and Perkins, R. (2003). Decision forest:
Combining the predictions of multiple independent decision tree models. J.
Chem. Inf. Comput. Sci. 43, 525–531.
Tong, W., Xie, Q., Hong, H., Shi, L., Fang, H., and Perkins, R. (2004).
Assessment of prediction condence and domain extrapolation of two struc-
ture-activity relationship models for predicting estrogen receptor binding
activity. Environ. Health Perspect. 112, 1249–1254.
Wang, Y., Xiao, J., Suzek, T. O., Zhang, J., Wang, J., and Bryant, S. H. (2009).
PubChem: Apublic information system for analyzing bioactivities of small
molecules. Nucleic Acids Res. 37, W623–W633.
Watson, C. S., Jeng, Y. J., and Kochukov, M. Y. (2010). Nongenomic signaling
pathways of estrogen toxicity. Toxicol. Sci. 115, 1–11.
Wexler, P. (2001). TOXNET: An evolving web resource for toxicology and
environmental health information. Toxicology 157, 3–10.
Wiese, T. E., Polin, L. A., Palomino, E., and Brooks, S. C. (1997). Induction
of the estrogen specic mitogenic response of MCF-7 cells by selected
analogues of estradiol-17 beta: A 3D QSAR study. J. Med. Chem. 40,
3659–3669.
Willett, C. E., Bishop, P. L., and Sullivan, K. M. (2011). Application of an
integrated testing strategy to the U.S. EPA endocrine disruptor screening
program. Toxicol. Sci. 123, 15–25.
Wilson, M. R. (2009). The Endocrine System: Hormones, Growth, and
Development. The Rosen Publishing Group, New York, NY.
Wormke, M., Stoner, M., Saville, B., Walker, K., Abdelrahim, M., Burghardt,
R., and Safe, S. (2003). The aryl hydrocarbon receptor mediates degrada-
tion of estrogen receptor alpha through activation of proteasomes. Mol. Cell.
Biol. 23, 1843–1855.
Yager, J. D., and Davidson, N. E. (2006). Estrogen carcinogenesis in breast
cancer. N. Engl. J.Med. 354, 270–282.
Zoeller, R. T., Brown, T. R., Doan, L. L., Gore, A. C., Skakkebaek, N. E., Soto,
A. M., Woodruff, T. J., and Vom Saal, F. S. (2012). Endocrine-disrupting
chemicals and public health protection: Astatement of principles from The
Endocrine Society. Endocrinology 153, 4097–4110.
291
... toxicological effects of chemicals. [1][2][3][4][5][6][7][8][9][10][11][12][13][14] QSAR builds a quantitative relationship between the structural or physicochemical characteristics of chemicals and their toxic effects. It has been one of the widely used methods to build toxicity prediction models. ...
... Many toxicity studies collected experimental data from a variety of data sources and established databases to manage the collected data, including ToxCast/Tox21, 127 ChEMBL, 41 ToxRefDB, 128 PubChem, 40 CPDB, 59 EDKB, 129 and EADB. 5 Since these databases contain data that were generated from different experiments and in various formats, data processing and curation are needed to prepare datasets from these databases for developing ML and DL models. For example, datasets extracted from the ToxCast/Tox21 database have been used to develop models for predicting reprotoxicity, 128,130 hepatotoxicity, 131 and other organ toxicity. ...
Article
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Article
Full-text available
Background The endocrine-disrupting effects of phytopharmaceutical active substances (PAS) on human health are a public health concern. The CIPATOX-PE database, created in 2018, listed the PAS authorized in France between 1961 and 2014 presenting endocrine-disrupting effects for humans according to data from official international organizations. Since the creation of CIPATOX-PE, European regulations have changed, and new initiatives identifying substances with endocrine-disrupting effects have been implemented and new PAS have been licensed. Objectives The study aimed to update the CIPATOX-PE database by considering new 2018 European endocrine-disrupting effect identification criteria as well as the new PAS authorized on the market in France since 2015. Methods The endocrine-disrupting effect assessment of PAS from five international governmental and non-governmental initiatives was reviewed, and levels of evidence were retained by these initiatives for eighteen endocrine target organs. Results The synthesis of the identified endocrine-disrupting effects allowed to assign an endocrine-disrupting effect level of concern for 241 PAS among 980 authorized in France between 1961 and 2021. Thus, according to the updated CIPATOX-PE data, 44 PAS (18.3%) had an endocrine-disrupting effect classified as “high concern,” 133 PAS (55.2%) “concern,” and 64 PAS (26.6%) “unknown effect” in the current state of knowledge. In the study, 42 PAS with an endocrine-disrupting effect of “high concern” are similarly classified in CIPATOX-PE-2018 and 2021, and 2 new PAS were identified as having an endocrine-disrupting effect of “high concern” in the update, and both were previously classified with an endocrine-disrupting effect of “concern” in CIPATOX-PE-2018. Finally, a PAS was identified as having an endocrine-disrupting effect of “high concern” in CIPATOX-PE-2018 but is now classified as a PAS not investigated for endocrine-disrupting effects in CIPATOX-PE-2021. The endocrine target organs associated with the largest number of PAS with an endocrine-disrupting effect of “high concern” is the reproductive system with 31 PAS. This is followed by the thyroid with 25 PAS and the hypothalamic–pituitary axis (excluding the gonadotropic axis) with 5 PAS. Discussion The proposed endocrine-disrupting effect indicator, which is not a regulatory classification, can be used as an epidemiological tool for occupational risks and surveillance.
Chapter
In order to advance the field of computational toxicology forward, it is vital to leverage new machine learning and deep learning algorithms. Historically, these algorithms were limited to fields with larger datasets, but the advent of big data has reached maturity and the time is now right to begin tackling larger and more difficult questions. The ever-increasing panoply of potential toxicants precipitates the need for faster and more cost-effective methods for pinpointing toxicants and their methods of toxicity. Machine learning and deep learning approaches present the ability to move beyond simple correlations and instead identify more complex patterns, which is a much closer representation of the biological truth. The integration of logical reasoning from the human eye and linear experiments to artificial intelligence will improve computational toxicology for risk assessment by unearthing novel discoveries through making unexpected connections across data types, datasets, and toxicology disciplines.
Chapter
Organ toxicity is a leading concern in safety assessment for regulatory agencies and pharmaceutical companies. Due to the huge amount of compounds and the limitation in animal toxicity testing, alternative and innovative approaches are urgently needed for toxicity assessment. It is a tremendous challenge for organ level toxicity prediction due to the differences among organs and the complicated mechanisms of organ toxicity induction. With the advantages of efficiency and low cost, machine learning methods have been used as promising alternatives for organ toxicity prediction. This chapter reviews the current advances in organ toxicity prediction using machine learning.
Chapter
Numerical description of chemical structures is necessary for development of machine learning and deep learning models for predicting the potential toxicity of chemicals. Mold2 is a software tool developed in C++ for fast calculating molecular descriptors from two-dimensional structures. Mold2 descriptors contain rich information and can be used to build high-performance models in computational toxicology. Multiple studies have compared Mold2 descriptors with other descriptors and fingerprints in machine learning and deep learning models for predicting the toxicity of chemicals. These studies have demonstrated that models built with Mold2 descriptors outperform models developed with other descriptors and fingerprints.
Chapter
Metal organic frameworks (MOFs) have been widely used in gas adsorption-based applications due to their high porosities and modification in chemical and physical properties. There are many MOFs available for applications. However, gas adsorption capacities are not known for most MOFs and it is not practical to experimentally test their gas adsorption capacities. Therefore, a variety of machine learning models have been developed for predicting gas adsorption capacities of MOFs. In this chapter, we summarized the machine learning models developed for predicting gas adsorption capacities of MOFs and their applications.
Article
Full-text available
Background: In 2008, the National Institute of Environmental Health Sciences/National Toxicology Program, the U.S. Environmental Protection Agency’s National Center for Computational Toxicology, and the National Human Genome Research Institute/National Institutes of Health Chemical Genomics Center entered into an agreement on “high throughput screening, toxicity pathway profiling, and biological interpretation of findings.” In 2010, the U.S. Food and Drug Administration (FDA) joined the collaboration, known informally as Tox21. Objectives: The Tox21 partners agreed to develop a vision and devise an implementation strategy to shift the assessment of chemical hazards away from traditional experimental animal toxicology studies to one based on target-specific, mechanism-based, biological observations largely obtained using in vitro assays. Discussion: Here we outline the efforts of the Tox21 partners up to the time the FDA joined the collaboration, describe the approaches taken to develop the science and technologies that are currently being used, assess the current status, and identify problems that could impede further progress as well as suggest approaches to address those problems. Conclusion: Tox21 faces some very difficult issues. However, we are making progress in integrating data from diverse technologies and end points into what is effectively a systems-biology approach to toxicology. This can be accomplished only when comprehensive knowledge is obtained with broad coverage of chemical and biological/toxicological space. The efforts thus far reflect the initial stage of an exceedingly complicated program, one that will likely take decades to fully achieve its goals. However, even at this stage, the information obtained has attracted the attention of the international scientific community, and we believe these efforts foretell the future of toxicology.
Article
Full-text available
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between environmental chemicals and gene products and their relationships to diseases. Chemical–gene, chemical–disease and gene–disease interactions manually curated from the literature are integrated to generate expanded networks and predict many novel associations between different data types. CTD now contains over 15 million toxicogenomic relationships. To navigate this sea of data, we added several new features, including DiseaseComps (which finds comparable diseases that share toxicogenomic profiles), statistical scoring for inferred gene–disease and pathway–chemical relationships, filtering options for several tools to refine user analysis and our new Gene Set Enricher (which provides biological annotations that are enriched for gene sets). To improve data visualization, we added a Cytoscape Web view to our ChemComps feature, included color-coded interactions and created a ‘slim list’ for our MEDIC disease vocabulary (allowing diseases to be grouped for meta-analysis, visualization and better data management). CTD continues to promote interoperability with external databases by providing content and cross-links to their sites. Together, this wealth of expanded chemical–gene–disease data, combined with novel ways to analyze and view content, continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases.
Article
Full-text available
The incidence and/or prevalence of health problems associated with endocrine-disruption have increased. Many chemicals have endocrine-disrupting properties, including bisphenol A, some organochlorines, polybrominated flame retardants, perfluorinated substances, alkylphenols, phthalates, pesticides, polycyclic aromatic hydrocarbons, alkylphenols, solvents, and some household products including some cleaning products, air fresheners, hair dyes, cosmetics, and sunscreens. Even some metals were shown to have endocrine-disrupting properties. Many observations suggesting that endocrine disruptors do contribute to cancer, diabetes, obesity, the metabolic syndrome, and infertility are listed in this paper. An overview is presented of mechanisms contributing to endocrine disruption. Endocrine disruptors can act through classical nuclear receptors, but also through estrogen-related receptors, membrane-bound estrogen-receptors, and interaction with targets in the cytosol resulting in activation of the Src/Ras/Erk pathway or modulation of nitric oxide. In addition, changes in metabolism of endogenous hormones, cross-talk between genomic and nongenomic pathways, cross talk with estrogen receptors after binding on other receptors, interference with feedback regulation and neuroendocrine cells, changes in DNA methylation or histone modifications, and genomic instability by interference with the spindle figure can play a role. Also it was found that effects of receptor activation can differ in function of the ligand.
Article
Full-text available
An endocrine-disrupting chemical (EDC) is an exogenous chemical, or mixture of chemicals, that can interfere with any aspect of hormone action. The potential for deleterious effects of EDC must be considered relative to the regulation of hormone synthesis, secretion, and actions and the variability in regulation of these events across the life cycle. The developmental age at which EDC exposures occur is a critical consideration in understanding their effects. Because endocrine systems exhibit tissue-, cell-, and receptor-specific actions during the life cycle, EDC can produce complex, mosaic effects. This complexity causes difficulty when a static approach to toxicity through endocrine mechanisms driven by rigid guidelines is used to identify EDC and manage risk to human and wildlife populations. We propose that principles taken from fundamental endocrinology be employed to identify EDC and manage their risk to exposed populations. We emphasize the importance of developmental stage and, in particular, the realization that exposure to a presumptive "safe" dose of chemical may impact a life stage when there is normally no endogenous hormone exposure, thereby underscoring the potential for very low-dose EDC exposures to have potent and irreversible effects. Finally, with regard to the current program designed to detect putative EDC, namely, the Endocrine Disruptor Screening Program, we offer recommendations for strengthening this program through the incorporation of basic endocrine principles to promote further understanding of complex EDC effects, especially due to developmental exposures.
Article
Consumption of phytoestrogens and mycoestrogens in food products or as dietary supplements is of interest because of both the potential beneficial and adverse effects of these compounds in estrogen-responsive target tissues. Although the hazards of exposure to potent estrogens such as diethylstilbestrol in developing male and female reproductive tracts are well characterized, less is known about the effects of weaker estrogens including phytoestrogens. With some exceptions, ligand binding to the estrogen receptor (ER) predicts uterotrophic activity. Using a well-established and rigorously validated ER-ligand binding assay, we assessed the relative binding affinity (RBA) for 46 chemicals from several chemical structure classes of potential phytoestrogens and mycoestrogens. Although none of the test compounds bound to ER with the affinity of the standard, 17beta-estradiol (E-2), ER binding was found among all classes of chemical structures (flavones, isoflavones, flavanones, coumarins, chalcones and mycoestrogens). Estrogen receptor relative binding affinities were distributed across a wide range (from similar to43 to 0.00008; E-2 = 100). These data can be utilized before animal testing to rank order estimates of the potential for in vivo estrogenic activity of a wide range of untested plant chemicals (as well as other chemicals) based on ER binding.
Article
The incidence of allergies and asthma has increased significantly in the past few decades. The objectives of this study were to establish an allergy model in weanling rats to more closely reflect the developing immune system of children, and to determine whether systemic administration of inactivated Bordetella pertussis could enhance pulmonary and systemic immune responses to locally administered house dust mite antigen (HDM). Three-week old female Brown Norway rats were sensitized with 10 g HDM intratracheally or intraperitoneally, with or without a simultaneous injection of 10 8 whole killed B. pertussis organisms. Ten days later, all the rats were challenged with 5 g HDM via the trachea. Bronchial lymph nodes and bronchoalveolar lavage fluid (BAL) were collected 0, 2, and 4 days post-challenge. Coadministration of pertussis and intratracheal instillation of HDM enhanced HDM-specific lymphoproliferative responses and increased BAL levels of total protein, lactate dehydrogenase, HDMspecific IgE and IgG antibodies, and the number of eosinophils in BAL to the same extent as had occurred in the systemically immunized animals. The data show that intratracheal instillation of HDM induces a mild allergic sensitization in juvenile rats, and that ip injection with B. pertussis enhances this sensitization process to levels seen in animals injected with antigen and B pertussis together. These results suggest that simultaneous exposure to Th2-inducing vaccine components and allergenic proteins may be a risk factor for allergic sensitization and the development of asthma in susceptible individuals.
Article
With virtual screening based on a structure optimized through molecular dynamics (MD) and bioassays, 18 potent ligands of estrogen receptor (ER) beta were discovered from 70 purchased compounds here. Among them, dual profile was observed in two ligands (1a and 1b), as agonists for ERbeta and antagonists for ERalpha, and they might serve as lead compounds for selective ER modulators. The results also suggest that structures optimized through MD are applicable to lead discovery.
Article
Bell System Technical Journal, also pp. 623-656 (October)
Article
Endocrine disrupting chemicals interfere with the endocrine system in animals, including humans, to exert adverse effects. One of the mechanisms of endocrine disruption is through the binding of receptors such as the estrogen receptor (ER) in target cells. The concentration of any chemical in serum is important for its entry into the target cells to bind the receptors. α-Fetoprotein (AFP) is a major transport protein in rodent serum that can bind with estrogens and thus change a chemical's availability for entrance into the target cell. Sequestration of an estrogen in the serum can alter the chemical's potential for disrupting estrogen receptor-mediated responses. To better understand endocrine disruption, we developed a competitive binding assay using rat amniotic fluid, which contains very high levels of AFP, and measured the binding to the rat AFP for 125 structurally diverse chemicals, most of which are known to bind ER. Fifty-three chemicals were able to bind the rat AFP in the assay, while 72 chemicals were determined to be nonbinders. Observations from closely examining the relationship between the binding data and structures of the tested chemicals are rationally explained in a manner consistent with proposed binding regions of rat AFP in the literature. The data reported here represent the largest data set of structurally diverse chemicals tested for rat AFP binding. The data assist in elucidating binding interactions and mechanisms between chemicals and rat AFP and, in turn, assist in the evaluation of the endocrine disrupting potential of chemicals.
Article
A number of selective estrogen receptor modulators (SERMs) were discovered from the SPECS database via a simple protocol. Based on two reference SERMs we identified via structure-based virtual screening previously, ligand-based similarity search and molecular docking filtering were conducted to identify novel SERMs from SPECS library. Among the 36 purchased compounds, 21 were confirmed to be active by in vitro assays, and 10 showed dual profile properties, namely as antagonists of ERα and agonists of ERβ. The anti-proliferative potency of these ligands was also evaluated against MCF-7 cell lines. Further investigation of the anti-proliferative mechanism of compound 3a suggested that it induced a G1 cell cycle arrest in ERα positive MCF-7 cell through ERα mediated cyclin D1 down-regulation.