Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/envint
Advanced data mining approaches in the assessment of urinary
concentrations of bisphenols, chlorophenols, parabens and benzophenones
in Brazilian children and their association to DNA damage
Bruno A. Rocha
, Alexandros G. Asimakopoulos
, Masato Honda
, Nattane L. da Costa
Rommel M. Barbosa
, Fernando Barbosa Jr
, Kurunthachalam Kannan
Laboratório de Toxicologia e Essencialidade de Metais, Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, São Paulo
Wadsworth Center, New York State Department of Health, and Department of Environmental Health Sciences, School of Public Health, State University of New York at
Albany, New York 12201, United States
Department of Chemistry, The Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
Instituto de Informática, Universidade Federal de Goiás, Goiânia, Goiás 74690-900, Brazil
Biochemistry Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
Handling editor: Lesa Aylward
Endocrine disrupting chemicals
Human exposure to endocrine disrupting chemicals (EDCs) has received considerable attention over the last
three decades. However, little is known about the inﬂuence of co-exposure to multiple EDCs on eﬀect-bio-
markers such as oxidative stress in Brazilian children. In this study, concentrations of 40 EDCs were determined
in urine samples collected from 300 Brazilian children of ages 6–14 years and data were analyzed by advanced
data mining techniques. Oxidative DNA damage was evaluated from the urinary concentrations of 8-hydroxy-2′-
deoxyguanosine (8OHDG). Fourteen EDCs, including bisphenol A (BPA), methyl paraben (MeP), ethyl paraben
(EtP), propyl paraben (PrP), 3,4-dihydroxy benzoic acid (3,4-DHB), methyl-protocatechuic acid (OH-MeP),
ethyl-protocatechuic acid (OH-EtP), triclosan (TCS), triclocarban (TCC), 2-hydroxy-4-methoxybenzophenone
(BP3), 2,4-dihydroxybenzophenone (BP1), bisphenol A bis(2,3-dihydroxypropyl) glycidyl ether (BADGE·2H
2,4-dichlorophenol (2,4-DCP), and 2,5-dichlorophenol (2,5-DCP) were found in > 50% of the urine samples
analyzed. The highest geometric mean concentrations were found for MeP (43.1 ng/mL), PrP (3.12ng/mL), 3,4-
DHB (42.2 ng/mL), TCS (8.26 ng/mL), BP3 (3.71 ng/mL), and BP1 (4.85 ng/mL), and exposures to most of which
were associated with personal care product (PCP) use. Statistically signiﬁcant associations were found between
urinary concentrations of 8OHDG and BPA, MeP, 3,4-DHB, OH-MeP, OH-EtP, TCS, BP3, 2,4-DCP, and 2,5-DCP.
After clustering the data on the basis of i) 14 EDCs (exposure levels), ii) demography (age, gender and geo-
graphic location), and iii) 8OHDG (eﬀect), two distinct clusters of samples were identiﬁed. 8OHDG con-
centration was the most critical parameter that diﬀerentiated the two clusters, followed by OH-EtP. When
8OHDG was removed from the dataset, predictability of exposure variables increased in the order of: OH-
EtP > OH-MeP > 3,4-DHB > BPA > 2,4-DCP > MeP > TCS > EtP > BP1 > 2,5-DCP. Our results
showed that co-exposure to OH-EtP, OH-MeP, 3,4-DHB, BPA, 2,4-DCP, MeP, TCS, EtP, BP1, and 2,5-DCP was
associated with DNA damage in children. This is the ﬁrst study to report exposure of Brazilian children to a wide
range of EDCs and the data mining approach further strengthened our ﬁndings of chemical co-exposures and
biomarkers of eﬀect.
Populations throughout the world are exposed to a wide range of
synthetic environmental chemicals, which are harmful to human health
(Naidu et al., 2016;Scognamiglio et al., 2016;Woodruﬀ, 2015;Zidek
et al., 2017). Many of these chemicals are endocrine disrupting che-
micals (EDCs), for which exposure has been associated with the pro-
gression of metabolic disorders including obesity, and diabetes, cancer
and endometriosis (Diamanti-Kandarakis et al., 2009;Giulivo et al.,
2016;Jimenez-Diaz et al., 2015;Naidu et al., 2016;Smarr et al., 2016;
Received 1 January 2018; Received in revised form 15 April 2018; Accepted 16 April 2018
Corresponding author at: Wadsworth Center, Empire State Plaza, P. O. Box 509, Albany, NY 12201-0509, United States.
E-mail address: email@example.com (K. Kannan).
Environment International 116 (2018) 269–277
0160-4120/ © 2018 Elsevier Ltd. All rights reserved.
Xue et al., 2015). Exposure to EDCs has been linked to oxidative stress
in human populations (Asimakopoulos et al., 2016;Bledzka et al., 2014;
Ferguson et al., 2016;Franken et al., 2017;Lu et al., 2016;Lv et al.,
2017;Rocha et al., 2017;Tavares et al., 2016;Watkins et al., 2015;
Zhang et al., 2016). Oxidative stress is a condition that arises from an
imbalance between the endogenous formation of reactive oxygen spe-
cies (ROS), and the organism's capacity to detoxify or eliminate the ROS
or to repair damage caused by the ROS. This condition can disrupt
normal cellular signaling and can act as a trigger for numerous diseases,
such as cancer, cardiovascular disease, and infertility. Oxidized DNA
repair products are excreted in urine, and therefore urinary 8OHDG is
considered an important marker of oxidative stress (Asimakopoulos
et al., 2016;Bisht et al., 2017;Di Minno et al., 2016;Kelly and Fussell,
2017;Reuter et al., 2010;Rocha et al., 2017;Zhang et al., 2016).
Human biomonitoring programs, implemented by public health
agencies in various countries, assess exposure of populations to EDCs.
Due to the smaller body weight and higher calorie intake per kilogram,
children can be exposed to greater levels of EDCs than adults. Several
studies have reported the occurrence of EDCs in children
(Asimakopoulos et al., 2016;Calafat et al., 2017;CDC, 2015;Covaci
et al., 2015;Frederiksen et al., 2013;Health Canada, 2013;Heﬀernan
et al., 2015;Jiménez-Díaz et al., 2016;Larsson et al., 2014;Myridakis
et al., 2015;Xue et al., 2015). However, to the best of our knowledge,
no data exist on the exposure of Brazilian children to these EDCs.
Furthermore, data analysis models often associate exposure from a
single chemical to health outcomes. Current developments in data
mining techniques enable analysis of co-exposure to multiple chemicals
on health outcomes. Such an approach would enable understanding of
eﬀects from co-exposures to multiple chemicals. The present study was
conducted with the aim to elucidate urinary concentrations of EDCs in
Brazilian children and to examine association between urinary con-
centrations of EDCs and 8OHDG.
2. Materials and methods
2.1. Study population and sample collection
Urine samples were collected from 300 urban resident Brazilian
school children aged 6 to 14 years from ﬁve geographic regions in
Brazil (Southeast, South, Central-West, Northeast, and North) in
2012–2013 (Rocha et al., 2017). The demographic characteristics
(gender, age, and region) of the population studied are shown in Table
S1. Spot urine samples were collected in polypropylene conical tubes
from healthy donors and stored at −80 °C until analysis. Informed
consent was obtained from legal guardian(s) of every child. The study
was approved by the Institutional Ethical Review Board of the School of
Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo,
2.2. Chemical analysis
2.2.1. Sample preparation and instrumental analysis
Urine samples were analyzed for 40 EDCs, 8OHDG, creatinine and
speciﬁc gravity. The list of all EDCs analyzed is shown in Table 1. In-
dividual stock solutions of each compound and internal standards were
prepared by dissolution in methanol (MeOH) and stored in amber glass
vials at −20 °C. The calibration and working standard solutions were
prepared daily from the stock solutions through serial dilution with
MeOH, and stored in amber glass vials at 4 °C until analysis. The
methods for the analysis of EDCs in urine samples have been described
in Asimakopoulos et al. (2016). EDCs were extracted after enzymatic
deconjugation of urine samples, followed by liquid-liquid extraction.
The extracts were analyzed by liquid chromatography-tandem mass
spectrometry (LC-MS/MS) methods. Details of the analytical methods
are presented elsewhere (Asimakopoulos et al., 2016). Urinary con-
centrations of 8OHDG, creatinine, and speciﬁc gravity were determined
as reported elsewhere (Rocha et al., 2017). However, the previous
methods did not analyze chlorophenols, which is described below.
The chromatographic separation of dichlorophenols and tri-
chlorophenols was carried out using a Waters Acquity™ultra perfor-
mance liquid chromatography (UPLC) system (Waters; Milford, MA,
U.S.), which consisted of a binary pump and an auto sampler.
Identiﬁcation and quantiﬁcation of target analytes were accomplished
with an Applied Biosystems API 5500™electrospray triple quadrupole
mass spectrometer (APCI–MS/MS; Applied Biosystems; Foster City, CA,
U.S.) under the negative ionization mode. An ACQUITY UPLC®BEH
C18 column (2.1 mm × 50 mm, 1.7 μm; Waters; Milford, MA, U.S.) was
used for the separation of target compounds. The mobile phase com-
prised MeOH (0.01% ammonium hydroxide) and Milli-Q water (0.01%
ammonium hydroxide) with gradient elution at a ﬂow rate of 300 μL/
min starting at 5% MeOH which was held for 0.5 min, increased to 45%
MeOH within 0.1 min, then increased again to 99% MeOH within
2 min, held for 0.7 min, then decreased to 5% MeOH within 0.10 min,
held for 0.70 min, for a total run time of 5.0 min. The ionization voltage
was −4500 V. The curtain and collision gas (nitrogen) ﬂow rates were
set at 25 and 10 psi, respectively, and the source heater was set at
300 °C. The nebulizer gas (ion source gas 1) was set at 30 psi, and the
heater gas (ion source gas 2) was set at 40 psi. The mass spectrometer
was operated in multiple reaction monitoring (MRM) negative ioniza-
tion mode. Detailed information regarding MS transitions for each
target chemical and internal standard are presented in the supple-
mentary material (Table S2).
2.2.2. Quality assurance/quality control
EDCs were determined after enzymatic deconjugation. Quality as-
surance and quality control parameters included procedural blanks,
matrix spikes and analysis of Standard Reference Materials (SRM).
Labeled internal standards were spiked into all samples and quantiﬁ-
cation was by isotope dilution method (Table S2). Contamination that
arises from laboratory materials and solvents was monitored by the
analysis of procedural blanks. A 20-point instrumental calibration curve
was prepared in MeOH at concentrations that ranged from 0.01 to
100 ng/mL, except for chlorophenols that ranged from 0.01 to 20 ng/
mL. The regression coeﬃcients of the calibration curves were > 0.99.
For each batch of 25 samples analyzed, two procedural blanks and two
pre-extraction matrix spikes (prepared by spiking known concentra-
tions [40 ng/mL] of target compounds) were analyzed by passing them
through the entire analytical procedure. In addition, SRMs 3672
(Organic Contaminants in Smokers' Urine) and 3673 (Organic
Contaminants in Non-Smokers' Urine) from the National Institute of
Standards & Technology (Gaithersburg, MD, U.S.), which were certiﬁed
for select EDCs, were analyzed with every 50 samples to assure accu-
racy of the analytical method. Our results for NIST SRMs were
within ± 15% of the certiﬁed values. A calibration check standard and
methanol were injected after every 25 samples as a check for drift in
instrumental sensitivity and carry-over between samples, respectively.
The limits of detection (LODs) and the limits of quantiﬁcation (LOQs)
were calculated based on Asimakopoulos et al. (2016). Brieﬂy, the
LODs and LOQs were calculated as 3 and 10 times the standard de-
viation of 6 replicate analyses of the lowest calibration standard (or at
the concentration found in procedural blanks, if the target analyte
maintained measurable background levels) divided by the value of the
slope of regression, after adjusting for recovery/loss during extraction
and matrix eﬀects, except for chlorophenols, for which the lowest ca-
libration standard that ﬁtted the calibration curve divided by 3. The
LODs of EDCs in urine varied from 0.004 to 0.60 ng/mL (Table S2).
2.3. Statistical analysis
2.3.1. Data analysis
Data analysis was performed using SPSS, version 20 and Microsoft
Excel 2013®. Median, mean, geometric mean and percentiles of urinary
B.A. Rocha et al. Environment International 116 (2018) 269–277
EDC concentrations were calculated on a volume-based, creatinine-
adjusted and speciﬁc-gravity (SG) adjusted basis. The results reported
herein were for volume-based concentrations (i.e., ng/mL); creatinine
and SG-adjusted concentrations are provided in the supporting in-
formation. For EDC concentrations below the LOD, we used a value
equal to the LOD divided by the square root of 2 (Hornung and Reed,
1990). We only calculated geometric means for EDCs that were de-
tected in > 50% of the samples. Only concentrations above the LOD
were used in statistical analysis and comparisons. Since the urinary
concentrations were not normally distributed (as determined by Sha-
piro-Wilk test), data were log-transformed for statistical analysis. The
nonparametric Mann-Whitney Utest was used to examine the diﬀer-
ence between two groups of data, whereas the nonparametric Kruskal-
Wallis test was used to test the diﬀerences among three or more groups.
Moreover, multiple linear regression (MLR) analysis was conducted to
elucidate statistical association between each urinary EDC concentra-
tion and demographic characteristics, after adjusting for covariates
(gender, age and region and interaction between age/gender). Natural
log transformations were performed on all EDC concentrations because
of the skewed distribution of data. Log-transformed data were normally
distributed. To examine the relationship among EDCs, Spearman's
correlation was applied. Analytes with detection rates below 50% were
excluded from statistical analysis. All statistical tests were considered
signiﬁcant if the p-value was < 0.05.
2.3.2. Cluster analysis
Cluster analysis is a family of unsupervised pattern recognition
technique that identiﬁes a ﬁnite set of categories or clusters in such a
way that those within a group are more similar to each other than to
objects in another group (Maione et al., 2017;Zidek et al., 2017). We
used k-medoids, a technique based on partitioning around medoids
(PAM), algorithm developed by Kaufman and Rousseuw (1990). The
objective of this algorithm is to partition the dataset into k clusters with
each cluster having one representative object known as medoid. A
medoid is deﬁned as the most centrally located object within a cluster,
i.e., the object that has the minimum sum of distances to other points.
We used the Euclidean distance as a measure of distance. The algorithm
is deﬁned by the following steps: 1. Choose k objects at random to be
the initial cluster medoids called representative objects. 2. Assign each
object to the cluster associated with the closest medoid. 3. Randomly
select a non-representative object O. 4. Calculate the total cost S of
swapping the medoid M with O (sum of distances of points to medoid M
minus the sum of distance of points to medoid O). 5. If S < 0, then
swap M with Oi to form the new set of medoids. 6. Repeat steps 3 to 5
Unadjusted urinary concentrations and detection rates of various endocrine disrupting chemicals (EDC) including bisphenols, chlorophenols, parabens and ben-
zophenones (ng/mL) in Brazilian children.
Abbreviation Name Chemical name DR%
BPA Bisphenol A 2,2-Bis(4-hydroxyphenyl)propane 98 1.74 1.66 0.30 35.9
BPS Bisphenol S 4,4′-Sulfonyldiphenol 23 nc < LOD 0.06 22.6
BPAP Bisphenol AP 4,4′-(1-Phenylethylidene)bisphenol 5 nc < LOD 0.20 1.88
BPB Bisphenol B 2,2-Bis(4-hydroxyphenyl)butane 3 nc < LOD 0.09 3.42
BPP Bisphenol P 4,4′-(1,4-Phenylenediisopropylidene)bisphenol 16 nc < LOD 0.09 1.28
BPF Bisphenol F 4,4′-Dihydroxydiphenylmethane 9 nc < LOD 0.56 8.33
BPAF Bisphenol AF 4,4′-(Hexaﬂuoroisopropylidene)-diphenol 1 nc < LOD 0.45 1.53
BPZ Bisphenol Z 4,4′-Cyclohexylidenebisphenol 0.3 nc <LOD 3.67 3.67
BPM Bisphenol M 4,4′-(1,3-Phenylenediisopropylidene)bisphenol 0 nd nd nd nd
BADGE –Bisphenol A diglycidyl ether 20 nc < LOD 0.10 4.76
O–Bisphenol A bis(2,3-dihydroxypropyl) glycidyl ether 57 0.30 0.24 0.10 33.8
O–Bisphenol A (2,3-dihydroxypropyl) glycidyl ether 14 nc < LOD 0.10 2.03
O·HCl –Bisphenol A (3-chloro-2-hydroxypropyl) (2,3-dihydroxypropyl) glycidyl ether 8 nc < LOD 0.05 1.26
BADGE.HCl –Bisphenol A (3-chloro-2-hydroxypropyl) glycidyl ether 1 nc < LOD 0.10 0.49
BADGE.2HCl –Bisphenol A bis(3-chloro-2-hydroxypropyl) glycidyl ether 7 nc < LOD 0.48 1.31
BFDGE –Bisphenol F diglycidyl ether 0 nd nd nd nd
BFDGE.2HCl –Bisphenol F bis(3-chloro-2-hydroxypropyl)glycidyl ether 0 nd nd nd nd
O–Bisphenol F bis (2,3-dihydroxypropyl)glycidyl ether 0 nd nd nd nd
3RNOGE –3-Ring Novolac glycidyl ether 0 nd nd nd nd
4RNOGE –4-Ring Novolac glycidyl ether 0 nd nd nd nd
MeP Methyl-paraben Methyl-paraben 100 43.1 38.8 0.20 10,647
EtP Ethyl-paraben Ethyl-paraben 83 0.19 0.32 0.003 185
PrP Propyl-paraben Propyl-paraben 97 3.12 2.68 0.03 3366
BuP Butyl-paraben Butyl-paraben 43 nc < LOD 0.02 18.9
BzP Benzyl-paraben Benzyl-paraben 27 nc < LOD 0.01 1.01
HeP Heptyl-paraben Heptyl-paraben 14 nc < LOD 0.004 1.01
3,4-DHB –3,4-Dihydroxy benzoic acid 76 8.24 28.8 10.0 515
OH-MeP –Methyl-protocatechuic acid 100 2.17 2.16 0.07 72.6
OH-EtP –Ethyl-protocatechuic acid 71 0.22 0.23 0.05 7.28
TCS Triclosan 5-Chloro-2-(2,4-dichlorophenoxy)phenol 90 8.26 14.0 0.02 874
TCC Triclocarban 1-(4-Chlorophenyl)-3-(3,4-dichlorophenyl)urea 70 0.02 0.02 0.004 0.94
BP3 Benzophenone-3 2-Hydroxy-4-methoxybenzophenone 100 3.71 2.99 0.70 983
BP1 Benzophenone-1 2,4-Dihydroxybenzophenone 95 4.85 5.86 0.01 1910
BP8 Benzophenone-8 2,2′-Dihydroxy-4-methoxybenzophenone 29 nc < LOD 0.01 2.69
BP2 Benzophenone-2 2,2′,4,4′-Tetrahydroxybenzophenone 8 nc < LOD 0.25 8.25
4-OHBP –4-Hydroxybenzophenone 38 nc < LOD 0.02 2.92
2,4-DCP Dichlorophenol 2,4-Dichlorophenol 99 2.60 2.47 0.35 56.7
2,5-DCP Dichlorophenol 2,5-Dichlorophenol 99 4.59 3.19 0.080 810
2,4,5-TCP Trichlorophenol 2,4,5-Trichlorophenol 16 nc < LOD 0.084 10.1
2,4,6-TCP Trichlorophenol 2,4,6-Trichlorophenol 42 nc < LOD 0.021 4.58
DR%, detection rate in percentage.
GM, geometric mean (nc: not calculated; GMs were calculated only for analytes with detection rate higher than 50%); nd = not detected
Minimum/maximum detected among positive samples.
B.A. Rocha et al. Environment International 116 (2018) 269–277
until the medoids become ﬁxed. One of the advantages of using k-me-
doids instead of the popular k-means is that the k-medoids algorithm is
more robust with respect to outliers (Kaufman and Rousseuw, 1990;
Reynolds et al., 2004).
2.3.3. Classiﬁcation algorithms and model evaluation
Decision trees have become a powerful and popular approaches in
data science due to their simplicity, low computational costs, and quick
generalization of data (Rokach, 2016). The structure of a decision tree
consists of internal nodes, which represents a test on a variable, bran-
ches representing the outcome of the test, and each leaf node re-
presenting a class label. We used C4.5 decision tree implemented as J48
decision from R software for the analysis of our data (Hornik et al.,
The Random Forest (RF) algorithm is a classiﬁer that generates
multiple decision trees using bootstrap samples from the original
training data. Around one third of data (called out-of-bag [OOB]) is
separated to test the respective tree constructed from the bootstrap
sample. Each tree in the RF is a Classiﬁcation and Regression Tree
(CART). Each node in the tree corresponds to a variable, and each edge
originating from a node x represents a value, or a range of values, for
the variable x. The tree chooses which variable will be a node based on
the Gini criteria and the node splits into new nodes until a stopping
criterion is met or until the terminal nodes are pure. The classiﬁcation
occurs according to the most voted class among the trees. The RF al-
gorithm is an eﬀective tool for prediction due to the high level of ac-
curacy (Breiman, 2001). The RF algorithm that we used in this study
deﬁne the number of trees at 500, and the number of randomly selected
variables chosen at each node was deﬁned as the square root of the
number of variables in the dataset (in our case, the number was ap-
proximately 4) (Liaw and Wiener, 2002). To evaluate the performance
of the classiﬁcation model, we used k-fold cross validation with k = 10.
This method splits the data into k subsets and uses k-1 fold to train data
and one-fold to test data. The relationship of correct and incorrect
classiﬁcations is organized in a confusion matrix (Table S3) to obtain
the measurement performances of accuracy, sensitivity and speciﬁcity.
The matrix values are true positive (TP) for samples correctly classiﬁed
as positive; true negative (TN) for samples correctly classiﬁed as ne-
gative; false negative (FN) for the positive samples that were classiﬁed
as negative; and false positive (FP) for negative samples that were
classiﬁed as positive.
Geometric mean, percentiles and range for volume-based con-
centrations of 40 EDCs (ng/mL) measured in 300 urine samples of
Brazilian children are summarized in Table 1. The results for creatinine-
adjusted and SG-adjusted concentrations of EDCs are summarized in
Tables S4 and S5, respectively. Among the 40 EDCs measured, 14 were
commonly detected in Brazilian children with the detection rates above
50%. Nineteen chemicals were detected in < 50% of the samples
whereas 7 chemicals remained undetected. Correlations among urinary
EDCs (Table S6) ranged from negative (r = −0.2) to positive (r = 0.8).
Positive correlations (r > 0.6) were observed between BP3 and BP1;
MeP and PrP; OH-MeP and OH-EtP; TCS and 2,4-DCP. Strong positive
correlations between chemicals often suggest common sources and co-
Among bisphenols, BPA was the major compound found in 98% of
the samples analyzed at concentrations ranging from < LOD to
35.9 ng/mL with a GM value of 1.74 ng/mL. The other BPA analogues
were found at much lower detection rates: BPS (23%), BPP (16%), BPF
(9%), BPAP (5%) BPB (3%) and BPAF (1%). All BADGE derivatives
were found in urine samples, but only BADGE.2H
O was frequently
detected (57%) at concentrations that ranged from < LOD to 33.8 ng/
mL with a GM concentration of 0.30 ng/mL. Other BADGE derivatives
were found in < 20% of the samples. BFDGEs and NOGE derivatives
were not detected.
Parabens and their metabolites were found at detection rates that
ranged from 14% to 100%. MeP was detected in all samples at a GM
concentration of 43.1 ng/mL. PrP and EtP were also frequently detected
with GM concentrations of 3.12 and 0.19 ng/mL, respectively. Paraben
metabolites, 3,4-DHB, OH-MeP and OH-EtP, were frequently detected
with detections rates > 70% and at GM concentrations of 8.24, 2.17
and 0.22 ng/mL, respectively.
BP3 and BP1 were detected at rates of 100% and 95%, respectively,
with GM concentrations of 3.71 and 4.85 ng/mL, respectively. The
detection rates of BP8, BP2 and 4-OHBP were < 40%. TCS and TCC
were detected in 90% and 70% of the samples, respectively, at con-
centrations ranging from < LOD to 874 ng/mL (GM: 8.26) for TCS
and < LOD to 0.94 (GM: 0.02) for TCC.
Dichlorophenols were frequently detected in urine. The GM con-
centrations of 2,4-DCP and 2,5-DCP were 2.60 ng/mL and 4.59 ng/mL,
respectively. 2,4,5-TCP and 2,4,6-TCP were less frequently detected.
Unadjusted urinary concentrations of EDCs stratiﬁed by gender and
age are summarized in Table 2 and urinary concentrations of EDCs
adjusted for creatinine and SG are listed in Tables S7 and S8. When
children were grouped by gender, statistically signiﬁcant diﬀerences
(p < 0.05; Mann-Whitney Utest) were observed for the concentrations
of MeP, PrP, TCS, BP3 and BP1. The results of MLR analysis are pre-
sented in Table S9, which suggested that the urinary concentrations of
MeP, EtP, PrP BP3, and BP1 diﬀered signiﬁcantly between sexes as
observed in non-parametric comparison methods (except for EtP). The
urinary concentrations of MeP, EtP, PrP, TCS, BP3 and BP1 were higher
in females than males. Furthermore, signiﬁcant diﬀerences (p < 0.05;
Kruskal-Wallis test) were observed in the concentrations of MeP, PrP,
and BP1 between age and gender groups of 6–10 and 11–14 year old
female and male children (Table S9). In addition, MLR analysis showed
that the urinary concentrations of MeP, PrP, OH-MeP, TCS, BP1 and,
BP3 were signiﬁcantly diﬀerent (p < 0.05) by the interaction terms of
age and gender. Higher concentrations of these compounds (except for
OH-MeP) were observed in older female children.
The urinary concentrations of EDCs, except for MeP, PrP, parabens
metabolites and TCS, showed signiﬁcant diﬀerences (p < 0.05;
Kruskal-Wallis test) among the geographic regions of Brazil (Table
S10). The median concentrations of EDCs were lower in Southern and
Southeastern regions than in the other regions of Brazil. Furthermore,
Unadjusted median urinary concentrations of endocrine disrupting chemicals (ng/mL) including bisphenols, chlorophenols, parabens and benzophenones in
Brazilian children grouped by age and gender.
Group N BPA BADGE.2H
O MeP EtP PrP 3,4-DHB OH-MeP OH-EtP TCS TCC BP3 BP1 2,4-DCP 2,5-DCP
6–10 years 134 1.76 1.17 27.4 0.36 2.61 41.4 2.28 0.59 14.1 0.032 3.07 6.29 2.22 2.96
11–14 years 166 1.64 0.70 48.5 0.44 3.30 36.4 2.12 0.40 21.5 0.033 2.96 6.77 2.61 4.06
p-Value 0.470 0.125 0.093 0.171 0.446 0.329 0.083 0.062 0.106 0.896 0.778 0.382 0.073 0.166
Female 151 1.67 0.85 57.9 0.45 4.48 40.0 2.47 0.49 27.4 0.033 3.39 8.08 2.74 3.72
Male 149 1.68 0.93 26.5 0.38 1.76 44.7 1.97 0.41 13.17 0.033 2.47 4.77 2.20 3.13
p-Value 0.580 0.487 0.000 0.086 0.000 0.168 0.375 0.652 0.015 0.958 0.047 0.017 0.134 0.580
*bold: Signiﬁcance level is p < 0.05.
B.A. Rocha et al. Environment International 116 (2018) 269–277
MLR analysis showed that the urinary concentrations of BPA,
O and MeP EtP, antimicrobials and dichlorophenols were
signiﬁcantly diﬀerent by the geographic region.
8OHDG was detected in 94.6% of the samples at concentrations
ranging from 0.40 to 29.5 ng/mL (GM: 4.40). Correlations
(r = 0.145–0.359) were found between log-transformed urinary con-
centrations of 8OHDG and BPA, MeP, 3,4-DHB, OH-MeP, OH-EtP, TCS,
BP3, 2,4-DCP and 2,5-DCP. No signiﬁcant diﬀerences were found in the
urinary concentrations of 8OHDG among various demographic (i.e.,
age, gender and region) groups. However, MLR analysis revealed the
inﬂuence of age and region (Table S11) on the concentrations of
8OHDG. Only the urinary concentrations of OH-MeP was signiﬁcantly
correlated (p = 0.004) with 8OHDG after adjusting for covariates
(gender, age and region).
Urinary concentrations of 8OHDG and 14 commonly detected EDCs
were subjected to cluster analysis (k-medoids) and classiﬁcation algo-
rithms (RF and J48) to identify common groups within the dataset. To
estimate the number of clusters, NbClust package, from the R software
was used (Charrad et al., 2014). This program implements 30 distinct,
eﬃcient, and widely used indexes for estimating the number of clusters.
The function was programmed to determine the best number of clusters
from two to ten possible groups. The indexes combine information
about intracluster compactness and intercluster isolation, as well as
other factors, such as geometric or statistical properties of the data, the
number of data objects and dissimilarity or similarity in measurements
to determine the optimal number of clusters (Charrad et al., 2014).
Among all the indexes, 10 proposed 2 as the best number of cluster; 7
proposed 3 as the best number of clusters; 1 proposed 5 as the best
number of clusters; 4 proposed 7 as the best number of clusters; and 2
proposed 10 as the best number of clusters. We decided to use two
clusters to perform cluster analysis as this value was frequently sug-
gested to be the best number for cluster analysis. Then, k-medoids al-
gorithm on standardized data was carried out (all features have a mean
0 and a standard deviation of 1) to search for the two clusters. It was
found that 65.7% of the samples (197 samples) were grouped as Cluster
1, and the remainder (34.3%) of the samples were grouped as Cluster 2.
After clustering the data, classiﬁcation algorithms RF and J48 de-
cision tree were applied to investigate the clusters characteristics. The
RF algorithm was selected because it uses decision trees to classify the
samples and provides a feature importance ranking, allowing us to
analyze how the EDCs aﬀects the clusters. The J48 algorithm was se-
lected to construct a decision tree that represented the results of the
feature importance ranking obtained by RF. The performance measures
for RF algorithm implemented on 10-fold cross validation using all
variables (8OHDG, BPA, BADGE.2H
O, MeP, EtP, PrP, 3,4-DHB, OH-
MeP, OH-EtP, TCS, TCC, BP3, BP1, 2,5-DCP, 2,4-DCP, region, age, and
gender) are shown in Table 3. The RF algorithm classiﬁed 94.3% of the
samples correctly. However, RF provides a variable importance for
measures based on the classiﬁcation accuracy of the OOB data called
Random Forest Importance (RFI) (Breiman, 2001). Fig. S1 shows the
importance values achieved by RFI. The higher the value the more
signiﬁcant the variables are in the classiﬁcation scheme.
The most important RFI variable identiﬁed by this analysis was
8OHDG. In order to investigate whether 8OHDG was the most im-
portant variable, we applied the decision tree algorithm for further
interpretation of this ﬁnding. Fig. S2 shows the tree structure generated
by J48 algorithm. According to this tree, given an arbitrary urine
sample, if the urinary concentration of 8OHDG for this sample was
≤6.60 ng/mL, the sample was in Cluster 1. The accuracy of the J48 was
86.3%. As can be seen in Fig. S2, only 13.2% of the samples which had
8OHDG concentration ≤6.60 ng/mL were misclassiﬁed in Cluster 1.
The remaining 86.8% samples in Cluster 1 were correctly classiﬁed.
Among the samples of Cluster 2, 14.8% of the samples with 8OHDG
concentration > 6.60 ng/mL were misclassiﬁed in this cluster. The
remaining 78.4% samples in Cluster 2 were correctly classiﬁed. These
results suggest that the variable 8OHDG has a major importance in the
formation of the clusters.
To analyze the importance of the remaining variables, we used the
RF algorithm to classify 17 subsets, generated by the i-th most im-
portant variable for all i=1, 2, …17, according to the ranking as
presented in Fig. S1 after removing 8OHDG from the list. The ﬁrst
subset had OH-EtP as a dominant variable and the second subset con-
tained OH-EtP and OH-MeP (Table S12). The subset that showed higher
prediction capability to classify the samples according to Cluster 1 and
Cluster 2 was #RFI05, which comprised of OH-EtP, OH-MeP, 3,4-DHB,
BPA, 2,4-DCP, MeP, TCS, EtP, and BP1, accounting for 81.7% precision.
The descriptive statistics of these variables for each cluster are shown in
BPA was the major bisphenol frequently detected in children's urine
from Brazil. The GM urinary concentrations of BPA in Brazilian children
(1.74 ng/mL) was similar to those reported for children from the USA
(1.58 ng/mL), Canada (1.3 ng/mL), and several European countries
(1.48 to 2.35 ng/mL) but lower than those reported from India
(5.08 ng/mL) and China (4.10 ng/mL) (CDC, 2015;Covaci et al., 2015;
Health Canada, 2013;Li et al., 2013;Xue et al., 2015). BPA is gradually
being replaced with other analogues (Asimakopoulos et al., 2016;Liao
et al., 2012a, b;Ye et al., 2015;Zhang et al., 2016). For instance, BPS
has been used as an alternative for BPA in baby bottles and thermal
papers, and other bisphenol analogues are used in the manufacture of
certain consumer products (Liao et al., 2012a, b;Rocha et al., 2015;
Simoneau et al., 2011;Ye et al., 2015). BPS was also detected in urine
samples from other countries such as Saudi Arabia (100%), Japan
(100%), USA (97%), China (82%), India (76%), and Korea (42%) at
higher detection rates (Asimakopoulos et al., 2016;Liao et al., 2012a;
Confusion matrix obtained for Random Forest (RF) classiﬁcation on 10-fold
Predicted class Cluster 1 Cluster 2
Cluster 1 64.0 (TP)
Cluster 2 1.7 (FN)
The matrix values are true positive (TP) for samples correctly classiﬁed as
positive; true negative (TN) for samples correctly classiﬁed as negative; false
negative (FN) for the positive samples were classiﬁed as negative; and false
positive (FP) for negative samples that were classiﬁed as positive.
Minimum, maximum and mean unadjusted urinary concentrations of OH-EtP,
OH-MeP, 3,4-DBH, BPA, 2,4-DCP, MeP, TCS, EtP, BP1 (ng/mL) found in two
clusters of samples identiﬁed by data mining techniques. The EDCs listed are
the most signiﬁcant to diﬀerentiate the samples from the clusters, according to
the Random Forest Importance (RFI) and Random Forest (RF) classiﬁer.
Cluster 1 Cluster 2
Min Max Mean Min Max Mean
OH-EtP nd 1.97 0.25 nd 7.28 1.23
OH-MeP 0.07 14.0 2.17 0.50 72.6 7.90
3,4-DHB nd 350 36.0 nd 515 66.6
BPA nd 20.6 2.08 0.50 35.9 3.72
2,4-DCP nd 30.8 3.54 nd 56.7 7.12
MeP nd 1252 93.9 nd 10,647 343
TCS nd 507 37.1 nd 874 93.0
EtP nd 54.8 1.56 nd 185 5.27
BP1 nd 1910 37.4 nd 1812 56.8
B.A. Rocha et al. Environment International 116 (2018) 269–277
Xue et al., 2015). However, BPS and other bisphenol analogues were
less frequently detected in Brazilian children. These results are in line
with a study from Brazil that showed low detection of BPS and other
bisphenol analogues in adult urine samples (Rocha et al., 2016).
Humans are exposed to BADGEs mainly through canned foods and
drinks. However, studies on human exposure to BADGEs are still lim-
ited. The occurrence of BADGEs and BFDGEs was shown in populations
in the USA, China, India, and Greece (Asimakopoulos et al., 2014;Wang
et al., 2012;Xue et al., 2015). The urinary GM concentration of BAD-
O in Brazilian children was 0.30 ng/mL, which was closer than
that reported for Chinese children (0.59 ng/mL) and much lower than
that reported for Indian Children (12.2 ng/mL) (Wang et al., 2012;Xue
et al., 2015).
Chlorophenols have been used as pesticides or as intermediates in
the production of pesticides, dyes, and pharmaceuticals. The high de-
tection frequencies of dichlorophenols (99%) indicate widespread ex-
posure to these chemicals among Brazilian children. The urinary GM
concentrations of 2,4-DCP and 2,5-DCP (2.60 and 4.59 ng/mL, respec-
tively) in Brazilian children were higher than those reported for
American, Canadian and Asian children (except for 2,5-DCP) (CDC,
2015;Health Canada, 2013;Zidek et al., 2017). One possible source of
2,4-DCP in urine is the metabolic transformation of TCS (Ye et al.,
2014). We found high concentrations of TCS in urine from these chil-
dren and a signiﬁcant correlation existed between urinary concentra-
tions of 2,4-DCP and TCS (r = 0.690), suggesting that TCS is a major
source of 2,4-DCP found in urine.
Parabens are widely used as antimicrobial preservatives in cos-
metics, pharmaceuticals, and foodstuﬀs(Bledzka et al., 2014;Guo and
Kannan, 2013;Liao et al., 2013). Urinary GM concentrations of MeP
and PrP in Brazilian children were higher than those previously re-
ported for American, Indian and Chinese children (CDC, 2015;Wang
et al., 2013;Xue et al., 2015). A correlation among MeP, EtP and PrP
(r = 0.275–0.593) in urine samples suggested that these compounds are
used in combination in various consumer products (Asimakopoulos
et al., 2014;Jiménez-Díaz et al., 2016;Larsson et al., 2014;Wang et al.,
Benzophenones (BPs) have been used as sunscreen agents in per-
sonal care products for the protection of skin and hair from UV irra-
diation. BP3 is the most commonly used sunscreen agent in a variety of
cosmetics (Asimakopoulos et al., 2014;Gao et al., 2015;Heﬀernan
et al., 2015;Kunisue et al., 2012;Liao and Kannan, 2014;Wang and
Kannan, 2013). The GM urinary concentration of BP3 measured in the
present study (3.71 ng/mL) was much higher than that previously re-
ported for Indian (0.91 ng/mL) and Chinese children (0.62 ng/mL),
which may be attributed to lower sunscreen usage in India and China
than in Brazil (Wang and Kannan, 2013;Xue et al., 2015). However,
urinary BP3 concentrations in Brazilian children were lower than those
reported for the U.S. (18.7 ng/mL) and Australian children
(26.2–96.2 ng/mL) (CDC, 2015;Heﬀernan et al., 2015). The urinary
GM concentration of BP1 (5.86 ng/mL) was higher than those reported
for the U.S. and Chinese children (4.21 and 0.115 ng/mL, respectively)
(Wang and Kannan, 2013). There was a signiﬁcant correlation between
BP3 and BP1 concentrations, suggesting concomitant exposure of chil-
dren to these compounds.
TCS and TCC have been widely used as antimicrobial agents in PCPs
such as soaps, toothpastes and deodorants (Larsson et al., 2014;H. Ma
et al., 2013;Pycke et al., 2014). The urinary GM concentration of TCS
in Brazilian children (8.26 ng/mL) was similar to those reported for the
American (7.2 ng/mL), Canadian (8.5 ng/mL), Indian (9.6 ng/mL) and
Chinese children (7.5 ng/mL), but much lower than those reported for
Australian children (59.8–106 ng/mL) (CDC, 2015;Health Canada,
2013;Heﬀernan et al., 2015;Li et al., 2013;Xue et al., 2015). TCC
(70%) was more frequently detected in our study than in studies from
Greece, Saudi Arabia, Canada and the USA (Asimakopoulos et al., 2016,
2014;CDC, 2015;Ye et al., 2016).
Several studies have associated the use of PCPs with high
concentrations of parabens, benzophenones, TCC and TCS in urine
(Frederiksen et al., 2013;Gao et al., 2015;Guo and Kannan, 2013;
Heﬀernan et al., 2015;Larsson et al., 2014;Liao and Kannan, 2014;W.
L. Ma et al., 2013;Pycke et al., 2014;Schebb et al., 2011;Wang et al.,
2013). Brazil is one of the leading countries in the consumption of
beauty and personal care products and it is the second largest market
for children's personal care products globally. Furthermore, Brazil is the
biggest consumer market for fragrances and deodorants in the world
(Euromonitor International, 2016a, b;Rocha et al., 2017). Therefore,
high urinary concentrations of several EDCs may be associated with
heavy usage of PCPs among the Brazilian population. Our results also
showed signiﬁcantly higher concentrations of MeP, PrP, TCS, BP3 and
BP1 in females than males (Table 2). This could be related to more
frequent use of personal care products and cosmetics by females.
Exposure to EDCs has been associated with oxidative stress
(Asimakopoulos et al., 2016;Bledzka et al., 2014;Franken et al., 2017;
Hong et al., 2009;Lu et al., 2016;Lv et al., 2017;Rocha et al., 2017;
Zhang et al., 2016). In this study, urinary concentrations of BPA, MeP,
3,4-DHB, OH-MeP, OH-EtP, TCS, BP3, 2,4-DCP and 2,5-DCP exhibited a
signiﬁcant correlation with 8OHDG (r = 0.158–0.359). BPA has been
shown to induce oxidative stress in laboratory experimental
(Bindhumol et al., 2003;Hassan et al., 2012) and human epidemiolo-
gical studies (Asimakopoulos et al., 2016;Ferguson et al., 2016;Hong
et al., 2009;Lv et al., 2017;Zhang et al., 2016). Similarly, relationships
between urinary paraben concentrations and oxidative DNA damage
have been shown (8OHDG and/or malondialdehyde) in human
(Bledzka et al., 2014;Kang et al., 2013;Watkins et al., 2015) and an-
imal studies (Bledzka et al., 2014;Popa et al., 2011). Statistically sig-
niﬁcant association between urinary concentrations of dichlorophenols
and 8OHDG was found in our study. A recent study also reported an
association between 2,5-DCP and 8OHDG (Franken et al., 2017). A
correlation between urinary concentrations of BP3 and 8OHDG was
found in our study. However, Watkins et al. (2015) did not observe a
similar relationship, although animal studies corroborate with our
ﬁndings (Gao et al., 2013;Kato et al., 2006). Previous in vitro studies
have shown that TCS can induce oxidative stress (Ma et al., 2013;Zeng
et al., 2016).
The associations between EDC exposures and 8OHDG described
above are based on conventional univariate statistics. However, uni-
variate or even some multivariate regression statistical tools have lim-
itations in identifying realistic associations between exposure and
outcome variables, mainly in scenarios in which coexposure to multiple
contaminants exists. Recently, data mining techniques or Knowledge
Discovery from Databases (KDD) have been used as alternative and
powerful mathematical tools to discover hidden patterns in data or
correlations among data (Tan et al., 2005). Data mining is an inter-
disciplinary subﬁeld of computer science and deﬁned as a computing
process of discovering or searching for patterns in datasets (Tan et al.,
2005). Data mining is a vast area of research and there is an abundance
of techniques and methods that can uncover new knowledge (Witten
et al., 2016). It involves methods at the intersection of artiﬁcial in-
telligence, machine learning, statistics, and database systems. A dataset
is usually analyzed using diﬀerent tools such as clustering, classiﬁcation
and feature selection. It is helpful in data classiﬁcation and identiﬁca-
tion of any co-occurring sequences and in the knowledge of the corre-
lation between any activities (Tan et al., 2005). Data mining has been
used successfully in many ﬁelds of research and its application in the
environmental sciences is on the rise (Lausch et al., 2015;Barbosa et al.,
2014;Corstanje et al., 2016;Marvuglia et al., 2015;Tsai et al., 2017).
According to Lausch et al. (2015), the limited use of data mining in
environmental science is due to the requirement of extensive pro-
gramming expertise, making the data mining techniques currently used
exclusively in the areas of computer science, physics and mathematics.
Data mining approaches were used in this study to evaluate or
search for patterns in our EDC database. First, the whole dataset was
subject to clustering (k-medoids) with the identiﬁcation of two diﬀerent
B.A. Rocha et al. Environment International 116 (2018) 269–277
clusters. These clusters presented some features of interest the data may
have and were yet unknown. Moreover, when the samples were cate-
gorized by region, age, and gender, the distribution of samples along
the three categories between the two clusters was found similar, de-
noting that the region, age, and gender did not inﬂuence EDC levels to
determine the formation of clusters. Interestingly, the variable im-
portance measurements identiﬁed 8OHDG as the most signiﬁcant
variable to separate the clusters with a “cutoﬀ”value of 6.60 ng/mL
(provided by the decision tree algorithm) in 86.3%. It denotes that
subjects belonging to cluster 1 (i.e. 8OHDG ≤6.60 ng/mL) have lower
DNA eﬀects in 86.8% of cases, whereas the subjects belonging to cluster
2 have higher DNA eﬀects in 85.2% of cases (i.e. 8OHDG > 6.60 ng/
mL). With this “cut oﬀ”value, we could identify target EDCs that were
associated with cluster 2 are linked to oxidative DNA damage. Thus,
after removing 8OHDG values from the database and after applying the
classiﬁcation models (by using diﬀerent subsets of variables rated ac-
cording to the Random Forest Importance [RFI]), it was observed that
exposure variables increased the diﬀerentiation of the two clusters in
the following order: OH-EtP > OH-MeP > 3.4-DHB > BPA > 2,4-
DCP > MeP > TCS > EtP > BP1 > 2,5-DCP. The order of the
compounds was determined by RFI algorithm. This method of selection
generated a ranking of importance in which the top variable was the
most signiﬁcant to diﬀerentiate between the classes of the data. The
second variable had the second major signiﬁcance and so on. This se-
quence of variables allowed us to classify the data interactively to
identify which variable subset allowed for a better classiﬁcation of the
data. The combined use of the variables (OH-EtP, OH-MeP, 3,4-DHB,
BPA, 2,4-DCP, MeP, TCS, EtP, BP1) resulted in higher accuracy, in-
dicating that this subset of variables was able to discriminate the classes
better than the other sets of variables. The order of the compounds
allowed identiﬁcation of this variable subset. In terms of classiﬁcation,
the order of the compounds of a variable subset in the classiﬁer did not
change the results of the classiﬁcation model because of the way RFI
algorithm performs. Interestingly, the levels of OH-EtP alone diﬀer-
entiated the two clusters with a 74.0% precision. Moreover, mean levels
of each of these 10 EDCs were higher in cluster 2 (higher 8OHDG levels)
compared to that in cluster 1. In summary, our study suggests that
coexposure to EDCs (mainly OH-EtP, OH-MeP, 3,4-DHB, BPA, 2,4-DCP,
MeP, TCS, EtP, BP1, 2,5-DCP) is associated with DNA damage and OH-
EtP is a major contributor to such eﬀects followed by other paraben
metabolites chlorophenols, TCS and benzophenone.
To our knowledge, this is the ﬁrst study to examine the association
between oxidative stress (DNA damage) and coexposure to multiple
classes of EDCs through a data mining approach. This approach pro-
vides much more relevant information related to multiple EDC ex-
posures than univariate statistical models. Overall, our ﬁndings suggest
that coexposures to BPA, parabens, and dichlorophenols contribute to
DNA damage in Brazilian children.
Conﬂict of interest
The authors declare no conﬂict of interest.
We thank all Brazilian children for providing urine samples for this
study. This research was supported in part by São Paulo Research
Foundation (FAPESP, grant numbers 2015/20928-3 and 2013/23710-
3). The sample analysis was conducted at Wadsworth Center, New York
State Department of Health. Research reported in this publication was
supported in part by the National Institute of Environmental Health
Sciences of the National Institutes of Health under Award Number
U2CES026542-01. The content is solely the responsibility of the authors
and does not necessarily represent the oﬃcial views of the National
Institutes of Health.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://
Asimakopoulos, A.G., Thomaidis, N.S., Kannan, K., 2014. Widespread occurrence of bi-
sphenol A diglycidyl ethers, p-hydroxybenzoic acid esters (parabens), benzophenone
type-UV ﬁlters, triclosan, and triclocarban in human urine from Athens, Greece. Sci.
Total Environ. 470–471, 1243–1249. http://dx.doi.org/10.1016/j.scitotenv.2013.10.
Asimakopoulos, A.G., Xue, J., De Carvalho, B.P., Iyer, A., Abualnaja, K.O., Yaghmoor,
S.S., Kumosani, T.A., Kannan, K., 2016. Urinary biomarkers of exposure to 57 xe-
nobiotics and its association with oxidative stress in a population in Jeddah, Saudi
Arabia. Environ. Res. 150, 573–581. http://dx.doi.org/10.1016/j.envres.2015.11.
Barbosa, R.M., Batista, B.L., Varrique, R.M., Coelho, V.A., Campiglia, A.D., Barbosa, F.,
2014. The use of advanced chemometric techniques and trace element levels for
controlling the authenticity of organic coﬀee. Food Res. Int. 61, 246–251. http://dx.
Bindhumol, V., Chitra, K.C., Mathur, P.P., 2003. Bisphenol A induces reactive oxygen
species generation in the liver of male rats. Toxicology 188, 117–124. http://dx.doi.
Bisht, S., Faiq, M., Tolahunase, M., Dada, R., 2017. Oxidative stress and male infertility.
Nat. Rev. Urol. 14, 470–485. http://dx.doi.org/10.1038/nrurol.2017.69.
Bledzka, D., Gromadzinska, J., Wasowicz, W., 2014. Parabens. From environmental stu-
dies to human health. Environ. Int. 67, 27–42. http://dx.doi.org/10.1016/j.envint.
Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32. http://dx.doi.org/10.1023/
Calafat, A.M., Ye, X., Valentin-Blasini, L., Li, Z., Mortensen, M.E., Wong, L.Y., 2017. Co-
exposure to non-persistent organic chemicals among American pre-school aged
children: a pilot study. Int. J. Hyg. Environ. Health 220, 55–63. http://dx.doi.org/10.
CDC: Centers for Disease Control and PreventionNational Center for Environmental
HealthDivision of Laboratory Sciences Fourth National Report on Human Exposure to
Environmental Chemicals (Updated Tables, February, 2015). https://www.cdc.gov/
biomonitoring/pdf/fourthreport_updatedtables_feb2015.pdf, Accessed date: 15
Charrad, M., Ghazzali, N., Boiteau, V., Niknafs, A., 2014. Charrad: an Rpackage for de-
termining the relevant number of clusters in a data set. J. Stat. Softw. 61. http://dx.
Corstanje, R., Graﬁus, D.R., Zawadzka, J., Moreira Barradas, J., Vince, G., Ivanoﬀ, D.,
Pietro, K., 2016. A datamining approach to identifying spatial patterns of phosphorus
forms in the Stormwater treatment areas in the Everglades, US. Ecol. Eng. 97,
Covaci, A., Den Hond, E., Geens, T., Govarts, E., Koppen, G., Frederiksen, H., Knudsen,
L.E., Mørck, T.A., Gutleb, A.C., Guignard, C., Cocco, E., Horvat, M., Heath, E., Kosjek,
T., Mazej, D., Tratnik, J.S., Castaño, A., Esteban, M., Cutanda, F., Ramos, J.J.,
Berglund, M., Larsson, K., Jönsson, B.A.G., Biot, P., Casteleyn, L., Joas, R., Joas, A.,
Bloemen, L., Sepai, O., Exley, K., Schoeters, G., Angerer, J., Kolossa-Gehring, M.,
Fiddicke, U., Aerts, D., Koch, H.M., 2015. Urinary BPA measurements in children and
mothers from six European member states: overall results and determinants of ex-
posure. Environ. Res. 141, 77–85. http://dx.doi.org/10.1016/j.envres.2014.08.008.
Di Minno, A., Turnu, L., Porro, B., Squellerio, I., Cavalca, V., Tremoli, E., Di Minno,
M.N.D., 2016. 8-Hydroxy-2-deoxyguanosine levels and cardiovascular disease: a
systematic review and meta-analysis of the literature. Antioxid. Redox Signal. 24,
Diamanti-Kandarakis, E., Bourguignon, J.P., Giudice, L.C., Hauser, R., Prins, G.S., Soto,
A.M., Zoeller, R.T., Gore, A.C., 2009. Endocrine-disrupting chemicals: an Endocrine
Society scientiﬁc statement. Endocr. Rev. 30, 293–342. http://dx.doi.org/10.1210/
Euromonitor International, 2016a. Fragrances in Brazil [WWW Document] (URL). http://
www.euromonitor.com/fragrances-in-brazil/report, Accessed date: 15 November
Euromonitor International, 2016b. Beauty and Personal Care in Brazil [WWW Document]
Accessed date: 15 November 2016.
Ferguson, K.K., Cantonwine, D.E., McElrath, T.F., Mukherjee, B., Meeker, J.D., 2016.
Repeated measures analysis of associations between urinary bisphenol-A concentra-
tions and biomarkers of inﬂammation and oxidative stress in pregnancy. Reprod.
Toxicol. 66, 93–98. http://dx.doi.org/10.1016/j.reprotox.2016.10.002.
Franken, C., Koppen, G., Lambrechts, N., Govarts, E., Bruckers, L., Den Hond, E., Loots, I.,
Nelen, V., Sioen, I., Nawrot, T.S., Baeyens, W., Van Larebeke, N., Boonen, F., Ooms,
D., Wevers, M., Jacobs, G., Covaci, A., Schettgen, T., Schoeters, G., 2017.
Environmental exposure to human carcinogens in teenagers and the association with
DNA damage. Environ. Res. 152. http://dx.doi.org/10.1016/j.envres.2016.10.012.
Frederiksen, H., Nielsen, J.K.S., Mørck, T.A., Hansen, P.W., Jensen, J.F., Nielsen, O.,
Andersson, A.M., Knudsen, L.E., 2013. Urinary excretion of phthalate metabolites,
phenols and parabens in rural and urban Danish mother-child pairs. Int. J. Hyg.
Environ. Health 216, 772–783. http://dx.doi.org/10.1016/j.ijheh.2013.02.006.
Gao, L., Yuan, T., Zhou, C., Cheng, P., Bai, Q., Ao, J., Wang, W., Zhang, H., 2013. Eﬀects
of four commonly used UV ﬁlters on the growth, cell viability and oxidative stress
B.A. Rocha et al. Environment International 116 (2018) 269–277
responses of the Tetrahymena thermophila. Chemosphere 93, 2507–2513. http://dx.
Gao, C.J., Liu, L.Y., Ma, W.L., Zhu, N.Z., Jiang, L., Li, Y.F., Kannan, K., 2015.
Benzonphenone-type UV ﬁlters in urine of Chinese young adults: concentration,
source and exposure. Environ. Pollut. 203, 1–6. http://dx.doi.org/10.1016/j.envpol.
Giulivo, M., de Alda, M., Capri, E., Barceló, D., 2016. Human exposure to endocrine
disrupting compounds: their role in reproductive systems, metabolic syndrome and
breast cancer. A review. Environ. Res. 151, 251–264. http://dx.doi.org/10.1016/j.
Guo, Y., Kannan, K., 2013. A survey of phthalates and parabens in personal care products
from the United States and its implications for human exposure. Environ. Sci.
Technol. 47, 14442–14449. http://dx.doi.org/10.1021/es4042034.
Hassan, Z.K., Elobeid, M.A., Virk, P., Omer, S.A., Elamin, M., Daghestani, M.H., Alolayan,
E.M., 2012. Bisphenol a induces hepatotoxicity through oxidative stress in rat model.
Oxidative Med. Cell. Longev. http://dx.doi.org/10.1155/2012/194829.
Health Canada, 2013. Second Report on Human Biomonitoring of Environmental
Chemicals in Canada: Results of the Canadian Health Measures Survey Cycle 2
(20097–2011). Health Canada, Ottawa, Ontario, Canada (Available at). http://www.
Heﬀernan, A.L., Baduel, C., Toms, L.M.L., Calafat, A.M., Ye, X., Hobson, P., Broomhall, S.,
Mueller, J.F., 2015. Use of pooled samples to assess human exposure to parabens,
benzophenone-3 and triclosan in Queensland, Australia. Environ. Int. 85, 77–83.
Hong, Y.C., Park, E.Y., Park, M.S., Ko, J.A., Oh, S.Y., Kim, H., Lee, K.H., Leem, J.H., Ha,
E.H., 2009. Community level exposure to chemicals and oxidative stress in adult
population. Toxicol. Lett. 184, 139–144. http://dx.doi.org/10.1016/j.toxlet.2008.11.
Hornik, K., Buchta, C., Zeileis, A., 2009. Open-source machine learning: R meets Weka.
Comput. Stat. 24, 225–232. http://dx.doi.org/10.1007/s00180-008-0119-7.
Hornung, R.W., Reed, L.D., 1990. Estimation of average concentration in the presence of
nondetectable values. Appl. Occup. Environ. Hyg. 5, 46–51.
Jimenez-Diaz, I., Vela-Soria, F., Rodriguez-Gomez, R., Zafra-Gomez, A., Ballesteros, O.,
Navalon, A., 2015. Analytical methods for the assessment of endocrine disrupting
chemical exposure during human fetal and lactation stages: a review. Anal. Chim.
Acta 892, 27–48. http://dx.doi.org/10.1016/j.aca.2015.08.008.
Jiménez-Díaz, I., Artacho-Cordón, F., Vela-Soria, F., Belhassen, H., Arrebola, J.P.,
Fernández, M.F., Ghali, R., Hedhili, A., Olea, N., 2016. Urinary levels of bisphenol A,
benzophenones and parabens in Tunisian women: a pilot study. Sci. Total Environ.
562, 81–88. http://dx.doi.org/10.1016/j.scitotenv.2016.03.203.
Kang, S., Kim, S., Park, J., Kim, H.J., Lee, J., Choi, G., Choi, S., Kim, S., Kim, S.Y., Moon,
H.B., Kim, S., Kho, Y.L., Choi, K., 2013. Urinary paraben concentrations among
pregnant women and their matching newborn infants of Korea, and the association
with oxidative stress biomarkers. Sci. Total Environ. 461–462, 214–221. http://dx.
Kato, T., Tada-Oikawa, S., Takahashi, K., Saito, K., Wang, L., Nishio, A., Hakamada-
Taguchi, R., Kawanishi, S., Kuribayashi, K., 2006. Endocrine disruptors that deplete
glutathione levels in APC promote Th2 polarization in mice leading to the exacer-
bation of airway inﬂammation. Eur. J. Immunol. 36, 1199–1209. http://dx.doi.org/
Kaufman, L., Rousseuw, P.J., 1990. Finding Groups in Data. John Wiley & Sons, Wiley
Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, USA.
Kelly, F.J., Fussell, J.C., 2017. Role of oxidative stress in cardiovascular disease outcomes
following exposure to ambient air pollution. Free Radic. Biol. Med. 110, 345–367.
Kunisue, T., Chen, Z., Buck Louis, G.M., Sundaram, R., Hediger, M.L., Sun, L., Kannan, K.,
2012. Urinary concentrations of benzophenone-type UV ﬁlters in U.S. women and
their association with endometriosis. Environ. Sci. Technol. 46, 4624–4632. http://
Larsson, K., Ljung Björklund, K., Palm, B., Wennberg, M., Kaj, L., Lindh, C.H., Jönsson,
B.A.G., Berglund, M., 2014. Exposure determinants of phthalates, parabens, bi-
sphenol A and triclosan in Swedish mothers and their children. Environ. Int. 73,
Lausch, A., Schmidt, A., Tischendorf, L., 2015. Data mining and linked open data–new
perspectives for data analysis in environmental research. Ecol. Model. 295, 5–17.
Li, X., Ying, G.G., Zhao, J.L., Chen, Z.F., Lai, H.J., Su, H.C., 2013. 4-Nonylphenol, bi-
sphenol-A and triclosan levels in human urine of children and students in China, and
the eﬀects of drinking these bottled materials on the levels. Environ. Int. 52, 81–86.
Liao, C., Kannan, K., 2014. Widespread occurrence of benzophenone-type UV light ﬁlters
in personal care products from China and the United States: an assessment of human
exposure. Environ. Sci. Technol. 48, 4103–4109. http://dx.doi.org/10.1021/
Liao, C., Liu, F., Alomirah, H., Loi, V.D., Mohd, M.A., Moon, H.B., Nakata, H., Kannan, K.,
2012a. Bisphenol S in urine from the United States and seven Asian countries: oc-
currence and human exposures. Environ. Sci. Technol. 46, 6860–6866. http://dx.doi.
Liao, C., Liu, F., Kannan, K., 2012b. Bisphenol S, a new bisphenol analogue, in paper
products and currency bills and its association with bisphenol a residues. Environ.
Sci. Technol. 46, 6515–6522. http://dx.doi.org/10.1021/es300876n.
Liao, C., Chen, L., Kannan, K., 2013. Occurrence of parabens in foodstuﬀs from China and
its implications for human dietary exposure. Environ. Int. 57–58, 68–74. http://dx.
Liaw, A., Wiener, M., 2002. Classiﬁcation and Regression by randomForest. R News 2. pp.
Lu, S., Li, Y.-X., Zhang, J.-Q., Zhang, T., Liu, G.-H., Huang, M.-Z., Li, X., Ruan, J.J.,
Kannan, K., Qiu, R.L., 2016. Associations between polycyclic aromatic hydrocarbon
(PAH) exposure and oxidative stress in people living near e-waste recycling facilities
in China. Environ. Int. 94, 161–169. http://dx.doi.org/10.1016/j.envint.2016.05.
Lv, Y., Lu, S., Dai, Y., Rui, C., Wang, Y., Zhou, Y., Li, Y., Pang, Q., Fan, R., 2017. Higher
dermal exposure of cashiers to BPA and its association with DNA oxidative damage.
Environ. Int. 98, 69–74. http://dx.doi.org/10.1016/j.envint.2016.10.001.
Ma, H., Zheng, L., Li, Y., Pan, S., Hu, J., Yu, Z., Zhang, G., Sheng, G., Fu, J., 2013.
Triclosan reduces the levels of global DNA methylation in HepG2 cells. Chemosphere
90, 1023–1029. http://dx.doi.org/10.1016/j.chemosphere.2012.07.063.
Ma, W.L., Wang, L., Guo, Y., Liu, L.Y., Qi, H., Zhu, N.Z., Gao, C.J., Li, Y.F., Kannan, K.,
2013. Urinary concentrations of parabens in Chinese young adults: implications for
human exposure. Arch. Environ. Contam. Toxicol. 65, 611–618. http://dx.doi.org/
Maione, C., de Oliveira Souza, V.C., Togni, L.R., da Costa, J.L., Campiglia, A.D., Barbosa,
F., Barbosa, R.M., 2017. Using cluster analysis and ICP-MS to identify groups of ec-
stasy tablets in Sao Paulo State, Brazil. J. Forensic Sci. 62, 1479–1486. http://dx.doi.
Marvuglia, A., Kanevski, M., Benetto, E., 2015. Machine learning for toxicity character-
ization of organic chemical emissions using USEtox database: learning the structure
of the input space. Environ. Int. 83, 72–85. http://dx.doi.org/10.1016/j.envint.2015.
Myridakis, A., Fthenou, E., Balaska, E., Vakinti, M., Kogevinas, M., Stephanou, E.G., 2015.
Phthalate esters, parabens and bisphenol-A exposure among mothers and their chil-
dren in Greece (Rhea cohort). Environ. Int. 83, 1–10. http://dx.doi.org/10.1016/j.
Naidu, R., Arias Espana, V.A., Liu, Y., Jit, J., 2016. Emerging contaminants in the en-
vironment: risk-based analysis for better management. Chemosphere 154, 350–357.
Popa, D.-S., Kiss, B., Vlase, L., Pop, A., Iepure, R., Pǎltinean, R., Loghin, F., 2011. Study of
oxidative stress induction after exposure to bisphenol a and methylparaben in rats.
Farmacia 59, 539–549.
Pycke, B.F.G., Geer, L.A., Dalloul, M., Abulaﬁa, O., Jenck, A.M., Halden, R.U., 2014.
Human fetal exposure to triclosan and triclocarban in an urban population from
Brooklyn, New York. Environ. Sci. Technol. 48, 8831–8838. http://dx.doi.org/10.
Reuter, S., Gupta, S.C., Chaturvedi, M.M., Aggarwal, B.B., 2010. Oxidative stress, in-
ﬂammation, and cancer: how are they linked? Free Radic. Biol. Med. 49, 1603–1616.
Reynolds, A.P., Richards, G., Rayward-Smith, V.J., 2004. The Application of K-Medoids
and PAM to the Clustering of Rules. pp. 173–178. http://dx.doi.org/10.1007/978-3-
Rocha, B.A., Azevedo, L.F., Gallimberti, M., Campiglia, A.D., Barbosa, F., 2015. High
levels of bisphenol a and bisphenol S in Brazilian thermal paper receipts and esti-
mation of daily exposure. J. Toxicol. Environ. Health, Part A 78, 1181–1188. http://
Rocha, B.A., Da Costa, B.R.B., De Albuquerque, N.C.P., De Oliveira, A.R.M., Souza,
J.M.O., Al-Tameemi, M., Campiglia, A.D., Barbosa, F., 2016. A fast method for bi-
sphenol A and six analogues (S, F, Z, P, AF, AP) determination in urine samples based
on dispersive liquid-liquid microextraction and liquid chromatography-tandem mass
spectrometry. Talanta 154, 511–519. http://dx.doi.org/10.1016/j.talanta.2016.03.
Rocha, B.A., Asimakopoulos, A.G., Barbosa, F., Kannan, K., 2017. Urinary concentrations
of 25 phthalate metabolites in Brazilian children and their association with oxidative
DNA damage. Sci. Total Environ. 586, 152–162. http://dx.doi.org/10.1016/j.
Rokach, L., 2016. Decision forest: twenty years of research. Inf. Fusion 27, 111–125.
Schebb, N.H., Inceoglu, B., Ahn, K.C., Morisseau, C., Gee, S.J., Hammock, B.D., 2011.
Investigation of human exposure to triclocarban after showering and preliminary
evaluation of its biological eﬀects. Environ. Sci. Technol. 45. http://dx.doi.org/10.
Scognamiglio, V., Antonacci, A., Patrolecco, L., Lambreva, M.D., Litescu, S.C., Ghuge,
S.A., Rea, G., 2016. Analytical tools monitoring endocrine disrupting chemicals. TrAC
Trends Anal. Chem. 80. http://dx.doi.org/10.1016/j.trac.2016.04.014.
Simoneau, C., Valzacchi, S., Morkunas, V., van den Eede, L., 2011. Comparison of mi-
gration from polyethersulphone and polycarbonate baby bottles. Food Addit.
Contam. Part A Chem. Anal. Control Expo. Risk Assess. 28, 1763–1768. http://dx.doi.
Smarr, M.M., Kannan, K., Louis, G.M.B., Buck Louis, G.M., 2016. Endocrine disrupting
chemicals and endometriosis. Fertil. Steril. 106, 959–966. http://dx.doi.org/10.
Tan, P.N., Steinbach, M., Kumar, V., 2005. Introduction to Data Mining. Addison-Wesley
Longman Publishing Co., Inc., Boston, MA.
Tavares, R.S., Escada-Rebelo, S., Correia, M., Mota, P.C., Ramalho-Santos, J., 2016. The
non-genomic eﬀects of endocrine-disrupting chemicals on mammalian sperm.
Reproduction 151, R1–R13. http://dx.doi.org/10.1530/REP-15-0355.
Tsai, W.P., Huang, S.P., Cheng, S.T., Shao, K.T., Chang, F.J., 2017. A data-mining fra-
mework for exploring the multi-relation between ﬁsh species and water quality
through self-organizing map. Sci. Total Environ. 579, 474–483. http://dx.doi.org/10.
Wang, L., Kannan, K., 2013. Characteristic proﬁles of benzonphenone-3 and its deriva-
tives in urine of children and adults from the United States and China. Environ. Sci.
Technol. 47, 12532–12538. http://dx.doi.org/10.1021/es4032908.
B.A. Rocha et al. Environment International 116 (2018) 269–277
Wang, L., Wu, Y., Zhang, W., Kannan, K., 2012. Widespread occurrence and distribution
of bisphenol a diglycidyl ether (BADGE) and its derivatives in human urine from the
United States and China. Environ. Sci. Technol. 46, 12968–12976. http://dx.doi.org/
Wang, L., Wu, Y., Zhang, W., Kannan, K., 2013. Characteristic proﬁles of urinary p-hy-
droxybenzoic acid and its esters (parabens) in children and adults from the United
States and China. Environ. Sci. Technol. 47, 2069–2076. http://dx.doi.org/10.1021/
Watkins, D.J., Ferguson, K.K., Anzalota Del Toro, L.V., Alshawabkeh, A.N., Cordero, J.F.,
Meeker, J.D., 2015. Associations between urinary phenol and paraben concentrations
and markers of oxidative stress and inﬂammation among pregnant women in Puerto
Rico. Int. J. Hyg. Environ. Health 218, 212–219. http://dx.doi.org/10.1016/j.ijheh.
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J., 2016. Data Mining: Practical Machine
Learning Tools and Techniques (4th Edition), 4th ed. Morgan Kaufmann.
Woodruﬀ, T.J., 2015. Making it real–the environmental burden of disease. What does it
take to make people pay attention to the environment and health? J. Clin. Endocrinol.
Metab. 100, 1241–1244. http://dx.doi.org/10.1210/jc.2015-1622.
Xue, J., Wu, Q., Sakthivel, S., Pavithran, P.V., Vasukutty, J.R., Kannan, K., 2015. Urinary
levels of endocrine-disrupting chemicals, including bisphenols, bisphenol A digly-
cidyl ethers, benzophenones, parabens, and triclosan in obese and non-obese Indian
children. Environ. Res. 137, 120–128. http://dx.doi.org/10.1016/j.envres.2014.12.
Ye, X., Wong, L.Y., Zhou, X., Calafat, A.M., 2014. Urinary concentrations of 2,4-
dichlorophenol and 2,5-dichlorophenol in the U.S. population (National health and
nutrition examination survey, 2003–2010): trends and predictors. Environ. Health
Perspect. 122, 351–355. http://dx.doi.org/10.1289/ehp.1306816.
Ye, X., Wong, L.Y., Kramer, J., Zhou, X., Jia, T., Calafat, A.M., 2015. Urinary con-
centrations of bisphenol A and three other bisphenols in convenience samples of U.S.
adults during 2000–2014. Environ. Sci. Technol. 49, 11834–11839. http://dx.doi.
Ye, X., Wong, L.-Y., Dwivedi, P., Zhou, X., Jia, T., Calafat, A.M., 2016. Urinary con-
centrations of the antibacterial agent Triclocarban in United States residents:
2013–2014 National Health and Nutrition Examination Survey. Environ. Sci.
Technol. 50, 13548–13554. http://dx.doi.org/10.1021/acs.est.6b04668.
Zeng, L., Ma, H., Pan, S., You, J., Zhang, G., Yu, Z., Sheng, G., Fu, J., 2016. LINE-1 gene
hypomethylation and p16 gene hypermethylation in HepG2 cells induced by low-
dose and long-term triclosan exposure: the role of hydroxyl group. Toxicol. in Vitro
34, 35–44. http://dx.doi.org/10.1016/j.tiv.2016.03.002.
Zhang, T., Xue, J., Gao, C.Z., Qiu, R.L., Li, Y.X., Li, X., Huang, M.Z., Kannan, K., 2016.
Urinary concentrations of bisphenols and their association with biomarkers of oxi-
dative stress in people living near E-waste recycling facilities in China. Environ. Sci.
Technol. 50, 4045–4053. http://dx.doi.org/10.1021/acs.est.6b00032.
Zidek, A., Macey, K., MacKinnon, L., Patel, M., Poddalgoda, D., Zhang, Y., 2017. A review
of human biomonitoring data used in regulatory risk assessment under Canada's
chemicals management program. Int. J. Hyg. Environ. Health 220, 167–178. http://
B.A. Rocha et al. Environment International 116 (2018) 269–277