Access to this full-text is provided by Springer Nature.
Content available from Environmental Systems Research
This content is subject to copyright. Terms and conditions apply.
Williamsetal.
Environmental Systems Research (2024) 13:53
https://doi.org/10.1186/s40068-024-00384-1
RESEARCH Open Access
© The Author(s) 2024, corrected publication 2025. Open Access This article is licensed under a Creative Commons Attribution-NonCom-
mercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons
licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material
derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons
licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and
your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by- nc- nd/4. 0/.
Environmental Systems Research
Application ofamodied set ofGoogLeNet
andResNet-18 convolutional neural networks
towardstheidentication ofenvironmentally
derived-MPLs intheYadkin-pee dee river basin
Wesley Allen Williams1*, Anirudh Arunprasad2 and Shyam Aravamudhan1
Abstract
Microplastic (MPL) abundance is a well-elucidated problem in the marine environment but not so much
in the terrestrial environment. In order to contribute to this research gap, a field study was performed in the Yadkin-
Pee Dee River Basin. Due to their heterogenous nature and difficulty in characterization, a diverse set of pictorial
training data from µ-Raman was used to perform transfer learning on 2 CNNs of interest: GoogLeNet (GN)
and ResNet-18 (RN). In the first trial, using 10% of the initial training dataset, the CNNs exhibited high levels
of accuracy rates, generally above 90%. Irrespective of spectroscopic mode, marginal improvements in accuracy rates
were seen, with the best improvements occurring in the Raman-based models (U[GN(FTIR), RN(Raman), GN(FTIR),
and RN(Raman)]: 39, 42, 38.5, and 34.5; p-value: 1, .6753, .9719, and .4978). However, for the external trial, pictorial
data from Primpke (FTIR) and DongMiller (Raman) was predicted less accurately, with the largest loss occurring
across the following sets: U[GN(Raman) and RN(FTIR)], 45.5 and 35; p-value:, .3268 and .5476. However, set RN fared
marginally better, and due to the usage of µ-Raman, and its performance in the 10% trial, RN18_ADAM_.0011
was selected as the champion model for the field study data. In the unknown microparticle (MP) trial, generally,
the most ID’d polymer type was CA, PET, and PE representing a relative concentration range for a given water
source and area (MPL/MP) of 4.17–37.5%, 4.17–8.33%, and 4.17–8.33% for CNN and OpenSpecy (OS). A FEDS
algorithm, equipped with natural and synthetic polymer standards and biological material, used to compare
the strength of each model determined similar frequency in ascertaining positive MPL results across both models
with corroboration between the CNN and OS around 1/3 of the time. Results indicate the models detect MPLs
with similar frequency elucidating comparable strength of the CNN as well as a focus on particle type distribution
rather than individual identification. Moreover, the largest influential factor in this study appears to be either laundry
wastewater effluent or atmospheric deposition, which is stressed as a primary focus of remediating MPLs
in similar freshwater environments. Lastly, it appears that these MPL are of primary origin as opposed to secondary
in the oceanic and coastal environments.
Keywords Microplastics, Primary microplastics, Neural networks, Artificial intelligence, μ-Raman
*Correspondence:
Wesley Allen Williams
wawilliams1@aggies.ncat.edu
Full list of author information is available at the end of the article
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 2 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Introduction
Spectroscopic characterization of MPLs from the
environment varies considerably in facility due to varying
application and length of time from biodegradation,
photodegradation, chemical degradation, and thermal
degradation (Liu etal. 2022). In the more facile cases (Liu
etal. 2022), spectral changes are subtle in that the pristine
condition of MPL polymer possesses the inclusion of
oxidized species and general absorbance. One can still
see the original spectrum thus giving rise to the correct
identification. In more difficult cases (Dong et al.
2020), massive inclusions of broad peaks in generally
uncharacterizable areas of the spectra begin to obscure
the original peaks of the polymer, meaning corroborative
analysis and more complex techniques are needed.
For example, general autofluorescence can be seen for
MPLs, which obscure the spectra beyond recognition
(Rytelewska and Dąbrowska 2022). Polymers, like PVC,
undergo incredible changes to their molecular structure
due to polyene-ization, which, under UV radiation,
condensates HCl, resulting in conjugated double bonds
((Fernández-González etal. 2022; Khoshnoud and Abu-
Zahra 2018)). is gives the appearance of the polymer
being similar to polyethylene with sp2/sp3 hybridized
C-H bonds in its spectra. Moreover (Decker 1984),
the process may make the polymer more susceptible to
photodegradation since those bond types are more active
in this regime of EM radiation. e double bonds can
facilitatethe scavenging of free radicals. Indeed, spectral
changes could occur for many different types of polymers
owing to byproducts elucidated in other studies (Wang
et al. 2023; Qian et al. 2011; Yousif and Haddad 2013;
Nakatani etal. 2021; Zhang etal. 2021; Wochnowski etal.
2000, 2005; Tuna and Benkreira 2018; Gonçalves et al.
2007; Al-Abri etal. 2019). Note: Some studies may not
reflect environmental conditions.
In regard to MPL additive presence, spectra exude
wavenumbers from multiple chemical species. Referring
back to the PVC report (Fernández-González etal. 2022),
one can see the massive difference between PVC spectra
in the pristine and industrially-derived forms where
flame retardants, plasticizers, and UV stabilizers are
added to improve the lifetime and mechanical properties
of the material. Colorantsand pigment additives within
MPL matrices obscure spectra to significant degrees
(Nava etal. 2021). Most, if not all, thermoplastics and
thermosets possess an array of additives within their
matrices. Moreover, further complications may arise
when these additives begin to degrade in addition to
the polymer. Indeed, microbial degradation appears to
reign supreme on this front due to the ability of marine
bacteria, like Mycobacterium houstonense or Halomonas
campaniensis, which produce a variety of metabolites for
the β-ketoadipate degradation pathway and glyoxylate
cycle from PET-derived phthalic acid esters (Wright
et al. 2020). Other microbial genera; like Desulfovibrio,
Devosia, and Gordonia; possess potent potential to
degrade PVC-derived phthalates (e.g., including but not
limited to DEHP and DINP) by way of nitrate reduction,
sulfate reduction, fermentation, and general aerobic
degradation (Panthi et al. 2024). Compared to abiotic
mechanisms, the unreacted form was nearly depleted.
is is plausible due to the fact that bacteria possess the
tendency to colonize and form biofilms on MPL surfaces
in the marine environment (Halle and Ghiglione 2021)
thus facilitating additive degradation.
Efforts to accommodate environmentally
degraded MPLs from the computational reconciliation
standpoint aid the subfield. One such report (Beck etal.
2023) utilized a Near-Infrared Hyperspectral Imaging
technique, which aids in the differentiation of spectra
owing to plastic polymers from natural objects (organic
detritus). ough the technique is limited to particles
300 µm and higher, the method is cost-effective over
traditional IR with reliably discriminatory analysis.
Focal plane array (FPA)-based results offer mapping
of particles to a finer detail; however, resolution may
eclipse particles sizes in the lower micrometer regime
(Ivleva 2021). Other techniques (Dong etal. 2022), such
as LDIR-particle analysis, possess benefits in speed and
MPL size measurement, while Optical Photothermal
Infrared (O-PTIR) (Dong et al. 2022) is better in terms
of detection limit and general identification of MPLs.
O-PTIR is an interesting spectroscopic mode that results
in the identification of MPLs on the smaller side of the
micrometer regime (Limit: 0.5 µm); however, there is
no automated process for particle characterization en
masse. e technique is brilliantly illustrated in one
report (Böke etal. 2022): mid-IR radiation is pulsed onto
a sample, producing photothermal effects. Local heating
and thermal expansion produced a measurable change
in the refractive index via a visible probe laser, which
acquisitions IR.
Other computational efforts, such as the use of Deep
Learning (DL) or artificial intelligence (AI), were of
interest to the work of the report. Artificial intelligence
is a form of machine learning in which an algorithm is
trained on existing datasets, influencing its decisions on
predictions for new data (e.g., medical diagnostics for
patients with suspected COVID-19 with a minimum
accuracy of ~80% (Panjeta et al. 2023)). DL is a set of
various tools, chiefly CNN (Yamashita et al. 2018),
which possess a set of layers designed to process 2D
data (images). Significant features are noted in these
layers whereby the model can abstract the identification
of a particular image from patterns “imprinted” on
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 3 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
the layers. e error rate is incorporated into the loss
function which is backpropagated for the fine-tuning
ofthe activation functions’ weights. One of the last layers
is the “fully connected” layer, which maps the feature
extractions for classification with remarkable swiftness in
identifying significant features in unsupervised learning.
Kernels in a given layer give the next layer information
on a significant feature, aiding in classification of
minute details. CNNs are critically applied in the report
where we have employed usage of CNNs towards the
identification of environmentally degraded MPLs.
Accuracy rates in this area of the literature appear to
be quite strong. One report, comparing a set of CNNs
to Subspace-K Nearest Neighbors observed a high
accuracy rate for MPL polymers, ~96% to ~100%, with
the SubKNN producing the highest accuracies at a
sample size of 40. In contrast, the second strongest
model, “1D-CNN,” produces the maximum accuracy seen
in the study with a sample size of 100 per class. us,
SubKNN appears to be better suited for cost-effective
computation, while the NN may handle a larger dataset.
Potentially, SubKNN could accidentally overfit new
samples due to needing a lower number of samples per
class. A subsequent example report (Zhu et al. 2023)
designed their own CNN, “PlasticNet”, in thecontext to
environmentallydegraded MPLs. eir CNN was trained
by over 8,000 pristine samples from 11 different plastic
polymer types using FPA-µFTIR imaging. e model was
further trained with a dataset of MPLs with additives
and weathering. ey found that retraining the models
resulted in a net gain of accuracy for green, glass fiber-
reinforced, flame-retardant possessing, and calcium
carbonate PP MPLs. Red PP MPLs received reduced
accuracy in the model. Overall, the accuracy rates of the
model was above > 92%.
In this report, usage of GoogLeNet and ResNet-18
is utilized to identify environmentally derived MPLs.
GoogLeNet (GoogLeNet and Explained 2024) is a
Google variant of the “Inception Network” that is 22
layers deep using a parallel convolution method that
creates a richer representation of the image. e model
also uses a 1 × 1 convolution, reducing thedimensionality
of the immediate input and reducing the parameters used.
Various other filters are used in the layers, aiding the
model in abstracting a wider array of features.ResNet-18
(Ramzan etal. 2019) is a CNN with 18 layers (hence its
suffix), the first 17 being the convolutional layers and
the last being a fully-connected layer tied to a softmax
(for classifying output). e layer handles input and
output of varying or similar dimensions between a few
layers for proper identity mapping. roughout the
layers, the information is often downsampled meaning
the information complexity is decreased to ease
computational effort and to detail minute features in
an image. In context to transfer learning, the models
can be updated towards a new dataset that takes on
the characteristic features of the classes from the user’s
choice. e following methodology will outline the
procurement of MPL samples from the field study as
well as the training datasets’ generation from spectral
data to pictorial data to perform the transfer learning.
Immediately following, the makeup of the training
dataset will be illustrated as well as the hardware and
software supporting the analysis of the report. e goals
of this study are to select a champion model from an
array of 48 CNNs’ predictions against a 10% holdout and
external dataset via various combinations of learning
methods and learning rates. e model’s strength will go
on to be studied against OpenSpecy which will be carried
out by calculating corroboration from a FEDS algorithm
with a standard set of polymers and biological materials
(potential contaminants). e results will indicate the
MPL polymer type distribution across various water
sources in the river basin as well as relative particle
concentrations that give specificity to the abundance and
ubiquity of the particles, thus redirecting remediation
efforts accordingly.
Methodology
As stated before, the next section will cover the process of
the study gathering unidentified MPs. Firstly, discussion
of the sample acquisition of suspected MPL particles
from the Yadkin-Pee Dee River Basin will ensue. Next,
sample filtration and dropcasting for microspectroscopic
characterization of unknown particles. Lastly, discussion
of neural networks’ training data makeup. e data,
scripts associated with this data, and workspaces used
to generate the results are available for download in the
Supplementary Information section in.txt and.m/.mat
format.
Field Study
Triplicates of50mL glass vials of surface water (slow
submerge until lip of vial was level with water level) was
taken from the upper ~ 20% of the river basin before the
High Rock Lake reservoir(Note: choice of triplicate over
other "n-tuplicates" was due to the balance of increasing
reliability and power of results and minimizing
random error while minimizing time of completion).
3 areas (Fig.1) were chosen alongside the Yadkin River,
encapsulating the North Carolinian cities and towns
of Elkin, Winston-Salem, Clemmons, Bermuda Run,
Linwood, and Salisbury. Triplicate sampling (Table 1)
was executed at 4 sub-areas for drinking water (DW), tap
water (TW), river water (RW), and wastewater treatment
plant effluent (WW). Coordinates and elevation
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 4 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
measurements were taken for analysis (typically via
iOS v. 17.5.1’s Compass Application. Other sources:
WW1, WW2, and TW2 gathered geographic data from
(Margulies 2024; Get Lat Long from Address 2024).
Separation Protocol
A total of 36 samples, 9 controls (CA, LDPE, HDPE,
PP, PS, PET, PMMA, PA, and PVC), and 8 controls
(QA after each session of separation) were extracted of
microparticles loosely following an oleophilic protocol
(Crichton et al. 2017). Firstly, 50 mL of the sample is
added to an 250mL Erlenmeyer flask (or 50mL diH2O
of added to solid controls). For the solid controls,
the ~ 0.05g of pristine MPL were added and vigorously
shaken. e samples were swirled in the Erlenmeyer flask
to break up aggregates, if any. Next, 5mL of canola oil
Fig. 1 Top: Locations of the field study conducted along the upper portion of the Yadkin-Pee Dee River Basin. Bottom: Photographs 1, 2, and
3 are the sampling locations for the RW where: 1. is the riverside near Crater Park in Elkin, NC; 2. is the riverside underneath the I-40 overpass
near Truist Soccer Park in Clemmons, NC; and 3. is the riverside fishing hole of Yadkin River Park in Linwood, NC (Reding 2021)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 5 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
(500 g, amber glass bottle, FisherScientific) was added
to the flask and swirled for approximately 30seconds.
e oil fraction was allowed to settle for 2 minutes,
after which the mixture was decanted into a 500 mL
borosilicate glass separatory funnel (Figure S1). e
Erlenmeyer flask was rinsed 3 times with 50mL of diH2O
and decanted into the separatory funnel. e funnel was
shaken vigorously for 30 sec and allowed to settle for
2 min (for silt/sand, allow for 20 min). e sediment
and water layer (bottom aqueous layer) was discarded
as waste. Subsequently, 200mL of diH2O was added to
the separatory funnel. e solution was shaken again
for 30 sec and allowed to settle for 2 min. Similarly to
before, the aqueous layer was discarded. Next, 10mL of
a mixed alcohol solution (90% EtOH, 5%, IPA, and 5%
MeOH glacial) was added to the separatory funnel and
shaken for 30 sec. e fractions were allowed to settle
for 10min due to alcohol’s miscibility in water (deviation
from Crichton starts here). Next, the alcohol layer was
removed into another 50 mL glass vial for spinning
and the oil layer was discarded. Using an Allegra®
X-15R Centrifuge, 8 15mL polypropylene conical tubes
were prepared and used to concentrate the extracted
MPLs. Using glass Pasteur pipettes, to minimize plastic
introduction into the sample, the oil and alcohol layers
were separated (over a course of a week or so, the layers
become completely separated by density) into 2 vials
corresponding to a sampling area. 4 subsamples were
taken and spun at 4,500 RPM for 10min. Afterwards,
the samples were deposited, drop-wise (1–3 drops), on
an aluminum-coated glass slide for characterization (8
areas per slide) (Figure S2). Only the alcohol on top of
the oil/alcohol boundary and a portion of the bottom
of both vials were used for analysis (pelagic MPLs tend
to be denser (0.88–1.01 g·cm−3 (Borges-Ramírez et al.
2020)) than alcohol (EtOH: ~ 0.79 g·cm−3 (Ferner and
Chambers 2021)) but sometimes more or less dense than
oil (~ 0.91g·cm−3) (Eskin etal. 2003)).
Spectroscopic Characterization
Using the WiTec alpha300 R Confocal Raman Microscope
(µ-Raman), 424 particles were scanned sequentially.
Of these, 149 particles possessed spectra without total
obscuration from fluorescence nor possessed incredibly
weak signals (i.e., they appear identical to the background
spectra taken for each particle under investigation).
Note: Images and spectral data for the unknown MPs for
particles are found in the SI. e acquisition time was set
to 5seconds with an accumulation of 5. 2 to 5 mW power
was used under 532nm irradiation. A 600g/mm grating
was used at a spectral center of 1,100 cm−1 in order to
capture a range of wavenumbers from 200 to 2,000 cm−1.
e characteristic dimension of these particles were
generally below 20µm, necessitating the use of µ-Raman
due to its superior resolution limit ~ 1 µm (Nava et al.
2021). Below are raw, unprocessed MPL control spectra,
which (Fig.2) were scanned as an elimination criterion
set outside of the controls. e images of these controls
were captured as a visual elimination criterion set (Figure
S3).
e 149 samples were background subtracted,
baselines corrected to polynomial order 8, and
the resolution was linearly interpolated to 8 cm−1.
Baseline correction zeroed negative intensities. e
ultimate ranges used in the study for Raman span from
224–1992 cm−1. Of the 149 MPs detected, 97 remained
after exclusion of controls and spikes in the sample. is
involved dividing the 8 particles scanned (with credence
to uniform scanning) by the number of “hits” or polymer
matches in a given area on the sample slide (8 areas per
slide). 53 areas were drawn, with 36 belonging to the field
study (4 water sources per area per triplicate sample), 8
belonging to the controls, and 9 belonging to the spikes.
e concentrations were averaged such that the hits
were divided by 24 for a given water source per area. e
relative concentrations were used to determine MPL type
by polymer and water source per area as well as general
relative concentrations determined by water source and
Table 1 Legend-Guide for Field Study with De-Identified
Coordinates, Altitude, City, and Population Density of Sampling
Area and Sub-Area (Places, and Economy 2024) (Margulies 2024)
(Get Lat Long from Address 2024)
Sub-Area Coordinates Altitude (ft) City Pop. Dens.
(sq. mi.-2)
DW1123 36°15’N,
80°51’W 900 Elkin 607.07
TW1123 36°15’N,
80°51’W 900 Elkin 607.07
RW1123 36°15’N,
80°51’W 870 Elkin 607.07
WW1123 36°15’N,
80°50’W 1031 Elkin 607.07
DW2123 36°1’N,
80°24’W 750 Clemmons 1772.15
TW2123 36°N, 80°26’W 720 Bermuda Run 1250.5
RW2123 36°1’N,
80°25’W 640 Clemmons 1772.15
WW2123 36°N, 80°20’W 787 Winston-Salem 1868.82
DW3123 35°43’N,
80°26’W 650 Salisbury 1613.47
TW3123 35°43’N,
80°26’W 651 Salisbury 1613.47
RW3123 35°43’N,
80°23’W 600 Linwood 184.2
WW3123 35°43’N,
80°26’W 630 Salisbury 1613.47
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 6 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
study area. Lastly, pie chart distributions were generated
to better visualize the hits by polymer type in the report’s
champion neural network, OpenSpecy (Cowger et al.
2021), and a separate FEDS algorithm (Palencia 2020)
used to assess the validity of both models by recovering
“hidden” peaks of interest within broad bands. e FEDS
algorithm used a custom database consisting of Raman
spectra from various sources of polymers (Puchowicz
and Cieslak 2021; Katsara et al. 2021; Furukawa et al.
2006; Vieira etal. 2023; Chen etal.2019; Al etal. 2019;
Sánchez-Márquez et al. 2015) as well as biological
samples (Miller et al. 2022), like biologically-related
cellulose, charcoal, bone, chitin, collagen, myofibrillar
protein, calcium carbonate, keratin, dentin, and algin (to
further control the study).
Yadkin pee-dee river basin modeling
Cubic interpolation of the abundance (relative MPL
concentration) by area was interpolated to 20 points
by using a spline interpolation of the mean altitude and
population density captured at each area in the field
study. e data was then used to form models via multiple
linear regression, multiple polynomial regression, and
genetic programming (via GPTIPS) (Searson 2009),
where the champion, multiple linear regression, was
used to ascertain the potential MPL abundance across
the entire river basin. Demographic and geographic data
provided by the U.S 2020 Census (Places, and Economy
2024) and a free coordinate point tool (Get Lat Long from
Address 2024) were used to generate the predictions.
Fig. 2 Collection of unprocessed controls as an exclusion criterion set. (Left to Right) First Row: CA, LDPE, HDPE, and PP. Second Row: PS, PET,
PMMA, PA. Third Row: PVC, Oil, Marker, and Alcohol. Fourth Row: Dust_AirborneParticle1 – 4. Fifth Row: Dust_AirborneParticle5 – 8
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 7 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Data formatting/data procurement
e training dataset (Table2), generated in Version 1
(V1), was 5,400 spectra (2,700 from FTIR and Raman),
of which 66% were synthetically derived. e dataset
consisted of pristine FTIR/Raman spectra from 9 plastic
polymers, 100 spectra each: CA, PE (LDPE/HDPE), PP,
PS, PET, PMMA, PA, and PVC. In addition, MPL of
heterogeneous origin and environmental origin were
included from FLOPP/FLOPP-E (Frond et al. 2021)
and SLOPP/SLOPP-E (Munno etal. 2020). For pristine
samples, they were deposited, sparingly, on gold slides.
For PS, a film was formed on the slide. e RaptIR
µ-FTIR (ermoFisher Scientific, Nicolet RaptIR + FTIR
Microscope) system was used to facilitate the spectral
acquisition. For µ-Raman, the Horiba XploRA Raman
Confocal Microscope, a glass slide was used instead.
Samples were irradiated with a 532 nm laser for an
acquisition time of 5 seconds with a 600T grating.
Accumulation, binning, FLAT correct, hole, and slit were
set to “5”, “1”, “ON,” “300”, and “100”, respectively. More
information on the product origins of the pristine MPLs
can be requested.
V2 is an amendment to V1 in that additional synthetic
spectra from environmental-weathered microplastics
were generated from Cowger and Davidson (FTIR)
((Cowger etal. 2020; Davidson etal. 2023)) and Cowger
and Miller (Raman)((Cowger et al. 2020; Miller et al.
2022)) totaling 9,000 spectra (90% synthetically-derived).
e testing dataset for FTIR was from Primpke (99
samples) and for Raman, Dong/Miller (124 samples,
excised portion of PS used from Miller used to represent
more classes in the external dataset test). Note: e
excised portion was excluded in the training of the
neural network from Miller. OpenSpecy processing was
done on all samples with substitutions (S) possessing
no background correction. All FTIR and Raman spectra
were reduced to 680 to 4000 cm−1 and 224–1992 cm−1,
respectively, due to interpolation shortening the original
domain of the data and the limits of the external
databases. A caveat in the training of models came into
play when the MiniBatch used in V2 was set to 32 from
the default: 128 (Note: Although the lower batch size
was chosen to stave off crashes, there may be an increase
in the variability of accuracies calculated due to highly
variable file sets appearing more noticeably for the loss
function adjustment than in the default). Because the
dataset, for V2, is 9,000 images in size (4,050 processed
at a time, due to 10% holdout), the memory of MATLAB
Online was eclipsed, > 8GB. is crashes the program;
therefore, processing 25% of the original batch at a time
for training purposes circumvents this issue. is issue
dramatically increased the training time about ~2.5-fold
on the desktop hardware.
Table 2 The sample sizes of classes for training and the external testing dataset for Versions 1 and 2 (V1/V2) of GoogLeNet and
ResNet-18 (F: FTIR, R: Raman) (S: Substitution from FLOPP-E (FPE) and SLOPP-E(SPE))(Frond et al. 2021)(Munno et al. 2020)(Cowger et al.
2020)(Davidson et al. 2023)(Primpke et al. 2020)(Dong et al. 2020)(Miller et al. 2022)
MPL Polymer Pristine (F/R) FP-SP FPE-SPE Cowger(F)-
Cowger(R) Davidson(F)-
Miller(R) Primpke
(F)-DongMiller(R)
CA 100 14 14 100 (S) 100 (S) 9
HDPE 100 19 49 17 65 10
LDPE 100 19 49 17 65 12
PP 100 19 66 6 29 12
PS 100 14 12 15 84 7
PET 100 25 22 1 80 12
PMMA 100 6 2 100 (S) 1 7
PA 100 12 6 100 (S) 27 23
PVC 100 9 9 1 85 7
CA 100 20 8 100 (S) 2 0
HDPE 100 19 13 19 2 31
LDPE 100 19 13 19 1 31
PP 100 17 16 14 4 44
PS 100 11 9 1 4 2 (Miller)
PET 100 19 13 100 (S) 1 10
PMMA 100 11 3 100 (S) 1 0
PA 100 7 4 100 (S) 3 2
PVC 100 11 3 100 (S) 1 4
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 8 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Again, the pictorial spectra generated by the first script
are available in totality in the supplementary information
section. As an example (Fig.3), a subset of one image was
taken from PP’s class across the entire training and testing
dataset to demonstrate the diversity of the samples. One
can see that dimensions, axis labeling, color, and width
have been kept as similar as possible to focus the CNNs’
feature extraction on the spectra. For dissimilar photos,
it is thought that the CNNs deprioritize features outside
of the spectra’s morphology due to similarity across
classes. Nevertheless, all pictorial data is augmented for
a homogeneous resolution. Note: Details on the code
used to generate the data and train the neural networks is
available in the SI.
Results
10% holdout data trial
In order to judge the strength of the trained models,
a randomized 10% holdout of the training dataset for
both V1 and V2 was generated for comparison. usly,
the calculation of the batch prediction accuracies
(Table 3) gave a sense of how the models differ with
respect to spectroscopic mode, version, and learning
parameters from the confusion matrices generated (SI).
Fig. 3 Example set (PP) of pictorial data, generated from synthetic data used to train CNNs. A-E, Raman. A. Pristine µ-Raman samples, B. SLOPP, C.
SLOPP-E, D. Cowger, and E. Davidson. F-J, FTIR. F. Pristine µ-FTIR samples, G. FLOPP, H. FLOPP-E, I. Cowger, and J. Miller
Table 3 Accuracy rates of 48 CNN Types with respect to pre-trained network, learning optimizer method, and learning rate (Initial 10%
Testing Dataset). Random 24 CNN accuracies for comparison
V1(GN) GN_SGDM_.0009 GN_SGDM_.0010 GN_SGDM_.0011 GN_ADAM_.0009 GN_ADAM_.0010 GN_ADAM_.0011
FTIR 94.40% 93.00% 93.70% 71.90% 37.80% 65.20%
Raman 90.00% 91.10% 91.10% 11.10% 91.50% 91.10%
V1(RN) RN18_SGDM_.0009 RN18_SGDM_.0010 RN18_SGDM_.0011 RN18_ADAM_.0009 RN18_ADAM_.0010 RN18_ADAM_.0011
FTIR 94.80% 95.20% 92.20% 94.40% 93.00% 92.60%
Raman 94.80% 92.20% 91.10% 92.20% 92.20% 96.3%
V2(GN) GN_SGDM_.0009 GN_SGDM_.0010 GN_SGDM_.0011 GN_ADAM_.0009 GN_ADAM_.0010 GN_ADAM_.0011
FTIR 90.90% 90.00% 90.90% 90.90% 91.30% 90.70%
Raman 91.80% 90.90% 92.20% 90.00% 90.20% 89.80%
V2(RN) RN18_SGDM_.0009 RN18_SGDM_.0010 RN18_SGDM_.0011 RN18_ADAM_.0009 RN18_ADAM_.0010 RN18_ADAM_.0011
FTIR 93.80% 93.60% 92.40% 93.80% 92.90% 93.60%
Raman 92.90% 92.70% 94.00% 93.10% 92.90% 94.70%
Mode NN1/NN13 NN2/NN14 NN3/NN15 NN4/NN16 NN5/NN17 NN6/NN18
FTIR 15.15% 9.09% 19.19% 12.12% 15.15% 10.10%
Raman 20.16% 9.68% 9.68% 16.94% 17.74% 17.74%
Mode NN7/NN19 NN8/NN20 NN9/NN21 NN10/NN22 NN11/NN23 NN12/NN24
FTIR 12.12% 10.10% 11.11% 17.17% 13.13% 14.14%
Raman 16.13% 12.10% 18.55% 14.52% 17.74% 16.94%
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 9 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Across the board, it appears that the FTIR-trained
CNNs possess strong accuracy rates above 90%. A
deviation is seen for V1, GN’s ADAM optimizing
method, with an accuracy range of around the mid
30’s to lower 70’s percent. For FTIR, the prime model
appears to be RN18_SGDM_.0010 with an accuracy
of 95.2%, indicating superiority of the default learning
rate and further supporting SGDM’s superiority.
However, the deviation stated before and increases
in accuracy appears to be non-statistically significant
(marginal) when considering a Mann–Whitney U-test
of significance. Across versions, the test looked at the
differences between accuracy rates for GN and RN, which
determined a p-value of 1 and 0.6753.at a U-test statistic
of 39 and 42, respectively. For the Raman-trained CNNs,
they are generally less strong in accurately predicting
the testing dataset sample; however, 2 models appear to
be at the extreme ends of accuracy: GN_ADAM_.0009
(11.10%) and RN18_ADAM_.0011 (96.3%), from V1 and
V2, respectively. Raman-trained CNNs appear to have
the most variable range in accuracy. Mann–Whitney
reflects this wider margin after determining non-
significance in the RN and GN CNNs for the mode with
a p-value of 0.9719 and 0.4978 at a U-test statistic of 38.5
and 34.5, respectively. For the Random NN’s, a random
number generator was instituted to determine if there
was true deviation in accurate predicting from random
guessing by the CNNs. For the 10% holdout, this places
GN_ADAM_.0009 within range.
External Data Holdout Trial
External datasets, Primpke (Primpke et al. 2020) and
DongMiller ((Dong et al. 2020; Miller et al. 2022)),
were used to determine if the CNNs tested in this study
could accurately predict real-world data outside of
the initial training dataset. Striking conservation and
changes in accuracy were seen (Table4) in the FTIR and
Raman models. In regard to the FTIR-trained CNNs,
marginal decreases in accuracy were seen in V1 across
model types and learning parameters, with the RN
sets appearing to have conserved accuracy rates from
the initial 10% holdout trial versus the GN set. In V2,
conservation appears to be more pronounced in both
GN and RN sets of CNNs with the version possessing
the prime FTIR model at 91%: RN_ADAM_.0009. Mann-
Whitney U-test was conducted in order to test higher
accuracies from V1 to V2; it was determined that GN set,
for FTIR and Raman, respectively, possessed statistically-
significant and statistically insignificantly different
levels of prediction with a p-value of 0.0022 and 0.3268
at a U-test statistic of 21 and 45.5. For the RN set, the
significance test determined that the levels of prediction
were marginally affected with a p-value of 0.5476 and
0.1926 at a U-test statistic of 35 and 30.5, for FTIR and
Raman, respectively. For Raman, a notable decrease in
accuracy rate was seen across all versions, all sets, and
all learning parameters with the highest accuracy rate
produced by model GN_ADAM_.0009 (V1) and RN18_
ADAM.0011 (V2) at 49.19% and 48.38%, respectively.
Despite poor performance, it was investigated whether or
Table 4 Accuracy rates of 48 CNN Types (External Databases) with respect to pre-trained network, learning optimizer method, and
learning rate. Random 24 CNN accuracies for comparison
V1(GN) GN_SGDM_.0009 GN_SGDM_.0010 GN_SGDM_.0011 GN_ADAM_.0009 GN_ADAM_.0010 GN_ADAM_.0011
FTIR 61.62% 47.50% 65.66% 19.20% 22.22% 64.65%
Raman 20.16% 13.71% 16.94% 49.19% 16.13% 25.81%
V1(RN) RN18_SGDM_.0009 RN18_SGDM_.0010 RN18_SGDM_.0011 RN18_ADAM_.0009 RN18_ADAM_.0010 RN18_ADAM_.0011
FTIR 88.89% 83.84% 83.84% 90.91% 84.85% 87.88%
Raman 27.42% 33.06% 10.48% 22.58% 41.94% 25%
V2(GN) GN_SGDM_.0009 GN_SGDM_.0010 GN_SGDM_.0011 GN_ADAM_.0009 GN_ADAM_.0010 GN_ADAM_.0011
FTIR 81.82% 81.82% 77.78% 77.78% 84.85% 71.72%
Raman 17.74% 17.74% 14.52% 8.06% 17.74% 16.13%
V2(RN) RN18_SGDM_.0009 RN18_SGDM_.0010 RN18_SGDM_.0011 RN18_ADAM_.0009 RN18_ADAM_.0010 RN18_ADAM_.0011
FTIR 86.87% 86.87% 85.86% 91% 88% 86.87%
Raman 25.81% 22.58% 35.48% 43.55% 38.71% 48.38%
Random NN1/NN13 NN2/NN14 NN3/NN15 NN4/NN16 NN5/NN17 NN6/NN18
FTIR 15.15% 9.09% 19.19% 12.12% 15.15% 10.10%
Raman 20.16% 9.68% 9.68% 16.94% 17.74% 17.74%
Random NN7/NN19 NN8/NN20 NN9/NN21 NN10/NN22 NN11/NN23 NN12/NN24
FTIR 12.12% 10.10% 11.11% 17.17% 13.13% 14.14%
Raman 16.13% 12.10% 18.55% 14.52% 17.74% 16.94%
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 10 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
not there was some improvement, if any, across versions
over a set of hypothetical CNNs with random predictions.
For V1, sets GN(FTIR), GN(Raman), RN(FTIR), and
RN(Raman) were tested via Mann–Whitney U-test of
significance, which determined the set of p-values as
0.0022, 0.3766, 0.0022, and 0.0649 at U-test statistics
of 57, 45, 57, and 51; respectively. For V2, GN(FTIR),
GN(Raman), RN(FTIR), and RN(Raman) were tested
similarly which determined the set of p-values as 0.0022,
0.7835, 0.0022, and 0.0022 at U-test statistics of 57, 37,
57, and 57.
Efforts to ensemble the models appears to recover
more accuracy for the external dataset (Table5). e
ensembled RN sets appear to be higher in predictive
capability regardless of versions and spectroscopic
modes. For the GN ensemble, accuracy increases
favorably for FTIR, while accuracy decreases for Raman.
For RN, accuracy is generally higher than GN, most
strikingly for RN (Raman). Across all model types, above
90% accuracy is seen; however, when compared to the
ensembled accuracy rates of the hypothetical CNNs,
their accuracies are
interestingly proximal: 79.8% and 86.29%. Given 6
models are in each set, ~ 60% accuracies justify its usage
as a baseline. Note: In the SI, a color-coding system is
used to determine the calculations for Tables2, 3, and 4.
Unknown MP trial (eld study)
Because the study employs µ-Raman spectroscopy, the
principal Raman model, RN18_ADAM_.0011 (Raman:
Version 2) was selected. Although its accuracy was poor
compared to DongMiller, the external dataset is cited
to be quite difficult to predict without corroborating
characterization modes (Dong et al. 2020). Moreover,
it performed well in the previous 10% holdout trial.
Nevertheless, 149 out of 424 total particles were analyzed
and exported as pictorial data for the unknown trial. Out
of these 149, 97 particles remained after controls and
spikes were considered for the champion, OpenSpecy,
and the FEDS corroboration.
According to the champion CNN (Fig. 4), a large
share of the MP detected appears to be cellulosic in
nature with higher concentrations generally in Area
2 of the field study. e RW sources share the most
cellulosic MPL content, with Area 3 possessing the
highest abundance at a striking 37.5% of MPs. is is
compared to WW’s nearly non-existent presence of
cellulose. PET, PE, and PA appear to be present across
both the CNN and OS’ predictions, with the CNN
revealing more PP and PMMA presence. With respect
to water sources (Fig.5), DW and RW are among the
highest sources of MPLs in the study, with TW and
WW among the lowest. e range of relative MPL
concentration appears to be quite widespread: 2.78 to
27.79% (MPL/MP). Across areas, the CNN shows a
peak around Area 2 for MPL concentration at ~33.34%
versus the OS’ slowly declining trend from ~10 to 8%.
Table 5 Ensemble accuracies of GN, RN, and (GN + RN)’s
prediction against the external databases with respect to version
and spectroscopic mode
V1 Ensemble (GN) Ensemble (RN) Ensemble (All)
FTIR 84.85% 96.97% 98.00%
Raman 75.81% 77.87% 94.35%
V2 Ensemble (GN) Ensemble (RN) Ensemble (All)
FTIR 90.91% 93.94% 93.94%
Raman 36.29% 85.48% 90.32%
Random Ensemble (A) Ensemble (B) Ensemble (All)
FTIR 59.60% 56.57% 79.80%
Raman 56.45% 63.71% 86.29%
Fig. 4 Relative MPL concentration (MPL/MP) calculated for RW, TW,
WW, and DW per area with respect to polymer type based on CNN
(top) and OS (bottom).
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 11 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Employment of a FEDS algorithm assessed the validity
of the CNN and OS’ predictions while also generating
a polymer distribution for comparison (Fig. 6). For the
CNN and OS (Fig.6A, B), they share a similar proportion
of cellulosic (CA/C), PET, and PE MP samples. Compared
to FEDS, the relationship appears to hold for CA/C,
PE, and PET; however, FEDS corroborates PA and PS
presence in the CNN (Fig.6C). Neither of the3 models
corroborate presence of silk, PVC, or PMMA. Most
strikingly, almost half of the entries are not considered to
be MP by OS and instead are owed to a copious amount
of minerals and minimal amounts of colorants. When
comparing FEDS agreement between CNN and OS
separately, a similar number of samples matched with 1/3
sharing corroboration amongst both models.
To assess application of the results to the entirety of
the river basin, a potential relationship between the
concentrations determined by area for both models,and
the areas respective altitude and population density, was
investigated via 3 models: multiple linear regression,
multiple genetic regression, and genetic programming.
e linear regression was found to have the lowest
error and was used to generate predictions for the other
cities in the river basin (Table 6) (Fig. 7) (Details in
SI). It appears that both models rely on both variables
(altitude and population density), with the CNN aligning
Fig. 5 Determination of relative MPL concentration (MPL/MP) with respect to water source type across all areas (top) and areas in field study
(bottom)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 12 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Fig. 6 Distribution of MPL sample with respect to CNN (A), OS (B), and FEDS (C) predictions. D. Separate and simultaneous corroboration incidence
for CNN and OS with respect to FEDS
Table 6 City-Data and Multiple Linear Regression Prediction Result ((Get Lat Long from Address 2024; Places, and Economy 2024))
(Original testing areas altitude and population density were averaged).
City Altitude (ft) Population density (sq.
mi.-2 ) Coordinates CNN (MPL/MP) OS (MPL/MP)
Mt. Airy 1102 911.15 36.499962,−80.605392 0.298292306 0.135641815
Wilkesboro 1043 573.76 36.145966,−81.160637 0.201245019 0.120875199
Statesville 896 1122.17 35.782799,−80.887367 0.272254731 0.109185677
Mocksville 856 759.53 35.894081,−80.561829 0.176305751 0.09675461
Albemarle 516 928.05 35.352280,−80.199539 0.092848339 0.049840494
Charlotte 696 2821.06 35.227085,−80.843124 0.58342464 0.110499326
Florence 143 1705.45 34.195435,−79.762566 0.134725649 0.009080783
Sumter 171 1322.59 33.930271,−80.367477 0.058504718 0.006276216
Georgetown 20 1201.80 33.378109,−79.297081 0 0
Elkin 935.25 607.07 36.348112,−80.8517 0.1667 0.104175
Clemmons 724.25 1665.91 36.030548,−80.3827 0.333375 0.0937775
Salisbury 632.75 1256.15 35.668331,−80.471 0.208425 0.072925
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 13 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
with higher population densities and OS aligning with
altitude. e city of Georgetown appears to be out of
bounds of positive real numbers for the models due to its
low altitude (negative entry was nullified), regardless of
its sizable population density.
Discussion
For the first time, the Yadkin-Pee Dee River Basin
has been studied for MPL abundance as a part of a
growing body of terrestrial reports as opposed to the
general coastal and marine and marine focus. Only
one other study (Williams 2023) from an unpublished
thesis from Appalachian State University, investigated
MPL abundance in the river basin but used freshwater
mussels as its sample source. e report sought
to analyze samples of water in triplicate from four
different sources: drinking water, tap water, waste water
treatment plant effluent, and river water. Samples were
collected across 3 geographical areas spanning the
western portion of the Piedmont Triad metropolitan
area over the upper Yadkin-Pee Dee River Basin.
Carefully, samples were separated and drop-casted
onto aluminum slides for µ-Raman characterization.
A limitation of the design of dividing areas on the
aluminum slides was proximity to other areas and the
marking material: potential sources of contamination.
Although deposition was drop-wise, occasionally,
a trace amount of droplets would combine across
areas. To mitigate this issue, triplicate samples were
kept in adjacency during deposition. Unfortunately,
some droplets of oil and alcohol-based samples crossed
over marking on the slide’s surface. As a precaution,
the spectra from these solvents as well as marker
pigments were used as exclusion criteria. Exclusion was
also based on airborne particles deposited on naked
areas of the slides surface and pristine samples used
as spikes. Lastly, spectral exclusion criteria, revealing
the majority of particles to be unidentifiable, were
Fig. 7 Top: Geographic density plot and surface plot of the predicted values from the multiple linear regression for the CNN and OS. Bottom:
Surface plots of the cubic interpolated mean altitude and population density values from Areas 1, 2, and 3 of the field study estimating relative MPL
concentrations over 20 data points for the CNN and OS.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 14 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
due to fluorescence obscuring the spectra completely.
In context to µ-FTIR, due to the characteristic
dimension of the microfibers captured being generally
smaller than ~ 20 µm, noisy and weak signals were
observed, which lead to the decision to not use the
microspectroscopic mode.
One caveat of the extraction protocol was the varied
distribution of recovered MPLs from the spiked areas
(Table S1). Some preference towards hydrophobic
polymers was expected; however, it appears to not be
the case as both models’ results confirmed of presence
of PET, PMMA, and PA MPL. Generally, both models
corroborate similar incidence of cellulosic and PE
materials. PP appeared to diverge quite considerably,
with OS picking up its presence 75% of the time and
CNN declaring it as a null result. Due to the high
variability, it is impossible to determine the extraction
efficiency by polymer type with the modified oleophilic
method.
Meanwhile, novel code was developed to generate
pictorial data from online “open-source” databases of
environmentally-derived MPLs for optimal training
of CNNs from MATLAB’s suite of pre-trained CNNs.
2 of these CNNs, “GoogLeNet” and “ResNet-18”
underwent transfer-learning on MATLAB online
whereby fully-formed modes were tested via 3 trials:
an initial 10% holdout of its training data, an external
online dataset, and data from MPs captured across
the field study. e training dataset for V1 and V2
contained an assortment of samples with various class-
wise imbalances. It was suspected that the models
would have bias and favorability towards PE, PP, and
PET across both versions. Supplementation of missing
classes for PET, PMMA, PA, PVC, and PS decreases
diversity thus potentially biasing the models out of
predicting diverse environmentally-derived MPLs from
these classes. Nevertheless, this is expected and may be
negated by the expectation that most plastics produced
are PE and PP (Bråte etal. 2014). Moreover, the specific
FLOPP-E/SLOPP-E supplementation may curtail this
issue. Another limitation of the training dataset is the
failure in consistent x-axis marking across databases.
However, CNNs are remarkable in ignoring non-unique
features extracted from pictorial data.
For the first trial (10% Holdout), models were strong
in predictive capabilities across the board with the set of
RN being generally stronger in predictive capability than
the GN sets. Across versions, marginal improvements
were seen in accuracy rates between both spectroscopic
modes and model types. However, lower accuracy was
seen for the ADAM algorithm in V1 regardless of mode.
In terms of the U-test, improvements were marginal
and not statistically-significant. Interestingly, the RN
sets appeared to possess the lowest p-values indicating
a stronger difference in accuracy, albeit marginal. For
the second trial (External Holdout), accuracy rates
dipped considerably; especially in the Raman model’s
case. is is understandable as the DongMiller dataset
(primarily Dong) is notoriously difficult to identify
necessitating corroborative characterization modes.
Interestingly, from V1 to V2, only FTIR’s GN set was
proven to be significantly better, with the next best-
performing set being RN’s (Raman). is may indicate
a limit to which the new databases can aid the CNNs’
generalizability. Despite the results, it was of interest
to assess any improvement over randomness. When
pairing statistical significant of GN(FTIR), GN(Raman),
RN(FTIR), and RN(Raman), a more significant p-value
in the RN(Raman) set from V1 to V2 indicating the
new training datasets benefit. Interestingly, GN(FTIR)
worsened while the other sets possessed no change. For
DongMiller, this may allude to better generalizability in
the Raman-based set of models.
Ensembling appears to have a general positive effect
in increasing the accuracy of the models considerably;
however, this effect appears to fall short past 6 CNNs.
Indeed, the RN(Raman) set proved most efficacious
vs. GN’s, but this may be partially due to effects of
random chance. It is important to note that one can
simply ensemble a hypothetically random and generate
higher accuracy rates. What is vital to note is that the
classes represented in the prediction space are diverse.
Unfortunately, many of the Raman-based models that
did well with DongMiller disproportionately classified
PE, which DongMiller so happens to be predominate in.
e correct combinations merits further study for other
external datasets.
Despite low performance, RN_ ADAM_.0011
possessed comparatively strong accuracy in context to
its weak sister models (~4-fold) in Version 2. Its potential
here, as well as the initial 10% holdout trial, necessitated
its usage as the model for the unknown MP trial.
Moreover, it may possess some ability to generalize newer
data to a high extent despite DongMiller’s difficulty.
In the unknown trial, the majority of counts appear
to be of cellulosic material. When pairing the presence
of PET or polyester material (as well as traces of other
materials) (Preau 2020), within the relative concentration
calculated from the prime model and OS’ prediction,
the main contributing factor may be laundering effluent
(clothes washing wastewater) into the water systems
of the Yadkin which is suggested to be a major source
of MPLs in water supplies, in the literature ((Volgare
etal. 2021; Falco etal. 2019)). Indeed, after performing
the regressions, it appears that anthropogenic effects
(e.g. population density) have some contribution to
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 15 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
MPL concentration (determined by CNN’s results),
potentially corroborating this conjecture. However, MPL
concentration, according to OS, did not corroborate
the population density metric as strongly. Instead,
altitude appeared to have more of an effect. Moreover,
FEDS corroboration appears to have indicated
similar strength for both CNN and OS leading to
the conjecture that effects from both variables in the
regression are important for modeling concentration.
A variety of confounding factors could influence the
differences in concentrations, such as river widening/
deepening diluting the particles, changes in flow rate
(meteorological events and reservoirs), and different
clarification methods at wastewater treatment plants.
Indeed, another confounding factor may be present as
a function of water source with WW/TW possessing
generally lower concentration whereas DW/RW
possessing concentrations on the higher end of the range,
irrespective of model type. DW water sources were
public water fountains, and RW are sources exposed
to air, possibly suggesting atmospheric deposition of
MPLs ((Dris etal. 2016; Cai etal. 2017)), which includes
cellulosic and polyester material in the literature. As for
TWs low concentration, a closed indoor system may
abate MPL contamination.
Because corroboration is quite low across model
types, focus on the distribution of MPL type may be the
best pathway forward in characterizing abundance. e
similarity in polymer-type presence across models suggests
some agreement, meaning that some models may predict
individual particles better than others. It is difficult to
determine this individually but as a distribution, this
may be informative enough in characterizing a region’s
MPL abundance by polymer type informing remediation
methods, thusly. Note: An anomaly occurred in the FEDS
algorithm which matched a majority of the unknowns as
“BIOL002_Charcoal” but this was removed from the result
after visual inspection of samples indicated no presence of
charcoal.
Conclusion
MPL abundance is a growing problem in need of
further elucidation in terms of its mechanisms of entry,
deposition, and potentially toxicological effects on
the biosphere. is problem also extends to terrestrial
water sources as opposed to the reports typically in the
oceanic or coastal environment. Reporting informs
the remediation efforts’ foci, which is contingent upon
MPL modeling not only with respect to concentration
but polymer type. In the case of the reports’ laundry
wastewater effluent conjecture, more efforts to contain
these particles from residential sources should be made
to ease usage of remediation. After which, employment
of microbial or enzymatic degradation capable of
digesting cellulosic ((Kalita and Hakkarainen 2023; Horn
etal. 2012)), PET ((Sadler and Wallace 2021; Zurier and
Goddard 2023)), PE (Temporiti etal. 2022) materials can
occur.
Without the sub-field pushing for open-source
collaboration, it would not have the tools to develop
computational reconciliatory characterization
techniques. Any researcher with computer access,
electricity, and curiosity can contribute as the report
endeavors to do so. Results from this study are
encouraged for promulgation towards a legislative
initiative establishing new standards for MPL abundance
within the Piedmont Triad Metropolitan Area’s water
systems and for the greater Mid-Atlantic region of the
U.S.
Supplementary Information
The online version contains supplementary material available at https:// doi.
org/ 10. 1186/ s40068- 024- 00384-1.
Additional file 1.
Acknowledgements
We would like to thank Win Cowger for helping us locate relevant databases
for the study.
Author contributions
S.A. contributed to project administration. W.W. performed the field study,
acquired the data, generated the code, standardized the training datasets,
generated the data from the predictions, wrote, edited, and revised the
manuscript. A.A. performed transfer learning on 25% of the CNNs used in the
study from a training dataset generated from the databases mentioned in the
methodology.
Funding
This study was not performed under a funding source.
Availability of data and material
All data generated or analyzed during this study are included in this published
article and its supplementary information files.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Author details
1 Joint School of Nanoscience and Nanoengineering, Department
of Nanoengineering, North Carolina Agricultural and Technical State
University, Greensboro, NC 27411, USA. 2 North Carolina School of Science
and Mathematics – Durham Campus, Durham, NC 27705, USA.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 16 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Received: 11 September 2024 Accepted: 22 October 2024
Published: 20 November 2024
References
Al-Abri M, Al-Ghafri B, Bora T, Dobretsov S, Dutta J, Castelletto S, Rosa L, Boretti
A (2019) Chlorination disadvantages and alternative routes for biofouling
control in reverse osmosis desalination. Npj Clean Water 2:2. https:// doi.
org/ 10. 1038/ s41545- 018- 0024-8
Beck AJ, Kaandorp M, Hamm T, Bogner B, Kossel E, Lenz M, Haeckel M,
Achterberg EP (2023) Rapid shipboard measurement of net-collected
marine microplastic polymer types using near-infrared hyperspectral
imaging. Anal Bioanal Chem 415:2989–2998. https:// doi. org/ 10. 1007/
s00216- 023- 04634-6
Böke JS, Popp J, Krafft C (2022) Optical photothermal infrared spectroscopy
with simultaneously acquired Raman spectroscopy for two-dimensional
microplastic identification. Sci Rep 12:18785. https:// doi. org/ 10. 1038/
s41598- 022- 23318-2
Borges-Ramírez MM, Mendoza-Franco EF, Escalona-Segura G, Osten JR (2020)
Plastic density as a key factor in the presence of microplastic in the
gastrointestinal tract of commercial fishes from Campeche Bay. Mexico
Environ Pollut 267:115659. https:// doi. org/ 10. 1016/j. envpol. 2020. 115659
Bråte IL, Halsband C, Allan I, Thomas K V. Report made for the Norwegian
environment agency : microplastics in marine environments : occurrence,
distribution and effects. 2014. Accessed 30 Aug 2024.
Boesch G. GoogLeNet Explained: the inception model that won ImageNet.
VisoAi 2024. https:// viso. ai/ deep- learn ing/ googl enet- expla ined- the-
incep tion- model- that- won- image net/#: ~: text= GoogL eNet% 20mod el%
20is% 20par ticul arly% 20wel l,these% 20fil ters% 20are% 20then% 20con
caten ated. Accessed 12 Oct 2024.
Cai L, Wang J, Peng J, Tan Z, Zhan Z, Tan X, Chen Q (2017) Characteristic of
microplastics in the atmospheric fallout from Dongguan city, China:
preliminary research and first evidence. Environ Sci Pollut Res 24:24928–
24935. https:// doi. org/ 10. 1007/ s11356- 017- 0116-x
Chen J, Li J, Xu L, Hong W, Yang Y, Chen X (2019) The glass-transition
temperature of supported PMMA thin films with hydrogen bond/
Plasmonic interface. Polymers. https:// doi. org/ 10. 3390/ polym 11040 601
Cowger W, Gray A, Tarby S, Hapich H (2020) Plastic Particle Data. OSF Home.
https:// doi. org/ 10. 17605/ OSF. IO/ UNRZ7
Cowger W, Steinmetz Z, Gray A, Munno K, Lynch J, Hapich H, Primpke S,
De Frond H, Rochman C, Herodotou O (2021) Microplastic spectral
classification needs an open source community: open specy to the
rescue! Anal Chem 93:7543–7548. https:// doi. org/ 10. 1021/ acs. analc
hem. 1c001 23
Crichton EM, Noël M, Gies EA, Ross PS (2017) A novel, density-independent
and FTIR-compatible approach for the rapid extraction of microplastics
from aquatic sediments. Anal Methods 9:1419–1428. https:// doi. org/
10. 1039/ c6ay0 2733d
Davidson J, Arienzo MM, Harrold Z, West C, Bandala ER, Easler S, Senft K
(2023) Polymer characterization of Submerged Plastic Litter from Lake
Tahoe. United States Appl Spectrosc 77:1240–1252. https:// doi. org/ 10.
1177/ 00037 02823 12011 74
Decker C (1984) Photodegradation of PVC. In: Owen ED (ed) Degrad
Stabilisation PVC. Springer, Netherlands, Dordrecht, pp 81–136
Dong M, She Z, Xiong X, Ouyang G, Luo Z (2022) Automated analysis of
microplastics based on vibrational spectroscopy: are we measuring the
same metrics Anal Bioanal Chem 414:3359–3372. https:// doi. org/ 10.
1007/ s00216- 022- 03951-6
Dong M, Zhang Q, Xing X, Chen W, She Z, Luo Z (2020) Raman spectra
and surface changes of microplastics weathered under natural
environments. Sci Total Environ 739:139990. https:// doi. org/ 10. 1016/j.
scito tenv. 2020. 139990
Dris R, Gasperi J, Saad M, Mirande C, Tassin B (2016) Synthetic fibers in
atmospheric fallout: a source of microplastics in the environment Mar
Pollut Bull 104:290–293. https:// doi. org/ 10. 1016/j. marpo lbul. 2016. 01.
006
De Falco F, Di Pace E, Cocca M, Avella M (2019) The contribution of washing
processes of synthetic clothes to microplastic pollution. Sci Rep 9:6633.
https:// doi. org/ 10. 1038/ s41598- 019- 43023-x
De Frond H, Rubinovitz R, Rochman CM (2021) μATR-FTIR spectral libraries of
plastic particles (FLOPP and FLOPP-e) for the analysis of Microplastics.
Anal Chem 93:15878–15885. https:// doi. org/ 10. 1021/ acs. analc hem. 1c025
49
Eskin NAM, Przybylski R (2003) Rape seed OIL/CANOLA. Encycl Food Sci Nutr.
https:// doi. org/ 10. 1016/ B0- 12- 227055- X/ 01349-3
Fernández-González V, Andrade-Garda JM, López-Mahía P, Muniategui-
Lorenzo S (2022) Misidentification of PVC microplastics in marine
environmental samples. TrAC - Trends Anal Chem 153:116649. https:// doi.
org/ 10. 1016/j. trac. 2022. 116649
Ferner RE, Chambers J (2001) Alcohol intake: measure for measure. BMJ
323:1439–1440. https:// doi. org/ 10. 1136/ bmj. 323. 7327. 1439
Furukawa T, Sato H, Kita Y, Matsukawa K, Yamaguchi H, Ochiai S, Siesler
H, Ozaki Y (2006) Molecular structure, crystallinity and morphology
of polyethylene/polypropylene blends studied by raman mapping,
scanning electron microscopy, wide angle x-ray diffraction, and
differential scanning calorimetry. Polym J 38:1127–1136. https:// doi. org/
10. 1295/ polymj. PJ200 6056
Get Lat long from address 2024. LatLong.net. Accessed 30 Aug 2024.
Gonçalves ES, Poulsen L, Ogilby PR (2007) Mechanism of the temperature-
dependent degradation of polyamide 66 films exposed to water. Polym
Degrad Stab 92:1977–1985. https:// doi. org/ 10. 1016/j. polym degra dstab.
2007. 08. 007
Horn SJ, Vaaje-Kolstad G, Westereng B, Eijsink V (2012) Novel enzymes for the
degradation of cellulose. Biotechnol Biofuels 5:45. https:// doi. org/ 10.
1186/ 1754- 6834-5- 45
Ivleva NP (2021) Chemical analysis of microplastics and nanoplastics:
challenges, advanced methods, and perspectives. Chem Rev 121:11886–
11936. https:// doi. org/ 10. 1021/ acs. chemr ev. 1c001 78
Kalita NK, Hakkarainen M (2023) Triggering degradation of cellulose acetate
by embedded enzymes: accelerated enzymatic degradation and
biodegradation under simulated composting conditions. Biomacromol
24:3290–3303. https:// doi. org/ 10. 1021/ acs. biomac. 3c003 37
Katsara K, Kenanakis G, Viskadourakis Z, Papadakis V (2021) Polyethylene
migration from food packaging on cheese detected by Raman and
infrared (ATR/FT-IR) spectroscopy. Materials 14:3872. https:// doi. org/ 10.
3390/ ma141 43872
Khoshnoud P, Abu-Zahra N (2018) Kinetics of thermal decomposition of PVC/
fly ash composites. Int J Polym Anal Charact 23:170–180. https:// doi. org/
10. 1080/ 10236 66X. 2017. 14046 68
Liu L, Xu M, Ye Y, Zhang B (2022) On the degradation of (micro) plastics:
degradation methods, influencing factors, environmental impacts. Sci
Total Environ 806:151312. https:// doi. org/ 10. 1016/j. scito tenv. 2021. 151312
Measuring America’s People, Places, and Economy. US Census Bur 2024.
https:// www. census. gov/. Accessed 30 Aug 2024.
Margulies S. What is my elevation- n.d. https:// whati smyel evati on. com/.
Accessed 30 Aug 2024.
Miller EA, Yamahara KM, French C, Spingarn N, Birch JM, Van Houtan KS
(2022) A raman spectral reference library of potential anthropogenic
and biological ocean polymers. Sci Data 9:780. https:// doi. org/ 10. 1038/
s41597- 022- 01883-5
Munno K, De Frond H, O’Donnell B, Rochman CM (2020) Increasing
the accessibility for characterizing microplastics: introducing new
application-based and spectral libraries of plastic particles (SLoPP and
SLoPP-E). Anal Chem 92:2443–2451. https:// doi. org/ 10. 1021/ acs. analc
hem. 9b036 26
Naim Al AF, AlFannakh H, Arafat S, Ibrahim SS (2019) Characterization of
PVC/MWCNTs nanocomposite: solvent blend. Sci Eng Compos Mater
27:55–64. https:// doi. org/ 10. 1515/ secm- 2020- 0003
Nakatani H, Kyan T, Urakawa Y (2021) Novel recycling system of polystyrene
water debris with polymer photocatalyst and thermal treatment. J Polym
Environ 29:1467–1476. https:// doi. org/ 10. 1007/ s10924- 020- 01976-5
Nava V, Frezzotti ML, Leoni B (2021) Raman spectroscopy for the analysis of
microplastics in aquatic systems. Appl Spectrosc 75:1341–1357. https://
doi. org/ 10. 1177/ 00037 02821 10431 19
Palencia M (2020) Functionally-enhanced derivative spectroscopy (FEDS): a
methodological approach. J Sci with Technol Appl 9:29–34. https:// doi.
org/ 10. 34294/j. jsta. 20.9. 63
Panjeta M, Reddy A, Shah R, Shah J (2023) Artificial intelligence enabled
COVID-19 detection: techniques, challenges and use cases. Multimed
Tools Appl. https:// doi. org/ 10. 1007/ s11042- 023- 15247-7
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 17 of 17
Williamsetal. Environmental Systems Research (2024) 13:53
Panthi G, Bajagain R, Chaudhary DK, Kim P-G, Kwon J-H, Hong Y (2024) The
release, degradation, and distribution of PVC microplastic-originated
phthalate and non-phthalate plasticizers in sediments. J Hazard Mater
470:134167. https:// doi. org/ 10. 1016/j. jhazm at. 2024. 134167
Preau G (2020) Sustainability and Globalization in Fashion: can the fashion
industry become sustainable, while remaining globalized HEC Paris.
Accessed 30 Aug 2024.
Primpke S, Cross RK, Mintenig SM, Simon M, Vianello A, Gerdts G, Vollertsen
J (2020) Toward the systematic identification of Microplastics in the
environment: evaluation of a new independent software Tool (siMPle) for
spectroscopic analysis. Appl Spectrosc 74:1127–1138. https:// doi. org/ 10.
1177/ 00037 02820 917760
Puchowicz D, Cieslak M (2021) Raman spectroscopy in the analysis of textile
structures. IntechOpen, Rijeka
Qian S, Igarashi T, Nitta K (2011) Thermal degradation behavior of
polypropylene in the melt state: molecular weight distribution changes
and chain scission mechanism. Polym Bull 67:1661–1670. https:// doi. org/
10. 1007/ s00289- 011- 0560-6
Ramzan F, Khan MU, Rehmat A, Iqbal S, Saba T, Rehman A, Mehmood Z (2019)
A deep learning approach for automated diagnosis and multi-class
classification of Alzheimer’s disease stages using resting-state fmri
and residual neural networks. J Med Syst. https:// doi. org/ 10. 1007/
s10916- 019- 1475-2
Reding T. Yadkin–Pee Dee River Basin. Wikipedia 2021. https:// en. wikip edia.
org/w/ index. php- title= Yadkin_ Pee_ Dee_ River_ Basin & oldid= 10249
49389. Accessed 12 Oct 2024.
Rytelewska S, Dąbrowska A (2022) The Raman spectroscopy approach to
different freshwater microplastics and quantitative characterization of
polyethylene aged in the environment. Microplastics 1:263–281. https://
doi. org/ 10. 3390/ micro plast ics10 20019
Sadler JC, Wallace S (2021) Microbial synthesis of vanillin from waste poly
(ethylene terephthalate). Green Chem 23:4665–4672. https:// doi. org/ 10.
1039/ D1GC0 0931A
Sánchez-Márquez JA, Fuentes-Ramírez R, Cano-Rodríguez I, Gamiño-Arroyo
Z, Rubio-Rosas E, Kenny JM, Rescignano N (2015) Membrane made of
cellulose acetate with polyacrylic acid reinforced with carbon nanotubes
and its applicability for chromium removal. Int J Polym Sci 2015:320631.
https:// doi. org/ 10. 1155/ 2015/ 320631
Searson D. (2009) GPTIPS: Genetic programming and symbolic regression for
Matlab. Newcastle Univ Libr. https:// eprin ts. ncl. ac. uk/ 175261. Accessed
30 Aug 2024.
Ter Halle A, Ghiglione JF (2021) Nanoplastics: a complex, polluting terra
incognita. Environ Sci Technol 55:14466–14469. https:// doi. org/ 10. 1021/
ACS. EST. 1C041 42
Temporiti MEE, Nicola L, Nielsen E, Tosi S (2022) Fungal enzymes involved in
plastics biodegradation. Microorganisms. https:// doi. org/ 10. 3390/ micro
organ isms1 00611 80
Tuna B, Benkreira H (2018) Chain extension of recycled PA6. Polym Eng Sci
58:1037–1042. https:// doi. org/ 10. 1002/ pen. 24663
Vieira MF, de Bovolato ALC, da Fonseca BG, Izumi CMS, Brolo AG (2023) A direct
immunoassay based on surface-enhanced spectroscopy using AuNP/
PS-b-P2VP nanocomposites. Sensors. https:// doi. org/ 10. 3390/ s2310 4810
Volgare M, De Falco F, Avolio R, Castaldo R, Errico ME, Gentile G, Ambrogi V,
Cocca M (2021) Washing load influences the microplastic release from
polyester fabrics by affecting wettability and mechanical stress. Sci Rep
11:19479. https:// doi. org/ 10. 1038/ s41598- 021- 98836-6
Wang Y, Feng G, Lin N, Lan H, Li Q, Yao D, Tang J (2023) A review of degradation
and life prediction of polyethylene. Appl Sci 13:3045. https:// doi. org/ 10.
3390/ app13 053045
Williams JB (2023) Distribution of microplastics in freshwater mussels across a
watershed scale. Thesis, Appalachian state University
Wochnowski C, Metev S, Sepold G (2000) UV–laser-assisted modification
of the optical properties of polymethylmethacrylate. Appl Surf Sci
154–155:706–711. https:// doi. org/ 10. 1016/ S0169- 4332(99) 00435-3
Wochnowski C, Shams Eldin MA, Metev S (2005) UV-laser-assisted degradation
of poly (methyl methacrylate). Polym Degrad Stab 89:252–264. https://
doi. org/ 10. 1016/j. polym degra dstab. 2004. 11. 024
Wright RJ, Bosch R, Gibson MI, Christie-Oleza JA (2020) Plasticizer degradation
by marine bacterial isolates: a proteogenomic and metabolomic
characterization. Environ Sci Technol 54:2244–2256. https:// doi. org/ 10.
1021/ acs. est. 9b052 28
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural
networks: an overview and application in radiology. Insights Imaging
9:611–629. https:// doi. org/ 10. 1007/ s13244- 018- 0639-9
Yousif E, Haddad R (2013) Photodegradation and photostabilization of
polymers, especially polystyrene: review. Springerplus 2:398. https:// doi.
org/ 10. 1186/ 2193- 1801-2- 398
Zhang K, Hamidian AH, Tubić A, Zhang Y, Fang JKH, Wu C, Lam PKS (2021)
Understanding plastic degradation and microplastic formation in the
environment: a review. Environ Pollut 274:116554. https:// doi. org/ 10.
1016/j. envpol. 2021. 116554
Zhu Z, Parker W, Wong A (2023) Leveraging deep learning for automatic
recognition of microplastics (MPs) via focal plane array (FPA) micro-FT-IR
imaging. Environ Pollut 337:122548. https:// doi. org/ 10. 1016/j. envpol.
2023. 122548
Zurier HS, Goddard JM (2023) PETase engineering for enhanced degradation
of microplastic Fibers in simulated wastewater sludge processing
conditions. ACS ES&T Water 3:2210–2218. https:// doi. org/ 10. 1021/ acses
twater. 3c000 21
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com