Article

Prediction of algal bloom using a combination of sparse modeling and a machine learning algorithm: Automatic relevance determination and support vector machine

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... They require large, high-quality datasets for training, which is also problematic in data-sparse regions (Abbas et al., 2023;Alarab et al., 2021). On the contrary, ML algorithms like RF, GB, XGB, SVR, etc., are more flexible than deep learning algorithms due to their easier interpretability and ability to perform well with smaller datasets (Miura et al., 2023;Yuan et al., 2020). In addition, there are numerous Chl-a retrieval models (band ratio model, vegetation indices, Fluorescence Line Height, three-band indices, etc.), with some designed for global use and others tailored to specific regions (Blondeau-Patissier et al., 2014;. ...
... As a novel approach to data analysis and processing, machine learning has garnered widespread attention owing to its high precision, flexible customization, and convenient scalability (Zhu et al., 2022). Machine learning has become an important tool for data analysis, classification, and prediction (Friedman, 2006;Lap et al., 2023;Miura et al., 2023), which are focal points in surface water quality research (Tiyasha et al., 2020;Zhou et al., 2024), with the rapid increase in the volume of aquatic environmental data and advances in algorithm development, computing power, sensor systems, and data availability. Machine learning algorithms can autonomously optimize and improve existing empirical models using sample data, uncover potential connections between data, and offer significant advantages in the regression analysis of less obvious correlations (Mamun et al., 2024;Messaoud et al., 2020). ...
Article
Lake eutrophication caused by nitrogen and phosphorus has led to frequent harmful algal blooms (HABs), especially under the unknown challenges of climate change, which have seriously damaged human life and property. In this study, a coupled SWAT-Bayesian Network (SWAT-BN) model framework was constructed to elucidate the mechanisms between non-point source nitrogen pollution in agricultural lake watersheds and algal activities. A typical agricultural shallow lake basin, the Taihu Basin (TB), China, was chosen in this study, aiming to investigate the effectiveness of best management practices (BMPs) in controlling HABs risks in TB. By modeling total nitrogen concentration of Taihu Lake from 2007 to 2022 with four BMPs (filter strips, grassed waterway, fertilizer application reduction and no-till agriculture), the results indicated that fertilizer application reduction proved to be the most effective BMP with 0.130 of Harmful Algal Blooms Probability Reduction (HABs- PR) when reducing 40% of fertilizer, followed by filter strips with 0.01 of HABs-PR when 4815ha of filter strips were conducted, while grassed waterway and no-till agriculture showed no significant effect on preventing HABs. Furthermore, the combined practice between 40% fertilizer application reduction and 4815ha filter strips con struction showed synergistic effects with HABs-PR increasing to 0.171. Precipitation and temperature data were distorted to model scenarios of extreme events. As a result, the combined approach outperformed any single BMP in terms of robustness under extreme climates. This research provides a watershed-level perspective on HABs risks mitigation and highlights the strategies to address HABs under the influence of climate change.
Article
Full-text available
Based on National Oceanic and Atmospheric Administration/Advanced Very High-Resolution Radiometer (NOAA/AVHRR) remote sensing and Cross-Calibrated Multi-Platform (CCMP) wind field data from 2007 to 2019, oceanographic conditions are analysed, respectively, in the Source Area (SA) and Typical Bloom Area (TBA) of Ulva prolifera (U. prolifera) in the west of the Southern Yellow Sea (SYS) using Sea Surface Temperature (SST), Suspended Sediment Concentration (SSC) and Wind Speed over the years. The results indicate that the annual maximum SST Difference (SSTD) between U. prolifera SA and TBA is strongly consistent with the intensity of U. prolifera, and a high SST Warming Rate (WR) from May to July may constrain the U. prolifera blooms. The Taiwan Warm Current (TWC), crossing Yangtze River Estuary northward from March to April, leads to SST increasing in the SA and becomes a key trigger for the growth of U. prolifera in the early period. The amount of U. prolifera may decrease in the early period because of the lower light intensity with high SSC and turbidity in SA. The summer monsoon is one of determinants for the spread of U. prolifera, and the distribution of U. prolifera reaches its highest point with a higher mean wind speed in the TBA.
Article
Full-text available
Water pollution has become one of the most serious issues threatening water environments, water as a resource and human health. The most urgent and effective measures rely on dynamic and accurate water quality monitoring on a large scale. Due to their temporal and spatial advantages, remote sensing technologies have been widely used to retrieve water quality data. With the development of hyper-spectral sensors, unmanned aerial vehicles (UAV) and artificial intelligence, there has been significant advancement in remotely sensed water quality retrieval owing to various data availabilities and retrieval methodologies. This article presents the application of remote sensing for water quality retrieval, and mainly discusses the research progress in terms of data sources and retrieval modes. In particular, we summarize some retrieval algorithms for several specific water quality variables, including total suspended matter (TSM), chlorophyll-a (Chl–a), colored dissolved organic matter (CDOM), chemical oxygen demand (COD), total nitrogen (TN) and total phosphorus (TP). We also discuss the significant challenges to atmospheric correction, remotely sensed data resolution, and retrieval model applicability in the domains of spatial, temporal and water complexity. Finally, we propose possible solutions to these challenges. The review can provide detailed references for future development and research in water quality retrieval.
Article
Full-text available
Harmful algal blooms (HABs) are among the most severe ecological marine problems worldwide. Under favorable climate and oceanographic conditions, toxin-producing microalgae species may proliferate, reach increasingly high cell concentrations in seawater, accumulate in shellfish, and threaten the health of seafood consumers. There is an urgent need for the development of effective tools to help shellfish farmers to cope and anticipate HAB events and shellfish contamination, which frequently leads to significant negative economic impacts. Statistical and machine learning forecasting tools have been developed in an attempt to better inform the shellfish industry to limit damages, improve mitigation measures and reduce production losses. This study presents a synoptic review covering the trends in machine learning methods for predicting HABs and shellfish biotoxin contamination, with a particular focus on autoregressive models, support vector machines, random forest, probabilistic graphical models, and artificial neural networks (ANN). Most efforts have been attempted to forecast HABs based on models of increased complexity over the years, coupled with increased multi-source data availability, with ANN architectures in the forefront to model these events. The purpose of this review is to help defining machine learning-based strategies to support shellfish industry to manage their harvesting/production, and decision making by governmental agencies with environmental responsibilities.
Article
Full-text available
In the last few decades, the eutrophication of lakes has been a serious issue in the middle and lower reaches of the Yangtze River watershed. To explore the relationship between lake systems and anthropogenic activities, sediments were collected from the Shuanglong reservoir in the Dianchi watershed in Southwest China. Total nitrogen (TN), total phosphorus (TP), total organic carbon (TOC), and the carbon isotopic ratio (δ¹³C) were analyzed in sediment cores to reconstruct the effects of natural succession and human activities on the past lacustrine environmental conditions. A reliable chronology of the sediment core was established by using the ²¹⁰Pb dating technique, which indicated that the age span of the 70-cm sediment core is from the years 1871 to 2011. Above – 31 cm depth in the core, TN, TP, TOC, C/N, and δ¹³C increased significantly, indicating that eutrophication has occurred since the 1980s. By combining the indicators of δ¹³C and C/N, it was shown that terrestrial and lacustrine components were the main sources of organic matter (OM) in the reservoir, which was mostly controlled by terrestrial C3 plants and algae. Since the 1980s, increased sewage discharge, fish aquaculture, fertilizer application, population, and economic strength have sped up the eutrophication process, and the eutrophication was further intensified in 2001. Graphical abstract
Article
Full-text available
Lake Taihu in China has suffered serious harmful cyanobacterial blooms for decades. The algal blooms threaten the ecological sustainability, drinking water safety, and human health. Although the roles of abi-otic factors (such as water temperature and nutrient loading) in promoting Microcystis blooms have been well studied, the importance of biotic factors (e.g. bacterial community) in promoting and meditating Microcystis blooms remains unclear. In this study, we investigated the ecological dynamics of bacterial community, the ratio of toxic Microcystis, as well as microcystin in Lake Taihu. High-throughput 16S rRNA sequencing and principal component analysis (PCA) revealed that the bacteria community compositions (BCCs) clustered into three groups, the partitioning of which corresponded to that of groups according to the toxic profiles (the ratio of toxic Microcystis to total Microcystis, and the microcystin concentrations) of the samples. Further Spearman's correlation network showed that the α-proteobacteria Phenylobacterium strongly positively correlated with the toxic profiles. Subsequent laboratory chemostats experiments demonstrated that three Phe-nylobacterium strains promoted the dominance of the toxic Microcystis aeruginosa PCC7806 when co-culturing with the non-toxic PCC7806 mcyB − mutant. Taken together, our data suggested that the α-proteobacteria Phenylobacterium may play a vital role in the maintenance of toxic Microcystis dominance in Lake Taihu.
Article
Full-text available
The shifts among bloom-forming cyanobacteria have attracted increasing attention due to the reductions in nitrogen and phosphorus during the eutrophication mitigation process. However, knowledge is limited regarding the pattern and drivers of the shifts among these cyanobacterial genera. In this study, we performed a 7-year long, monthly investigation in Lake Chaohu, to analyze the interannual and seasonal shifts between Microcystis and Dolichospermum. Our results showed that Microcystis was the dominant cyanobacterium in the western lake region in summer, whereas Dolichospermum was dominant in the other regions and seasons. The Microcystis biomass and ratio were driven primarily by total phosphorus and temperature. The sensitivity of Dolichospermum to nutrients and temperature was relatively weak compared to that of Microcystis. The shifts between Microcystis and Dolichospermum might be led by Microcystis. If the temperature and phosphorus level were relatively high, then Microcystis grew rapidly, and competitively excluded Dolichospermum. If the nutrient level, especially the phosphorus level, was low, then the exclusive power of Microcystis was weak, and Dolichospermum maintained its dominance, even in summer. The key temperature (~17 °C) determined the dominance of the two cyanobacteria. Microcystis never dominated, while Dolichospermum was always dominant below the key temperature. Microcystis and Dolichospermum had different means of responding to the interaction of temperature, nitrogen and phosphorus. The Dolichospermum biomass was sensitive to the variation in nitrogen level, and the sensitivity depended on temperature. While the Microcystis biomass was sensitive to the variation in phosphorus level, and the sensitivity depended on temperature and total nitrogen. The different ways might contribute to the succession of the two cyanobacteria. Our findings will be helpful for improving the understanding of the shift process between Microcystis and Dolichospermum.
Article
Full-text available
Prediction of rivers and lakes water temperature plays an important role in hydrology, ecology, and water resources planning and management. Recently, machines learning approaches have been widely used for modelling water temperature, and the obtained results vary depending on the kind of models and the selections of the appropriates predictors. In the present paper, a new family of machines learning are proposed and compared to the famous air2stream model, using a large data set collected at 25 lakes in the northern part of Poland. The proposed models were: (i) the extremely randomized trees (ERT), (ii) the multivariate adaptive regression splines (MARS), (iii) the M5 Model tree (M5Tree), (iv) the random forest (RF), and (v) the multilayer perceptron neural network (MLPNN). The models were developed using the air temperature as input variables and the component of the Gregorian calendar (year, month and day) number. Results obtained were evaluated using several statistical indices: the root mean square error (RMSE), the mean absolute error (MAE), correlation coefficient (R) and Nash-Sutcliffe efficiency coefficient (NSE). Obtained results reveals that the air2stream model outperformed all other machines learning models and worked best with high accuracy at all the 25 lakes, and none of the ERT, MARS, M5Tree, RF and MLPNN models was able to provides an improvement of the water temperature prediction compared to the air2stream.
Article
Full-text available
Cyanobacteria harmful blooms (CyanoHABs) in lakes and reservoirs represent a major risk for water authorities globally due to their toxicity and economic impacts. Anticipating bloom occurrence and understanding the main drivers of CyanoHABs are needed to optimize water resources management. An extensive review of the application of CyanoHABs forecasting and predictive models was performed, and a summary of the current state of knowledge, limitations and research opportunities on this topic is provided through analysis of case studies. Two modelling approaches were used to achieve CyanoHABs anticipation; process-based (PB) and data-driven (DD) models. The objective of the model was a determining factor for the choice of modelling approach. PB models were more frequently used to predict future scenarios whereas DD models were employed for short-term forecasts. Each modelling approach presented multiple variations that may be applied for more specific, targeted purposes. Most models reviewed were site-specific. The monitoring methodologies, including data frequency, uncertainty and precision, were identified as a major limitation to improve model performance. A lack of standardization of both model output and performance metrics was observed. CyanoHAB modelling is an interdisciplinary topic and communication between disciplines should be improved to facilitate model comparisons. These shortcomings can hinder the adoption of modelling tools by practitioners. We suggest that water managers should focus on generalising models for lakes with similar characteristics and where possible use high frequency monitoring for model development and validation.
Article
Full-text available
The prediction of algal chlorophyll-a and water clarity in lentic ecosystems is a hot issue due to rapid deteriorations of drinking water quality and eutrophication processes. Our key objectives of the study were to predict long-term algal chlorophyll-a and transparency (water clarity), measured as Secchi depth, in spatially heterogeneous and temporally dynamic reservoirs largely influenced by the Asian monsoon during 2000–2017 and then determine the reservoir trophic state using a multiple linear regression (MLR), support vector machine (SVM) and artificial neural network (ANN). We tested the models to analyze the spatial patterns of the riverine zone (Rz), transitional zone (Tz) and lacustrine zone (Lz) and temporal variations of premonsoon, monsoon and postmonsoon. Monthly physicochemical parameters and precipitation data (2000–2017) were used to build up the models of MLR, SVM and ANN and then were confirmed by cross-validation processes. The model of SVM showed better predictive performance than the models of MLR and ANN, in both before validation and after validation. Values of root mean square error (RMSE) and mean absolute error (MAE) were lower in the SVM model, compared to the models of MLR and ANN, indicating that the SVM model has better performance than the MLR and ANN models. The coefficient of determination was higher in the SVM model, compared to the MLR and ANN models. The mean and maximum total suspended solids (TSS), nutrients (total nitrogen (TN) and total phosphorus (TP)), water temperature (WT), conductivity and algal chlorophyll (CHL-a) were in higher concentrations in the riverine zone compared to transitional and lacustrine zone due to surface run-off from the watershed. During the premonsoon and postmonsoon, the average annual rainfall was 59.50 mm and 54.73 mm whereas it was 236.66 mm during the monsoon period. From 2013 to 2017, the trophic state of the reservoir on the basis of CHL-a and SD was from mesotrophic to oligotrophic. Analysis of the importance of input variables indicated that WT, TP, TSS, TN, NP ratios and the rainfall influenced the chlorophyll-a and transparency directly in the reservoir. These findings of the algal chlorophyll-a predictions and Secchi depth may provide key clues for better management strategy in the reservoir.
Article
Full-text available
Climate change is expected to impact the severity of harmful algal blooms in lakes and reservoirs through a number of mechanisms related to the influence of warming temperatures and changes to precipitation patterns. Evidence on the prevalence of individual mechanisms is lacking, however, with knowledge of many mechanisms restricted to studies of individual or small subsets of lakes. Here, we leverage over twelve hundred summertime lake observations from across the continental U.S. to explore evidence for the hypothesized risks from climate change attributable to specific mechanisms. Using a statistical model selection approach, we examine associations between temperature and precipitation variables and indicators of total phytoplankton abundance, species dominance, and toxicity. We find evidence in support of the hypotheses that summer temperatures drive total abundance, that the length of the summer drives cyanobacterial abundance, and that increased temperatures may reduce the observed toxicity of blooms in some cases. We find that nutrient concentrations are also likely to be impacted by lake warming, as increased temperatures are robustly associated with increased total phosphorus concentrations. Evidence for the impact of precipitation is mixed, however, as there is evidence to support that increased nutrient runoff from precipitation could support blooms but also that nutrient concentrations could be reduced through greater flushing due to precipitation. While statistical associations are not definitive evidence of formal mechanistic links, the geographic scale of the results is useful for identifying hypothesized mechanisms that are widespread across the continental U.S., and therefore for informing understanding of the influence of climate change.
Article
Full-text available
The study investigated the effects of cyanobacteria toxins such as microcystins in water sources and water stored in containers during its blooming and decaying seasons. Samples from water sources and containers near the Hartbeespoort Dam in South Africa were analysed using a microcystin ELIZA test kit. Microcystins were present in water sources used by the community, with an average of 4.3 μg/L in communal tap water and 4.8 μg/L in the water stored in tanks. The concentration of microcystins was lower in groundwater in the decaying season (0.38 μg/L) than in the blooming season (1.4 μg/L). Although microcystins were present in the storage containers, the average levels in all water samples were below the acceptable limit of 1 μg/L. The present study confirmed the presence of microcystins in the water storage containers. Therefore, it is suggested that water used for drinking from community water sources should be treated before storage to eliminate microcystins.
Article
Full-text available
Marine and freshwater ecosystems are warming, acidifying, and deoxygenating as a consequence of climate change. In parallel, the impacts of harmful algal blooms (HABs) on these ecosystems are intensifying. Many eutrophic habitats that host recurring HABs already experience thermal extremes, low dissolved oxygen, and low pH, making these locations potential sentinel sites for conditions that will become more common in larger-scale systems as climate change accelerates. While studies of the effects of HABs or individual climate change stressors on aquatic organisms have been relatively common, studies assessing their combined impacts have been rare. Those doing so have reported strong species- and strain-specific interactions between HAB species and climate change co-stressors yielding outcomes for aquatic organisms that could not have been predicted based on investigations of these factors individually. This review provides an ecological and physiological framework for considering HABs as a climate change co-stressor and considers the consequences of their combined occurrence for coastal ecosystems. This review also highlights critical gaps in our understanding of HABs as a climate change co-stressor that must be addressed in order to develop management plans that adequately protect fisheries, aquaculture, aquatic ecosystems, and human health. Ultimately, incorporating HAB species into experiments and monitoring programs where the effects of multiple climate change stressors are considered will provide a more ecologically relevant perspective of the structure and function of marine ecosystems in future, climate-altered systems.
Article
Full-text available
Water temperature regulates many processes in lakes; therefore, evaluating it is essential to understand its ecological status and functioning, and to comprehend the impact of climate change. Although few studies assessed the accuracy of individual sensors in estimating lake-surface-water temperature (LSWT), comparative analysis considering different sensors is still needed. This study evaluated the performance of two thermal sensors, MODIS and Landsat 7 ETM+, and used Landsat methods to estimate the SWT of a large subtropical lake. MODIS products MOD11 LST and MOD28 SST were used for comparison. For the Landsat images, the radiative transfer equation (RTE), using NASA's Atmospheric Correction Parameter Calculator (AtmCorr) parameters, was compared with the single-channel algorithm in different approaches. Our results showed that MOD11 obtained the highest accuracy (RMSE of 1.05 • C), and is the recommended product for LSWT studies. For Landsat-derived SWT, AtmCorr obtained the highest accuracy (RMSE of 1.07 • C) and is the recommended method for small lakes. Sensitivity analysis showed that Landsat-derived LSWT using the RTE is very sensitive to atmospheric parameters and emissivity. A discussion of the main error sources was conducted. We recommend that similar tests be applied for Landsat imagery on different lakes, further studies on algorithms to correct the cool-skin effect in inland waters, and tests of different emissivity values to verify if it can compensate for this effect, in an effort to improve the accuracy of these estimates.
Article
Full-text available
Kim D, Kim Y, Kim B. Simulation of eutrophication in a reservoir by CE-QUAL-W2 for the evaluation of the importance of point sources and summer monsoon. Lake Reserv Manage. 35:64–76. Water quality was modeled by a 2-dimensional model (CE-QUAL-W2) in a reservoir (Lake Uiam, Korea) receiving nutrients from both nonpoint sources and point sources. Due to the summer monsoon climate and the aggregated seasonal precipitation pattern, the phosphorus export from nonpoint sources is severely concentrated in the rainy season. For several decades, a sewage treatment plant (STP) on the shore of the lake has been releasing effluent, which was suspected as the major cause of eutrophication. However, the total annual phosphorus loading from the STP was smaller than the loadings from nonpoint sources, which aroused skepticism about the effectiveness of an additional chemical P-removal process in the STP as a eutrophication control strategy. The result of scenario simulations in this study showed that P reduction in the STP effluent from 0.9 mg/L to 0.1 mg/L would effectively reduce the chlorophyll a concentration in the lake by 62%. According to the results of the simulation, the addition of a chemical P-removal process was suggested to the municipal government and was installed in 2012. After the process, the chlorophyll a concentration in the lake decreased as predicted in the simulation. The effect of phosphorus loading can have quite different effects on phytoplankton growth depending on the runoff pattern and hydrological characteristics of the receiving water bodies. Flow rate and nutrient loadings are very dynamic, far from a steady state, in the summer monsoon regions, which may be a unique limnological feature of East Asian countries.
Article
Full-text available
In this study, ensemble models using the Bates-Granger approach and least square method are developed to combine forecasts of multi-wavelet artificial neural network (ANN) models. Originally, this study is aimed to investigate the proposed models for forecasting of chlorophyll a concentration. However, the modeling procedure was repeated for water salinity forecasting to evaluate the generality of the approach. The ensemble models are employed for forecasting purposes in Hilo Bay, Hawaii. Moreover, the efficacy of the forecasting models for up to three days in advance is investigated. To predict chlorophyll a and salinity with different lead, the previous daily time series up to three lags are decomposed via different wavelet functions to be applied as input parameters of the models. Further, outputs of the different wavelet-ANN models are combined using the least square boosting ensemble and Bates-Granger techniques to achieve more accurate and more reliable forecasts. To examine the efficiency and reliability of the proposed models for different lead times, uncertainty analysis is conducted for the best single wavelet-ANN and ensemble models as well. The results indicate that accurate forecasts of water temperature and salinity up to three days ahead can be achieved using the ensemble models. Increasing the time horizon, the reliability and accuracy of the models decrease. Ensemble models are found to be superior to the best single models for both forecasting variables and for all the three lead times. The results of this study are promising with respect to multi-step forecasting of water quality parameters such as chlorophyll a and salinity, important indicators of ecosystem status in coastal and ocean regions.
Article
Full-text available
As a representative index of the algal bloom, the concentration of chlorophyll-a (Chl-a) is a key parameter of concern for environmental managers. The relationships between environmental variables and Chl-a are complex and difficult to establish. Two machine learning methods, including support vector machine for regression (SVR) and random forest (RF), were used in this study to predict Chl-a concentration based on multiple variables. To improve the model accuracy and reduce the input number, two feature selection methods, including minimum redundancy and maximum relevance method (mRMR) and RF, were integrated with regression models. The results showed that the RF model had a higher predictive ability than the SVR model. Furthermore, the less computational time cost and unnecessary prior data transformation also indicated a better applicability of the RF model. The comparison between ensemble models of mRMR-RF and RF-RF showed that the RF-RF yielded a better performance with fewer variables. Seven variables selected from the candidate predictors could interpret most information, and their potential implications to Chl-a were discussed based on the level of importance. Overall, the RF-RF ensemble model can be considered as a useful approach to determine the significant stressors and achieve satisfactory prediction of Chl-a concentration.
Article
Full-text available
Microcystis and Anabaena (Dolichospermum) are among the most toxic cyanobacterial genera and often succeed each other during harmful algal blooms. The role allelopathy plays in the succession of these genera is not fully understood. The allelopathic interactions of six strains of Microcystis and Anabaena under different nutrient conditions in co-culture and in culture-filtrate experiments were investigated. Microcystis strains significantly reduced the growth of Anabaena strains in mixed cultures with direct cell-to-cell contact and high nutrient levels. Cell-free filtrate from Microcystis cultures proved equally potent in suppressing the growth of nutrient replete Anabaena cultures while also significantly reducing anatoxin-a production. Allelopathic interactions between Microcystis and Anabaena were, however, partly dependent on ambient nutrient levels. Anabaena dominated under low N conditions and Microcystis dominated under nutrient replete and low P during which allelochemicals caused the complete suppression of nitrogen fixation by Anabaena and stimulated glutathione S-transferase activity. The microcystin content of Microcystis was lowered with decreasing N and the presence of Anabaena decreased it further under low P and high nutrient conditions. Collectively, these results indicate that strong allelopathic interactions between Microcystis and Anabaena are closely intertwined with the availability of nutrients and that allelopathy may contribute to the succession, nitrogen availability, and toxicity of cyanobacterial blooms.
Article
Full-text available
Lake Taihu is a large shallow eutrophic lake with frequent recurrence of cyanobacterial bloom which has high variable distribution in space and time. Based on the field observations and remote sensing monitoring of cyanobacterial bloom occurrence, in conjunction with laboratory controlled experiments of mixing effects on large colony formation and colonies upward moving velocity measurements, it is found that the small or moderate wind-induced disturbance would increase the colonies size and enable it more easily to overcome the mixing and float to water surface rapidly during post-disturbance. The proposed mechanism of wind induced mixing on cyanobacterial colony enlargement is associated with the presence of the extracellular polysaccharide (EPS) which increased the size and buoyancy of cyanobacteria colonies and promote the colonies aggregate at the water surface to form bloom. Both the vertical movement and horizontal migration of cyanobacterial colonies were controlled by the wind induced hydrodynamics. Because of the high variation of wind and current coupling with the large cyanobacterial colony formation make the bloom occurrence as highly mutable in space and time. This physical factor determining cyanobacterial bloom formation in the large shallow lake differ from the previously documented light-mediated bloom formation dynamics.
Article
Support vector machine (SVM) is a powerful machine learning technique relying on the structural risk minimization principle. The applications of SVM in structural reliability analysis (SRA) are enormous in the recent past. There are review articles on machine learning-based methods that partly discussed the development of SVM for SRA applications along with other machine learning methods. However, there is no dedicated review on SVM for SRA applications. Thus, a review article on the implementation of various SVM approaches for SRA applications will be useful. The present article provides a synthesis and roadmap to the growing and diverse literature, specifically the classification and regression-based support vector algorithms in SRA applications. In doing so, different advanced variants of SVM in SRA applications and hyperparameter tuning algorithms are also briefly discussed. Following the detailed review studies, future opportunities and challenges in the area of applications are summarized. The review in general reveals that the SVM in SRA applications is getting thrust as it has an excellent capability of handling high-dimensional problems utilizing relatively lesser training data. The review article is expected to enhance the state-of-the-art developments of support vector algorithms for SRA applications.
Article
The increasing occurrence of harmful algae blooms globally poses significant challenges to water management. In water treatment utilities, coagulation is the first treatment process of the multi-barrier strategy designed to address algae-laden source water. Since the coagulation efficiency directly impacts all downstream treatment processes, it is critical to optimize coagulation conditions to remove algal cells to the extent possible without causing cell damage. Moreover, the importance of coagulation extends to source water management. Coagulation-based processes have demonstrated great potential as in-lake measures to mitigate eutrophication and control algal blooms. This review aims to serve as a holistic resource of coagulation-based techniques for algae removal. Studies focusing on the coagulation removal of algae in source management and treatment mitigation are critically reviewed with emphasis on the following aspects. Introductory sections present the common algae of interest to water management and outline the theoretical background of interfacial phenomena in the coagulation system. Commonly used experimental methods and emerging techniques in coagulation studies are summarized with representative results. In addition, inorganic and organic coagulant materials of synthetic or natural origins and their applications in algae removal are discussed in depth. The latest developments in enhanced algae removal, propelled by electrochemical processes and sonication, are presented with fundamental technical concepts. Furthermore, practical considerations for using coagulation to remove algae in both water treatment facilities and natural waterbodies are discussed in detail with results from pilot- and full-scale studies. Finally, the article concludes by identifying limitations and challenges associated with the current body of literature and proposing directions for future research.
Article
Many countries have attempted to monitor and predict harmful algal blooms to mitigate related problems and establish management practices. The current alert system-based sampling of cell density is used to intimate the bloom status and to inform rapid and adequate response from water-associated organizations. The objective of this study was to develop an early warning system for cyanobacterial blooms to allow for efficient decision making prior to the occurrence of algal blooms and to guide preemptive actions regarding management practices. In this study, two machine learning models: artificial neural network (ANN) and support vector machine (SVM), were constructed for the timely prediction of alert levels of algal bloom using eight years’ worth of meteorological, hydrodynamic, and water quality data in a reservoir where harmful cyanobacterial blooms frequently occur during summer. However, the proportion imbalance on all alert level data as the output variable leads to biased training of the data-driven model and degradation of model prediction performance. Therefore, the synthetic data generated by an adaptive synthetic (ADASYN) sampling method were used to resolve the imbalance of minority class data in the original data and to improve the prediction performance of the models. The results showed that the overall prediction performance yielded by the caution level (L1) and warning level (L2) in the models constructed using a combination of original and synthetic data was higher than the models constructed using original data only. In particular, the optimal ANN and SVM constructed using a combination of original and synthetic data during both training (including validation) and test generated distinctively improved recall and precision values of L1, which is a very critical alert level as it indicates a transition status from normalcy to bloom formation. In addition, both optimal models constructed using synthetic-added data exhibited improvement in recall and precision by more than 33.7% while predicting L-1 and L-2 during the test. Therefore, the application of synthetic data can improve detection performance of machine learning models by solving the imbalance of observed data. Reliable prediction by the improved models can be used to aid the design of management practices to mitigate algal blooms within a reservoir.
Article
Cyanobacterial blooms have become an urgent threat to the aquatic ecosystem, but early warning of the blooms is still challenging for the research community. In this paper, a method based on polarized light scattering and powered by machine learning is proposed to in-situ early warn the cyanobacterial blooms. In this work, the wild types of Microcystis are treated and the cells are individually measured to obtain their polarization parameters. The experimental results show that machine learning algorithms can be used to well identify the states of the Microcystis cells, and the compositions of the mixed samples can be effectively retrieved by this method. Subsequently, one application strategy is suggested to early warn the blooms, which is potential and powerful to achieve the in-situ early warning of cyanobacterial blooms in the future.
Article
Reservoirs are an important type of drinking water source for megacities, while lots of reservoirs are threatened by odor problems during certain seasons. The influencing factors of odor compounds in reservoirs are still unclear. During August 2019, a nationwide survey investigating the distribution of odor compounds in reservoirs used as drinking water sources was conducted on seven reservoirs. 2-methylisoborneol (2-MIB) and geosmin were detected in almost every reservoir, and some odor compound concentrations even exceeded the odor threshold concentration. The average concentration of 2-MIB was 2.68 ng/L, and geosmin was 3.63 ng/L. The average chlorophyll a concentration was 8.25 μg/L. The dominant genera of phytoplankton in these reservoirs belonged to cyanobacteria and diatom. Statistical analysis showed that odor compound concentration was significantly related to the chlorophyll a concentration and indicated that the odor compounds mainly came from phytoplankton. The concentration of odor compounds in the euphotic zone was significantly related to phytoplankton species and biomass. Therefore, the odor compound concentrations in the subsurface chlorophyll maxima layer was generally higher than in the surface layer. However, the odor compounds in the hypolimnion layer were related to the density current. This research suggests that both phytoplankton proliferation events and heavy storm events are important risk factors increasing odor compounds in reservoirs. Control of algal bloom, in-situ profile monitoring system and depth-adjustable pumping system will greatly reduce the risk of odor problems in reservoirs using as water supplies for large cities.
Article
Understanding the dynamics of harmful algal blooms is important to protect the aquatic ecosystem in regulated rivers and secure human health. In this study, artificial neural network (ANN) and support vector machine (SVM) models were used to predict algae alert levels for the early warning of blooms in a freshwater reservoir. Intensive water-quality, hydrodynamic, and meteorological data were used to train and validate both ANN and SVM models. The Latin-hypercube one-factor-at-a-time (LH-OAT) method and a pattern search algorithm were applied to perform sensitivity analyses for the input variables and to optimize the parameters of the models, respectively. The results indicated that the two models well reproduced the algae alert level based on the time-lag input and output data. In particular, the ANN model showed a better performance than the SVM model, displaying a higher performance value in both training and validation steps. Furthermore, a sampling frequency of 6- and 7-day were determined as efficient early-warning intervals for the freshwater reservoir. Therefore, this study presents an effective early-warning prediction method for algae alert level, which can improve the eutrophication management schemes for freshwater reservoirs.
Article
Understanding the climatic drivers of eutrophication is critical for lake management under the prism of the global change. Yet the complex interplay between climatic variables and lake processes makes prediction of phytoplankton biomass a rather difficult task. Quantifying the relative influence of climate-related variables on the regulation of phytoplankton biomass requires modelling approaches that use extensive field measurements paired with accurate meteorological observations. In this study we used climate and lake related variables obtained from the ERA5-Land reanalysis dataset combined with a large dataset of in-situ measurements of chlorophyll-a and phytoplankton biomass from 50 water bodies to develop models of phytoplankton related responses as functions of the climate reanalysis data. We used chlorophyll-a and phytoplankton biomass as response metrics of phytoplankton growth and we employed two different modelling techniques, boosted regression trees (BRT) and generalized additive models for location scale and shape (GAMLSS). According to our results, the fitted models had a relatively high explanatory power and predictive performance. Boosted regression trees had a high pseudo R² with the type of the lake, the total layer temperature, and the mix-layer depth being the three predictors with the higher relative influence. The best GAMLSS model retained mix-layer depth, mix-layer temperature, total layer temperature, total runoff and 10-m wind speed as significant predictors (p<0.001). Regarding the phytoplankton biomass both modelling approaches had less explanatory power than those for chlorophyll-a. Concerning the predictive performance of the models both the BRT and GAMLSS models for chlorophyll-a outperformed those for phytoplankton biomass. Overall, we consider these findings promising for future limnological studies as they bring forth new perspectives in modelling ecosystem responses to a wide range of climate and lake variables. As a concluding remark, climate reanalysis can be an extremely useful asset for lake research and management.
Article
Light availability is an important driver of algal growth and for the formation of surface blooms. The formation of Microcystis surface scum decreases the transparency of the water column and influences the vertical distribution of light intensity. Only few studies analysed the interactions between the dynamics of surface blooms and the light distribution in the water column. Particularly the effect of light attenuation caused by Microcystis colonies (self-shading) on the formation of surface scum has not been explored. In the present study, we simulate the effect of variable cell concentration of Microcystis colonies on the vertical distribution of light in the water column based on experimental estimates of the extinction coefficient of Microcystis colonies. The laboratory observations indicated that higher cell concentration of Microcystis enhance the light attenuation in water column and promotes surface scum formation. We extended an existing model for the light-driven migration of Microcystis by introducing the effect of self-shading and simulated the dynamics of vertical migration for different cell concentrations and different colonial morphologies. The simulation results show that high cell concentrations of Microcystis promote surface scum formation, as well as its persistence throughout diel photoperiods. Large and tight Microcystis colonies facilitate scum formation, while small and loose colonies increase scum stability and persistence. This study reveals a positive feedback regulation of Microcystis surface scum formation and stability by self-shading and provides novel insights into the underlying mechanisms.
Article
During the past three decades, harmful algal blooms (HAB) events have been frequently observed in marine waters around many coastal cities in the world including Hong Kong. The increasing occurrence of HAB has caused acute influences and damages on water environment and marine aquaculture with millions of monetary losses. For example, the Tolo Harbour is one of the most affected areas in Hong Kong, where more than 30% HAB occurred. In order to forewarn the potential HAB incidents, the machine learning (ML) methods have been increasingly resorted in modelling and forecasting water quality issues. In this study, two different ML methods – artificial neural networks (ANN) and support vector machine (SVM) – are implemented and improved by introducing different hybrid learning algorithms for the simulations and comparative analysis of more than 30-year measured data, so as to accurately forecast algal growth and eutrophication in Tolo Harbour in Hong Kong. The application results show the good applicability and accuracy of these two ML methods for the predictions of both trend and magnitude of the algal growth. Specifically, the results reveal that ANN is preferable to achieve satisfactory results with quick response, while the SVM is suitable to accurately identify the optimal model but taking longer training time. Moreover, it is demonstrated that the used ML methods could ensure robustness to learn complicated relationship between algal dynamics and different coastal environmental variables and thereby to identify significant variables accurately. The results analysis and discussion of this study also indicate the potentials and advantages of the applied ML models to provide useful information and implications for understanding the mechanism and process of HAB outbreak and evolution that is helpful to improving the water quality prediction for coastal hydro-environment management.
Article
Nutrient leakage due to modern agriculture and disposal of untreated urban wastewater results in the eutrophication of freshwater lakes and reservoirs, with elevated levels of cyanobacteria algal blooms often resulting in toxic conditions for animals and humans. A unique dataset of 22 coincident nutrient, phytoplankton, meteorological, and reservoir condition variables were recorded or calculated monthly for Hoover Reservoir near Columbus, Ohio, from February 1999 to December 2005. Network science was used in this study to visualize and differentiate selected nutrient and phytoplankton seasonality in Hoover Reservoir. Nutrient and phytoplankton concentrations in Hoover Reservoir respond independently to a number of external and internal environmental variables, but are also biochemically interdependent within the reservoir ecosystem. A number of forcing parameters can significantly alter seasonal phytoplankton growth and succession patterns. Chief among these are variability in the daily, seasonal, and/or yearly patterns and intensities of precipitation and the resultant surface runoff; duration and intensity of solar radiation; air and resultant water temperatures; and agricultural practices. The phytoplankton population in Hoover Reservoir is ecologically driven, with algal succession and nutrient levels constraining population concentrations. Complemented by selected statistical analyses, this limited network community detection study provides the hydrologist/ecologist with a dynamic and quantitative visualization of the complex hydrometeorological variable and biogeochemical process forcing parameters present in reservoirs and lakes.
Article
In aquatic ecosystems, anthropogenic activities disrupt nutrient fluxes, thereby promoting harmful algal blooms that could directly impact economies and human health. Within this framework, the forecasting of the proxy of chlorophyll a in coastal areas is the first step to managing these algal blooms. The primary goal was to analyze how phytoplankton bloom forecasts are impacted by different sampling frequencies, by using a machine learning model. The database used in this study was sourced from an automated system located in the English Channel. This device has a sampling frequency of 20 min. We considered 12 physicochemical parameters over a six-year period. Our forecast methodology is based on the random forest (RF) model and a sliding window strategy. The lag times for these sliding windows ranged from 12 h to 3 months with four different sampling times until 1 day. The results indicate that the optimal forecast was obtained for a 20 min time step, with an average R² of 0.62. Moreover, the highest values of fluorescence were predicted when the water temperature was approximately 11.8 °C. Consequently, we demonstrated that the sampling frequency directly impacts the forecast performance of an RF model. Furthermore, this kind of model can recreate interactions that closely resemble biological processes. Our study suggests that the RF model can utilize the additional information contained in high-frequency datasets. The methodology presented here lays the foundation for the development of a numerical decision-making tool that could help mitigate the impact of these algal blooms.
Article
Harmful algal blooms are among the emerging threats to freshwater biodiversity that need to be studied further in the Anthropocene. Here, we studied freshwater plankton communities in ten tropical reservoirs to record the impact of algal blooms, comprising different phytoplankton taxa, on water quality, plankton biodiversity, and ecosystem functioning. We compared water quality parameters (water transparency, mixing depth, pH, electrical conductivity, dissolved inorganic nitrogen, total dissolved phosphorus, total phosphorus, chlorophyll-a, and trophic state), plankton structure (composition and biomass), biodiversity (species richness, diversity, and evenness), and ecosystem functioning (phytoplankton:phosphorus and zooplankton:phytoplankton ratios as a metric of resource use efficiency) through univariate and multivariate analysis of variance, and generalized additive mixed models in five different bloom categories. Most of the bloom events were composed of Cyanobacteria, followed by Dinophyta and Chlorophyta. Mixed blooms were composed of Cyanobacteria plus Bacillariophyta, Chlorophyta, and/or Dinophyta, while non-bloom communities presented phytoplankton biomass below the threshold for bloom development (10 mg L-1). Higher phytoplankton biomasses were recorded during Cyanobacteria blooms (15.87-273.82 mg L-1) followed by Dinophyta blooms (18.86-196.41 mg L-1). An intense deterioration of water quality, including higher pH, eutrophication, stratification, and lower water transparency, was verified during Cyanobacteria and mixed blooms, while Chlorophyta and Dinophyta blooms presented lower pH, eutrophication, stratification, and higher water transparency. All bloom categories significantly impacted phytoplankton and zooplankton structure, changing the composition and dominance patterns. Bloom intensity positively influenced phytoplankton resource use efficiency (R 2 =0.25; p<0.001), while decreased zooplankton resource acquisition (R 2 =0.51; p<0.001). Moreover, Cyanobacteria and Chlorophyta blooms negatively impacted zooplankton species richness, while Dinophyta blooms decreased phytoplankton richness. In general, Cyanobacteria blooms presented low water quality and major threats to plankton biodiversity, and ecosystem functioning. Moreover, we demonstrated that biodiversity losses decrease ecosystem functioning, with cascading effects on plankton dynamics.
Article
Taste and odor (T&O) are an important issue in drinking water, aquaculture, recreation and a few other associated industries, and cyanobacteria-relevant geosmin and 2-methylisoborneol (2-MIB) are the two most commonly detected T&O compounds worldwide. A rise in the cyanobacterial blooms and associated geosmin/2-MIB episodes due to anthropogenic activities as well as climate change has led to global concerns for drinking water quality. The increasing awareness for the safe drinking, aquaculture or recreational water systems has boost the demand for rapid, robust, on-site early detection and monitoring system for cyanobacterial geosmin/2-MIB events. In past years, research has indicated quantitative PCR (qPCR) as one of the promising tools for detection of geosmin/2-MIB episodes. It offers advantages of detecting the source organism even at very low concentrations, distinction of odor-producing cyanobacterial strains from non-producers and evaluation of odor producing potential of the cyanobacteria at much faster rates compared to conventional techniques. The present review aims at examining the current status of developed qPCR primers and probes in identifying and detecting the cyanobacterial blooms along with geosmin/2-MIB events. Among the more than 100 articles about cyanobacteria associated geosmin/2-MIB in drinking water systems published after 1990, limited reports (approx. 10 each for geosmin and 2-MIB) focused on qPCR detection and its application in the field. Based on the review of literature, a comprehensive open access global cyanobacterial geosmin/2-MIB events database (CyanoGM Explorer) is curated. It acts as a single platform to access updated information related to origin and geographical distribution of geosmin/2-MIB events, cyanobacterial producers, frequency, and techniques associated with the monitoring of the events. Although a total of 132 cyanobacterial strains from 21 genera and 72 cyanobacterial strains from 13 genera have been reported for geosmin and 2-MIB production, respectively, only 58 geosmin and 28 2-MIB synthesis regions have been assembled in the NCBI database. Based on the identity, geosmin sequences were found to be more diverse in the geosmin synthase conserved/primer design region, compared to 2-MIB synthesis region, hindering the design of universal primers/probes. Emerging technologies such as the bioelectronic nose, Surface Enhanced Raman Scattering (SERS), and nanopore sequencing are discussed for future applications in early on-site detection of geosmin/2-MIB and producers. In the end, the paper also highlights various challenges in applying qPCR as a universal system of monitoring and development of response system for geosmin/2-MIB episodes.
Article
In this paper, we applied support vector regression to predict the number of COVID-19 cases for the 12 most-affected countries, testing for different structures of nonlinearity using Kernel functions and analyzing the sensitivity of the models’ predictive performance to different hyperparameters settings using 3-D interpolated surfaces. In our experiment, the model that incorporates the highest degree of nonlinearity (Gaussian Kernel) had the best in-sample performance, but also yielded the worst out-of-sample predictions, a typical example of overfitting in a machine learning model. On the other hand, the linear Kernel function performed badly in-sample but generated the best out-of-sample forecasts. The findings of this paper provide an empirical assessment of fundamental concepts in data analysis and evidence the need for caution when applying machine learning models to support real-world decision making, notably with respect to the challenges arising from the COVID-19 pandemics.
Article
Wind strongly impacts the hydrodynamic and biogeochemical process of large shallow lakes, therefore wind stress also plays an important role in modeling the hydrodynamics and water quality of shallow lakes. In large shallow lakes, it may be necessary to modify the empirical wind-drag coefficient formula derived from ocean surface experiments because lake current velocities may be seriously underestimated in inland waters. To resolve this limitation, we added a wind-drag multiplier (α) to the wind drag formula in a lake hydrodynamic model. We used the Environmental Fluid Dynamics Code (EFDC) to model the hydrodynamics of Upper Klamath Lake (UKL), Oregon. The moment-independent method for global sensitive analysis (GSA) based on sampling of input parameters was utilized. We found that the original model underestimated lake current velocities when compared to field observations, so we developed a modified model with a wind-drag multiplier. This model was calibrated to observed data from June 21–September 12, 2005, and verified with data from May 24–September 25, 2006. The results showed the calibrated modified model resolved the underestimation problem, e.g., at three sites in UKL the water velocity increased by 59–85%, and the relative error for the model decreased by 15–32%. Sensitivity analysis showed the modeled current velocities were more sensitive to the α coefficient than to the bottom roughness height z0 and the coefficients in the original wind-drag formula. We believe the wind-drag multiplier affects wave propagation in the model and reconciles the mismatch between large shallow lake and open ocean conditions. Our results show that a relatively simple modification can alleviate the fundamental mismatch between modeling the hydrodynamics of the open ocean and large shallow lakes.
Article
Central-southern Chile is characterized by a series of large lakes that originate in the Andes Mountains. This region is facing increasing anthropogenic impact, which threatens the oligotrophic status of these lakes. While monitoring programs are often based on a limited spatial and temporal coverage, remote sensing offers promising tools for large-scale observations improving our capacity to study comprehensively indicators of lake properties. Seasonal trends (long-term means) and intra-lake variation of surface water temperature (SWT), turbidity and chlorophyll a in Lake Panguipulli were studied through satellite imagery from Landsat 5 TM, 7 ETM+ and 8 OLI (1998–2018; SWT, turbidity), and Sentinel-2A/B MSI (2016–2017; chlorophyll). Remotely sensed data were validated against in situ data from monitoring database. Satellite-derived SWT (representing the surface skin layer of water, so-called skin temperature) showed good similarity with in situ (bulk) temperature (RRMSD 0.17, R ² = 0.86), although was somewhat lower (RMSD of 2.77 °C; MBD of −2.10 °C). Seasonal long-term means of turbidity from satellite imagery corresponded to those from in situ data, while satellite-derived predictions (based on OC2v2 algorithm) overestimated chlorophyll a levels slightly in summer-spring. SWT ranged from 8.0 °C in winter to 17.5 °C in summer. Mean turbidity (1.6 FNU) and chlorophyll a (1.1 μg L ⁻¹ ) levels were at their lowest in summer. Spatial and seasonal patterns reflected the bathymetry and previously described mixing patterns of this monomictic lake: warming of shallow bays in spring extended to wider area along with summer stratification period, while mixing of the water column was reflected in spatially more homogenous SWT in fall-winter. Spatial heterogeneity in summer was confirmed by a clear separation of different lake areas based on SWT, turbidity and chlorophyll a using 3-D plot. Mapping of spatial and seasonal variation using satellite imagery allowed identifying lake areas with different characteristics, improving strategies for water resource management.
Article
Cyanobacterial blooms (CBs) in eutrophic lakes can cause various harmful issues to both humans and animals, disturb drinking water supply, and devastate lake ecosystems. Although great progresses have been made in many lakes from China and abroad on CBs prevention, mitigation and control, systematic research on the influencing factors of CBs in hypereutrophic plateau Lake Dianchi over a long time span is so far unavailable. This study comprehensively generalized both meteorological and water quality changes in Lake Dianchi during 1990–2015 on both yearly and monthly basis, separated Caohai from Waihai of Lake Dianchi regarding water quality variations, and investigated the individual and joint influencing meteorological and water quality factors on CBs using Spearman correlation, principal component analysis, and multivariate linear stepwise regression. Four specific lake regions, i.e. Caohai, northern Waihai, central Waihai, and southern Waihai, were respectively analyzed due to significant water quality heterogeneity. Results indicated that mild temperatures, low wind velocities, and hypereutrophic water conditions all favor CBs in Lake Dianchi, and the significant temperature rising trend may exacerbate severer CBs in the future. Despite configuration differences, the first principal components on CBs in the four sub-regions of Lake Dianchi were all consisted of meteorological factors, while water quality parameters especially total phosphorus concentrations contributed to the second principal component. Quantification of joint meteorological and water quality influencing factors on CBs needs further improvement, and largely relies on the accuracy of future weather forecasts, in order to set the goal of water quality improvement in each specific lake region for effective CBs management.
Article
We propose a new method for estimation in linear models. The ‘lasso’ minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree‐based models are briefly described.
Article
The ability of Microcystis to form large colonies is a key trait that contributes to competition ability over other phytoplankton and facilitates the formation of surface scums in many freshwater systems. The effect of temperature and nutrients on this trait, however, is far from clear and needs further investigation, especially under a warmer climate and nutrient overloading in aquatic systems globally. In this study, two colonial strains of Microcystis (M. wesenbergii and M. ichthyoblabe) originally isolated from Lake Taihu in China, were used to investigate cyanobacterial aggregation under a range of temperatures (15–30 °C), phosphorus availability (0.004–8 mg P L−1), and nitrogen availability (0.04–40 mg N L−1). The mechanism of colony formation in Microcystis was determined based on growth rates and extracellular polysaccharide (EPS) contents. The colony size of both strains increased significantly when the temperature rose from 15 to 25 °C. A further increase in temperature from 25 to 30 °C, however, reduced the colony size of M. ichthyoblabe significantly, and, in contrast, increased the colony size of M. wesenbergii. Higher phosphorus availability promoted the formation of larger colonies in both strains. In comparison, nitrogen had no significant effect on the colony size. Furthermore, although EPS was a significant contributor to the formation of large colonies in colonial Microcystis, growth rate was a dominant driving factor in this process. The findings of this study highlight that warmer temperatures and phosphorus enrichment might enhance surface Microcystis scums directly through increasing the colony size. This study also provides new insights into the mechanism of colony formation in Microcystis.
Article
We performed different consensus methods by combining binary classifiers, mostly machine learning classifiers, with the aim to test their capability as predictive tools for the presence–absence of marine phytoplankton species. The consensus methods were constructed by considering a combination of four methods (i.e., generalized linear models, random forests, boosting and support vector machines). Six different consensus methods were analyzed by taking into account six different ways of combining single-model predictions. Some of these methods are presented here for the first time. To evaluate the performance of the models, we considered eight phytoplankton species presence–absence data sets and data related to environmental variables. Some of the analyzed species are toxic, whereas others provoke water discoloration, which can cause alarm in the population. Besides the phytoplankton data sets, we tested the models on 10 well-known open access data sets. We evaluated the models' performances over a test sample. For most (72%) of the data sets, a consensus method was the method with the lowest classification error. In particular, a consensus method that weighted single-model predictions in accordance with single-model performances (weighted average prediction error — WA-PE model) was the one that presented the lowest classification error most of the time. For the phytoplankton species, the errors of the WA-PE model were between 10% for the species Akashiwo sanguinea and 38% for Dinophysis acuminata. This study provides novel approaches to improve the prediction accuracy in species distribution studies and, in particular, in those concerning marine phytoplankton species.
Article
More rain means more pollution Nitrogen input from river runoff is a major cause of eutrophication in estuaries and coastal waters. This is a serious problem that is widely expected to intensify as climate change strengthens the hydrological cycle. To address the current lack of adequate analysis, Sinha et al. present estimates of riverine nitrogen loading for the continental United States, based on projections of precipitation derived from climate models (see the Perspective by Seitzinger and Phillips). Anticipated changes in precipitation patterns are forecast to cause large and robust increases in nitrogen fluxes by the end of the century. Science , this issue p. 405 ; see also p. 350
Article
Nutrient enrichment is a major cause of water eutrophication, and variations in nutrient enrichment are influenced by environmental changes and anthropogenic activities. Accurately estimating nutrient concentrations and understanding their relationships with environmental factors are vital to develop nutrient management strategies to mitigate eutrophication. Landsat 8 Operational Land Imager (OLI) data is used to estimate nutrient concentrations and analyze their responses to hydrological and meteorological conditions. Two well-accepted empirical models are developed and validated to estimate the total nitrogen (TN) and total phosphorus (TP) concentrations (CTN and CTP) in the Xin'anjiang Reservoir using Landsat 8 OLI data from 2013 to 2016. Spatially, CTN decreased from the transition zone to the riverine zone and the lacustrine zone. On the other hand, CTP decreased from the riverine zone to the transition zone and the lacustrine zone. Temporally, CTN displayed elevated values during the late fall and winter and had lower values during the summer and early fall, whereas CTP was higher during the spring and lower during the winter. Among the environmental factors, the rainfall and the inflow rate have strong positive correlations with the nutrient concentrations. TN is more sensitive to meteorological factors (wind speed, temperature, sunshine duration), and the spatial driving forces vary among the different sections of the reservoir. However, TP is more easily influenced by human activities, such as fishery and agricultural activities. Current results would improve our understanding of the drivers of nutrients spatiotemporal variability and the approach in this study can be applicable to other similar reservoir to develop related strategies to mitigate eutrophication.
Article
This study proposes a method for estimating phytoplankton cell counts associated with an algal bloom, using satellite images coincident with in situ and meteorological parameters. Satellite images from Landsat Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+), Operational Land Imager (OLI) and HJ-1 A/B Charge Couple Device (CCD) sensors were integrated with the meteorological observations to provide an estimate of phytoplankton cell counts. All images were atmospherically corrected using the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) atmospheric correction method with a possible error of 1.2%, 2.6%, 1.4% and 2.3% for blue (450–520 nm), green (520–600 nm), red (630–690 nm) and near infrared (NIR 760–900 nm) wavelengths, respectively. Results showed that the developed Artificial Neural Network (ANN) model yields a correlation coefficient (R) of 0.95 with the in situ validation data with Sum of Squared Error (SSE) of 0.34 cell/ml, Mean Relative Error (MRE) of 0.154 cells/ml and a bias of −504.87. The integration of the meteorological parameters with remote sensing observations provided a promising estimation of the algal scum as compared to previous studies. The applicability of the ANN model was tested over Hong Kong as well as over Lake Kasumigaura, Japan and Lake Okeechobee, Florida USA, where algal blooms were also reported. Further, a 40-year (1975– 2014) red tide occurrence map was developed and revealed that the eastern and southern waters of Hong Kong are more vulnerable to red tides. Over the 40 years, 66% of red tide incidents were associated with the Dinoflagellates group, while the remainder were associated with the Diatom group (14%) and several other minor groups (20%). The developed technology can be applied to other similar environments in an efficient and costsaving manner.
Article
Harris TD, Graham JL. 2016. Predicting cyanobacterial abundance, microcystin, and geosmin in a eutrophic drinking-water reservoir using a 14-year dataset. Lake Reserve Manage. 00:00-00. Cyanobacterial blooms degrade water quality in drinking water supply reservoirs by producing toxic and taste-and-odor causing secondary metabolites, which ultimately cause public health concerns and lead to increased treatment costs for water utilities. There have been numerous attempts to create models that predict cyanobacteria and their secondary metabolites, most using linear models; however, linear models are limited by assumptions about the data and have had limited success as predictive tools. Thus, lake and reservoir managers need improved modeling techniques that can accurately predict large bloom events that have the highest impact on recreational activities and drinking-water treatment processes. In this study, we compared 12 unique linear and nonlinear regression modeling techniques to predict cyanobacterial abundance and the cyanobacterial secondary metabolites microcystin and geosmin using 14 years of physiochemical water quality data collected from Cheney Reservoir, Kansas. Support vector machine (SVM), random forest (RF), boosted tree (BT), and Cubist modeling techniques were the most predictive of the compared modeling approaches. SVM, RF, and BT modeling techniques were able to successfully predict cyanobacterial abundance, microcystin, and geosmin concentrations <60,000 cells/mL, 2.5 µg/L, and 20 ng/L, respectively. Only Cubist modeling predicted maxima concentrations of cyanobacteria and geosmin; no modeling technique was able to predict maxima microcystin concentrations. Because maxima concentrations are a primary concern for lake and reservoir managers, Cubist modeling may help predict the largest and most noxious concentrations of cyanobacteria and their secondary metabolites.
Article
The dinoflagellate Karlodinium and the diatom Pseudo-nitzschia are bloom-forming genera frequently present in Alfacs Bay. Both microalgae are associated with toxic events. Therefore, understanding their population dynamics and predict their occurrence in short-term is crucial for an optimal management of toxic events for the local shellfish production and ecosystem managers. Artificial neural networks have been successfully used to model the complex nonlinear dynamics of phytoplankton. In this study, this approach was applied to predict absence-presence and abundance of Karlodinium and Pseudo-nitzschia microalgae in Alfacs Bay (NW Mediterranean) using biological and/or environmental variables. Neural Interpretation Diagram (NID) and Connection Weight Approach (CWA) methodologies were applied to obtain ecological information from the models.The dataset used was long-term (1990–2015) time series of environmental and phytoplankton variables from different monitoring stations established in Alfacs Bay (Ebre Delta), meteorological data and Ebre River flow rates. Several models were presented. The best ones were achieved for one-week ahead procedures performed with environmental and biological variables using all the available data. A sensitivity analysis showed the larger the data set used, the better the models obtained. However, Karlodinium absence-presence models developed with five years of data present high accuracy.The size of the neural networks denotes complex relationships between environmental and phyto-plankton variables. The environmental variables had stronger influence on the abundance models while biological variables had more importance in the absence-presence models. These results highlight acomplex ecosystem in Alfacs Bay involving anthropogenic, climatic and hydrologic factors forcing phyto-plankton dynamics. In addition, a change in the ecosystem dynamics regarding Karlodinium is detected.The configuration and the accuracy achieved with the models allow their use in different real-world applications as automated systems and/or monitoring programs.
Article
The frequency and intensity of potentially toxic cyanobacterial blooms in water sources are increasing. Currently, the water industry relies on laboratory analysis of cyanobacteria that can take two-five days; there is therefore a need to improve response time. Online fluorometric probes (also called “fluorescence probes” in some publications) are available for the rapid detection of cyanobacteria cells via measurement of specific pigmentation; however, water quality interferences with probe measurements in natural environments hinder their wider application. This review aims to investigate the sources of interference and bias, and assess the applicability of these probes for measurement of water supplies. Reported laboratory and field validation of these probes showed that their readings were sufficiently accurate. Correction procedures have been investigated for the identified sources of interferences but require field validation. Fluorometric probes can help with decision making during plant operation and have the potential to be applied as a management technique; however, probe users should be fully aware of the sources of interferences when interpreting the in situ probe measurements.
Article
There is a strong interest in developing a capacity to predict the occurrence of cyanobacteria blooms in lakes and to identify the measures to be taken to reduce water quality problems associated with the occurrence of potentially harmful taxa. Here we conducted a weekly to bi-weekly monitoring program on five shallow eutrophic lakes during two years, with the aim of gathering data on total cyanobacterial abundance, as estimated from marker pigments determined by HPLC analysis of phytoplankton extracts. We also determined bloom composition and measured weather and limnological variables. The most frequently identified taxa were Aphanizomenon flos-aquae, Microcystis aeruginosa, Planktothrix agardhii and Anabaena spp. We used the data base composed of a total of 306 observations and an adaptive regression trees method, the boosted regression tree (BRT), to develop predictive models of bloom occurrence and composition, based on environmental conditions. Data processing with BRT enabled the design of satisfactory prediction models of cyanobacterial abundance and of the occurrence of the main taxa. Phosphorus (total and soluble reactive phosphate), dissolved inorganic nitrogen, epilimnion temperature, photoperiod and euphotic depth were among the best predictive variables, contributing for at least 10 % in the models, and their relative contribution varied in accordance with the ecological traits of the taxa considered. Meteorological factors (wind, rainfall, surface irradiance) had a significant role in species selection. Such results may contribute to designing measures for bloom management in shallow lakes.
Article
Harmful cyanobacterial blooms (cHABs) have significant socioeconomic and ecological costs, which impact drinking water, fisheries, agriculture, tourism, real estate, water quality, food web resilience and habitats, and contribute to anoxia and fish kills. Many of these costs are well described, but in fact are largely unmeasured. Worldwide cHABs can produce toxins (cyanotoxins), which cause acute or chronic health effects in mammals (including humans) and other organisms. There are few attempts to characterize the full health-related effects other than acute incidences, which may go unrecorded. At present these are difficult to access and evaluate and may be ascribed to other causes. Such information is fundamental to measure the full costs of cHABs and inform the need for often-costly management and remediation. This paper synthesizes information on cHABs occurrence, toxicology and health effects, and relates this to past and current conditions in the Great Lakes, a major global resource which supplies 84% of the surface water in North America. This geographic region has seen a significant resurgence of cHABs since the 1980s. In particular we focus on Lake Erie, where increased reporting of cHABs has occurred from the early 1990's. We evaluate available information and case reports of cHAB-related illness and death and show that cHABs occur throughout the basin, with reports of animal illness and death, especially dogs and livestock. Lake Erie has consistently experienced cHABs and cyanotoxins in the last decade with probable cases of human illness, while the other Great Lakes show intermittent cHABs and toxins, but no confirmed reports on illness or toxicity. The dominant toxigenic cyanobacterium is the genus Microcystis known to produce microcystins. The presence of other cyanotoxins (anatoxin-a, paralytic shellfish toxins) implicates other toxigenic cyanobacteria such as Anabaena (Dolichospermum) and Lyngbya.