ArticlePublisher preview available

Remote sensing-based water quality index estimation using data-driven approaches: a case study of the Kali River in Uttar Pradesh, India

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

The present study evaluates the water quality status of 6-km-long Kali River stretch that passes through the Aligarh district in Uttar Pradesh, India, by utilizing high-resolution IRS P6 LISS IV imagery. In situ river water samples collected at 40 random locations were analyzed for seven physicochemical and four heavy metal concentrations, and the water quality index (WQI) was computed for each sampling location. A set of 11 spectral reflectance band combinations were formulated to identify the most significant band combination that is related to the observed WQI at each sampling location. Three approaches, namely multiple linear regression (MLR), backpropagation neural network (BPNN) and gene expression programming (GEP), were employed to relate WQI as a function of most significant band combination. Comparative assessment among the three utilized approaches was performed via quantitative indicators such as R², RMSE and MAE. Results revealed that WQI estimates ranged between 203.7 and 262.33 and rated as “very poor” status. Results further indicated that GEP performed better than BPNN and MLR approaches and predicted WQI estimates with high R² values (i.e., 0.94 for calibration and 0.91 for validation data), low RMSE and MAE values (i.e., 2.49 and 2.16 for calibration and 4.45 and 3.53 for validation data). Moreover, both GEP and BPNN depicted superiority over MLR approach that yielded WQI with R² ~ 0.81 and 0.67 for calibration and validation data, respectively. WQI maps generated from the three approaches corroborate the existing pollution levels along the river stretch. In order to examine the significant differences among WQI estimates from the three approaches, one-way ANOVA test was performed, and the results in terms of F-statistic (F = 0.01) and p-value (p = 0.994 > 0.05) revealed WQI estimates as “not significant,” reasoned to the small water sample size (i.e., N = 40). The study therefore recommends GEP as more rational and a better alternative for precise water quality monitoring of surface water bodies by producing simplified mathematical expressions.
This content is subject to copyright. Terms and conditions apply.
Vol:.(1234567890)
Environment, Development and Sustainability (2021) 23:18252–18277
https://doi.org/10.1007/s10668-021-01437-6
1 3
Remote sensing‑based water quality index estimation
using data‑driven approaches: acase study oftheKali River
inUttar Pradesh, India
SaifSaid1 · ShadabAliKhan1
Received: 4 April 2020 / Accepted: 13 April 2021 / Published online: 21 April 2021
© The Author(s), under exclusive licence to Springer Nature B.V. 2021
Abstract
The present study evaluates the water quality status of 6-km-long Kali River stretch that
passes through the Aligarh district in Uttar Pradesh, India, by utilizing high-resolution IRS
P6 LISS IV imagery. Insitu river water samples collected at 40 random locations were ana-
lyzed for seven physicochemical and four heavy metal concentrations, and the water qual-
ity index (WQI) was computed for each sampling location. A set of 11 spectral reflectance
band combinations were formulated to identify the most significant band combination that
is related to the observed WQI at each sampling location. Three approaches, namely multi-
ple linear regression (MLR), backpropagation neural network (BPNN) and gene expression
programming (GEP), were employed to relate WQI as a function of most significant band
combination. Comparative assessment among the three utilized approaches was performed
via quantitative indicators such as R2, RMSE and MAE. Results revealed that WQI esti-
mates ranged between 203.7 and 262.33 and rated as “very poor” status. Results further
indicated that GEP performed better than BPNN and MLR approaches and predicted WQI
estimates with high R2 values (i.e., 0.94 for calibration and 0.91 for validation data), low
RMSE and MAE values (i.e., 2.49 and 2.16 for calibration and 4.45 and 3.53 for valida-
tion data). Moreover, both GEP and BPNN depicted superiority over MLR approach that
yielded WQI with R2 ~ 0.81 and 0.67 for calibration and validation data, respectively. WQI
maps generated from the three approaches corroborate the existing pollution levels along
the river stretch. In order to examine the significant differences among WQI estimates from
the three approaches, one-way ANOVA test was performed, and the results in terms of
F-statistic (F = 0.01) and p-value (p = 0.994 > 0.05) revealed WQI estimates as “not sig-
nificant,” reasoned to the small water sample size (i.e., N = 40). The study therefore recom-
mends GEP as more rational and a better alternative for precise water quality monitoring of
surface water bodies by producing simplified mathematical expressions.
Keywords Kali River· WQI· Spectral reflectance· MLR· ANN· GEP
* Saif Said
saif_said@rediffmail.com
Shadab Ali Khan
shadab7856gc@gmail.com
1 Civil Engineering Department, Aligarh Muslim University, (AMU), Aligarh, India
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... Remote sensing and artificial intelligence models have been applied to evaluate river water quality indices (Chebud et al. 2012;Najafzadeh and Basirian 2023). Data-driven models have been used to evaluate the reliability of groundwater quality indices (Najafzadeh et al. , 2022Said and Khan 2021). Additionally, a novel multiple-kernel support vector regression algorithm has been developed for estimating water quality parameters (Najafzadeh and Niazmardi 2021). ...
Article
Full-text available
The Ouislane sub-watershed is currently experiencing severe water shortages and is highly dependent on its water supply. The sub-watershed spans two communes: Meknes to the north and El Hajeb to the south. It serves as the primary water source for irrigation and drinking purposes for the local population. Consequently, it is crucial to assess the spatio-temporal variations of water quality to identify and address potential gaps; these focused on effective monitoring systems to detect contaminants, pollutants and health risks. This research project aims on the application of self-organizing map (SOM) techniques combined with cluster analysis to classify water quality in springs for drinking and irrigation purposes. The present study evaluates the water quality variations using physicochemical parameters of twelve water springs, collected during the wet and dry seasons of 2022. For this purpose, the water quality index (WQI), self-organizing map (SOM), hierarchical cluster analysis (HCA), and principal component analysis (PCA) are used as evaluation and classification methods. As a result, the SOM algorithm with a size of 5 × 5 units identified as the most suitable, based on the minimum quantization error (QE) and topographic error (TE), yielding a QE of 0.379 and a TE of 0.000. It grouped the water quality data into five distinct clusters, Cluster I represented 37.5% of the total samples, while cluster II represented 25%. Cluster III and IV each accounted for 8.33% of the samples, while 20.83% of the sampling water are classified in cluster V. Clusters I, II, and IV indicate good water suitable for drinking. However, cluster V had the highest WQI, suggesting very high contamination due to increased levels of the 10 studied physicochemical parameters. The water quality in this region (cluster V) is influenced by natural processes, such as precipitation intensity, weathering and vegetation cover, as well as anthropogenic factors like agriculture and urban concentration. PCA confirmed the clustering results obtained by SOM. However, SOM provides a more detailed classification and additional insights into the dominant variables influencing the classification processes. The results of this study suggest that SOM was an effective tool for gaining a better understanding of the patterns and processes driving water quality in the Ouislane sub-watershed and provides valuable avenues for further research to establish and monitor water quality for effective management of water resources.
... In Table 5, the R 2 of TP is the lowest among the parameters. This is because remote sensing data are limited by atmospheric conditions, sensor noise, viewing angles, and other factors [32], resulting in the fact that achievement of very high R 2 values is challenging when using remote sensing to inverse water parameters. However, data provide broad and continuous observations, which offer valuable information over large areas that traditional in situ measurements cannot reach. ...
Article
Full-text available
With rapid social and economic development, land use/land cover change (LUCC) has intensified with serious impacts on water quality in the watershed. In this study, we took Dongjiang Lake watershed as the study area and obtained measured data on water quality parameters from the watershed’s water quality monitoring stations. Based on Landsat-5, Landsat-8, or Sentinel-2 remote sensing data for multiple periods per year between 1992 and 2022, the sensitive satellite bands or band combinations of each water quality parameter were determined. The Random Forest method was used to classify the land use types in the watershed into six categories, and the area proportion of each type was calculated. We established machine learning regression models and polynomial regression models with WQI as the dependent variable and the area proportion of each land use type as the independent variable. Accuracy test results showed that, among them, the quadratic cubic polynomial regression model with grassland, forest land, construction land, and unused land as its independent variables was the best model for coupling watershed water quality with LUCC. This study’s results provide a scientific basis for monitoring spatial and temporal changes in water quality caused by LUCC in the Dongjiang Lake watershed.
... It is recommended to collect in situ samples within a day from the satellite image capture. As a result, the errors will be minimized, and the algorithms will be better calibrated (Brezonik et al., 2005;Said & Khan, 2021). If there is an extended time gap between the satellite image capture and in situ sample collection, the accuracy of the data may be adversely affected. ...
Article
Full-text available
The continuous availability of spatial and temporal distributed data from satellite sensors provides more accurate and timely information regarding surface water quality parameters. Remote sensing data has the potential to serve as an alternative to traditional on-site measurements, which can be resource-intensive due to the time and labor involved. This present study aims in exploring the possibility and comparison of hyperspectral and multispectral imageries (PRISMA) for accurate prediction of surface water quality parameters. Muthupet estuary, situated on the south side of the Cauvery River delta on the Bay of Bengal, is selected as the study area. The remote sensing data is acquired from the PRISMA hyperspectral satellite and the Sentinel-2 multispectral instrument (MSI) satellite. The in situ sampling from the study area is performed, and the testing procedures are carried out for analyzing different water quality parameters. The correlations between the water sample results and the reflectance values of satellites are analyzed to generate appropriate algorithmic models. The study utilized data from both the PRISMA and Sentinel satellites to develop models for assessing water quality parameters such as total dissolved solids, chlorophyll, pH, and chlorides. The developed models demonstrated strong correlations with R² values above 0.80 in the validation phase. PRISMA-based models for pH and chlorophyll displayed higher accuracy levels than Sentinel-based models with R² > 0.90.
Article
Full-text available
This study objects to evaluate the Water Quality Indices (WQIs) of the Tigris River in Wasit, Iraq, using the Arithmetic Weighted Water Quality Index (AW-WQI), Canadian Water Quality Index (CCME-WQI), Heavy Metal Pollution Index (HPI-WQI), National Sanitation Foundation Index (NSF-WQI), and Overall Index of Pollution (OIP-WQI). Twelve water samples were collected at different locations in the study area during the winter and spring of 2024. Each index evaluates the water quality in the study area based on speciϐic criteria. In separate periods (winter and spring seasons of 2024), we categorized the water quality in the research region according to each indication: AW-WQI (70.517-102.611), CCME-WQI (39.763-47.1404), HPI-WQI (82.526-118.846), NSF-WQI (54.66-60.12), and OIP-WQI (1.9769-2.4686). We have created twenty-six combinations of spectral reϐlectance bands, reϐlectance values of seven bands, band ratios for the ϐirst ϐive bands, and nine spectral indices. This study showed a signiϐicant correlation between the spectral reϐlectance data of Landsat-9 OLI-2 bands and the WQIs using Pearson correlation and multiple linear regression (MLR) model equations. We evaluated the performance of the MLR model for the WQIs across different seasons. The AW-WQI model showed a coefϐicient of determination R 2 of 84% in winter and 98% in spring. At the same time, the CCME-WQI recorded R 2 of 97% in winter and 75% in spring. The HPI-WQI received R 2 of 93% and 98% in spring. The NSF-WQIs received R 2 of 62% and 98% in spring. Finally, the OIP-WQI received R 2 of 92% and 99% in spring. These results highlight the seasonal variation in the predictive accuracy of the WQI models, with some minor differences between the experimental results and those obtained through remote sensing techniques. The WQIs showed that the water needed to be more suitable for consumption due to elevated levels beyond the permissible limit in most study area locations. Multiple sources of pollution in the region discharge hazardous waste into the river, causing WQIs to exceed permissible limits in most study areas.
Chapter
Purpose: Water pollution is a major concern to human life, and periodic monitoring of water quality parameters across water reservoirs such as lakes, ponds, dams and rivers is necessary to examine the water contaminants. The present work focused on developing an Unmanned Surface Vehicle (USV) to collect water samples in remote water body locations. Design/Methodology/Approach: A solenoid-actuated water sampling system with an automatic cutoff of storage was devised. In addition, an onboard water quality monitoring system consisting of pH, turbidity, electrical conductivity, and dissolved oxygen sensors was integrated with USV. Findings: The USV mounted with a water sampler and sensor unit was tested in a lake near our institute. 1200 ml of water samples from the six polyethene terephthalate (PET) storage containers were collected and tested in a laboratory, and their contamination levels were estimated. The spatial distribution maps of water contaminants were generated based on the water quality analysis. Originality: The in-situ water quality measured using the Internet of Things (IoT) enabled USV is an efficient choice for online monitoring of water quality parameters. The USV was also tested for its stability on the water surface under various wind loads, and it was able to withstand the wind conditions for effective water sampling and in-situ water quality measurements.
Article
Full-text available
A significant problem in the sustainable management of water resources is the lack of funding and long-term monitoring. Today, this problem has been greatly reduced by innovative, adaptive, and sustainable learning methods. Therefore, in this study, a sample river was selected and 14 variables observed at 5 different points for 12 months, traditionally reference values, were calculated by multivariate statistical analysis methods to obtain the water quality index (WQI). The WQI index was estimated using different algorithms including the innovatively used multiple linear regression (MLR), multilayer perceptron artificial neural networks (MLP-ANN) and various machine learning estimation algorithms including neural networks (NN), support vector machine (SVM), gaussian process regression (GPR), ensemble and decision tree approach. By comparing the results, the most appropriate method was selected. The determination of water quality was best estimated by the multiple linear regression (MLR) model. As a result of this MLR modeling, high prediction performance was obtained with accuracy values of R² = 1.0, RMSE = 0.0025, and MAPE = 0.0296. The root mean square error (RMSE), percent mean absolute error (MAE), and coefficient of determination (R²) were used to determine the accuracy of the models. These results confirm that both MLR model can be used to predict WQI with very high accuracy. It seems that it can contribute to strengthening water quality management. As a result, as with the powerful results of the innovative approaches (MLR and MLP-ANN) and other assessments, it was found that the presence of intense anthropogenic pressure in the study area and the current situation needs immediate remediation.
Article
Due to advanced sensor technology, satellites and unmanned aerial vehicles (UAV) are producing a huge amount of data allowing advancement in all different kinds of earth observation applications. Thanks to this source of information, and driven by climate change concerns, renewable energy assessment became an increasing necessity among researchers and companies. Solar power, going from household rooftops to utility-scale farms, is reshaping the energy markets around the globe. However, the automatic identification of photovoltaic (PV) panels and solar farms' status is still an open question that, if answered properly, will help gauge solar power development and fulfill energy demands. Recently deep learning (DL) methods proved to be suitable to deal with remotely sensed data, hence allowing many opportunities to push further research regarding solar energy assessment. The coordination between the availability of remotely sensed data and the computer vision capabilities of deep learning has enabled researchers to provide possible solutions to the global mapping of solar farms and residential photovoltaic panels. However, the scores obtained by previous studies are questionable when it comes to dealing with the scarcity of photovoltaic systems. In this paper, we closely highlight and investigate the potential of remote sensing-driven DL approaches to cope with solar energy assessment. Given that many works have been recently released addressing such a challenge, reviewing and discussing them, it is highly motivated to keep its sustainable progress in future contributions. Then, we present a quick study highlighting how semantic segmentation models can be biased and yield significantly higher scores when inference is not sufficient. We provide a simulation of a leading semantic segmentation architecture U-Net and achieve performance scores as high as 99.78%. Nevertheless, further improvements should be made to increase the model's capability to achieve real photovoltaic units.
Article
Full-text available
The Kali River is a significant source of surface water as well as the main tributary of River Hindon that flows through major cities of western Uttar Pradesh, India. It flows throughout the urban and industrial regions; hence, it carries various amounts of pollutant. Therefore, a study was conducted to examine spatial–temporal variations in river water quality by determining physicochemical variables and heavy metal concentrations at seventeen sampling stations (S1–S17) throughout the river stretch. Various physicochemical variables, namely pH, EC, TDS, turbidity, BOD, COD, TH, TA, Ca, Mg, Na, K, HCO3−, Cl−, SO42−, NO3−, and PO43− were higher in summer than in winter. The order of mean metal concentrations was Fe > Pb > Mn > Ni > Zn > Cu > Cr > Cd. The relationships among measured physicochemical variables and pollution index were examined. Furthermore, multivariate statistical methods were used to assess spatial–temporal variation in water quality to identify current pollution sources and validate results. Water quality index and comprehensive pollution index indicated that the Kali River was less polluted from S1 to S8. However, downstream sampling sites were polluted. Pollution starts from S9 and drastically increases at and beyond S13 because of effluents from industries and sugar mills in Muzaffarnagar. The study suggests cleaning the downstream region of river to restore human health and flora and fauna in the river ecosystem.
Article
Full-text available
Remotely sensed data can reinforce the abilities of water resources researchers and decision-makers to monitor water quality more effectively. In the past few decades, remote sensing techniques have been widely used to measure qualitative water quality parameters. However, the use of moderate resolution sensors may not meet the requirements for monitoring small water bodies. Water quality in a small dam was assessed using high-resolution satellite data from RapidEye and in situ measurements collected a few days apart. The satellite carries a five-band multispectral optical imager with a ground sampling distance of 5 m at its nadir and a swath width of 80 km. Several different algorithms were evaluated using Pearson correlation coefficients for electrical conductivity (EC), total dissolved soils (TDS), water transparency, water turbidity, depth, suspended particular matter (SPM), and chlorophyll-a. The results indicate strong correlation between the investigated parameters and RapidEye reflectance, especially in the red and red-edge portion with highest correlation between red-edge band and water turbidity (r² = 0.92). Two of the investigated indices showed good correlation in almost all of the water quality parameters with correlation higher than 0.80. The findings of this study emphasize the use of both high-resolution remote sensing imagery and red-edge portion of the electromagnetic spectrum for monitoring several water quality parameters in small water areas.
Article
Full-text available
With the advancement of human civilization, the river cascading system has been converted into a control system and man has a significant role in it. This paper has examined how the rivers, flowing across the highly populated Ganga–Brahmaputra Delta, are being obliterated due to the close contact of human civilization, as is an example of Ichamati River, an important distributary channel in the district of North 24 Parganas, India. The Ichamati River drains the east and south sides of the North 24 Parganas district and is covered by deep Quaternary sediments produced under tropical monsoon climate in India. The district is densely populated. GIS and a detail field investigation along with two case studies have been incorporated to extract the relationship between man and river, as a control system. This study significantly will draw the attention how the river has modified itself against imprudence human attitude towards environment without any proper river management. This paper has examined the human interventions over river, as a control system, and has discussed about the associated changing characters of river behaviour as the response, e.g. (1) longitudinal profile has been changed temporally due to human impact; (2) the characters of cross profiles have been changed due to the impact of bridge and other human influences; (3) the tidal discharge of the river has been changed downstream upward due to intake of water for different purposes. The primary objective of this article is to examine the role of man as an important controlling element of a river system.
Article
Full-text available
This study aimed to develop a reliable turbidity model to assess reservoir turbidity based on Landsat-8 satellite imagery. Models were established by multiple linear regression (MLR) and gene-expression programming (GEP) algorithms. Totally 55 and 18 measured turbidity data from Tseng-Wen and Nan-Hwa reservoir paired and screened with satellite imagery. Finally, MLR and GEP were applied to simulated 13 turbid water data for critical turbidity assessment. The coefficient of determination (R²), root mean squared error (RMSE), and relative RMSE (R-RMSE) calculated for model performance evaluation. The result show that, in model development, MLR and GEP shows a similar consequent. However, in model testing, the R², RMSE, and R-RMSE of MLR and GEP are 0.7277 and 0.8278, 0.7248 NTU and 0.5815 NTU, 22.26% and 17.86%, respectively. Accuracy assessment result shows that GEP is more reasonable than MLR, even in critical turbidity situation, GEP is more convincible. In the model performance evaluation, MLR and GEP are normal and good level, in critical turbidity condition, GEP even belongs to outstanding level. These results exhibit GEP denotes rationality and with relatively good applicability for turbidity simulation. From this study, one can conclude that GEP is suitable for turbidity modeling and is accurate enough for reservoir turbidity estimation.
Article
Full-text available
Based on continuous turbidity values (T) and water quality modeling, one methodology was proposed to estimate phosphorus (P) flux into the sea through the river. There are three steps in the procedure: (1) analyzing the relationship of P and total suspended solid (TSS), TSS and T; (2) estimating P concentrations with high temporal and spatial density along the river; and (3) calculating the amount of phosphorus flux into the sea. From September 2014 to December 2016, 224 data sets were collected at eight sites along the Tonglv River which feeds into the Yellow Sea, China. The linear regression of TSS and T provided a fit with R² = 0.944, and TP and TSS presented linear relationship with R² = 0.884. It was estimated that about 30,227 kg total phosphorus (TP) flows from the river into the sea from September 2014 to August 2015. It is practicable and credible to use T values and water quality modeling to estimate the process of phosphorus flux. Besides, the procedure proposed could be also applied in analyzing the influence on TP fluxes brought by pollution loading changes. Thus, to facilitate long-term pollutants’ fluxes estimation, combining the regression models with water quality modeling might be an effective technique.
Article
Full-text available
Remote sensing applications in water resources management are quite essential in watershed characterization, particularly when mega basins are under investigation. Water quality parameters help in decision making regarding the further use of water based on its quality. Water quality parameters of chlorophyll a concentration, nitrate concentration, and water turbidity were used in the current study to estimate the water quality parameters in the dam lake of Wadi Baysh, Saudi Arabia. Water quality parameters were collected daily over 2 years (2017–2018) from the water treatment station located within the dam vicinity and were correspondingly tested against remotely sensed water quality parameters. Remote sensing data were collected from Sentinel-2 sensor, European Space Agency (ESA) on a satellite temporal resolution basis. Data were pre-processed then processed to estimate the maximum chlorophyll index (MCI), green normalized difference vegetation index (GNDVI) and normalized difference turbidity index (NDTI). Zonal statistics were used to improve the regression analysis between the spatial data estimated from the remote sensing images and the nonspatial data collected from the water treatment plant. Results showed different correlation coefficients between the ground truth collected data and the corresponding indices conducted from remote sensing data. Actual chlorophyll a concentration showed high correlation with estimated MCI mean values with an R² of 0.96, actual nitrate concentration showed high correlation with the estimated GNDVI mean values with an R² of 0.94, and the actual water turbidity measurements showed high correlation with the estimated NDTI mean values with an R² of 0.94. The research findings support the use of remote sensing data of Sentinel-2 to estimate water quality parameters in arid environments.
Chapter
Computational modeling plays a central role in cognitive science. This book provides a comprehensive introduction to computational models of human cognition. It covers major approaches and architectures, both neural network and symbolic; major theoretical issues; and specific computational models of a variety of cognitive processes, ranging from low-level (e.g., attention and memory) to higher-level (e.g., language and reasoning). The articles included in the book provide original descriptions of developments in the field. The emphasis is on implemented computational models rather than on mathematical or nonformal approaches, and on modeling empirical data from human subjects. Bradford Books imprint
Article
Environmental properties of compounds provide significant information in treating organic pollutants, which drives the chemical process and environmental science toward eco-friendly technology. Traditional group contribution methods play an important role in property estimations, whereas various disadvantages emerge in their applications, such as scattered predicted values for certain groups of compounds. In order to address such issues, an extraction strategy for molecular features is proposed in this research, which is characterized by interpretability and discriminating power with regard to isomers. Based on the Henry's law constant data of organic compounds in water, we developed a hybrid predictive model that integrates the proposed strategy in conjunction with a neural network framework. The structure of the predictive model is optimized using cross-validation and grid search to improve its robustness. Moreover, the predictive model is improved by introducing the plane of best fit descriptor as input and adopting k-means clustering in sampling. In contrast with reported models in the literature, the developed predictive model demonstrates improved generality, higher accuracy, and fewer molecular features used in its development.
Article
Aquatic ecosystems cover over two thirds of our planet and play a pivotal role in stabilizing the global climate as well as providing a large array of services for a fast-growing human population. However, anthropogenic activities increasingly provoke deleterious impacts in aquatic ecosystems. In this paper we discuss five sources of anthropogenic pollution that affect marine and freshwater ecosystems: sewage, nutrients and terrigenous materials, crude oil, heavy metals and plastics. Using specific locations as examples, we show that land-based anthropogenic activities have repercussions in freshwater and marine environments, and we detail the direct and indirect effects that these pollutants have on a range of aquatic organisms, even when the pollutant source is distant from the sink. While the issues covered here do focus on specific locations, they exemplify emerging problems that are increasingly common around the world. All these issues are in dire need of stricter environmental policies and legislations particularly for pollution at industrial levels, as well as solutions to mitigate the effects of anthropogenic pollutants and restore the important services provided by aquatic ecosystems for future generations.
Article
As an essential environmental property, octanol-water partition coefficient (KOW) quantifies the lipophilicity of a compound and it could be further employed to predict the toxicity. Thus, it is an indispensable factor should be considered in screening and development of green solvents with respect to unconventional and novel compounds. Herein, a deep-learning-assisted predictive model has been developed to accurately and reliably calculate log KOW values for organic compounds. An embedding algorithm was specifically established for generating signatures automatically for molecular structures to express structural information and connectivity. Afterwards, the Tree-structured long short-term memory (Tree-LSTM) network was used in conjunction with signature descriptor for automatic feature selection, and it was then coupled with the back-propagation neural network to develop a deep neural network (DNN), which is used for modeling quantity structure-property relationship (QSPR) to predict log KOW. Comparing with an authoritative estimation method, the proposed DNN-based QSPR model exhibited the better predictive accuracy and greater discriminative power in terms of the structural isomers and stereoisomers. As such, the proposed deep learning approach can act as a promising and intelligent tool for developing environmental property prediction methods for guiding development or screening of green solvents.