Tomislav Hengl

Tomislav Hengl
OpenGeoHub foundation

PhD

About

190
Publications
148,241
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,318
Citations
Citations since 2016
62 Research Items
10870 Citations
201620172018201920202021202205001,0001,5002,000
201620172018201920202021202205001,0001,5002,000
201620172018201920202021202205001,0001,5002,000
201620172018201920202021202205001,0001,5002,000
Additional affiliations
May 2018 - present
Envirometrix Ltd
Position
  • Senior Researcher
September 2010 - present
ISRIC - World Soil Information
Position
  • Senior Researcher
Description
  • www.isric.org
January 2007 - December 2010
Universiteit van Amsterdam

Publications

Publications (190)
Preprint
Full-text available
The paper describes production steps and accuracy assessment of an analysis-ready open environmental data cube (2000--2021+) for continental Europe; at working resolutions from 10~m to 30~m and with quarterly to annual estimates. The data cube is based on processing and harmonizing earth observation (EO) images: Landsat GLAD ARD (2000- -2020+), Sen...
Preprint
Full-text available
The paper describes production steps and accuracy assessment of an analysis-ready open environmental data cube (2000--2021+) for continental Europe; at working resolutions from 10~m to 30~m and with quarterly to annual estimates. The data cube is based on processing and harmonizing earth observation (EO) images: Landsat GLAD ARD (2000- -2020+), Sen...
Article
Full-text available
Most agricultural soils have experienced substantial soil organic carbon losses in time. These losses motivate recent calls to restore organic carbon in agricultural lands to improve biogeochemical cycling and for climate change mitigation. Declines in organic carbon also reduce soil infiltration and water holding capacity, which may have important...
Data
The document provides information for transparency and reproducibility of the study according to the standard for species distribution modeling (ODMAP protocol) from Zurrell et al. (2020). Additional information reported include: (1) implementation strategy and results of spatial filtering operation, (2) hyperparameter space for model optimization...
Article
Full-text available
This paper describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., P...
Article
Full-text available
The representation of land surface processes in hydrological and climatic models critically depends on the soil water characteristics curve (SWCC) that defines the plant availability and water storage in the vadose zone. Despite the availability of SWCC datasets in the literature, significant efforts are required to harmonize reported data before S...
Article
Full-text available
A spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use/Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including five million harmonized LUCAS and CORINE Land Cover-derived training sa...
Preprint
Full-text available
This paper describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., P...
Poster
Full-text available
The poster describes a data-driven framework based on spatio-temporal ensemble machine learning to produce distribution maps for 16 tree species at high spatial resolution (30m). Tree occurrence data for a total of 3 million of points was used to train different Machine Learning (ML) algorithms: random forest, gradient-boosted trees, generalized li...
Preprint
Full-text available
This paper describes a data-driven framework based on spatiotemporal machine learning to produce distribution maps for 16 tree species (Abies alba Mill., Castanea sativa Mill., Corylus avellana L., Fagus sylvatica L., Olea europaea L., Picea abies L. H. Karst., Pinus halepensis Mill., Pinus nigra J. F. Arnold, Pinus pinea L., Pinus sylvestris L., P...
Technical Report
As part of the environmental regulatory framework to minimize risk to receptors, concentrations of chemicals in soil or water exceeding regulatory guidelines that can be attributed to industrial activities at a site require remediation and/or monitoring. This process is complicated by the fact that various chemical parameters are naturally elevated...
Preprint
Full-text available
A seamless spatiotemporal machine learning framework for automated prediction, uncertainty assessment, and analysis of land use / land cover (LULC) dynamics is presented. The framework includes: (1) harmonization and preprocessing of high-resolution spatial and spatiotemporal covariate datasets (GLAD Landsat, NPP/VIIRS) including 5 million harmoniz...
Preprint
Full-text available
A seamless spatiotemporal machine learning framework for automated prediction, uncertainty assessment, and analysis of long-term LULC dynamics is presented. The framework includes: (1) harmonization and preprocessing of high-resolution spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including 5~million harmonized LUCAS and CORIN...
Preprint
Full-text available
A seamless spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use / Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of high-resolution spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including 5 million harmonized LUCAS and CORINE Land C...
Preprint
Full-text available
A spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use / Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including 5 million harmonized LUCAS and CORINE Land Cover-derived training sam...
Article
Full-text available
The saturated soil hydraulic conductivity (Ksat) is a key parameter in many hydrological and climate models. Ksat values are primarily determined from basic soil properties and may vary over several orders of magnitude. Despite the availability of Ksat datasets in the literature, significant efforts are required to combine the data before they can...
Article
Full-text available
Soil property and class maps for the continent of Africa were so far only available at very generalised scales, with many countries not mapped at all. Thanks to an increasing quantity and availability of soil samples collected at field point locations by various government and/or NGO funded projects, it is now possible to produce detailed pan-Afric...
Preprint
Full-text available
Soil property and class maps for the continent of Africa were so far only available at very generalised scales, with many countries not mappedat all. Thanks to an increasing quantity and availability of soil samples collected at field point locations by various government and/or NGOfunded projects, it is now possible to produce detailed pan-African...
Preprint
Full-text available
Saturated soil hydraulic conductivity (Ksat) is a key parameter in many hydrological and climatic modeling applications, as it controls the partitioning between precipitation, infiltration and runoff. Ksat values are primarily determined from soil textural properties and soil forming processes, and may vary over several orders of magnitude. Despite...
Article
Full-text available
Soil organic carbon (SOC) information is fundamental for improving global carbon cycle modeling efforts, but discrepancies exist from country‐to‐global scales. We predicted the spatial distribution of SOC stocks (topsoil; 0–30 cm) and quantified modeling uncertainty across Mexico and the conterminous United States (CONUS). We used a multisource SOC...
Article
Full-text available
Most soil hydraulic information used in Earth System Models (ESMs) is derived from pedo-transfer functions that use easy-to-measure soil attributes to estimate hydraulic parameters. This parameterization relies heavily on soil texture, but overlooks the critical role of soil structure originated by soil biophysical activity. Soil structure omission...
Conference Paper
Full-text available
Rapid losses of mangroves over the past 50 years have had negative consequences on the environment, climate, and humanity, through diminished benefits such as carbon storage, coastal protection and fish production. Restoration of mangrove forests is possible, and has already been undertaken in many settings, but such efforts have been piecemeal, an...
Preprint
Using the term "Open data" has become a bit of a fashion, but using it without clear specifications is misleading i.e. it can be considered just an empty phrase. Probably even worse is the term "Open Science" — can science be NOT open at all? Are we reinventing something that should be obvious from start? This guide tries to clarify some key aspect...
Article
Full-text available
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal...
Data
RFsp—Random Forest for spatial data (R tutorial)
Article
Full-text available
Potential natural vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location if not impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This paper presents results of assessing machine learning algorithms—neural networks (n...
Preprint
Using the term "Open data" has become a bit of a fashion, but using it without clear specifications is misleading i.e. it can be considered just an empty phrase. Probably even worse is the term "Open Science" — can science be NOT open at all? Are we reinventing something that should be obvious from start? This guide tries to clarify some key aspect...
Preprint
Full-text available
Using the term "Open data" has become a bit of a fashion, but using it without clear specifications is misleading i.e. it can be considered just an empty phrase. Probably even worse is the term "Open Science" — can science be NOT open at all? Are we reinventing something that should be obvious from start? This guide tries to clarify some key aspect...
Article
Full-text available
In rainfed crop production, root zone plant-available water holding capacity (RZ-PAWHC) of the soil has a large influence on crop growth and the yield response to management inputs such as improved seeds and fertilisers. However, data are lacking for this parameter in sub-Saharan Africa (SSA). This study produced the first spatially explicit, coher...
Preprint
Full-text available
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal...
Preprint
Full-text available
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal...
Preprint
Full-text available
Potential Natural Vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location non-impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This paper presents results of assessing Machine Learning Algorithms (MLA) for operational...
Preprint
Potential Natural Vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location non-impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This paper presents results of assessing Machine Learning Algorithms (MLA) for operational...
Preprint
Full-text available
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal...
Article
Full-text available
With the growing recognition that effective action on climate change will require a combination of emissions reductions and carbon sequestration, protecting, enhancing and restoring natural carbon sinks have become political priorities. Mangrove forests are considered some of the most carbon-dense ecosystems in the world with most of the carbon sto...
Article
An approach for using lasso (Least Absolute Shrinkage and Selection Operator) regression in creating sparse 3D models of soil properties for spatial prediction at multiple depths is presented. Modeling soil properties in 3D benefits from interactions of spatial predictors with soil depth and its polynomial expansion, which yields a large number of...
Preprint
Full-text available
Potential Natural Vegetation (PNV) is the vegetation cover in equilibrium with climate, that would exist at a given location non-impacted by human activities. PNV is useful for raising public awareness about land degradation and for estimating land potential. This paper presents results of assessing Machine Learning Algorithms (MLA) for operational...
Preprint
Full-text available
Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal...
Conference Paper
Full-text available
This study evaluates the SoilGrids as predictors, using points data from Cameroon national soil profiles data compilation, together with a set of covariates representing soil forming factors. Much effort was placed on the preparation of the Cameroon soil database (Camsadat 0.1). We predicted Soil Organic Carbon and Clay content at 250m resolution i...
Article
Full-text available
Past sea level fluctuations have shaped island area and archipelago configuration. The availability of global high-resolution data on bathymetry and past sea levels allows reconstruction of island palaeo-geography. Studies on the role of palaeo-area often consider only the Last Glacial Maximum, which neglects the dynamics of island fusion and fissi...
Article
Full-text available
With growing concern for the depletion of soil resources, conventional soil maps need to be updated and provided at finer and finer resolutions to be able to support spatially explicit human-landscape models. Three US soil point datasets-the National Cooperative Soil Survey Characterization Database, the National Soil Information System, and the Ra...
Article
Full-text available
Legacy soil data have been produced over 70 years in nearly all countries of the world. Unfortunately, data, information and knowledge are still currently fragmented and at risk of getting lost if they remain in a paper format. To process this legacy data into consistent, spatially explicit and continuous global soil information, data are being res...
Article
Full-text available
This paper describes a method to develop a soil bulk density pedotransfer function (PTF) using the Random Forest machine-Learning algorithm with soil and environmental data for the conterminous United States. Complete data from 45,818 horizons were extracted from the National Cooperative Soil Survey (NCSS) soil characterization database and used to...
Article
Full-text available
Spatial predictions of soil macro and micro-nutrient content across Sub-Saharan Africa at 250 m spatial resolution and for 0–30 cm depth interval are presented. Predictions were produced for 15 target nutrients: organic carbon (C) and total (organic) nitrogen (N), total phosphorus (P), and extractable—phosphorus (P), potassium (K), calcium (Ca), ma...
Article
Full-text available
View article on-line at: https://theconversation.com/open-soil-science-technology-is-helping-us-discover-the-mysteries-under-our-feet-81727 (PDF available on request) This article is based on discussions during the 'International workshop on Open Land Data: Mobile Apps and Geo-services for Open Soil Data' , ISRIC, 2-4 July 2017 ( Wageningen): http...
Article
Full-text available
Significance Land use and land cover change has resulted in substantial losses of carbon from soils globally, but credible estimates of how much soil carbon has been lost have been difficult to generate. Using a data-driven statistical model and the History Database of the Global Environment v3.2 historic land-use dataset, we estimated that agricul...
Conference Paper
Full-text available
Resulting from the GlobalSoilMap initiative and the Globally-integrated Africa Soil Information Service (AfSIS) project, soil property maps of the world were produced in 2014, following the maps of Sub-Saharan Africa produced in 2013. The two maps were fully compliant with the GlobalSoilMap specifications except for the spatial resolution of 1km. T...
Article
Full-text available
Three national US soil point datasets: the National Cooperative Soil Survey (NCSS) Characterization Database, National Soil Information System (NASIS), and the Rapid Carbon Assessment (RaCA) dataset, were combined with a stack of over 200 environmental datasets to generate complete coverage gridded predictions at 100 m spatial resolution of soil pr...
Article
Full-text available
Soil hydraulic properties are required in various modelling schemes. We propose a consistent spatial soil hydraulic database at 7 soil depths up to 2 m calculated for Europe based on SoilGrids250m and 1 km datasets and pedotransfer functions trained on the European Hydropedological Data Inventory (EU-HYDI). Saturated water content, water content at...