ArticlePDF Available

Abstract and Figures

Assessment of soil organic matter content using laboratory analysis can be costly and time consuming, so limiting how often land managers assess this important property. This work demonstrates an ability to estimate topsoil organic matter content from field observations alone and provides a method by which rapid and cost-effective assessments of soil organic matter status may be made. Models using environmental factors from the National Soil Inventory of Scotland (NSIS) dataset as inputs to a neural network model were used to predict loss on ignition (LOI). Two models, one for all soils and one for soils with small organic matter contents (LOI < 20%), were developed. It was found that the model developed for all soils produced reasonable predictive results across the entire LOI range (R2 = 0.877), although it was not as effective at predicting small LOI values (R2 = 0.354) as the small organic matter content model (R2 = 0.674). Both models were tested with imagery and data from samples outwith the NSIS dataset to validate the approach. Predictive results were less accurate than when using NSIS data. A discussion of possible improvements to make the model useful for field observations of soils is given.
Content may be subject to copyright.
European Journal of Soil Science, 2014 doi: 10.1111/ejss.12199
Predicting Scottish topsoil organic matter content
from colour and environmental factors
M. J. A, D. D, L. S, D. G. M, M. C. C
&H.I.J.B
The James Hutton Institute, Craigiebuckler, Aberdeen, AB15 8QH, UK
Summary
Assessment of soil organic matter content using laboratory analysis can be costly and time consuming, so limiting
how often land managers assess this important property. This work demonstrates an ability to estimate topsoil
organic matter content from eld observations alone and provides a method by which rapid and cost-effective
assessments of soil organic matter status may be made. Models using environmental factors from the National Soil
Inventory of Scotland (NSIS) dataset as inputs to a neural network model were used to predict loss on ignition
(LOI). Two models, one for all soils and one for soils with small organic matter contents (LOI <20%), were
developed. It was found that the model developed for all soils produced reasonable predictive results across the
entire LOI range (R2=0.877), although it was not as effective at predicting small LOI values (R2=0.354) as the
small organic matter content model (R2=0.674). Both models were tested with imagery and data from samples
outwith the NSIS dataset to validate the approach. Predictive results were less accurate than when using NSIS
data. A discussion of possible improvements to make the model useful for eld observations of soils is given.
Introduction
Soil organic matter (SOM) controls a host of soil functions and
ecosystem services, and the development of effective policies and
monitoring tools for ensuring that SOM contents are maintained
or increased is a high priority (Orr et al., 2008). Much current
European soil policy-relevant research is focused on assessing and
improving SOM content (Glenk & Colombo, 2011), and future
policy objectives are likely to be even more concerned with this,
and encouraging farmers to manage their land in a manner that
will enhance ecosystem services. This is a complex and politically
sensitive topic that has received a great deal of attention in
recent years (Lal, 2009; Robbins, 2011). There are well-understood
agriculture management strategies such as no-till, set-aside or
cover crops that would improve soil agricultural productivity and
increase other ecosystem service provisions (Kassam et al., 2009;
Lal, 2010). However, to provide incentives for these strategies
through farmer payment schemes requires careful auditing not only
of management strategies, but also of their impact on the soil.
One vital component of any auditing system will be accurate,
cost-effective and rapid monitoring of ecosystem service indicators
across managed land.
Kibblewhite et al. (2008) argue that measurement of individual
soil properties does not provide an accurate indication of soil
Correspondence: M. J Aitkenhead. E-mail: matt.aitkenhead@hutton.ac.uk
Received 18 June 2013; revised version accepted 15 September 2014
health, because of the complexity and integrative nature of process
interactions within the soil. However, it is possible for specic
properties to be used as indicators of specic ecosystem services
(Haines-Young & Potschin, 2009; Maes et al., 2013) and SOM is a
particularly useful example. The measurement of SOM is relevant
to the determination not only of how much carbon is being stored,
but also how much farmers could be paid for keeping it stored.
Saby et al. (2008) and Aalders et al. (2009) emphasize the need for
effective national soil monitoring networks (SMNs) that are able to
monitor changes in SOM content at the regional and national scales.
Challenges to the measurement of SOM include: spatial variabil-
ity (Conant et al., 2011), which implies a need for intense spatial
measuring density; the inuence of many different factors such as
land use and soil type (Martin et al., 2011; Van Wesemael et al.,
2011), which can be resolved with a stratied sampling approach;
measuring changes in SOM content over time (Chapman et al.,
2013); and the need to obtain measurements rapidly and cheaply
in order to make the monitoring cost-effective. This last challenge
has received a great deal of attention in recent years (see McHenry,
2009) and has seen some breakthroughs in the use of monitoring
soil spectral properties with remote sensing, for example (Croft
et al., 2012), or eld spectroscopy techniques (Stevens et al., 2008;
Bellon-Maurel & McBratney, 2011).
Relationships between SOM and soil spectral characteristics have
been known to exist for some time (Barouchas & Moustakas, 2004).
Soil spectroscopy often uses wavelengths outside those visible to
humans, although visible wavelengths provide useful information
© 2014 British Society of Soil Science 1
2M. J. Aitkenhead et al.
(La et al., 2008; Aitkenhead et al., 2012). While spectroscopic
tools (Vasques et al., 2010) may make it possible to measure
SOM more accurately than with basic colour descriptors such as
Munsell (Munsell Color Company, 1954), RGB (the red, green
and blue values assigned to computer display pixels), equipment
and staff costs make it difcult to carry out soil surveys. The aim
of this work is to demonstrate an approach that relies on a suite
of easily-measured image colour properties to provide rapid and
accurate SOM assessment for land managers.
Improvements to SOM estimation can be made by including
information such as soil class or texture (Suuster et al., 2012),
topographic characteristics (Chaplot et al., 2001), climate or veg-
etation (Zhang et al., 2011). Here we use a neural network model
to integrate a number of ancillary properties to make predictions
of SOM content. The resulting system, which has been imple-
mented within a mobile phone application, is cost-effective and
rapid in allowing land managers to assess one of the key indica-
tors of soil health, and requires little or no expert knowledge to use.
Neural networks are particularly effective in capturing the relation-
ships between environmental factors and SOM content, as long as
the model parameterization, architecture and training approach are
selected appropriately (Li et al., 2013).
Our aim was to develop and demonstrate a method for estimating
soil organic matter (SOM) values from information on soil colour
and site characteristics. While we acknowledge that from a green-
house gas or carbon budget perspective, soil organic carbon (SOC)
information is more useful than SOM, we have used SOM rather
than SOC in this work. The reasons for this are that (i) farmers and
other land managers are usually more familiar with the concept of
organic matter content and (ii) SOM is accommodated on a scale
between none (0%) and all (100%) within the soil, making it eas-
ier to understand where on this universal scale a particular soil lies.
The SOC upper limits in soil are harder to pin down, making com-
prehension of the position of a soil in relation to others less easy to
describe. Work by De Vos et al. (2005) indicates that it is possible
to estimate SOC from loss on ignition values, implying that in situ-
ations where only LOI is known, it is still possible to estimate SOC
values if they are preferred.
Materials and methods
NSIS data preparation
A national grid-based survey of Scotland’s soils (National Soil
Inventory of Scotland, NSIS1) was rst carried out from 1978
to 1988 by the Soil Survey of Scotland. During this period, the
Macaulay Institute for Soil Research (now The James Hutton
Institute) was engaged in a programme to map the soils of Scotland
at 1:250 000 scale and NSIS1 was designed to create a dataset that
would hold environmental, morphological and analytical data on a
systematic grid basis and assist in ground-truthing for the mapping.
Later survey work to repeat some of this has been designated NSIS2,
but only the NSIS1 data have been used here. We refer to this
throughout as NSIS data.
Between 1978 and 1988, sample sites were located at every 5-km
intersection of the Ordnance Survey national grid (some sites were
not visited because of problems of access). Soil information was
recorded with a standard proforma to capture site and environmental
data. At each 10-km intersection, a full survey pit was also dug, and
the soil classied and described (Lilly et al., 2010). Samples were
taken from each morphological horizon using standard protocols
(Lilly et al., 2010) and returned to the laboratory for analysis. Each
sample was prepared by air drying and sieving (plus milling for
some analyses), then subjected to a standard list of physical, chemi-
cal and biological analyses. The following information derived from
this sampling and analysis was used in this work.
(1) Site and environmental data, including elevation, slope, veg-
etation and climate (mean monthly temperature and rainfall
interpolated from UK Meteorological Ofce data). These data
were extracted from spatial layers by using the known location
of the samples.
(2) Horizon depth and colour (middle of horizon taken as depth,
and Munsell colour in the eld).
(3) Loss on ignition (LOI, sample weighed, then dried at 105C
for a minimum of 2 hours, cooled and weighed, then heated at
900C for 2 hours, cooled again and re-weighed to determine
the loss on ignition).
Figure 1 shows the distribution of the NSIS sample points
across Scotland and the location and distribution of the Hartwood
sample points (see section ‘Additional testing and incorporation of
photoimagery data’ for further explanation), and Table 1 lists the
input variables used for development of the neural network model,
with the possible values or ranges of values associated with each.
Many of the variables in Table 1 were used directly as inputs
for the neural network models described below, with normalization
carried out to t them into a standard range. These included the
colour properties (which were derived from original soil Munsell
colour using a conversion table developed specically for this
purpose), temperature and rainfall. The conversion of Munsell
codes to RGB is not appropriate at certain large Value and Chroma
numbers as the range of RGB values is in effect not large enough
to cover all of the possible Munsell values, but within the range of
colours found within Scottish soils this was not an issue. The same
issue does not exist for the CIELab colour coordinate system (CIE,
1932), which was considered as an alternative to RGB but has only
minor improvements to the prediction of soil characteristics and has
additional processing costs (Aitkenhead et al., 2013). Most soils in
Scotland occupy the smaller value and chroma ranges of the hues Y
and YR, with a smaller number of samples being of hue R, G, GY,
G and BG. Soils in other parts of the world occupy a wider range
of Munsell colour codes than those in Scotland, but very few would
fall outside the range of colours possible using the RGB system,
so the methodology developed here would still be appropriate.
Problems with converting Munsell colour codes to the RGB system
occur generally with large value numbers such as ‘extremely pale’,
which are relatively rare although not unseen. Conversion tables
© 2014 British Society of Soil Science, European Journal of Soil Science
Predicting Scottish topsoil organic matter content 3
Figure 1 Map showing the distribution of points in the National Soils
Inventory of Scotland and of the sample points used in the Hartwood dataset
(points in yellow are the 32 used from the overall distribution of samples in
Hartwood).
from Munsell colour codes to RGB were developed through a
combination of internet-based sources, many of which are relatively
old and contained only partial look-up tables. One of the most useful
sites found was that at http://ccc.orgfree.com/ (Boronkay, 2012),
which supplies a Microsoft Excel spreadsheet containing a number
of conversion utilities. It is possible to nd other packages to convert
between Munsell and RGB values, for example within the statistical
package R. However, we did not have sufcient experience in the
use of these packages and so developed our own sub-routine, which
made use of the above look-up table.
Other variables, by their nature or range of values, required
manipulation into a form more suitable for model input. These
included the following.
(1) Elevation: the majority of values were at relatively low eleva-
tion, and it was thought that using a linear normalization would
result in the higher sampling points masking the effects of small
changes at low elevations. Therefore, the square root of the ele-
vation in metres was taken to reduce this effect.
Tab le 1 Input characteristics used for development of the neural network
model
Characteristic Range/values Characteristic Range/values
Topsoil R (red) 0– 255 Soil drainage Excessive
Topsoil G (green) 0– 255 Free
Topsoil B (blue) 0– 255 Moderate
Subsoil R (red) 0 –255 Imperfect
Subsoil G (green) 0 –255 Poor
Subsoil B (blue) 0– 255 Very poor
Elevation / m 1– 1200 Vegetation Deciduous
Slope / 0– 90 Coniferous
Aspect (north) 0– 180 Arable (crop)
Aspect (east) 0– 180 Grassland
(improved)
Slope form Flat Grassland (rough)
Concave Heath
Convex Bog
Straight
Slope type Flat Soil type Alluvial
Complex Calcareous
Simple Brown earth
Site drainage Normal Gley
Receiving Peat
Shedding Podzol
Mean temperature / C020 Ranker
Mean rainfall / mm 0– 4000 Regosol
(2) Slope: as for elevation, most slope values were relatively small
and the square root of each slope value in degrees was therefore
used.
(3) Aspect: the values for aspect were originally given as degrees
measured in an anticlockwise manner from the north. Using
a linear normalization therefore results in a discontinuity
between values that are slightly east and slightly west of
north. To correct for this, aspect has been given as two values,
absolute degrees from north and absolute degrees from east.
This removes any discontinuity, but makes it necessary to have
two values in order to identify properly the original aspect
value.
(4) Slope form: this is given as a descriptive term in the original
survey, with four possible terms described using four dummy
variable inputs (A, at; B, concave; C, convex; D, straight).
(5) Slope type: for the same reason as for slope form, this property
was expressed as multiple inputs (three in this case). The same
reasoning was applied to site drainage, soil drainage, vegetation
and soil type.
An examination of the coefcient of determination between
individual predictors was carried out, to determine whether it
was possible to simplify the inputs meaningfully. Results showed
that when using an R2value of 0.9 as a threshold, there were
strong relationships between: elevation and some temperature
monthly variables; elevation and some rainfall monthly variables;
temperature monthly variables ‘near’ one another in time (for
© 2014 British Society of Soil Science, European Journal of Soil Science
4M. J. Aitkenhead et al.
Figure 2 Conceptual diagram of a feed-forward
fully-connected neural network as used in this work.
The number of input nodes equals the number of input
variables, while the number of nodes in the hidden layer
equals twice that number.
example between mean monthly temperatures for May and June,
but not between May and December); and also rainfall monthly
variables ‘near’ one another. Because climate variables showed
strong coefcients of determination with one another in many
cases there was an argument to be made for reducing the numbers
of variables used. However, uncertainty about whether or not
seasonality of climate was important in affecting SOM led to
the decision to retain all monthly climate variables. A further
investigation to nd out if SOM values were correlated strongly
with any input variables showed that for the full dataset with
LOI range 0–100%, the majority of the input variables had R2
values of less than 0.1. The exceptions to this were bog vegetation
(R2=0.161) and the presence or absence of peat soil (R2=0.171).
Similarly, there were small values for the dataset with the LOI range
of 0–20%. We therefore concluded that no single input variable
could be used to predict SOM content.
Neural network architecture and training
The neural network model used was a feed-forward
back-propagation network (Bishop, 1995; Goh, 1995) with one
hidden layer. The training algorithm for this network uses incre-
mental changes in the connections weights over many (usually
several thousand) training cycles to minimize the error between
actual and target outputs at output nodes (and the nodes in the
hidden layer between input and output layers; see Figure 2, which
gives a schematic of the architecture and connectivity of a standard
articial neural network). Input nodes for a neural network of this
kind accept values in the range [0, 1], meaning that variables need
to be adjusted to t. For continuous variables such as elevation or
RGB colour codes, the relevant values need only be normalized
within the possible range. For variables that have a number of
different categories, however, such as soil or land-cover type, we
have used dummy variables in the same manner as for slope form
described earlier.
The node response function used to determine the activation level
of all hidden and output nodes was that given in Equation (1):
y=1(1+e𝛽x),(1)
where y is the output activation and lies in the range [0, 1], x is
the input activation [−∞,+∞], e is Euler’s number (approximately
2.71828) and 𝛽is the node response variable [0, +∞]. The number
of nodes in the hidden layer (86) was equal to twice the number
of input nodes (43), in accordance with Kolmogorov’s theory on
neural network architecture (Bishop, 1995). In order to optimize
the training rate 𝛼(which controls the rate at which connection
weights are adjusted) and node response variable 𝛽(which controls
the sensitivity of each node’s activation level to input values), all
combinations of the values (𝛼=0.0001, 0.001, 0.01, 0.02, 0.05, 0.1
and 𝛽=0.1, 0.2, 0.5, 1.0, 2.0, 5.0) were used in training a network
for 105steps, which took approximately 1 hour on a standard
desktop PC using Microsoft VB6.0 to implement the NN models.
From the results of comparing R2, RMSE (root mean square error)
and MAE (mean absolute error) using the cross-validation approach
described below, the values of 𝛼=0.02 and 𝛽=1 were used.
The MAE values varied only slightly across the different variable
combinations, while very slightly larger values of R2were obtained
with larger values of 𝛼, as was found for RMSE and smaller values
of 𝛽. The combination of values used was therefore considered to
be optimal to provide a balance of statistical evaluation variables.
Training was carried out by splitting the dataset described in
Section NSIS data preparation (2614 data points) into 10 subsets of
approximately equal size, by assigning each data point to a subset
© 2014 British Society of Soil Science, European Journal of Soil Science
Predicting Scottish topsoil organic matter content 5
at random. We then used tenfold cross-validation training, in which
10 models were each trained using nine out of the 10 subsets, with
the nal subset in each case used for testing of that model. Each
model was tested using a different subset, to allow robust ‘blind’
validation of the models while at the same time making full use of
all available data points. In order to avoid the problems caused by
attempting to train the NN to give values of 0 and 1 for the smallest
and largest output values (which according to Equation (1) would
require inputs of −∞ and +∞, respectively), the output range was
adjusted to lie within the range [0.1, 0.9] by normalizing along a
linear scale within this range. The consequence of this is that output
values of the trained network tend to fall within the same range
and were converted back to the range [0, 1] after output. After the
nal training with the variable values given earlier, the network
was evaluated using the test dataset. Values of RMSE, R2,MAE
and mean error for the actual and predicted values were calculated.
These values when given for the cross-validation training are across
the full dataset rather than one of the validation subsets.
In addition to training a neural network model using the full
dataset containing LOI values between 0 and 100%, a secondary
dataset that contained LOI values between 0 and 20% (1665 data
points) was generated. This was done in order to determine whether
or not a model restricted to small LOI values would be more
accurate within this range than the model trained on the full range
of values. As a priority of this work is to produce a model that could
be used by agricultural land managers, it is important to have a
model that operates best within the small organic matter range most
commonly found on agricultural soils in Scotland. The secondary
dataset output values were adjusted to t in the range [0.1, 0.9]
as above, and training was once again carried out using tenfold
cross-validation. The same statistical evaluation as described earlier
was carried out on the network trained and tested using these
secondary datasets.
Additional testing and incorporation of photoimagery data
To carry out further validation of the two NN models (full range
and small organic matter content), a soil organic matter dataset was
used from eld experiments at the Hartwood Research Station in
Lanarkshire, Scotland. This is an upland farming area of 350 ha,
between 150 and 300 m a.s.l.. None of the data used for testing
from this area was used in the training of the neural network
models described above. Thirty-two observations were selected
from a total of 319 made at the site (these were the only ones for
which imagery was available; see later), with the selected points
distributed spatially over the whole study area and including all
of the possible soil and land-cover types. Of the 32 sample points
used, 20 had topsoil LOI values less than 20% and were used
for testing the ‘mineral soil’ network model. The GPS locations
for each observation allowed the relevant environmental variables
given in Table 1 to be determined from existing spatial datasets.
A Nikon E5000 (Nikon, Tokyo, Japan) mid-range compact camera
had been used to obtain digital photographs of the soil at each
observation site, allowing RGB values to be estimated directly
Figure 3 Example image taken at Hartwood of a soil core used for colour
evaluation.
from the images. This estimate of colour was carried out after
adjustment of the colour values in each image with an automated
colour-correction method designed to balance the RGB values
of a white sheet of paper shown in each image. Optimal colour
correction using a standardized ‘colour card’ was not possible
as the imagery was obtained prior to the decision to use it for
organic matter content estimation, and so a standardized colour
correction card was not present in the image. However, visual
analysis of features in the corrected imagery such as the auger, grass
or the clipboard used indicated that the image colour, and therefore
presumably the natural soil colour, had been correctly restored by
the automated colour correction process. Figure 3 gives an example
of the imagery acquired during eld sampling at Hartwood, from
which the relevant area of the image (the soil in the auger) was
cropped and the RGB values averaged over a window of size
10 ×10 pixels for both the topsoil and subsoil.
Results
Full organic matter content range model
The neural network model trained with all 2614 data points with
organic matter values ranging from 0.81 to 98.7% predicted SOM
with an R2value of 0.877. The best-tted straight line produced for
comparing real and predicted values had a gradient of 0.730 and an
intercept of the yaxis of 5.34 (Figure 4). The RMSE value for the
entire test dataset was 11.13%, the ME was +2.15% and the MAE
was 5.87%. However, Figure 5 shows that when the RMSE values
were plotted against LOI values grouped within 1% LOI intervals
across the test data (as shown by Martin et al., 2011), this RMSE
value was not consistent across all LOI values. This curve can be
partially explained by the proportionally infrequent occurrence of
LOI values in the range between 20 and 90%, implying that the
network is less well trained on ‘organo-mineral’ soils than it is on
mineral or organic soils. For values greater than 93%, the RMSE
dropped once more as the number of samples increased. The fact
© 2014 British Society of Soil Science, European Journal of Soil Science
6M. J. Aitkenhead et al.
Figure 4 Actual plotted against predicted values of LOI for the testing
dataset and using all proportions of organic matter between 0 and 100%.
Some predicted values lie outwith the range [0, 100%] because of the
re-adjustment of NN output values caused by initial normalization of the
training values to make them lie in the range [0.1, 0.9].
Figure 5 RMSE values plotted against LOI for all LOI values, grouped
within intervals of 1% LOI.
that some of the predicted SOM values were less than 0 or more than
100% in Figure 4 is explained by the fact that normalization was
used during training to t the neural network output values between
0.1 and 0.9. If an output of less than 0.1 is given for example, then
converting the outputs back to the range [0, 100%] will result in
negative values. In practice, values outwith the range of possible
values (less than 0 or more than 100%) should be rounded to the
nearest ‘possible’ value of 0 or 100%, respectively.
The integration of colour and site descriptor data as inputs to the
model was assumed to provide a better model than using colour
or site descriptors alone. In order to test this assumption, the LOI
full-range dataset was used to develop models with (A) colour only
and (B) site descriptors only. The R2value obtained with colour
alone was 0.424, while that obtained using only site descriptors
was 0.605. Compared with the R2of 0.877 obtained with all data,
this gives a clear indication that using both types of information
produces a better model.
Figure 6 Actual plotted against predicted values of LOI for the testing
dataset with all proportions of organic matter between 0 and 20%. Some
predicted values lie outwith the range [0, 1] due to the readjustment of NN
output values caused by initial normalization of the training values to make
them lie in the range [0.1, 0.9].
Small organic matter content model
When tested on data containing LOI values of less than 20% alone,
the NN model trained on the full range of LOI values gave an
R2value of 0.354 and an RMSE of 7.31%. In addition, the MAE
was 4.10% and the ME was +1.84%. The RMSE, MAE and ME
values were improvements on those given for the dataset with the
full range of LOI values. However, this performance was still not
good enough to be applied in the eld, based on knowledge of (1)
ranges of organic matter content in agricultural soil, (2) the impacts
of land management on the organic matter content of these soils, and
(3) the estimated accuracies available from traditional laboratory
based LOI measurements. Nearly all cultivated land has LOI values
of less than 20%, so the rst NN model would not only provide
poor predictions for farmers but is also not focused on the necessary
range of values. The second network, trained only on LOI values of
less than 20%, gave an R2value of 0.674, an RMSE of 1.842%,
a mean absolute error of 1.327% and a mean error of 0.938%.
These values are an improvement on the ‘full-range’ model, and
demonstrate the effectiveness of the approach. Figure 6 shows the
relationship between predicted and target values of LOI for the
points with values less than 20% from the small organic matter
content model, while Figure 7 shows the relationship between LOI
and RMSE for this model, as found by Martin et al. (2011). Figure 7
shows that the RMSE increases from a minimum near zero organic
matter content to between 2 and 2.5% at 20% LOI. This matches
what is seen in Figure 5 for the same LOI range, but with much
smaller RMSE values. With these levels of accuracy, predictions of
SOM are much more useful.
Additional testing with Hartwood soils
The sample data from the Hartwood eld station provided a further
test of the method, using eld data obtained outwith the sampling
protocols of the NSIS survey, and with digital imagery used to
© 2014 British Society of Soil Science, European Journal of Soil Science
Predicting Scottish topsoil organic matter content 7
Figure 7 RMSE values plotted against LOI for all LOI values up to 20%,
grouped within intervals of 1% LOI.
Tab le 2 Statistical evaluation of the predictions made by the two neural
network LOI prediction models against actual LOI values for Hartwood soils
Full range (N =32) Small LOI (N =20)
Regression gradient 0.715 0.488
Regression intercept 5.78 2.71
R20.844 0.626
RMSE / % 13.92 3.04
Mean absolute error / % 6.31 1.97
Mean error / % 2.01 1.22
derive soil colour instead of Munsell estimates converted to RGB
values. Table 2 shows the R2, RMSE, mean absolute error and
mean error values given by the two neural network models for
the Hartwood soils. For the ‘full range’ model, 32 sample values
were used, while for the ‘mineral soil’ model, 20 sample values
were used. As can be seen, the models gave predictive accuracies
comparable to those obtained with the NSIS test data, showing that
the models can be applied effectively to the prediction of soil LOI
values for new sites.
Discussion
We have demonstrated the applicability of a neural network mod-
elling approach that can be used to predict soil LOI content from
observable environmental variables and soil colour. This approach
has been used to develop two models, one for soils with small
organic matter content and one for all soils. The small organic mat-
ter content model is more accurate within its target range of LOI
proportion values (0– 20%), while the ‘full range’ model is more
accurate at small (mineral soil) LOI values and less so at the inter-
mediate ‘organo-mineral’ range. Cultivated soils in the UK com-
monly have small LOI values in comparison to forest or moorland
soils (although extensively grazed moorland soils can have organic
matter-rich layers), and the small-LOI neural network model pre-
dicts LOI values with a degree of accuracy that allows soil organic
matter content to be rapidly estimated in the eld. This therefore
indicates that the approach used here will be more applicable for
small LOI soils such as those under agriculture. While this approach
is not as accurate as laboratory analysis, it provides an assessment
of organic matter content that is potentially useful for mineral soils.
Although estimates of variation within LOI measurements vary for
control samples, normal gures quoted for accuracy are between 0.3
and 0.5% (Hoskins, 2002; Jason Owen, personal communication).
Recent work by Nocita et al. (2014) using Vis-NIR (visible-near
infrared) spectroscopy gives accuracy gures between 0.36 and
1.19% for soil organic carbon (SOC) at the European scale for min-
eral soils. If this is multiplied by a factor of between 1.5 and 2 to
convert to SOM and assuming that this is similar to LOI, it shows
that spectroscopy possibly provides slightly better input data than
RGB values alone. However, this is based on laboratory measure-
ments that are neither as rapid nor as cost-effective as the eld-based
assessment possible with colour alone. De Vos et al. (2005) showed
that LOI could be used to estimate TOC in soils with an R2of
0.98, even for soils with small organic matter contents, with the tra-
ditional multiplication factor of 0.58 being a good match for the
relationship. If we assume that TOC and SOC are the same, then
it is therefore acceptable to assume that an estimate of LOI can be
used to produce an estimate of SOC that is useful for land managers
and scientists alike. Some error propagation between calculation of
LOI and SOC will reduce the accuracy of the prediction, however.
We have shown that it is possible to assess LOI content rapidly and
cheaply to within an acceptable degree of error, with the decrease
in accuracy balanced against an improvement in speed and cost.
If implemented within a software tool, this can be useful for land
managers in assessing soil fertility and health. Recent work at
the James Hutton Institute has resulted in an application (‘app’)
(SOCiT) for Android and Apple mobile phones that makes use
of this model for Scotland (Donnelly et al., 2013). This ‘app’ is
potentially useful for assessing soil carbon stocks and budgets
over time, and provides a novel method of rapidly monitoring the
distribution of soil organic matter at small spatial scales.
One potential issue with the use of the LOI models described here
is that of knowing when to use the best model. The NN model used
for predicting small (<20%) LOI values worked better than the ‘full
range’ model for mineral soils, but without already knowing that
the soil has a relatively poor organic matter status it is not possible
to decide when that model should be applied. Land managers will
usually know the approximate organic content of their soils, and so
should be able to make that decision successfully, while a surveyor
unfamiliar with a specic site should be able to judge whether a
soil has a ‘small’, ‘medium’ or ‘large’ organic matter status based
on the relative colour, texture and structure of topsoil and subsoil.
However, this judgement is likely to be error-prone particularly for
soils with LOI values approaching 20% and this is an acknowledged
weakness of the system. Once this judgement has been made, it
could be used to select the model to be used in the hypothetical
software tool mentioned above. An example of such a tool (which
is only applicable for mineral soils on agricultural and forested land
in Scotland) is the SOCiT app mentioned above (Donnelly et al.,
© 2014 British Society of Soil Science, European Journal of Soil Science
8M. J. Aitkenhead et al.
2013). Existing online information, such as SIFSS (soil information
for Scottish soils), can be used to indicate the range of values for
the soil series present at a specic location, and can also provide
information about the indicative soil type. Recent work at the James
Hutton Institute has produced an iPhone ‘app’ implementation of
the SIFSS web application, allowing it to be used in the eld.
We have not evaluated the neural network models to determine
the relative or absolute sensitivity of OM predictions in relation
to individual inputs. Some of the input variables will be more
inuential than others, and it would be useful in future work to be
able to determine if there were some inputs that could be dropped
from the model without altering the overall accuracy of the system.
The model used here will not operate if any of the input values are
missing, and as some variables are harder to measure than others
in the eld, it might be possible to eliminate some from future
work and make the implementation and use of the model easier.
Reducing the number of input variables might also have an impact
on the accuracy of the system by eliminating some sources of error,
as each dataset used will have some degrees of error associated
with it. There will also be natural variation in the system that is
not accounted for and which is caused by other environmental
factors not considered. There are doubtless also errors caused by
additional factors, such as the impact of soil moisture on colour and
the natural variation of soil colour caused by mineralogy. These
are sources of error that would be difcult to eliminate without
carrying out detailed analysis of the soil, and this would eliminate
any usefulness of the approach in terms of rapid eld-based soil
assessment.
The models demonstrated here have been applied solely for
prediction of soil organic matter content. However, recent work
has shown that the concept could also be applied to carbon
budgets, with the loss or increase of organic matter content in a
soil being predicted (Liles et al., 2013) for different management
and environmental conditions. This is arguably a more useful
application of the soil organic matter model concept, as it would
provide information about the changes in the SOC. However, it is
harder to obtain information about the rate of change of organic
matter in a soil than it is to get information about the current organic
matter status. Acquiring sufcient data to train a model that could be
accurately applied across a whole country would require additional
investment in long-term monitoring networks.
A comparison with the work of Liles et al. (2013) is useful as this
also aimed to predict soil carbon across a range of soil types and
environmental conditions. The samples in this case were prepared
in the laboratory (air-dried and sieved) and illuminated under
controlled conditions for colorimetry, and the statistical analysis
was carried out after grouping the samples, either into soil type or
parent material. This preparation of the samples resulted in smaller
RMSE values for Liles et al. (0.35 –0.8% for soils with <4%
carbon, and 1.2% for soils with >4% carbon) than for the neural
network model trained across all soil types and with eld data (1.8%
for testing within the standardized dataset used). This difference
in accuracy is to be expected given the variation in lighting, soil
moisture and other conditions, and the fact that the Munsell colours
provided for the NSIS data were evaluated by eye. However, it does
given an indication of the levels of accuracy that could be aimed for
in the future.
Conclusions
While we have developed an approach that is potentially useful
for assessing soil organic matter contents rapidly in the eld,
improvements are required to the models developed here before
they can be used to detect changes to soil organic matter content
caused by land-management activities or some other environmental
driver. It is also necessary to improve the estimation accuracy
in order to make them more effective for soils with very little
organic matter. Improvements could be sought in three different
ways: (i) improving the modelling approach, through the use of a
more sophisticated neural network training algorithm (or another
modelling method entirely, if it is demonstrably superior); (ii)
improving the colour sensor information, using colorimetric sensors
with better spectral resolution or accuracy or by adding available
multispectral remote sensing data; or (iii) increasing the amount
of information available from site characterization. This could
include additional topographic features, more detailed geological
information or information from more detailed soil maps than the
one used.
We have shown that neural network modelling can be used
to predict soil LOI content based on easily obtained, in situ
observations including soil colour determined by imagery. We
have also demonstrated that using colour or site character alone
produces less accurate models with this neural network method.
The approach has been used to develop two models, one that can
be applied to soils with any organic matter content and one that
can be applied to soils with small LOI values. This has potential
for a number of applications, including rapid soil organic matter
estimation and, if the accuracy is improved, monitoring changes in
soil C and the efcacy of management to enhance C sequestration.
Acknowledgements
The authors would like to thank QMS (Quality Meat Scotland)
for providing the co-funding for this work as a grant in aid award
matched to funding from The Scottish Government’s Rural and
Environment Science and Analytical Services Division (RESAS).
We would also like to thank Dr Keith Matthews, Dr Allan Lilly and
Dr Steve Chapman of the James Hutton Institute for information
and assistance provided.
References
Aalders, I., Hough, R.L., Towers, W., Black, H.I.J., Ball, B.C., Grifths, B.S.
et al. 2009. Considerations for Scottish soil monitoring in the European
context. European Journal of Soil Science,60, 833–843.
Aitkenhead, M.J., Coull, M.C., Towers, W., Hudson, G. & Black, H.I.J.
2012. Predicting soil chemical composition and other soil parameters
from eld observations using a neural network. Computers & Electronics
in Agriculture,82, 108– 116.
© 2014 British Society of Soil Science, European Journal of Soil Science
Predicting Scottish topsoil organic matter content 9
Aitkenhead, M.J., Coull, M., Towers, W., Hudson, G. & Black, H.I.J. 2013.
Prediction of soil characteristics and colour using data from the National
Soils Inventory of Scotland. Geoderma,200-201, 99– 107.
Barouchas, P.E. & Moustakas, N.K. 2004. Soil colour and spectral analysis
employing linear regression models. I. Effect of organic matter. Interna-
tional Agrophysics,18, 1– 10.
Bellon-Maurel, V. & McBratney, A. 2011. Near-Infrared (NIR) and
Mid-Infrared (MIR) spectroscopic techniques for assessing the amount
of carbon stock in soils – critical review and research perspectives. Soil
Biology & Biochemistry,43, 1398– 1410.
Bishop, C.M. 1995. Neural Networks for Pattern Recognition. Oxford
University Press, Oxford.
Boronkay, G. 2012. Colour Conversion Centre [WWW document]. URL
http://ccc.orgfree.com/ [accessed on 21 August 2014].
Chaplot, V., Bernoux, M., Walter, C., Curmi, P. & Herpin, U. 2001.
Soil carbon storage prediction in temperate hydromorphic soils using a
morphologic index and digital elevation model. Soil Science,166, 48– 60.
Chapman, S.J., Bell, J.S., Campbell, C.D., Hudson, G., Lilly, A., Nolan, A.J.
et al. 2013. Comparison of soil carbon stocks in Scottish soils between
1978 and 2009. European Journal of Soil Science,64, 455–465.
CIE 1932. Commission international de l’Eclairage proceedings.Cam-
bridge University Press, Cambridge.
Conant, R.T., Ogle, S.M., Paul, E.A. & Paustian, K. 2011. Measuring and
monitoring soil organic carbon stocks in agricultural lands for climate
mitigation. Frontiers in Ecology & the Environment,9, 169– 173.
Croft, H., Kuhn, N.J. & Anderson, K. 2012. On the use of remote sensing
techniques for monitoring spatio-temporal soil organic carbon dynamics
in agricultural systems. Catena,94, 64–74.
De Vos, B., Vandecasteele, B., Deckers, J. & Muys, B. 2005. Capability of
loss-on-ignition as a predictor of total organic carbon in non-calcareous
forest soils. Communications in Soil Science & Plant Analysis,36,
2899– 2921.
Donnelly, D., Aitkenhead, M.J. & Coull, M.C. 2013. SOCiT Soil Car-
bon App for iPhone/Android [WWW document]. URL http://www.
hutton.ac.uk/research/groups/information-and-computational-sciences/
esmart [accessed on 29 January 2013].
Glenk, K. & Colombo, S. 2011. Designing policies to mitigate the agricul-
tural contribution to climate change: an assessment of soil based carbon
sequestration and its ancillary effects. Climatic Change,105, 43– 66.
Goh, A.T.C. 1995. Back-propagation neural networks for modeling complex
systems. Articial Intelligence in Engineering,9, 143– 151.
Haines-Young, R.H. & Potschin, M.B. 2009. Methodologies for Dening
and Assessing Ecosystem Services. Final Report, JNCC, Project Code
C08-0170-0062. The University of Nottingham, Nottingham.
Hoskins, B. 2002. Organic Matter by Loss on Ignition [WWW document].
URL http://www.naptprogram.org/les/napt/publications/method-
papers/2002-organic-matter-by-loss-on-ignition.pdf [accessed on 21
August 2014].
Kassam, A., Friedrich, T., Shaxson, F. & Pretty, J. 2009. The spread of
conservation agriculture: justication, sustainability and uptake. Interna-
tional Journal of Agricultural Sustainability,7, 292– 320.
Kibblewhite, M.G., Ritz, K. & Swift, M.J. 2008. Soil health in agricultural
systems. Philosophical Transactions of the Royal Society B: Biological
Sciences,363, 685– 701.
La, W.J., Sudduth, K.A., Chung, S.-O. & Kim, H.-J. 2008. Spectral
reectance estimates of surface soil physical and chemical properties.
American Society of Agricultural & Biological Engineers Annual Inter-
national Meeting, 2008, 4159– 4172.
Lal, R. 2009. Soils and food sufciency. A review. Agronomy for Sustainable
Development,29, 113– 133.
Lal, R. 2010. Beyond Copenhagen: mitigating climate change and achiev-
ing food security through soil carbon sequestration. Food Security,2,
169– 177.
Li, Q.Q., Yue, T.X., Wang, C.Q., Zhang, W.J., Yu, Y., Li, B. et al. 2013.
Spatially distributed modeling of soil organic matter across China: an
application of articial neural network approach. Catena,104, 210– 218.
Liles, G.C., Beaudette, D.E., O’Geen, A.T. & Horwath, W.R. 2013. Devel-
oping predictive soil C models for soils using quantitative color measure-
ments. Soil Science Society of America Journal,77, 2173– 2181.
Lilly, A., Bell, J.S., Hudson, G., Nolan, A.J. & Towers, W.(Compilers)
2010. National Soil Inventory of Scotland 1 (NSIS_1): Site Location,
Sampling and Prole Description Protocols. (1975–1988). Technical
Bulletin. Macaulay Institute, Aberdeen.
Maes, J., Hauck, J., Paracchini, M.L., Ratamaki, O., Hutchins, M., Ter-
manen, M. et al. 2013. Mainstreaming ecosystem services in EU policy.
Current Opinion in Environmental Sustainability,5, 128–134.
Martin, M.P., Wattenbach, M., Smith, P., Meersmans, J., Jolivet, C.,
Boulonne, L. et al. 2011. Spatial distribution of soil organic carbon stocks
in France. Biogeosciences,8, 1053– 1065.
McHenry, M.P. 2009. Farm soil carbon monitoring developments and land
use change: unearthing relationships between paddock carbon stocks,
monitoring technology and new market options in Western Australia.
Mitigation & Adaptation Strategies for Global Change,14, 497– 512.
Munsell Color Company 1954. Soil Color Charts. Munsell Color Company
Inc., Baltimore, MA.
Nocita, M., Stevens, A., Toth, G., Panagos, P., van Wesemael, B. & Mon-
tanarella, L. 2014. Prediction of soil organic carbon content by diffuse
reectance spectroscopy using a local partial least square regression
approach. Soil Biology & Biochemistry,68, 337– 347.
Orr, H.G., Wilby, R.L., Hedger, M.M. & Brown, I. 2008. Climate change
in the uplands: a UK perspective on safeguarding regulatory ecosystem
services. Climate Research,37, 77– 98.
Robbins, M. 2011. Crops and Carbon: Paying Farmers to Combat Climate
Change. Routledge, Taylor & Francis, Abingdon.
Saby, N.P.A., Bellamy, P.H., Morvan, X., Arrouays, D., Jones, R.J.A.,
Verheijen, F.G.A. et al. 2008. Will European soil-monitoring networks be
able to detect changes in topsoil organic carbon content? Global Change
Biology,14, 2432– 2442.
Stevens, A., van Wesemael, B., Bartholomeus, H., Rosillon, D., Tychon,
B. & Ben-Dor, E. 2008. Laboratory, eld and airborne spectroscopy for
monitoring organic carbon content in agricultural soils. Geoderma,144,
395– 404.
Suuster, E., Ritz, C., Roostalu, H., Kolli, R. & Astover, A. 2012. Modelling
soil organic carbon concentration of mineral soils in arable land using
legacy soil data. European Journal of Soil Science,63, 351 –359.
Van Wesemael, B., Paustian, K., Andren, O., Cerri, C.E.P., Dodd, M.,
Etchevers, J. et al. 2011. How can soil monitoring networks be used to
improve predictions of organic carbon pool dynamics and CO2 uxes in
agricultural soils? Plant & Soil,338, 247– 259.
Vasques, G.M., Grunwald, S. & Harris, W.G. 2010. Spectroscopic models
of soil organic carbon in Florida, USA. Journal of Environmental Quality,
39, 923– 934.
Zhang, C.S., Tang, Y., Xu, X.L. & Kiely, G. 2011. Towards spatial
geochemical modelling: use of geographically weighted regression for
mapping soil organic carbon contents in Ireland. Applied Geochemistry,
26, 1239– 1248.
© 2014 British Society of Soil Science, European Journal of Soil Science
... However, the correlation of the model was less than 0.5 [42]. A feedforward back-propagation network with one hidden layer was used by Aitkenhead et al. (2015) as a predictive model of soil organic matter [43]. They calculated different qualities of models depending on values of organic matter in the training data set. ...
... However, the correlation of the model was less than 0.5 [42]. A feedforward back-propagation network with one hidden layer was used by Aitkenhead et al. (2015) as a predictive model of soil organic matter [43]. They calculated different qualities of models depending on values of organic matter in the training data set. ...
Article
Full-text available
Soil spatial variability mapping allows the delimitation of the number of soil samples investigated to describe agricultural areas; it is crucial in precision agriculture. Electrical soil parameters are promising factors for the delimitation of management zones. One of the soil parameters that affects yield is soil compaction. The objective of this work was to indicate electrical parameters useful for the delimitation of management zones connected with soil compaction. For this purpose, the measurement of apparent soil electrical conductivity and magnetic susceptibility was conducted at two depths: 0.5 and 1 m. Soil compaction was measured for a soil layer at 0–0.5 m. Relationships between electrical soil parameters and soil compaction were modelled with the use of two types of neural networks—multilayer perceptron (MLP) and radial basis function (RBF). Better prediction quality was observed for RBF models. It can be stated that in the mathematical model, the apparent soil electrical conductivity affects soil compaction significantly more than magnetic susceptibility. However, magnetic susceptibility gives additional information about soil properties, and therefore, both electrical parameters should be used simultaneously for the delimitation of management zones.
... Therefore, in these sys tems, artificial neural networks (ANN) operate more efficiently than regression methods. Numerous stud ies have been carried out to estimate soil variables through artificial neural networks (Zhou et al., 2008;Bocco et al., 2010;Gago et al., 2010;Parvizi et al., 2010;Mokhtari Karchegani et al., 2011;Besalatpour et al., 2013;Dai et al., 2014;Moghimi et al., 2014;Aitkenhead et al., 2015;Marashi et al., 2017;Khanbabakhani et al., 2019;Marashi et al., 2019). Also, some studies have been conducted to predict crop yield by remote sensing, stochastic, artificial neural network (ANN) and simulation models (Bannayan and Crout, 1999;O'Neal et al., 2002;Bartoszek, 2014;Farjam et al., 2014;Domínguez et al., 2015;Emamgholizadeh et al., 2015;Dias and Sentelhas, 2017;Mohammadi Torkashvand et al., 2017;Niedbała, 2019; based on weather, soil and growth charac teristics as input data. ...
Article
Full-text available
The purpose of this study was to predict the percentage and yield of chamomile essential oils using the artificial neural network system based on some soil physicochemical properties. Several habitats of chamomile cultivation were investigated and 100 soil samples were shipped to the greenhouse. The maximum and minimum of pH, EC, K, OM (organic matter), CCE (calcium carbonate equivalent), and clay in soils were 8.75-7.94, 1.6-1.0, 381-135, 2.30-0.22, 69-16, and 55.6-32.0, respectively. Growth indices, essential oil percentage, and yield were measured. Artificial neural network modeling was carried out to predict the essential oil concentration and yield using three groups of soil properties as a predictor: 1- nitrogen (N), phosphorus (P), potassium (K), and clay; 2- pH, EC, organic matter (OM) and clay; 3- CCE, clay, silt, sand, N, P, K, OM, pH, and EC. So, three pedotransfer functions (PTFs) were developed using the multi-layer perceptron (MPL) with Levenberg-Marquardt training algorithm for estimating chamomile essential oil content. Results evaluation of the accuracy and reliability of showed that, the third PTF (PTF3) which developed by all independent variables had the highest accuracy and reliability. Results also showed that, it is possible to predict the concentration and yield of chamomile essential oil based on soil physicochemical properties. This issue is important in terms of land suitability, identify areas susceptible to chamomile cultivation and planning for essential oil yields.
... Therefore, LOI is considered an important parameter to characterize the content of organic matter [31]. Moreover, high-organic clay typically has an LOI value greater than 20% [32]. It is important to note that expandable clay minerals may release their adsorbed and structural water upon heating during the LOI test, leading to an apparent weight loss [33]. ...
Article
Full-text available
The effective and sustainable treatment of high-water-content waste dredged clay (WDC) remains a significant challenge in water conservancy engineering. In this study, we focused on the treatment of WDC produced by Kumamoto Ohkirihata Reservoir. The study examined the effect of two types of cement-based solidifiers, namely, ordinary Portland cement (OPC) and cement–fly ash agent (DF), on three clay samples collected from different locations. The cone index test was used to assess the samples’ properties. The dosage of cement required for effective improvement with DF was significantly reduced (by about 47–55%), compared to OPC. Moreover, the dewatering efficiency of WDC improved by the simple dewatering method of vertically placing environmental protection materials. Within seven days, the average water content of the WDC decreased to below the liquid limit compared with natural air drying. Finally, the dosage of DF required to stabilize the WDC under effective improvement conditions was reduced by 37–58%, which is higher than the dosage of OPC reduction (22–50%). The reduction in water content reduced the pore space of the soil particles, benefiting the internal bonding of DF-stabilized clay. Dewatering methods facilitate the use of DF solidifiers, facilitating sustainable and environmentally friendly improvement in WDC.
... A later study [30] predicted SOC in a comparative format by using different color spaces. Other various studies also considered stronger relation between color of the soil sample and SOM [32], [33], [28], [29], [42], [47]. Later on this relationship was tested on cell-phone application SOCIT [34]. ...
Preprint
Full-text available
– Soil organic matter (SOM) and soil moisture contents (SMC) are two main properties in defining soil health. It is a challenge to measure organic matter and moisture content in soil, as the conventional methods are time, labor and money consuming. In order to overcome these challenges, various image processing based models have been proposed to predict SOM and SMC. Proposed model uses a stepwise multiple linear regression (SMLR) method to predict these properties on basis of soil color features like color moments, GLCMs and different color models as well, since, soil moisture and organic content are influenced by soil color. Multiple soil samples from field are collected with a certain distance in order to simulate continuous variation in the contents. Loss of ignition method is used to generate the ground truth to feed the model. For produce and compare the result first 34 and then 6 optimal predictor variables are used in model. The output results in external validation for SOM prediction were: R 2 = 0.07, RMSE = 0.76, RPIQ = 1.00 and that for SMC were: R 2 = 0.77, RMSE = 0.55, RPIQ = 1.07.
... Loss-on-ignition (LOI) is a parameter that represents the organic matter content [31]. Generally, clays with an LOI of more than 20% are regarded as highly organic clay [32]. The LOI test was carried out on the sediments from areas № 1 to № 3. The results showed that LOI = 26.15% in area № 1, and that of the samples in areas № 2 and 3 exceeded 30%. ...
Article
Full-text available
The purpose of this study was to assess the performance of high water content clayey sediments at different liquid limits as the clays are treated with cement-based solidifying materials. Three clay samples are obtained from different locations in the Kumamoto Reservoir. Two types of cement-based solidifying agents, namely, ordinary Portland cement and a cement–fly ash binder, were used. Using the initial water content of clay and the mixing amount of the solidifying agent as experimental variables, a cone penetration test was performed on the solidifying agent-stabilized clays to obtain the cone index (qc). The results showed that when the water content to cementitious content ratio (w/AW) was used as a parameter for evaluating the improvement of solidifying agent-stabilized clay, different forms of improvements were observed when different water and solidifying agent contents were used. This implied that the parameter w/AW was not suitable for evaluating the improvement of such clay. A new parameter, K, representing the content of solidifying agent, was introduced to account for the water content. For all sampled clays, the correlation coefficients for the K–ln qc relationship exceeded 0.9. Considering the effect of the liquid limit of the samples, the modified content of the solidifying agent (KL) was introduced to evaluate the cone index of the stabilized soils. It was discovered that the proposed equation unified the assessment of the improvement of the three samples of Kumamoto clayey sediments owing to the new parameter, KL.
... Estimation of several soil properties including organic matter content has been demonstrated using soil colour and environmental covariates captured using smartphone camera and location (Aitkenhead et al. 2012;Aitkenhead et al. 2015). Links between soil colour and physicochemical properties can be captured using nonlinear modelling approaches such as neural networks running on a server-side processor (Aitkenhead et al. 2013;Aitkenhead et al. 2016a). ...
Article
Full-text available
The Scottish Government has recognised that soils perform many vital functions for the health of the environment and economy. In the last decade, there has been significant research output from several organisations across Scotland, in collaboration with partners in the rest of the UK and further afield. In this review, I highlight recent research focused on soil organic matter in the context of the main external drivers (land management and climate change). This review demonstrates the strengths and successes of the relatively tightly integrated policy-research-regulatory landscape in Scotland. It also highlights the need for more and greater impact through interdisciplinary and transdisciplinary research involving soil scientists, social scientists, policymakers and land managers. Evidence is presented that meaningful (rather than incremental) changes to climate change mitigation and adaptation policies and practices are necessary, with a further need for researchers and policymakers to consider both local conditions and global impacts of future climate on the practical implementation of soil-based climate change mitigation and adaptation strategies in Scotland. The role of environmental and social scientists through advocacy as well as research is explored and discussed.
... AI methods are mainly very effective in solving complex problems that empirical and/or quasi-empirical models may not solve with sufficient efficiency (Caudill 1987). In the same context, Aitkenhead et al. (2015) used environmental and color factors in a neural network model to predict surface soil organic matter content in Scotland. They proposed two models: one for soils with little organic ingredients (LOI < 20%) and the second for all types of soils. ...
Article
Full-text available
Soil organic carbon has favorable effects on the chemical, physical and thermal properties of soil, as well as its biological activities. Organic matter carbon is one of the important elements in soil, which plays a crucial role in soil quality of the forest ecosystems. In this research, to exactly estimate carbon sequestration (CS) according to the organic carbon and bulk density, we used RBF, MLP and multiple regression models. To do so, we took 60 soil samples from the depth of 0–15 cm of soil, across an altitudinal gradient of the forest, located at the Tarbiat Modares University Training Forest, and physicochemical soil properties (i.e., nitrogen, calcium, potassium, clay, silt, sand, organic carbon, pH, EC, bulk density and soil water content) as input variables for prediction of CS were measured. The results showed that CS of the study region was affected by soil physical and chemical characteristics. Furthermore, in all states, the RBF model statistically proved to have better prediction of CS compared to the MLP neural network and regression analysis, where the highest correlation between input variables and CS predicted with the least error was evident for RBF model followed by MLP and regression analysis, respectively. Moreover, the rate of carbon sequestration was not significantly affected by the amount of silt, whereas soil water content and soil electrical conductivity slightly affected the CS rate.
... Secondly, existing soil mapping of this region is still relatively spatially coarse (e.g. the FAO-UN-ESCO 1:5,000,000 soil map of the world), making it difficult to use such datasets for a study area approximately 20 km across. Aitkenhead et al. (2015) demonstrated a soil organic matter estimation RSQ value of 0.674 for mineral topsoils in Scotland using a similar approach, although for this the range of values was double that of the current work which achieved an RSQ of 0.500, and the number of samples available was higher. Aitkenhead et al. (2013) also showed that soil colour alone could be used to estimate organic matter content and calcium; they demonstrated comparable results with nitrogen and molybdenum although these two elements were less well estimated here using colour alone. ...
Article
The links between soil properties and smartphone imagery were investigated for 273 samples in the Halaba area of south-west Ethiopia. The aim of this was to explore the possibility of using a smartphone-based system to estimate soil properties in the field, without the need for sampling and laboratory analysis. This presents an opportunity to develop low cost soil assessment in remote locations. Imagery and associated site characteristics were captured using an ODK (Open Data Kit) interface developed specifically for the project. Two types of model linking image information to soil properties were explored, backpropagation neural networks (NN) and partial least squares (PLS). Models were generated with colour alone, spatial covariates alone and a combination of colour and spatial covariates. Two sets of data, for soil chemistry and soil physical properties, were modelled. For both NN and PLS models, estimation accuracy for chemical properties was consistently higher using colour and spatial covariate information together rather than colour or spatial covariates alone. For physical properties a similar pattern was seen but this was less clear, and estimation of physical properties was less successful based on statistical model validation.
Article
Forested wetland soils within the Piedmont and Coastal Plain physiographic provinces of Northern Virginia (NOVA) were investigated to determine the utility of a handheld colorimeter, the Nix Pro Color Sensor (“Nix”), for predicting carbon contents (TC) and stocks (TC stocks) from on-site color measurements. Both the color variables recorded with each Nix scan (“Nix color variables”; n = 15) and carbon contents significantly differed between sites, with redder soils (higher a and h) at Piedmont sites, and higher TC at sites with darker soils (lower values of L, or lightness; p < 0.05). Nix–carbon correlation analysis revealed strong relationships between L (lightness), X (a virtual spectral variable), R (additive red), and KK (black) and log-transformed TC (Ln[TC]; |r| = 0.70; p < 0.01 for all). Simple linear regressions were conducted to identify how well these four final Nix variables could predict soil carbon. Using all color measurements, about 50% of Ln(TC) variability could be explained by L, X, R, or KK (p < 0.01), yet with higher predictive power obtained for Coastal Plain soils (0.55 < R² < 0.65; p < 0.01). Regression model strength was maximized between Ln(TC) and the four final Nix variables using simple linear regressions when color measurements observed at a specific depth were first averaged (0.66 < R² < 0.70; p < 0.01). While further study is warranted to investigate Nix applicability within various soil settings, these results demonstrate potential for the Nix and its soil color measurements to assist with rapid field-based assessments of soil carbon in forested wetlands.
Article
Soil color is frequently used by researchers to determine soil properties such as soil organic matter (SOM). However, soil moisture can darken soil color, which seriously limits the accuracy of SOM estimates. This study focused on the influence of soil moisture on the SOM estimate by using our new moisture-based multicolor reconstruction (MMR) method. On the basis of RGB color and moisture values of moist soil samples, the optimal MMR (moist soil) model—root mean square error of validation = 4.537 g/kg, residual prediction deviation of validation = 1.681, ratio of performance to interquartile range of validation = 2.939—was obtained to develop the final model, which performed better than the RGB (moist soil) model. Our method can reduce the influence of soil moisture and increase accuracy for multicolor modeling, and explores a new way to accurately determine the SOM content.
Article
Full-text available
Soil organic carbon plays a major role in the global carbon budget, and can act as a source or a sink of atmospheric carbon, thereby possibly influencing the course of climate change. Changes in soil organic carbon (SOC) stocks are now taken into account in international negotiations regarding climate change. Consequently, developing sampling schemes and models for estimating the spatial distribution of SOC stocks is a priority. The French soil monitoring network has been established on a 16 km × 16 km grid and the first sampling campaign has recently been completed, providing around 2200 measurements of stocks of soil organic carbon, obtained through an in situ composite sampling, uniformly distributed over the French territory. We calibrated a boosted regression tree model on the observed stocks, modelling SOC stocks as a function of other variables such as climatic parameters, vegetation net primary productivity, soil properties and land use. The calibrated model was evaluated through cross-validation and eventually used for estimating SOC stocks for mainland France. Two other models were calibrated on forest and agricultural soils separately, in order to assess more precisely the influence of pedo-climatic variables on SOC for such soils. The boosted regression tree model showed good predictive ability, and enabled quantification of relationships between SOC stocks and pedo-climatic variables (plus their interactions) over the French territory. These relationships strongly depended on the land use, and more specifically, differed between forest soils and cultivated soil. The total estimate of SOC stocks in France was 3.260 ± 0.872 PgC for the first 30 cm. It was compared to another estimate, based on the previously published European soil organic carbon and bulk density maps, of 5.303 PgC. We demonstrate that the present estimate might better represent the actual SOC stock distributions of France, and consequently that the previously published approach at the European level greatly overestimates SOC stocks.
Article
Full-text available
As regional and continental carbon balances of terrestrial ecosystems become available, it becomes clear that the soils are the largest source of uncertainty. Repeated inventories of soil organic carbon (SOC) organized in soil monitoring networks (SMN) are being implemented in a number of countries. This paper reviews the concepts and design of SMNs in ten countries, and discusses the contri
Article
Full-text available
Rapid low cost methods to quantify soil C concentrations are needed to support local through global resource inventories. Color is a key indicator of many soil properties, with a strong linkage between darkness and soil organic matter (SOM), making it an important indicator for soil taxonomy, soil quality, and fertility. This study investigated relationships between quantitative measurements of soil color and C in similar to 1900 forest soil samples, representing a wide range of soil development, parent material (PM), and C concentrations. Utilizing a hand held chromameter and the CIELAB color space, soil darkness (L) was employed as a continuous predictor of C in simple models (C similar to darkness) with a slope similar to -0.1 across the entire population. Grouping samples by taxonomy and PM influenced model coefficients with strong correlation in weakly developed soil groups (Inceptisols r = -0.9) and more felsic PM. Soil redness (A) has a strong influence on model performance, altering slope and increasing data scatter. Including redness in multivariate relationships greatly increased fit, aligning model slopes at -0.1 for all PM. Ordinary least squares models reached predictive accuracy of <0.5% (RMSE) for specific soils, certain PM classes, and in samples with <4% C. These results demonstrate the utility of quantified soil color to drive predictive relationships and support data development to refine ecosystem C budgets and quantify soil C credits.
Conference Paper
Optical diffuse reflectance sensing in visible and near-infrared wavelength ranges is one approach to rapidly quantify soil properties for site-specific management. The objectives of this study were (1) to determine the accuracy of the reflectance approach for estimating physical and chemical properties of selected Missouri and Illinois surface soils, and (2) to compare the accuracies of soil P and K estimates from reflectance sensing, a prototype ion-selective electrode (ISE) system, and a combination of both reflectance and ISE sensing. Diffuse reflectance spectra of air-dried, sieved samples were obtained in the laboratory. Calibrations relating spectra to soil properties determined by standard methods were developed using partial least squares (PLS) regression. Good estimates (R2 = 0.83 to 0.92) were obtained using spectral data for soil texture fractions, organic matter, and CEC. Estimates of pH, P, and K were not good (R2 < 0.7), and P and K estimates were considerably worse than achieved by ISE. Including both spectral and ISE information in P and K calibrations provided very good results (R2 = 0.93), which were a considerable improvement over ISE data alone. Further investigation of this combined approach is warranted.
Article
Due to the large spatial variation of soil organic carbon (SOC) content, assessing the current state of SOC for large areas is costly and time consuming. Visible and Near Infrared Diffuse Reflectance Spectroscopy (Vis-NIR DRS) is a fast and cheap tool for measuring SOC based on empirical equations and spectral libraries. While the approach has been demonstrated to yield accurate predictions for databases containing samples belonging to soils with similar characteristics such as mineralogy, texture, iron and CaCO3 content, spectroscopic calibrations have been less successful when applied to large and diverse soil spectral libraries. The scope of this study was to predict SOC using a local partial least square regression approach. In total, 19,969 topsoil (0–20 cm) samples collected all over the European Union were analyzed for physical and chemical properties, and scanned with a Vis-NIR spectrometer in a single laboratory. The local regression method builds a different multivariate model for each sample to predict. Each local model is trained with neighbours' samples selected from a large spectral library, based on their spectral similarity with the sample to predict. We modified the local regression procedure by including other covariates (geographical and texture information) in the computation of the distance between samples. The results showed good prediction ability for mineral soils under cropland (RMSE = 3.6 g C kg−1) and grassland (RMSE = 7.2 g C kg−1). Predictions of mineral soils under woodland (RMSE = 11.9 g C kg−1) and organic soils (RMSE = 51.1 g C kg−1) were less accurate. The use of sand content in the computation of the sample similarities provided the most accurate SOC predictions due to its influence on light scattering properties of soils. In large datasets, using additional soil or environmental information allows to select neighbours that have overall the same soil composition as the samples to predict, resulting in more accurate models. This study shows that (i) it is possible to realize low-cost estimations of SOC at continental scale using large spectral libraries with a reasonable accuracy, and (ii) the local approach is a valuable tool to deal with large datasets, especially if existing soil property maps or soil legacy data could be used as covariates in the SOC prediction models.