Available via license: CC BY 4.0
Content may be subject to copyright.
Apr. 2019, Vol. 9, No. 2
doi: 10.1093/af/vfz003
© Liebe and White.
This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/4.0/),
which permits unrestricted reuse, distribution, and reproduction in any
medium, provided the original work is properly cited.
Feature Article
Analytics in sustainable precision animal
nutrition
Douglas M. Liebe, and Robin R. White
Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, VA
Key words: computer vision, data mining, Internet of things, machine
learning
Introduction
The global population, resource, and climate dynamics sug-
gest we must improve sustainability of food production systems
(Ohlsson, 2014; Kleinman et al., 2018). Improving livestock
production sustainability is particularly important because
a signicant portion of the projected increases in global food
demand is anticipated to come from livestock (Thornton, 2010).
Improving sustainability of livestock production systems can be
achieved through optimized reproductive, genetic, nutritional,
and health management (White et al., 2014, 2015). Management
decisions within livestock production can be thought of as two
interleaved feedback loops. The rst feedback loop is between
the animal and the environment: the animal is inuenced by its
environment and, in turn, inuences its environment. The sec-
ond feedback loop is between the animal and the manager: the
manager takes information about the animal’s behavior and
attempts to inuence the environment to optimize the animal’s
performance (Figure 1). Managers make management decisions
on different timescales ranging from immediate to relaxed.
An example of an immediate management decision would be
a farmer identifying an animal as sick, isolating the animal,
and treating the animal for the illness. We term this immedi-
ate because the farmer must identify the sick animal as soon as
possible and must react to the diagnosis as soon as possible. An
example of a relaxed management decision would be the farmer
electing to change the feed provided to his animals in response
to something observed about their production (i.e., the cows are
producing poorly, so change the ration to provide higher nutri-
ent density to correct a nutrient shortfall). This decision is more
relaxed because its formulation and response are subjected to
natural, biological delays (i.e., it may take days to weeks to see
a production response to a new diet). Improving the precision
of these decision-making processes and reducing the burden
of decision making on farmers are two critical steps toward
improving sustainability of livestock production. Precision agri-
cultural technologies have been identied as one possible solu-
tion (Berckmans, 2014; Tullo etal., 2019).
Precision eld crop agriculture has dramatically expanded
and industrialized over the last several decades, demonstrat-
ing substantial opportunity for using precision technologies in
agriculture (Thorp and Tian, 2004; Nash et al., 2009; Zhang
and Kovacs, 2012). Such technologies include global position-
ing system (GPS) guided equipment, unmanned aerial vehi-
cles, robotic harvesting and monitoring equipment, automated
application of agrochemicals, and others. Precision animal
agriculture, on the other hand, has had limited expansion.
Although technologies, such as temperature monitors, rumen
sensors, robotic milking machines, and others exist, the uptake
and industrialization of precision animal agriculture has
not paralleled crop agriculture. There are several differences
Implications
• The global population, resource, and climate dynamics
suggest we must improve sustainability of food produc-
tion systems; precision feeding of livestock may be one
way to accomplish this goal.
• Analytics for precision management can be classied
according to four levels: I) technique, II) data interpre-
tation, III) integration of information, and IV) decision
making. Most current animal agricultural analytics fall
under categories I and II. Moving toward analytics that
address integration of information and decision mak-
ing is of critical importance.
• Data analytical techniques such as linear modeling and
machine learning provide unique and important tools
for interpreting data obtained from on-farm sensors.
These techniques each apply to the different levels of
precision management classication.
• Assessing adequacy and performance of analytics
tools must, by default, depend on the objective of
those tools and the type of response considered. As
more advanced level III and IV systems are developed,
integration of expert opinion into analytics may be
essential to optimize performance and relevance on-
farm.
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
17
Apr. 2019, Vol. 9, No. 2
between crop and livestock management that may contribute
to this difference in technology uptake. For example, the man-
agement time scales for crop agriculture interventions, while
highly protable, are often measured in days or weeks. In ani-
mal agriculture, timescales for certain management can range
from hours to days. For issues of nutrition, health, productivity,
and efciency, animal agriculture must treat both the individ-
uals and the collective, whereas crop agriculture focuses pri-
marily on the eld (rather than on individual plants). Animal
losses are also perceived differently than crop losses, possibly
imposing higher standards on animal-based decision technol-
ogy. Collectively, these challenges mean that animal agriculture
will likely require different types of technological interventions
than have been pioneered in crop systems. Exploring opportu-
nities for where precision technologies may be relevant in the
livestock nutrition space exemplies this need.
Management applications for precision animal
nutrition
Optimizing rumen fermentation. The idea that fermentation
can be optimized if degradable carbohydrate sources and degra-
dable protein sources are properly matched has been contem-
plated for decades (Sinclair, 1995). The theory behind optimizing
nutrient synchrony suggests that fermentations will be optimized
if they are never limited by energy or nitrogen (i.e., supplies are
balanced). Despite this theory being sound, achieving nutrient
synchrony within rumen fermentations is extremely difcult
to accomplish with currently available technologies (Hall and
Huntington, 2008). One potential reason for this challenge is
the limited real-time data available on the fermentation envi-
ronment. Several models attempt to account for nutrient deg-
radation kinetics (Hanigan et al., 2013; Higgs et al., 2015; Van
Amburgh et al., 2015; Li et al., 2018); however, obtaining data
to construct and evaluate models of degradation kinetics in vivo
often requires expensive experiments. The advent of technolo-
gies such as indwelling rumen sensors have enabled more pre-
cise understanding of how pH changes over the course of a day.
Expanding these sensors to include recording other important
metabolites could enable development of feeding recommenda-
tions that take fermentation prole into account more precisely.
Detection of metabolic diseases. It is possible to use analytics
to identify risk of metabolic diseases. Existing efforts to identify
other disease states (e.g., mastitis) have shown moderate promise.
Much like metabolic diseases, mastitis is extremely costly to the
dairy industry. Diseases are often difcult to predict due to the
imbalance of positive results (disease cases) relative to the popu-
lation. For example, the incidence rate of clinical mastitis ranges
among farms and depends on many factors like housing or loca-
tion. The national average is near 15 cases per 100 cow lacta-
tions, or 1 case per 2,033 cow days, assuming a 305 day lactation
(McDougall et al., 2007). Put another way, a priori, a randomly
selected lactating cow from a random herd is only approximately
0.05% likely to exhibit clinical mastitis. In some farms, this rate
may be 0.1% or higher. The low density of the positive test cases
and the variation in the expected rate of positive test cases both
cause challenges for developing robust predictions.
Sparse datasets, the analytical term for the issue of having
a disproportionate amount of positive test cases in a dataset,
are a common problem in present-day analytics (Han et al.,
2015; Greenland et al., 2016). However, due to the widespread
nature of the issue, new analytical techniques such as modied
tree-based algorithms can learn patterns while maintaining the
underlying proportion of cases in the training data (Ushikubo
et al., 2017). Alternatively, the collation of larger datasets is
also advantageous for producing better metabolic disease pre-
dictions. There is a tendency to collect new data to train new
models, but in cases with sparse data, the combination of past
data and new data will lead to richer training sets. Consider that
each additional positive training case will greatly improve accu-
racy compared with each new negative case. In fact, removing
additional negative cases to articially improve the proportion
of positive cases can help to train models. The caveat to train-
ing on stratied datasets is that they must be properly validated
on datasets with the appropriate proportion of positive cases
to determine real-world use. By utilizing strategies designed
for the problem of sparse data in machine learning, predicting
metabolic disease will become easier, and most importantly,
more accurate, providing decreased false-positives.
Response-based nutrient requirement recommendations. A
major limitation of existing nutrient requirement systems like the
National Research Council Requirements for dairy cattle (NRC,
2001) is the requirement-based nature of the recommendations.
Maximizing production mass is often not the same as optimizing
production efciency. Multicriteria optimization has previously
been used to formulate rations to simultaneously achieve multi-
ple environmental goals (White et al., 2014, 2015). Optimizing
productivity or economic parameters could also be accom-
plished with this technique if the underlying equations linked
dietary inputs with productive outputs in a responsive way. A
challenge with response-based nutrient requirements systems is
that most of our current data that could be used to develop such
a system relies on pen-fed cattle. Responses of individuals are
likely unique and such a response-based model would be more
useful if feeding systems and nutrition models did a better job of
representing the individual, rather than the collective.
Figure 1. Depiction of the feedback loops between the farm manager, animal,
and environment. The animal and environment inuence each other, as do the
animal and the manager’s decisions about the animal. Additionally, the man-
ager can make decisions about the environment that will inuence the animal.
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
18 Animal Frontiers
Precision nutrition research. In a wide variety of ruminant
nutrition research, access to the rumen is obtained through
rumen cannulae; however, sampling through this orice is
physically difcult and often results in mixing of naturally
stratied (vertical and horizontal) rumen contents. The physi-
cal difculty in sampling the rumen can impede precision mon-
itoring of difcult-to-reach areas. Additionally, disrupting the
rumen environment through sampling physically or chemically
alters the unique microclimates that are thought to exist within
the rumen, and thus precluding accurate and representative
sampling. Collectively, these challenges make accessing unique
microclimates within the rumen a challenge. The availability of
a platform that can monitor rumen sensors would be valuable
to the study of these unique rumen microclimates.
What limitations exist for current technologies? Rutten and
colleagues summarized 126 publications describing 139 dairy
sensor systems from the period 2002 to 2012 (Rutten et al.,
2013). The systems were then compared based on the four
levels of I) technique, II) data interpretation, III) integra-
tion of information, and IV) decision making. Systems that
accomplish all four of these levels are often referred to as
cyberphysical systems. These cyberphysical systems are often
an automated network of sensors, networking technologies,
analytics, and actuation technologies that work in combina-
tion with or independent of the farmer to affect management
changes based on real-time sensed information on-farm. None
of the 139 sensor systems evaluated by Rutten and colleagues
included integration of sensed metrics with other informa-
tion available on the farm to produce management advice or
automated decision making (Rutten et al., 2013). Most sensor
systems that were used in the farmer’s decision process only
provided the raw data measured by the sensor, or a probability
(such as the probability of disease given the sensor data). In
both cases, the farmer is left to their intuition to integrate and
actually make a management decision. Although basic linear
models or logit models produce predictions that are correct
on average over a group, these models cannot account for
increased variation in individuals. The models being used to
interpret data, as referenced in level II of Rutten et al. (2013)
can struggle under the complexity of decision making. For
example, although there may be a manageable number of fac-
tors that affect the prediction of ketosis, the number of factors
affecting the costs and benets of the treatment of said ketosis
is surely greater. Put another way, knowing that a cow is 35%
± 2 likely to be ketotic tomorrow does not say anything about
whether the farmer should check the cow, treat the cow, cull
her, or something else. To properly assess the promise of ana-
lytics in creating cyberphysical systems capable of lling all
four levels of the Rutten et al. (2013) summary of agricultural
systems, we will present a common precision nutrition aim:
automated individualized feeding of dairy cows. Using this
example objective, we highlight several possible alternative
analytical approaches and discuss their strengths and poten-
tial pitfalls relevant to this objective.
A nutrition analytics example: automated individ-
ual feeding
Automated individualized feeding. Given the variation
among individual animals, it is reasonable to assume that by
using data specic to each animal, we can make better deci-
sions on what, and how much, to feed. As we have previously
noted, model-based feeding can optimize productivity for the
whole farm because individuals likely have differing and unique
requirements. Individual feeding requires the ability to collect
data specic to each animal and analytics capable of estimat-
ing individual requirements from that data. Feeding individ-
uals eliminates the need to over-feed some animals to avoid
under-feeding others, likely leading to more targeted feeding
practices. One does not necessarily need to feed each animal
individually; this same reduction in over/under feeding can be
accomplished simply by reducing the variation in the feeding
group, either by feeding more like-animals together or by feed-
ing animals in small groups. An example of variance reduction
through smaller groupings of animals would be the use of dif-
ferent feeding groups by lactation number in dairy cows. It is
clear that nutrient requirements are vastly different for rst and
fourth lactation cows, so they are separated to reduce the feed
requirement prediction variance. Another more targeted exam-
ple of individualized feeding is concentrate supplement feed-
ing. A larger group of animals can receive the same basal diet
and the supplement is provided separately to smaller groups
(Dela Rue and Eastwood, 2017). However, this type of individ-
ualized feeding, as noted by Dela Rue and Eastwood (2017),
has not been shown to provide marginal benet to farmers.
Multiple recent studies which suggested individualized supple-
ment feeding saw no improvement in milk production, body
condition score, or body weight (Lawrence et al., 2015; Dale
et al., 2016; Little et al., 2016). Although it seems intuitive that
more individualized feeding regimens would lead to better
performance, this is not always what occurs in practice. These
limitations may be because of the aforementioned issues with
requirement models, which are based on data from groups of
animals, not individuals. Another limitation might be the com-
plexity of analytics used for feeding recommendations. Of the
three citations above that showed no increase in performance
on individualized concentrate feeding, all studies used only one
variable (milk yield) to inform concentrate requirement. In one
study, only two levels of concentrate based on milk yield were
fed. In the other two studies, a linear multiplier of milk yield
was used to determine concentrate. Such low-dimensionality
models, using only one variable to predict a response, limits
the robustness of the predictions and results. We will examine
potentials of higher-level modeling approaches by examining
the current infrastructure to support cyberphysical systems in
the four levels described by Rutten et al. (2013).
Current cyberphysical systems infrastructure. Level I, the
techniques for data collection, is comprised of technologies
such as radio frequency identication (RFID) tags, acceler-
ometers, and other output measurement software (e.g., inline
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
19
Apr. 2019, Vol. 9, No. 2
milking parlor sensors). We can use these data that are col-
lected daily, or even in real-time, to broadly evaluate the per-
formance of animals. One of the issues with the techniques of
collecting raw data is the interpretation. With only raw data,
it is hard to determine the cause–effect relationship between
feeding and performance. For example, the fact that the daily
step count of an animal has increased on a new diet does not
inform the farmer whether or not to continue feeding this
diet or what needs to be changed. Rather, raw data must be
interpreted before it can be used effectively to make diet deci-
sions. Level II, or the interpretation of sensor data, seeks to
add context to sensor data with emphasis on explaining such
relationships. Many models attempt to predict intake require-
ments of dairy cows using raw data as predictors (Jensen et
al., 2015). Jensen et al. (2015) evaluated models that were
used on a national scale in different countries. All models
were t to held-out intake data to determine the residual error
in each prediction model. The root mean square prediction
error for each model ranged between 1.2 kg dry matter per
day and 3.2 kg dry matter per day (Jensen et al., 2015). The
held-out data included 94 treatment means derived from 917
lactating dairy cows. A given model’s average prediction was
near 2.0 kg of dry matter greater or less than a cow’s average
intake. If these results were applied to individual cow days,
the variance would necessarily be greater than the variance
in predictions for a cow’s average intake. Models predicting
dry matter intake can be simple, lending themselves to being
correct on average, which is not as useful in individualized
feeding because response variance increases. In a review of lin-
ear models predicting dry matter intake (Jensen et al., 2015),
models referred to as “advanced” were those that incorporated
interaction terms into the linear model, specically the mod-
els “TDMI” and “NorFor” (Huhtanen et al., 2011; Volden et
al., 2011). Many recent publications involve predicting intake
using less than 10 total predictor variables and rely on basic
linear regression (McParland et al., 2014; de Haas et al., 2015;
Shetty et al., 2017; White et al., 2017). Most models attempt
to nd the few variables that will reduce the variance better
than previous models. At some point, we will not be able to
nd a selection of 10 or fewer variables that continue to reduce
variance in a meaningful way. One advance in data analytics
is hierarchical modeling, which works well in the case where
there are many models using varying parameters to predict
the same response. Making a “model of models” can improve
accuracy beyond that of any one model in the group (Gelman,
2006). This is possible due to uncorrelated error structures in
different submodels. To create an example hierarchical model
for predicting dry matter intake in dairy cows, we could com-
bine the outputs of models built on herd level data into models
built on models using different individual cow measurements
to make a more accurate prediction of individual dry matter
intake than using a single model alone. Although hierarchi-
cal modeling is just a framework, there are many useful ways
to combine existing models that can improve model accuracy.
Models can be weighted based on accuracy in a test dataset,
the variance of predictions, or even on prior knowledge.
With over 9 million dairy cows in the United States, it intu-
itively seems easy to collect sufcient data to predict intake;
however, this is not necessarily the case (McParland et al.,
2014). First, data must be collated, not dispersed, to create bet-
ter-trained models. There are incentives now for farmers to con-
tinue to collect individual intake data and genetic data relating
to intake to help inform farmers in the future (Berry et al., 2014).
An estimated 89% of genetic variation in dry matter intake
could be explained with only four common animal characteris-
tics, according to one meta-analysis of genetic studies (Berry and
Crowley, 2013). Although we have great amounts of data, there
are near-innite permutations of cow characteristics that would
need to be predicted to improve dry matter intake prediction.
Luckily, data analytics offers a way to reduce the dimensionality
of problems and also group similar animals together to make the
prediction space more manageable. Principal component analy-
sis attempts to reduce dimensionality while maintaining maxi-
mal variance in the remaining dimensions using an orthogonal
transformation (Pearson, 1901). Consider a three-dimensional
set of data, shown in Figure 2. If we know the groupings ahead
of time, we can nd two angles using all three factors that max-
imizes variance in the dataset. This is modeled using a ashlight
at different angles and shining it through the data and observing
the shadow cast along the “wall’s” two axes. The angle of the
ashlight that casts the shadow with the least variance within
groups indicates the two planes to condense the data onto. By
using all three factors but condensing the descriptors into two
values for each point, we have reduced the dimensionality at min-
imal variance cost between groups. This is evident in the second
image in Figure 2. Using principal component analysis can also
help discern groups, as this analysis is sensitive to scale changes
and can be used to determine the distance between two mul-
ti-dimensional points in space. Traditionally, a machine learning
technique like k-nearest neighbors (Altman, 1992) or k-means
(Lloyd, 1982) is used to determine the similarity between points.
In our example with a herd of cows that we need to predict and
feed individually, a linear model trained on the entire herd will
only be right on average. If we do not have sufcient data to
make low-variance predictions for individual cows, we could
employ principal component analysis on the individual cow
data to determine cows that are most similar, combine their data
and train models on these smaller combined datasets of similar
cows to achieve more accurate results. By using a xed mode-
ling procedure and measure of accuracy, we could iteratively test
models using data from smaller groups until we no longer saw
an improvement in accuracy. Consider the scenario outlined in
Figure 3 which explains the framework for using principal com-
ponent analysis to nd the optimal groupings for a given model.
It is important to note that although two-dimensional prin-
cipal component analysis is easiest to visualize, these results
should be retained in the number of dimensions that explains a
specied amount of variance. Figure 4 shows a plot of the var-
iance explained as the number of dimensions included in prin-
cipal component analysis is increased. With fewer dimensions
there is less variance explained by the components and the pro-
portion of variance explained by each additional component
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
20 Animal Frontiers
is high. As we increase dimensions, the cumulative variance
explained increases but the proportion of variance explained by
each additional component decreases. Humans tend to inter-
pret best in two dimensions, but we can see that if we wanted
our principal component analysis to explain at least 80% of the
variance in our dataset, two dimensions would not be sufcient.
Also keep in mind that not all datasets will produce such steady
reductions in variance with each component. There is no rule of
thumb for how many components to condense. With principal
component analysis, and many algorithms in data analytics, we
must trade-off interpretability for accuracy.
Opportunities to leverage machine learning in
precision livestock nutrition
In level III, integration of information, the predictions made
by models are used to created recommendations for the farmer.
Level IV is the culmination of the prediction, leading to action,
either by the system itself or the farmer. A lack of level III and
IV cyberphysical systems was noted in Rutten et al. (2013). We
would expect that, by utilizing the most appropriate modeling
techniques to generate predictions at levels I and II, appropri-
ate decision-making models would be possible. However, this
is obviously not the case, as we see minimal examples of deci-
sion-making algorithms present in the current animal nutrition
literature. One factor that traditional modeling frameworks
do not allow for is the ability to update based on feedback.
If a level II model predicts dry matter intake at 50 kg, but
the farmer continuously adjusts this to 45 kg, based on his/
her knowledge of something outside the model scope, a tra-
ditional model does not “learn.” Here, neural networks and
other recurrent machine learning algorithms provide a prom-
ising approach to decision-making frameworks by allowing for
revising predictions in practice. In a traditional individualized
feeding modeling framework, a model is built for each cow
and the model itself does not change, only the predictions. In
a machine learning framework, the predicted dry matter intake
for a cow each day could be predicted and, using all data availa-
ble along with the actual response of the animal, the algorithm
may change the weights of certain factors in the model. This
dynamic feedback loop allows the model to “learn” on-farm
and produce more accurate predictions.
Figure 2. Example of principal component analysis from three to two dimensions. Consider ashing a light on a set of points in three dimensions and observing
the shadows of the points in two dimensions on the wall. The shining of the ashlight through the data represents the search for the plane which creates the
greatest variance between groups in the data. The angle of the light in the bottom picture nds a better two-dimensional plane to project the points onto com-
pared with the image above.
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
21
Apr. 2019, Vol. 9, No. 2
Neural networks, or articial neural networks, are actually a
combination of many algorithms in a network, where layers of
nodes, representing algorithms, feed outputs from the previous
layer of nodes as inputs to the next layer, until the nal layer’s
output is used as the prediction (McCulloch and Pitts, 1943).
Figure 5 shows a typical framework for neural network, with
raw information being fed into the left and predictions coming
from the right. Nodes each represent a nondescript function,
typically those that make small changes to inputs, allowing
for better control at each node over the nal prediction. The
real power for a problem with the complexity of individualized
feeding is the idea of backpropagation, where the accuracy of
prediction is back-propagated through the nodes of a network
to re-weight the importance of each node, thereby ensuring
better accuracy on the same example datum if presented again
(Werbos, 1974). Put simply, backpropagation allows us to dis-
tribute error through the existing network. Neural networks
have been shown to detect patterns in highly nonlinear data,
which is nearly impossible for linear models (Fukushima, 1980).
Reinforcement learning is another key concept in the eld
of machine learning and is crucial for problems where cost
functions are not explicit, like in predicting feed intake. That is,
we do not know the exact cost of overfeeding or underfeeding.
Suppose we are training a model to tell a farmer how to feed
each cow, but the farmer is well-informed and keeps adjust-
ing the predictions. If we were trying to minimize the need for
farmer intervention, our feedback loop would weight errors
based on the farmer’s adjustment to each prediction. That is to
say the recurrent neural network is estimating the model that
limits error under the unknown cost function. The framework
starts with substantial uncertainty about the cost function and
the network performs poorly; then, the network is trained and
the model parameterized to decrease the cumulative costs.
This is done in an updating manner called a Markov decision
Figure 4. An example plot of the proportion of variance explained by each
additional component in principal component analysis. Variance explained by
each additional component can vary considerably based on the data you are
working with (Shah et al., 2018).
PCA X
DM
I
PCA X
DMI
DMI = Xβ + ε
DMI1= X1β1+ ε
DMI2= X2β2+ ε
DMI3= X3β3+ ε
1
2
3
PCA X
DMI
PCA X
DMI
1
2
3
Figure 3. Comparison of tting models after grouping results from principal component analysis (PCA). Grouping data based on a clustering algorithm allows
the same model increased exibility when making predictions. Notice that the model used does not change, only the data used to train the model is varied.
DMI, dry matter intake.
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
22 Animal Frontiers
process (Howard, 1960). In the real world, our farmers are
likely not omniscient, but the ability to estimate models under
cost uncertainty can still be utilized to choose better models for
actual decision making, because the cost of feeding decisions
is not xed or known, but predictions must be made every day
for every cow. In fact, reinforcement models are seen in many
places where decisions must be made, despite uncertainty about
their costs, like game-playing algorithms and resource alloca-
tion problems (Damas et al., 2000).
Having to make predictions faced with sometimes vast
uncertainty can make prediction modeling more difcult and is
surely a reason why reliable levels III and IV cyberphysical sys-
tems are not seen in animal agriculture. For example, a model
built to predict appropriate plane ticket costs will have a large
amount of training data, because there have been many ights
before. But how will a model predict the appropriate desire for
a plane ticket in the days after a terrorist attack? This is a main-
stream example, but consider one in the context of feeding ani-
mals. Assume a scenario where predictions for a cow’s intake
have been very accurate, then she gets her foot caught in the
parlor and is in a great deal of pain, the injury is not caught
immediately and will not be fed into the model as an explicit
variable. Is it correct to punish the model for incorrectly predict-
ing intake on this day? Likely not, because a known, but unan-
ticipated, event can explain the variation. This example points
to a major challenge with deploying these modeling techniques
on-farm. If allowed to iterate and update in an unrestricted
manner, the model will try to assign weights to other factors
to explain why the cow reduced intake the day she injured her-
self. For example, if activity data were included in the model,
the weight on activity responses might be updated because we
would anticipate activity to also change with the injured hoof.
However, the model may take some time to recover from this
prediction to correct the weight on activity under a noninjured
scenario, resulting in a period of time where predictions were
poor. A solution to this type of challenge would be to include
an injury variable in the model to account for these types of
cases; however, the point of the example is that there is always
opportunity for factors exogenous to the model to inuence the
behavior of the response variable. When building and deploy-
ing these analytics, we must consider that reality. Another
solution to the challenge is to omit data from the day in ques-
tion. However, that opportunity introduces the issue of human
perception with respect to identifying exogenous causes and
correctly differentiating them from endogenous causes. It is
important to keep in mind that we cannot leave out predictions
that are not correct without reason, because every cow needs to
get a prediction every day. A different solution might be found
in the training of the model. Instead of focusing on minimizing
the average cost of a prediction, it is possible to train the model
on minimizing the maximum cost of prediction. The measure
of costs relates to a secondary problem plaguing models of all
varieties today: how to choose the cost functions, or, how to
know which model is best.
Challenges with model selection and evaluation
There are a number of model evaluation statistics used
commonly to assess the precision and accuracy of predictions;
however, when models are applied as analytics in conjunction
with sensors and in the context of cyberphysical systems, the
system as a whole is often evaluated on the basis of sensitivity
and specicity. Indeed, in an example outside nutrition, there
are actually International Standards Organization standards
for sensitivity and specicity for cyberphysical systems formu-
lated initially for automated detection of mastitis (Rutten et al.,
2013). Sensitivity is a model’s ability to detect positive cases,
that is, the percentage of all true positives that are detected.
Specicity is the same metric applied to negative cases, namely
the percentage of total negative cases that the model detects
Figure 5. An example of a neural network framework. Circles represent individual equations which are fed data from all connected nodes. The lack of a 1-1
ratio of nodes in each layer of the network forces the model to condense information and leads to the most important information being determined iteratively
through backpropagation of error(Ivezicetal., 2014).
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
23
Apr. 2019, Vol. 9, No. 2
correctly. High specicity and low sensitivity leads to models
that rarely detect (predict) a positive case, while the opposite
would be true of high sensitivity, low specicity models. If
detecting metabolic disease is an important attribute of the
precision feeding system, a positive case might be an animal
with metabolic disease whereas a negative case might be an
animal free from disease. Although both of these calculations
are extremely important for a useful cyberphysical system
model in animal agriculture, false alarms can become an issue,
especially in cases where the proportion of positive to negative
cases is skewed in the overall population. In the case of models
that detect animal conditions to alert farmers, the positive pre-
dictive value is a third measure of model accuracy that should
be considered. The positive predictive value can be thought
of as the probability that an alert (predicted positive case)
actually is positive. Models with low positive predictive value
will have more false alarms. Although positive predictive value
would not be useful in the proportion of positive to negative
cases in the population was equal, in many disease detection,
less than 1% of cow days on a typical farm will be positive.
When we consider the example of predicting intake, or
designing an ideal supplementation strategy for a cow, the use
of sensitivity and specicity for model evaluation becomes
more nebulous. Undoubtedly, it is more important to know by
how much you over- or under-predicted a response like intake
or milk yield than it is to know the binary directionality of the
residual. A number of statistics (root mean squared error, mean
absolute error, etc.) are available to quantify t in this manner.
However, as discussed above, when making recommendations
on-farm, incorporating the cost of these decisions is perhaps
most important. Working more explicitly to tie performance
predictions to economic data on-farm will be an important
step in advancing analytics of precision feeding.
When are the analytics good enough?
As John von Neumann said, “truth … is much too compli-
cated to allow anything but approximations” (Szász, 2011).
Approximations are a necessary evil, particularly in the business
of feeding animals. Livestock nutrition is a complex science,
verging on an art form, and successful nutritionists combine
analytics and exogenous information to optimize productivity of
their farms. A cyberphysical system, almost by design, limits the
opportunity for exogenous data, or at a minimum, changes the
way that exogenous data will inuence the system. To assess gold
standards for when a cyberphysical system is good enough for
deployment to farms, it may be useful to evaluate the standards
professional nutritionists use for making feeding recommenda-
tions. Many nutritionists have a dollar value or a milk response
cutoff that they believe a product, or feeding recommendation,
must be expected to achieve before it should be recommended
to a farmer. Gaining consensus on those cutoffs may be one way
to evaluate the relevance of precision nutrition analytics from an
industry context. Although it is possible to set more objective cut-
offs, creating such an objective cutoff implies that a given model’s
knowledge completely covers that of the experts, which is very
unlikely. Although models can help weigh options in complex
environments, they are only as complex as the data they are
trained on, and thus by default are less informed than an expert
who has the opportunity to see exogenous and endogenous var-
iables. Further work is needed to identify the best strategies to
combine and incorporating expert opinion/knowledge into a
cyberphysical system focused on animal feeding.
Literature Cited
Altman, N. S. 1992. An Introduction to kernel and nearest-neighbor nonparamet-
ric regression. Am. Stat. 46:175–185. doi:10.1080/00031305.1992.10475879
Berckmans, D. 2014. Precision livestock farming technologies for welfare
management in intensive livestock systems. Rev. Sci. Tech. 33:189–196.
doi:10.20506/rst.33.1.2273
Berry, D. P., M. P. Coffey, J. E. Pryce, Y. de Haas, P. Løvendahl, N. Krattenmacher,
J. J. Crowley, Z. Wang, D. Spurlock, K. Weigel, et al. 2014. International
genetic evaluations for feed intake in dairy cattle through the collation of data
from multiple sources. J. Dairy Sci. 97:3894–3905. doi:10.3168/jds.2013-7548
Berry, D. P., and J. J. Crowley. 2013. Cell biology symposium: genetics of feed
efciency in dairy and beef cattle. J. Anim. Sci. 91:1594–1613. doi:10.2527/
jas.2012-5862
Dale, A. J., S. McGettrick, A. W. Gordon, and C. P. Ferris. 2016. The effect
of two contrasting concentrate allocation strategies on the performance of
grazing dairy cows. Grass Forage Sci. 71:379–388. doi:10.1111/gfs.12185
Damas, M., M. Salmeron, A. Diaz, J. Ortega, A. Prieto, and G. Olivares. 2000.
Genetic algorithms and neuro-dynamic programming: application to water
supply networks. In: Proceedings of the 2000 Congress on Evolutionary
Computation. CEC00 (Cat. No.00TH8512), vol. 1. La Jolla (CA):
Institute of Electrical and Electronics Engineers; p. 7–14. doi:10.1109/
CEC.2000.870269
About the Authors
Douglas M. Liebe received his BS in
Animal Sciences from the Ohio State
University. He is currently in a PhD
position at Virginia Tech focused on the
role of data analytics for making man-
agement decisions in agriculture. Liebe’s
previous work involved mathematical
modeling and sustainability in animal
production systems.
Robin R. White obtained a BS and PhD
in Animal Sciences from Washington
State University. Her doctoral work
focused on mathematical modeling of
sustainable beef production systems.
White currently runs a three-tiered
research program at Virginia Tech with
basic research focused on optimizing
rumen fermentation, applied research
developing analytics for enhanced
feed efciency, and systems-oriented
research focused on describing sustain-
ability of livestock production systems.
Corresponding author: rrwhite@vt.edu
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019
24 Animal Frontiers
Dela Rue, B. T., and C. R. Eastwood. 2017. Individualised feeding of con-
centrate supplement in pasture-based dairy systems: practices and percep-
tions of New Zealand dairy farmers and their advisors. Anim. Produc. Sci.
57:1543–1549. doi:10.1071/AN16471
Fukushima, K. 1980. Neocognitron: a self organizing neural network model
for a mechanism of pattern recognition unaffected by shift in position.
Biol. Cybern. 36:193–202. doi:10.1007/BF00344251
Gelman, A. 2006. Multilevel (hierarchical) modeling: what it can and cannot
do. Technometrics. 48:432–435. doi:10.1198/004017005000000661
Greenland, S., M. A. Mansournia, and D. G. Altman. 2016. Sparse data bias: a
problem hiding in plain sight. BMJ. 352:i1981. doi:10.1136/bmj.i1981
de Haas, Y., J. E. Pryce, M. P. Calus, E. Wall, D. P. Berry, P. Løvendahl, N.
Krattenmacher, F. Miglior, K. Weigel, D. Spurlock, et al. 2015. Genomic
prediction of dry matter intake in dairy cattle from an international data set
consisting of research herds in Europe, North America, and Australasia. J.
Dairy Sci. 98:6522–6534. doi:10.3168/jds.2014-9257
Hall, M. B., and G. B. Huntington. 2008. Nutrient synchrony: sound in theory, elu-
sive in practice. J. Anim. Sci. 86(14 suppl.):E287–E292. doi:10.2527/jas.2007-0516
Han, X., Z. Shen, W. X. Wang, and Z. Di. 2015. Robust reconstruction of com-
plex networks from sparse data. Phys. Rev. Lett. 114:028701. doi:10.1103/
PhysRevLett.114.028701
Hanigan, M. D., J. A. Appuhamy, and P. Gregorini. 2013. Revised digestive
parameter estimates for the Molly cow model. J. Dairy Sci. 96:3867–3885.
doi:10.3168/jds.2012-6183
Higgs, R. J., L. E. Chase, D. A. Ross, and M. E. Van Amburgh. 2015. Updating the
cornell net carbohydrate and protein system feed library and analyzing model
sensitivity to feed inputs. J. Dairy Sci. 98:6340–6360. doi:10.3168/jds.2015-9379
Howard, R. A. 1960. Dynamic programming and Markov processes. New York
(NY): Technology Press and Wiley.
Huhtanen, P., M. Rinne, P. Mäntysaari, and J. Nousiainen. 2011. Integration of the
effects of animal and dietary factors on total dry matter intake of dairy cows
fed silage-based diets. Animal. 5:691–702. doi:10.1017/S1751731110002363
Ivezić, Ž., A. J. Connolly, J. T. VanderPlas, and A. Gray. 2014. Statistics, data
mining, and machine learning in astronomy: a practical Python guide for
the analysis of survey data. Princeton(NJ):Princeton University Press.
Jensen, L. M., N. I. Nielsen, E. Nadeau, B. Markussen, and P. Nørgaard. 2015.
Evaluation of ve models predicting feed intake by dairy cows fed total
mixed rations. Livest. Sci. 176:91–103. doi:10.1016/j.livsci.2015.03.026
Kleinman, P. J. A., S. Spiegal, J. R. Rigby, S. C. Goslee, J. M. Baker, B. T.
Bestelmeyer, R. K. Boughton, R. B. Bryant, M. A. Cavigelli, J. D. Derner, et
al. 2018. Advancing the sustainability of US agriculture through long-term
research. J. Environ. Qual. 47:1412–1425. doi:10.2134/jeq2018.05.0171
Lawrence, D. C., M. O’Donovan, T. M. Boland, E. Lewis, and E. Kennedy.
2015. The effect of concentrate feeding amount and feeding strategy on milk
production, dry matter intake, and energy partitioning of autumn-calving
Holstein-Friesian cows. J. Dairy Sci. 98:338–348. doi:10.3168/jds.2014-7905
Li, M. M., R. R. White, and M. D. Hanigan. 2018. An evaluation of molly cow
model predictions of ruminal metabolism and nutrient digestion for dairy
and beef diets. J. Dairy Sci. 101:9747–9767. doi:10.3168/jds.2017-14182
Little, M. W., N. E. O’Connell, and C. P. Ferris. 2016. A comparison of individual
cow versus group concentrate allocation strategies on dry matter intake, milk
production, tissue changes, and fertility of Holstein-Friesian cows offered a
grass silage diet. J. Dairy Sci. 99:4360–4373. doi:10.3168/jds.2015-10441
Lloyd, S. 1982. Least squares quantization in PCM. IEEE Trans. Inf. Theory.
28:129–137.
McCulloch, W. S., and W. Pitts. 1943. A logical calculus of the ideas immanent in
nervous activity. Bull. Math. Biophys. 5:115–133. doi:10.1007/BF02478259
McDougall, S., K. E. Agnew, R. Cursons, X. X. Hou, and C. R. Compton.
2007. Parenteral treatment of clinical mastitis with tylosin base or penetha-
mate hydriodide in dairy cattle. J. Dairy Sci. 90:779–789. doi:10.3168/jds.
S0022-0302(07)71562-X
McParland, S., E. Lewis, E. Kennedy, S. G. Moore, B. McCarthy, M.
O’Donovan, S. T. Butler, J. E. Pryce, and D. P. Berry. 2014. Mid-infrared
spectrometry of milk as a predictor of energy intake and efciency in lac-
tating dairy cows. J. Dairy Sci. 97:5863–5871. doi:10.3168/jds.2014-8214
National Research Council, Board on Agriculture and Natural Resources,
Committee on Animal Nutrition, and Subcommittee on Dairy Cattle
Nutrition. 2001. Nutrient requirements of dairy cattle. 7th rev. edn.
National Academies Press.
Nash, E., P. Korduan, and R. Bill. 2009. Applications of open geospatial
web services in precision agriculture: a review. Precis. Agric. 10:546–560.
doi:10.1007/s11119-009-9134-0
Ohlsson, T. 2014. Sustainability and food production. In: Food safety manage-
ment. Academic Press; p. 1085–1097.
Pearson, K. 1901. On lines and planes of closest t to systems of points
in space. Lond. Edinb. Dubl. Philos. Magaz. J. Sci. 2:559–572.
doi:10.1080/14786440109462720
Rutten, C. J., A. G. J. Velthuis, W. Steeneveld, and H. Hogeveen. 2013. Invited
review: sensors to support health management on dairy farms. J. Dairy Sci.
96:1928–1952. doi:10.3168/jds.2012-6107
Scholz, M. 2006. Approaches to analyse and interpret biological prole data.
Available from https://publishup.uni-potsdam.de/opus4-ubp/frontdoor/
index/index/docId/696
Shah, I. A., I. Khan, S. A. Mir, M. S. Pukhta, and A. A. Lone. 2018. Principal
component analysis utilizing R and SAS Software’s. Int. J. Curr. Microbiol.
Appl. Sci. 7:3794–3801. doi:10.20546/ijcmas.2018.705.441
Shetty, N., P. Løvendahl, M. S. Lund, and A. J. Buitenhuis. 2017. Prediction
and validation of residual feed intake and dry matter intake in danish lac-
tating dairy cows using mid-infrared spectroscopy of milk. J. Dairy Sci.
100:253–264. doi:10.3168/jds.2016-11609
Sinclair, L. A. 1995. Effects of synchronizing the rate of dietary energy and
nitrogen release in diets with a similar carbohydrate composition on
rumen fermentation and microbial protein synthesis in sheep. J. Agric. Sci.
124:463–472. doi:10.1017/S0021859600073421
Szász, D. 2011. John von Neumann, the mathematician. Math. Intelligencer.
33:42–51. doi:10.1007/s00283-011-9223-6
Thornton, P. K. 2010. Livestock production: recent trends, future prospects.
Philos. Trans. R. Soc. Lond. B. Biol. Sci. 365:2853–2867. doi:10.1098/
rstb.2010.0134
Thorp, K. R., and L. F. Tian. 2004. A review on remote sensing of weeds in
agriculture. Precis. Agric. 5:477–508. doi:10.1007/s11119-004-5321-1
Tullo, E., A. Finzi, and M. Guarino. 2019. Review: environmental impact
of livestock farming and precision livestock farming as a mitiga-
tion strategy. Sci. Total Environ. 650(Pt 2):2751–2760. doi:10.1016/j.
scitotenv.2018.10.018
Ushikubo, S., C. Kubota, and H. Ohwada. 2017. The early detection of subclin-
ical ketosis in dairy cows using machine learning methods. In: Proceedings
of the 9th International Conference on Machine Learning and Computing.
New York (NY): ACM; p. 38–42.
Van Amburgh, M. E., E. A. Collao-Saenz, R. J. Higgs, D. A. Ross, E. B.
Recktenwald, E. Raffrenato, L. E. Chase, T. R. Overton, J. K. Mills, and A.
Foskolos. 2015. The cornell net carbohydrate and protein system: updates
to the model and evaluation of version 6.5. J. Dairy Sci. 98:6361–6380.
doi:10.3168/jds.2015-9378.
Volden, H., N. I. Nielsen, M. Åkerlind, M. Larsen, Ø. Havrevoll, and A. J.
Rygh. 2011. Prediction of voluntary feed intake. In: H. Volden, editor.
NorFor—the Nordic feed evaluation system. Wageningen: Wageningen
Academic Publishers. p. 113–126.
Werbos, P. 1974. Beyond regression: new tools for prediction and analysis in
the behavioral sciences[PhD thesis]. Harvard University.
White, R. R., M. Brady, J. L. Capper, and K. A. Johnson. 2014. Optimizing
diet and pasture management to improve sustainability of U.S. beef pro-
duction. Agric. Syst. 130:1–12. doi:10.1016/j.agsy.2014.06.004
White, R. R., M. Brady, J. L. Capper, J. P. McNamara, and K. A. Johnson.
2015. Cow-calf reproductive, genetic, and nutritional management to
improve the sustainability of whole beef production systems. J. Anim. Sci.
93:3197–3211. doi:10.2527/jas.2014-8800
White, R. R., M. B. Hall, J. L. Firkins, and P. J. Kononoff. 2017. Physically
adjusted neutral detergent ber system for lactating dairy cow rations. I:
deriving equations that identify factors that inuence effectiveness of ber.
J. Dairy Sci. 100:9551–9568. doi:10.3168/jds.2017-12765
Zhang, C., and J. M. Kovacs. 2012. The application of small unmanned aerial
systems for precision agriculture: a review. Precis. Agric. 13:693–712.
doi:10.1007/s11119-012-9274-5
Downloaded from https://academic.oup.com/af/article-abstract/9/2/16/5448600 by guest on 13 April 2019