ArticlePDF Available

Improving short-term demand forecasting for short-lifecycle consumer products with data mining techniques

Authors:

Abstract and Figures

Today’s economy is characterized by increased competition, faster product development and increased product differentiation. As a consequence product lifecycles become shorter and demand patterns become more volatile which especially affects the retail industry. This new situation imposes stronger requirements on demand forecasting methods. Due to shorter product lifecycles historical sales information, which is the most important source of information used for demand forecasts, becomes available only for short periods in time or is even unavailable when new or modified products are introduced. Furthermore the general trend of individualization leads to higher product differentiation and specialization, which in itself leads to increased unpredictability and variance in demand. At the same time companies want to increase accuracy and reliability of demand forecasting systems in order to utilize the full demand potential and avoid oversupply. This new situation calls for forecasting methods that can handle large variance and complex relationships of demand factors. This research investigates the potential of data mining techniques as well as alternative approaches to improve the short-term forecasting method for short-lifecycle products with high uncertainty in demand. We found that data mining techniques cannot unveil their full potential to improve short-term forecasting in this case due to the high demand uncertainty and the high variance of demand patterns. In fact we found that the higher the variance in demand patterns the less complex a demand forecasting method can be. Forecasting can often be improved by data preparation. The right preparation method can unveil important information hidden in the available data and decrease the perceived variance and uncertainty. In this case data preparation did not lead to a decrease in the perceived uncertainty to such an extent that a complex forecasting method could be used. Rather than using a data mining approach we found that using an alternative combined forecasting approach, incorporating judgmental adjustments of statistical forecasts, led to significantly improved short-term forecasting accuracy. The findings are validated on real world data in an extensive case study at a large retail company in Western Europe.
Content may be subject to copyright.
R E S E A R C H Open Access
Improving short-term demand forecasting for
short-lifecycle consumer products with data
mining techniques
Dennis Maaß, Marco Spruit
*
and Peter de Waal
* Correspondence: m.r.spruit@uu.nl
Department of Information and
Computing Sciences, Utrecht
University, Utrecht, The Netherlands
Abstract
Todays economy is characterized by increased competition, faster product
development and increased product differentiation. As a consequence product
lifecycles become shorter and demand patterns become more volatile which
especially affects the retail industry. This new situation imposes stronger
requirements on demand forecasting methods. Due to shorter product lifecycles
historical sales information, which is the most important source of information used
for demand forecasts, becomes available only for short periods in time or is even
unavailable when new or modified products are introduced. Furthermore the
general trend of individualization leads to higher product differentiation and
specialization, which in itself leads to increased unpredictability and variance in
demand. At the same time companies want to increase accuracy and reliability of
demand forecasting systems in order to utilize the full demand potential and avoid
oversupply. This new situation calls for forecasting methods that can handle large
variance and complex relationships of demand factors.
This research investigates the potential of data mining techniques as well as alternative
approaches to improve the short-term forecasting method for short-lifecycle products
with high uncertainty in demand. We found that data mining techniques cannot unveil
their full potential to improve short-term forecasting in this case due to the high
demand uncertainty and the high variance of demand patterns. In fact we found that
the higher the variance in demand patterns the less complex a demand forecasting
method can be.
Forecasting can often be improved by data preparation. The right preparation method
can unveil important information hidden in the available data and decrease the
perceived variance and uncertainty. In this case data preparation did not lead to a
decrease in the perceived uncertainty to such an extent that a complex forecasting
method could be used. Rather than using a data mining approach we found that using
an alternative combined forecasting approach, incorporating judgmental adjustments
of statistical forecasts, led to significantly improved short-term forecasting accuracy. The
findings are validated on real world data in an extensive case study at a large retail
company in Western Europe.
Keywords: Demand forecasting; Sales forecasting; Consumer products; Fashion
products; Short life-cycle products; Data mining; Predictive modeling; Big data; Sales
forecast; Combined forecasting; Judgmental forecasting; Data preparation; Domain
knowledge; Contextual knowledge; Demand uncertainty; Retail; Retail testing; Demand
volatility; Impulsive buying
© 2014 Maaß et al.; licensee Springer. This is an open access article distributed under the terms of the Creative Commons Attribution
License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly cited.
Maaß et al. Decision Analytics 2014, 1:4
http://www.decisionanalyticsjournal.com/1/1/4
Background
Consumer products can be segmented into two different types of products regarding
their demand patterns: basic or functional products and fashion or innovative products
(Fisher & Rajaram 2000). Basic products have a long life-cycle and stable demand,
which is easy to forecast with standard methods. Fashion products on the other hand
have a short life-cycle and highly unpredictable demand. Due to their short life-cycles
fashion products are often bought just once prior to a selling period (and not reordered
after demand occurred which is usually the case for basic products) which makes them
hard to forecast. Fashion products thus need different forecasting methods than basic
products.
The problem of demand forecasting of fashion type products is described as being a
problem of high uncertainty, high volatility and impulsive buying behavior (Christopher
et al. 2004). Furthermore, Fisher & Rajaram (2000) describe it as a problem that is
highly unpredictable. Several authors propose not to try to forecast demand for these
products, but instead build an agile supply chain that can satisfy demand as soon as it
occurs (e.g. Christopher et al. 2004). In practice this is very expensive solution and for
our case even unfeasible due to the extremely short life-cycles.
Data mining and machine learning techniques have been shown to be more accurate
than statistical models in real world cases when relationships become more complex
and/or non-linear (Thomassey & Fiordaliso 2006). Classical models, like regression
models, time series models or neural networks, are also generally inappropriate when
short historic data is used that is disturbed by explanatory variables (Kuo & Xue 1999).
Data mining techniques have already been successfully applied on demand forecasting
problems (Fisher & Rajaram 2000; Thomassey & Fiordaliso 2006). In this paper we
report on an analysis of demand forecasting improvements using data mining tech-
niques and alternative forecasting methods in the context of a large retail company in
Western Europe.
Problem description
The forecasting problem in this research is to predict the demand for each product in
each outlet of the case company. The short-term demand forecast is used for distribut-
ing the products from the central warehouse to the outlets in the most profitable way,
but not for determining the optimal buying quantity. In fact total product quantities
are assumed to be fixed for this problem since products are only bought once in a
single tranche prior to the selling period according to the outcome of a long-term fore-
casting process which is not discussed in this research.
The currently used forecasting method at the case company largely depends on retail
testing. Retail tests are experiments in a small subset of the available stores, in which
products are offered for sale under controlled conditions several weeks before the start
of the main selling period. Additionally to demand also price elasticity is tested dur-
ing the retail test. The measured price elasticity is then used in a dynamic pricing
approach to maximize profits, given that total product quantities are fixed. The
dynamic pricing approach optimizes the tradeoff between expected sales, already
ordered quantity and change of expected sales through price alteration. For this pur-
pose each product is presented at different prices to the customer. The allocation of
price level to each product-outlet combination is done randomly but there are
Maaß et al. Decision Analytics 2014, 1:4 Page 2 of 17
http://www.decisionanalyticsjournal.com/1/1/4
always a fixed number of outlets having the same price for a given product. The
random allocation scheme is used in order to minimize interaction effects between
the different price levels of the products (high prices for a certain product could
induce the customer to buy another cheaper product). The retail test is thus used to
determine the sales potential and the price elasticity of each product. After the retail
test the price for each product is set by a separate advisory board according to
profit maximization goals (selling most of the bought quantity at the highest price
possible within 46weeks).
Literature review of existing forecasting methods and data mining techniques
Most of the standard forecasting methods for fashion type products are not able to
deal with complex demand patterns or uncertainty. In the following we will present,
next to data mining methods, those methods that have a potential to be useful for
forecasting of fashion type products. Furthermore we will introduce data preparation
methods which are especially important for this problem because they can transform
the input data in such a way that uncertainty and volatility is reduced. This enables
forecasting methods to deliver better results when they are applied on the transformed
input data.
Data mining methods
Definition of data mining
Hand (1998) defines data mining as the process of secondary analysis of databases
aimed at finding unsuspected relationships which are of interest or value to the data-
base owner. He states that data mining [] is entirely concerned with secondary data
analysis, i.e. the analysis of data that was collected for other purposes but not the
questions to be answered through the data mining process. This is opposed to primary
data analysis where data is collected to test a certain hypothesis. According to Hand
(1998) data mining is a new discipline that arose as a consequence of the progress in
computer technology and electronic data acquisition, which lead to the creation of
large databases in various fields. In this context data mining can be seen as a set of
tools to unveil valuable information from these databases. With secondary data analysis
there is the danger of sampling bias, which can lead to erroneous and inapplicable
models (Pyle 1999).
Simoudis (1996) views data mining as the process of extracting valid, previously
unknown, comprehensible, and actionable information from large databases and using
it to make crucial business decisions.
A similar definition is given by Fayyad et al. (1996) although they use the term know-
ledge discovery in databases (KDD) instead of data mining. They use the term data
mining only to denote the step of applying algorithms on data. Thus, their definition of
knowledge discovery in databases is in fact also a definition of data mining: KDD is
the process of using the database along with any required selection, preprocessing, sub
sampling, and transformations of it; to apply data mining methods (algorithms) to
enumerate patterns from it; and to evaluate the products of data mining to identify the
subset of the enumerated patterns deemed knowledge’”.
Weiss & Indurkhya (1998) state that data mining is the search for valuable information
in large volumes of data. They also highlight that it is a cooperative effort of humans and
Maaß et al. Decision Analytics 2014, 1:4 Page 3 of 17
http://www.decisionanalyticsjournal.com/1/1/4
computers where humans describe the problem and set goals while computer sift through
the data, looking for patterns that match with the given goals.
As can be seen from Table 1 definitions of data mining are very similar. One perceiv-
able difference is that Hand (1998) sees relationships as the output of the data mining
process instead of information or knowledge as the other authors. Although it appears
to be different from the other definitions on the first view, both definitions can be seen
as equal because information is created from the interpretation of the relationships
between the variables (Pyle 1999). Overall we can say that there is no dispute or
misconception about a definition of the term data mining
Despite the fact that data mining is seen as secondary data analysis (Hand 1998) the fore-
casting problem described in this case study is in fact (at least to large part) a primary data
analysis since the case company actively conducts an experiment (the retail test) in order
to determine the expected sales potential of their newly introduced products.
The application and success of the data mining (or knowledge discovery process) is
largely dependent on data preparation techniques. As Weiss & Indurkhya (1998) state:
In many cases, there are transformations of the data that can have a surprisingly
strong impact on results for prediction methods. In this sense, the composition of the
features is a greater determining factor in the quality of results than the specific prediction
methods used to produce those results.Thus, we cannot split the application of machine
learning algorithms and the preceding data preparation tasks. Both processes are
dependent on each other.
There are two main challenges one has to cope with during a data mining project:
First, it is not known in the beginning of the data mining process what structure of the
data and what kind of model will lead to the desired results. As Hand (1998) states:
The essence of data mining is that one does not know precisely what sort of structure
one is seeking. And second, the fact that many patterns that are found by mining algo-
rithms will simply be a product of random fluctuations, and will not represent any
underlying structure(Hand 1998).
Data mining process
Most authors describe the same general process of how to conduct a data mining task
or project. It can be described by the steps of understanding the problem, finding and
Table 1 Definitions of data mining
Author Type Characteristics
of type
Input Characteristics
of input
Output Characteristics of
output
Fayyad et al.
(1996)
Process Non-trivial,
involves search
or inference
Database Larger data sets
with rich data
structures
Knowledge Valid, novel, potentially
useful, ultimately
understandable
Hand (1998) Process Secondary data
analysis
Database Secondary
data
Relationships Unsuspected, of interest
or value for database
owner
Simoudis
(1996)
Process Extraction
process
Database Large scale Information Valid, previously unknown,
comprehensible,
actionable, useful for
making crucial business
decisions
Weiss &
Indurkhya
(1998)
Search Cooperative effort
of humans and
computers
Data Large volume Information Valuable
Maaß et al. Decision Analytics 2014, 1:4 Page 4 of 17
http://www.decisionanalyticsjournal.com/1/1/4
analyzing data that can be used for problem solution, prepare the data for modeling,
build models using machine learning algorithms, evaluate the quality of the models and
finally use the models to solve the problem. Of course this is not a linear process, many
steps have to be repeated and adapted when new insights were generated by another
step. Exemplary for the general process we will present the CRISP-DM method (see
Table 2) which was developed as a standard process model for data mining projects of
all kinds across industries.
Each activity listed in Table 2 is further split into sub-activities which we will not
present in detail here (for further information see www.crisp-dm.org).
Although the CRISP-DM method describes the general steps of a data mining project
it does not describe what to do for specific problem types and how exactly it should be
done. We will thus provide more details of the important steps of data mining in the
following section. These steps are data preparation/data transformation, data reduction
(called data selection in the CRISP-DM method) and modeling.
Data mining algorithms
For the discussed problem the specific characteristics of the data mining algorithm is
not essential. The complexity of the concepts that can potentially be learned can be
handled by almost all available algorithms. It is much more important to provide suffi-
ciently prepared data in this case.
Table 2 Steps and activities of the crisp-dm method
Step Activity
1. Business understanding Determine business objectives
Assess situation
Determine data mining goals
Produce project plan
2. Data understanding Collect initial data
Describe data
Explore data
Verify data quality
3. Data preparation Select data
Clean data
Construct data
Integrate data
Format data
4. Modeling Select modeling technique
Generate test design
Build model
Assess model
5. Evaluation Evaluate results
Review process
Determine next steps
6. Deployment Plan deployment
Plan monitoring and maintenance
Produce final report
Review project
Maaß et al. Decision Analytics 2014, 1:4 Page 5 of 17
http://www.decisionanalyticsjournal.com/1/1/4
Data preparation methods
Data transformation
Many authors note the paramount importance of data preparation for the outcome of
the whole data mining process (Pyle 1999; Weiss & Indurkhya 1998; Witten & Frank
2005). The paramount importance of data preparation is due to the fact that prediction
algorithms have no control over the quality of the features and must accept it as a
source of error; they are at the mercy of the original data descriptions that constrain
the potential quality of solutions(Weiss & Indurkhya 1998). Pyle (1999) notes that
data preparation cannot be done in an automatic way (for example with an automatic
software tool). It involves human insight and domain knowledge to prepare the data in
the right way. To goal of data preparation is to make the information which is enfolded
in the relations between the variables of the training set as accessible and available as
possible to the modeling tool(Pyle 1999).
Possible data preparation techniques are normalization, transformation of data into
ratios or differences, data smoothing, feature enhancement, replacement of missing
values with surrogates and transformation of time-series data. There are no rules that
specify which techniques should be applied in a certain order given a specific problem
type. The process to the find the right techniques depends more on the insight and
knowledge that is created during the process of data preparation and subsequent appli-
cation of learning algorithms.
Data reduction
There are two good reasons for data reduction: First, although adding more variables
to the data set potentially provides more information that can be exploited by a learn-
ing algorithm, it becomes, at the same time, more difficult for the algorithm to work
through all the additional information (relationships between variables). That is because
the number of possible combinations of relationships between variables increases expo-
nentially, also referred to as the combinatorial explosion(Pyle 1999). Thus it is wise
to reduce the number of variables as much as possible without losing valuable informa-
tion. Second, reducing the number of variables and thus complexity can be very helpful
to avoid overfitting of the learned solution to the training set.
There are three types of data reduction techniques: feature reduction, case reduction
and value reduction (see Figure 1 for an overview). Feature reduction reduces the num-
ber of features (columns) in the data set through selection of the most relevant features
Figure 1 Three types of data reduction techniques.
Maaß et al. Decision Analytics 2014, 1:4 Page 6 of 17
http://www.decisionanalyticsjournal.com/1/1/4
or combination of two or more features into a single feature. Case reduction reduces
the number of cases in a data set (rows) which is usually achieved through specialized
sampling methods or sampling strategies. Value reduction means reducing the number
of different values a feature can take through grouping of values into a single category.
Possible feature reduction techniques are techniques such as principle components,
heuristic feature selection with wrapper method and feature selection with decision
trees. Examples for case reduction techniques are incremental samples, average samples,
increasing the sampling period and strategic sampling of key events. For value reduction
prominent techniques are rounding, using k-means clustering and discretization using
entropy minimization.
Forecasting methods for demand with high uncertainty and high volatility
Not many forecasting methods can be applied in situations of high uncertainty and
high volatility of demand. In the following we will thus give a short overview of methods
that are applicable in this type of situation.
Judgmental adjustment of statistical forecasts
Sanders & Ritzman (2001) propose to integrate two types of forecasting methods to
achieve higher accuracy: judgmental forecasts and statistical forecasts. They note that
each method has strengths and weaknesses that can lead to better forecasts when they
are combined. The advantage of judgmental forecasts is that they incorporate import-
ant domain knowledge into the forecasts. Domain knowledge in this context can be
seen as knowledge about the problem domain that practitioners gain through experi-
ence in the job. According to Sanders & Ritzman (2001) domain knowledge enables
the practitioner to evaluate the importance of specific contextual information. This type
of knowledge can usually not be accessed by statistical methods but can be of high
importance especially when environmental conditions are changing and when large uncer-
tainty is present. The drawback of judgmental methods is their high potential for bias such
as optimism, wishful thinking, lack of consistency and political manipulation(Sanders &
Ritzman 2001). In contrast, statistical methods are relatively free from bias and can handle
large amounts of data. However, they are just as good as the data they are provided with.
Sanders & Ritzman (2001) propose the method judgmental adjustment of statistical
forecaststo integrate judgmental with statistical methods. However, they also state that
judgmental adjustment is actually the least effective way to combine statistical and judg-
mental forecastsbecause it can introduce bias. Instead an automated integration of both
methods can provide a bias free combination of the methods. Sanders & Ritzman (2001) re-
port that equally weighting of forecast leads to excellent results. However, in situations of
very high uncertainty an overweighting of the judgmental method can lead to better results.
Transformation of time-series
Wedekind (1968) states that the type of time-series depends on the length of the time
interval and that one type of time-series can be transformed into another type of time-
series by changing the length of the considered time interval. We can thus transform a
time-series that has trend and seasonal characteristics (time interval: month) into a
time-series that has only trend characteristics by considering just intervals of annual
length.
Maaß et al. Decision Analytics 2014, 1:4 Page 7 of 17
http://www.decisionanalyticsjournal.com/1/1/4
We can thus achieve a smoothing effect only by increasing the length of the time
interval because we do not forecast the occurrence of a single event but of multiple
events. The probability of the occurrence of a certain event is higher in a large time
interval than in a small time interval. If we predict the average number of events our
forecast then becomes more accurate (Nowack 2005).
Demand forecasting with data mining techniques
Thomassey & Fiordaliso (2006) propose a forecasting method for sales profiles (relative
sales proportion of total sales over time) of new products based on clustering and deci-
sion trees. They cluster sales profiles of previously sold products and map new prod-
ucts to the sales profiles cluster via descriptor variables like price, start of selling period
and life span. The mapping from descriptor variables to the sales profile cluster is
learned using a decision tree. Although it is a useful approach, retail testing turns out
to be much more precise than the proposed approach for the discussed problem.
Retail tests
Retail tests are experiments, called tests, in which products are offered for sale under
carefully controlled conditions in a small number of stores(Fisher & Rajaram 2000).
Such a test is used to test customer reaction to variables such as price, product place-
ment or store design. If the test is used to predict season sales for a product it is called
a depth test (Fisher & Rajaram 2000). In a depth test the test outlets are usually over-
supplied in order to avoid stock-outs which usually distorts the forecast. The forecast is
then used for the total season demand, which is ordered from a supplier before the
start of the selling period.
Fisher & Rajaram (2000) report there exists no further academic or managerial litera-
ture describing how to design retail tests. In order to achieve optimal results with a
retail test Fisher & Rajaram (2000) propose a clustering method to select test stores
based on past sales performance. They found that clustering based on sales figures out-
performs clustering on other store descriptor variables (average temperature, ethnicity,
store type) significantly.
Fisher & Rajaram (2000) assume that customers differ in their preferences for prod-
ucts according to differing preferences for specific product attributes (e.g. color, style).
Thus actual sales of a store can be thought of as a summary of product attribute prefer-
ences of the customers at that store. The clustering approach is thus based on percent-
age of total sales represented by each product attribute. Therefore stores are clustered
according to their similarity in the percentage mix along the product attributes. Then
one store from each cluster is selected as a test store to predict total season sales. The
inference from the sales in the test stores to the population of all stores is done using a
dynamic programming approach that determines the weights of a linear forecast
formula such that the trade-off between extra costs of the test sale and benefits from
increased accuracy is optimized.
Combined forecasting
The idea of combined forecasting is to apply several different forecasting methods (or
using several different data sources with the same forecasting method) on the same
problem. Improvement in accuracy is achieved when the component forecasts contain
Maaß et al. Decision Analytics 2014, 1:4 Page 8 of 17
http://www.decisionanalyticsjournal.com/1/1/4
useful and independent information (Armstrong 2001). Especially when forecast errors
are negatively correlated or uncorrelated the error might be canceled out or reduced
and thus improve accuracy (see also Figure 2 for illustration).
The more distinct the methods or data sources used for the component fore-
casts are (the more they are independent from another) the higher is the expected
improvement on forecasting accuracy compared to the best individual forecasts
(Armstrong 2001).
It is a widely accepted and practiced method that very often leads to better results
than a single forecasting method that is based on a single model (or data source)
(Armstrong 2001). However, a prerequisite is that each component forecast is by
itself a reasonably accurate forecast. Armstrong (2001) also states that combining
forecasts can reduce errors caused by faulty assumptions, bias and mistakes in
data. Combining judgmental and statistical methods often leads to better results.
Armstrong (2001) quotes several studies that found that equal weighting of methods
should be used unless precise information on forecasting accuracy of the single
methods is available. Accuracy is also increased when additional methods are used
for combined forecasting. Armstrong (2001) suggests using at least five different
methods or data sources, provided this is comparatively inexpensive to achieve opti-
mal results with combined forecasting. When more than five methods are combined
accuracy is improved, but usually at a diminishing rate that becomes less and less notable.
Armstrong (2001) states that combined forecasts are especially useful in situations of
high uncertainty.
Figure 2 Negatively correlated and uncorrelated errors of two distinctive forecasting methods
(A and B) reduce forecast error.
Maaß et al. Decision Analytics 2014, 1:4 Page 9 of 17
http://www.decisionanalyticsjournal.com/1/1/4
Methods
Data collection
The data used for our analysis originated from point of sale scanners at each outlet.
The scanner data is loaded each night into a central data warehouse and archived for
later analysis. Sales data is stored at the quantity per product per outlet per day granu-
larity. For the purpose of this research we computed the cumulated sales sum until day
7 in order to reduce variance and uncertainty. We also limited the forecast horizon to
the first seven days of the sales period in order to approximate a good measure for real
demand. If we would extend the forecast horizon further the proportion of stock-outs
would become too high and obscure real demand. During the first week stock-outs
occur in fewer than 5c of the cases so we can assume that sales volumes for the first
seven days are a sufficiently accurate approximation for real demand.
In a following step we cleaned the data for customer returns (negative sales num-
bers), oversized products that were delivered by an alternative logistic supplier (higher
chance of stock-out than normal), products that were planned to be sold just in a
subset of outlets and for products that were not tested in the retail test. The data set
entails all remaining sales cases of the year 2009. For the development of forecasting
models we limited the data set to weeks 1451 because the case company used a different
demand forecasting method and other replenishment cycles before week 14. We also ex-
cluded data from week 19 and 28 because here unsold products from earlier sales periods
were sold without conducting another pilot sale beforehand. The remaining weeks were
randomly split into two data sets. One was used for developing new forecasting methods
and the other one was used for testing.
Currently used forecasting method
The currently used forecasting method at the case company (see Figure 3) is based on
a calculation schema that consists of three components that are calculated separately.
The first component is a measure for the overall sales potential of a product derived
from the sales data of the retail test. It forecasts the total expected sales volume by
Figure 3 Schema of the currently used forecasting method.
Maaß et al. Decision Analytics 2014, 1:4 Page 10 of 17
http://www.decisionanalyticsjournal.com/1/1/4
extrapolating from the sample outlets to the whole population of outlets. The second
component is a measure for the general (product independent) sales potential of each
individual outlet which is derived from historical sales data. It determines how the fore-
casted total sales volume for a product is distributed among outlets. The third compo-
nent is a measure for the sales curve over time which is calculated from historical sales
data as the average sales curve for all outlets and all products using the sales data from
several weeks. It determines how the forecasted total sales volume for a product in an
outlet is distributed over time.
The measure for the overall sales potential of a product is influenced by experts that
interpret the results of the retail test and adjust the product sales potential measure to
special circumstances (like marketing campaigns for certain products or changed wea-
ther conditions). They also estimate price elasticity from the three different pricings of
the retail test and adapt expected demand volumes to the sales price, which is set by a
separate committee. In general the forecasting method makes strong use of aggregation
in order to cope with high uncertainty and volatility in demand patterns. Sales are
aggregated over all products regardless of product groups and common product features. It
is also aggregated over time (average over several weeks) in order to reduce volatility.
A reduction of the aggregation level can lead to potentially more accurate forecasts
since more complex forecasting methods (e.g. data mining techniques) can be applied.
The question however is, if reducing the aggregation level is possible with the given
level of volatility in the data. If volatility is too high the underlying effect which we
want to measure is superimposed by noise and forecasting accuracy will decrease.
As is turns out reducing the aggregation level on the product dimension (calculating
the sales potential for each product group separately instead of calculating the sales
potential for all products combined) leads to a reduced forecasting accuracy in terms of
increased misallocation with the current forecasting method (see Figure 4).
Reducing the aggregation level on the time dimension would reveal seasonal fluctua-
tions in an outlets sales proportion over the year but such an effect does not exist (at least
no seasonality that is stronger than the general noise level) and would thus not lead to
increased accuracy. The seasonal fluctuations of the total sales quantity is already captured
in the sales forecast, since the retail test is conducted only several weeks before the selling
period.
Why data mining techniques are not applicable in this case
This decrease in forecasting accuracy when the level of aggregation is reduced is the
reason that data mining techniques are not applicable for the discussed problem. The
advantage of data mining techniques is that its algorithms can capture more complex
demand patterns compared to other forecasting methods. In this case however, more
complex patterns can only be revealed when the level of aggregation is reduced. As this
leads to lower forecasting accuracy (due to superimposition by noise) data mining tech-
niques cannot unveil their potential to increase forecasting accuracy in this case.
Improved method
A possible way to reduce noise and uncertainty is to use multiple forecasting methods
and combine their results. One promising approach is to combine judgmental forecast-
ing and statistical forecasting as proposed by Sanders & Ritzman (2001). This approach
Maaß et al. Decision Analytics 2014, 1:4 Page 11 of 17
http://www.decisionanalyticsjournal.com/1/1/4
also satisfies the condition proposed by Armstrong (2001) that only the combination of
distinct methods leads to improved results.
The forecasting method used at the case company can be seen as a method that
strongly involves judgmental adjustment of statistical forecasts. The result of the retail
test is always interpreted by experts and adjusted for special circumstances such as sup-
ply problems, weather conditions, competitor moves or special promotions. However,
the process is strongly biased because there is a strong motivation to overestimate fore-
casts when the purchased quantity is larger than the expected sales volume. Further-
more the process itself, as well as the adjustment of the product sales potential to price
changes, is unstructured which can lead to decreased accuracy as described by Sanders
& Ritzman (2001).
We propose to increase forecasting accuracy by combining the current forecasting
process at the case company with a purely procedural version (without involving
human judgment) of the current forecasting method. This eliminates bias but does not
take domain knowledge, contextual and environmental information into account.
Since the change in demand caused by an altered selling price is estimated by human
judgment in the current forecasting process we further create a pricing function that
estimates the pricing effect in a purely procedural manner. The product sales potential
is then directly derived from the weighted sales figures of the retail sale without adjust-
ing demand for the different (random) price settings in the test stores. Instead a linear
price function is equally applied to all products. The price function determines the
Figure 4 Reduced aggregation level leads to increased misallocation.
Maaß et al. Decision Analytics 2014, 1:4 Page 12 of 17
http://www.decisionanalyticsjournal.com/1/1/4
increase or decrease in demand in dependence of the relative selling price change com-
pared to the planning price which was decided on by the separate committee. The coef-
ficients of the linear price functions (formula 1) were estimated by regression on the
test data set such that the amount of misallocation in terms of oversupply and under-
supply was not higher than with the original forecasting method.
ChangeInDemand ¼x0þx1RelativePriceChange ð1Þ
Two different price functions were estimated for each product this way: one price
function for all cases in which the selling price was decreased compared to the plan-
ning price through the committee and one price function for all cases in which the
selling price was increased compared to the planning price.
A schema of the combined forecasting method is shown in Figure 5. Both methods
rely on the data of the retail test to estimate the product sales potential. But the retail
test data is processed in two distinct ways. The judgmentally adjusted method uses
extra information (domain knowledge, contextual and environmental information) but
is biased. The purely precdureal method is unbiased and uses a general linear price
function. The results of each method are equally weighted with 50% as proposed by
(Sanders & Ritzman 2001) and constitute the new product sales potential. A different
weighting (75% judgmental, 25% mechanized) was also tried but lead to decreased fore-
casting accuracy. This finding is supported by Armstrong (2001) who states that the
weighting of methods should only be different from an equal split if there is a plausible
reason to do so.
Results
Evaluation method
In order to evaluate the used forecasting method and potential improvements a metric
that measures the distance of the forecast to the real value (in this case real demand) is
Figure 5 Schema of the combined forecasting method Combining two product sales forecasts A
and B into a single forecast.
Maaß et al. Decision Analytics 2014, 1:4 Page 13 of 17
http://www.decisionanalyticsjournal.com/1/1/4
defined. As stated above we can assume that realized sales quantities during the first
seven days are sufficiently close to the real demand (sales that could have been realized
with constant 0% stock-out rate). Thus the forecast error is the difference between the
forecast sales quantity and the realized sales during the first week (see Figure 6).
In order to evaluate the quality of the forecasting method we will rely on the most prac-
tical and reasonable approach possible, that is to test forecasting methods in situations
that resemble the actual situation(Armstrong 2001). In our case we measured the out-
come of the forecasts in terms of oversupply and undersupply as it occurred in reality. We
compared it with the amount of misallocation (in terms of oversupply and undersupply)
that would have been generated if the company would have solely relied on the used fore-
casting method. In the comparisons we assume that each store only receives one delivery
at the start of the selling period with the forecast quantity. In reality the case company is
restocking the products several times a week in order to minimize the stock-out rate.
The evaluation is conducted on the test data set (randomly selected weeks
16,17,21,22,23,24,26,27,29,30,31,35, 38,41,43,44,45,47,51) while the forecasting method
described in the previous section was developed on the training set to avoid overfitting.
Results
Using the new combined forecasting method the amount of misallocation can be
significantly reduced (as illustrated in Figure 7). Oversupply is reduced by 2.6% while
undersupply is reduced by 1.6%. The reduction in misallocation has a reasonable cost
saving impact through reduction of returning and restocking of unsold products.
Discussion and conclusion
For the problem type described in this research it is important to find ways to reduce
noise in the data and to cope with volatility. We can derive three types of methods that
Figure 6 Evaluation metric.
Maaß et al. Decision Analytics 2014, 1:4 Page 14 of 17
http://www.decisionanalyticsjournal.com/1/1/4
can be used to reduce noise and cope with volatility in the data: aggregation, using
domain knowledge and combined forecasting. Aggregation can be applied over the
three dimensions of the described problem: time, outlets and products. Aggregation is
heavily used in the currently used forecasting method at the case company. Domain
knowledge can be used in two ways: during model building and to adjust statistical
forecasts. Using domain knowledge during model building means to use domain know-
ledge about the structure and causal relationships of the problem to prescribe the elem-
entary building blocks of the model used for forecasting. There are in principle two
ways to model the underlying concepts: first, to know the structure and interrelation-
ships of the underlying concept through domain knowledge and theoretical knowledge
and second, to leave the detection of underlying concepts of the problem to the learn-
ing algorithm in a data mining approach. The learning algorithm in turn can only
detect those concepts that are not superimposed by noise. When the noise is large,
fewer concepts can be detected by the learning algorithm. Thus, if concepts are known
through domain knowledge they might be of more detail than any of the concepts a learn-
ing algorithm could possibly learn when noise level is large. Therefore the concepts known
already should be implemented in the forecasting model manually. An example for such a
concept known from domain knowledge is the concept of the price effect on demand. We
know from other research that demand is almost always increased when the price is low-
ered. There are only very few special cases in which this relationship does not hold. With
the domain knowledge about the products offered by the case company we can exclude
these special cases and get to the conclusion that in the problem domain the demand is
always increased or at least unchanged if the price is lowered and vice versa.
For the application of data mining algorithms it is essential that available domain
knowledge is incorporated into data preparation. The domain knowledge about which
Figure 7 Reduction of misallocation through combined forecasting method (normalized numbers).
(Weeks 16, 17, 21,22, 23,24, 26, 27, 29,30, 31, 35, 38, 41, 43, 44, 45, 47, 51).
Maaß et al. Decision Analytics 2014, 1:4 Page 15 of 17
http://www.decisionanalyticsjournal.com/1/1/4
concepts might actually be there has to be transformed into an appropriate data prep-
aration that makes the potential information accessible for the learning algorithm.
The third type of method is constituted by methods of combined forecasting. Com-
bined forecasting means to apply several different forecasting methods on the same
problem and use the average of the results as the forecast. Armstrong (2001) states that
the results become usually better when the combined methods use distinct forecasting
techniques or rely on distinct data sources.
One goal of this research was to examine if data mining techniques can be used to
improve demand forecasting for products with high uncertainty and very short selling
periods. We showed that in fact data mining algorithms can only be applied when noise
and uncertainty in the data are comparatively low. Because the data at the case com-
pany comes with very high uncertainty and noise, aggregation has to be applied on the
data to reduce the noise level so far that the data can be used for reliable forecasting.
The problem here is that the extent of aggregation needs to be so high that the number
of remaining relationships in the data is shrinking to a complexity level on which data
mining algorithms need not be applied anymore. A single formula can be used to
model the remaining relationships in the data. In order to apply data mining algorithms
such that they can model more complex relationships the aggregation level has to be
reduced to reveal additional relationships in the data. But we showed that a reduction
of the aggregation level seems not possible because in this case noise is superimposing
the information entailed in the data. Maybe a reduction of the aggregation level would
be possible with another product group feature (such as style, novelty or usefulness),
but it is questionable if such a feature can be found and it is also currently not captured
in the data warehouse of the case company.
We showed in this research that combined forecasting is a useful approach to achieve
better forecasting accuracy in situations of high uncertainty by developing an improved
forecasting method that significantly increased forecasting accuracy. Next to combined
forecasting judgmental adjustment of forecasts delivers a valuable source of informa-
tion about the environment and the problem domain that is not entailed in the data.
These findings encourage further research on how to integrate judgmental and contextual
information with information from databases. Especially in the field of data mining there is
almost no literature on a combined approach of data mining techniques with judgmental
techniques which we believe will lead to much better results than relying on data mining
techniques alone.
Competing interests
The authors declare that they have no competing interests.
Authorscontributions
DM carried out the research and wrote the manuscript. MS supervised the research and reviewed the manuscript.
PW co-supervised the research and gave recommendations for improvements. All authors read and approved the
final manuscript.
Received: 2 September 2013 Accepted: 6 September 2013
Published: 19 February 2014
References
Armstrong, JS. (2001). Principles of forecasting: a handbook for researchers and practitioners. New York, Boston, Dordrecht,
London, Moscow: Kluwer Academic Publishers.
Christopher, M, Lowson, R, & Peck, H. (2004). Creating agile supply chains in the fashion industry. International Journal
of Retail & Distribution Management, 32(8), 367376.
Maaß et al. Decision Analytics 2014, 1:4 Page 16 of 17
http://www.decisionanalyticsjournal.com/1/1/4
Fayyad, U, Piatetsky-Shapiro, G, & Smyth, P. (1996). Knowledge discovery and data mining: towards a unifying framework.
Menlo Park, CA: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD-96). AAAI Press.
Fisher, M, & Rajaram, K. (2000). Accurate retail testing of fashion merchandise: methodology and application. Marketing
Science, 19(2), 266278.
Hand, DJ. (1998). Data mining: statistics and more? The American Statistician, 52(2), 112118.
Kuo & Xue. (1999). Fuzzy neural networks with application to sales forecasting. Fuzzy Sets and Systems, 108(2), 123143.
Nowack, A. (2005). Prognose bei unregelmäßigem bedarf. In P Mertens & S Rässler (Eds.), Prognoserechnung (pp. 6172).
Heidelberg: Physica-Verlag.
Pyle, D. (1999). Data preparation for data mining. San Francisco, California: Morgan Kaufmann Publishers.
Sanders, NR, & Ritzman, LP. (2001). Judgmental adjustment of statistical forecasts. In JS Armstrong (Ed.), Principles of
forecasting: a handbook for researchers and practitioners (pp. 195213). New York, Boston, Dordrecht, London,
Moscow: Kluwer Academic Publishers.
Simoudis, E. (1996). Reality check for data mining. IEEE Expert Intelligent Systems and Their Application, 11(5), 2633.
Thomassey, S, & Fiordaliso, A. (2006). A hybrid sales forecasting system based on clustering and decision trees. Decision
Support Systems, 42, 408421.
Wedekind, H. (1968). Ein Vorhersagemodell für sporadische Nachfragemengen bei der Lagerhaltung. Ablauf- und
Planungsforschung, 9, 1. et sqq.
Weiss, AM, & Indurkhya, N. (1998). Predictive data mining a practical guide. San Francisco, California: Morgan Kaufmann
Publishers.
Witten, IH, & Frank, E. (2005). Data mining practical machine learning tools and techniques (2nd ed.). San Francisco,
California: Morgan Kaufmann Publishers.
doi:10.1186/2193-8636-1-4
Cite this article as: Maaß et al.:Improving short-term demand forecasting for short-lifecycle consumer products
with data mining techniques. Decision Analytics 2014 1:4.
Submit your manuscript to a
journal and benefi t from:
7 Convenient online submission
7 Rigorous peer review
7 Immediate publication on acceptance
7 Open access: articles freely available online
7 High visibility within the fi eld
7 Retaining the copyright to your article
Submit your next manuscript at 7 springeropen.com
Maaß et al. Decision Analytics 2014, 1:4 Page 17 of 17
http://www.decisionanalyticsjournal.com/1/1/4
... Exploring and investigating this data with the help of data mining leads to improved transparency and security in the supply chain (Eugene et al. 2017;Pappas et al. 2018) and to increasing understanding and knowledge about CE in the dairy supply chain (Choi et al. 2018;Geissdoerfer et al., 2018;Liu et al., 2020). Data mining techniques also aim to offer approaches to improve forecasts by dealing with the challenge of short-term forecasting for food products with high demand uncertainty and short shelf-life (Maaß et al, 2014). In addition, with an accurate forecast, the issues in establishing the balance between supply and demand can also be managed (Pappas et al. 2018;Blackburn et al., 2015;Arunachalam et al. 2018). ...
... In the food industry, data mining is an extremely useful method for dealing with food products with high demand uncertainty and short shelf life; dairy products are a prime example (Maaß et al, 2014). In this study, for "inadequacy of legal systems to build a circular system", "issues related to data security, integration and privacy", "technical infrastructure deficiency in CE adoption", "difficulties in establishing the balance between supply and demand", "little understanding and knowledge about CE in dairy supply chain", "lack of transparency across the value chain", "shorter product life cycle" and "lack of collaboration, coordination and cooperation among stakeholders" problems, data mining can be a solution to prevent losses caused by the lack of circular and sustainable processes. ...
Article
This study determines the potential barriers to achieving circularity in dairy supply chains; it proposes a framework which covers big data driven solutions to deal with the suggested barriers. The main contribution of the study is to propose a framework by making ideal matching and ranking of big data solutions to barriers to circularity in dairy supply chains. This framework further offers a specific roadmap as a practical contribution while investigating companies with restricted resources. In this study the main barriers are classified as ‘economic’, ‘environmental’, ‘social and legal’, technological’, ‘supply chain management’ and ‘strategic’ with twenty-seven sub-barriers. Various big data solutions such as machine learning, optimization, data mining, cloud computing, artificial neural network, statistical techniques and social network analysis have been suggested. Big data solutions are matched with circularity focused barriers to show which solutions succeed in overcoming barriers. A hybrid decision framework based on the fuzzy ANP and the fuzzy VIKOR is developed to find the weights of the barriers and to rank the big data driven solutions. The results indicate that among the main barriers, ‘economic’ was of the highest importance, followed by ‘technological’, ‘environmental’, ‘strategic’, ‘supply chain management’ then ‘social and legal barrier’ in dairy supply chains. In order to overcome circularity focused barriers, ‘optimization’ is determined to be the most important big data solutin. The other solutions to overcoming proposed challenges are ‘data mining’, ‘machine learning’, ‘statistical techniques’ and ‘artificial neural network’ respectively. The suggested big data solutions will be useful for policy makers and managers to deal with potential barriers in implementing circularity in the context of dairy supply chains.
... In practice, after a few weeks from the delivery on market, one has to sense how well a clothing item has performed and forecast its behavior in the coming weeks. This is crucial to improve restocking policies [27]: a clothing item with a rapid selling rate should be restocked to avoid stockouts. Two particular cases of the SO-fore problem will be taken into account: SO-fore 2−10 , in which the observed window is 2 weeks long and the forecasting horizon is 10 weeks long, required when a company wants to implement few restocks [14]; SO-fore 2−1 , where the forecasting horizon changes to a single week, and is instead re- quired when a company wants to take decisions on a weekly basis, as in the ultra-fast fashion supply chain [9,39]. ...
... • Demand forecasting. Forecasting demand is a crucial issue for driving efficient operations management plans [13,30,41].This is especially the case in the fast fashion industry,where demand uncertainty, lack of historical data, variable ultra-fast life-cycle of a product and seasonal trends usually coexist [20,27]. In rough terms demand forecasting outputs the amount of goods to buy from the suppliers. ...
Preprint
Full-text available
We present Visuelle 2.0, the first dataset useful for facing diverse prediction problems that a fast-fashion company has to manage routinely. Furthermore, we demonstrate how the use of computer vision is substantial in this scenario. Visuelle 2.0 contains data for 6 seasons / 5355 clothing products of Nuna Lie, a famous Italian company with hundreds of shops located in different areas within the country. In particular, we focus on a specific prediction problem, namely short-observation new product sale forecasting (SO-fore). SO-fore assumes that the season has started and a set of new products is on the shelves of the different stores. The goal is to forecast the sales for a particular horizon, given a short, available past (few weeks), since no earlier statistics are available. To be successful, SO-fore approaches should capture this short past and exploit other modalities or exogenous data. To these aims, Visuelle 2.0 is equipped with disaggregated data at the item-shop level and multi-modal information for each clothing item, allowing computer vision approaches to come into play. The main message that we deliver is that the use of image data with deep networks boosts performances obtained when using the time series in long-term forecasting scenarios, ameliorating the WAPE by 8.2% and the MAE by 7.7%. The dataset is available at: https://humaticslab.github.io/forecasting/visuelle.
... In practice, after a few weeks from the delivery on market, one has to sense how well a clothing item has performed and forecast its behavior in the coming weeks. This is crucial to improve restocking policies [23]: a clothing item with a rapid selling rate should be restocked to avoid stockouts. The same reasoning can be applied to the opposite case, where a product with a slow-selling rate might not necessarily be a focus point for restocking in the near future. ...
Preprint
Full-text available
The fashion industry is one of the most active and competitive markets in the world, manufacturing millions of products and reaching large audiences every year. A plethora of business processes are involved in this large-scale industry, but due to the generally short life-cycle of clothing items, supply-chain management and retailing strategies are crucial for good market performance. Correctly understanding the wants and needs of clients, managing logistic issues and marketing the correct products are high-level problems with a lot of uncertainty associated to them given the number of influencing factors, but most importantly due to the unpredictability often associated with the future. It is therefore straightforward that forecasting methods, which generate predictions of the future, are indispensable in order to ameliorate all the various business processes that deal with the true purpose and meaning of fashion: having a lot of people wear a particular product or style, rendering these items, people and consequently brands fashionable. In this paper, we provide an overview of three concrete forecasting tasks that any fashion company can apply in order to improve their industrial and market impact. We underline advances and issues in all three tasks and argue about their importance and the impact they can have at an industrial level. Finally, we highlight issues and directions of future work, reflecting on how learning-based forecasting methods can further aid the fashion industry.
... Sales promotion has a short-term impact on the profitability of the brand, but consumer loyalty and long-term profitability are achieved only if the consumers recollect the sales promotions. 35 Innovations in retail, in general, are becoming one of the crucial factors of success. Today, they are in close connection with the services and the ambiance in a retail store (managing the interior, etc.), but are also closely linked to the techniques used in retail self-service and represent a part of a very efficient strategy of successful retailers. ...
... Their main goal was actively involving users in the process of model development and testing. Built on their earlier work on effective feature selection [14] they concluded random forest models were the most promising for B2B sales forecasting. Here, we proposed an end-to-end cloud-based workflow to forecast the outcome of B2B sales opportunities by reframing this problem into a binary classification framework. ...
... Data mining can highly advantage industries such as retail, banking, and telecommunications; classification and clustering can be applied to this region [68].Retailers gather customer information, related transactions information, and product information to considerably improve exactness of product demand forecasting, variety optimization, product recommendation, and ranking across retailers and manufacturers [69,70]. Researchers leverage SVM [71], support vector regression [72], or Bass model [73] to forecast the products' demand. ...
Article
The huge data generate by the Internet of Things (IOT) are measured of high business worth, and data mining algorithms can be applied to IOT to take out hidden information from data. In this paper, we give a methodical way to review data mining in knowledge, technique and application view, together with classification, clustering, association analysis and time series analysis, outlier analysis. And the latest application luggage is also surveyed. As more and more devices connected to IOT, huge volume of data should be analyzed, the latest algorithms should be customized to apply to big data. We reviewed these algorithms and discussed challenges and open research issues. At last a suggested big data mining system is proposed.
... The estimation of the credit value of borrowers in advance during the process of credit assessment is one of the principal factors for performance of insurance companies and banks [49]. Retailers collect customer information, product information, and related business transactions to boost product demand predicting accuracy, product recommendation, and product ranking for retailers and manufacturers significantly [50]. The SVM [51], or Bass Model [52], aids the researchers in predicting products demand [10,11]. ...
Article
Full-text available
Data mining is one of the most popular analysis methods in medical research. It involves finding patterns and correlations in previously unknown datasets. Data mining encompasses various areas of biomedical research, including data collection, clinical decision support, illness or safety monitoring, public health and inquiry research. Health analytics frequently uses computational methods for data mining, such as clustering, classification, and regression. Studies of large numbers of diverse heterogeneous documents, including biological and electronic information, provided medical and health studies.
... In fact, the current fashion system came under scrutiny, as practitioners suggested a new "decentralized" fashion calendar with only two seasons and the elimination of off-season production [119]. Moreover, the advent of big data analytics provides promising opportunities to invest into data mining, warehousing and artificial intelligence systems to increase accuracy in fashion sales forecasting [120,121]. ...
Article
Full-text available
This paper explores why and how dominant international social standards used in the fashion industry are prone to implementation failures. A qualitative multiple-case study method was conducted, using purposive sampling to select 13 apparel supply chain actors. Data were collected through on-site semi-structured face-to-face interviews. The findings of the study are interpreted by using core tenets of agency theory. The case study findings clearly highlight why and how multi-tier apparel supply chains fail to implement social standards effectively. As a consequence of substantial goal conflicts and information asymmetries, sourcing agents and suppliers are driven to perform opportunistic behaviors in form of hidden characteristics, hidden intentions, and hidden actions, which significantly harm social standards. Fashion retailers need to empower their corporate social responsibility (CSR) departments by awarding an integrative role to sourcing decisions. Moreover, accurate calculation of orders, risk sharing, cost sharing, price premiums, and especially guaranteed order continuity for social compliance are critical to reduce opportunistic behaviors upstream of the supply chain. The development of social standards is highly suggested, e.g., by including novel metrics such as the assessment of buying practices or the evaluation of capacity planning at factories and the strict inclusion of subcontractors’ social performances. This paper presents evidence from multiple Vietnamese and Indonesian cases involving sourcing agents as well as Tier 1 and Tier 2 suppliers on a highly sensitive topic. With the development of the conceptual framework and the formulation of seven related novel propositions, this paper unveils the ineffectiveness of social standards, offers guidance for practitioners, and contributes to the neglected social dimension in sustainable supply chain management research and accountability literature.
Conference Paper
Full-text available
In demand forecasting, which can depend on various internal and external factors, machine learning (ML) methods can capture complex patterns and enable precise forecasts. Accurate forecasts facilitate targeted, demand-oriented planning and control of production and underline the importance of this task. The implementation of ML-algorithms requires knowledge of the specific domain as well as knowledge of data science and involves an elaborate set up process. This often makes the application of ML to potential industrial problems economically unattractive. The major skills shortage in the field of data science further exacerbates this. Automation and better accessibility of ML methods is therefore a key prerequisite for widespread use. This is where the principle of automated ML (AutoML) comes in, automating large parts of a ML pipeline and thus leading to a reduction in human labour input. Therefore, the aim of the publication is to investigate the extent to which AutoML solutions can generate added value for demand planning in the context of production planning and control. For this purpose, publicly available datasets deriving from Walmart as well as an anonymised manufacturing company are used for short-term and long-term forecasting. The AutoML tools from Microsoft, Dataiku and Google conduct these forecasts. Statistical models serve as benchmarks. The results show that the forecasting quality varies depending on the software, the input data and their demand patterns. Overall, the prepared models from Microsoft show the most accurate results in average and the potential of AutoML becomes particularly clear in the short-term forecast. This paper enriches the research field through its broad application, giving valuable insights into the use of AutoML tools for demand planning. The resulting understanding of limitations and benefits of AutoML tools for the case studies presented fosters their suitable application in practice.
Article
Full-text available
Judgmental and statistical forecasts can each bring advantages to the forecasting process. One way forecasters can integrate these methods is to adjust statistical forecasts based on judgment. However, judgmental adjustments can bias forecasts and harm accuracy. Forecasters should consider six principles in deciding when and how to use judgment in adjusting statistical forecasts: (1) Adjust statistical forecasts if there is important domain knowledge; (2) adjust statistical forecasts in situations with a high degree of uncertainty; (3) adjust statistical forecasts when there are known changes in the environment; (4) structure the judgmental adjustment process; (5) document all judgmental adjustments made and periodically relate to forecast accuracy; (6) consider mechanically integrating judgmental and statistical forecasts over adjusting.
Article
Full-text available
Fashion markets are synonymous with rapid change and, as a result, commercial success or failure is largely determined by the organisation's flexibility and responsiveness. Responsiveness is characterised by short time-to-market, the ability to scale up (or down) quickly and the rapid incorporation of consumer preferences into the design process. In this paper it is argued that conventional organisational structures and forecast-driven supply chains are not adequate to meet the challenges of volatile and turbulent demand which typify fashion markets. Instead, the requirement is for the creation of an agile organisation embedded within an agile supply chain.
Book
Principles of Forecasting: A Handbook for Researchers and Practitioners summarizes knowledge from experts and from empirical studies. It provides guidelines that can be applied in fields such as economics, sociology, and psychology. It applies to problems such as those in finance (How much is this company worth?), marketing (Will a new product be successful?), personnel (How can we identify the best job candidates?), and production (What level of inventories should be kept?). The book is edited by Professor J. Scott Armstrong of the Wharton School, University of Pennsylvania. Contributions were written by 40 leading experts in forecasting, and the 30 chapters cover all types of forecasting methods. There are judgmental methods such as Delphi, role-playing, and intentions studies. Quantitative methods include econometric methods, expert systems, and extrapolation. Some methods, such as conjoint analysis, analogies, and rule-based forecasting, integrate quantitative and judgmental procedures. In each area, the authors identify what is known in the form of `if-then principles', and they summarize evidence on these principles. The project, developed over a four-year period, represents the first book to summarize all that is known about forecasting and to present it so that it can be used by researchers and practitioners. To ensure that the principles are correct, the authors reviewed one another's papers. In addition, external reviews were provided by more than 120 experts, some of whom reviewed many of the papers. The book includes the first comprehensive forecasting dictionary.
Chapter
Die Vorhersagemethode der exponentiellen Glättung auf der Basis der Arbeiten von Brown geht davon aus, daß in jeder beobachteten Periode Nachfragen eintreffen.
Chapter
Sporadische Nachfrage tritt in vielen Branchen auf. Ein Beispiel sind modische Artikel mit eng begrenzter Lebensdauer. Branchenübergreifend bereiten sogenannte Langsamdreher bei der Vorhersage besondere Probleme, die auf die sehr unregelmäßige Nachfrage zurückzuführen sind [2].
Article
Data mining is a new discipline lying at the interface of statistics, database technology, pattern recognition, machine learning, and other areas. It is concerned with the secondary analysis of large databases in order to find previously unsuspected relationships which are of interest or value to the database owners. New problems arise, partly as a consequence of the sheer size of the data sets involved, and partly because of issues of pattern matching. However, since statistics provides the intellectual glue underlying the effort, it is important for statisticians to become involved. There are very real opportunities for statisticians to make significant contributions.
Conference Paper
This paper presents a first step towards a unifying framework for Knowledge Discovery in Databases. We describe finks between data milfing, knowledge dis- covery, and other related fields. We then define the KDD process and basic data mining algorithms, dis- cuss application issues and conclude with an analysis of challenges facing practitioners in the field.