Content uploaded by Alexandra Lagorio

Author content

All content in this area was uploaded by Alexandra Lagorio on Oct 04, 2019

Content may be subject to copyright.

Urban freight demand estimation: a probability distribution based

method

Alexandra Lagorio 1, Jesus Gonzalez-Feliu 2, Roberto Pinto 3

1,3 Department of Management, Information and Industrial Engineering, University of Bergamo, via G. Marconi,

24044, Dalmine (BG) – Italy

2 Environnement, Ville et Société, Department of Environment and Organization Engineering, Institut Henri,

Fayol, Ecole des Mines de Saint-Etienne, 158 cours Fauriel, 42023 Saint-Etienne Cedex – France

{alexandra.lagorio@unibg.it, jesus.gonzalez-feliu@emse.fr, roberto.pinto@unibg.it}

Abstract. The lack of data is one of the most common problems when dealing with the design of

solutions that optimize urban freight transport (City Logistics Projects). In fact, to be able to process an

effective City Logistics Projects for a certain area of the city, it is necessary to have the data concerning

the number of daily deliveries that each commercial activity receives in this area, with detailed

information regarding the time of delivery, the type of used vehicle and the amount of delivery. Only

in this way is it possible to have a correct and realistic dimensioning of the freight demand. In the

reality, it is not always possible to have this data for an adequate period of time. This paper, starting

from the existing literature on demand forecasting and from an analysis of real data, provides the

proposal of an alternative method of forecasting the demand for goods for a given area in the city when

only the typology of commercial activities and a small amount of data are known.

Keywords: urban freight transport, demand estimation, city logistics; statistic distribution analysis.

1. Introduction

Many factors have contributed transforming urban logistics in recent years: the fragmentation of the

demand for goods, the ever-increasing frequency of deliveries, the continuous spread of the e-commerce

that has transformed every home into a delivery point. All those factors have led to an increase in traffic

congestion and a consequent increase in noise pollution and emissions of air pollutants that have made

urban centers less and less livable cities. In order to cope with these problems, city logistics projects have

been developed and aimed at optimizing urban transport of goods.

However, the implementation of this type of solution requires careful and proper planning that is made

complex by the presence of numerous stakeholders with often conflicting interests and the scarcity, if not

the absence, of the data needed to design the city logistics solution best suited to the context (Lagorio et al.,

2016). In fact, the most widespread city logistics solutions, such as the urban distribution centers, the

loading-unloading areas for the deliveries operations or the delivery of goods by vehicles with low

environmental impact (i.e., cargo-bikes, e-vehicles) require correct information regarding the

infrastructural network, the regulations, the location of the places of delivery (i.e., shops, private houses)

and the demand for goods. While the information about the first elements mentioned is relatively easy to

obtain online (Golini et al., 2018), the estimation of the demand for goods required by an urban center (or

a part of it) is a trivial problem.

In order to estimate the future demand for goods, usually researchers rely on historical series, but in this

case, they are difficult to obtain because, unlike the demand for goods in the industrial sector, we do not

have to deal with one or few suppliers and an exact number of customers. In the case of urban freight

transport, each shop carries out its operations independently from the others, as well as suppliers, each one

keeps track of its operations in a different way and it is therefore also difficult to reconstruct the number of

deliveries spent per store.

In particular, in order to implement the main city logistics solutions, it is necessary to know the number of

deliveries that each store receives in a week, the number of vehicles from which these deliveries are made,

the type of vehicle used for the delivery, the day of the week and the time of the day in which deliveries

took place. This data should be available for a sufficiently long period of time so that trends, peaks and

seasonality can be observed in the frequency of deliveries.

Nowadays, the following general modeling frameworks are applied to estimate the demand generation

patterns related to urban logistics (Gonzalez-Feliu and Peris-Pla, 2017):

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

Deterministic freight trip generation (FTG) or freight generation (FG) models, issued from

establishment-based surveys and mainly related to category classification datasets (Holguin-Veras

et al., 2011). Those models can be based on constant rates calculated for each category (Holguin-

Veras et al., 2012; Gonzalez-Feliu et al., 2014b; Aditjandra et al., 2016) or relate, for each category,

the number of trips (FTG) or the freight quantities (FG) to different variables, like employment

(Holguin-Veras et al., 2013; Gonzalez-Feliu et al., 2014a), area (Jaller et al., 2014; Alho and de

Abreu e Silva, 2014) or income, resulting on constant, linear or non-linear model for each category

(Sanchez-Diaz et al., 2016b). However, models assigning different functional forms to each category

seem to give better results than models with a unique functional form (Sanchez-Diaz et al., 2016a).

Random-based approaches (Deflorio et al., 2012 ; Faure et al., 2016 ; Lagorio et al., 2016), where,

because of the lack of data, or due to the needs of dynamic information for simulation purposes, the

assignment of a constant rate to each establishment is not pertinent. One possibility, when no other

information is available, is to determine suitable average values (which are in general able to be

estimated, at least roughly) then generate a random value around that average. They assume a

uniform statistical distribution of data, which is not always the case (Gonzalez-Feliu et al., 2014a).

Probabilistic generation, i.e. random generation not based on a uniform distribution but to other

probability distribution. Although that generation is still not much deployed, we can find two works

deploying it by comparing Normal and Rayleigh distributions (Gonzalez-Feliu et al., 2014a, Lopez

et al., 2016), and the second probability distribution seems to be more pertinent than the first (Lopez

et al., 2016).

However, since more probability distributions can be defined, and analogously to the recent developments

for a deterministic generation (Gonzalez-Feliu et al., 2016, Sanchez-Diaz et al., 2016a), it seems interesting

to explore the quality and relevance of using different probability distributions for a random generation in

FTG.

This paper provides a process that can be followed to estimate the demand for goods in urban areas when

there is no exact data on delivery frequencies. In this paper, the estimation procedure will be illustrated and

the results related to the first phases of the process will be reported as the research is still ongoing.

In the following section, we present the methodology process that leads to the estimation of the demand for

urban goods in the absence of extended datasets on the frequency of goods. Then, first results of the

application of the first phases of the estimation process will be reported. Finally, the limitations, the future

developments of the research and the final considerations with respect to the work carried out so far will be

reported.

2. Methodology

The methodology starting from the data collection and leading to the definition of an urban freight demand

estimation is quite long and it is summarized in Figure 1. The proposed method is a peculiar application in

the urban freight transport field of the general method of generating demand from probability distribution

(Kolassa, 2016; Petrik, Moura and Silva, 2016). In Figure 1 we reported all the steps implemented in this

research to perform the demand estimation using probabilities distributions.

FIGURE 1 Research methodology streamline

(1) Data

collection (2)

Categorization

(3) Preliminary

statistical

analysis

(4)

Identification

of probability

distributions

(5) Test

(6) Application

with Monte

Carlo

simulation

(7) Urban

freight

demand

estimation

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

First of all, some data are necessary to have a starting point for the identification of the probability

distribution of the number of deliveries. In particular, the number of weekly deliveries for each shop in a

defined pilot area is necessary (1).

Then, it is possible to separate the list of shops in different categories basing on the sell product category.

Thus, for each category, we have a list of shops with the average number of weekly deliveries (2). At this

point it is possible to identify some classical statistical parameters (i.e., mean, variance, maximum,

minimum, quartiles) and to perform some preliminary statistical analysis (i.e. correlation tests) in order to

understand if there are some external factors that affect the average number of weekly deliveries such as

the shop surface, the number of employers or the numbers of suppliers (3).

After performing these analyses, it is possible to identify the probability distributions for each freight

category (4). In this paper, the analysis stops at this point but the process is still not concluded. To guarantee

an adequate reliability and replicability of the method it is still necessary to test the goodness of the

distributions, for example, try to define them starting from another pilot area with the same freight

categories (5). If the probability distributions are the same for each freight category, it is possible to estimate

the freight demand for a new pilot area in which the shop categories are known but the average of weekly

deliveries is unknown. The idea is to assign the probability distribution of the freight category to every shop

in the selected area and then perform a Monte Carlo simulation that simulate the number of deliveries for

each shop in the area according to the probability distribution assigned to the shop category (6).

In this way it possible to estimate the urban freight demand of goods of every area of the city in which we

know the category of freight sold by each shop in the area (7).

3. First Results

In this paper, the average number of deliveries per week for every single store was analyzed. Data are issued

form the French Urban Goods Transport Surveys (Ambrosini et al., 2010), already used in Gonzalez-Feliu

et al. (2016) and Sanchez-Diaz et al. (2016a) for deterministic models and in Lopez et al. (2016) for

Rayleigh –based probabilistic ones. In particular, 2,970 single entry constitute the total number of retrieved

data in the collecting period (1996-1999). Three cities were involved in the survey: Marseille, Dijon and

Bordeaux, and data can be used jointly since works using those data show that the freight generator in all

those cities is the single establishment independently of its location (Gonzalez-Feliu et al., 2014b).

Although more recent data has been collected (in Paris in 2012 and in Bordeaux in 2014), the two databases

are not homogeneous (each of them includes 900 to 1100 valid entries, which does not lead to statistically

relevant data for all categories if considered separately) so the joint use of the two surveys is not nowadays

possible (which is instead the case for the data used in this research). Moreover, data of Paris presents some

irregularities and lacks that have not yet been adjusted, and the survey of Bordeaux is not still available

openly. The purpose of this article is to understand if it is possible, for each shops’ category, to obtain a

distribution that can represent the probability of a given number of weekly deliveries, in order to make a

demand prediction.

The first step for doing this is starting from a descriptive statistical analysis.

3.1. Descriptive statistical analysis

The data have a very high variability, with numerous extreme values. Data variability can be seen in Table

1, where for many categories the value of standard deviation is equal or greater than the mean and in

observing the chart for the boxplots (Figure 1).

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

TABLE 1 Descriptive statistical analysis for each category of production / commercial activity

FIGURE 2 Boxplot related to the different categories of production / commercial activity

Given these characteristics, we can deduce the presence of large variability within certain categories and

the consequent presence of many extreme values. Before proceeding to each category, outliers have been

eliminated because they could compromise subsequent data analysis.

Outliers have been identified by the formula (Dawson, 2011):

(1)

Where is the third quartile and is the first quartile. Subsequent data analyses were then carried out on

outbound data.

We can also observe that with such high variability in samples for each category, the mean could not be

used as a valid descriptive parameter of the data. The only categories where the mean can be significant are

in fact Category 14 (Retail Clothing, Shoes, Accessories, Leather Goods), Category 15 (Butchery) and

Category 22 (Bookshop) as it could be seen analysing the boxplots of these three categories.

For further categories, further investigations are required. If the mean cannot be used as a parameter to

describe the sample of data, we cannot assume that the average number of weekly deliveries is a constant

except for Categories 14, 15 and 22. We must, therefore, make a further assumption. In this case, we can

assume that data of weekly deliveries depends on known internal process features, such as the number of

employees (the only internal process feature available in the dataset used for this research).

To validate this hypothesis, a linear regression of the type was tested on the data:

N. Category Min 1s t Qu Median Mean 3rd Qu M ax sd Outliers

1 AGRICOLTURE 0,0 0,3 1,0 1,9 2,2 11,0 2,6 8,1

2 ARTISANS 0,0 1,0 2,0 3,6 4,0 46,3 6,0 13,0

3 IND CHIMIQUE 0,3 2,0 5,0 17,3 16,7 134,0 29,8 60,9

4 IND BIENS PROD ET INTERM 0,0 1,4 4,0 7,8 8,0 140,7 13,8 27,9

5 IND BIENS CONSOMMATION 0,4 1,0 3,0 7,1 6,0 118,0 13,3 21,0

6 TRANSPORT 0,0 0,3 1,5 4,3 3,8 59,0 10,7 14,1

7 COM GROS PRODUC INTERMED 0,0 2,0 4,0 15,2 10,0 655,0 66,9 34,0

8 COM GROS CONSO NON ALIM 0,0 1,2 3,0 9,0 10,0 212,0 22,0 36,5

9 COM GROS BIENS CONS ALIM 0,1 3,0 7,5 17,6 15,0 224,0 31,5 51,0

10 SUPERMARCHES 1,0 18,8 30,4 65,9 73,3 332,9 80,6 236,5

14 COM D ETAIL HAB,CHAUS,CUIR 0,0 1,0 2,0 2,7 4,0 16,6 2,4 13,0

15 BOUCHERIE 1,0 3,0 5,0 6,6 8,0 36,0 6,1 23,0

16 EPICERIE,ALIM 0,0 2,5 5,0 6,7 8,0 49,0 7,9 24,6

17 BOULANGERIES,PATISSERIES 1,0 2,0 3,9 6,2 8,0 30,1 6,3 26,0

18 CAFES,HOTELS,RESTAURANTS 0,0 1,9 3,0 6,4 7,3 70,0 9,7 23,2

19 PHARMACIES 1,0 21,0 26,0 27,1 34,9 81,0 14,6 76,8

20 QUINCAILLERIES 0,2 1,0 3,0 3,7 5,2 19,5 3,7 17,9

21 COM D' AMEUBLEMENT 0,0 1,0 3,0 0,1 5,7 28,5 6,3 19,8

22 LIBRAIRIE-PAPETERIE 0,5 3,3 7,9 9,6 14,1 26,8 6,8 46,5

23 AUTRE COMM DETAIL 0,0 2,0 4,0 7,1 8,0 72,0 9,4 26,0

24 SERVICES DIVERS 0,0 2,0 3,0 8,9 11,6 74,0 1,4 40,2

25 TERTIAIRE PUR 0,0 0,8 2,0 3,3 3,3 49,2 5,7 10,8

26 TERTIAIRE AUTRE 0,0 0,5 1,0 3,4 2,1 180,2 15,4 7,1

27 BUREAUX NON TERTIAIRE 0,0 0,9 2,0 3,8 4,8 33,0 5,6 16,5

28 ENTREPOTS 1,0 3,3 10,0 52,8 41,5 486,0 107,3 156,2

29 COM NON SEDENTAIRES 0,1 1,0 2,0 3,8 5,0 17,0 4,3 17,0

34 IND LOURDE (CONSTRUCTION) 0,0 1,0 3,1 7,3 8,0 86,0 12,6 29,0

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

(2)

Where represents the average number of weekly deliveries per activity, the actual number of employees

working at the same activity, and the parameters relating to the slope and the intercept of the regression

line. The results of the analysis are summarized in Table 2.

TABLE 2 Linear Regression Analysis

To assess the presence or not of a dependency between the two variables, we were based on several

evaluation tests:

1. Check R2: The more R2 approaches 1, the more effective the model is when interpreting the data.

So, the more the value of R2 is close to 1 the higher is the probability that there is a linear

dependence between the average number of weekly deliveries and the number of employees in the

business (Yang, 2015). In this study, in particular, we consider the existence of a dependence

between the two variables when R^2≥ 0,5 (Gonzalez-Feliu et al., 2014a).

2. F-Test: “the significance test of the regression equation often uses F test by the method of variance

analysis. F-statistics is defined as the average of regression sum of squares to the average of error

sum of squares ratio.” (Yang, 2015). With the given significance level α and the degree of freedom

(m, n-m), we can find Fα. If |F|≥F_α, it indicates that this regression has a significant linear. But

on the contrary, it indicates that this regression does not have a significant linear, namely, all the

independent variables are not responding to dependent variables. The p-values associated to the

F-value should be near zero at 0.0001 (Fite et al., 2002).

3. Pearson coefficient: is another test that can help us to define if there is a dependency between the

two variables. In particular, if the Pearson coefficient is between 0 and 0.3 the dependency is weak,

if the coefficient is between 0,3 and 0,7 the dependency is moderate and if the coefficient is greater

than 0,7 the dependency is strong. So, for this study, we have considerate only strong dependency

and so Pearson coefficients greater than 0,7.

4. Spearman coefficient: Spearman's coefficient indicates instead when there is a nonlinear

relationship between two variables. Even in this case, the more the coefficient approaches 1, higher

is the probability that between the two measured variables there is a nonlinear relationship. We

have considered a possible nonlinear relationship when the spearmen coefficient is greater than

0,7.

Slope Intercept

N. Category a b R^2 F p-value Pearson Spearman

Residuals mean

p-value

Test Shapiro

Polinomial R^2 Anova Test

1 AGRICOLTURE 0,1 1,5 0,1 2,2 0,1 0,27 0,53

5,75E-11 1 0,02

2 ARTISANS 0,8 1,8 0,2 2,0

0,2 0,15 0,26 -9,41E-11 1

3 IND CHIMIQUE 0,1 9,1 0,6 53,9 4.74e-09 0,75 0,55 1,63E-10 1

4 IND BIENS PROD ET INTERM 0,1 0,3 0,5 217,9

2.2e-16 0,70 0,50 5,89E-11 1

5 IND BIENS CONSOMMATION 0,2 4,1 0,5 189,9 2.2e-16 0,72 0,40 -8,54E-11 1

6 TRANSPORT 0,0 3,9 0,0 0,2 0,6481 0,09 0,50 6,10E-10 1

7 COM GROS PRODUC INTERMED 0,6 8,6 0,0 2,6 0,1073 0,12 0,40 6,82E-11 1

8 COM GROS CONSO NON ALIM 0,4 5,2 0,1 6,1 0,01533 0,24 0,42 -2,26E-11 1

9 COM GROS BIENS CONS ALIM 0,3 11,9 0,2 23,0 6,19E-06 0,45 0,35 -1,21E-15 1

10 SUPERMARCHES 0,5 7,9 0,7 126,3

8,76E-12 0,86 0,82 2,20E-15 1 0,74 Linear Regression

14 COM DETAIL HAB,CHAUS,CUIR 0,2 1,9 0,3 34,4 9,95E-17 0,50 0,39 4,85E-16 1

15 BOUCHERIE 0,5 4,9 0,1 8,7 0,1 0,34 0,34 2,16E-16 1

16 EPICERIE,ALIM 0,3 5,8 0,1 5,0 0,0 0,23 0,17 -1,63E-16 1

17 BOULANGERIES,PATISSERIES 0,1 5,9 0,0 0,7 0,4 0,08 0,01 2,72E-16 1

18 CAFES,HOTELS,RESTAURANTS 0,6 2,5 0,5 164,8 2.2e-16 0,73 0,46 3,63E-16 1

19 PHARMACIES 1,2 19,8 0,1 9,6 0,0 0,38 0,21 1,06E-15 1

20 QUINCAILLERIES 0,7 2,2 0,1 4,3 0,0 0,31 0,10 6,32E-17 1

21 COM D' AMEUBLEMENT 0,1 3,5 0,4 25,2 9,44E-03 0,61 0,54 8,60E-18 1 0,02

22 LIBRAIRIE-PAPETERIE -0,1 10,1 0,0 2,2 0,1 -0,16 -0,11 1,44E-16 1

23 AUTRE COMM DETAIL 0,1 6,3 0,1 13,9 0,0 0,22 0,41 -3,99E-16 1

24 SERVICES DIVERS 0,4 6,9 0,0 5,2 0,0 0,22 0,42 1,68E-16 1

25 TERTIAIRE PUR 0,0 3,1 0,0 4,1 0,0 0,13 0,45 -4,46E-17 1

26 TERTIAIRE AUTRE 0,0 3,3 0,0 0,0 0,9 0,01 0,49 -6,92E-17 1

27 BUREAUX NON TERTIAIRE 0,0 3,1 0,3 27,7

1,58E-03 0,54 0,58 1,36E-16 1 0,02

28 ENTREPOTS 1,0 35,1 0,1 3,3 0,1 0,23 0,38 -1,12E-15 1

29 COM NON SEDENTAIRES 0,3 3,3 0,0 0,2 0,7 0,09 -0,08 - 2,60E-16 1

34 IND LOURDE (CONSTRUCTION) 0,2 3,3 0,3 40,8

5,44E-06 0,54 0,44 -6,36E-16 1

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

5. Test of normality of the residuals: to evaluate the correctness of a regression model, one of the

most common tests is to verify the normality of residual errors. In this study, the normality of

residual errors was verified with the Shapiro-Wilk test.

The categories for which there is a relationship between the average number of weekly deliveries and the

number of employees are: Category 3 (Chemical industry), 4 (Manufacturing industry manufacturing B2B),

5 (manufacturing industry of B2C goods), 18 (HoReCa, Hotel, Restaurants, Café sector). Category 10

(supermarket) is the only one that has presented both a high coefficient of Pearson and Spearman. In this

case, a nonlinear (quadratic) regression test was also performed. The results of the two regressions were

then compared with the ANOVA test, which showed that linear regression is the one that best describes the

relationship between the average number of weekly deliveries and the number of employees for

supermarkets. We have also tested polynomial regression for categories 1 (Agriculture), 21 (Street traders)

and 27 (Non-tertiary office), but the linked R2 value is far minor than 0,5, so we assumed that there are no

polynomial dependencies between the two variables for these three categories.

At this point, we can say that for distributions that are not well represented by a constant, or by a regression

(linear or not), a different estimation method is needed.

3.2. The probability distribution for freight trip generation

For each category, we have identified a probability distribution that was able to approximate the distribution

of the data for the deliveries, so that we can not only consider values with a greater frequency but also those

variables values of the demand that seems to characterize the number of average deliveries per week in

urban centers. To do this, the EasyFit software was used, which is a support device for distributing fittings

from input data (in our case the average number of weekly deliveries per single point of sale per category).

TABLE 3 Distributions fitting results

Again, we can observe that there are some categories for which we get "abnormal" parameters, as in the

case of categories 4 (Manufacturing industry manufacturing B2B), 5 (manufacturing industry of B2C

N. Category Distribution Parameters

1 Agricolture Weibull α= 0,79918 β=1,3039

2 ARTISANS Gamma α= 1,1847 β=2,2554

3 IND CHIMIQUE Lognormal

σ= 1,3941 μ=1,5996

4 IND BIENS PROD ET INTERM Burr k= 4,979 α= 1,0766 β=19,712

5 IND BIENS CONSOMMATION Burr k= 2,7957 α= 1,2399 β=7,4653

6 TRANSPORT Gamma α= 0,79848 β=3,1054

7 COM GROS PRODUC INTERMED Gamma α= 0,95355 β=6,082

8 COM GROS CONSO NON ALIM Lognormal

σ= 1,2977 μ=1,1959

9 COM GROS BIENS CONS ALIM Weibull α= 0,94319 β=9,7551

10 SUPERMARCHES Exponential

λ=0,02175

14 COM DETAIL HAB,CHAUS,CUIR Gamma α= 1,6536 β=1,5489

15 BOUCHERIE Lognormal

σ= 0,73738 μ=1,516

16 EPICERIE,ALIM Gamma α= 1,8445 β=2,8525

17 BOULANGERIES,PATISSERIES Burr k= 0,56442 α= 2,6012 β=2,5434

18 CAFES,HOTELS,RESTAURANTS Lognormal

σ= 0,96563 μ=1,0974

19 PHARMACIES Normal

σ= 12,821 μ=26,133

20 QUINCAILLERIES Weibull α= 1,2501 β=3,4116

21 COM D' AMEUBLEMENT Burr k= 2,0009 α= 1,3589 β=5,0667

22 LIBRAIRIE-PAPETERIE Weibull α= 1,2112 β=10,334

23 AUTRE COMM DETAIL Lognormal

σ= 1,0449 μ=1,2913

24 SERVICES DIVERS Lognormal

σ= 1,3302 μ=1,3189

25 TERTIAIRE PUR Weibull α= 0,91997 β=2,1655

26 TERTIAIRE AUTRE Weibull α= 1,009 β=1,5379

27 BUREAUX NON TERTIAIRE Weibull α= 0,72468 β=2,6404

28 ENTREPOTS Lognormal

σ= 1,4097 μ=2,0695

29 COM NON SEDENTAIRES Lognormal

σ= 1,2284 μ=0,6825

34 IND LOURDE (CONSTRUCTION) Lognormal

σ= 1,1576 μ=1,0569

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

goods), 7 (Wholesale trade for B2B), 9 (Wholesale trade food consumer goods) and 22 (Bookshops). These

abnormal values, which are also shown in Table 1, are related to the main statistical values Descriptive for

each category, are due to the fact that these categories are very heterogeneous. For examples in the category

9 fresh, frozen and cans foods are included. We know that each of these typologies of food required different

typologies of transportations and storage. So, for these categories is better to find another way for

estimation, such as a generation of a random distribution between the values of the first and third quartiles.

4. Conclusion

This paper proposes a first approach to select the probability distribution related to FTG rates in urban

logistics, mainly focusing on retailing and service activities. The paper proposed a methodology to test and

compare different probability distribution and select the most relevant one for each retailer/service category,

on the basis of statistical indicators. The methodology is applied to a dataset extracted from a detailed

establishment survey in France, from where the statistical distribution characteristics of each category can

be studied. Results show that the normal approximation is not the most suitable form most categories,

although lognormal distributions is mainly suitable. Each category results on a different probability

distribution and the parameters to characterize them are defined. When small data quantities are available

(and, as shown in Gonzalez-Feliu et al., 2016, in some cases we can have categories with small samples

which do not justify the Normal approximation), the proposed works allows to feed simulation-based

approaches (mainly Monte-Carlo based) which can be a valid alternative to deterministic generation.

Moreover, for other uses (mainly dynamic works); those models give a more realistic view of data variation

and dynamics. However, those results are preliminary and need to be completed, mainly with a pertinent

simulation to assess them then they need to be related to transport flow construction and scenario

construction.

5. References

1. Aditjandra, P. T., Galatioto, F., Bell, M. C., and Zunder, T. H. (2016). Evaluating the impacts of

urban freight traffic: application of micro-simulation at a large establishment. European Journal of

Transport & Infrastructure Research, 16(1), 4-22.

2. Alho, A., and de Abreu e Silva, J. (2014). Freight-Trip Generation Model: Predicting Urban Freight

Weekly Parking Demand from Retail Establishment Characteristics. Transportation Research

Record: Journal of the Transportation Research Board, (2411), 45-54.

3. Dawson, R. (2011). How Significant Is A Boxplot Outlier? Journal of Statistics Education, 19(2).

Retrieved from www.amstat.org/publications/jse/v19n2/dawson.pdf

4. Deflorio, F.P., Gonzalez-Feliu, J., Perboli, G., and Tadei, R. (2012). The influence of time windows

of urban freight distribution services costs in City Logistics applications, European Journal of

Transport and Infrastructure Research, vol. 12, n. 3, pp. 256-274.

5. Faure, L., Burlat, P., and Marquès, G. (2016). Evaluate the viability of Urban Consolidation Centre

with regards to urban morphology. Transportation Research Procedia, 12, 348-356.

6. Fite, J. T. et al. (2002) ‘Forecasting freight demand using economic indices’, International Journal

of Physical Distribution & Logistics Management, 32(4), pp. 299–308. doi:

10.1108/09600030210430660.

7. Golini, R., Guerlain, C., Lagorio, A. & Pinto, R. (2018). An assessment framework to support

collective decision-making on urban freight transport. Transport, accepted for publication.

8. Gonzalez-Feliu, J., Peris-Pla, C. (2017), Impacts of retailing attractiveness on freight and shopping

trip attraction rates, Research in Transportation Business and Management, vol. 24, pp. 49-58, doi:

10.1016/j.rtbm.2017.07.004.

9. Gonzalez-Feliu, J., Cedillo-Campos, M.G., Garcia-Alcaraz, J.L. (2014a), An emission model as an

alternative to O-D matrix in urban goods transport modelling, Dyna. Journal of the Facultad de

Minas, Universidad Nacional de Colombia – sede Medellin, vol. 81, n. 187, pp. 249-256.

10. Gonzalez-Feliu, J., Toilier, F., Ambrosini, C., Routhier, J.L. (2014), Estimated data production for

urban goods transport diagnosis. The Freturb methodology. In Gonzalez-Feliu J., Semet, F.,

Routhier, J.L. (eds.), Sustainable urban logistics: concepts, methods and information systems,

Springer, Heidelberg, pp. 113-144.

7th International Conference on Information Systems, Logistics and Supply Chain

ILS Conference 2018, July 8-11, Lyon, France

11. Gonzalez-Feliu, J., Sanchez-Diaz, I., Ambrosini, C. (2016), Aggregation level, variability and linear

hypotheses for urban delivery generation models, 95th Transportation Research Board Annual

Symposium Compendium of Proceedings, Washington, January 10-14, 2016, TRB, Washington.

12. Holguín-Veras, J., Jaller, M., Destro, L., Ban, X., Lawson, C., & Levinson, H. (2011). Freight

generation, freight trip generation, and perils of using constant trip rates. Transportation Research

Record: Journal of the Transportation Research Board, (2224), 68-81.

13. Holguín-Veras, J., Jaller, M., Sánchez-Díaz, I., Wojtowicz, J., Campbell, S., Levinson, H., ... &

Tavasszy, L. (2012). NCHRP Report 739/NCFRP Report 19: freight trip generation and land use.

Washington DC: Transportation Research Board of the National Academies.

14. Holguín-Veras, J., Sánchez-Díaz, I., Lawson, C., Jaller, M., Campbell, S., Levinson, H., & Shin, H.

S. (2013). Transferability of freight trip generation models. Transportation Research Record:

Journal of the Transportation Research Board, (2379), 1-8.

15. Jaller, M., Sanchez-Diaz, I., Holguin-Veras, J. et Lawson, C.T. (2014). « Area Based Freight Trip

Generation Models », Transportation Research Board 93rd Annual Meeting, n° 14-4908.

16. Kolassa, S. (2016). Evaluating predictive count data distributions in retail sales forecasting.

International Journal of Forecasting, 32(3), 788–803.

https://doi.org/10.1016/J.IJFORECAST.2015.12.004

17. Lagorio, A., Pinto, R. and Golini, R. (2016) ‘Research in urban logistics: a systematic literature

review’, International Journal of Physical Distribution & Logistics Management, 46(10), pp. 908–

931. doi: 10.1108/IJPDLM-01-2016-0008.

18. Lopez, C., Gonzalez-Feliu, J., Chiabaut, N., Leclercq, L. (2016), Assessing the impacts of goods

deliveries’ double line parking on the overall traffic under realistic conditions. In Proceedings of the

6th International Conference in Information Systems, Logistics and Supply Chain, Bordeaux, France,

June 1-4, 2016, Kedge Business School, Bordeaux, ISBN 978-2-9539787-3-5.

19. Petrik, O., Moura, F., & Silva, J. de A. e. (2016). Measuring uncertainty in discrete choice travel

demand forecasting models. Transportation Planning and Technology, 39(2), 218–237.

https://doi.org/10.1080/03081060.2015.1127542

20. Sanchez-Diaz, I., Gonzalez-Feliu, J., Ambrosini, C. (2016a), Assessing the implications of

aggregating data by activity-based categories for urban freight trip generation modeling. In

Proceedings of the 6th International Conference in Information Systems, Logistics and Supply Chain,

Bordeaux, France, June 1-4, 2016, Kedge Business School, Bordeaux, ISBN 978-2-9539787-3-5.

21. Sánchez-Díaz, I., Holguín-Veras, J., & Wang, X. (2016b). An exploratory analysis of spatial effects

on freight trip attraction. Transportation, 43(1), 177-196.

22. Yang, Y. (2015) ‘Development of the regional freight transportation demand prediction models

based on the regression analysis methods’, Neurocomputing. Elsevier, 158, pp. 42–47. doi:

10.1016/J.NEUCOM.2015.01.069.