Content uploaded by Muhammad Munum Masud
Author content
All content in this area was uploaded by Muhammad Munum Masud on Aug 31, 2021
Content may be subject to copyright.
1
Estimation of Weigh-In-Motion System Accuracy from Axle Load Spectra Data
M. Munum Masud, SM. ASCE1 and Syed Waqar Haider, Ph.D., PE., M. ASCE2
1
Graduate Research Assistant, Department of Civil and Environmental Engineering, Michigan State University,
3546 Engineering Building, East Lansing, MI-48824; email: masudmuh@egr.msu.edu.
1
Associate Professor, Department of Civil and Environmental Engineering, Michigan State University, 3546
Engineering Building, East Lansing, MI-48824; PH (517) 353-9782; FAX (517) 432-1827; email:
syedwaqa@egr.msu.edu
ABSTRACT
Inaccurate weigh-in-motion (WIM) data may result in significant over-or under-estimation of the
pavement performance period, leading to over-or under-design pavements. Therefore, the data
collected at WIM systems must be accurate and consistent. The paper presents an approach to
estimate WIM system accuracy based on axle load spectra attributes [Normalized Axle Load
Spectra (NALS) shape factors]. This alternative approach to assess WIM system accuracy is
needed to characterize temporal changes in WIM data consistency. The WIM error data collected
before and after calibration were related to NALS shape factors for Class 9 vehicles.
This analysis's main objective is to determine WIM system errors based on axle loading
without physically performing equipment calibration. This approach can help highway agencies
select optimum timings for routine maintenance and calibration of WIM equipment without
compromising its accuracy. The results show that the WIM accuracy for the tandem axle (TA)
can be estimated with TA NALS shape factors with an acceptable degree of error for bending
plate (BP) and quartz piezo QP) sensors. Further, the results obtained using different statistical
methods for model development and validation show reasonable goodness of fit. The use of
NALS to estimate the TA WIM accuracy can save a significant amount of time and resources,
which are usually spent on equipment calibrations every year.
INTRODUCTION
Highway agencies collect WIM data for many reasons, including highway planning, pavement
and bridge design, freight movement studies, motor vehicle enforcement, and regulatory studies.
The new mechanistic-empirical pavement design guide (Pavement-ME) also requires WIM data
for predicting pavement distresses. Inappropriate WIM data may significantly over-or
underestimate the pavement performance period, leading to premature failure. Therefore, the
data collected at WIM systems must be accurate and consistent (Papagiannakis et al. 2001). The
damage caused by one heavy vehicle on the pavement structure is equivalent to the same damage
caused by tens of thousands of passenger cars. Also, one overloaded heavy vehicle causes much
more considerable fatigue damage to the pavements than an adequately loaded one(Burnos et al.
2018). The process of weighing vehicles in motion is designed to estimate static truck weight by
the wheel [single (SA) or tandem (TA) axles] and/or total vehicle (GVW) as vehicles drive over
sensors installed in a roadway or under a bridge. Several WIM technologies exist to capture the
applied forces and predict static weight. The data's quality and accuracy largely depend on the
2
characteristics of the WIM equipment, calibration/validation technician expertise, and data
reporting (ASTM 2009).
The accuracy of the WIM systems is a primary concern for its manufacturers and users.
The users desire different levels of accuracy, according to the application proposed (Jacob 2000).
The WIM systems go out of calibration, and their accuracy deteriorates over time due to many
factors(Haider et al. 2020). These factors may include changes in measurement conditions (e.g.,
temperature and speed), pavement deflection, and roughness caused by distresses, and fatigue of
WIM sensors. The authors of the referenced studies also reported that regardless of the WIM
system calibration, the WIM accuracy can deteriorate over time due to these factors (Burnos et
al. 2018; Papagiannakis et al. 2001).
OBJECTIVES
This paper addresses one core issue related to traffic loadings, i.e., How to obtain accurate and
reliable WIM data. Therefore, the primary objectives of the paper are to provide (a) review of
high-quality LTPP WIM data, (b) WIM accuracy relationship with NALS shape factors, (c)
statistical analysis to develop a predictive model for WIM accuracy. These objectives were
accomplished by synthesizing and analyzing the WIM and loading data available in the LTPP
database.
WIM ACCURACY PROTOCOLS
American Society for Testing and Materials International Standard, ASTM E1318-09 [updated in
2017) is the broadly recognized WIM measurement protocol used within the United States (US)
(Haider and Masud 2020). For the current paper, ASTM type I is considered the baseline
accuracy for WIM measurements after calibration with a tolerance limit for 95% compliance as ±
10 %, ± 15 %, and ± 20 % for GVW, TA, and SA, respectively. The WIM system accuracy can
be measured in terms of the relative difference between WIM and static weights. The relative
WIM error can be expressed by Equation (1). This relative error is commonly referred to as
measurement error for a WIM scale. Further, this accuracy will vary for different types of WIM
sensor technologies. For a well-calibrated WIM system, typical WIM measurement error follows
a normal distribution with a zero mean (no bias) and a standard deviation (Haider and
Harichandran 2007) as shown in Equation (2):
2
'~ 0,
XXN
X
(1)
Where
'X
=
load measured on a WIM scale for an axle configuration
X
=
load measured on a static scale for the same axle configuration
=
standard deviation (SD) characterizing the accuracy of the WIM scale
DATABASE DEVELOPMENT AND DATA SELECTION CRITERIA
The required data were obtained from the LTPP database standard release 33.0 (July 2019). All
LTPP WIM sites were assigned with a unique ID by combining state code and WIM ID (Haider
and Masud 2020; Haider et al. 2020; Masud 2018). Only LTPP research quality WIM sites were
considered based on WIM data accuracy and consistency evaluated from calibration records.
3
The data used for the analysis were obtained from LTPP research quality data (RQD) WIM sites
installed with QP and BP sites. These sites represent the highest quality WIM data sets due to
more stringent LTPP WIM calibration protocol and daily WIM data review. These sites had
detailed WIM measurement accuracy data collected before and after each calibration event that
allowed the development of computational models. For the model validations, the additional data
from the Michigan Department of Transportation (MDOT) for the QP sites were used. Table 1
presents the summary of available WIM sites and records considered for the model development
and validation. It can be noted that the majority of the WIM accuracy data are available for the
sites located in a wet climate.
Table 1 Distribution of WIM sites and records by the sensor, climate, and pavement type
Model development
Model validation
Sensor
Pavement
Climate
Total
Climate
Total
Dry
Wet
Dry
Wet
QP
AC
2a (5 b)
9 (25)
11 (30)
-
6 (9)
6 (9)
PCC
1 (3)
1 (4)
2 (7)
-
10 (14)
10 (14)
Total
3 (8)
10 (329)
13 (37)
-
16 (23)
16 (23)
BP
AC
-
-
-
-
-
-
PCC
4 (11)
7 (22)
11 (33)
-
-
-
Total
4 (11)
7 (22)
11 (33)
-
-
-
a No of WIM sites, b No of WIM records (one record each for pre and post-calibration)
CONSISTENCY OF WIM MEASUREMENT ERROR USING AXLE LOAD SPECTRA
As highlighted previously (Gajda et al. 2007; Haider and Harichandran 2007; Haider et al. 2020),
some sensors may yield good results at the time of calibration, i.e., low errors based on GVW,
SA, and TA. However, the WIM data quality (increase in bias and SD) may deteriorate over time
due to various factors. The daily axle loading data available in the LTPP database is an excellent
source for assessing the consistency in WIM data for different sensor types. The DD-AX table in
the LTPP database contains axle load data by site, year, month, day of the month, day of the
week (DOW), lane, direction, vehicle class, axle group, and load bin. This table was created by
accumulating the axle repetitions by vehicle class in a calendar day. The data are grouped in
1,000-lb bins for single axles, 2,000-lb bins for tandem axles, and 3,000-lb bins for tridems and
quads. For this paper, the NALS for tandem axles of class 9 trucks were developed for the
available WIM sites equipped with BP and QP sensors. The NALS for the following periods
were considered:
NALS based on 30 days of WIM data collected before a calibration event.
NALS based on 30 days of WIM data collected after a successful calibration event.
For the tandem axle, typically, two peak loads are observed in a NALS. Figure 1 shows the
examples of pre and post-calibration NALS data for four WIM sites with positive, negative, or
negligible bias. A mixture of statistical distributions to characterize the predominantly bimodal
axle load spectra were considered by(Haider and Harichandran 2007). It was shown that two or
more normal probability density functions (PDFs) could be added with appropriate weight
factors to obtain the PDF of the combined distribution, as shown by Equation(2):
4
*n
ii
i
f p f
(2)
Where
*
f
= PDF of combined distribution, pi= proportions (weight factors) for each normal
PDF, and fi= PDFs for each normal distribution.
For a bimodal mixed normal distribution containing two normal PDFs, the two-weight factors
are complementary (i.e., p2 = 1 – p1), as shown in Figure 2. Haider and Harichandran determined
that the bimodal shape of axle spectra could be effectively captured by using a combination of
two normal distributions:
22
12
22
12
*1 1 2 2 1 2
( ) ( )
22
112
; , , , , 11
22
xx
f x p p e p e
(3)
Where
1
the average of empty or partially loaded axle loads,
1
the standard deviation of
empty or partially loaded axle loads,
2
the average of fully loaded axle loads, and
2
the
standard deviation of fully loaded axle loads.
(a) QP sensor with 12.7% postive bias 53-0200 (2007)
(b) QP sensor with 0.90% negative bias 42-0600 (2008)
(c) BP sensor with 6.3% negative bias 17-0600 (2014)
(d) BP sensor with 1.2% negative bias 20-0200 (2006)
Figure 1 Tandem axle load spectra example for BP and QP WIM sites
0%
5%
10%
15%
20%
25%
0-1999
2000-3999
4000-5999
6000-7999
8000-9999
10000-11999
12000-13999
14000-15999
16000-17999
18000-19999
20000-21999
22000-23999
24000-25999
26000-27999
28000-29999
30000-31999
32000-33999
34000-35999
36000-37999
38000-39999
40000-41999
42000-43999
44000-45999
46000-47999
48000-49999
50000-51999
52000-53999
54000-55999
56000-57999
58000-59999
Class 9 vehicles (%)
Weight (lb)
Pre calibration
Post calibration
0%
5%
10%
15%
20%
25%
0-1999
2000-3999
4000-5999
6000-7999
8000-9999
10000-11999
12000-13999
14000-15999
16000-17999
18000-19999
20000-21999
22000-23999
24000-25999
26000-27999
28000-29999
30000-31999
32000-33999
34000-35999
36000-37999
38000-39999
40000-41999
42000-43999
44000-45999
46000-47999
48000-49999
50000-51999
52000-53999
54000-55999
56000-57999
58000-59999
Class 9 vehicles (%)
Weight (lb)
Pre calibration
Post calibration
0%
5%
10%
15%
20%
25%
0-1999
2000-3999
4000-5999
6000-7999
8000-9999
10000-11999
12000-13999
14000-15999
16000-17999
18000-19999
20000-21999
22000-23999
24000-25999
26000-27999
28000-29999
30000-31999
32000-33999
34000-35999
36000-37999
38000-39999
40000-41999
42000-43999
44000-45999
46000-47999
48000-49999
50000-51999
52000-53999
54000-55999
56000-57999
58000-59999
Class 9 vehicles (%)
Weight (lb)
Pre calibration
Post calibration
0%
5%
10%
15%
20%
25%
0-1999
2000-3999
4000-5999
6000-7999
8000-9999
10000-11999
12000-13999
14000-15999
16000-17999
18000-19999
20000-21999
22000-23999
24000-25999
26000-27999
28000-29999
30000-31999
32000-33999
34000-35999
36000-37999
38000-39999
40000-41999
42000-43999
44000-45999
46000-47999
48000-49999
50000-51999
52000-53999
54000-55999
56000-57999
58000-59999
Class 9 vehicles (%)
Weight (lb)
Pre calibration
Post calibration
5
Figure 2 Tandem axle load spectra modeling using bimodal mixed normal distributions
Initially, bimodal mixed normal distributions were fitted to obtain the TA shape factors. The
normal fitting was consistently underestimating mean and SD from TA NALS; therefore, a log-
normal distribution was fitted to get the NALS shape factors for TA NALS.
TANDEM AXLE NALS SHAPE FACTORS
This section presents the procedure used to relate differences in WIM measurement errors,
calculated based on pre and post-calibration data, with the differences in NALS shape factors.
Table 2 presents the tandem axle NALS shape factors considered for analyses. Based on 30 days
of weight data collected before and after the calibration event
Table 2 TA NALS shape factors
Data
Tandem axle shape factors
Based on 30
days of weight
data collected
before and after
the calibration
event
Unloaded peak (TAPL1)
Loaded peak (TAPL2)
The overall mean of the TA distribution (TAOAM)
TA NALS mean of the loaded axles (axle weighing >26,000 lbs.)
(TAmean>26,000)
Ratios (Pre/Post) of mean for second peaks for TA NALS.
Equations (4) to (7) were used to obtain the TA NALS shape factors; those were used as
potential predictors to estimate changes in TA bias. Also, the ratio (pre/post) of TA loaded peaks
were obtained for TA NALS. Equation (8) was used to calculate the differences in TA bias using
WIM calibration data. The TA bias difference was used as a dependent variable for model
development.
1( ) 1( )
where: TAdiffM1=TA unloaded peak difference
1Pre Post
TAdiffM TAPL TAPL
(4)
2( ) 2( )
where: TAdiffM2=TA loaded peak difference
2 Pre Post
TAdiffM TAPL TAPL
(5)
Relative Frequency, %
µ2
σ1
Axle Load, kN
Empty or partially
loaded trucks or axles
Loaded trucks
or axles
Mixture
distribution
1 1 2 2
f p f p f
σ2
6
where: TAdiffMean>26,000=TA mean difference of the bins>26,000 lbs.
26,000 26,000 26,000
Pre Post
TAdiffMean TAmean TAmean
(6)
where: TAdiffOAM=TA overall mean difference
Pre Post
TAdiffOAM TAOAM TAOAM
(7)
where: TABdiff=TA bias difference
Pre Post
TABdiff TAbias TAbias
(8)
STATISTICAL ANALYSES AND RESULTS
The dependent and independent variables presented in Equations (4) to (8) were used to develop
a model that would allow assessing changes in WIM weight measurement errors over time for
TA. Different statistical techniques, including scatter plots, correlation, linear, non-linear, and
multiple regression, were used to identify the most significant variables. A strong correlation was
observed between TA shape factors and TA bias differences (see Table 2). The TA shape factors
were also highly correlated with each other. This high correlation amongst TA shape factors
could lead to the potential issue of multicollinearity. The next section presents the final model's
details for TA bias (mean error) estimation.
Table 3 Correlation between TA SD and TA NALS shape factors
Variable
TAdiffM1
TAdiffM2
TAM2(Pre/Post)
TAdiffMean>26,000
TAdiffOAM
TABdiff
TAdiffM1
1
TAdiffM2
0.22
1
TAM2(Pre/Post)
0.21
1.00
1
TAdiffMean>26,000
0.17
0.86
0.85
1
TAdiffOAM
0.55
0.57
0.57
0.61
1
TABdiff
0.22
0.75
0.74
0.89
0.60
1
Model for Estimating Bias in TA Weight Measurement
Equation (9) shows the final model developed for QP and BP sensors. The sensor type was also
considered as an independent variable, but it was not significant. The coefficient of
determination for the TA bias model is 0.8, showing that the independent variable can explain
80% of the dependent variable variance. Figure 3(a) shows the goodness-of-fit for the TA bias
model. This graph compares the model predicted and observed TA bias values for all the
available data for the QP and BP sensors.
2
0.0041* 26,000
0.80
TABdiff TAdiffMean
R
(9)
Overall, the TA model made predictions accurately. The significant term, i.e., the difference
between pre and post TA mean >26,000 (TAdiffMean>26,000) can be used as a good predictor
for assessing and quantifying TA bias changes in WIM systems. This parameter represents the
mean load of tandem axles weighing greater than 26,000 lb. In a bimodal tandem axle load
distribution, these would be the loads in load bins greater than 26,000 lb. The model can be
7
improved further by adding more data in the future. The above model should be used in
combination with the visual inspection of the shifts in TA peak loads' location for the loaded
peaks. This analytical approach can help estimate changes in WIM measurement accuracy and
facilitate identifying the WIM calibration needs without performing the actual field validations
of WIM equipment performance using calibration trucks. This methodology can save a
significant amount of time and resources required for field validation using test trucks.
(a) Goodness of fit
(b) Model validation
(c) Model simulations
Figure 3 Goodness-of-fit, validation, and simulations for TA bias model
Validation of the Model
The WIM performance and axle loading data from the pre and post-calibration events were
obtained from the MDOT and used for the model validation. Figure 3(b) shows the goodness of
fit for the TA bias prediction model using the validation data. The TA bias predictions for the
model validation data are reasonably accurate (R-Sq=0.82). These data were not used during the
model development, and the prediction errors seem logical since both the data are subjected to
different loading patterns and conditions.
Finally, the TAdiffMean>26,000 data were simulated within the observed range to study the
model's sensitivity. Figure 3(c) shows the sensitivity of the model to the independent variable.
The model shows that when the pre and post difference between TAdiffMean>26,000 for class 9
8
trucks exceeds almost 1250 lbs., the TA bias difference exceeds 5%, indicating equipment would
require calibration.
Binary Logistic Regression Model
The continuous response variable (Absolute Difference between Pre and Post bias for tandem
axle) was converted into a binary response to perform binary logistic regression by defining
failure at a 5% threshold.
The absolute difference between pre and post bias for tandem axle <5% as 0
The absolute difference between pre and post bias for tandem axle ≥5% as 1 (failure of
equipment, i.e., it requires calibration)
The binary logistic regression was also performed with all predictors (TA shape factors).
Interestingly, the same predictor was significant in binary logistic regression that was identified
in Equation (9) provided in the previous section. The p-value (0.00001) for the Wald test showed
that the logistic regression is significant at (α=0.05). The p-values for the goodness-of-fit test are
higher than the chosen significance level (0.05) show that there is not enough evidence that the
predicted probabilities deviate from the observed probabilities in a way that the binomial
distribution does not predict. The odds ratio (1.0039) and its 95%CI (1.0020, 1.0058) showed
that the TA shape factor (TAdiffMean>26,000) is a significant predictor. The odds ratio greater
than one is acceptable, but the higher, the better (Yoo and Kim 2016). Equation (10) shows the
binary logistic regression model.
2
[ ( ) 5%] exp( ')/(1 exp( '))
' 5.05 0.003916 26,000
0.52
P TAbias Diff Y Y
Y Diff
R
(10)
Figure 4 presents the conditional plot for logit(Y) with the independent variable and receiver
operating characteristics (ROC) curve. The binary fitted line plot in Figure 4(a) requires careful
interpretation because of its complexity. This plot shows the predictions (fits) for the binary
logistic regression model plotted against the continuous independent variable
(TAdiffMean>26,000). The results show that as the value of TAdiffMean>26,000 increases, the
likelihood that bias is more than 5% (equipment out of calibration) increases. It can be concluded
that when TAdiffMean>26,000 values exceed almost 1250 lbs. (represented by the green dotted
lines), there are more chances that the response would fall under category 2 (i.e., 1 in binary
codes). This finding gets augmented further when the TAdiffMean>26,000 exceeds 1500 lbs.
(represented by the orange dotted line).
The loaded peak average value is around 30,000 pounds for TA. The significant difference
criteria for tandem axle loaded peaks before and after calibration is 5% (30,000*0.05=1500
pounds) from the previous literature. This information can help us conclude that if the tandem
axle load spectra result before and after calibration show a difference of approximately 1500
pounds or more in TAdiffMean>26,000; the weigh-in-motion equipment will likely result in 5%
or more bias for Tandem axle accuracy. The conclusions are based on approximations and
interpretation of the plot. The same finding is augmented by the ROC plot [see Figure 4(b)]. The
9
area under the ROC curve (AUC) is a measure of discrimination; a model with a high area under
the ROC curve suggests that the model can accurately predict observation value (Davis and
Goadrich 2006). The binary logistic regression findings are in agreement with the predictions
made using a simple linear regression model.
(a) Goodness of fit
(b) Receiver Operating Characteristic (ROC) Curve
Figure 4 Goodness-of-fit, validation, and simulations for TA bias model
KEY FINDINGS
The following are the key findings based on the analyses of NALS shape factors and WIM
performance parameters:
The pre and post TA bias differences (TABdiff) can be accurately estimated using
changes in TA mean value for the loaded (>26,000 lbs.) Class 9 trucks
(TAdiffMean>26,000), obtained from pre and post TA NALS. When the
TADiffMean>26,000 difference exceeds 1250 lbs., the TA bias difference exceeds 5%,
indicating the equipment requires calibration.
The results of binary logistic regression also supported the above finding.
The results obtained using different statistical methods for model development and
validation show reasonable goodness of fit.
The models presented here should be used in combination with the visual inspection of
SA and TA peak loads along with the information about seasonal changes in traffic
loading of Class 9 trucks due to land use activities (such as major agricultural harvests,
if any).
The use of NALS to estimate the TA WIM accuracy can save a significant amount of
time and resources, which are usually spent on equipment calibrations every year.
10
REFERENCES
ASTM (2009). "Standard Specification for Highway Weigh-In-Motion (WIM) Systems with
User Requirements and Test Methods E 1318-09." 2007 Annual Book of ASTM
Standards. Edited by ASTM Committee E17-52 on Traffic Monitoring. ASTM
International, USA.
Burnos, P., Gajda, J., and Sroka, R. (2018). "Accuracy criteria for evaluation of Weigh-in-
Motion Systems." Metrology and Measurement Systems, 25(4).
Davis, J., and Goadrich, M. "The relationship between Precision-Recall and ROC curves." Proc.,
Proceedings of the 23rd international conference on Machine learning, 233-240.
Gajda, J., Sroka, R., and Żegleń, T. (2007). "Accuracy analysis of WIM Systems Calibrated
Using Pre-Weighed Vehicles Method." Metrology and Measurement Systems, 14(4), 517-
527.
Haider, S. W., and Harichandran, R. S. (2007). "Relating Axle Load Spectra to Truck Gross
Vehicle Weights and Volumes." ASCE Journal of Transportation Engineering, 133(12),
696-705.
Haider, S. W., and Masud, M. M. "Accuracy Comparisons Between ASTM 1318-09 and COST-
323 (European) WIM Standards Using LTPP WIM Data." Proc., Proceedings of the 9th
International Conference on Maintenance and Rehabilitation of Pavements—Mairepav9,
Springer, 155-165.
Haider, S. W., and Masud, M. M. (2020). "Use of LTPP SMP Data to Quantify Moisture Impacts
on Fatigue Cracking in Flexible Pavements [summary report]." United States. Federal
Highway Administration. Office of Research ….
Haider, S. W., Masud, M. M., and Chatti, K. (2020). "Influence of moisture infiltration on
flexible pavement cracking and optimum timing for surface seals." Canadian Journal of
Civil Engineering, 47(5), 487-497.
Haider, S. W., Masud, M. M., Selezneva, O., and Wolf, D. J. (2020). "Assessment of Factors
Affecting Measurement Accuracy for High-Quality Weigh-in-Motion Sites in the Long-
Term Pavement Performance Database." Transportation Research Record, 2674(10),
269-284.
Jacob, B. (2000). "Assessment of the Accuracy and Classification of Weigh-in-Motion Systems
Part 1: Statistical Background." International Journal of Heavy Vehicle Systems, 7(2-3),
136-152.
Masud, M. M. (2018). Quantification of Moisture Related Damage in Flexible and Rigid
Pavements and Incorporation of Pavement Preservation Treatments in AASHTOWare
Pavement-ME Design and Analysis, Michigan State University.
Papagiannakis, A., Johnston, E., Alavi, S., and Mactutis, J. (2001). "Laboratory and field
evaluation of piezoelectric Weigh-in-Motion sensors." Journal of testing and evaluation,
29(6), 535-543.
Yoo, H.-S., and Kim, Y.-S. (2016). "Development of a crack recognition algorithm from non-
routed pavement images using artificial neural network and binary logistic regression."
KSCE Journal of Civil Engineering, 20(4), 1151-1162.