ArticlePDF Available

Chi-Square Distribution: New Derivations and Environmental Application

Authors:

Abstract and Figures

We describe two new derivations of the chi-square distribution. The first derivation uses the induction method, which requires only a single integral to calculate. The second derivation uses the Laplace transform and requires minimum assumptions. The new derivations are compared with the established derivations, such as by convolution, moment generating function, and Bayesian inference. The chi-square testing has seen many applications to physics and other fields. We describe a unique version of the chi-square test where both the variance and location are tested, which is then applied to environmental data. The chi-square test is used to make a judgment whether a laboratory method is capable of detection of gross alpha and beta radioactivity in drinking water for regulatory monitoring to protect health of population. A case of a failure of the chi-square test and its amelioration are described. The chi-square test is compared to and supplemented by the t-test.
Content may be subject to copyright.
Journal of Applied Mathematics and Physics, 2019, 7, 1786-1799
http://www.scirp.org/journal/jamp
ISSN Online: 2327-4379
ISSN Print: 2327-4352
DOI:
10.4236/jamp.2019.78122 Aug. 19, 2019 1786 Journal of Applied Mathematics and Physics
Chi-Square Distribution: New Derivations and
Environmental Application
Thomas M. Semkow1,2, Nicole Freeman3, Umme-Farzana Syed1, Douglas K. Haines1, Abdul Bari1,
Abdul J. Khan1, Kimi Nishikawa1, Adil Khan1, Adam G. Burn1,2, Xin Li1,2, Liang T. Chu1,2
1Wadsworth Center, New York State Department of Health, Albany, NY, USA
2Department of Environmental Health Sciences, University at Albany, State University of New York, Rensselaer, NY, USA
3Averill Park Central School District, Averill Park, NY, USA
Abstract
We describe two new derivations of the chi-square distribution. The first de-
rivation uses the induction method, which requires only a single integral to
calculate. The second
derivation uses the Laplace transform and requires
minimum assumptions. The new derivations are compared with the estab-
lished derivations, such as by convolution, moment generating function, and
Bayesian inference. The chi-square testing has seen many appl
ications to
physics and other fields. We describe a unique version of the chi-
square test
where both the variance and location are tested, which is then applied to en-
vironmental data. The chi-
square test is used to make a judgment whether a
laboratory method is capable of detection of gross alpha and beta radioactivi-
ty in drinking water for regulatory monitoring to protect health of popula-
tion. A case of a failure of the chi-square test and its amelioration are de-
scribed. The chi-square test is compared to and supplemented by the
t
-test.
Keywords
Mathematical Induction, Laplace Transform, Gamma Distribution,
Chi-Square Test, Gross Alpha-Beta, Drinking Water
1. Introduction
The chi-square distribution (CSD) has been one of the most frequently used dis-
tributions in science. It is a special case of the gamma distribution (see Section 2).
The latter has been an important distribution in fundamental physics, for exam-
ple as kinetic energy distribution of particles in an ideal gas (Maxwell-Boltzmann)
[1] or the kinetic energy distribution of particles emitted from excited nuclei in
nuclear reactions [2]. A historical context for the development of the CSD is de-
How to cite this paper:
Semkow, T.M.
,
Freeman
, N., Syed, U.-F., Haines, D.K.
,
Bari
, A., Khan, A.J., Nishikawa, K., Khan
,
A
., Burn, A.G., Li, X. and Chu, L.T. (2019
)
Chi
-Square
Distribution: New Derivations
and Environmental Application
.
Journal of
Applied Mathematics and Physics
,
7,
1786
-1799.
https:
//doi.org/10.4236/jamp.2019.78122
Received:
July 19, 2019
Accepted:
August 16, 2019
Published:
August 19, 2019
Copyright © 201
9 by author(s) and
Scientific
Research Publishing Inc.
This work is licensed under the Creative
Commons Attribution International
License (CC BY
4.0).
http://creativecommons.org/licenses/by/4.0/
Open Access
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1787 Journal of Applied Mathematics and Physics
scribed in References [3] and [4]. Its first derivation is attributed to Bienaymé
[5], who used multiple integrals over normal variables and substitutions. Abbe
[6] used a method of integration in the complex plane to solve multiple integrals.
The most general derivation is attributed to Helmert, who proposed a classic
transformation to derive CSD, including calculation of the Jacobian determinant
of transformation [7]. This transformation can be worked out into polar va-
riables, which is described in statistical textbooks [4] [8].
The established fundamental derivations of the CSD described above lend
themselves to complicated handling of multiple integrals. On the contrary, the
simplified derivations use the fact that CSD is a special case of the gamma dis-
tribution. Owing to the integrable and recursive properties of the gamma distri-
bution, as well as its moment generating function (Mgf), simplified derivations
of CSD are described in the textbooks [9] [10]. Another simplified derivation
uses Bayesian inference [11]. In Section 2, we refer to these methods for com-
parisons.
In this work, we present two new methods of derivation of the CSD. They are
both within the simplified category. One of them is mathematical induction. The
original derivation was done by Helmert [12] using a 2-step forward mathemat-
ical induction. We have elaborated on that and observed that the CSD has cer-
tain recursive property, which enables its derivation using a single-step induc-
tion plus the well-known theorem for beta and gamma functions. Another deri-
vation method we describe is by the Laplace transform. This method has some
similarity to the Mgf and characteristic function methods, owing to the presence
of exponentiation. It uses a complex-variable integration and it is free from
many assumptions of the other methods. The two new derivations of the CSD by
mathematical induction and Laplace transform are described in Section 2.
Chi-square testing (CST) is closely related to and based upon the CSD. It has
its origins in the discovery of the goodness-of-fit test by Pearson [13]. In the
goodness-of-fit, one calculates the test statistics as
( )
2
2
1
,
mii
ii
OE
E
ν
χ
=
=
(1)
where
i
O
is frequency of observation,
i
E
is expected frequency based on an
assumed model distribution, for category of type
i
, and
m
is the number of cate-
gories. Both
i
O
and
i
E
are unitless.
1
mp
ν
= −−
is the number of degrees
of freedom, where
p
is number of parameters of the model distribution calcu-
lated from the data. For any model distribution, Equation (1) leads asymptoti-
cally to the CSD when the number of observations is large, which has been
proved for the multinomial distribution by Pearson [13]. The goodness-of-fit
CST has been extensively used in statistics and widely applied to many fields [3]
[14]. It is worth noting that the interpretation of the degrees of freedom was
provided by Fisher [15]. As example in physics, CST goodness-of-fit has been
used to verify Poisson fluctuations of radioactivity counter [14] [16].
Another form of the chi-square variable from Equation (1) is written in the
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1788 Journal of Applied Mathematics and Physics
general form as
(2)
where
n
is the number of observations,
i
x
is the observed variable,
i
µ
is the
expected value,
i
σ
is the standard deviation, and
n
ν
. The variables in Equ-
ation (2) can be expressed in physical units. In the limit of large number of ob-
servations, the variable and parameters of Equation (2) are approximated by
those of the normal variates, and the
2
ν
χ
distributes as CSD. In this work, we
generalize this CST test to a combined test for variance and location as well as
verify it with the
t
-test [17]. The test statistics studied are described in Section 3.
Within the context of this work, we present a unique application of the CST to
the detection of radioactive contaminants in drinking water required by the Safe
Drinking Water Act (SDWA) in the US. The bulk of natural alpha and be-
ta/gamma (photon) radioactivity in drinking water originates from the possible
presence of 238U and 232Th natural radioactive-series progeny, 226,22 8Ra and their
progeny, as well as 40K radionuclides [18]. The SDWA regulations [19] establish
a Maximum Contaminant Level (MCL) of 15 pCi/L (555 mBq/L) for gross alpha
(GA) radioactivity, excluding U and Rn. For gross beta (GB) radioactivity, the
MCL is limited by the total body or any organ radiation dose of 4 mrem/y (40
μSv/y). For both GA and GB, the Maximum Contaminant Level Goal (MCLG) is
zero. Furthermore, SDWA requires Detection Limits (DL) of 3 pCi/L (111
mBq/L) and 4 pCi/L (148 mBq/L) for GA and GB radioactivity, respectively.
These DLs must be met by all public health laboratories accredited for monitor-
ing of GA and GB radioactivity in drinking water in the US. In Section 4, we de-
tail a CST procedure to verify if the required above-mentioned DLs are met [20].
We investigate the reasons and consequences of failed CST and ameliorate such
cases.
2. Chi-Square Distribution
The probability density function (Pdf) of the CSD is given by
( ) ( )
( )
2
21 2
2
2
2
e
Pdf | ,
22
ν
νχ
ν
νν
χ
χν ν
=Γ
(3)
where
Γ
is the gamma function. The expectation value of CSD is
2
E
χν

=
,
and the variance
2
Var 2
χν

=

[21]. The CSD is a special case of the gamma
distribution abbreviated as
( )
2
gamma | ,ab
ν
χ
with the parameters
2a
ν
=
and
2b=
[21].
To derive Equation (3), we start with the general definition of
2
ν
χ
statistics
given by Equation (2) assuming normal variates. For a single normal variable
1
x
with
( )
1
Pdf x
, the probability of
[ ]
1 11 1
,dx xx x∈+
is given by
( )
2
11
1
2
11 1
1
1
Pdf d e d .
2π
x
xx x
µ
σ
σ



=
(4)
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1789 Journal of Applied Mathematics and Physics
By substituting
( )
( )
2
2
1 111
x
χ µσ
=
, we obtain from Equation (4)
()()
( )
()
2
1
2
1
12 1
22
1
22 2 2
1
12
1 11
2 12
1
1
22
11
e
d
2
Pdf |1 d e d d
d2
Γ12
2π
gamma |1 2, 2 d ,
x
χ
χχ
χχ χ χ
χ
σ
χχ
= =
=
(5)
which has the Pdf given by Equation (3) for
1
ν
=
. In deriving Equation (5), we
also used
( )
12 πΓ=
, whereas factor of 2 originated from the fact that the
1
x
variable ranging from minus infinity to plus infinity has been substituted with
the
2
1
χ
variable ranging from zero to plus infinity.
Let us assume that the
1n+
term with the normal
1n
x+
variable was added
to Equation (2), and that this addition raised the number of degrees of freedom
to
1
ν
+
. Then,
2
22 11
1
1
.
nn
n
x
νν
µ
χχ σ
++
+
+

= + 

(6)
Using the calculus for probability density functions [21],
( ) ( )
( )
2 2 22
1 1 11
Pdf | 1 d Pdf | d Pdf d .
nn
xx
ν ν νν
χ ν χ χν χ
+∞
+ + ++
−∞
+=
(7)
Let us define a new variable
z
, such as
( )
2
2
11 1
1
1.
nn
n
xz
ν
µχ
σ
++ +
+

=


(8)
By realizing that
22
1
dd
νν
χχ
+
=
, and performing all substitutions, the right side
of Equation (7) can be rewritten as
( )
( )
( )
( )
( )
( ) ( ) ()
21
122
1
11
0
1 21 2
21
112 1
2 21
1
12 0
d
2 Pdf | Pdf d d
d
ed 1 d.
2 2 12
n
n
x
xz
z
z zz
ν
νν
νχ
νν
ν
ν
χν χ
χχ
ν
+
+
++
+−
+
+
+
=
ΓΓ
(9)
However, the integral on the right side of Equation (9) is the beta function,
( )
2,1 2B
ν
, which is related to the gamma functions by [22],
( ) ( ) ( )
( )
( )
Γ2Γ12
2,1 2 .
Γ12
B
ν
νν
=+
(10)
By inserting Equation (10) into Equation (9), simplifying, and comparing with
the left side of Equation (7), one obtains
( ) ( )
( )
( )
()
( )
21
1 21 2
21
2112
Pdf | 1 ,
2Γ12
e
ν
νχ
ν
νν
χ
χν ν
+
+−
+
++
+= +
(11)
which is the Pdf given by Equation (3) for
1
ν
+
degrees of freedom and it
proves Equation (3) by induction.
By substituting
( )
( )
2
2
i iii
x
ϕ µσ
=
, Equation (2) becomes
22
1
n
i
i
ν
χϕ
=
=
(12)
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1790 Journal of Applied Mathematics and Physics
The sum of independent random variables
2
i
ϕ
is called a convolution and
the joint distribution function for
2
ν
χ
can be obtained by calculating an
n
-dimensional convolution integral. Exploring the properties of this convolu-
tion leads to simplifications, which have been used in the literature. By convo-
luting two gamma distributions
22
1i
χϕ
from Equation (5) and using the
theorem that the convolution of two gammas is also a gamma, one obtains
()
2
2
gamma | 2 2, 2
χ
[9]. By continuing this process of convoluting with
2
1
χ
, it
is easy to infer that the full convolution is equal to
( )
2
gamma | 2, 2
ν
χν
, where
n
ν
=
, which the CSD given by Equation (3). This provides a simplified deriva-
tion of CSD using convolution.
Another simplified derivation of CSD uses the theorem that the Mgf of con-
volution is a product of individual Mgfs [10]. Thus, by calculating Mfg of
2
1
χ
from Equation (5) and taking it to the
n
th power, one obtains the Mgf for
2
ν
χ
,
where
n
ν
=
. One can also calculate the Mgf of the gamma distribution and in-
fer from a comparison that the CSD in Equation (3) is a special case of the
gamma distribution [10].
In this work we provide yet another simplified derivation of the CSD using
Laplace transform [23]. The Laplace transform of Equation (5) is equal to
( )
( )
2
12
1
12 1 2
2 12
12
1
12
0
e12
ed .
12
2 12
s
s
χ
χ
χχ

=
+
Γ
(13)
Subsequently, we use a theorem that the Laplace transform of a
n
th convolu-
tion is a product of the individual transforms,
i.e.
2
12
12
n
s


+

. By abbreviating
2
n
u
χ
=
, the inverse Laplace transform results in the Pdf of
u
,
( ) ( )
/2
22
1 12 1 1 e
Pdf | e d d .
2π12 2π
212
nsu
su
nn
un s s
is is

= =

++

∫∫
(14)
To calculate the contour integral in Equation (14), we start with the Cauchy
integration formula for an analytic function
( )
fs
of a complex variable
s
hav-
ing a simple pole at
0
s
[24]:
( )
( )
0
0
1d.
2π
fs
fs s
iss
=
(15)
The
1k
times differentiation of Equation (15), where the differentiation
can be of an integer or a fractional order [25], results in:
( )
( )
( ) ( )
( )
10
0
Γd.
2π
k
k
k fs
fs s
iss
=
(16)
By comparing Equation (14) to Equation (16), we infer that
( )
e
su
fs=
,
0
12s=
, and
2kn=
. By inserting these variables to Equation (16) and plug-
ging it into Equation (14), we obtain:
( ) ( )
( )
21 21 2
2 21 2
12
1d e
Pdf | e ,
2Γ2d 2Γ2
n nu
su
nn n
s
u
un ns n
−−
=−

= =


(17)
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1791 Journal of Applied Mathematics and Physics
which is the CSD given by Equation (3) for
n
ν
=
and
2
n
u
χ
=
.
Another simplified derivation of the CSD uses the Bayesian inference and it is
not related to the convolutions described above [11]. It uses a normal likelihood
function for multiple samples. It also uses the transformational prior distribu-
tions:
1
σ
for scale parameter
σ
and a constant for translation parameter
µ
[26]. Marginalizing the joint distribution
()
,
µσ
over
µ
results in the
CSD, whereas marginalizing over
σ
results in the
t-
distribution [27].
In Section 5, we summarize the advantages and disadvantages of the simpli-
fied derivation methods of CSD described in this section.
3. Test Statistics
Several models for the CST statistics can be derived from the general Equation (2).
For the expected value, we can use either the sample mean
x
or the population
mean
µ
, whereas for the standard deviation we can use either individual stan-
dard deviations
i
σ
or the sample standard deviation
x
σ
. We do not know the
population standard deviation for the data described in Section 4. Model test sta-
tistics
( )
( )
2
ix
x x
σ
is always equal to
1n
and thus not useful. However,
the model test statistics
( )
( )
2
ii
x x
σ
can be used to test the variance. Other
possibilities are to test for both the variance and location by employing model test
statistics
( )
( )
2
ii
x
µ σ
or
( )
( )
2
ix
x
µ σ
, if the population mean is
known which is the case for the data in Section 4.
For the
t
-test we perform a standard one-sample test, where we calculate
t
va-
riable as
( )
( )
x
x n
µσ
. The
t
-test is the location test. The results of all these
test models using radioactivity data are presented in Section 4.
4. Chi-Square- and t-Test for Radioactivity Detection in
Drinking Water
The most convenient method of measuring GA and GB radioactivity in drinking
water is by gas proportional counting [28]. In this method, a given quantity of
water is evaporated with nitric acid onto a stainless-steel planchet and dried,
leaving a residue containing any radioactivity. The planchet is then counted on a
gas proportional detector. Alpha and beta particles are counted simultaneously,
and they are differentiated by much larger ionization caused by the former.
As stated in Section 1, this method must be able to determine GA and GB at
the DL, to be verified by the CST [20] using a minimum of seven samples. EPA
recommends a right-tail (RT) CST at 99% Confidence Level (CL), or 0.01 signi-
ficance. To accomplish this,
9n=
samples of community drinking water were
spiked with 230Th and 90Sr/90Y radionuclides providing alpha and beta radioactiv-
ity, respectively. The spiking activities (
i.e.
the expected
µ
) were: 2.9888 ±
0.0402 pCi/L for alpha and 4.1860 ± 0.0549 pCi/L for beta, close to the required
DL values. The values of spiking activities and their uncertainties were obtained
from the standards traceable to the National Institute of Standards and Tech-
nology (NIST). Then the experimental procedure was followed, and the meas-
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1792 Journal of Applied Mathematics and Physics
ured GA and GB activities
i
x
are depicted as points in Figure 1 and Figure 2,
respectively.
Also shown in Figure 1 and Figure 2 are the individual standard deviations
i
σ
, depicted as vertical lines. These standard uncertainties are propagated, in-
cluding the Poisson statistics of radioactivity counting and background subtrac-
tion, uncertainties of the detector efficiency, cross-talk between alpha and beta
particles, as well as solution-pipetting uncertainties. Therefore, they are slightly
different for different samples.
The GA results are described first. The sample average for GA is given by
3.0951x=
pCi/L (red horizontal thick line) which is close to the expected
µ
(green horizontal thick line) as seen in Figure 1. The sample standard deviation
is given by
0.7000
x
σ
=
pCi/L. The results of the variance test, as defined in
Section 3, are given in column 3 of Table 1. The number of the degrees of free-
dom is
8
ν
=
because one constraint is from calculating the mean. The ob-
served
2
χ
statistics is equal to 14.0 for gross alpha. The right-tail (RT) and
left-tail (LT)
2
χ
are calculated from the CSD at 0.01 significance each. Since
1.6 14.0 20.1<<
, each tail test passes at 0.01 significance and two-tail (2T) test
passes at 0.02 significance. Then, the two combined variance/location tests, as
defined in Section 3 are given in columns 4 and 5 using
i
σ
and
x
σ
, respec-
tively.
9n
ν
= =
in these cases, because there are no constraints. They both
pass for GA.
The
t
-test statistics is calculated as described in Section 3 resulting in 0.45 for
GA, as given in column 6 in Table 1. The RT probability of 0.33 and 2T proba-
bility of 0.66 are larger than 0.01 and 0.02, respectively, ensuring the passage of
the location
t
-test.
The gross beta activities plotted in Figure 2, with the mean
5.1274x=
pCi/L (red horizontal thick line) and
0.3050
x
σ
=
pCi/L differ significantly
from the expected
µ
(green horizontal thick line) beyond the observed uncer-
tainties. That fact did not affect the variance test which passed for GB (column 3
in Table 1). However, the observed
2
χ
of 43.1 and 93.7 exceed the calculated
RT
2
χ
of 21.7 (columns 4 and 5 in Table 1), therefore the combined va-
riance/location tests failed. This failure is supported by the
t
-test, where the high
9.26t=
(column 6) resulted in very low values of the RT and 2T probabilities
(columns 7 and 8) and failures of the test for GB.
To elucidate the reasons for failure of the GB CST and
t
-test, fifteen
non-spiked Method Blank (MB) community water samples were prepared and
measured. The average GA activity was below detection; however, the average
GB was 0.8121 ± 0.2801 pCi/L. This MB was then subtracted from the spiked GB
results and the corrected GB activities are plotted in Figure 3. The mean of the
corrected GB is
4.3153x=
pCi/L (
0.3050
x
σ
=
pCi/L), very close to the value
for spiked radioactivity. The corrected observed
2
χ
are now 2.7, 3.2 and 9.6
(columns 3, 4, and 5 in Table 1) ensuring the passage of the three CSTs. This is
supported by the passage of the
t
-test also (columns 6, 7, and 8).
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1793 Journal of Applied Mathematics and Physics
Table 1. The results of
χ
2- and
t
-tests. Abbreviations: RT right-tail, LT left-tail, 2T
two-tail. Significance is 0.01 for each tail.
1 2 3 4 5 6 7 8
Experiment,
reference
χ
2-test
t
-test
Parameter Variance,
i
σ
Variance
and location,
i
σ
Variance
and location,
x
σ
Location
Deg free 8 9 9 8
Calc RT 20.1 21.7 21.7
Calc LT 1.6 2.1 2.1
t
RT prob 2T prob
Gross Alpha,
Figure 1
Observed 14.0 13.4 8.2 0.45 0.33 0.66
Test result Passed Passed Passed Passed Passed
Gross Beta,
Figure 2
Observed 3.8 43.1 93.7 9.26 7.5E06 1.5E05
Test result Passed Failed Failed Failed Failed
Gross Beta-MB
subtracted,
Figure 3
Observed 2.7 3.2 9.6 1.27 0.12 0.24
Test result Passed Passed Passed Passed Passed
Figure 1. Gross alpha (points) ordered according
to the increased activity.
Figure 2. Gross beta (points) ordered according
to the increased activity.
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1794 Journal of Applied Mathematics and Physics
Figure 3. Gross beta (points) corrected for method
blank and ordered according to the increased
activity.
The reasons for the elevated GB in MB of community drinking water were
investigated. Ten L of water were evaporated to 50 mL and measured using
precise gamma-ray spectrometry [29]. It was determined that the concentra-
tion of the beta/gamma emitter, 40K was 0.6926 ± 0.0790 pCi/L. It was also
possible to identify several beta/gamma progenies of the 238U series: 234Th,
214Pb, 21 4Bi, and 210Pb, as well as those from the 232Th series: 228Ac, 212Pb, and
208Tl. The combined activity of the beta/gamma progeny was 0.1513 ± 0.0672
pCi/L. Therefore, the sum of 40K and beta/gamma progeny was 0.8440 ±
0.1037 pCi/L. The latter is consistent with the GB activity of 0.8121 ± 0.2801
pCi/L from the MB measurement to within the measured uncertainties. Also
associated with the decay of 238U and 232Th is their alpha activity plus alpha
progeny of similar activity to that of the beta/gamma progeny. This alpha ac-
tivity could not have been detected by gamma spectrometry and was below
the detection by GA in the MB measurement. However, the fact that GA of
3.0951 pCi/L is slightly higher than the expected 2.9888 pCi/L is an indication
of that. Unlike in the case of beta activity, the small alpha progeny activity did
not affect the CST or
t
-test. It should be noted that this level of naturally
present radioactivity in the community water is much below the MCL, and
thus poses small risk to the population.
5. Summary and Conclusions
We have described five simplified methods of deriving the chi-square distribu-
tion. Three of them: by convolution, moment generating function, and Bayesian
inference are described in the literature and have been outlined here for com-
parison. The simplest of them seems to be the convolution method. It only uses
the substitution from the normal distribution to a chi-square variable and re-
quires a calculation of a single convolution integral on the above. It infers the
form of multiple convolution on gamma distribution leading to the chi-square
distribution. The moment generating function method of derivation is more ad-
vanced as it requires the knowledge of the moment generating function and the
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1795 Journal of Applied Mathematics and Physics
gamma distribution. The Bayesian inference method requires the knowledge
about likelihood function and prior probabilities but does not require the know-
ledge about the gamma distribution.
In this work, we have proposed two new methods for derivation of the
chi-square distribution: by induction and by Laplace transform. The method of
induction uses operational calculus with only a single integral leading to beta
function. The proposed derivation applies modern formalism and seems to be
simpler than the original derivation by Helmert as early as in 1876. A disadvan-
tage of the induction method is that it requires a prior knowledge of the
chi-square distribution to perform induction on it. There is a significant advan-
tage, however. All other methods require either no constraints in the data;
i.e.
the number of degrees of freedom must be equal to the number of observations,
or one constraint in case of Bayesian inference. The induction method leaves any
constraints intact by adding one induction step to the existing number of de-
grees of freedom. The proposed derivation method by Laplace transform is more
advanced because it uses integration in the complex plane. The significant ad-
vantage of the Laplace transform, and the Bayes inference methods is that they
do not require prior knowledge about the gamma distribution.
We have also described a unique application of the chi-square test to envi-
ronmental science. In chi-square testing, it is important to delineate systematic
effects from the random uncertainties. In this work, a systematic natural conta-
mination of laboratory method blank caused the chi-square test for combined
variance/location to fail; however, it did not affect the chi-square test for va-
riance alone. After subtracting the systematic method blank, the chi-square va-
riance/location test was shown to have passed. This was confirmed by the loca-
tion
t
-test. It is also imperative to perform analysis of uncertainty. In this work,
using either individual or sample standard deviations did not affect the va-
riance/location chi-square test. While the chi-square test provides verification if
a laboratory test method is adequate to monitor gross alpha and gross beta ra-
dioactivity in drinking water, the test statistics combining variance and location
is more useful than the one based on the variance alone because it can identify
systematic bias.
Acknowledgements
N. F. acknowledges partial support by the Questar III STEM Research Institute
for Teachers of Science, Engineering, Mathematics, and Technology. K. N. ac-
knowledges partial support by the US Food and Drug Administration under
Grant 5U18FD005514-04. Thanks are due to J. Witmer for his valuable com-
ments.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this pa-
per.
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1796 Journal of Applied Mathematics and Physics
References
[1] Hill, T.L. (1986) An Introduction to Statistical Thermodynamics. Dover Publica-
tions, New York, 122.
[2] Satchler, G.R. (1990) Introduction to Nuclear Reactions. Oxford U. P., New York,
248. https://doi.org/10.1007/978-1-349-20531-8
[3] Lancaster, H.O. (1969) The Chi-Squared Distribution. J. Wiley & Sons, New York,
Chap. 1.
[4] Gorroochurn, P. (2016) Classic Topics on the History of Modern Mathematical Sta-
tistics: From Laplace to More Recent Times. J. Wiley & Sons, Hoboken, Chap. 3.
https://doi.org/10.1002/9781119127963
[5] Bienaymé, I.-J. (1852) Sur la Probabilité des Erreurs d’Aprés la Méthode des Moin-
dres Carrés.
Liouville
s Journal de Mathématiques Poures et Appliquées
,
Séries
1,
17, 33-78.
[6] Abbe, D.E. (1863) Ueber die Gesetzmässigkeit in der Vertheilung der Fehler bei
Beobachtungsreihen. Dissertation, Jena.
[7] Helmert, F.R. (1876) Die Genauigkeit der Formel von Peters zur Berechnung des
wahrscheinlichen Beobachtungsfehlers directer Beobachtungen gleicher Genauig-
keit.
Astronomische Nachrichten
, 88, 113-132.
https://doi.org/10.1002/asna.18760880802
[8] Stuart, A. and Ord, K. (1994) Kendall’s Advanced Theory of Statistics, Vol. 1, Dis-
tribution Theory. Arnold Hodder Headline Group, London, Chap. 11.
[9] Ross, S. (2006) A First Course in Probability. Pearson Prentice Hall, Upper Saddle
River, Sec. 6.3.
[10] Berry, D.A. and Lindgren, B.W. (1996) Statistics: Theory and Methods. Wadsworth
Publishing, Belmont, Sec. 5.12, 6.4.
[11] Gull, S.F. (1988) Bayesian Inductive Inference and Maximum Entropy. In: Erickson,
G.J. and Smith, C.R., Eds.,
Maximum
-
Entropy and Bayesian Methods in Science
and Engineering
, Kluwer Academic, Dordrecht, 53-74.
https://doi.org/10.1007/978-94-009-3049-0_4
[12] Helmert, F.R. (1876) Ueber die Wahrscheinlichkeit der Potenzsummen der Beo-
bachtungsfehler und über einige damit im Zusammenhange stehende Fragen.
Zeit-
schrift für Mathematik und Physik
, 21, 192-218.
[13] Pearson, K. (1900) On the Criterion that a Given System of Deviations from the
Probable in the Case of Correlated System of Variables Is Such That It Can Be Rea-
sonably Supposed to Have Arisen from Random Sampling.
Philosophical Magazine
Series
5, 50, 157-175. https://doi.org/10.1080/14786440009463897
[14] Greenwood, P.E. and Nikulin, M.S. (1996) A Guide to Chi-Squared Testing. J. Wi-
ley & Sons, New York, Sec. 3.18.
[15] Fisher, R.A. (1922) On the Interpretation of
χ
2 from Contingency Tables and the
Calculation of P.
Journal of the Royal Statistical Society A
, 85, 87-94.
https://doi.org/10.2307/2340521
[16] Evans, R.D. (1985) The Atomic Nucleus. Krieger Publishing, Malabar, Chap. 27.
[17] Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distri-
butions, Vol. 2. J. Wiley & Sons, New York, Chap. 28.
[18] Eisenbud, M. and Gesell, T. (1997) Environmental Radioactivity from Natural, In-
dustrial, and Military Sources. Academic Press, San Diego, Chap. 6.
https://doi.org/10.1016/B978-012235154-9/50010-4
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1797 Journal of Applied Mathematics and Physics
[19] Environmental Protection Agency (2000) 40 CFR Parts 9, 141, and 142 National
Primary Drinking Water Regulations; Radionuclides; Final Rule.
Federal Register
,
65, 76708-76752.
[20] EPA (2017) Procedure for Safe Drinking Water Act Program Detection Limits for
Radionuclides. Report EPA 815-B-17-003, Cincinnati.
[21] Johnson, N.L., Kotz, S. and Balakrishnan, N. (1994) Continuous Univariate Distri-
butions, Vol. 1. J. Wiley & Sons, New York, Chap. 12, 17, 18.
[22] Johnson, N.L., Kemp, A.W. and Kotz, S. (2005) Univariate Discrete Distributions. J.
Wiley & Sons, Hoboken, Chap. 1. https://doi.org/10.1002/0471715816
[23] Margenau, H. and Murphy, G.M. (1976) The Mathematics of Physics and Chemi-
stry. Krieger Publishing, Huntington, Sec. 8.5.
[24] Dettman, J.W. (1965) Applied Complex Variables. Dover Publications, New York,
Sec. 3.6.
[25] Oldham, K.B. and Spanier, J. (1974) The Fractional Calculus: Theory and Applica-
tions of Differentiation and Integration to Arbitrary Order. Dover Publications,
Mineola, Sec. 3.4.
[26] Jaynes, E.T. (2004) Probability Theory: The Logic of Science. Cambridge U. P.,
Cambridge, Chap. 12. https://doi.org/10.1017/CBO9780511790423
[27] Student (1908) The Probable Error of a Mean.
Biometrika
, 6, 1-25.
https://doi.org/10.1093/biomet/6.1.1
[28] Semkow, T.M. and Parekh, P.P. (2001) Principles of Gross Alpha and Beta Radioac-
tivity Detection in Water.
Health Physics
, 81, 567-574.
https://doi.org/10.1097/00004032-200111000-00011
[29] Khan, A.J., Semkow, T.M., Beach, S.E., Haines, D.K., Bradt, C.J., Bari, A., Syed,
U.-F., Torres, M., Marrantino, J., Kitto, M.E., Menia, T. and Fielman, E. (2014) Ap-
plication of Low-Background Gamma-Ray Spectrometry to Monitor Radioactivity
in the Environment and Food.
Applied Radiation and Isotopes
, 90, 251-257.
https://doi.org/10.1016/j.apradiso.2014.04.011
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1798 Journal of Applied Mathematics and Physics
Appendix
A.1. Glossary
CL: Confidence Level
CSD: Chi-Square Distribution
CST: Chi-Square Test
DL: Detection Limit for radionuclides
EPA: U.S. Environmental Protection Agency
GA: Gross Alpha Radioactivity
GB: Gross Beta Radioactivity
L: Liter
LT: Left Tail
MB: Method Blank
mBq: milli-Becquerel
MCL: Maximum Contaminant Level
MCLG: Maximum Contaminant Level Goal
Mgf: Moment generating function
mL: milli-Liter
mrem: milli-rem
NIST: National Institute of Standards and Technology
pCi: pico-Curie
Pdf: Probability density function
RT: Right Tail
SDWA: Safe Drinking Water Act
STEM: Science, Technology, Engineering and Mathematics
y: year
μSv: micro-Sievert
2T: Two Tail
A.2. Variables
a
,
b
: parameters of the gamma distribution
B
: beta function
E
: expectation value
i
E
: expected frequency
( )
fs
: analytic function
gamma: gamma distribution
i
,
k
: indices
m
: number of categories
n
: number of observations
i
O
: observed frequency
p
: number of parameters for model distribution
s
: complex variable
0
s
: pole
t
:
t
-test variable
T. M. Semkow et al.
DOI:
10.4236/jamp.2019.78122 1799 Journal of Applied Mathematics and Physics
Var: variance
i
x
: normal random variable
x
: sample mean
u
,
z
: substituted variables
Γ
: gamma function
,
i
µµ
: expected variable: population, individual
ν
: number of degrees of freedom
,,
ix
σσ σ
: standard deviation, individual, sample
2
i
ϕ
: individual chi-square
2222
,,,
in
ν
χχχχ
: chi-square, for
i
,
n
observations,
ν
degrees of freedom
... In this work we have investigated two radium detection methods in drinking water, referred as destructive and non-destructive. We described in detail the achieved MDA and DL using the newly developed procedures [18], supported by the newly developed methods for Chi-square testing [19]. ...
... Then either destructive or nondestructive procedures were followed. The results were evaluated with a combined location/variance right-tail (RT) Chi-square test at 0.01 significance (99% confidence level, CL), as described elsewhere [19]. The observed Chi-square variable is defined as ...
... However, as seen in Table 2 The answer to this question can be addressed with the aid of statistical significance testing [11]. In this case we perform a test for the variance [29]. We define a -score variable as: ...
... After determining the priority of the basic attributes, priority is given to find the group of each basic attribute that is exactly the same, if it cannot be exactly the same, then according to the priority of other basic attributes obtained through the chi-square test before, through the python for loop traversal search processing to find the approximate similarity, to establish the approximate identical group, the prediction results of the exact same group are more accurate than the prediction results of the approximate identical group [6]. Degree of weathering is used as a decision attribute, glass type, grain and colour as other basic attributes and chemical composition as a predictive attribute. ...
Article
Full-text available
Glass is a witness of early trade exchanges on the Silk Road. In this paper, based on the classification information of glass artefacts and their corresponding proportions of major components, the chi-square test, the CRITIC weighting method and the superior order approximation model are used. The data were first pre-processed to exclude the sample data whose sum of the proportions of chemical components did not belong to 85%~105%. After classifying the other glass artifacts according to their attributes, the chi-square test was carried out by SPSS, and according to the significance p-value <0.05, it was concluded that only the glass type within the three had a significant effect on the degree of weathering, while the decoration and colour had little effect on the degree of weathering. All the sample data were classified according to the combination of type-degree of differentiation, and then the CRITIC weight method was applied to find out the objective weight of each chemical indicator in the combination respectively, and finally the distribution of the weights of each chemical element in all the combinations was compared to establish a superior order approximation model, and the individuals whose predictive attributes were known according to the prediction attributes were taken as the prediction reference objects to obtain the final results.
... In addition, because the variables are often quantitative, regression and analysis of variance methods are widely used and wellknown tools [2,3]. On occasions when the data is qualitative, an analysis of contingency tables and independence tests based on statistics with chi-square distribution is developed, having a wide application in physics and biophysics that include examples of various kinds from classical mechanics to particle physics [4]. ...
Article
Full-text available
Modeling qualitative variables and their interactions often require multidimensional analysis through Log-linear models. Furthermore, these models are useful as alternatives in fields where probabilistic classification is required, such as speech recognition or pattern classification. This work uses log-linear modeling as a methodological approach to the analysis of 1114 valid cases of women participating in a human papillomavirus infection and cervical cancer screening program, thus relating a public health problem to biophysical knowledge. The objective of the study was to evaluate the main effects and interactions between the variables compared to the independence model. A backward stepwise selection with a 5% probability of elimination was performed to arrive at the best hierarchical model starting on the covariates that were significant in a previous bivariate analysis. This allows us to understand how biophysical process modeling can identify biomarkers and propose prevention methods for human papillomavirus infection and Papanicolaou smear abnormalities.
... observations with EX 1 = 0. We are interested in testing the hypothesis H 0 : ν = 0 vs. H 1 : ν > 0, where ν denotes the median of X 1 . This statement of the problem can be found in various practical applications related to testing linear regression residuals as being symmetric, and pre-and post-placebo paired comparison of biomarker measurements as well as, for example, when researchers investigate data associated with radioactivity detection in drinking water, where the population mean is known; see Section 4 inSemkow et al. (2019). ...
Article
Full-text available
Data-driven most powerful tests are statistical hypothesis decision-making tools that deliver the greatest power against a fixed null hypothesis among all corresponding data-based tests of a given size. When the underlying data distributions are known, the likelihood ratio principle can be applied to conduct most powerful tests. Reversing this notion, we consider the following questions. (a) Assuming a test statistic, say T , is given, how can we transform T to improve the power of the test? (b) Can T be used to generate the most powerful test? (c) How does one compare test statistics with respect to an attribute of the desired most powerful decision-making procedure? To examine these questions, we propose one-to-one mapping of the term "most powerful" to the distribution properties of a given test statistic via matching characterization. This form of characterization has practical applicability and aligns well with the general principle of sufficiency. Findings indicate that to improve a given test, we can employ relevant ancillary statistics that do not have changes in their distributions with respect to tested hypotheses. As an example, the present method is illustrated by modifying the usual t-test under nonparametric settings. Numerical studies based on generated data and a real-data set confirm that the proposed approach can be useful in practice.
... (2) and (3). In hence either accept or reject the "Null" hypothesis [21]. The steps of the Chi-square method shows in Fig. 2. The high value of Chisquare refers to a relation between the features or the relation between the feature and the class label [22]. ...
... (2) and (3). In hence either accept or reject the "Null" hypothesis [21]. The steps of the Chi-square method shows in Fig. 2. The high value of Chisquare refers to a relation between the features or the relation between the feature and the class label [22]. ...
Article
Full-text available
The teacher of a kindergarten is a paramount factor that affects the child’s future and the educational process as all. The major objective of the work is to build a model for predicting the performance of the Iraqi kindergarten teachers by using AI techniques and providing feedback for kindergartens’ teachers by determining the important teachers’ attributes using feature selection methods. The proposed work contained three major stages: the data preparation, the feature selection stage, and finally, the classification stage. The dataset has been collected by the questionnaire, the number of samples was 1450 samples of teachers from different cities in Iraq which were selected randomly, while the number of features was twenty-six. Two types of feature selection techniques utilized were Chi-square and classification and regression Tree (CART) methods. Three techniques had been used in the classification stage Support Vector Machine (SVM), Naïve Bayes (NB), and Deep Neural Network (DNN). The results showed the features had values differently in importance. The features selection technique had a positive effect on the performance, where the accuracy was 91.7%, 99.31%, and 99.68% when used NB, SVM, and DNN consecutively when using the CART selection method and 75.3%, 75.4%, and 98.7% consecutively for all features.
Article
Generalized entropies developed for non-extensive statistical mechanics are derived from the Boltzmann-Gibbs-Shannon entropy by a real number q that is a parameter based on q-calculus; where q is called ‘the entropic index’ and determines the degree of non-extensivity of a system in the interval between 1 and 3. In a very recent study, we introduced a new calculation method of the entropic index q of non-extensive statistical mechanics. In this study, we show the mathematical proof of this calculation method of the entropic index. Firstly, we propose that the number of degrees of freedom, n is proportional to the inverse of the wavelet scale index,n≡1iscale, where iscale is a wavelet based parameter called wavelet scale index that quantitatively measures the non-periodicity of a signal in the interval between 0 and 1. Then, by applying this proposition to the superstatistics approach, we derive the equation that expresses the relationship between the entropic index and the wavelet scale index, q=1+2iscale. Therefore, we name this q-index as the ‘wavelet’ entropic index. Lastly, we calculate the Abe entropy, Landsberg-Vedral entropy and q-dualities of the Tsallis entropy of the Logistic Map and Hennon Map using the ‘wavelet’ entropic index, and based on our results, compare and discuss these generalized entropies.
Chapter
Natural radioactivity originates from extraterrestrial sources, as well as from radioactive elements in the earth's crust. About 340 nuclides have been found in nature, of which about 70 are radioactive and are found mainly among the heavy elements. All elements having an atomic number greater than 80 possess radioactive isotopes, and all isotopes of elements heavier than number 83 are radioactive. The radioactivity of the earth includes three major categories. Primordial radionuclides have half lives sufficiently long that they have survived since their creation. Secondary radionuclides are derived from radioactive decay of the primordials. Cosmogenic radionuclides are continuously produced by the bombardment of stable nuclides by cosmic rays, primarily in the atmosphere. A much larger number of radioactive isotopes than existing now were produced when the matter of which the universe is formed first came into being several billion years ago, but most of them have decayed out of existence. The primordial radionuclides that now exist are those that have half lives at least comparable to the age of the universe. Radioisotopes with half lives of less than about l0 years have become undetectable in the 30 or so half lives since their creation, whereas radionuclides with half lives greater than 10 years have decayed very little up to the present time. In most places on earth the natural radioactivity varies only within narrow limits, but in some localities there are wide deviations from normal levels because of abnormally high soil concentrations of radioactive minerals.
Article
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies