ArticlePDF Available

Multi-State Reliability Demonstration Tests

Authors:

Abstract and Figures

Reliability demonstration tests have important applications in reliability assurance activities to demonstrate product quality over time and safeguard companies' market positions and competitiveness. With greatly increasing global market competition, conventional binomial reliability demonstration tests based on binary test outcomes (success or failure) at a single time point become insufficient for meeting consumers' diverse requirements. This paper proposes multi-state reliability demonstration tests (MSRDTs) for demonstrating reliability at multiple time periods or involving multiple failure modes. The design strategy for MSRDTs employs a Bayesian approach to allow incorporation of prior knowledge, which has the potential to reduce the test sample size. Simultaneous demonstration of multiple objectives can be achieved and critical requirements specified to avoid early/critical failures can be explicitly demonstrated to ensure high customer satisfaction. Two case studies are explored to demonstrate the proposed test plans for different objectives.
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=lqen20
Download by: [47.199.250.226] Date: 30 June 2017, At: 19:10
Quality Engineering
ISSN: 0898-2112 (Print) 1532-4222 (Online) Journal homepage: http://www.tandfonline.com/loi/lqen20
Multi-state reliability demonstration tests
Suiyao Chen, Lu Lu & Mingyang Li
To cite this article: Suiyao Chen, Lu Lu & Mingyang Li (2017) Multi-state reliability demonstration
tests, Quality Engineering, 29:3, 431-445, DOI: 10.1080/08982112.2017.1314493
To link to this article: http://dx.doi.org/10.1080/08982112.2017.1314493
Accepted author version posted online: 05
Apr 2017.
Published online: 05 Apr 2017.
Submit your article to this journal
Article views: 61
View related articles
View Crossmark data
QUALITY ENGINEERING
, VOL. , NO. , –
https://doi.org/./..
Multi-state reliability demonstration tests
Suiyao Chena,LuLu
b, and Mingyang Lia
aDepartment of Industrial and Management Systems Engineering, University of South Florida, Tampa, Florida; bDepartment of Mathematics and
Statistics, University of South Florida, Tampa, Florida
KEYWORDS
binomial demonstration test;
consumer’s risk; minimum
sample size; Monte Carlo
integration
ABSTRACT
Reliability demonstration tests have important applications in reliability assurance activities to
demonstrate product quality over time and safeguard companies’market positions and competitive-
ness. With greatly increasing global market competition, conventional binomial reliability demonstra-
tion tests based on binary test outcomes (success or failure) at a single time point become insucient
for meeting consumers’diverse requirements. This article proposes multi-state reliability demonstra-
tion tests (MSRDTs) for demonstrating reliability at multiple time periods or involving multiple failure
modes. The design strategy for MSRDTs employs a Bayesian approach to allow incorporation of prior
knowledge, which has the potential to reduce the test sample size. Simultaneous demonstration of
multiple objectives can be achieved and critical requirements specied to avoid early/critical failures
can be explicitly demonstrated to ensure high customer satisfaction. Two case studies are explored to
demonstrate the proposed test plans for dierent objectives.
Introduction
Reliability of a product is the probability that the prod-
uct can perform its required function at a given time
point. As a time-dependent characteristic, reliability
is an important measure of the product quality and
safety over time, which has a great impact on the satis-
faction of customers and can inuence their purchase
decisions linked with the revenue of manufacturers. In
order to succeed in the market competition, manufac-
turers need to produce products with high reliability
over their expected lifetime. Reliability demonstration
tests (RDTs) are often conducted by manufacturers to
demonstrate the capability of their products to meet the
requirements from customers for achieving good qual-
ity and performance over time. Given the budget and
time constraints, manufacturers need to determine the
number of test units, the time duration of the test, and
the maximum number of failures allowed to pass the
test. These choices are usually made to ensure the con-
sumersrisk(CR)onhavingaproductthathaspassed
thetestbutfailstomeetthereliabilityrequirementis
controlled. Controlling the CR at an acceptable level
can take the burden o the customers on bearing a high
risk of receiving inferior products which are claimed to
CONTACT Mingyang Li mingyangli@usf.edu Department of Industrial and Management Systems Engineering, University of South Florida,  E. Fowler
Avenue, Tampa, FL .
have met the requirements on reliability, and hence can
help improve customers’ satisfaction.
Dierent categories of RDTs have been studied in
the literature based on dierent types of reliability
data, such as failure counts data (Guo et al., 2011;Li
et al., 2016;Luetal.,2016), failure time data (Guo
et al., 2012;McKaneetal.,2005) and degradation data
(Yang, 2009). Failure counts data report the number of
failures that occur during a xed test period. The RDTs
based on failure counts data (Wasserman, 2002,pp.
208–210) are also called binomial RDTs (BRDTs) since
failure counts are modeled with binomial distributions.
In a BRDT, within a given testing period, if the num-
beroffailuresdoesnotexceedthemaximumnum-
ber of allowable failures, the test is passed. The maxi-
mum number of allowable failures cand the minimum
number of test units nare determined to ensure a cer-
tain minimum acceptable reliability requirement, R,is
met with the controlled CR at or below βby the end
of the test duration. The BRDTs are broadly applied
in reliability engineering practices because (i) they
require less monitoring eorts in the middle of the test
duration; and (ii) they are simple and straightforward
to be implemented and analyzed. However, with the
©  Taylor & Francis
432 S. CHEN ET AL.
increasing needs from customers, the BRDTs are no
longer able to meet all requirements in many applica-
tions. For example, customers may have varied require-
ments on reliability performance over dierent time
periods. It is common that many customers have little
tolerance of early failures and hence require high relia-
bility during early lifetime and lower reliability for later
time.Inthiscase,aBRDTfordemonstratingreliabil-
ity within a single time period is inadequate to meet all
requirements.
Considerascenariowhentwocompaniesrun
BRDTs with the same testing period of 5 years and use
the maximum number of allowable failures as c=5.
Products from company I had 1 failure in the rst two
years and 3 failures in the last three years. Products
from company II had 3 failures in the rst two years
and 1 failure in the last three years. Even though the
products from both companies can pass the demon-
stration tests, their underlying reliability performance
indicatedfromthefailurecountsdatacanbedierent.
If a customer needs products with high reliability in
early lifetime (corresponding to allowing no more than
2failuresduringthersttwoyears),theriskofthe
products from company II failing to meet the require-
ment can be much higher than that of the product
from company I. A typical BRDT with ve-year testing
period cannot demonstrate the performance over the
early two years, and hence raises the CR on accepting
aninferiorproductthatfailstomeetallrequirements.
Another limitation of the BRDTs is that they are
often used for pass/failure testing of a product without
distinguishing the causes and consequences of dier-
ent failure modes. A product with a complex system
is often composed of multiple key components which
may have dierent failure modes associated with var-
ied consequences. Their failures can have dierent neg-
ativeeectsonthefunctionalityoftheentireproduct.
For instance, the failure of the central processing unit
(CPU) of a computer is much more crucial than the
failureofavideocard.Customersmayalsohavedif-
ferent expectations for dierent components accord-
ing to their values or costs of replacement. The cost of
replacing a CPU or a motherboard is much higher than
replacing a keyboard or a mouse. As a result, customers
can have much higher expectation on the reliability of
themorevaluableandcriticalpartsthanthereliability
of other parts or accessories. A typical BRDT cannot
demonstrate separate reliability requirements for mul-
tiple failure modes.
To meet the ever-increasing demands of customers,
moreversatileRDTswithmoretailoredplansfortest-
ing multiple reliability requirements can better serve
the customers with enriched information on product
reliability. This article proposes RDT strategies for two
categories of reliability demonstration tests over multi-
ple time periods and for multiple failure modes, both
of which are referred to as multi-state RDTs (MSRDTs)
throughout the rest of the article. Alternative test plans
within each category are also explored and compared
with the conventional BRDTs for demonstrating mul-
tiple reliability requirements. Bayesian analysis is used
for quantifying the CR associated with various test
plans. The Bayesian method oers more exibilities on
incorporating prior information of product reliability
from either subject matter expertise or historical data
(Pintar et al., 2012;Weaveretal.,2008;Wilsonetal.,
2016). The impacts of dierent test strategies and dif-
ferent prior elicitations on the minimum test sample
size (i.e., the number of test units required) will be stud-
ied to provide more insights on guiding decisions on
demonstration test plans. If there exists historical data
which supports higher reliabilities compared to the
requirements, then using Bayesian method to incorpo-
rate prior information has the potential to reduce the
minimum test sample size required for the MSRDTs.
The remaining of the article is organized as fol-
lows. In the next section, the conventional BRDT plans
are reviewed with discussions of their benets and
limitations. Then the new MSRDTs for demonstrat-
ing reliability requirements over multiple time peri-
ods are proposed. Two dierent design strategies are
proposedandcomparedunderdierentpriorelicita-
tion settings. In the following section, another category
of new MSRDT designs for demonstrating reliability
requirements involving multiple failure modes are pro-
posed and their performances are evaluated and com-
paredwiththeconventionalBRDTs.Casestudieson
two categories of MSRDTs for multiple time periods
and multiple failure modes are provided to illustrate
the proposed test plans and demonstrate their perfor-
mances. Conclusions and discussions are provided in
the end.
Binomial RDTs
For many single use or “one-shot” product units, the
test procedure can be destructive. In this case, bino-
mial RDTs (BRDTs) are the common choices to obtain
QUALITY ENGINEERING 433
the failure counts data at the end of a predetermined
test period (Kececioglu et al., 2002, pp. 759–768). Let
πdenote the probability of failure over the test period,
and Rdenote the minimum acceptable reliability at the
end of the test duration. In Bayesian analysis, for a cho-
sen number of test units, n, and a maximum number of
allowable failures, c, the CR is measured by the poste-
rior probability of the product failing to meet the reli-
ability requirement given that the product has passed
the test, which can be calculated as
CRbinomial =P(Failure probability fails to meet the
reliability requirement|Test is passed)
=P > 1R|yc)
=1P1R|yc)
=11R
0c
y=0n
yπy(1π)
nyp(π )dπ
1
0[c
y=0n
yπy(1π)
ny]p(π )dπ
.
[1]
Note that in Eq. [1], p(π ) denotes the prior distribu-
tion of πwhich can be specied based on subject mat-
ter expert knowledge or historical data and ydenotes
the number of failures observed in the test period. Let
βdenote the maximum acceptable value for the CR,
thenaBRDTisdeterminedbychoosingthe(n,c)com-
bination such that the corresponding CRbinomial β.
According to (Lu et al., 2016), for any xed choice of c,
CRbinomial increases as the test sample size nincreases.
We use nbto denote the minimum test sample size that
is required to control the CR within an acceptable range
CRbinomial β.
In Bayesian analysis, the CRbinomial in Eq. [1]canbe
calculated using Monte Carlo integration (Robert et al.,
2004, pp. 71–131), where samples of πwith a large
size M=15000 are generated from the specied prior
distribution p(π ),andCR
binomial is calculated approx-
imately by
CRbinomial 1M
j=1c
y=0n
y(j))y(1π(j))nyI (j)1R)
M
j=1c
y=0n
y(j))y(1π(j))ny,[2]
where π(j)is the jth generated sample of failure prob-
ability for the specied prior distribution.
Table 1 showsanexampleofBRDTplanswithdier-
ent choices of prior distributions of π. The mean and
standard deviation (i.e., the square of variance) values
are provided to give some intuitions about the center
Tab le . Minimum sample sizes required by BRDTs with different
choices on cand prior distributions of π.
πBeta (1,1)(2,18)(4,16)(10,15)(10,10)
Mean(π) . . . . .
SD(π) . . . . .
cn
b

    





Settings: M=15000,R=0.8=0.05
and the spread of the prior distributions. For example,
πBeta(1,1)is centered at 0.5 but has large stan-
dard deviation at 0.2893. While πBeta(2,18)has
the mean failure probability of 0.1 but much smaller
standard deviation (0.0647) around its mean. The
minimum acceptable reliability from the consumers
requirement was set at R=0.8andthemaximumtol-
erable CR is chosen to be β=0.05. When no his-
torical data or prior information is available, a non-
informative prior πBeta(1,1)can be used. For any
assumed prior distribution of π,manufacturerscan
choose a test plan determined by (nb,c)using the min-
imum sample size nbfor any chosen maximum number
of allowable failures c.Forinstance,whenc=0anda
non-informative prior πBeta(1,1)is assumed, the
minimum sample size which can ensure the CR calcu-
lated in Eq. [2]tobenomorethanβ=0.05 is calcu-
lated as nb=13. Hence, at least 13 units need to be
tested if the test can only be passed when no failure
is observed. However, as larger maximum number of
allowable failures being set for passing the test, the CR
increases as it becomes easier to pass the test for a given
sample size n.Hence,tocontroltheCRatorbelow
β=0.05, more units need to be tested as more failures
are allowed to pass the test.
When more informative priors are available from
historical data or expertise, they can aect the selec-
tion of test plans. Table 1 has explored the impacts of
dierent prior distributions p(π ) on the selected test
plan for dierent tolerances on the maximum number
of allowable failures, c.Figure 1 shows the ve prior
434 S. CHEN ET AL.
π
p(π)
0.0 0.2 0.4 0.6 0.8 1.0
02468
Beta(1,1)
Beta(2,18)
Beta(4,16)
Beta(10,15)
Beta(10,10)
Figure . Density curves of different prior distributions explored in
Tab le .
distributions explored in Table 1.Theatdensitycurve
corresponds to the non-informative prior Beta(1,1)
which assumes that all possible values for π(0,1)
have equal probability. Other prior distributions from
Beta(10,10)to Beta(2,18)become more informa-
tivewithreducedspread(correspondingtosmaller
standard deviation in Table 1)andprovidestronger
support for smaller failure probability π.Foranygiven
c, the minimum sample size required can be reduced if
the prior distribution from historical data supports the
reliability requirement. For example, when a prior dis-
tribution πBeta(2,18)is used, which supports high
reliability around 1 2/(2+18)=0.9>R=0.8,
fewer units need to be tested to demonstrate the relia-
bility requirement (e.g., 4 <13 when c=0). However,
if the specied prior distribution is not in favor of the
reliability requirement, as illustrated with prior distri-
butions Beta(4,16),Beta(10,15),andBeta(10,10),
which favor incrementally lower reliability, more units
arerequiredtobetestedtodemonstratethesame
reliability requirement.
On the other hand, Table 2 demonstrates the impact
of dierent requirements on reliability. For a given
choice on the prior distribution, as Rdecreases cor-
responding to reduced requirement on reliability, the
minimum sample size, nb, decreases for a xed choice
on c. This matches our intuition that fewer units need
to be tested for demonstrating lower requirement on
reliability.
The BRDTs are useful for demonstrating reliability
requirements for binary tests. For example, a test plan
(nb=81,c=5)for a predetermined test period of
5 years can demonstrate no less than 0.9 reliability in
5 years with the CR controlled by 0.05. However, it
Tab le . Minimum sample sizes required by BRDTs with different
choices on cand reliability requirements.
nb
cR=0.9R=0.8R=0.6
 
 
 
  
  

  
Settings: M=15000=0.05
πBeta(1,1)
oers no capability of demonstrating reliability at any
time before the end of the test period. For example,
if the customers are particularly concerned about
the reliability in the rst two years in addition to the
reliability by the end of the ve years, the conventional
BRDTs are unable to demonstrate all requirements
over multiple time periods. In addition, BRDTs are
unable to dierentiate and demonstrate reliability
requirements involving multiple failure modes asso-
ciated with dierent consequences. In the next two
sections, two categories of new MSRDTs are pro-
posedtodemonstratereliabilityrequirementsover
multiple time periods and for multiple failure modes,
respectively. Alternative designs are also proposed and
their performances are evaluated and compared under
dierent prior elicitations.
MSRDTs over multiple time periods
Conventional BRDTs often demonstrate the product
reliability within a single time period, such as dur-
ing the mission time or the service life, to meet with
the customers’ requirements. However, customers’ sat-
isfaction in dierent time periods may dier. For
instance, upon the purchase of products, customers
may expect higher reliability during the early lifetime.
Theoccurrenceofearlyfailuresmayhavestrongerneg-
ative impact on customers’ satisfaction and company’s
reputation than failures occurred in the later stage of
the service period. To explicitly demonstrate dier-
ent product reliability requirements over multiple time
periods rather than a single time period, the strategies
of MSRDTs, i.e., multi-state RDTs, are proposed in this
section to meet customers’ expectation on reliability
over multiple time periods.
Consider a nite testing period with the start
time at t0and the end time at tK. The testing time
duration (t0,tK] is exclusively partitioned into K
QUALITY ENGINEERING 435
Period 1 Period 2
···
Period KPeriod K+1
t0t1t2tK-1 tK
Figure . Illustration of the multiple time periods in a K-period
MSDRT between (t0,tK].
non-overlapping time periods, (ti1,ti],i=1,...,K,
as illustrated in Figure 2.Letπiand yidenote the
probability of failure and the number of observed
failures within the ith time period (ti1,ti], respec-
tively. Then the number of units that survive the entire
test duration (right-censored at the end of the test
duration tK) can be expressed as nK
i=1yi,where
nisthetotalnumberoftestunits.Theprobabilityof
surviving the test is given by πK+1=1K
i=1πi.The
objective of a MSRDT over multiple time periods is to
simultaneously demonstrate the product reliability at
multiple time points satisfying a set of lower reliability
requirements, Ri,i=1,...,K, with the assurance
level controlled at (1β).Here,Riis the minimum
acceptable reliability in the rst icumulative time peri-
ods, (t0,ti], βis the maximum acceptable consumer’s
risk and assurance level can be explained as the min-
imum probability that the reliability requirements are
not met all given the test is passed (Hamada et al., 2008,
pp. 343–347). Two dierent scenarios of acceptance
criteria are proposed as follows.
Scenario I. The MSRDT will be passed if the cumu-
lative number of observed failures i
k=1ykat each
cumulative time period (t0,ti]isnomorethan
its corresponding cumulative maximum number of
allowable failures i
k=1ckfor all cumulative time
periods (t0,ti], at i=1,...,K. For example, con-
sider a two-period MSRDT with tests conducted at
the end of the second and fth year. For 100 test
units, the MSRDT will be passed if the number of
observed failures in rst two years do not exceed 1
and the number of observed failures at the end of the
fth year do not exceed 5.
Scenario II.TheMSRDTwillbepassedifthenumber
of observed failures yiat each non-overlapping time
period (ti1,ti]isnogreaterthanitscorresponding
maximum number of allowable failures cifor all time
periods (ti1,ti], at i=1,...,K.Forthesametwo-
period test, the MSRDT will be passed if the number
of observed failures in rst two years do not exceed
1andthenumberofobservedfailuresinthenext
three years do not exceed 4. It is noticed that the
major dierence between the two scenarios is that
Scenario II plans the tests for non-overlapping time
periods while Scenario I considers the cumulative
time-periods instead.
For each acceptance criterion, the design of MSRDT
over multiple time periods aims to determine (i)
the minimum sample size, denoted by nIand nII for
Scenarios I and II, respectively, and (ii) the cumula-
tive maximum number of allowable failures at time
ti,i
k=1ck, for Scenario I and the maximum num-
ber of allowable failures within ith time period, ci,
i=1,...,KforScenarioII.Foreitherscenario,the
MSRDT is selected by choosing the test plans which
control the CR at or below β. It is noticed that the
proposed MSRDT strategies are suitable for demon-
stration tests that generate failure counts data (Li et al.,
2016; Guo et al., 2011) over multiple time periods,
and do not make any assumptions on the failure time
distribution. The advantages of the proposed methods
are to fulll the reliability requirements of customers
over dierent testing periods (e.g., either cumulative
time periods from Scenario I or the non-overlapping
periods from Scenario II) simultaneously and provide
dierent testing strategies that require dierent min-
imum test sample sizes based on dierent maximum
numbers of allowable failures. Assuming a certain
failure time distribution over multiple time periods
or for multiple failure modes may limit the use of the
proposed strategies because the lifetime distribution
assumptionhastobevalidforthewholetestperiodand
onlytheexpectednumberoffailurescanbeobtained,
which is not commensurate with the objectives of
proposed strategies as mentioned above. Alternative
RDT designs such as Weibull testing which is more
suitable for failure time data, is out of the scope of this
article, but is of interest for future work.
To il l u s t rate t h e p r o p os e d M SRDTs ove r m u l t i-
ple time periods and further investigate the dier-
ence between two scenarios of acceptance criteria, the
MSRDTs over two time periods (i.e., K=2) are consid-
ered without loss of generality. Let R1and R2denote the
minimum acceptable reliabilities over the time peri-
ods (t0,t1]and(t0,t2]withR2<R1.Theprobabili-
ties of failure for each cumulative time period meet
the requirements if π11R1and π1+π21
R2. For acceptance criterion in Scenario I, the test
of MSRDT is to determine {nI,c1,c1+c2},andthe
probability of accepting the test for any given 1
2),
436 S. CHEN ET AL.
denoted by HI(n,c1,c2), can be explicitly written
as
HI(n,c1,c2)=
c1
y1=0
c1+c2y1
y2=0 n!
y1!y2!(ny1y2)!
×πy1
1πy2
2(1π1π2)ny1y2
and the corresponding CRIis controlled at or below β
by
CRI=11R1
01R2π1
0HI(n,c1,c2)p1
2)dπ2dπ1
1
01
0HI(n,c1,c2)p1
2)dπ2dπ1
β,
[3]
where p1
2)denotes the joint prior distribution of
1
2,1π1π2).
For the acceptance criterion in Scenario II, the
MSRDT plan can be determined by specifying
{nII,c1,c2}, and the probability of accepting the test for
any combination of 1
2),denotedbyHII(n,c1,c2)
is given by
HII(n,c1,c2)=
c1
y1=0
c2
y2=0 n!
y1!y2!(ny1y2)!
×πy1
1πy2
2(1π1π2)ny1y2,
and the corresponding CRII is controlled by
CRII
=11R1
01R2π1
0HII(n,c1,c2)p1
2)dπ2dπ1
1
01
0HII(n,c1,c2)p1
2)dπ2dπ1
β.
[4]
Acasestudyisshownbelowforillustratingthepro-
posed MSRDT strategies for a two-period test. The
reliability requirements are set as R1=0.8andR2=
0.6 over the time periods (t0,t1]and(t0,t2]with
t2<2t1, which indicates longer time interval of (t0,
t1]than(t1,t2]. A higher reliability requirement R1
is desired for the early cumulative time period (t0,
t1] because the customers are averse to early failures.
The CR is controlled at β=0.05, indicating that the
probability of accepting the test when the actual reli-
ability requirements are not met is controlled at or
below 0.05. To evaluate the complex integration in
either Eq. [3]orEq.[4], Monte Carlo sampling is
performed with the sample size of M=15,000 to
maintain the evaluation accuracy. The Dirichlet dis-
tribution, denoted by Dirichlet 1
2
3),isusedas
Tab le . Comparison between Scenarios I and II and BRDT, with
non-informative prior.
Scenario I Scenario II BRDT
c1c1+c2nIc1c2nII cn
b
  
   
   
   
   

    


   

   
    



  
  

Settings: p1
2)Dirichlet(1,1,1)
R1=0.8,R2=0.6,M=15000=0.05
the prior distribution for 1
2,1π1π2),where
α1
2
3are hyper-parameters to be elicited based
on the prior knowledge. The Dirichlet distribution is
a family of continuous multivariate probability distri-
bution parametrized by the vector of positive hyper-
parameters αi,i=1,...,Kfor Kcategories of out-
comes. The advantage of using Dirichlet distribution
istwofolded.Firstofall,itistheconjugatepriorfor
the multinomial distribution, and hence allows an easy
update of knowledge as new data are observed because
the posterior distribution of the failure probabilities
also follow a Dirichlet distribution. Second, the hyper-
parameters in the Dirichlet distribution are associated
with more intuitive practical implications as they are
directly connected with the failure probabilities for
each outcome category based on the prior knowledge
in the form of αi/K
i=1αi. A few dierent settings of
hyper-parameters will be explored later to investigate
the impact of prior knowledge on the performance of
the proposed test plan.
When no prior information is available, a non-
informative prior distribution, given by 1
2,1
π1π2)Dirichlet(1,1,1)can be used for indicat-
ing the lack of prior knowledge. The selected test plans
under the acceptance criteria of two scenarios with
dierent choices on the maximum number of allow-
able failures are illustrated in Ta bl e 3.Thetestplansare
grouped based on the total number of failures allowed
during the entire test duration. Several features are
QUALITY ENGINEERING 437
Figure  . Comparison between Scenarios I and II based on the min-
imum sample size as c2increases for some fixed c1values.
observed. First of all, under both Scenarios I and II,
given a xed choice of c2, the minimum sample size
nIor nII increases as c1increases. Similarly, given a
xed c1,nIand nII also increase with c2.Asforagiven
xed number of test units, allowing more failures (i.e.,
increasing c) can make it easier to pass the test and thus
increase the CR. Hence, it requires to test more units to
control the CR at a predetermined maximum accept-
able level. The patterns of minimum sample sizes can
be observed more clearly in Figures 3 and 4.
Figure 3 shows the change in the minimum sam-
ple size as c2increases for a few selected c1values
under both scenarios. Solid lines are used for showing
Scenario I and dash lines are used for Scenario II.
Dierent symbols are used for displaying dierent
c1values. For a xed c1value, the minimum sample
sizes under both Scenarios I and II increase as c2
increases. For example, when c1=0, two scenarios are
essentially the same in terms of the acceptance criteria.
Hence, the same minimum sample size is required
Minimum sample size
10 20 30 40 50 60
012345678910
c1
I:c2=1
I:c2=4
I:c2=6
II :c2=1
II :c2=4
II :c2=6
Figure . Comparison between Scenarios I and II based on the
minimum sample size as c1increases for some fixed c2values.
for both scenarios, which is shown with the solid
line with the triangles and increases as c2increases.
When c1>0, the minimum sample size still generally
increases as c2increases. However, the trend is slightly
dierent between the two scenarios. The nIincreases
monotonically with c2,whilethenII starts o with sim-
ilar sample sizes for small c2values to a certain point
and then starts to increase more quickly as c2increases.
For example, when c1=4, the minimum sample size
for Scenario II (shown with a dotted line with the open
circles) is relatively at for c24andthenincreases
for c2>4. This is because under Scenario II, the max-
imum number of allowable failures for the two non-
overlapping periods determines their corresponding
minimum required test units, which then jointly deter-
mine the overall minimum sample size for the entire
test. Therefore, the overall sample size can be domi-
natedbythemaximumnumberofallowablefailures
for one of the test periods if one of the ciis considerably
larger compared to its failure probability under the reli-
ability requirements to be demonstrated. Thus, when
c2is small, c1plays an dominating role in determining
the overall sample size for the entire test, which corre-
sponds to the at portion of the minimum sample size
curve for c1=4. However, as c2becomes larger than
c1, the overall minimum sample size is dominated by
the requirement from period 2 and hence resumes an
increasing pattern as c2increases. To compare the two
scenarios, it appears that nIis usually larger than nII for
small c2values, but becomes smaller than nII when c2
becomes larger than a certain value. This is because for
the same required civalues, the test plans in Scenario I
generally can allow larger maximum number of allow-
able failures for period 2 (when the maximum number
of allowable failure is not reached during period 1) and
hence request to test more units when c2has dominat-
ing impact on the overall minimum sample size.
Figure 4 shows how the minimum sample size
changes with c1for xed c2values under both scenar-
ios. Generally, for any xed c2,theminimumsample
size increases as c1increases under Scenario I. Also,
alargerc2value requires to test more units and the
dierence in nIamong dierent c2values are simi-
lar across dierent c1values, which is evidenced by
the almost parallel lines observed for Scenario I in
Figure 4. However, for Scenario II, even though nII
increases monotonically with c1, there are diminishing
dierences in nII at dierent c2values as c1increases.
This is because under Scenario II, increasing c1will
aect nII by increasing the minimum sample size
438 S. CHEN ET AL.
Minimum sample size
40 50 60 70 80 90 100 110
0 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20
20 18 16 14 12 10 9 8 7 6 5 4 3 2 1 0
c1
c2
I:c1+c2=20
II :c1+c2=20
I:c1+c2=15
II :c1+c2=15
Figure  . Comparison between Scenarios I and II based on the min-
imum sample size for fixed c1+c2values.
needed to demonstrate the reliability requirement in
period 1 and hence leads to a dominating eect on the
size of nII (which is equivalent to a diminishing impact
of the dierence in c2values). While under Scenario I,
increasing c1will result in increases in the minimum
sample sizes needed for demonstrating both reliability
requirements at the end of the two non-overlapping
time periods, and hence has a consistent impact on the
overall minimum sample size nI.
It is also interesting to compare the two scenarios
given the same total maximum number of allowable
failures c1+c2in the entire test duration. Figure 5
compares the minimum sample sizes for both scenar-
iosgivenaxedtotalmaximumnumberofallowable
failures c1+c2.Twocaseswithc1+c2=15 and
c1+c2=20 are investigated, which are shown in
Figure 5 with the solid and dotted lines, respectively.
Thebottomandthetopaxesdisplayallcombinations
of c1and c2values. A few patterns can be observed.
First, both nIand nII increase as c1+c2increases.
This matches with the pattern for the conventional
BRDTs in that it generally requires to test more units to
ensurethesameassurancelevelifamorerelaxedcrite-
rion has been used for passing the test by allowing more
failures to be observed during the entire test duration.
Second, increasing c1(at the same time reducing c2)
will consistently increase nIbut reduce nII rst for small
c2values and then increase nII after c2reaches a certain
value.Third,intermsoftherelativeperformanceof
the two strategies, Scenario II is associated with smaller
overall minimum sample size for large c1and small c2
values. As c2increasing to about the same size as c1,
Scenario I starts to have a smaller minimum sample
size and the dierence becomes larger as c1continues
to increase. This can be evidenced by the crossing pat-
tern between the monotonically increasing line with
the squares for Scenario I and the U-shaped curve with
the open circles for Scenario II. Brief analytical expla-
nations can also be found in the Appendix to improve
the understanding of the observed dierences between
two scenarios.
Under the same maximum number of allowable fail-
ures c1+c2for the entire test duration, Scenario II
is expected to have more strict requirements (y1
c1,y2c2)thanScenarioI(y1c1,y1+y2c1+
c2), meaning that any tests that pass in Scenario II will
also pass in Scenario I. Intuitively, Scenario II will be
preferred if minimizing the CR is the only criterion of
interest, which on the other hand generally requires
larger minimum sample size. However, smaller test
sample is also generally preferred in RDT plan from
the manufacturer’s point of view. Hence, the tests with
minimum sample size after controlling the CR are gen-
erally preferred. As illustrated in Figure 5,thetwotest
scenarios may have varied performance in the required
minimum sample size for dierent settings and Sce-
nario II does not consistently outperform Scenario I
based on the minimum sample size. It is also noticed
in Table 3 that the dierence between the two scenar-
ios when c1is small becomes smaller for small c1+c2
values, and is almost negligible for c1+c26. On
the other hand, Scenario I can be preferred for rela-
tively large c1values when c1+c2is large or when only
small c1+c2is allowed. It is also noticed that for tests
using more strict passing conditions, they are gener-
ally associated with smaller probabilities of passing the
test (i.e., low acceptance probability) and often higher
probabilities for manufacturers to reject the products
that actually have met the reliability requirements (Lu
et al., 2016). Hence, a decision on the selection of
scenarios should be catered for a particular applica-
tion to meet the objectives of a specic demonstration
test.
In addition, Tab l e 3 also shows the comparison
between the MSRDT strategies over two time periods
with the conventional BRDTs when non-informative
prior is used. The last two columns in Tabl e 3 give
the maximum number of allowable failures and the
minimum sample size for demonstrating the reliability
requirement at the end of test duration (i.e., the end
of period 2). For any given total maximum number of
QUALITY ENGINEERING 439
allowable failures over the entire test duration, c=c1+
c2, the conventional BRDTs require to test fewer units
for demonstrating only a single reliability at the end
ofthetest.TheMSRDTs,ontheotherhand,gainthe
capability of demonstrating multiple reliability require-
ments at dierent time points at the expense of testing
afewmoreunits.However,asc=c1+c2increases,
fewer extra units are required to be tested for demon-
strating more reliability requirements at multiple time
points. For example, for c=5, the conventional BRDT
requires to test 18 units to demonstrate reliability at the
end of the two-year period as 0.6. To demonstrate an
additional higher reliability at the end of the rst year
at 0.8, both MSRDT strategies require to test at least
20 units with no failure allowed to be observed during
the rst year. More units need to be tested if more
failures are allowed to be observed during the rst
year.
It is well known that incorporating dierent prior
information may have large impacts on the results
in Bayesian analysis. Next, we explore the impact of
dierent prior elicitations on the selected MSRDT
plans under both scenarios. Tables 4 and 5summarize
the required minimum sample sizes for the MSRDT
plans over two test periods with dierent choices of
prior distributions for Scenarios I and II, respectively.
Seven dierent prior distributions including the non-
informative prior, Dirichlet(1,1,1),areexplored.
The patterns are rather consistent across Tab l e s 4 and
5. Under both scenarios, when the prior distribu-
tion supports higher reliabilities than the minimum
requirements, such as Dirichlet(3,3,24)showninthe
fourth column in both tables, the minimum sample size
can be substantially reduced for any given combina-
tion of c1and c2values than using the non-informative
prior(showninthethirdcolumninbothtables).
On the other hand, if the prior distribution supports
reliabilities at or below the requirements, more units
need to be tested to demonstrate the requirements
than using the non-informative prior. This can be
observed in Figures 6 and 7which show the mini-
mum sample size for xed c1+c2under Scenario I
and II, respectively. In both gures, the solid lines
with triangles represent the sample sizes for dierent
(c1,c2)combinations using a non-informative prior.
The dash lines with squares show the sample sizes for
a prior distribution Dirichlet(3,3,24)that supports
higher reliabilities than the requirements, which are
consistently below the line of non-informative prior.
All other prior distributions support reliabilities at
or below the requirements, and hence all require to
test more units with the corresponding lines located
above the line of non-informative prior. The farther
the specied prior distribution is to the reliability
requirements, the more test units are needed in the
MSRDTs over multiple time periods. One special
case is the dash line with open circles observed in
Figure 7 for a prior distribution Dirichlet(3,12,15),
Tab le . Minimum sample sizes required by the two-period MSRDT using the acceptance criterion in Scenario I for different prior
distributions.
Dirichlet (1,1,1)(3,3,24)(6,6,18)(12,3,15)(3,12,15)(6,12,12)(12,6,12)
c1c1+c2nI
 
 
 
 
 
 
 
 
 
 
        
        
 
 
 
 
        
        
      
Settings: M=15000,R1=0.8,R2=0.6=0.05
440 S. CHEN ET AL.
Tab le . Minimum sample sizes required by the two-period MSRDT using the acceptance criterion in Scenario II for different prior
distributions.
Dirichlet (1,1,1)(3,3,24 )(6,6,18)(12,3,15)(3,12,15)(6,12,12)(12,6,12)
c1c2nII
 
 
 
 
 
 
 
 
 
 
        
        
 
 
 
 
 
        
        
Settings: M=15000,R1=0.8,R2=0.6=0.05
which is consistently below the non-informative line
indicating smaller minimum sample sizes are required
for all (c1,c2)combinations. Since the prior distribu-
tion regarding period 1 supports higher reliabilities
than the requirements, while the prior distribution
regarding period 2 supports reliabilities below the
requirements, the eects of sample size reduction from
period 1 and sample size increase from period 2 may
jointly determine the overall minimum sample size,
and hence lead to slightly dierent pattern than what
has been observed for other prior distributions.
Minimum sample size
0 20 40 60 80 100
6543210
0123456
c1
c2
(3,3,24)
(6,6,18)
(12,3,15)
(3,12,15)
(6,12,12)
(12,6,12)
Non−informative
Dirichlet
Figure . Minimum sample sizes required in Scenario I with fixed
c1+c2=6for different prior distributions.
MSRDTs for multiple failure modes
In the previous section, the MSRDT strategies con-
sider each time period as an individual state for demon-
strating specic reliability requirement within the time
period. This section proposes a dierent category of
MSRDTs which considers dierent failure modes as
individual states that are often associated with dierent
consequences of failures and dierent costs of replace-
ment. The conventional BRDTs report dichotomous
outcomes (i.e., success and failure) for each test unit,
Minimum sample size
0 20 40 60 80 100
6543210
0123456
c1
c2
(3,3,24)
(6,6,18)
(12,3,15)
(3,12,15)
(6,12,12)
(12,6,12)
Non−informative
Dirichlet
Figure . Minimum sample sizes required in Scenario II with fixed
c1+c2=6for different prior distributions.
QUALITY ENGINEERING 441
in which case dierent failure modes of the product
are not dierentiated and the severity levels of dierent
consequences associated with dierent failures modes
are overlooked. In real applications, a product often
has multiple failure modes in varied levels of sever-
ity, which can lead to dierent impacts on customers’
dissatisfaction.
For instance, the failure of a CPU or a hard drive of
a computer system is much more critical than the fail-
ures of some accessory parts such as a keyboard or a
microphone, since the former can lead to a complete
breakdownofthesystem,alossofvaluableinforma-
tion and/or a high repair/replacement cost while the
latter usually only results in system under-performance
and a low repair/replacement cost. Consequently, the
failures of critical or valuable parts will lead to stronger
dissatisfaction of customers, and hence result in higher
expectation on reliability for these components. It is
desirable to develop test strategies that allow demon-
strating separate reliability requirements for multiple
failure modes.
The product with Jindependent failure modes
is considered. For each test unit, it will either have
failed in mode j,j=1,...,Jor remain working by
the end of the testing period. Let πjand yjdenote
the probability of failure and the number of observed
failures in failure mode jwithin the test period (or an
equivalent mission time period), respectively. Then,
πJ+1=1J
j=1πjand nJ
j=1yjdenote the
probability of success and the number of survived
units by the end of the test. The MSRDTs for multiple
failure modes aim to demonstrate at an assurance level
at (1β) that the product reliability will meet multi-
ple minimum reliability requirements for each of the
dierent failure modes, denoted by Rj,j=1,...,J.
Here, βistheCRonhavingaproductthathaspassed
the demonstration test but fails to meet all reliability
requirements for dierent failure modes. Note that
all failure modes are dened in the same test period.
For any specied reliability requirements Rj’s and t h e
maximum acceptable CR controlled at or below β,the
MSRDTs for multiple failure modes are designed to
determine the minimum sample size nmas well as the
maximum number of allowable failures cjin the jth
failuremodefor j=1,...,J.
Without loss of generality, considering two failure
modes with J=2forillustratingtheproposedMSRDT
strategy. Let R1and R2denote the minimum accept-
able reliabilities for failure modes 1 and 2, respectively.
The test is passed if the number of observed failures yj
is less or equal to the maximum number of allowable
failures cjfor both failure modes, and the test plan is to
determine the choice on {nm,c1,c2}.Forindependent
failure modes, the acceptance probability Hm(n,c1,c2)
for certain 1
2)values can be written as
Hm(n,c1,c2)=
c1
y1=0 n!
y1!(ny1)!πy1
1(1π1)ny1
×
c2
y2=0 n!
y2!(ny2)!πy2
2(1π2)ny2
and the corresponding CR, denoted by CRm,iscalcu-
lated by
CRm=1(1R1)
0(1R2)
0Hm(n,c1,c2)p1
2)dπ2dπ1
1
01
0Hm(n,c1,c2)p1
2)dπ2dπ1
,
[5]
where p1
2)is the joint prior distribution of
1
2). For independent failure modes, there is
p1
2)=p1)p2). The minimum sample size is
determined by controlling the CRmobtained in Eq. [5]
to be at or below β. Simulation case studies are con-
ducted for exploring dierent reliability requirements,
maximum numbers of allowable failures for dierent
failure modes, as well as dierent prior elicitations and
their impacts on the required minimum sample size
for the MSRDTs for two failure modes. The results are
summarized in Tables 6 and 7for two cases with simi-
lar or dierent reliability requirements for the two fail-
ure modes. In Tab l e 6, identical minimum reliability
requirements are assumed for the two failure modes,
where R1=R2=0.8 indicates that the customers have
the same expectation on reliability for both failure
modes. Table 7 assumes dierent reliability require-
ments with R1=0.8andR2=0.6. Here, failure mode
1 is considered more critical and/or have more severe
consequences associated with its failure, and hence is
required for a higher reliability. The CRmis still con-
trolled at β=0.05 and the sample size for Monte Carlo
sampling is chosen as M=15000 to maintain the sim-
ulation accuracy. Beta distributions are used for speci-
fying the prior distributions for both π1and π2for the
two failure modes.
When two failure modes have the same reliability
requirements at R1=R2=0.8, Table 6 summarizes
the minimum sample size with dierent choices of the
maximum number of allowable failures and dierent
prior settings. When no prior information is available,
442 S. CHEN ET AL.
Tab le . Multiple failure modes with the same reliability require-
ments for different prior distributions.
π1(1,1)(2,18)(4,16)(10,15 )(2,18)(2,18)(4,16)
Beta π2(1,1)(2,18)(4,16)(10,15 )(4,16)(10,15)(10,15 )
c1c2nm
  
     
    
        
      
       
        
      
        
        
        
        
        
      
        
       
        
      
        
Settings: M=15000,R1=0.8,R2=0.8=0.05
a non-informative prior distribution of Beta(1,1)is
assigned for both π1and π2.Similarpatternscanbe
observed as for the MSRDTs over multiple time peri-
ods. When c1is xed, the minimum sample size nm
increases as c2increases; when c2is xed, nmincreases
with c1. This is intuitive as having more allowable fail-
ures makes it easier to pass the test and thus increases
Tab le . Multiple failure modes with different reliability require-
ments for different prior distributions.
π1(1,1)(2,18)(10,10)(4,16)(2,18)(10,10)
Beta π2(1,1)(2,18)(10,10)(10,15)(10,10)(2,18)
c1c2nm
      
      
      
  
       
       
  
   
      
       
       
      
  
  
       
       
       
  
       
Settings: M=15000,R1=0.8,R2=0.6=0.05
the CR. To control a reasonable CR, a larger number
of units need to be tested by allowing more failures
to be observed during the test. When c1+c2is xed,
the minimum sample size nmexhibits a symmetric
pattern under the non-informative prior setting due to
the identical reliability requirements for both failure
modes. For example, when c1+c2=6, the minimum
sample sizes for c1=0,c2=6, and c1=6,c2=0
are identical. In addition, when c1and c2become
more similar in size (e.g., c1=2,c2=4comparedto
c1=0,c2=6), it requires smaller minimum sample
size to remain the same assurance level for demon-
strating the requirements on both failure modes. This
makes sense as when the maximum number of allow-
able failures is considerably larger for one failure mode
given the same reliability requirement, it requires to test
more units for demonstrating the requirement for this
failuremode,whichtheninatestheoverallminimum
sample size needed in the MSRDT for demonstrating
reliability requirements for both failure modes.
Dierent prior elicitations also have large impacts
on the selected test plan, as shown in Ta b l e 6 .When
prior knowledge supports higher reliabilities than the
requirements to be demonstrated, fewer units need to
be tested and vice versa. For instance, the prior dis-
tributions of π1Beta(2,18)and π2Beta(2,18)
indicate that there is a strong belief of lower fail-
ure probabilities than the requirements within the test
period for both failure modes. Thus, the corresponding
minimum sample size is smaller than what is needed
for using the non-informative prior. On the other hand,
when the prior distributions of π1Beta(10,15)and
π2Beta(10,15)are used, which indicates a moder-
ately strong belief in larger failure probabilities than
the requirements for both failure modes, more units
need to be tested to demonstrate the higher reliabil-
ity requirements compared to what is needed when no
prior information is available.
When c1+c2is xed, the required minimum sam-
ple size is also sensitive to the specied prior distri-
bution. Figure 8 illustrates the change in the nmfor
dierent (c1,c2)combinations given xed c1+c2=
6. When the non-informative priors are assumed, the
curve for nm(the solid line with the triangles) shows
a symmetric pattern with the minimum sample size
achieved at c1=c2=3. When informative priors indi-
cating lower failure probabilities than requirements for
both failure modes (such as π1Beta(2,18), π2
QUALITY ENGINEERING 443
Minimum sample size
050
100 150
6543210
0123456
c1
c2
(2,18) (2,18)
(4,16) (4,16)
(10,15)
(10,15)
(2,18)
(4,16)
(2,18)
(10,15)
(4,16)
(10,15)
Non−informative
Beta
Figure . Multiple failure modes with the same reliability require-
ments for fixed c1+c2and different prior distributions.
Beta(2,18)corresponding to the dash line with the
open circles) are assumed, the minimum sample size
curve is below the non-informative curve. As the prior
belief indicates higher failure probability for at least one
of the failure modes (such as π1Beta(2,18), π2
Beta(10,15)corresponding to the dotted line with the
solid circles or π1Beta(10,15), π2Beta(10,15)
corresponding to the dash-dotted line with the open
circles), the corresponding minimum sample size cur ve
shifts upwards on at least one side of tails or on both
sides.
Table 7 showsthetestplanswhendierentreliability
requirements are used for the two failure modes with
R1=0.8andR2=0.6. When the non-informative
priors are used, the symmetric pattern is no longer
observed due to dierent requirements on reliability for
the two failure modes. Particularly, nmis larger when
c1is large since more units need to be tested to demon-
strate higher reliability requirement for failure model 1
while allowing more failures to be observed during the
test period. Also, for the same c1and c2settings, the
minimum sample size for demonstrating R1=R2=
0.8 is smaller than what is required for demonstrat-
ing R1=0.8andR2=0.6 since fewer units can be
tested to demonstrate a lower reliability requirement
for failure mode 2. When more informative priors are
used, similar patterns are observed from both Ta b l e 7
and Figure 9. A potential sample size reduction can be
achieved when the prior knowledge supports higher
reliability than what is required to be demonstrated by
the MSRDT.
Minimum sample size
0 50 100 150
6543210
0123456
c1
c2
(2,18)
(2,18)
(10,10)
(10,10)
(4,16)
(10,15)
(2,18) (10,10)
(10,10)
(2,18)
Non−informative
Beta
Figure . Multiple failure modes with different reliability require-
ments for fixed c1+c2and different prior distributions.
Conclusions
Conventional binomial RDTs, which focus on demon-
strating a single reliability requirement within a single
test period, have limited use when multiple reliability
requirements need to be met. This article proposes
two types of RDTs for demonstrating reliabilities over
multiple time periods and for multiple failure modes.
These RDTs with multiple reliability requirements are
all referred to as multi-state RDTs (MSRDTs).
IntheMSRDTsovermultipletimeperiods,every
time period of interest is treated as a state, and the joint
distribution of failure counts over the non-overlapping
time periods can be modeled by a multinomial distri-
bution. Two dierent test strategies are proposed for
demonstrating multiple requirements over dierent
time periods. One strategy uses the cumulative failure
counts at the end of each cumulative time period peri-
ods as the criteria for passing the test; while the other
uses separate failure counts over non-overlapping time
intervals as the criteria for passing the test. Simula-
tion studies were conducted for comparing the two
strategies by considering two-period MSRDTs. It was
foundedthatthestrategybasedoncumulativefailure
counts (Scenario I) is generally preferred for cases that
allow fewer total failure counts over all time periods or
when a larger maximum number of allowable failures
is allowed for the early cumulative time period. The
strategy using separate failure counts (Scenario II) is
only preferred for requiring smaller minimum sample
size when a smaller maximum number of allowable
failures is allowed for the early separate time period.
444 S. CHEN ET AL.
In the MSRDTs for multiple failure modes, each
failuremodeistreatedasastateandallreliability
requirements for the multiple failure modes that may
be associated with dierent consequences in varied lev-
els of severity and/or costs of repair/replacement can be
simultaneously demonstrated. The required minimum
sample size is usually determined mainly by the fail-
uremodethathasthehighestreliabilityrequirement
and/or least stringent criterion for passing the test (i.e.,
allowing a larger maximum number of allowable fail-
ures for a particular failure mode).
The impacts of incorporating dierent prior dis-
tributions are also explored for both categories of
MSRDTs. The patterns are consistent regardless of
which test strategy is considered. When the prior
knowledge supports higher reliabilities than the
requirements to be demonstrated, fewer units can be
tested compared to using the non-informative priors
for demonstrating the same reliability requirements.
However, if the historical data supports lower relia-
bilities than what are required to be demonstrated,
then more units need to be tested to override the
eects of the prior distribution for demonstrating
higher reliabilities than what has been indicated from
existing data. For future work, it is expected to develop
thorough mathematical justications with theoritical
formulations and derivations to validate the discussed
patterns using both non-informative and informative
prior distributions.
About the authors
Suiyao Chen is a Ph.D. student in the Department of Indus-
trial and Management Systems Engineering at University of
South Florida. He received his B.S. degree (2014) in Economics
from Huazhong University of Science and Technology and
M.A. degree (2016) in Statistics from Columbia University. His
research focus is on statistical reliability data analysis, demon-
stration tests design, and data analytics.
Lu Lu is an Assistant Professor of Statistics in the Depart-
ment of Mathematics and Statistics at the University of South
Florida in Tampa. She was a postdoctoral research associated in
the Statistics Sciences Group at Los Alamos National Labora-
tory. She earned a doctorate in statistics from Iowa State Univer-
sity in Ames, IA. Her research interests include reliability anal-
ysis, design of experiments, response surface methodology, sur-
vey sampling, and multiple objective/response optimization.
Mingyang Li is an assistant Professor in the Department of
Industrial & Management Systems Engineering at the Univer-
sity of South Florida. He received his Ph.D. in systems & indus-
trial engineering and his M.S. in statistics from the University
of Arizona in 2015 and 2013, respectively. He also received his
M.S. in mechanical & industrial engineering from the Univer-
sity of Iowa in 2010 and his B.S. in control science & engineering
from Huazhong University of Science and Technology in 2008.
His research interests include reliability and quality assurance,
Bayesian data analytics and system informatics. Dr. Li is a mem-
ber of INFORMS, IISE, and ASQ.
Funding
This work was supported in part by National Science Foun-
dation under Grant BCS-1638301 and in part by University of
South Florida Research & Innovation Internal Awards Program
under Grant No. 0114783.
References
Guo, H., T. Jin, and A. Mettas 2011. Designing reliability
demonstrationtestsforone-shotsystemsunderzerocom-
ponent failures. IEEE Transactions on Reliability 60 (1):286–
294.
Guo, H., and H. Liao 2012. Methods of reliability demonstra-
tion testing and their relationships. IEEE Transactions on
Reliability 61 (1):231–237.
Hamada, M. S., A. Wilson, C. S. Reese, and H. Martz 2008.
Bayesian reliability. Springer Science & Business Media.
Kececioglu, D. 2002. Reliability and life testing handbook.Vol.2.
DEStech Publications, Inc.
Li, M., W. Zhang, Q. Hu, H. Guo, and J. Liu 2016. Design and risk
evaluation of reliability demonstration test for hierarchi-
cal systems with multilevel information aggregation. IEEE
Transa c tio ns on Re lia bil ity 66 (1):135–147.
Lu, L., M. Li, and C. M. Anderson-Cook 2016. Multiple objec-
tive optimization in reliability demonstration tests. Journal
of Quality Technology 48 (4):303–326.
McKane, S. W., L. A. Escobar, and W. Q. Meeker 2005. Sample
size and number of failure requirements for demonstration
tests with log-location-scale distributions and failure cen-
soring. Technometrics 47 (2):182–190.
Pintar, A., L. Lu, C. M. Anderson-Cook, and G. L. Silver 2012.
Bayesian estimation of reliability for batches of high relia-
bility single-use parts. Quality Engineering 24 (4):473–485.
Robert, C., and G. Casella 2004. Monte Carlo statistical methods.
2nd ed. Springer Science & Business Media.
Wasserman, G. 2002. Reliability verication, testing, and analysis
in engineering design. CRC Press.
Weaver, B. P., and M. S. Hamada 2008. A Bayesian approach to
the analysis of industrial experiments: An illustration with
binomial count data. Quality Engineering 20 (3):269–280.
Wilson, A. G., and K. M. Fronczyk 2016. Bayesian reliabil-
ity: combining information. Quality Engineering 29 (1):119–
129.
Yang, G. 2009. Reliability demonstration through degradation
bogey testing. IEEE Transactions on Reliability 58 (4):604–
610.
QUALITY ENGINEERING 445
Appendix
To analytically show the dierence between Scenar-
iosIandIIintheproposedMSRDTsovermultiple
time periods when c1+c2is xed, let H(n,c1,c2)=
HI(n,c1,c2)HII(n,c1,c2), which can be explicitly
written as
H(n,c1,c2)=
c1
y1=0
c1y1
y2=c2+1 n!
y1!y2!(ny1y2)!
×πy1
1πy2
2(1π1π2)ny1y2.
When c1=0, H(n,c1,c2)=0andbothscenarios
become equivalent, as shown in Tab l e s 35.Whenc1>
0, H(n,c1,c2)>0, which indicates that the proba-
bility of accepting test plan under Scenario II is always
smaller than the probability calculated under Scenario
I. However, this nding does not imply that for a
xed n, one scenario will always give a consistently
higher/lower CR than the other. To justify this, let A=
1R1
01R2π1
0HII(n,c1,c2)p1
2)dπ2dπ1and B=
1
01
0HII(n,c1,c2)p1
2)dπ2dπ1,CR
II and CRIcan
be written as
CRII =1A
B,
CRI=1A+A
B+B,
where A=1R1
01R2π1
0H(n,c1,c2)p1
2)
dπ2dπ1and B=1
01
0H(n,c1,c2)p1
2)
dπ2dπ1.ThenCR
II CRIis given by
CRII CRI=BAAB
B(B+B).
Although B>A,asn,c1and c2vary, Acan be
larger/smaller than B. Thus, for a xed sample size n,
neither CRII >CRInor CRII <CRIwill hold consis-
tently. It also explains results in Figure 5,andTa b l e s 4
and 5that when controlling CR, one scenario cannot
give a consistently larger/smaller minimum sample size
than the other scenario.
... In visual tasks, techniques like image rotation or puzzle assembly enhance positive pair similarities, while token masking in text processing helps capture robust, invariant features. However, several critical industries, including healthcare [40,8,9,10], manufacturing [4,11,46,7], agriculture [31,57,54,45,55,56] and various engineering fields [37,12,28,41], still heavily rely on structured tabular data. Researchers traditionally leverage domain expertise for feature selection and uncertainty quantification [62,60,59,61,63]. ...
Preprint
Representation learning is a fundamental aspect of modern artificial intelligence, driving substantial improvements across diverse applications. While selfsupervised contrastive learning has led to significant advancements in fields like computer vision and natural language processing, its adaptation to tabular data presents unique challenges. Traditional approaches often prioritize optimizing model architecture and loss functions but may overlook the crucial task of constructing meaningful positive and negative sample pairs from various perspectives like feature interactions, instance-level patterns and batch-specific contexts. To address these challenges, we introduce TabDeco, a novel method that leverages attention-based encoding strategies across both rows and columns and employs contrastive learning framework to effectively disentangle feature representations at multiple levels, including features, instances and data batches. With the innovative feature decoupling hierarchies, TabDeco consistently surpasses existing deep learning methods and leading gradient boosting algorithms, including XG-Boost, CatBoost, and LightGBM, across various benchmark tasks, underscoring its effectiveness in advancing tabular data representation learning.
... While representation learning (Bengio, Courville, and Vincent 2013) has made remarkable advancements in computer vision (CV) and natural language processing (NLP) domains, tabular data, which is ubiquitous in real-world applications and critical industries such as healthcare (Qayyum et al. 2020;Chen et al. , 2019, manufacturing (Borisov et al. 2022;Chen, Lu, and Li 2017;Wang et al. 2023a;Chen et al. 2020a), agriculture (Liakos et al. 2018;Wu et al. 2022;Tao et al. 2022) and various engineering fields (Chen et al. 2018;Zhu et al. 2018;Chen 2020;Wang et al. 2023b), has not fully benefited from its transformative power and remains relatively unexplored. The unique challenges posed by tabular datasets stem from their inherent heterogeneity, Figure 1: Given a pair of images, a person can easily distinguish the salient digits and mutual background due to the well-structured spatial relationships. ...
Article
Self-supervised representation learning methods have achieved significant success in computer vision and natural language processing (NLP), where data samples exhibit explicit spatial or semantic dependencies. However, applying these methods to tabular data is challenging due to the less pronounced dependencies among data samples. In this paper, we address this limitation by introducing SwitchTab, a novel self-supervised method specifically designed to capture latent dependencies in tabular data. SwitchTab leverages an asymmetric encoder-decoder framework to decouple mutual and salient features among data pairs, resulting in more representative embeddings. These embeddings, in turn, contribute to better decision boundaries and lead to improved results in downstream tasks. To validate the effectiveness of SwitchTab, we conduct extensive experiments across various domains involving tabular data. The results showcase superior performance in end-to-end prediction tasks with fine-tuning. Moreover, we demonstrate that pre-trained salient embeddings can be utilized as plug-and-play features to enhance the performance of various traditional classification methods (e.g., Logistic Regression, XGBoost, etc.). Lastly, we highlight the capability of SwitchTab to create explainable representations through visualization of decoupled mutual and salient features in the latent space.
... While representation learning (Bengio, Courville, and Vincent 2013) has made remarkable advancements in computer vision (CV) and natural language processing (NLP) domains, tabular data, which is ubiquitous in real-world applications and critical industries such as healthcare (Qayyum et al. 2020;Chen et al. , 2019, manufacturing (Borisov et al. 2022;Chen, Lu, and Li 2017;Wang et al. 2023a;Chen et al. 2020a), agriculture (Liakos et al. 2018;Wu et al. 2022;Tao et al. 2022) and various engineering fields (Chen et al. 2018;Zhu et al. 2018;Chen 2020;Wang et al. 2023b), has not fully benefited from its transformative power and remains relatively unexplored. The unique challenges posed by tabular datasets stem from their inherent heterogeneity, Figure 1: Given a pair of images, a person can easily distinguish the salient digits and mutual background due to the well-structured spatial relationships. ...
Conference Paper
Full-text available
Self-supervised representation learning methods have achieved significant success in computer vision and natural language processing (NLP), where data samples exhibit explicit spatial or semantic dependencies. However, applying these methods to tabular data is challenging due to the less pronounced dependencies among data samples. In this paper, we address this limitation by introducing SwitchTab, a novel self-supervised method specifically designed to capture latent dependencies in tabular data. SwitchTab leverages an asymmetric encoder-decoder framework to decouple mutual and salient features among data pairs, resulting in more representative embeddings. These embeddings, in turn, contribute to better decision boundaries and lead to improved results in downstream tasks. To validate the effectiveness of SwitchTab, we conduct extensive experiments across various domains involving tabular data. The results showcase superior performance in end-to-end prediction tasks with fine-tuning. Moreover, we demonstrate that pre-trained salient embeddings can be utilized as plug-and-play features to enhance the performance of various traditional classification methods (e.g., Logistic Regression, XGBoost, etc.). Lastly, we highlight the capability of SwitchTab to create explainable representations through visualization of decoupled mutual and salient features in the latent space.
... In the last decade, representation learning has made remarkable strides in fields like computer vision and natural language processing, revolutionizing the way we extract valuable insights from image and text data. However, several critical industries, including healthcare [57,11,12], manufacturing [6,13,66,10], agriculture [47,77,64] and various engineering fields [80,70,15,9], still heavily rely on structured tabular data. Researchers traditionally leverage domain expertise for feature selection [21], model refinement [68,67] and uncertainty quantification [14,71,69]. ...
Conference Paper
Full-text available
Representation learning stands as one of the critical machine learning techniques across various domains. Through the acquisition of high-quality features, pre-trained embeddings significantly reduce input space redundancy, benefiting downstream pattern recognition tasks such as classification, regression, or detection. Nonetheless, in the domain of tabular data, feature engineering and selection still heavily rely on manual intervention, leading to time-consuming processes and necessitating domain expertise. In response to this challenge, we introduce Re-ConTab, a deep automatic representation learning framework with regularized contrastive learning. Agnostic to any type of modeling task, ReConTab constructs an asymmetric autoencoder based on the same raw features from model inputs, producing low-dimensional representative embeddings. Specifically, regularization techniques are applied for raw feature selection. Meanwhile, ReConTab leverages contrastive learning to distill the most pertinent information for downstream tasks. Experiments conducted on extensive real-world datasets substantiate the framework's capacity to yield substantial and robust performance improvements. Furthermore, we empirically demonstrate that pre-trained embeddings can seamlessly integrate as easily adaptable features, enhancing the performance of various traditional methods such as XGBoost and Random Forest.
Preprint
Full-text available
Representation learning stands as one of the critical machine learning techniques across various domains. Through the acquisition of high-quality features, pre-trained embeddings significantly reduce input space redundancy, benefiting downstream pattern recognition tasks such as classification, regression, or detection. Nonetheless, in the domain of tabular data, feature engineering and selection still heavily rely on manual intervention and explanation, leading to time-consuming processes and necessitating domain expertise. In response to this challenge, we introduce EConTab, an explainable deep automatic representation learning framework with regularized contrastive learning. Agnostic to any type of modeling task, EConTab constructs an asymmetric autoencoder based on the same raw features from model inputs, producing low-dimensional representative embeddings. Specifically, regularization techniques are applied for raw feature selection and contrastive learning is leveraged to distill the most pertinent information for downstream tasks. Meanwhile, model explanation is demonstrated through feature weights and SHAP-value based model explainer. Experiments conducted on extensive real-world datasets substantiate the framework's capacity to yield substantial and robust performance improvements. Furthermore, we empirically demonstrate that pre-trained embeddings can seamlessly integrate as easily adaptable features, enhancing the performance of various traditional methods such as XGBoost and Random Forest.
Article
A reliability demonstration test (RDT) has been used to determine whether a product meets pre-specified reliability requirements and to decide whether a batch of products should be accepted or rejected. For systems with high reliability, standard RDTs are no longer preferred because test plans often require long test durations and pose high risks to producers and consumers. Related reliability information, such as subsystem test data, which is often available prior to system testing, is often neglected. For the widely applied assumptions of exponentially distributed subsystems and systems, this paper proposes a reliability demonstration test plan derivation method that makes use of subsystem test data to derive the probability density function of system failure rate for small-sized sample data. In comparison with conventional RDT plans that make decisions solely based on system test data, a system test plan can be derived with much shorter test durations while keeping producer and consumer risks under control. The case study shows that our proposed method can reduce risk and shorten test duration when a test plan needs to be derived for a system with subsystem test data.
Article
In industry, the durations of reliability experiments are usually fixed in advanced, which could lead to suboptimal results. A constrained optimization problem is stated in order to determine the best inspection time for reliability testing. The decision criterion is based on Weibull failure counts, whereas the constraints are related to the reliability and risk levels imposed by the producer and the consumer. The optimal reliability sampling plan provides the best strategy to determine the acceptability of lots and production processes. Minimum-cost inspection times, as well as the required number of test units and the maximum number of failures allowed, are derived by solving mixed integer nonlinear programming problems. An approximation of the minimum feasible acceptance number is first provided in closed-form. An efficient step-by-step procedure is then proposed in order to find the optimal reliability test plan for lot sentencing. In most cases, only a few iterations are needed to reach the optimal solution. For illustrative purposes, the suggested methodology is applied to the manufacturing of turbine engine combustors, systems of components and lawn mower motors.
Article
As reliability requirements become increasingly demanding for many engineering systems, conventional system reliability demonstration testing (SRDT) based on the number of failures depends on a large sample of system units. However, for many safety critical systems, such as missiles, it is prohibitive to perform such testing with large samples. To reduce the sample size, existing SRDT methods utilize test data from either system level or component level. In this paper, an aggregation-based SRDT methodology is proposed for hierarchical systems by utilizing multilevel reliability information of components, subsystems, and the overall system. Analytical conditions are identified for the proposed method to achieve lower consumer risk. The performances of different SRDT design strategies are evaluated and compared according to their consumer risks. A numerical case study is presented to illustrate the proposed methodology and demonstrate its validity and effectiveness.
Article
When a multivariate control chart raises an out-of-control signal, several diagnostic questions arise. When did the change occur? Which components or quality characteristics changed? For those components for which the mean shifted, what are the new values for the mean? While methods exist for addressing these questions individually, we present a Bayesian approach that addresses all three questions in a single model. We employ Markov chain Monte Carlo (MCMC) methods in a Bayesian analysis that can be used in a unified approach to the diagnostics questions for multivariate charts. We demonstrate how a reversible jump Markov chain Monte Carlo (RJMCMC) approach can be used to infer (1) the change point, (2) the change model (i.e., which components changed), and (3) post-change estimates of the mean.
Article
Reliability demonstration tests are usually performed in product design or validation processes to demonstrate whether a product meets specified requirements on reliability. For binomial demonstration tests, the zero-failure test has been most commonly used due to its simplicity and use of minimum sample size to achieve an acceptable consumer's risk level. However, this test can often result in unacceptably high risk for producers as well as a low probability of passing the test even when the product has good reliability. This paper explicitly explores the interrelationship between multiple objectives that are commonly of interest when planning a demonstration test and proposes structured decision-making procedures using a Pareto front approach for selecting an optimal test plan based on simultaneously balancing multiple criteria. Different strategies are suggested for scenarios with different user priorities and graphical tools are developed to help quantify the trade-offs between choices and to facilitate informed decision making. Potential impacts of some subjective user inputs on the final decision are studied to offer insights and useful guidance for general applications.
Article
One of the most powerful features of Bayesian analyses is the ability to combine multiple sources of information in a principled way to perform inference. This feature can be particularly valuable in assessing the reliability of systems where testing is limited. At their most basic, Bayesian methods for reliability develop informative prior distributions using expert judgment or similar systems. Appropriate models allow the incorporation of many other sources of information, including historical data, information from similar systems, and computer models. We introduce the Bayesian approach to reliability using several examples and point to open problems and areas for future work. © 2017, © Institute for Defense Analyses. Published with License by Taylor & Francis.