- Access to this full-text is provided by Springer Nature.
- Learn more

Download available

Content available from SN Computer Science

This content is subject to copyright. Terms and conditions apply.

Vol.:(0123456789)

SN Computer Science (2022) 3: 374

https://doi.org/10.1007/s42979-022-01257-z

SN Computer Science

ORIGINAL RESEARCH

Fuzzy Conﬁdence Intervals bytheLikelihood Ratio: Testing Equality

ofMeans—Application onSwiss SILC Data

RédinaBerkachy1 · LaurentDonzé1

Received: 20 July 2021 / Accepted: 20 June 2022 / Published online: 15 July 2022

© The Author(s) 2022

Abstract

We propose a practical procedure of construction of fuzzy conﬁdence intervals by the likelihood method where the obser-

vations and the hypotheses are considered to be fuzzy. We use the bootstrap technique to estimate the distribution of the

likelihood ratio. The chosen bootstrap algorithm consists on randomly drawing observations by preserving the location and

dispersion measures of the original fuzzy data set. A metric

d

𝜃

⋆

SGD

based on the well-known signed distance measure is con-

sidered in this case. We expose a simulation study to investigate the inﬂuence of the fuzziness of the computed maximum

likelihood estimator on the constructed conﬁdence intervals. Based on these intervals, we introduce a hypothesis test for the

equality of means of two groups with its corresponding decision rule. The highlight of this paper is the application of the

defended approach on the Swiss SILC Surveys. We empirically investigate the inﬂuence of the fuzziness vs. the randomness

of the data as well as of the maximum likelihood estimator on the conﬁdence intervals. In addition, we perform an empirical

analysis where we compare the mean of the group “Swiss nationality” to the group “Other nationalities” for the variables

Satisfaction of health situation and Satisfaction of ﬁnancial situation.

Keywords Bootstrap technique· Likelihood ratio· Fuzzy conﬁdence interval· Fuzzy statistics· Fuzzy hypotheses·

Equality of means· Fuzzy data· Statistical inference· Fuzzy analysis of variance (FANOVA)

Introduction andMotivation

A typical hypothesis testing procedure can be accomplished

by, for example, constructing conﬁdence intervals for a par-

ticular parameter. This method is widely used in practice.

However, once we consider the data and/or the hypotheses

to be fuzzy, the corresponding statistical methods have to

be updated. Some approaches already exist in the theory of

fuzzy sets. For instance, Kruse and Meyer [17] presented a

theoretical deﬁnition of fuzzy conﬁdence intervals. Several

researchers have afterwards proposed reﬁned deﬁnitions

of fuzzy confidence intervals. For instance, Viertl and

Yeganeh [22] proposed a deﬁnition of the so-called conﬁ-

dence regions. Their main application was in the Bayesian

context. Kahraman etal. [16] described some approaches

to the construction of fuzzy conﬁdence intervals, as well as

the concept of hesitant fuzzy conﬁdence intervals. Couso

and Sanchez [9] provided an approach that considers the

inner and outer approximations of conﬁdence intervals in the

context of fuzzy observations. Unfortunately, these various

approaches are limited because they were all conceived to

test a speciﬁc parameter with a pre-deﬁned distribution. It

would therefore be advantageous to develop a uniﬁed gen-

eral approach to fuzzy conﬁdence intervals.

In classical statistics, the likelihood ratio method is con-

sidered an alternative tool for the construction of conﬁdence

intervals. In the fuzzy environment, this method using uncer-

tain data has multiple advantages.

Gil and Casals [13] used the likelihood ratio in a hypoth-

esis testing procedure where fuzziness is contained in the

data. In Berkachy and Donzé [5], we proposed a practical

procedure to construct conﬁdence intervals by the likeli-

hood ratio method which is seen in some sense general. The

This article is part of the topical collection “Computational

Intelligence” guest edited by Kurosh Madani, Kevin Warwick, Juan

Julian Merelo, Thomas Bäck and Anna Kononova.

* Rédina Berkachy

Redina.Berkachy@UniFR.CH

Laurent Donzé

Laurent.Donze@UniFR.CH

1 Applied Statistics andModelling, Department

ofInformatics, University ofFribourg, Boulevard de Pérolles

90, 1700Fribourg, Switzerland

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 2 of 15

SN Computer Science

procedure can be easily adapted to speciﬁc cases. However,

the distribution of the likelihood ratio is a priori unknown

and has to be estimated or derived from strong assumptions.

Under classical assumptions, we note that this ratio is known

to be

𝜒2

-distributed with degrees of freedom correspond-

ing to the number of constraints applied to parameters. We

propose to use the bootstrap technique extended to the fuzzy

environment to estimate the distribution of the likelihood

ratio. A main contribution of Berkachy and Donzé [7] is to

provide two algorithms to constitute the bootstrapped sam-

ples mainly using the location and dispersion characteristics

calculated based on a new version of the signed distance

measure written as the

d𝜃⋆

SGD

metric and detailed in Berkachy

[1]. We highlight that the Expectation-Maximization (EM)

algorithm based on the fuzziness of data described by

Denoeux [10] is used to calculate the maximum likelihood

estimators (ML-estimators).

The defended procedure is considered eﬃcient and com-

putationally light because we do not have to consider every

single value of the support set of the involved fuzzy num-

bers, as in the traditional fuzzy method. Indeed, four con-

veniently chosen values are used in the construction process.

The presented calculations are done using the R package

FuzzySTs shown in [8]. We propose to use our fuzzy con-

ﬁdence interval to test the equality of means. We expose

a procedure and give the corresponding decision rule. An

application on Swiss SILC data, described in [21], give us

the opportunity to apply and test our methods. Consideration

on sensitivity and robustness are also shown.

The paper is organised as follows. We open the paper

in “Deﬁnitions” with fundamental deﬁnitions of fuzziness.

In “The signed distance”, we present the deﬁnition of the

signed distance measure, followed by the deﬁnition of the

d𝜃⋆

SGD

metric in “The

d

𝜃

⋆

SGD

Metric”. “Traditional fuzzy con-

ﬁdence intervals” is devoted to the construction of the tra-

ditional fuzzy conﬁdence intervals. In “Fuzzy conﬁdence

intervals by the likelihood method”, we discuss our concept

of fuzzy conﬁdence intervals constructed using the likeli-

hood method and detail the bootstrap algorithms to approxi-

mate the distribution of the likelihood ratio. In addition, a

simulation study illustrates the proposed algorithms. We end

the paper with “Application on SILC 2017” by the applica-

tion on the Swiss SILC data.

Denitions

Let us ﬁrst expose the basic deﬁnitions and concepts of

fuzziness.

Deﬁnition 1 (Fuzzy set) If A is a collection of objects

denoted generically by x, then a fuzzy set or class

X

in A is

a set of ordered pairs:

where the mapping

𝜇

X

representing the “grade of member-

ship” is a crisp real valued function such that

is called the membership function.

It is useful to show the support and the kernel of a given

fuzzy set. They are given as follows:

Deﬁnition 2 (Support and kernel of a fuzzy set)

The support and the kernel of a fuzzy set

X

denoted

respectively by supp

X

and core

X

, are given by:

In other terms, the support of a fuzzy set

X

is a crisp set

containing all the elements such that their membership func-

tion is not zero. In the same manner, the core of the fuzzy

set

X

is a crisp set containing all elements with degree of

membership equal to one.

We often characterize a given fuzzy set by a collection

of crisp sets called the

𝛼

-level sets. They are given in the

following manner:

Deﬁnition 3 (

𝛼

-level set or

𝛼

-cut)

An

𝛼

-level set

X𝛼

of the fuzzy set

X

is the (crisp) set of

elements such that:

The

𝛼

-level set is a closed bounded and non-empty interval

denoted generally by

[

XL

𝛼

;

XR

𝛼]

where for

∀𝛼∈[0;1]

,

XL

𝛼

and

XR

𝛼

are the left and right hand sides of

X𝛼

called respectively

the left and right

𝛼

-cuts such that:

Furthermore, a fuzzy number

X

, also called Left–Right

(L–R) fuzzy number, can be represented by the family set

of his

𝛼

-cuts

{

X

𝛼

∣𝛼∈[0;1]

}

. This set is a union of ﬁnite

compact and bounded intervals

[

XL

𝛼

(𝛼);

XR

𝛼

(𝛼)

]

such that,

∀𝛼∈[0;1]

,

where

X

L

𝛼

(𝛼

)

and

X

R

𝛼

(𝛼

)

are the functions of the left and right

hand sides of

X

.

(1)

X

=

{

(x,𝜇

X

(x)) ∣ x∈A

},

𝜇

X

∶ℝ

→

[0;1]

x↦𝜇

X

(x

)

(2)

supp �

X=

{

x∈ℝ∣𝜇

�

X

(x)>0

},

(3)

core

X=

{

x∈ℝ∣𝜇

X

(x)=1

}.

(4)

X𝛼

=

{

x∈A∣𝜇

X

(x)

≥

𝛼

}.

(5)

X

L

𝛼

=inf

{

x∈ℝ∣𝜇

X

(x)

≥

𝛼)

}

and

XR

𝛼

=sup

{

x∈ℝ∣𝜇

X

(x)

≥

𝛼)

}.

(6)

X

=

⋃

0≤𝛼≤1[

XL

𝛼(𝛼);

XR

𝛼(𝛼)

],

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 3 of 15 374

SN Computer Science

Remark 1 For sake of simplicity, common shapes of L–R

fuzzy numbers are often used in practice. We particularly

mention triangular fuzzy numbers denoted by a triplet as

X=(p,q,r)

, with

p,q

and

r∈ℝ

, and trapezoidal fuzzy

numbers denote by a quadruple as

X=(p,q,r,s)

, with

p,q,r

, and

s∈ℝ

.

The Signed Distance

The signed distance measure was ﬁrstly used in the con-

text of ranking fuzzy numbers by Yao and Wu [23]. It has

also served in some other contexts: Berkachy and Donzé

[3] used it in the assessment of linguistic questionnaires;

Berkachy and Donzé [6] used it in hypotheses testing; etc.

Although this measure is considered to be simple in terms

of computations, it has interested specialists because of its

directionality. This latter means that it can be positive or

negative, indicating the direction between two particular

fuzzy numbers. From another side, Dubois and Prade [11]

described it as the expected value of a given fuzzy number.

This measure is brieﬂy written as follows:

Deﬁnition 4 (Signed distance between two fuzzy sets)

Let

X

and

Y

be two sets of the class of fuzzy sets

𝔽(

ℝ

)

.

Their respective

𝛼

-cuts are written as

X𝛼

and

Y𝛼

such that

their left and right

𝛼

-cuts denoted respectively by

XL

𝛼

,

XR

𝛼

,

YL

𝛼

and

YR

𝛼

are integrable for all

𝛼∈[0;1]

.

The signed distance

dSGD

between

X

and

Y

is the mapping

such that

We are often interested by the signed distance of a par-

ticular fuzzy number measured from the fuzzy origin

0

as

follows:

Deﬁnition 5 (Signed distance of a fuzzy set)

The signed distance of the fuzzy set

X

measured from the

fuzzy origin

0

is given by:

d

SGD

∶𝔽(ℝ)×𝔽(ℝ)

→

ℝ

X×

Y

↦

dSGD(

X,

Y),

(7)

d

SGD(

X,

Y)= 1

2∫1

0[

XL

𝛼(𝛼)+

XR

𝛼(𝛼)−

YL

𝛼(𝛼)−

YR

𝛼(𝛼)

]

d𝛼

.

(8)

d

SGD(

X,

0)= 1

2

∫

1

0[

XL

𝛼(𝛼)+

XR

𝛼(𝛼)

]

d𝛼

.

The

d

⋆

SGD

Metric

Although the signed distance

dSGD

is seen as advantageous in

terms of simplicity and accessibility, it presents also impor-

tant drawbacks as detailed in Berkachy [2]. The major ones

are given as follows:

1. Mainly because of its directionality, this distance cannot

be deﬁned as a full metric. It lacks topological charac-

teristics, such as separability and symmetry.

2. It coincides with a central location measure. Thus, this

distance depends strongly on its extreme values. In other

words, neither the inner points between the extreme val-

ues nor the shape of the fuzzy numbers could aﬀect this

measure.

For these reasons, we propose a new

L2

metric denoted by

d

𝜃

⋆

SGD

. It is seen as a generalisation of the signed distance

dSGD

. This new metric depends on a weight parameter called

𝜃⋆

. Using

d

𝜃

⋆

SGD

, we take into account the deviation in the

shapes and its possible irregularities from one side, and the

central location measure from another one. This measure has

the necessary and suﬃcient conditions to constitute a metric

of fuzzy quantities as proved in Berkachy [2]. Let us ﬁrst

deﬁne the so-called deviations of the shape of a given fuzzy

number written in terms of the distance

dSGD

:

Deﬁnition 6 (Left and right deviations [2])

Consider

X

to be a fuzzy number with its

𝛼

-level set

X𝛼

=[

XL

𝛼

,

XR

𝛼]

,

X

∈

𝔽(ℝ)

. The left and right deviations of the

shape of

X

denoted by

dev

L

X

and

dev

R

X

can be written by:

where

dSGD(

X,

0)

is the signed distance of

X

measured from

the fuzzy origin

0

.

We deﬁne now the new metric

d𝜃⋆

SGD

as follows:

Deﬁnition 7 (The

d𝜃⋆

SGD

distance [2])

Consider two fuzzy numbers

X

and

Y

of the class of non-

empty compact and bounded fuzzy numbers. Let

𝜃⋆

be the

weight chosen for the modelling of the shape of these fuzzy

numbers such that

0≤𝜃⋆≤1

. Based on the signed distance

between

X

and

Y

, the

L2

metric

d𝜃⋆

SGD

is the mapping

(9)

devL

X(𝛼)=d

SGD

(

X,

0)−

XL

𝛼,

(10)

dev

R

X(𝛼)=

XR

𝛼

−d

SGD

(

X,

0)

,

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 4 of 15

SN Computer Science

such that

It is important at this stage to show a direct mathematical

relationship between the

d𝜃⋆

SGD

metric and the signed dis-

tance

dSGD

. Therefore, let us recall the concept of the nearest

trapezoidal symmetrical fuzzy number. Further information

in addition to the detailed proof, are shown in [2]. Remark

that this concept will be used in the process of generation of

random samples in the forthcoming sections.

Deﬁnition 8 (Nearest trapezoidal fuzzy number [2])

The nearest symmetrical trapezoidal fuzzy number

S

written by the quadruple

S

=[s

0

−2𝜖,s

0

−𝜖,s

0

+𝜖,s

0

+2𝜖

]

to a fuzzy number

X

with respect to the metric

d𝜃⋆

SGD

is given

such that

Fuzzy Condence Intervals foraPre‑Dened

Parameter

A fuzzy conﬁdence interval is a very great tool for statisti-

cal inference. We ﬁrst present the deﬁnition of a traditional

fuzzy conﬁdence intervals in “Traditional fuzzy conﬁdence

intervals”, followed by our procedure of estimation of inter-

vals by the likelihood ratio method and using the bootstrap

technique in “Fuzzy conﬁdence intervals by the likelihood

method”. Based on the designed conﬁdence intervals, we

also introduce a hypotheses test for the equality of means. A

simulation study is ﬁnally provided in “Simulation study”.

d𝜃⋆

SGD ∶𝔽(ℝ)×𝔽(ℝ)→ℝ

+

�

X×�

Y↦d𝜃⋆

SGD

(�

X,

�

Y)

,

(11)

d

𝜃⋆

SGD(�

X,�

Y)=dSGD(�

X,�

Y)2+𝜃⋆∫

1

0

max devR�

Y(𝛼

)

−devL�

X(𝛼), devR�

X(𝛼)−devL�

Y(𝛼)

d𝛼

2

1

2.

(12)

s0

=d

SGD(

X,

0

),

(13)

𝜖

=9

14

dSGD

(

X,

0

)

−3

7∫1

0

XL

𝛼

(

2−𝛼

)

d𝛼

.

Traditional Fuzzy Conﬁdence Intervals

A given conﬁdence interval is often produced for a particu-

lar parameter denoted by

𝜃

. In an epistemic approach, this

interval is considered to be fuzzy. This fuzziness is a direct

consequence of the fuzziness of the considered parameter.

Kruse and Meyer [17] proposed a main approach to write a

fuzzy conﬁdence intervals in such conditions. Many proce-

dures have been derived to compute this interval. A known

one relies on considering a pre-deﬁned distribution as seen

in the following construction procedure:

First, let

X1,…,Xn

be a random sample of size n. We

consider this sample to be fuzzy, and we call

X1,…,

Xn

its

fuzzy perception. For a particular parameter denoted by

𝜃

,

we are interested in testing the following hypotheses:

To accomplish this task, an idea could be to construct a

fuzzy conﬁdence interval for

𝜃

at a given signiﬁcance level

𝛿

. Based on [17], a two-sided fuzzy conﬁdence interval

Π

for

𝜃

is deﬁned by:

Deﬁnition 9 (Fuzzy conﬁdence interval [17])

Let

[𝜋1,𝜋2]

be a symmetrical conﬁdence interval for a

particular parameter

𝜃

at the signiﬁcance level

𝛿

. A fuzzy

conﬁdence interval

Π

is a convex and normal fuzzy set

such that its left and right

𝛼

-cuts, respectively written by

Π𝛼

=[

ΠL

𝛼

,

ΠR

𝛼]

, are written in the following manner:

The constructed fuzzy conﬁdence interval has a conﬁ-

dence of

1−𝛿

if for a parameter

𝜃

, the equation

is veriﬁed. A one-sided fuzzy conﬁdence interval is likewise

conceivable. A left one-sided fuzzy conﬁdence interval at a

conﬁdence level

1−𝛿

denoted by

Π𝛼

is written by its

𝛼

-level

sets as follows:

In the same way, the

𝛼

-cuts of a right one-sided one are

written by:

H0∶𝜃=𝜃0against H1∶𝜃≠𝜃0.

Π

L

𝛼=inf {a∈ℝ∶∃xi∈(

Xi)𝛼,∀i=1, …,n,

such that 𝜋1(x1,…,xn)≤a}

,

Π

R

𝛼=sup{a∈ℝ∶∃xi∈(

Xi)𝛼,∀i=1, …,n,

such that 𝜋

2

(x

1

,…,xn)≥a

}.

(14)

P(

ΠL

𝛼≤

𝜃

≤

ΠR

𝛼)≥

1−𝛿,∀𝛼∈[0;1

]

Π𝛼

=[

ΠL

𝛼

,∞]

.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 5 of 15 374

SN Computer Science

Detailed examples illustrating this deﬁnition can be found

in Berkachy [2] and Berkachy and Donzé [7].

Fuzzy Conﬁdence Intervals bytheLikelihood

Method

We presented in Berkachy and Donzé [7] and Berkachy and

Donzé [5] a generalisation of the traditional construction

procedure. The aim was to show a practical tool based on

the concept of likelihood ratio method to estimate fuzzy

conﬁdence intervals, in which the fuzziness contained in

the variables is conveniently taken into consideration. We

highlight that the likelihood ratio is a common tool in clas-

sical statistics as well. In the fuzzy environment, Gil and

Casals [13] as instance used it in the context of hypotheses

testing. In this section, we brieﬂy recall the defended proce-

dure. Note that further detailed information can be found in

Berkachy and Donzé [7] and Berkachy [2].

We ﬁrst deﬁne the likelihood function of a fuzzy obser-

vation. Consider

Xi

to be a fuzzy variable, with its fuzzy

perception. Therefore, we denote by

Xi

a fuzzy random vari-

able (FRV) such that its corresponding fuzzy realisation

xi

is associated with a measurable membership function given

by

𝜇

x

i

in the sense of Borel, i.e.

𝜇

x

i∶x

→

[0;1]

. Based on the

probability concepts proposed in Zadeh [24], the likelihood

function described in the fuzzy context can be expressed by:

Deﬁnition 10 (Likelihood function of a fuzzy observation)

Let

𝜃

be a vector of fuzzy parameters in the parameter

space

Θ

. For a single fuzzy observation

xi

, the likelihood

function can be given by:

This probability can also be written using the

𝛼

-cuts of

the involved fuzzy numbers.

Let now

x be a fuzzy sample composed of all the fuzzy

realisations

xi

of the fuzzy random variables

Xi

,

i=1, …,n

.

The corresponding likelihood function L(

𝜃;x

)

can then be

given by:

Π𝛼

= [−∞,

ΠR

𝛼

]

.

(15)

L

(

𝜃;xi)=P(xi;

𝜃)=

∫ℝ

𝜇xi(x)f(x;

𝜃)dx

.

(16)

L

(

𝜃;x)=P(x;

𝜃

)

(17)

=∫ℝ

𝜇x1(x)f(x;

𝜃)dx ⋅…⋅

∫ℝ

𝜇xn(x)f(x;

𝜃)

dx

(18)

=

n

∏

i=1

∫ℝ

𝜇xi(x)f(x;

𝜃)dx

.

It is then important to write the log-likelihood function

l

(

𝜃

;

x)

as follows:

Now consider

𝜃

the maximum likelihood estimator (ML-

estimator) of the fuzzy parameter

𝜃

. The likelihood ratio

is written by:

such that

L

(

𝜃

;

x)

is the likelihood function related to the fuzzy

parameter

𝜃

, and

L

(

𝜃;x

)

is the likelihood function evaluated

at the estimator

𝜃

. It is essential at this stage to write the

logarithm of this ratio, given also by the diﬀerence between

the log-likelihood functions evaluated at

𝜃

and at

𝜃

. There-

fore, the statistic LR can be given in the following manner:

such that

L

(

𝜃

;

x

)

≠0

,

L

(

𝜃

;

x

)

≠0

and are both ﬁnite.

Under classical statistical assumptions in the crisp case,

the ratio LR is proven to be asymptotically

𝜒2

-distributed

with a given number of degrees of freedom. In the fuzzy

statistical theory, a main issue is that we do not have any

proven asymptotic property for the distribution of this ratio.

Hence, we propose to solve this problem using the so-called

bootstrap techniques.

We remind that constructing a

100(1−𝛿)

% conﬁdence

interval means to ﬁnd every value of

𝜃

for which we reject

or we do not reject the null hypothesis

H0

. For this construc-

tion, consider

𝜂

to be the

(1−𝛿)

-quantile of the distribu-

tion of the statistic LR. We could then write the conﬁdence

interval by:

This latter is equivalent to

In other terms, the constructed interval has to be composed

of all possible values of

𝜃

, for which the log-likelihood

maximum varies by

𝜂

2

at most. Based on the statistic LR, a

mandatory condition for this construction is that for every

value of the parameter

𝜃

, the fuzzy conﬁdence interval by

the likelihood ratio

ΠLR

given by its left and right

𝛼

-cuts

[(

Π

LR

)L

𝛼

;(

Π

LR

)R

𝛼]

has to verify the following equation

(19)

l

(

𝜃;x)=log L(

𝜃;x)

=log

∫ℝ

𝜇x1(x)f(x;

𝜃)dx +⋯+log

∫ℝ

𝜇xn(x)f(x;

𝜃)dx

.

L

(

𝜃;x)

L

(

𝜃;x)

,

(20)

LR

=−2 log

L(

𝜃;x)

L(

𝜃;x)

=2

[

l(

𝜃;x)−l(

𝜃;x)

],

(21)

2[

l(

𝜃;x)−l(

𝜃;x)

]≤

𝜂

.

(22)

l

(

𝜃;x)

≥

l(

𝜃;x)−

𝜂

2.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 6 of 15

SN Computer Science

To insure the above-mentioned conditions, we propose the

following procedure of construction of fuzzy conﬁdence

intervals as seen in [7] and [2].

Procedure

We propose a revisited approach of construction of fuzzy

conﬁdence intervals by the likelihood ratio method, in which

the data set is assumed to be vague. Consequently, the log-

likelihood function becomes fuzzy-dependent, as well as

the considered parameter. We could directly deduce that the

needed ML-estimator have to be also fuzzy. Assume then

that the calculated crisp ML-estimator is modelled by a well-

chosen fuzzy number. It is natural to see that the support

set of this fuzzy number is the set of crisp elements. There-

fore, every element of this set have to be accordingly used

in the calculation process of the log-likelihood function.

However, this task seems to be very tedious because of the

computational burden of such process. Thereby, we suggest

to choose speciﬁc values which triggers the calculation of

the so-called threshold points. The process of calculating the

fuzzy conﬁdence interval will be based on the intersection

between these threshold points and the log-likelihood curve.

The complete description of the procedure is as follows:

First, let us expose the so-called standardising func-

tion. This latter is intentionally proposed to preserve the

[0;1]-interval identity as a basic property of

𝛼

-level sets.

It is written as:

Deﬁnition 11 (Standardising function [1])

Consider a value

𝜃

contained in the the support set of a

fuzzy number

𝜃

, i.e.

𝜃

∈

supp

(

𝜃

)

. The standardising func-

tion

Istand

is given by:

where

Ia

and

Ib

are arbitrary real values such that

Ia≤

l

(𝜃,

x

)≤

I

b

and

Ia≠Ib

. We have that

Istand(

l(𝜃,x)

)

is

bounded and

0≤

I

stand

(l(

𝜃

,

x))

≤1

.

The steps of the calculation process are written in the

following manner:

1. Let

𝜃

be a fuzzy parameter. We ﬁrst have to calculate the

log-likelihood function

l

(

𝜃;x

)

shown in Eq.19.

2. The support and the core sets deﬁning the fuzzy number

modelling the ML-estimator are composed of an inﬁn-

ity of values. We choose the lower and upper bounds

(23)

P(

(

ΠLR)L

𝛼

≤

𝜃

≤

(

ΠLR)R

𝛼

)≥

1−𝛿,∀𝛼∈[0;1]

.

I

stand

∶ℝ

→

ℝ

l(𝜃,x)↦Istand

(

l(𝜃,x)

)

=

l(𝜃,x)−Ia

Ib

−

Ia

,

of these sets only. The aim is to consider a reduced

number of elements only. Therefore, let p, q, r and s,

p≤q≤r≤s

, be the considered four elements, and

supp(

𝜃

) and core(

𝜃

) be respectively the support and

the core sets of

𝜃

. The four values p, q, r and s are then:

In addition, the fuzzy parameter is bounded and the sets

supp

(

𝜃

)

and

core(

𝜃)

are not empty. This leads to conclude

that the four values p, q, r and s always exist. We high-

light that our intentional choice of elements is in some

sense evident. Note that assuming the symmetry of the

probability function, the left and right-hand sides of a

log-likelihood function are monotonic and continuous.

3. Next,

𝜂

has to be estimated. The bootstrap technique is

suggested as described in the next section.

4. Based on the estimated parameter

𝜂

, we construct the

threshold values denoted by

I1

,

I2

,

I3

and

I4

correspond-

ing to the chosen values p, q, r and s, respectively. The

idea is to evaluate

𝜃

for each of the four values on the

right-hand side of Eq.22. The threshold values are then

calculated as follows:

5. Next, we calculate

Imin

and

Imax

, the minimum and maxi-

mum thresholds, written as:

Computing

Imin

and

Imax

and including them in the cal-

culation process are essential at this stage. The reason

for that is that we want to cover the entire interval of the

possible values verifying Eq.22.

6. We ﬁnd now the intersection between the log-likelihood

function and the threshold values

I1

,

I2

,

I3

and

I4

. Con-

sider

𝜃⋆L

1

,

𝜃⋆L

2

,

𝜃⋆L

3

,

𝜃⋆L

4

and

𝜃⋆R

1

,

𝜃⋆R

2

,

𝜃⋆R

3

,

𝜃⋆R

4

to be the

intersection abscissas. Note that the letters “L” and “R”

refer to the left and right sides of a particular entity. We

calculate these abscissas by solving the following equa-

tions:

(24)

p=min(supp(

𝜃));q=min(core(

𝜃));

(25)

r=max(core(

𝜃)) and s=max(supp(

𝜃)).

(26)

I

1=l(p;x)−

𝜂

2

;I2=l(q;x)−

𝜂

2;

(27)

I

3=l(r;x)−

𝜂

2

and I4=l(s;x)−

𝜂

2.

(28)

Imin =min(I1,I2,I3,I4),

(29)

and Imax =max(I1,I2,I3,I4).

(30)

lL

(𝜃

⋆L

1

;�x)=I

1

and l

R

(𝜃

⋆R

1

;�x)=I

1,

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 7 of 15 374

SN Computer Science

7. Next, we compute the minimum and maximum left inter-

section abscissas given by

The minimum and maximum right intersection abscis-

sas are similarly written as:

We remark that the left and right side intersection

abscissas are single and real values.

8. The previously calculated entities are accordingly used

to construct the

𝛼

-cuts of the fuzzy conﬁdence interval

using the likelihood ratio method

ΠLR

. We deﬁne the left

and right

𝛼

-cuts

(

Π

LR

)

𝛼

=

[

(

Π

LR

)L

𝛼

;(

Π

LR

)R

𝛼]

in the follow-

ing manner:

Note that Berkachy [2] gives the complete proof that the

defended fuzzy conﬁdence interval

ΠLR

veriﬁes Deﬁnition

9. Concerning the coverage rate, it is also proven that the

Eq.23 theoretically holds.

Bootstrap Technique fortheApproximation

oftheLikelihood Ratio andits Distribution

The bootstrap technique formally described by Efron [12]

is a great tool to empirically estimate a speciﬁc sampling

distribution using observed data. This technique is based on

drawing a large number of samples from a primary random

sample taken from an unknown distribution. This operation

leads to construct a so-called bootstrap distribution of the

(31)

lL

(𝜃

⋆L

2

;�x)=I

2

and l

R

(𝜃

⋆R

2

;�x)=I

2,

(32)

lL

(𝜃

⋆L

3

;�x)=I

3

and l

R

(𝜃

⋆R

3

;�x)=I

3,

(33)

lL

(𝜃

⋆L

4

;�x)=I

4

and l

R

(𝜃

⋆R

4

;�x)=I

4.

(34)

𝜃⋆L

inf

=inf(𝜃

⋆L

1

,𝜃

⋆L

2

,𝜃

⋆L

3

,𝜃

⋆L

4

)

,

(35)

and

𝜃

⋆L

sup

=sup(𝜃

⋆L

1

,𝜃

⋆L

2

,𝜃

⋆L

3

,𝜃

⋆L

4

)

.

(36)

𝜃⋆R

inf

=inf(𝜃

⋆R

1

,𝜃

⋆R

2

,𝜃

⋆R

3

,𝜃

⋆R

4

)

,

(37)

and

𝜃

⋆R

sup

=sup(𝜃

⋆R

1

,𝜃

⋆R

2

,𝜃

⋆R

3

,𝜃

⋆R

4

)

.

(38)

(

�

ΠLR)L

𝛼=

{

𝜃∈ℝ∣𝜃⋆L

inf ≤𝜃≤𝜃⋆L

sup and

𝛼=Istand

(

l(𝜃,�x)

)

=

l(𝜃,�x)−Imin

I

max

−I

min },

(39)

(

�

ΠLR)R

𝛼=

{

𝜃∈ℝ∣𝜃⋆R

inf ≤𝜃≤𝜃⋆R

sup and

𝛼=Istand

(

l(𝜃,�x)

)

=

l(𝜃,�x)−Imin

I

max

−I

min }.

statistic of interest. To sum up, this approach seems to esti-

mate such distributions using random simulation-based cal-

culation processes. The bootstrap technique has also served

in fuzzy statistics. As such, Gonzalez-Rodriguez etal. [15]

used it in the hypotheses testing procedure for the mean of

fuzzy random variables. In the same direction, Montenegro

etal. [18] concluded that a bootstrap process is considered

to be computationally lighter than asymptotic ones.

In our strategy, we propose to use a bootstrap methodol-

ogy to empirically estimate the distribution of the likeli-

hood ratio LR exposed in Eq.20, i.e. the diﬀerence of the

log-likelihood function evaluated at

𝜃

compared to the one

evaluated at

𝜃

. Berkachy and Donzé [7] has introduced two

approaches to construct the bootstrap imprecise samples as

follows:

1. The ﬁrst one is based on simply generating with replace-

ment D bootstrap samples. For each sample, we calcu-

late after the needed deviance.

2. The second one is based on generating D samples by

preserving the couple of location and dispersion char-

acteristics respectively denoted by

(s0,𝜖)

, of the nearest

symmetrical trapezoidal fuzzy numbers described in

Deﬁnition 8. Note that these fuzzy numbers calculated

rely on the primary data set.

Further description of both approaches remain at disposal

in [7] and [2]. From [2], we can clearly see that no nota-

ble diﬀerences exist between the use of both algorithms.

Although the design of both algorithms is somehow diﬀer-

ent, the obtained results seemed to be very similar. For this

reason, we will detail hereafter only the second approach,

considered to be more complicated than the ﬁrst one, but

conceptually very attractive. The algorithm based on the

second bootstrap approach using the characteristics

(s0,𝜖)

is then given by the following steps:

Algorithm:

1. Consider a primary sample. For each observation of this

sample, calculate the set of characteristics

(s0,𝜖)

.

2. From the calculated set of characteristics

(s0,𝜖)

, ran-

domly draw with replacement and with equal prob-

abilities a new set of characteristics

(s0,𝜖)

. Construct a

bootstrap sample based on this set.

3. Calculate the deviance

2[

l(

𝜃;x)−l(

𝜃;x)

]boot

for each

bootstrap sample.

4. Recursively repeat the Steps 2 and 3 a large number D

of times. A bootstrap distribution composed of a number

D of values has to be constructed.

5. Calculate

𝜂

, the

(1−𝛿)

-quantile of the bootstrap distri-

bution of the statistic LR.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 8 of 15

SN Computer Science

This algorithm mandatory requires to calculate a maximum

likelihood estimator. For this task, Denoeux [10] proposed

a tool based on the fuzzy EM algorithm. This latter can be

computed using the R package EM.Fuzzy described in

[19]. Nevertheless, this methodology presents a drawback

since it produces a crisp estimator instead of a fuzzy one.

Unfortunately, methods for calculating a fuzzy maximum

likelihood estimator in such contexts is not yet established.

For this reason, we model the obtained EM crisp-based

estimator using a triangular symmetrical fuzzy number as

instance. The calculated crisp element will be chosen to be

the core of the modelling fuzzy number. Note that the choice

of the symmetrical shape of the fuzzy number is intentional,

since the purpose is to reduce as much as possible the com-

plexity relate to this choice.

We ﬁnally highlight that all the previously described steps

of the calculation procedure can be easily computed using

our R package FuzzySTs described in [8]. This package

is a complete user-friendly one, for which the development

is made for application purposes. In addition, a detailed

numerical example of the defended procedure with its inter-

pretation can be found in [7].

Inference: Comparison ofMeans

Fuzzy conﬁdence intervals are very useful in statistical infer-

ence. We propose to use these intervals in a more or less

complex testing situation. Indeed, we introduce a pragmatic

approach to perform a hypotheses test for comparing the

means of groups using the constructed fuzzy conﬁdence

intervals. The fuzzy analysis of variance (FANOVA) is often

used for this purpose as seen in Berkachy and Donzé [4],

Parchami and al. [20], and Gonzalez-Rodriguez and al. [14].

We complete this analysis by proposing a test based on fuzzy

conﬁdence intervals. Our approach is as follows:

Similarly to the approach of a classical analysis of vari-

ance (ANOVA), we deﬁne the null hypothesis

H0

that the

means related to the two groups are equal, against the alter-

native one

H1

that the pair of means is not equal, at a signiﬁ-

cance level

𝛿

. The null and alternative hypotheses

H0

and

H1

can then be written as follows:

where

𝜇1

and

𝜇2

are the means of the groups 1 and 2

respectively.

For the groups 1 and 2, we ﬁrst construct the fuzzy con-

ﬁdence intervals by the likelihood ratio method denoted by

ΠLR1

and

ΠLR2

. Our strategy of hypothesis testing is to ana-

lyse the overlapping between the fuzzy conﬁdence inter-

vals for each group mean. The aim is to be able to identify

whether the means of groups are potentially equal or not,

using these intervals. In case of perfect overlapping, we

H0∶𝜇1=𝜇2, against H1∶𝜇1≠𝜇2,

could infer that there is no diﬀerence between the means.

Since the metrics described in “The

d

𝜃

⋆

SGD

Metric” are seen

as powerful alternative for the diﬀerence between two fuzzy

sets, we propose to calculate the

d

𝜃

⋆

SGD

metric of both con-

structed intervals, denoted by

d

𝜃

⋆

SGD

(

�

Π

LR1

,

�

Π

LR2)

. The objec-

tive is then to quantify the overlapping between them. We

highlight that the choice of the

d𝜃⋆

SGD

metric is intentional

since taking into consideration the shape of the fuzzy con-

ﬁdence intervals and its possible irregularities is crucial in

this situation. In addition, a mapping into

ℝ+

is important

on an absolute manner.

Next, we would like to deﬁne a decision rule according to

the obtained overlap between both fuzzy sets. As such, we

propose to “normalise” the calculated distance in order to

obtain a relative ratio. Thus, by translating one set, the other

remaining ﬁxed, we calculate an optimal distance of rejec-

tion between the fuzzy conﬁdence intervals, as the position

of the intervals such that both intervals become tangent. This

distance is denoted by

d

𝜃

⋆

SGD

(

�

Π

LR1

,

�

Π

LR2

)

opt

. The following

statistic R given by:

will help us to reject or not the null hypothesis. The decision

rule can then be written as follows:

Decision rule: The statistic R belongs to the interval

[0;1]. The rules are then:

– The closer the statistic R is to the value 0, the strongest

we do not reject the null hypothesis

H0

.

– The closer the statistic R is to the value 1, the strongest

we reject the null hypothesis

H0

.

Simulation Study

In [7], we have showed a simulation study illustrating the use

of the two defended bootstrap algorithms in the process of

calculation of the fuzzy conﬁdence intervals. This study is

based of randomly generating data sets taken from a normal

distribution N(5,1) and composed by

N=50

, 100 and 500

observations. The observations are then modelled by trian-

gular symmetrical fuzzy numbers of spread 2.

Following the well-described procedure, the fuzzy conﬁ-

dence intervals by the likelihood ratio for the theoretical of

the constructed data sets were computed at the conﬁdence

level

1−𝛿=1−0.05

. The algorithm presented in “Boot-

strap technique for the approximation of the likelihood ratio

and its distribution” has been used to estimate the boot-

strapped quantile

𝜂

.

In addition, since the number of iterations did not really

inﬂuence the outcome of the calculations,

D=1000

itera-

tions were considered for all our calculations. Concerning

(40)

R

=

d𝜃

⋆

SGD(

�

ΠLR1,

�

ΠLR2)

d𝜃⋆

SGD

(�

Π

LR1

,�

Π

LR2

)

opt

,

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 9 of 15 374

SN Computer Science

the crisp-based estimators calculated using the fuzzy EM

algorithm, we have considered the following two fuzzy num-

bers to model the estimators:

– the ﬁrst one is a triangular symmetrical fuzzy number of

spread 2;

– the second one is a triangular symmetrical fuzzy number

of spread 1.

We would like to explore the inﬂuence of the degree of

fuzziness in the modelling procedure of the estimators on

the constructed fuzzy conﬁdence intervals. For sake of com-

parison, we additionally use the fuzzy sample mean as a

fuzzy estimator.

We show in Table1 the 95%-quantiles of the bootstrapped

distribution of the likelihood ratio, where data sets of size

50, 100 and 500 are considered. It is clear to see that the

quantiles corresponding to the considered sample sizes are

in some sense very close. We could directly remark also

that modelling the ML-estimator using less fuzziness (fuzzy

number with spread 1) leads to a lower quantile, compared to

modelling the ML-estimator using greater fuzziness (fuzzy

number with spread 2).

Based on the boostrapped quantiles shown in Table1,

we now calculate the fuzzy conﬁdence intervals using the

likelihood ratio method following the instructions given in

“Procedure”. For sake of simplicity, we will develop the

case with

N=500

observations only for the construction

of conﬁdence intervals. Table1 gives the lower and upper

bounds of the support and the core sets of the calculated

fuzzy conﬁdence intervals.

Concerning the interpretation of the choice of degree of

fuzziness related to the ML-estimators, it is clear to see that

less fuzziness leads a smaller support set of the calculated

conﬁdence interval. In other terms, this choice aﬀects the

obtained fuzzy conﬁdence interval. Therefore, carefully

modelling the ML-estimator is crucial.

By traditional fuzzy tools, a fuzzy confidence

interval defined in the same settings is given by

Π=(3.907, 4.907, 5.080, 6.080)

. An important conclusion of

the diﬀerence between the traditional and the defended fuzzy

conﬁdence intervals is that the core sets are slightly larger

in the case of bootstrap intervals using the ML-estimators.

Note that for these intervals, interpreting the spread of the

support sets is in some sense diﬃcult since they are aﬀected

by the degree of fuzziness of the ML-estimator. In case the

fuzzy sample mean is used as an estimator, the obtained

fuzzy conﬁdence interval has tighter support and core sets

compared to the traditional fuzzy conﬁdence interval.

Simulation Study onCoverage Rates

In [7], we have conducted a simulation study on coverage

rates corresponding to the fuzzy conﬁdence intervals calcu-

lated using the likelihood ratio method. A large number of

data sets composed of

N=100, 500

and 1000 observations

are generated. We consider these data sets to be uncertain

and we model each observation by a triangular symmetrical

fuzzy number with a spread 2. The objective of this study

is to estimate for the mean, fuzzy conﬁdence intervals by

the likelihood ratio method for one side, and the tradi-

tional fuzzy ones from another one, at the conﬁdence level

1−𝛿=1−0.05

, and consequently calculate the coverage

rates of these intervals in order to compare them. We note

that the ML-estimators were modelled by fuzzy numbers of

spreads 1 and 2. Similarly to the previous study, the fuzzy

sample mean was also used for the sake of comparison only.

Table 1 The 95%-quantiles of the bootstrapped distribution of LR

and the corresponding fuzzy conﬁdence intervals by the likelihood

ratio (data set of 500 observations)—case of a data set taken from a

normal distribution N(5, 1) modelled using triangular symmetrical

fuzzy numbers at 1000 iterations

Bootstrap quantiles

Sample size N = 50 N = 100 N = 500

Bootstrap quantile using the sample mean 1.802 1.845 2.118

Bootstrap quantile using the ML-estimator (spread 2) 1.854 1.971 2.201

Bootstrap quantile using the ML-estimator (spread 1) 1.563 1.671 1.864

Fuzzy conﬁdence intervals

Support set Core set

Lower Upper Lower Upper

Fci using the sample mean 3.991 5.996 4.945 5.042

Fci using the ML estimator (spread 2) 3.803 6.184 4.795 5.193

Fci using the ML estimator (spread 1) 4.303 5.685 4.797 5.191

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 10 of 15

SN Computer Science

Some important conclusions of the comparison of coverage

rates mentioned in this study are as follows:

– The diﬀerence between the coverage rates of the boot-

strap fuzzy conﬁdence intervals and the ones of the tra-

ditional fuzzy interval is very slight.

– The coverage rate of the fuzzy conﬁdence intervals by the

likelihood method where the fuzzy sample mean is used

as an estimator, is the same as the rate for the traditional

fuzzy one.

– All the coverage rates of LR fuzzy conﬁdence guarantee

the required theoretical 95% conﬁdence level.

– The fuzziness of the ML-estimators do not inﬂuence the

coverage rates of the calculated fuzzy conﬁdence inter-

vals.

Application onSILC 2017

The Swiss SILC surveys are large and complex surveys con-

ducted every year by the Swiss Federal Statistical Oﬃce

[21] on Statistics on Income and Living Conditions in

Switzerland. In our application, we use the data from the

2017 edition. After selecting only the active population, i.e.

persons which age greater or equal to 18, and persons who

are respondent for the whole household, we get a sample

of 8120 observations. In this study, we do not apply any

weighting scheme. We consider in particular two variables

describing respectively the health situation (PW5020) and

the ﬁnancial condition (HQ5010). Both variables are coded

on a Likert scale from 0 to 10, where the value 0 means not

satisﬁed. We are mainly interested in the diﬀerence between

the group of Swiss nationality from one side, and the group

of other nationalities from another one. As such, we per-

form the defended calculation approaches for the variables

PW5020 and HQ5010 by each of both groups.

First of all, we would like to construct the fuzzy conﬁ-

dence intervals by the likelihood ratio method at the con-

ﬁdence level 95% for the variables PW5020 and HQ5010

as previously discussed. Each modality of these variables

is modelled by a triangular symmetrical fuzzy number of

spread 2. In other words, the value 3 is for example modelled

by the triangular fuzzy number (2,3,4). It is the same for all

the other modalities of both variables. The obtained intervals

for both variables are shown in Fig.1. The support and the

core sets of these intervals are also given in Table2. We

highlight that the bootstrap algorithm presented in “Boot-

strap technique for the approximation of the likelihood ratio

and its distribution” is used.

We focus the analyses of this section on two main axes

described as follows:

1. The inﬂuence of the variation in the fuzziness of the

ML-estimator from one side, and the inﬂuence of the

variation of the conﬁdence level from another one on the

constructed FCI. These analyses are performed based on

the variable PW5020;

2. The hypothesis test on the means of the groups Swiss

vs. Other nationalities using the constructed fuzzy conﬁ-

dence intervals as proposed in “Inference: comparison of

means”. This analysis is based on the variable PW5020

and on the variable HQ5010.

Fuzziness vs. Randomness

As previously mentioned, the objective of a ﬁrst set of analy-

ses is to investigate the inﬂuence of the variation of fuzzi-

ness related to the ML-estimator from one side, and the vari-

ation of the conﬁdence level from another one.

In our context, the ML-estimator is calculated by the EM-

algorithm deﬁned in the fuzzy environment as seen in [10].

This approach unfortunately leads to a crisp-based estimator,

and we consequently need to re-fuzzify it. For this reason,

678910

FCI for the variable − Satisfaction of health −

for Swiss vs. Other nationalities

θ

α

0.00.2 0.40.6 0.81.0

FCI for Swiss nationality

FCI for Other nationalities

678910

FCI for the variable − Satisfaction of financial situation −

for Swiss vs. Other nationalities

θ

α

0.00.2 0.40.6 0.8 1.0

FCI for Swiss nationality

FCI for Other nationalities

(a)

(b)

Fig. 1 Fuzzy conﬁdence intervals (FCI) of the variables Health

situation—PW5020(a) and Satisfaction of nancial

situation—HQ5010(b) at the conﬁdence level 95%

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 11 of 15 374

SN Computer Science

we have proposed to model this estimator by a triangular

symmetrical fuzzy number of spread 2. Such construction

is in some sense natural. We are interested in investigating

the inﬂuence of such choice on the constructed conﬁdence

intervals. For this reason, we propose on a second stage to

model these estimators by triangular symmetrical fuzzy

numbers of spread 1 in order to understand the inﬂuence of

such variation.

In Tables3 and 4, we show the support and the core

sets of the obtained conﬁdence intervals by the described

procedure. By these tables, we could clearly conﬁrm the

conclusion of the simulation study performed in “Simula-

tion study”. In other words, independently from the chosen

groups of the variable PW5020, the degree of fuzziness

of the fuzzy conﬁdence interval is strongly aﬀected by the

choice of fuzziness of the ML-estimator. For the same con-

ﬁdence level, more fuzziness of the ML-estimator leads to

a greater support set of the constructed conﬁdence interval.

Therefore, it would be ideal if a complete fuzzy-based

approach of calculation of ML-estimator exists.

In terms of conﬁdence levels, Table3 shows the fuzzy

confidence intervals at the confidence level 95% while

Table4 gives the ones at the conﬁdence level 70%. By com-

paring both tables, one could clearly remark that no impor-

tant variation in terms of spread of the constructed fuzzy

intervals, is depicted. A very small ﬂuctuation is seen only.

A general conclusion that we could propose is that the ran-

domness in the data is less inﬂuencing the constructed fuzzy

conﬁdence intervals than the fuzziness in the data and in the

ML-estimator.

Test onEquality ofMeans

We are now interested in comparing the means of groups

based on the interpretation of fuzzy conﬁdence intervals. We

would like then to determine whether the means of the two

Table 2 The fuzzy conﬁdence intervals by the likelihood ratio for the

groups “Swiss” and “Other” nationalities of the variables Health

situation—PW5020 and Satisfaction of nancial

situation—HQ5010 at the conﬁdence level 95% using triangular

symmetrical fuzzy numbers of spread 2

Health situation—PW5020

Nationality Support set Core set

Lower Upper Lower Upper

Swiss 6.705 9.068 7.703 8.069

Other 6.549 9.161 7.544 8.166

Financial situation—HQ5010

Nationality Support set Core set

Lower Upper Lower Upper

Swiss 6.742 8.581 7.589 7.732

Other 6.078 7.942 6.951 7.072

Table 3 The fuzzy conﬁdence intervals by the likelihood ratio for the groups “Swiss” and “Other” nationalities of the variable Health situ-

ation—PW5020 at the conﬁdence level 95% using triangular symmetrical fuzzy numbers of spread 2

Swiss nationality

Support set Core set

Lower Upper Lower Upper

Fci using the ML estimator (spread 2) 6.705 9.068 7.703 8.069

Fci using the ML estimator (spread 1) 7.205 8.568 7.703 8.069

Other nationality

Support set Core set

Lower Upper Lower Upper

Fci using the ML estimator (spread 2) 6.549 9.161 7.544 8.166

Fci using the ML estimator (spread 1) 7.048 8.662 7.649 8.161

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 12 of 15

SN Computer Science

groups “Swiss” versus “Other” nationality of the variables

PW5020 and HQ5010 are equal to each other, or not. Such

hypotheses test are usually made by the well-known analysis

of variance (ANOVA) in the classical statistical theory, or

by the fuzzy analysis of variance (FANOVA) in the fuzzy

set theory. For this reason, we perform the ANOVA as well

as the FANOVA for the variables PW5020 and HQ5010

by nationality. The summaries are given in Tables5 and 6

respectively.

Table5 shows that for the variable PW5020, the null

hypothesis of equality of means is not rejected with a p-value

of 0.605 by ANOVA and 0.597 by FANOVA. Let us now

Table 4 The fuzzy conﬁdence intervals by the likelihood ratio for the groups “Swiss” and “Other” nationalities of the variable Health situ-

ation—PW5020 at the conﬁdence level 70% using triangular symmetrical fuzzy numbers of spread 2

Swiss nationality

Support set Core set

Lower Upper Lower Upper

Fci using the ML estimator (spread 2) 6.715 9.058 7.713 8.059

Fci using the ML estimator (spread 1) 7.214 8.559 7.713 8.060

Other nationality

Support set Core set

Lower Upper Lower Upper

Fci using the ML estimator (spread 2) 6.570 9.139 7.565 8.145

Fci using the ML estimator (spread 1) 7.069 8.641 7.629 8.081

Table 5 The results of the classical analysis of variance (ANOVA) and the fuzzy analysis of variance (FANOVA) for the variable Health

situation—PW5020

AN OVA

Df Sum Sq F value P value

Factor 1 1 0.267 0.605

Residuals 7423 24437

FANOVA

Df Sum Sq F value P value

Factor 1 0.854 0.279 0.597

Residuals 7423 22740.454

Table 6 The results of the classical analysis of variance (ANOVA) and the fuzzy analysis of variance (FANOVA) for the variable Satisfac-

tion of nancial situation— HQ5010

AN OVA

Df Sum Sq F value P value

Factor 1 484 111.7 0

∗∗∗

Residuals 8084 35052

FANOVA

Df Sum Sq F value P value

Factor 1 451.175 112.049 0

∗∗∗

Residuals 8084 32551.067

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 13 of 15 374

SN Computer Science

perform the same hypothesis test but using the fuzzy conﬁ-

dence intervals. First of all, the Fig.1a is a great tool to visu-

alise the potential equality of means. From this ﬁgure, we

could clearly see that the support and the core sets of both

intervals overlap completely. In other terms, we could have

an evident suspicion of the equality between the perception

of health situation by Swiss natives or Other nationalities.

We would like to apply now the testing procedure described

in “Inference: Comparison of means”. Thus, we calculate

the actual distance between the fuzzy conﬁdence interval by

the likelihood ratio for the group “Swiss”, i.e.

ΠLRCH

, and the

one for the group “Other”, i.e.

Π

LR

OT

, using the metric

d

𝜃

⋆

SGD

.

This distance is given by

for which the optimal distance between these two intervals

can be found by the translation technique such that

The statistic R can then be computed using the Eq.40. We

get:

Decision rule: Since the statistic R is very close to the value

0, we strongly not reject the null hypothesis of equality of

means between both groups. This conclusion conﬁrms our

ANOVA and FANOVA results.

We perform the same analysis for the variable HQ5010.

By Table6, it is evident to see that the null hypothesis of

equality of means is strongly rejected in the ANOVA and

the FANOVA with a P value of 0. If we consider the fuzzy

conﬁdence intervals by the likelihood ratio for both studied

groups shown in Fig.1b, we remark that both intervals par-

tially overlap. In this case, the overlapping is exclusively a

matter of the support sets of the intervals. The core sets do

not overlap. By the hypotheses testing procedure presented

in “Inference: Comparison of means”, we calculate the dis-

tance

d

𝜃

⋆

SGD

(

�

Π

LRCH

,

�

Π

LROT )

and we get:

In this case, the maximal distance between the fuzzy conﬁ-

dence interval by the likelihood ratio for the group “Swiss”,

i.e.

ΠLRCH

, and the one for the group “Other”, i.e.

ΠLROT

is

Therefore, the statistic R is calculated as:

Decision rule: Since the statistic R is relatively far from

the value 0, we tend to strongly reject the null hypothesis

d𝜃⋆

SGD

(

�

Π

LRCH

,

�

Π

LROT

)=

0.111,

d𝜃⋆

SGD

(

�

Π

LRCH

,

�

Π

LROT

)

opt

=

2.489.

R=0.045.

d𝜃⋆

SGD

(

�

Π

LRCH

,

�

Π

LROT

)=

0.652.

d

𝜃

⋆

SGD

(

�

Π

LRCH

,

�

Π

LROT

)

opt

=

1.852.

R=0.352.

of equality of means between both groups, conﬁrming our

ANOVA and FANOVA results.

To sum up, a clear conclusion of the use of such decision

rule is that the closer the statistic R to the value 0, the strong-

est we do not reject the hypothesis of equality of means.

However, once we move away from the neighbourhood of

0, we strongly enter into the rejection region. By our deci-

sion rule, one could reject or not the hypothesis of equality

of means with conﬁdence. Accordingly, a ﬁnal remark is

that the use of the statistic R has to be prudent since R is

in some sense an indicator and not a threshold of rejection.

The adopted criteria in our methodology is mainly related to

the shape and the position of the fuzzy conﬁdence interval.

Conclusion

This study proposed a complete practical procedure of esti-

mation of fuzzy conﬁdence intervals using the likelihood

ratio method. The bootstrap technique was also used. A

complex bootstrap algorithm based on preserving the loca-

tion and dispersion measures related to the

d

𝜃

⋆

SGD

metric is

exposed. Such calculations are often seen as expensive

in terms of computational burden. Our procedure is well-

designed in order to avoid such complexities. Based on

the developed fuzzy conﬁdence intervals, we introduced a

hypothesis test for the equality of means of two groups with

its corresponding decision rule.

Our approaches have been applied on the Swiss SILC

surveys where two main axes of analyses were developed:

the inﬂuence of the fuzziness versus the randomness of the

data and of the maximum likelihood estimator on the con-

ﬁdence intervals from one side, and the application of the

hypotheses testing procedure for comparing the mean of the

group “Swiss nationality” to that of the group “Other nation-

alities” for the variables Satisfaction of health situation and

Satisfaction of ﬁnancial situation.

Main conclusions of these analyses are that the random-

ness in the data is less inﬂuencing the fuzzy conﬁdence

intervals than the fuzziness in the data and of the ML-esti-

mator. Furthermore, one could clearly see that fuzzy conﬁ-

dence intervals can be considered as a proper tool to perform

a preliminary analysis for the comparison of means. Thus,

the introduced statistic is in some sense an indicator of rejec-

tion of the hypothesis of equality of means of two groups.

Furthermore, we have to mention that it is often diﬃ-

cult to conduct a hypothesis test based on fuzzy conﬁdence

intervals. It needs deeper knowledge of the hypotheses, the

considered estimators and distributions as instance. Based

on our constructed procedure, we managed to propose a con-

ceptually diﬀerent way of comparison between two means.

This method shows also an indicator of overlapping, which

can be strongly helpful in the decision making process. From

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374374 Page 14 of 15

SN Computer Science

another side, the defended hypothesis test by the likelihood

ratio method is an interesting alternative to the classical test.

Our approach is practicable, direct, and easily reproducible

using our R toolbox given in Berkachy and Donzé [8]. This

latter is well conceived and implemented in a way of hav-

ing a good performance in terms of time and computational

burden needed for numerical purposes. Multiple benchmark

tests have been previously performed to ensure its perfor-

mance. Thus, this R package is a great tool, since it can be

easily used to carry out such calculations of fuzzy conﬁ-

dence intervals in a user-friendly environment. We highlight

that fuzzy hypotheses can also be taken into consideration

in the calculation of these fuzzy intervals by our toolbox.

Finally, the encountered problem is mainly in the fuzzi-

ﬁcation of the crisp-based maximum likelihood estimator.

An approach leading to a fuzzy-based one is very welcome.

From another side, reﬁning our decision rule regarding test-

ing the hypotheses of equality of means could also be inter-

esting to investigate in future research.

Funding Open access funding provided by University of Fribourg.

Declarations

Conflict of Interest The authors declare that they have no conﬂict of

interest.

Open Access This article is licensed under a Creative Commons Attri-

bution 4.0 International License, which permits use, sharing, adapta-

tion, distribution and reproduction in any medium or format, as long

as you give appropriate credit to the original author(s) and the source,

provide a link to the Creative Commons licence, and indicate if changes

were made. The images or other third party material in this article are

included in the article's Creative Commons licence, unless indicated

otherwise in a credit line to the material. If material is not included in

the article's Creative Commons licence and your intended use is not

permitted by statutory regulation or exceeds the permitted use, you will

need to obtain permission directly from the copyright holder. To view a

copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.

References

1. Berkachy R. The signed distance measure in fuzzy statistical

analysis. Some theoretical, empirical and programming advances.

Ph.D. thesis, University of Fribourg, Switzerland, 2020.

2. Berkachy R. The signed distance measure in fuzzy statistical anal-

ysis. In: Theoretical, empirical and programming advances fuzzy

management methods. Berlin: Springer International Publishing;

2021. https:// doi. org/ 10. 1007/ 978-3- 030- 76916-1.

3. Berkachy R, Donzé L. Individual and global assessments with

signed distance defuzziﬁcation, and characteristics of the output

distributions based on an empirical analysis. In: Proceedings of

the 8th International Joint Conference on Computational Intel-

ligence—volume 1: FCTA,, pp. 75–82 2016. https:// doi. org/ 10.

5220/ 00060 36500 750082

4. Berkachy R, Donzé L. Fuzzy one-way ANOVA using the

signed distance method to approximate the fuzzy product. LFA

2018—Rencontres francophones sur la Logique Floue et ses

Applications, Cépaduès, 2018, pp 253–264.

5. Berkachy R, Donzé L. Fuzzy conﬁdence interval estimation by

likelihood ratio. In: Proceedings of the 2019 Conference of the

International Fuzzy Systems Association and the European Soci-

ety for Fuzzy Logic and Technology (EUSFLAT 2019), 2019.

6. Berkachy R, Donzé L. Testing hypotheses by fuzzy methods: a

comparison with the classical approach. Cham: Springer Interna-

tional Publishing; 2019. p. 1–22.

7. Berkachy R, Donzé L. Fuzzy conﬁdence intervals by the likeli-

hood ratio with bootstrapped distribution. In: Proceedings of the

12th International Joint Conference on Computational Intelligence

- FCTA. INSTICC, SciTePress; 2020. pp 231–242. https:// doi. org/

10. 5220/ 00100 23602 310242

8. Berkachy R, Donzé L. FuzzySTs: Fuzzy statistical tools, R pack-

age, url = https://CRAN.R-project.org/package=FuzzySTs (2020).

https:// CRAN.R- proje ct. org/ packa ge= Fuzzy STs

9. Couso I, Sanchez L. Inner and outer fuzzy approximations of con-

ﬁdence intervals. Fuzzy Sets Syst 2011;184(1):68 – 83. https://

doi. org/ 10. 1016/j. fss. 2010. 11. 004. http:// www. scien cedir ect. com/

scien ce/ artic le/ pii/ S0165 01141 00045 50. Preference Modelling

and Decision Analysis (Selected Papers from EUROFUSE 2009).

10. Denoeux T. Maximum likelihood estimation from fuzzy data

using the EM algorithm. Fuzzy Sets Syst. 2011;183(1):72–91.

https:// doi. org/ 10. 1016/j. fss. 2011. 05. 022 (Theme: information

processing).

11. Dubois D, Prade H. The mean value of a fuzzy number. Fuzzy Sets

Syst. 1987;24(3):279–300. https:// doi. org/ 10. 1016/ 0165- 0114(87)

90028-5. http:// www. scien cedir ect. com/ scien ce/ artic le/ pii/ 01650

11487 900285. Fuzzy numbers.

12. Efron B. Bootstrap methods: another look at the jackknife. Ann

Stat 1979;7(1):1–26. http:// www. jstor. org/ stable/ 29588 30.

13. Gil MA, Casals MR. An operative extension of the likelihood ratio

test from fuzzy data. Stat Pap. 1988;29(1):191–203. https:// doi.

org/ 10. 1007/ BF029 24524.

14. Gonzalez-Rodriguez G, Colubi A, Gil MA. Fuzzy data treated as

functional data: a one-way ANOVA test approach. Comput Stat

Data Anal. 2012;56(4):943–55. https:// doi. org/ 10. 1016/j. csda.

2010. 06. 01.

15. Gonzalez-Rodriguez G, Montenegro M, Colubi A, Gil MA. Boot-

strap techniques and fuzzy random variables: synergy in hypoth-

esis testing with fuzzy data. Fuzzy Sets Syst 2006;157(19):2608

– 2613 (2006). https:// doi. org/ 10. 1016/j. fss. 2003. 11. 021. http://

www. scien cedir ect. com/ scien ce/ artic le/ pii/ S0165 01140 60020 89.

Fuzzy sets and probability/statistics theories.

16. Kahraman C, Otay I, Öztayşi B. Fuzzy extensions of conﬁdence

intervals: estimation for

𝜇

,

2

, and p. Cham: Springer International

Publishing; 2016. p. 129–54.

17. Kruse R, Meyer KD. Statistics with vague data, vol. 6. Nether-

lands: Springer; 1987.

18. Montenegro M, Colubi A, Rosa Casals M, Ángeles Gil M. Asymp-

totic and bootstrap techniques for testing the expected value of a

fuzzy random variable. Metrika. 2004;59(1):31–49. https:// doi.

org/ 10. 1007/ s0018 40300 270.

19. Parchami A. EM Fuzzy: EM algorithm for maximum likeli-

hood estimation by non-precise information, R package, url =

https://CRAN.R-project.org/package=EM.Fuzzy (2018). https://

CRAN.R- proje ct. org/ packa ge= EM. Fuzzy.

20. Parchami A, Nourbakhsh M, Mashinchi M. Analysis of variance

in uncertain environments. Complex Intell Syst. 2017;3(3):189–

96. https:// doi. org/ 10. 1007/ s40747- 017- 0046-8.

21. Swiss Federal Statistical Oﬃce: Statistics on income and living

conditions (SILC). surveys 2015–2017 (2017). https:// www. bfs.

admin. ch/ bfs/ en/ home/ stati stics/ econo mic- social- situa tion- popul

ation/ surve ys/ silc. asset detail. 18227 44. html.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

SN Computer Science (2022) 3: 374 Page 15 of 15 374

SN Computer Science

22. Viertl R, Yeganeh SM. Fuzzy conﬁdence regions. Cham: Springer

International Publishing; 2016. p. 119–27.

23. Yao JS, Wu K. Ranking fuzzy numbers based on decom-

position principle and signed distance. Fuzzy Sets Syst.

2000;116(2):275–88.

24. Zadeh L. Probability measures of fuzzy events. J Math Anal Appl.

1968;23(2):421–7.

Publisher's Note Springer Nature remains neutral with regard to

jurisdictional claims in published maps and institutional aﬃliations.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

1.

2.

3.

4.

5.

6.

Terms and Conditions

Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).

Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-

scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By

accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these

purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.

These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal

subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription

(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will

apply.

We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within

ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not

otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as

detailed in the Privacy Policy.

While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may

not:

use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access

control;

use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is

otherwise unlawful;

falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in

writing;

use bots or other automated methods to access the content or redirect messages

override any security feature or exclusionary protocol; or

share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal

content.

In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,

royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal

content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any

other, institutional repository.

These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or

content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature

may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.

To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied

with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,

including merchantability or fitness for any particular purpose.

Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed

from third parties.

If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not

expressly permitted by these Terms, please contact Springer Nature at

onlineservice@springernature.com