Available via license: CC BY-NC 4.0
Content may be subject to copyright.
Using Rough Set Theory to Achieve Reliable
Chemical Solvents Selection: A Multi-attribute
Decision Case Study
Chaozhe Jiang 1 Pei Hu2 Dun Liu2 Fang Xu3 Jixue Yuan4
1College of Traffic and Transportation, Southwest Jiaotong University, Chengdu 610031, P. R. China
2School of Economics & Management, Southwest Jiaotong University, Chengdu 610031, P. R. China
3Sichuan Tourism University, Chengdu 610031, P. R. China
4School of Physical Education, Yunnan Normal University, Kunming 650092, P. R. China
Abstract
The basic concepts of the rough set theory (RST) are
introduced. An example of the rough set theory
application to the chemical solvents selection problem
(CSSP) is presented. Through the RST, we get a good
multi-attributes decision-making effect in the chemical
solvents selection. This work demonstrates how the
use of RST in chemical process development by
allowing efficient and reliable improvement of a given
synthetic step. And other numerous earlier
applications of rough set theory to the various
scientific domains suggest that it also can be a useful
tool for the analysis of inexact, uncertain, or vague
chemical data.
Keywords: RST (Rough set theory), Chemical
solvents selection problem (CSSP), Multi-attributes
Decision Making (MADM), RIDAS (An Rough Set
Based Intelligent Data Analysis System)
1. Introduction
The rough set theory, introduced by Pawlak in 1982[1],
although popular in many other disciplines, is nearly
unknown in chemistry. So this paper intends to
propose a research communication of the rough set
theory application to chemical field.
The rough set approach can be considered as a
formal framework for discovering facts from imperfect
data. The results of the rough set approach are
presented in the form of classification or decision rules
derived from a set of examples or cases.
The aim of this paper is to introduce the basic
concepts of the Rough Set Theory (RST) and also to
show its possible applications in the field of chemical
industry.
2. Rough Set Theory
2.1 Basic concepts of the RST
2.1.1. Information system and indiscernibility relation
Formally, an information system, can be seen as a
system IS=(U, A) Where U is the universe( a finite set
of objects, U={x1,x2,…..,xm}) and A is the set of
attributes. Each attribute a∈A defines an information
function fa: U→Va, where Va is the set of values of a,
called the domain of attribute a.
For every set of attributes B⊂A, an
indiscernibility relation Ind(B) is defined in the
following way: two objects, i
xand j
x, are
indiscernible by the set of attributes B in A, if
)()( ji xbxb
=
for every b⊂B. The equivalence class
of Ind(B) is called elementary set in B because it
represents the smallest discernibile groups of objects.
For any element i
x of U, the equivalence class of i
x
in relation Ind(B) is represented as )(
][ BIndi
x. The
construction of elementary sets is the first step in
classification with rough sets.
2.1.2. Approximations of sets
So called the lower and the upper approximations of a
set (Fig.1), referring to:
Fig.1. Schematic demonstration of the upper and lower
approximation of set X.
Let X denotes the subset of elements of the
universe U (X⊂U). The lower approximation of X in
B (B⊆A), denoted as:
}][|{ )( XxUxBX BIndii ⊂∈=
The upper approximation of the set X, denoted
as: }][|{ )(
φ
≠∩∈= XxUxBX BIndii
For any object i
x of the lower approximation of
X(i.e., BXxi∈), it is certain that it belongs to X.
For any object i
x of the upper approximation of X
(i.e., BXxi∈), we can only say that i
x may belong
to X. The difference: BXBXBNX −= is called a
boundary of X in U.
If the lower and upper approximation are
identical(i.e., BXBX =), then set X is definable,
otherwise, set X is indefinable in U. if
φ
≠
BX and
UBX ≠, X is called roughly definable in U; where
Φ denotes an empty set.
And BXXPOSB=)( , called the B-positive
region of X, BXUXNEGB−=)( ,called the B-
negative region of X.
2.1.3. The weight ac Approximation of sets
An accuracy measure of the set X in B⊆A is defined
as: )(/)()( BXcardBXcardX
B=
μ
Where card(.) means the cardinality of a set. As one
can notice, 1)(0 ≤≤ X
B
μ
. If X is definable in U
then 1)( =X
B
μ
; if X is undefinable in U ,then
1)( <X
B
μ
.
2.1.4. Reduction and Independence of attributes
If Ind(A)=Ind(A-ai), then the attribute ai is called
superfluous. Otherwise, the attribute ai is indispensable
in A.
If the set of attributes is dependent, one can be
interested in finding all possible minimual subsets of
attributes, which lead to the same number of
elementary sets as the whole set of attributes( reducts)
and in finding the set of all indispensable attributes
(core).
The concepts of core and reduct are two
fundamental concepts of the rough sets theory. The
reduct is the essential part of an IS, which can discern
all objects discernible by the original IS. The core is
the common part of all reducts. To compute reducts
and core, the discernibility matrix is used. The
discernibility matrix has the dimension n×n, where n
denotes the number of elementary sets and its
elements are defined as the set of all attributesd which
discern elementary sets i
x][ and j
x][ .
.
Simplification of the IS can be achieved by
dropping certain values of attributes, which are
unnecessary for the system, i.e., by eliminating some
of these values in such a way that we are still able to
discern all elementary sets in the system. The
procedure of finding core and reducts of the attribute
values is similar to that of finding core and reducts of
the attributes. All computations are performed based
on the discernibility matrix, but the definition of the
discernibility function is now slightly different.
Instead of one discernibility function, we have to
construct as many discernibility functions, as there are
elementary sets in the IS.
2.1.5. Classification
Let F={X1,X2,…..,Xn},Xi⊂U be a family of subsets
of the universe U. If the subsets in F do not overlap,
i.e., Xi∩Xj=Φ. And the entity of them contains all
elementary sets, i.e.,∪Xi=U for . i=1,…, n. Then, F
is called a classification of U, whereas Xi are called
classes.
The lower and upper approximations of F in B⊆A
are defined as:
)}(,),(),({)( 21 n
XBXBXBFB L
=
)}(,),(),({)( 21 n
XBXBXBFB L=
respectively, The quality of classification is defined
as: cardUXBcardF tB /)(∪
=
η
and the
accuracy of classification F in B can be calculated
according to the following formula:
)(/)( ttB XBcardXBcardF ∪∪=
β
2.2 Decision table
A knowledge representation system containing the set
of attributes A(now called condition attributes) and the
set of decision attributes S is called a decision table.
As we will show further, decision tables are also
useful for classification.
2.2.1. D-superfluous attributes
For the attibute ai, belonging to the condition set of
attributes B (where B⊆A), is D-superfluous if it exerts
no influence on the lower approximation of D, i.e., if
POSB(D)=POS(B-ai)(D) , Otherwise, attribute ai is D-
indispensable in A.
2.2.2. Relative core and relative reducts of attributes
The set of all D-indispensable attributes in A is called
the D-core of A, whereas, the minimal subsets of
condition attributes that discern all equivalence classes
of the relation Ind (D) discernable by the entire set of
attributes are called D-reducts.
Relative reducts can be computed using a slightly
modified discernibility matrix. An element of the D-
discernibility matrix of A is defined as the set of the
relation Ind(D), i.e., to the same class. The D-core is
the set of all single elements of the D-discernibility
matrix of A.
2.3 Main steps of decision table
analysis
z Construction of elementary sets in D-space,
z calculation of upper and lower approximations of
the elementary sets in D.
z finding D-core and D-reducts of A attributes,
z finding D-core and D-reducts of A attribute
values.
2.3.1. Decision rules
The above described decision table can also be
regarded as a set of decision(classification) rules of the
form: jk da i⇒, where i
k
ameans that “attribute k
a
has value i” and the symbol“ ⇒”denotes prepositional
implication. In the decision rule
φ
θ
⇒, formulas
θ
and
φ
are called condition and decision, respectively.
Minimization of a set of attributes and values of
attributes with respect to another set of attributes
simply means a reduction of unnecessary conditions in
the decision rules, which is also known as the
generation of decision rules from the data.
2.3.2.New decisions
Logical rules derived from experimental data may be
used to support new decisions. Matching their
description to one of logical rules can support
classification of new objects. The matching procedure
may lead to one of four situations.
(a) the new object matches exactly one of the
deterministic logical rules;
(b) the new object matches exactly one of the non-
deterministic logical rules;
(c) the new object matches no logical rules;
(d) the new object matches more than one logical rule.
2.4 Types of attributes
There are different types of attributes.
Quantitative attributes represent measurable
properties of objects. Their values are ordered by
definition. Examples: temperature, pH, concentration.
Qualitative attributes are expressed in linguistic
terms. They can be divided into two classes. (1)
ordered qualitative attributes. The values of these
attributes can be ordered along an axis of significance.
The order of the linguistic values can be represented
by a sequence of increasing or decreasing numbers
encoding them. Example: polarity=(low, medium,
high)can be coded as low=1; medium=2 and high=3.
(2) unordered qualitative attributes (nominal attributes.
The linguistic values of these attributes cannot be
ordered; in other words, it is impossible to arrange
them along any axis of significance.
Application of RST to qualitative attributes is
straightforward. For nominal attributes, RST offers
evident advantages when compared with other
classifiers. To use this type of attribute as an input to
classical classifiers, one has to code it in a special way.
Each linguistic value is represented by separate
input(variable, feature). This encoding, called one-
from-k, creates a binary vector the elements of which
correspond to new inputs. When an attribute takes a
particular value, the corresponding vector element is
set to 1, while others are set to 0. this type of coding
causes a drastic extension of data dimensionality.
Continuous condition attributes present a problem,
as in this case, a discretezation is required. Both the
number of subranges and their intervals have to be
optimized. The number of subranges decides about the
number of logical rules considered. The number of
rules is not given in advance, but is limited by the
general requirement that the learning objects should
confirm the rules.
There are two possible approaches to the
discretization problem. One can optimize coding
taking into account only the similarities of the objects
in the attributes’ space or one can maximize the
predictive properties of the information system in the
stage of coding. Although the second approach seems
to be more interesting it has its limitations. As pointed
out by Ziarko et al. in ref. a low roughness setting, i.e.,
many small subranges, will lead to weak rules, i.e.,
rules not supported by many examples, which may be
in contradiction to the expert’s experience. When the
roughness parameter is set high, generalized rules are
produced with many supporting cases which lead to
strong rules.
3. RST application to the chemical
solvents selection- a case study
In order to illustrate RST, we have chosen to apply it
to a chemical solvents selection problem. A small data
set taken from the experimental data record of Asta
Company is used (see Table 3). The example is chosen
to be simple for tutorial reasons but it illustrates most
aspects of the RST approach.
γ-lactam (pyrrolidinone) derivatives is a very
important class of bioactive compounds that widely
exist in natural products, and also serves as a versatile
building block for the synthesis of drug material and
natural products1. In our search for practical process
to manufacture optical pure γ-lactam, we investigated
all factors that affect the formation of chiral γ-lactam.
In this communication, we like to report an interesting
approach to obtain high optical purity of γ-lactam by
using molecular sieves (MS) as catalyst and also by
microvave irradiation. Most interestingly, Rough Set
theory (RST), a soft computation method introduced
previously, was successfully used in solvent selection
for this reaction.
G
H
NO
G
H2NCOOH
microwave
MS
Scheme 1
3.1. Molecular Sieves Promoting
formation of γ-lactam
γ-lactams can be synthesized from amino acids3
(Scheme 1) under traditional condition by using water
as solvent. However, this traditional method not only
requires long reaction time, but also gives a certain
percentage of racemized products for chiral amino
acids. For example, producing (D) or (L)-
pyroglutamic acid from (D) or (L)-glutamic acid give
15% racemized product.
In order to avoid racemization, we tried various
reaction conditions and found that reaction time is key
to optical selectivity. To speed up reaction, we added 4
Å molecular sieves (10 wt%) to above reaction
mixture. As expected, chemical yield increased from
45% to 58% (Table 1). Meanwhile, racemized product
decreased from 15% to 10%.
Table 1: The yield affected by Molecular sieves in reaction
of (D) and (L)-pyroglutamic acid formation.
3.2. Microvave Irradiation in
formation of γ-lactam
Although the yield increased 14% and optical purity
increased 5% with molecular sieves, it is still not
suited for scale-up.
Microwave technology using in chemical
reaction becomes more and more popular in recent
years4. Microwave can promote reaction. With this
idea, we did a set of comparison experiments under
900 W microwave irradiation. The result showed that
the reaction time of pyroglutamic acid formation in
water decreased from 48 hrs to 2 hrs and the yield also
was improved. Combining MS and microwave
irradiation, the reaction time further decreased 30
minutes and the yield was improved. Only 2 to 3%
racemized product was detected (Table 2).
Table 2: The microwave effection in the formation (D)
and (L)-pyroglutamic acid.
3.3. Rough Set Theory in Solvent
Selection
Rough Set Theory may predict the suitable solvents.
According to RST, we created a Rough decision table
(Table 3 )
Table 3: The physical constant of the selected solvents.
We only choose 8 attributes to describe the
solvents and only for 20 objectives. Based on the
rough set method introduced previously, or the
software RIDAS to compute the final decision rules.
At last we got some rules, under the real
environment we chose one and explained as follows: if
ε≥35∧Cp≥35∧μ≥3, their corresponding solvents’
microwave effect should be strong.
We assumed: the stronger the microwave effect is,
the higher the yield is. The microwave effect of
thirteen solvents was compared. Based on RST
calculation, DMSO and DMF should display high
microwave effect. Following order was obtained for
selected solvents: DMSO > DMF > CH3CN> H2O
>toluene > dioxane.
To prove the calculation, we repeated the
cyclization of glutamic acid with microwave
irradiation and catalytic amount of molecular sieves
with DMSO and DMF as solvents. The result is
encouraging: over 90% yield was obtained and no
racemized product was detected in DMSO (Table 4).
Production Yield\Time (in DMF) Yield\Time(in DMSO)
(D)-pyroglutamic acid 86%\30min 95%\5min
(L)-pyroglutamic acid 87%\30min 95%\5min
Table 4: The solvent effected in the formation (D) and (L)-pyroglutamic acid.
Synthesis of serial of functional γ-lactams from studied.
(The experiments results were shown in Table 5.).Substituted
4-aminobutyric acids in DMSO as solvent was studied. (The
experiments results were shown in Table 5.).
entry substrate Product Yield(%) ee%
1 H2NCOOH
N
H
O 96 racemic
2 H2NCOOH
COOH
N
H
OCOOH
95 99
3 H2NCOOH
COOH
N
H
OCOOH
95 98
4 H2NCOOH
OH
N
H
O
HO
87 99
5 H2N COOH
OH N
H
O
OH
85 racemic
Table 5: Substituted 4-aminiobutyric acid to γ-lactam under microwave radiation in DMSO promoted by molecular sieves
The yield for above cyclization under combining
microwave irradiation and molecular sieves is very
good6. Most importantly, no racemization was
observed for entry 2, 3 and 4.
In conclusion, γ-lactams were synthesized directly
from substituted 4-aminiobutyric acids under
microwave radiation with high yield and without
racimazition. It has advantage of short reaction time,
more economic and environmental friendly than
traditional methods. Two solvents were selected by
RST and showed good activity in synthesis of γ-
lactams.
4. Conclusions
RST is a methodology which has demonstrated its
usefulness in the multi-attribute decision making, and
the above case study has shown that solvents selection
based on rough set theory can optimize the reaction
process, reduce the cost in chemical research and
production.
Acknowledgement
This work is partially supported by the SWJTU
Doctoral Innovation Fund(2005JCZ), the Sichuan
Province Education Dept. Nature Science Youth
Fund(2006B090), and the MEPRC Doctoral Science
Research Fund(20060613019).
References
[1] Z. Pawlak, Rough sets, Int. J. Inf. Comput. Sci. 11:
341-356,1982
[2] Z. Pawlak, Rough sets. Theoretical Aspects of
Reasoning about Data, Kluwer Academic
Publisher, ,1991.
[3] G. M. B. Anthony, H. John, L. S. Marie, Nicholas S
S, White A J P, and. Williams D J, J Org Chem, 64:
600-605, 1999.
[4] Jiang Chaozhe, Hupei, Xufang. Survey of Rough
Set theory and application, Computer Science and
Practice. 180:160-170,2004
[5] Turui, Jiang Chaozhe, Wurui, The application of
Rough set theory in chemical synthesis1, Computer
Science. 31: 124-127,2004.
[6] M. Mary and H. Paul, Tetrahedron Letter, 29:
3049-3056,1998.
[7]. P. T. Anastas, J. C. Warner, Green Chemistry,
Theory and Practice, Oxford University Press,
1998.