Conference PaperPDF Available

Big Data Analyses for Collective Opinion Elicitation in Social Networks

Authors:

Abstract and Figures

Big data are extremely large-scaled data in terms of quantity, complexity, semantics, distribution, and processing costs in computer science, cognitive informatics, web-based computing, cloud computing, and computational intelligence. Censuses and elections are a typical paradigm of big data engineering in modern digital democracy and social networks. This paper analyzes the mechanisms of voting systems and collective opinions using big data analysis technologies.A set of numerical and fuzzy models for collective opinion analyses is presented for applications in social networks, online voting, and general elections. A fundamental insight on the collective opinion equilibrium is revealed among electoral distributions and in voting systems. Fuzzy analysis methods for collective opinions are rigorously developed and applied in poll data mining, collective opinion determination, and quantitative electoral data processing.
Content may be subject to copyright.
Big Data Analyses for Collective Opinion Elicitation in Social Networks
Yingxu Wang
Dept. of Electrical and Computer Engineering
Schulich School of Engineering, Univ. of Calgary
Calgary, Alberta, Canada T2N 1N4
e-mail: yingxu@ucalgary.ca
Victor J. Wiebe
Dept. of Electrical and Computer Engineering
Schulich School of Engineering, Univ. of Calgary
Calgary, Alberta, Canada T2N 1N4
e-mail: victor_mx@shaw.ca.ca
Abstract—Big data are extremely large-scaled data in terms of
quantity, complexity, semantics, distribution, and processing
costs in computer science, cognitive informatics, web-based
computing, cloud computing, and computational intelligence.
Censuses and elections are a typical paradigm of big data
engineering in modern digital democracy and social networks.
This paper analyzes the mechanisms of voting systems
and collective opinions using big data analysis
technologies. A set of numerical and fuzzy models for
collective opinion analyses is presented for applications
in social networks, online voting, and general elections.
A fundamental insight on the collective opinion
equilibrium is revealed among electoral distributions
and in voting systems. Fuzzy analysis methods for
collective opinions are rigorously developed and applied
in poll data mining, collective opinion determination,
and quantitative electoral data processing.
Keywords-Big data; big data engineering; numerical
methods; fuzzy big data; social networks; voting; opinion
poll; collective opinion; quantitative analyses
I. INTRODUCTION
Big data is one of the representative phenomena of the
information era of human societies [8, 16]. Almost all fields
and hierarchical levels of human activities generate
exponentially increasing data, information, and knowledge.
Therefore, big data engineering has become one of the
fundamental approaches to embody the essences of the
abstraction and induction principles in rational inferences
where discrete data represent continuous mechanisms and
semantics.
A field of big data applications is in human memory and
DNA analyses in neuroinformatics, cognitive biology, and
brain science, where huge amount of data and information
have been obtained and pending for efficient processing [1,
3, 10, 17]. For instance, the biological information contained
in a DNA is identified as up to 33 Peta-bit, i.e.,
32,985,348,833,280,000 bit or 32,985,348 Giga-bit, of
genetic information according to a formal neuroinformatics
model [33].
Another paradigm of big data generated in computing is
the Internet traffic as shown in Table I as of statistics in
2012 [30]. The big data over the Internet indicate human
communication and information searching demands via
digital devices such as over 4.6 billion mobile phones and
equivalent number of tablets and portable computers. The
big data in this domain has pushed the daily traffic from the
rate of Terabyte (1012) to that of Petabyte (1015).
TABLE I. THE BIG DATA TRAFFIC ON INTERNET IN 2012
Data hub Data traffic Rate/day
NYSE 1.0 Terabytes
Twitter 7.0 Terabytes
Facebook 10.0 Terabytes
Google 24.0 Petabytes
Total Internet traffic 667.0 Exabytes (1018)
Censuses and general elections are the traditional and
typical domains that demand efficient big data analysis
theories and methodologies beyond number counting [5, 13].
Among modern digital societies and social networks, popular
opinion collection via online polls and voting systems
becomes necessary for policy confirmation and general
elections.
One of the central sociological principals adopted in
popular elections and voting systems is the majority rule
where each vote is treated with an equal weight [2, 4, 7]. The
conventional methods for embodying majority rule may be
divided into two categories known as the methods of max
counting and average weighted sum. The former is the most
widely used technology that determines the simple majority
by the greatest number of votes on a certain opinion among
multiple or binary options. The latter assigns various weights
to optional opinions, which extends the binary selection to a
wide range of weighted rating. Classic implementations of
these voting methods are proposed by Borda, Condorcet, and
others [5, 11, 12]. Borda introduced a scale-based system
where each casted vote is attached a rank that represents an
individual's preferences [5]. Condorset developed a voting
technology that determines the winner of an election as the
individual who is paired against all alternatives as a run-off
vote [11]. However, formal voting and general elections
mainly adopt the mechanism that implements a selection of
only-one-out-of-n options without any preassigned weight.
In this practice for casting the majority rule in societies, the
average weighted sum method is impractical.
This paper analyzes the formal mechanisms of voting
systems and collective opinion elicitation in the big data
engineering approach. The cognitive and computing
properties of big data in general, and of the electoral big data
Proc. of 2014 IEEE International Conference on Big Data Science and Engineering, Tsinghua Univ., Beijing, China
978-1-4799-6513-7/14 $31.00 © 2014 IEEE
DOI 10.1109/TrustCom/BDSE.2014.81
630
in particular, are explored in Section II. A set of
mathematical models and numerical algorithms for collective
opinion analyses is developed in Section III and illustrated in
Section IV. Fuzzy models for collective opinion elicitation
and aggregation are rigorously described in Section V. A set
of real-world case studies on applications of the formal
methodologies is demonstrated in big poll data mining,
collective opinion determination, and quantitative electoral
data processing.
II. PROPERTIES OF DATA IN BIG DATA ENGINEERING
This section explores the intentions and extensions of
big data as a term. The sources of big data generation are
analyzed. Special properties of big data are elaborated in
computer science, cognitive informatics, web-based
computing, and computational intelligence.
A. The Computational Properties of Big Data
Definition 1. Data,D, are an abstract representation of
the quantity Q of real-world entities or mental objects by a
quantification mapping fq, i.e.:
q
Df: Q o (1)
Although decimal numbers and systems are mainly
adopted in human civilization, the basic unit of data is a bit
[9, 15], which forms the converged foundation of computer
and information sciences. Therefore, the most fundamental
form of information that can be represented and processed is
binary data. Based on bit, complex data representations can
be aggregate to higher structures such as byte, natural
numbers (), real numbers (), structured data, and
databases.
The physical model of data and data storage in computing
and the IT industry are the container metaphor where each
bit of data requires a bit of physical memory.
Definition 2. Big data are extremely large-scaled data
across all aspects of data properties such as quantity,
complexity, semantics, distribution, and processing costs.
The basic properties of big data are unstructured,
heterogeneous, monotonous growing, mostly nonverbal, and
decay in information consistency or increase of entropy over
time [20]. The inherent complexity and exponentially
increasing demands create unprecedented problems in all
aspects of big data engineering such as big data
representation, acquisition, storage, searching, retrieve,
distribution, standardization, consistency, and security.
The sources of big data are human collective
intelligence. Typical mathematical and computing activities
that generate big data are Cartesian products (O(n2)), sorting
(O(nxlogn)), searching (exhaustive, O(n2)), knowledge base
update (O(n2)), as well as permutation and NP problems
with O(2n), O(n!), or even higher orders [9]. Typical human
activities that produce big data are such as many-to-many
communications, massive downloads of data replications,
digital image collections, and networked opinion forming.
Although the syntax of data is concrete based on
computation and type theories, the semantics of data is
fuzzy [24, 25, 27, 32, 33]. The analysis and interpretation of
big data may easily exceed the capacity of conventional
counting and statistics technologies.
B. The Cognitive Properties of Big Data
The neurophysiological metaphor of data as factual
information and knowledge in human memory is a
relational network [10, 17, 19, 26], which can be
represented by the Object-Attribute-Relation (OAR) model
[19, 29] as shown in Figure 1.
O
1
A
11
O
2
A
12
A
13
A
22
A
23
A
2j
A
1i
A
2m'
A
21
A
1m
r(O
1
, O
2
)
r(A
11
, A
21
)
r(O
1
, A
1m
) r(O
2
, A
2m’
)
r(O
1
, A
2j
) r(O
2
, A
1i
)
Figure 1. The OAR model of data and knowledge in memory
Definition 3. The OAR model of data and knowledge as
retained in long-term memory o the brain is a triple, i.e.:
OAR (O, A, R) (2)
where O is a finite set of objects denoting the extension of a
data concept, A is a finite set of attributes for characterizing
the data concept, and R is a set of relations between the
objects and attributes.
The OAR model enables the estimation of the memory
capacity of human, which revealed the nature of big data as
cognitive and semantics entities in the brain. In cognitive
neurology, it is observed that there are about 1011 neurons in
the brain, each of them is with 103synaptic connections in
average [3, 10]. According to the OAR model, the
estimation of the capacity of human memory for big data
representation can be reduced to a classical combinatorial
problem as follows.
Definition 4. The capacity of human memory Cm is
determined by the total potential relational combinations,
s
n
C, among all neurons n = 1011 and their average synaptic
connections s = 103 to various related subset of entire
neurons, i.e.:
(3)
Eq. 3 provides an analytic explanation of the upper limit
of the potential number of synaptic connections among
neurons in the brain. The model reveals that the brain does
not create new neurons to represent new information;
11
3113
8,432
10 !
10 !(10 -10 )!
10 [bit]
s
mn
C=
=
C
631
instead, it generates new synapses between existing neurons
in order to represent the newly acquired information.
Both cognitive and computational foundations of data
explored in this section explain the nature of big data and
the need for big data engineering. The notion of big data
engineering is perceived as a field that studies the
properties, theories, and methodologies of big data as well
as efficient technologies for big data representation,
organization, manipulations, and applications in industries
and everyday life. It is noteworthy that, although the
appearance of data is discrete, the semantics and
mechanisms behind them are mainly continuous. This is the
essence of the abstraction and induction principles of natural
intelligence.
III. METHODS FOR BIG ELECTORAL DATA ANALYSES
Mathematical models and numerical methods for
rigorous voting data processing and representation are sought
in this section in order to reveal the nature of big data in
voting and collective opinions. This leads to a set of novel
methods beyond traditional counting technologies such as
regressions of opinion spectrums, adaptive integrations of
collective opinions, and allocation of the opinion
equilibrium.
A. Big Data Interpretation for Embodying the Majority
Rule in Sociology
As reviewed in Section I, the typical method for
implementing the majority rule via voting is used to be the
max finding method.
Definition 5. The max function elicits the greatest
number of votes on a certain opinion, Oi, 1 didn, as the
voting result n
Vamong a set of n options, i.e.:
12
( , , ..., )
n
OO On
VmaxNN N (4)
where NOi is the number of votes casted for opinion Oi.
When there are only two options for the voting, Eq. 4 is
reduced to a binary selection.
Although the conventional max finding method is
widely adopted in almost all kinds of voting systems, it is an
over simplified method for accurate opinion collection. The
major disadvantage of it is that the implied philosophy, the
winner takes all, would often overlook the entire spectrum
of distributed opinions. This leads to a pseudo majority
dilemma [13, 20], which is analyzed as follows.
Definition 6. The pseudo majority dilemma states that
the result of a voting based on the simple max mechanism
may not represent the majority opinion distribution casted in
the voting, i.e.:
_
01234
max
(,, , , )
n
OOOOO
Oi
VmaxNNNNN
N
__
max
1
max max
1
|,
where ( ) -
n
Oi
i
n
Oi Oi Oi
i
Nii
NN N
v
p
(5)
A typical case of the pseudo majority dilemma in voting
can be elaborated in the following example.
Example 1. A voting with a distributed political
spectrum from far right (NO0), right (NO1), neutral (NO2), left
(NO3), and far left (NO4) is shown in Figure 2 where
the vote distribution is 01234
[,, , , ]
OOOOO
XNNNNN
[4000,2500,2600,1200,1100]. According to the max finding
method given in Eq. 4, the voting result is:
5
01234
00
(,, , , )
(4000,2500,2600,1200,1100)
( 4000)
OOOOO
O
VmaxNNNNN
max
ON
º
00.5 11.5 22.5 33. 5 4
1000
1500
2000
2500
3000
3500
4000
Opinion spec trum (x)
Vote c ount f(x)
Project ed votes
Figure 2. Distribution of collective opinions and their votes
The result indicates that opinion O0 is the winner and
the other votes would be ignored. However, in fact, the sum
of the rest opinions 0max
1
| 2500 2600 1200 1100 7300,
n
Oi
i
Nii
v   
is significantly greater than O0according to Eq. 5. Although
the maximum vote appears at 0 over the opinion spectrum,
the real representative centroid of the collective opinion is
actually at about l.3 on the spectrum. In other words, the
mean of the entire votes indicated an equilibrium point of
the collective opinion in between those of NO1 and NO2
rather than NO0. Therefore, in order to rationally analyze
popular opinion distributions and the representative
collective opinion on an opinion spectrum, advanced
mathematical models, numerical methods, and fuzzy
analyses [6, 21, 32, 33] are yet to be rigorously studied for
voting data processing and representation.
B. Numerical Regression for Analyzing Opinion
Spectrum Distributions beyond Counting
On the basis of analyses in the preceding subsection, an
overall perspective on the collective opinions casted in a
voting can be rigorously modeled as a nonlinear function
over the opinion spectrum. In order to implement a complex
polynomial regression, a numerical algorithm is developed
in MATLAB as shown in Figure 3, which can be applied to
analyze any popular opinion distribution against a certain
political spectrum represented by bid voting data. In the
632
analysis program, a 3rd order polynomial is adopted for
curve fitting, while other orders may be chosen when it is
appropriate. The general rule is that the order of the
polynomial regression m must less than the points of the
collected data n. Data interpolation technologies may be
adopted to improve the smoothness or missing points of raw
data in numerical technologies [6, 28, 32].
Figure 3. Algorithm of polynomial regression for opinion distributions
Applying the algorithm VoteRegressionAnalysis(X, Y), a
specific polynomial function and a visualized perception on
the entire opinion distribution can be rigorously obtained.
Example 2. The seats distribution of Canadian parties in
the House of Commons is given in Table II [30]. In Table II,
the relative position of each party on the political spectrum
is obtained based on statistics of historical data such as their
manifesto, policy, and common public perspectives [14, 18].
TABLE II. VOTING DATA DISTRIBUTION BY SEATS IN PARLIAMENT
Political party Seats occupied Relative position on the spectrum
New Democrats 100 -100
Bloc Quebecois 4 -71
Green 1 -43
Liberals 34 -14
Conservatives 160 50
According to the data in Table II, i.e., X = [-100, -71,
-43, -14, 0, 50] and Y = [100, 4, 1, 34, 4, 160], the voting
results can be rigorously represented by the following
function, f(x), as a result of the polynomial regression
implemented in Figure 3:
32
( ) 0.0001 0.005 2.175 65.1182fx x x x
(6)
where m = 3 and n = 5.
The above regression analysis results are visually plotted
in Figure 4. Because the polynomial characteristic function
is a continuous characteristic function, it can be easily
processed for multiple applications such as for opinion
spectrum representation, equilibrium determination, and
analyses of policy gains based on the equilibrium
benchmark as described in the following subsection.
-100 -50 050
-20
0
20
40
60
80
100
120
140
160
Opinion spectrum (x)
Vote c ount f(x)
Projected votes
PolyReg ression
Figure 4. House seats of parties on the political spectrum of Canada
C. The Collective Opinion Equilibrium Elicited from a
Spectrum of Opinion Distributions
It is recognized that the representative collective opinion
on a spectrum of opinion distributions casted in an election
is not a simple average of weighted sum as conventionally
perceived. Instead, it is the centriod covered by the curve of
the characteristic regression function as marked by the red
sign as shown in Figure 4.
Definition 7. The opinion equilibrium
;
is the natural
centroid in a given weighed opinion distribution where the
total votes of the left and right wings reached a balance at
the point k, i.e.:
-
(| ) () = () , , [-,]
kn
xn xk
x x k vxdx vxdx xk nn

9 
¨¨
(7)
where
---
11
() = () ( () + ()
22
knkn
xn xn xn xk
v x dx v x dx v x dx v x dx

¨¨¨¨
The integration of distributed opinions based on the
regression function can be carried out using any numerical
integration technologies. For instance, the iterative
Simpson’s integration method for an arbitrary continuous
function f(x) over [a, b] can be described as follows:
123
0
1
1
1
( ) , , 0
3
[ ( ( ) 4 ( ) ( ))]
3
b
oa
kpp
ppp
p
n
Ifxdxk
hfx fx fx
I
II
R
³ (8)
The collective opinion equilibrium method as modeled in
Eq. 7 is implemented in the algorithm as shown in Figure 5.
The core integration method adopted in the algorithm is
based on a built-in function quad() in MATLAB [6] that
implements Eq. 8.
Example 3. Applying the collective opinion equilibrium
determination algorithm to the opinion distribution data in
the Canadian general election as given in Figure 4, the
function VoteRegressionAnalysis(X, Y)
% Curve fitting by polynomial regression
format long;
m = 3;
n = length (X);
p = polyfit(X, Y, m);
% Vote distribution regression
f = @(x) p(1)*x.^4 + p(2)*x.^3 + p(3)*x.^2 + p(4)*x + p(5);
fprintf ('f(x) = %10.3f %s %10.3f %s %10.3f %s %10.3f %s
%10.3f %s\n', p(1), '*x.^4 + (', p(2), ')*x.^3 +
(', p(3), ')*x.^2 + (', p(4),')*x + (', p(5), ')');
plot(X, Y,'*r-'); hold on;
fplot(f, [X(1) X(n)]);
xlabel('Opinion spectrum (x)'); ylabel('Vote count f(x)');
legend ('Projected votes','PolyRegression');
% Find opinion equilibrium
[TotalOpinionIntegration, Equilibrium] =
VoteEquilibriumAnalysis(f, X(1), X(n))% Call subfunction
plot(Equilibrium, 0,'+r');
633
opinion equilibrium is obtained as
;
1 = 20.3. The result
indicates that the overall national opinion equilibrium was at
the mid-right as casted in 2011.
Figure 5. Algorithm of collective opinion equilibrium determination
Because the collective opinion equilibrium
;
is the
centroid of the opinion integration as defined in Eq. 7, it is
obvious that the equilibrium cannot be simply determined or
empirical allocated without the computational algorithm
(Figure 5) as demonstrated in Example 3.
IV. BIG ELECTORAL DATA PROCESSING
Using the methodologies developed in Section III, useful
applications will be demonstrated in this section with real-
world data. The case studies encompass the analysis of a
series of general elections in order to find out the dynamic
equilibrium shifts and the extrapolation of potential policy
gains based on the historical electoral data.
A. Analysis of a Series of Historical Elections Based
on Equilibrium Benchmarking
A benchmark of opinion equilibrium can be established
on the basis of a series of the historical electoral data. Based
on it, trends of the political equilibriums can be rigorously
analyzed in order to explain: a) What was the extent of
serial shifts as casted in the general elections? and b) Which
party was closer to the political equilibrium represented by
the collective opinions casted in the general elections?
Example 4. The trend in Canadian popular votes over
time can be benchmarked by results from the last four
general elections as given in Table III. Applying the opinion
equilibrium determination algorithm, vote_equilibrium_
analysis as given in Figure 5, the collective opinions
distributed in Figure 6 can be rigorously elicited, which
indicates a dynamic shifting pattern of the collective opinion
equilibriums, i.e., 5.0 o 7.4 o 7.0 o 10.9, on the political
spectrum between [-100, 100] during 2004 to 2011.
The opinion equilibrium determination method provides
insight for revealing the implied trends and the entire
collective opinions distributed on the political spectrum. An
interesting finding in Example 4 is, although several parties
on the left spectrum, -100 dx < 0, had won significant
number of votes as shown in Table III and Figure 6, the
collective opinion equilibrium had mainly remained
unchanged at the area of central-right where
;
= 7.6 in
average.
TABLE III. HISTORICAL ELECTORAL DATA DISTRIBUTIONS OF
CANADIAN GENERAL ELECTIONS
Figure 6. Polynomial regressions for federal elections during 2004 to 2011
B. Extrapolation for Potential Policy Gains Based on
Benchmarked Collective Opinion Equilibrium
The key objective of a party in a general election is to
rationally predict what the potential gain would be for a
certain policy making or shifting. The theory of the
collective opinion equilibrium as developed in preceding
sections suggests that this target can be reached by adapting
current policies towards the equilibrium benchmark.
Definition 8. A target gain in elections can be
extrapolatively projected via a necessary shift of policy
'
x
= n’ - n that satisfies the equilibrium benchmark
;
by an
updated regression of expected opinion distributions v'(x):
'
'
'
a) At the right end: ' ,
when ( ) = / 2, ' [ , ]
b) At the left end: ' ,
when ( ) = / 2, ' [ , ]
c) In between: ' ,and a,a' ( , )
()
when
n
o
xk
k
o
xn
a
xa
xnn
vxdx I n kn
xnn
vxdx I n nk
xaa nn
vxdx



 
¨
¨
¨
+
+
+
+
+
+
'
=1.1 ( ), '
() =1.1 ( '), '
a
xa
va a a
vxdx va a a
£
¦
¦t
¦
¦
¤
¦
¦t
¦
¦
¥¨+
(9)
where the original position n, point of equilibrium
;
(k |
x=k),and the total integrated votes Ioare known as results
of analyses in Section III.
Political Party Votes Relative
position on the
spectrum
2004 2006 2008 2011
Conservatives 4,019,498 5,374,071 5,209,069 5,832,401 50
Liberals 4,982,220 4,479,415 3,633,185 2,783,175 -14
Green 582,247 664,068 937,613 576,221 -43
Bloc Quebecois 1,680,109 1,553,201 1,379,991 889,788 -71
New Democrats 2,127,403 2,589,597 2,515,288 4,508,474 -100
Opinion
equilibrium
5.0 7.4 7.0 10.9
function [TotalOpinionIntegration, Equilibrium] =
VoteEquilibriumAnalysis(f, Xl, Xu)
% The integration of total opinion in the voting
TotalOpinionIntegration = quad(f, Xl, Xu); % Simpson integration
% To find the opinion equilibrium by iterative integration
h = 0.1;
for MidPoint = Xl : h : Xu
IGi = quad(f, Xl, MidPoint); % Simpson iterative integration
if IGi >= TotalOpinionIntegration / 2
break
end
end
Equilibrium = MidPoint;
634
The extrapolation method is divided into three cases
according to the position of the party on the spectrum,
which can be at either ends or in the middle of the spectrum.
Example 5. Given a target of a 10% gain in terms of
number of votes in the future election for a party where n =
50 on the spectrum, what kind of policy manipulations may
be needed to contribute towards the expected objective
based on the equilibrium benchmark casted in the latest
election as obtained in Example 4?
Based on the historical data provided in Table III,
i.e., X = [-100, -71, -43, -14, 50], Y2011 = [4508474, 889788,
576221, 2783175, 5832401], and the total opinion integration
Io = 441854649, the projected electoral improvement
problem can be represented as follows:

' 100, 71, 43, 14, '
' 4508474, 889788,576221,2783175, 5832401 1.1 6415641
441854649
o
Xn
Y
I
£¯
¦   
¦¡°
¢±
¦
¦
¦¯
q
¤¡°
¢±
¦
¦
¦
¦
¦
¥
Solve the problem according to Eq. 9(a), the following
results are obtained: n’ = 48.2 and
'
x = n’ - n = 48.2 – 50 =
-1.8. That is, in order to gain 10% more votes, the party
would need to shift its policy leaning to the collective
equilibrium
;
=10.9 for 1.8 steps where the negative sign
indicates a move to the middle. In case where other factors
would change as well, the problem becomes a multi-party
gaming system. However, for any given moment, the
system is still determinable based on the same analysis
method and algorithm as presented in Sections III and IV.
V. FUZZY METHODS FOR COLLECTIVE OPINION
ELICITATION AND ANALYSIS BASED ON BIG POLL DATA
Big data analysis technologies for collective opinion
elicitation based on historical data have been demonstrated
in preceding sections, which reveal that a party may gain
more votes by adapting its policy towards the political
equilibrium established in past elections. It is recognized
that a social system is conservative which is not change
rapidly over time because the huge base of population and
human cognitive tendency according to the long-life span
system theory [23]. However, the collective opinion
equilibriums do shift dynamically. Therefore, an advanced
technology for enhancing potential policy gains is to
calibrate the current collective opinion equilibrium by polls
in order to support up-to-date analysis and predication.
A. Fuzzy Elicitation of Collective Opinion from Big
Poll Data Samples
The typical technology for detecting current collective
opinion equilibrium is by polls. A poll may be designed to
test the impact of a potential policy in order to establish a
newly projected equilibrium. The projected equilibrium will
be used to update and adjust the historical benchmark. In
this approach, rational predictions of policy gains towards a
general election or a social network voting can be obtained
in a series of analytic regressions as formally described in
the remainder of this subsection.
Definition 9. An opinion oi on a given policy pi is a
fuzzy set of degrees of weights
j
k
i
Xexpressed by j, 1 djdm,
groups in the uniform scale I, i.e.:
i
ii i
i
12
1
:, [0,1]
{( , ),( , ),...,( , )}
{(, )}
m
j
ii
ii i
ii i
m
i
i
j
ofp
pp p
p
R
XX X
X
l
II
(10)
where the big-R notation represents recurring entities or
repetitive functions indexed by the subscript [20].
The normalized scale for fuzzy analyses is a universal
one because any other scale can be mapped into it.
Definition 10. Acollective opinion
j
P
Oon a set of n
policies pi, 1didn, is a compound opinion as a fuzzy set of
average weights
k
i
1
1
=(,))
q
N
k
ij
k
q
ij
N
WX
on each policy, i.e.:
j
j
k
i
k
k
k
k
k
k
k
k
k
k
11
1
11
111 112 11
221 222 22
12
{}
1
{(,= (,))}, [0,1]
(, ) (, ) ... (, )
( , ) ( , ) ... ( , )
{}
... ... ... ...
( , ) ( , ) ... ( , )
q
nm
Pij
ij
N
nm
k
iij ij
k
q
ij
m
m
nn nn nnm
Oo
pij
N
pp p
pp p
pp p
RR
RR WXW
WW W
WW W
WW W



¯
¡°
¡°
¡°
¡°
¡°
¡°
¡°
¡°
¢±
(11)
where
j
P
Omay be aggregated against the averages of each
row or column that indicate the collective opinions of a
certain policy casted by all groups or that of all policies of a
certain group, respectively, as illustrated in Table IV.
Definition 11. The effect
i
E of a set of policies is a
fuzzy matrix of the average weighted differences between
the current opinion
ij
W and the historical ones
'
ij
W for the
ith policy on the collective opinion of the jth group, i.e.:
i
i
j
kk
'
11
', 1 , 1
{(, )}
ij ij
nm
iij ij
ij
EO O in jm
p
RR WW

bbbb

(12)
Definition 12. The impact I
of a policy is a fuzzy
matrix of products of effects
i
Eand the corresponding group
sizes
j
g
N, i.e.:
j
kk
'
11 11
( ) { ( , ( ))}
jj
nm nm
gij igij ij
ij ij
INE pN
RR RR WW
 
t 
(13)
where the r sign indicates a positive or negative impact on a
target group, respectively.
Definition 13. The gain of policy impacts,
i
G, is a fuzzy
635
set of the mathematical means of the cumulative impacts
that each group obtain as results of the series of
aggregations from the initial poll data, i.e.:
i
j
1
1
1
()
mn
ij
i
j
GI
n
R
4
(14)
B. Fuzzy Analyses of Potential Policy Impacts in Votes
The fuzzy methodologies for collective opinion
elicitation and analysis from big poll data as developed in
Section V.A are illustrated in application case studies in the
following examples.
Example 6. The collective opinion
j
P
Oon the set of 3
testing policies against 5 groups on the political spectrum
can be elicited based on a set of large sample poll data as
summarized in Table IV. The current average weights of
opinions
ij
W and those of the historical ones
'
ij
Ware
aggregated from the sample data of individual opinions
according to Eqs. 10 and 11.
TABLE IV. SAMPLE POLL DATA OF COLLECTIVE OPINIONS
kk
'
,
ij ij
WW
G
1
G
2
G
3
G
4
G
5
p
1
0.2, 0.1 0.4, 0.4 0.9, 0.7 0.7, 0.6 0.5, 0.9
p
2
1.0, 0/9 0.5, 0.7 0.8, 0.9 0.5, 0.4 0.6, 0.5
p
3
0.5, 0.2 0.6, 0.3 0.3, 0.5 0.7, 0.6 1.0, 0.8
Definition 14. The complexity or size of poll data, P
C,
is proportional to the numbers of testing policies |P|, groups
on the spectrum |G|, and number of sample individuals Nq,
i.e.:
|| | |
Pq
CPGNtt (15)
where 2,000 tests in a poll will result in 30,000 raw
individual opinions in the settings of Example 6.
Example 7. Based on the summarized poll data as given
in Table IV with the average collective opinions, the fuzzy
set of effects
i
Eof the ith policy to the collective opinion of
the jth group can be quantitatively determined according to
Eq. 12 as follows:
i
kk
kk
'
11
35
'
11
1
2
3
(, )}, 3, 5
{(, )}
0.2 0.4 0.9 0.7 0.5 0.1 0.4 0.7 0.6 0.9
{ , 1.0 0.5 0.8 0.5 0.6 0.9 0.7 0.9 0.4 0.5
0.5 0.6 0.3 0.7 1.0 0.2 0.3 0.5 0.6 0.8
nm
iij ij
ij
iij ij
ij
Ep nm
p
p
p
p
RR
RR
WW
WW




¯ ¯
¡°¡ °¡
¡°¡ °¡

¡°¡ °¡
¡°¡ °
¡°¡ °
¢±¢ ±¢
1
2
3
}
0.1 0 0.2 0.1 -0.4
{ , 0.1 -0.2 -0.1 0.1 0.1 }
0.3 0.3 -0.2 0.1 0.2
p
p
p
¯
°
°
°
¡°
¡°
±
¯ ¯
¡°¡ °
¡°¡ °
¡°¡ °
¡°¡ °
¡°¡ °
¢±¢ ±
where the most effective policy is p3o{G1,G2} with a 30%
improvement, while the most negatively effective policy is
p1oG5 with a -40% loss.
Example 8. On the basis of Table IV and Example 7,
the impact I
of each tested policy is a fuzzy matrix of the
products of individual group size and the effects that
projects the ith policy on the jth group with the size
j
g
N, i.e.:
j
11
35
'
11
(), 3,5
{ ( , ( ))}
0.1 4508474 0 889788 0.2 576221 0.1 2783175 0.4 5832401
= 0.1 4508474 0.2 889788 0.1 576221 0.1 2783175 0.1 5832401
0.3 4508474 0. 3 889788 0.
j
j
nm
gij
ij
igij ij
ij
INEnm
pN
RR
RR WW


t

tt ttt
ttt t t
tt
2 576221 0. 1 2783175 0.2 5832401
450847 0 115244 278318 -2332960
450847 -177958 -57622 278318 583240
1352542 266936 -115244 278318 1166480
¯
¡°
¡°
¡°
¡°
tt t
¡°
¢±
¯
¡°
¡°
¡°
¡°
¡°
¢±
where [4508474, 889788, 576221, 2783175, 832401]
j
g
Naccording
to the 2011 data in Table III.
Example 9. The potential average gain of policy impacts
i
Gcan be derived according to Eq. 14 based on the results in
Example 8 as follows:
i
j
j
1
1
53
1
1
53
1
1
1
(), 3,5
1
()
3
450847 0 115244 278318 -2332960
1
( 450847 -177958 -57622 278318 583240 )
31352542 266936 -115244 278318 1166480
{751412, 29660, -19207, 278317,
mn
ij
i
j
ij
i
j
i
j
GInm
n
I
R
R
R

¯
¡°
¡°
¡°
¡°
¡°
¢±
4
4
4
-194413}
-100 -50 050
0
1
2
3
4
5
6
7x 10
6
Opinion s pectrum (x)
Votes f(x)
Project ed vot es
Historical votes
Project ed Regression
Figure 7. Polynomial regressions of projected voting gains
The projected gains or losses, G, over the political
spectrum produce a new set of estimated electoral
distributions Y = Y’ + G = [4508474, 889788, 576221,
2783175, 5832401] + [751412, 29660, -19279, 278317,
-194413] = [5259886, 919488, 557014, 3061492, 5637988].
636
On the basis of the projected gains derived from current
polls of collective opinions, the potential shift of the
collective opinion equilibrium on the political spectrum can
be predicated using the algorithm in Figure 5. The
regression result is plotted in Figure 7, which indicates a
collective opinion equilibrium shift slightly to the middle,
i.e.,
';
=
;
2 -
;
1 = 9.8 – 10.9 = -1.1, by contrasting to that
of the historical vote distributions.
VI. CONCLUSIONS
Big data engineering has been introduced into the field of
sociology for collective opinion elicitation and analyses.
Numerical models and fuzzy methodologies have been
developed for rigorously analyzing voting and electoral data.
This approach has led to the revealing of deep implications,
complex equilibrium, and dynamic trends represented by
popular opinion distributions on a political spectrum. A key
finding in this work has been the existence of the collective
opinion equilibrium over a spectrum of opinion distribution
in big poll data, which is not simply a weighted average
rather than the point of natural centriod at the integrated
areas of opinion distributions. Adaptive policy gains based
on historical and current poll data have been formally
derived from fuzzy collective opinion aggregation, effect
analyses, and quantitative impact estimations. A set of
interesting insights has been demonstrated on the nature of
large-scale collective opinions in poll data mining, collective
opinion equilibrium determination, and quantitative electoral
data processing in big data engineering.
ACKNOWLEDGMENT
This work is supported in part from a discovery fund
granted by the Natural Sciences and Engineering Research
Council of Canada (NSERC). We would like to thank the
anonymous reviewers for their valuable suggestions and
comments on the previous version of this paper.
REFERENCES
[1] C. Benham and S. Mielke, "DNA mechanics," Annu Rev Biomed
Eng, vol. 7, 2005, pp. 21–53.
[2] M. Chevallier, M. Warynski, and A. Sandoz, “Success factors of
Geneva's e-voting system,” The Electronic Journal of e-Government,
vol. 4, 2006, pp. 55-61.
[3] M. Chicurel, “Databasing the brain,” Nature, vol. 406, 2000, pp.
822-825.
[4] O. Davis, M. Hinich, and P. Ordeshook, “An expository
development of a mathematical model of the electoral process,”
American Political Science Review, vol. 64, no. 2, 1970, pp. 426-
448.
[5] P. Emerson, “The original Borda count and partial voting,䇿㻌 Social
Choice and Welfare, vol. 40, no. 2, 2013, pp. 353-358.
[6] A. Gilat and V. Subramaniam, Numerical Methods for Engineers
and Scientists: An Introduction with Applications using MATLAB.
2nd ed., MA: John Wiley & Sons, 2011.
[7] B. Goldsmith, Electronic Voting and Counting Technologies: A
Guide to Conducting Feasibility Studies. Washington, D.C.:
International Foundation for Electoral Systems (IFES), 2011.
[8] A. Jacobs, "The pathologies of big data," ACM Queue, July 2009,
pp.1-12.
[9] H. R. Lewis and C. H. Papadimitriou, Elements of the Theory of
Computation, 2nd ed., NY: Prentice Hall, 1998.
[10] E. N. Marieb, Human Anatomy and Physiology, 2nd ed., Redwood,
CA: The Benjamin/Cummings Publishing Co., 1992.
[11] I. McLean and N. Shephard, “A program to implement the
condorcet and Borda rules in a small-nelection,” Technical Report,
Oxford University, UK, 2005.
[12] R. J. Mokken, A Theory and Procedure in Scale Analysis with
Applications in Political Research. Netherlands: Mouton & Co.,
1971, pp. 29-233.
[13] D. G.. Saari, “Mathematical structure of voting paradoxes: II.
positional voting,” Journal of Economic Theory, vol. 15, no. 1, 2000.
[14] G. Sartori, Parties and Party Systems: A Framework for Analysis.
UK: Cambridge University Press, 1976, pp. 291.
[15] C. E. Shannon, “A mathematical theory of communication,” Bell
System Technical Journal, vol. 27, 1948, pp.379-423 and 623-656.
[16] C. Snijders, U. Matzat, and U.-D. Reips. “‘Big data’: Big gaps of
knowledge in the field of Internet,” International Journal of Internet
Science, vol. 7, 2012, pp. 1-5.
[17] R. J. Sternberg, In Search of the Human Mind. 2nd ed., NY:
Harcourt Brace & Co., 1998.
[18] K. Strom, “A behavioral theory of competitive political parties,”
American Journal Political Science, vol. 34, no.2, 1990, pp.565-598.
[19] Y. Wang and Y. Wang, “On cognitive informatics models of the
brain,” IEEE Transactions on Systems, Man, and Cybernetics, vol.
36, no. 2, March, 2006, pp. 203-207.
[20] Y. Wang, Software Engineering Foundations: A Software Science
Perspective. NY: Auerbach Publications, 2007.
[21] Y. Wang, “On cognitive computing,” International Journal on
Software Science and Computational Intelligence, vol. 1, no.3. pp.
1-15.
[22] Y. Wang, “In search of denotational mathematics: novel
mathematical means for contemporary intelligence, brain, and
knowledge sciences,” Journal of Advanced Mathematics and
Applications, vol. 1, no. 1, 2012, pp. 4-25.
[23] Y. Wang, “On long lifespan systems and applications,” Journal of
Computational and Theoretical Nanoscience, vol. 9, no. 2, 2012, pp.
208-216.
[24] Y. Wang, “Formal rules for fuzzy causal analyses and fuzzy
inferences,” International Journal of Software Science and
Computational Intelligence, vol.4, no.4, 2012, pp.70-86.
[25] Y. Wang and R.C. Berwick, “Towards a formal framework of
cognitive linguistics,Journal of Advanced Mathematics and
Applications, vol. 1, no. 2, 2012, pp.250-263.
[26] Y. Wang, “Neuroinformatics models of human memory: mapping
the cognitive functions of memory onto neurophysiological
structures of the brain,” International Journal of Cognitive
Informatics and Natural Intelligence, vol. 7, no. 1, 2013, pp. 98-122.
[27] Y. Wang, “Fuzzy causal inferences based on fuzzy semantics of
fuzzy concepts in cognitive computing,” WSEAS Transactions on
Computers, 13, 2014, pp.430-441.
[28] Y. Wang, J. Nielsen, and V. Dimitrov, “Novel optimization theories
and implementations in numerical methods,” Int. J. of Advanced
Mathematics and Applications, vol. 2, no. 1, 2013, pp.2-12.
[29] Y. Wang and G. Fariello, “On neuroinformatics: mathematical
models of neuroscience and neurocomputing.” Journal of Advanced
Mathematics and Applications, vol. 1, no. 2, 2012, pp. 206-217.
[30] Web-J.J, Political Parties of Canada,
http://www.thecanadaguide.com/political-parties, 2013.
[31] Wiki, Internet traffic, 2012,
http://en.wikipedia.org/wiki/Internet_traffic.
[32] L. A. Zadeh, “Fuzzy sets,” Information and Control, vol. 8, 1965,
pp. 338-353.
[33] L. A. Zadeh, “Fuzzy logic and approximate reasoning,” Syntheses,
vol. 30, 1975, pp. 407-428.
637
... The hierarchy of human knowledge is categorized at the levels of data, information, knowledge, and intelligence [Debenham, 1989;Bender, 1996;Wang, 2006Wang, , 2014a. Big data is one of the fundamental phenomena of the information era of human societies [Jacobs, 2009;Snijders et al., 2012;Wang, 2014a; Wang & Wiebe, 2014]. Almost all fields and hierarchical levels of human activities generate exponentially increasing data, information, and knowledge. ...
... The hierarchy of human knowledge is categorized at the levels of data, information, knowledge, and intelligence [Debenham, 1989;Bender, 1996;Wang, 2006Wang, , 2014a. Big data is one of the fundamental phenomena of the information era of human societies [Jacobs, 2009;Snijders et al., 2012;Wang, 2014a;Wang & Wiebe, 2014]. Almost all fields and hierarchical levels of human activities generate exponentially increasing data, information, and knowledge. ...
... Big data analytics in sociology and collective opinion elicitation in social networks are identified as an important filed where data are often complex, vague, incomplete, and counting-based [Wang & Wiebe, 2014]. Censuses and general elections are the traditional and typical domains that demand efficient big data analytic theories and methodologies beyond number counting and statistics [Emerson, 2013;Saari, 2000]. ...
Article
Full-text available
Big data are products of human collective intelligence that are exponentially increasing in all facets of quan-tity, complexity, semantics, distribution, and processing costs in computer science, cognitive informatics, web-based computing, cloud computing, and computational intelligence. This paper presents fundamental big data analysis and mining technologies in the domain of social networks as a typical paradigm of big data engineering. A key principle of computational sociology known as the characteristic opinion equilibrium is revealed in social networks and electoral systems. A set of numerical and fuzzy models for collective opinion analyses is formally presented. Fuzzy data mining methodologies are rigorously described for collective opinion elicitation and benchmarking in order to enhance the conventional counting and statistical method-ologies for big data analytics.
... The framework of cognitive informatics [Wang, , 2007b and cognitive computing [Wang, 2006[Wang, , 2009b[Wang, , 2010a[Wang, , 2012b can be described by theories, denotational mathematical means, technologies, and applications in the fields of cognitive robotics, neuroinformatics, brain informatics, abstract intelligence, cognitive inference engines, and cognitive machine learning systems. Fundamental theories developed in CI cover the Matter-Energy-Information-Intelligence (MEII) model [Wang, 2007a[Wang, , 2007b, the Layered Reference Model of the Brain (LRMB) , the Object-Attribute-Relation (OAR) model of internal information representation in the brain [Wang, 2007c], the Cognitive Functional Model of the Brain (CFMB) [Wang & Wang, 2006], the Abstract Intelligence Model of the Brain (AIMB), Natural Intelligence [Wang, 2007b], Abstract Intelligence [Wang, 2009a[Wang, , 2012c, Neuroinformatics [Wang, 2007b[Wang, , 2013a[Wang, , 2013bWang and Fariello, 2012a], Denotational Mathematics [Wang, 2002b[Wang, , 2007a[Wang, , 2008a[Wang, , 2008b[Wang, , 2008c[Wang, , 2008d[Wang, , 2009d[Wang, , 2011a[Wang, , 2011b[Wang, , 2012a[Wang, , 2012b[Wang, , 2012e, 2012g, 2013c, Cognitive Linguistics [Wang and Berwick, 2012b;Wang, 2013d;Wang et al., 2012d], Formal Neural Signal and Circuit Theories [Wang and Fariello, 2012a], Cognitive Systems Wang, 2010bWang, , 2011cWang et al., 2011cWang et al., , 2014. Recent studies on LRMB in cognitive informatics reveal an entire set of cognitive functions of the brain and their cognitive process models, which explain the functional mechanisms and cognitive processes of the natural intelligence with a set of 48 cognitive processes at seven layers known as the sensation, action, memory, perception, meta-cognitive, inference, and advanced cognitive layers . ...
Chapter
Cognitive Informatics (CI) is a contemporary multidisciplinary field spanning across computer science, information science, cognitive science, brain science, intelligence science, knowledge science, cognitive linguistics, and cognitive philosophy. Cognitive Computing (CC) is a novel paradigm of intelligent computing methodologies and systems based on CI that implements computational intelligence by autonomous inferences and perceptions mimicking the mechanisms of the brain. This paper reports a set of position statements presented in the plenary panel of IEEE ICCI*CC'14 on Cognitive Informatics and Cognitive Computing. The summary is contributed by invited panelists who are part of the world's renowned researchers and scholars in the transdisciplinary field of cognitive informatics and cognitive computing.
... The effectiveness in monitoring public opinion and forecasting political results emerges also in Song et al. (2014), where Twitter has been exploited to mine dynamic social trends and to highlight communications among people supporting the same political candidates, and in Calderon et al. (2015), where the authors try to find a relation between big data analytics coming from Twitter and the citizens' trust toward politicians and government in the aftermath of the 2014 Brazilian World Cup. An interesting analysis about electoral big data and their distributions can be found also in Wang and Wiebe (2014), where historical elections data and potential policy gains are investigated through fuzzy methods in order to assess policy confirmations and general consensus. ...
Article
Full-text available
Mining and analyzing the valuable knowledge hidden behind the amount of data available in social media is becoming a fundamental prerequisite for any effective and successful strategic marketing campaign. Anyway, to the best of our knowledge, a systematic analysis and review of the very recent literature according to a marketing framework is still missing. In this work, we intend to provide, first and foremost, a clear understanding of the main concepts and issues regarding social big data, as well as their features and technologies. Secondly, we focus on marketing, describing an operative methodology to get useful insights from social big data. Then, we carry out a brief but accurate classification of recent use cases from the literature, according to the decision support and the competitive advantages obtained by enterprises whenever they exploit the analytics available from social big data sources. Finally, we outline some open issues and suggestions in order to encourage further research in the field.
... The analysis of data is important and is being used to predict trends in society, or estimate and optimize costs and profits. Some examples of works being developed in Big Data research field focus on making better weather prediction [11], helping in traffic management [43], creating smart cities [34], helping in analysis of diseases [58][61] and social networks [33] [69], and helping decision making in business intelligence [22] [42]. ...
Article
Full-text available
The amount of data generated in the modern society is increasing rapidly. New problems and novel approaches of data capture, storage, analysis and visualization are responsible for the emergence of the Big Data research field. Machine Learning algorithms can be used in Big Data to make better and more accurate inferences. However, because of the challenges Big Data imposes, these algorithms need to be adapted and optimized to specific applications. One important decision made by software engineers is the choice of the language that is used in the implementation of these algorithms. Therefore, this literature survey identifies and describes domain-specific languages and frameworks used for Machine Learning in Big Data. By doing this, software engineers can then make more informed choices and beginners have an overview of the main languages used in this domain.
... The framework of cognitive informatics [Wang, , 2007b and cognitive computing [Wang, 2006[Wang, , 2009b[Wang, , 2010a[Wang, , 2012b can be described by theories, denotational mathematical means, technologies, and applications in the fields of cognitive robotics, neuroinformatics, brain informatics, abstract intelligence, cognitive inference engines, and cognitive machine learning systems. Fundamental theories developed in CI cover the Matter-Energy-Information-Intelligence (MEII) model [Wang, 2007a[Wang, , 2007b, the Layered Reference Model of the Brain (LRMB) , the Object-Attribute-Relation (OAR) model of internal information representation in the brain [Wang, 2007c], the Cognitive Functional Model of the Brain (CFMB) [Wang & Wang, 2006], the Abstract Intelligence Model of the Brain (AIMB), Natural Intelligence [Wang, 2007b], Abstract Intelligence [Wang, 2009a[Wang, , 2012c, Neuroinformatics [Wang, 2007b[Wang, , 2013a[Wang, , 2013bWang and Fariello, 2012a], Denotational Mathematics [Wang, 2002b[Wang, , 2007a[Wang, , 2008a[Wang, , 2008b[Wang, , 2008c[Wang, , 2008d[Wang, , 2009d[Wang, , 2011a[Wang, , 2011b[Wang, , 2012a[Wang, , 2012b[Wang, , 2012e, 2012g, 2013c, Cognitive Linguistics [Wang and Berwick, 2012b;Wang, 2013d;Wang et al., 2012d], Formal Neural Signal and Circuit Theories [Wang and Fariello, 2012a], Cognitive Systems Wang, 2010bWang, , 2011cWang et al., 2011cWang et al., , 2014. Recent studies on LRMB in cognitive informatics reveal an entire set of cognitive functions of the brain and their cognitive process models, which explain the functional mechanisms and cognitive processes of the natural intelligence with a set of 48 cognitive processes at seven layers known as the sensation, action, memory, perception, meta-cognitive, inference, and advanced cognitive layers . ...
Article
Full-text available
Cognitive Informatics (CI) is a contemporary multidisciplinary field spanning across computer science, information science, cognitive science, brain science, intelligence science, knowledge science, cognitive linguistics, and cognitive philosophy. Cognitive Computing (CC) is a novel paradigm of intelligent computing methodologies and systems based on CI that implements computational intelligence by autonomous inferences and perceptions mimicking the mechanisms of the brain. This paper reports a set of position statements presented in the plenary panel of IEEE ICCI*CC'14 on Cognitive Informatics and Cognitive Computing. The summary is contributed by invited panelists who are part of the world's renowned researchers and scholars in the transdisciplinary field of cognitive informatics and cognitive computing.
Chapter
Big data are products of human collective intelligence that are exponentially increasing in all facets of quantity, complexity, semantics, distribution, and processing costs in computer science, cognitive informatics, web-based computing, cloud computing, and computational intelligence. This paper presents fundamental big data analysis and mining technologies in the domain of social networks as a typical paradigm of big data engineering. A key principle of computational sociology known as the characteristic opinion equilibrium is revealed in social networks and electoral systems. A set of numerical and fuzzy models for collective opinion analyses is formally presented. Fuzzy data mining methodologies are rigorously described for collective opinion elicitation and benchmarking in order to enhance the conventional counting and statistical methodologies for big data analytics.
Article
Full-text available
It is a great leap of humanity in abstraction and induction to represent the nature by a highly abstract and general concept known as systems. Systems are the most complicated entities and phenomena in abstract, physical, information, and social worlds across all science and engineering disciplines. A formal theory of system science is system algebra as an abstract mathematical structure for the formal treatment of general systems as well as their algebraic properties, relations, operations, and rules. This paper presents a denotational mathematical theory of system science and its applications in complex intelligent, knowledge, software, big data, and cognitive systems. A mathematical model of abstract systems is formally introduced. A set of algebraic operators on formal system relations, reproductions, and compositions is rigorously defined. System algebra provides a denotational mathematical means for modeling, specifying, and manipulating generic system structures and functions in system science, computer science, software science, knowledge science, and cognitive science.
Article
Full-text available
Basic studies in system science explore the theories, principles, and properties of abstract and concrete systems as well as their applications in system engineering. Systems are the most complicated entities and phenomena in abstract, physical, information, cognitive, brain, and social worlds across a wide range of science and engineering disciplines. The mathematical model of a general system is embodied as a hyperstructure of the abstract system. The theoretical framework of system science is formally described by a set of algebraic operations on abstract systems known as system algebra. A set of abstract structures, properties, behaviors, and principles is rigorously formalized in contemporary system theories. Applications of the formal theories of system science in system engineering, intelligent engineering, cognitive informatics, cognitive robotics, software engineering, cognitive linguistics, and cognitive computing are demonstrated, which reveals how system structural and behavioral complexities may be efficiently reduced in system representation, modeling, analysis, synthesis, inference, and implementation.
Article
Full-text available
Granular computing is an emerging technology for modeling and implementing complex computing architectures, behaviors, and big data manipulations. This paper presents a recent development in denotational mathematics known as granular algebra, which enables a rigorous treatment of computing granules as a generic abstract mathematical structure and granular behaviors as a set of algebraic operations. A formal granule is introduced as a mathematical model that elicits a set of basic properties of computing granules. A set of algebraic operations on formal granules is rigorously defined for the relational, reproductive, and compositional operations. A real-world case study is presented that demonstrates how concrete granules and their algebraic operations are manipulated by granular algebra. This work demonstrates that granular algebra is not only a powerful conceptual modeling methodology for granular systems, but also a practical functional specification methodology for granular computing.
Article
Full-text available
Fuzzy semantics comprehension and fuzzy inference are two of the central abilities of human brains that play a crucial role in thinking, perception, and problem solving. A formal methodology for rigorously describing and manipulating fuzzy semantics and fuzzy concepts is sought for bridging the gap between humans and cognitive fuzzy systems. A mathematical model of fuzzy concepts is created based on concept algebra as the basic unit of fuzzy semantics for denoting languages entities in semantic analyses. The semantic operations of fuzzy modifiers and qualifiers on fuzzy concepts are introduced to deal with complex fuzzy concepts. On the basis the fuzzy semantic models, fuzzy causations and fuzzy causal inferences are formally elaborated by algebraic operations. The denotational mathematical structure of fuzzy semantics and fuzzy inferences not only explains the fuzzy nature of linguistic semantics and its comprehension, but also enables cognitive machines and fuzzy systems to mimic the human fuzzy inference mechanisms in cognitive informatics, cognitive linguistics, cognitive computing, and computational intelligence.
Article
Full-text available
It is recently recognized that complex systems are not only characterized by their magnitudes, but also their lifespans. This paper presents a theory of Long-Lifespan systems (LLS's) and its applications in explaining the properties of complex systems from both the function and time dimensions. The mathematical models of LLS's such as those of the abstract systems, system magnitudes, and system lifespan, are established. Properties of LLS's, particularly those of system conservation, system equilibrium, and system self-organization, are formally analyzed. The fundamental mechanisms and applications of LLS's in economics are elaborated, followed by the description of the global warning and the greenhouse effect in order to explain the rational causality of global climate system as a typical LLS system.
Conference Paper
Full-text available
In eight official votes between January 2003 and April 2005 authorities in Geneva invited up to 90'000 citizens to test a remote e-Voting system as a complement to traditional voting methods. Multidisciplinary teams composed of specialists in law, political rights, public relations, government, security, and computer science, together with strong support from the Government itself were necessary to build the system which will be evaluated by Parliament en 2006. The paper reports on the project, its results in terms of numbers and socio-political profile of e-Voters, and its success factors. All three authors were directly or indirectly involved in the project from the beginning and are currently working on the deployment of Geneva's e-Government platform (Sandoz 2005).
Article
Full-text available
Optimization is a searching process for seeking the maximum and/or minimum of a one or multidimensional function. Optimization is widely used to obtain the best outcomes of a design, solution, prototype, and project in science and engineering. This paper presents two novel numerical methods for function optimization known as the binary section search (BSS) method and the sliced reduction search (SRS) method. For each of the numerical methods developed in this paper, its conceptual model, mathematical model, and algorithm are formally modeled and analyzed. A set of comparative experiment results in MATLAB® is provided. Interesting findings are reported based on comparative studies of numerical optimization methods and their complexity and performance. It is found that the golden search method is not the fastest converging bracket method for 1-D function optimization as conventionally supposed. The traditional random search method and the steepest accent method for 2-D and n-D function optimization are extended by a more efficient method of sliced reduction search.
Article
Full-text available
The human brain is a superbly marvelous and extremely complicated neurophysiological structure for generating natural intelligence that transforms cognitive information into colorful behaviors. The brain is the most complex and interesting objects in nature that requires rigorous scientific investigations by multidisciplinary methodologies and via transdisciplinary approaches where only low-level studies could not explain it. A fundamental problem and difficulty in contemporary brain science is the indistinguishable confusion of the cognitive mechanisms and neurophysiological structures of the kernel brain and its memories. This paper presents a set of formal neuroinformatics models of memory and a rigorous mapping between the cognitive functions of memory and their neurophysiological structures. The neurophysiological foundations of memory are rigorously described based on comprehensive cognitive models of memory. The cognitive architecture of human memory and its relationship to the intelligence power of the brain are logically analyzed. The cognitive roles of memory allocated in both cerebrum and cerebellum are revealed by mapping the functional models of memory onto corresponding neurophysiological structures of the brain. As a result, fundamental properties of memory and knowledge as well as their neurophysiological forms in the brain are systematically explained.
Chapter
The term fuzzy logic is used in this paper to describe an imprecise logical system, FL, in which the truth-values are fuzzy subsets of the unit interval with linguistic labels such as true, false, not true, very true, quite true, not very true and not very false, etc. The truth-value set, , of FL is assumed to be generated by a context-free grammar, with a semantic rule providing a means of computing the meaning of each linguistic truth-value in as a fuzzy subset of [0, 1]. Since is not closed under the operations of negation, conjunction, disjunction and implication, the result of an operation on truth-values in requires, in general, a linguistic approximation by a truth-value in . As a consequence, the truth tables and the rules of inference in fuzzy logic are (i) inexact and (ii) dependent on the meaning associated with the primary truth-value true as well as the modifiers very, quite, more or less, etc. Approximate reasoning is viewed as a process of approximate solution of a system of relational assignment equations. This process is formulated as a compositional rule of inference which subsumes modus ponens as a special case. A characteristic feature of approximate reasoning is the fuzziness and nonuniqueness of consequents of fuzzy premisses. Simple examples of approximate reasoning are: (a) Most men are vain; Socrates is a man; therefore, it is very likely that Socrates is vain, (b) x is small; x and y are approximately equal; therefore y is more or less small, where italicized words are labels of fuzzy sets. Approximate reasoning is viewed as a process of approximate solution of a system of relational assignment equations. This process is formulated as a compositional rule of inference which subsumes modus ponens as a special case. A characteristic feature of approximate reasoning is the fuzziness and nonuniqueness of consequents of fuzzy premisses. Simple examples of approximate reasoning are: (a) Most men are vain; Socrates is a man; therefore, it is very likely that Socrates is vain, (b) x is small; x and y are approximately equal; therefore y is more or less small, where italicized words are labels of fuzzy sets.
Article
A theory is developed to explain all positional voting outcomes that can result from a single but arbitrarily chosen profile. This includes all outcomes, paradoxes, and disagreements among positional procedure outcomes as well as all discrepancies in rankings as candidates are dropped or added. The theory explains why each outcome occurs while identifying all illustrating profiles. It is shown how to use this approach to derive properties of methods based on pairwise and positional voting outcomes. Pairwise voting is addressed in the preceding companion paper [15]; the theory for positional methods is developed here. JEL Classification Number: D71.
Article
The fundamental process of politics is the aggregation of citizens' preferences into a collective—a social—choice. We develop, interpret, and explain non-technically in this expository essay the definitions, assumptions, and theorems of a mathematical model of one aggregative mechanism—the electoral process. This mechanism is conceptualized here as a multidimensional model of spatial competition in which competition consists of candidates affecting turnout and the electorate's perception of each candidate's positions, and in which the social choice is a policy package which the victorious candidate advocates. This approach, inaugurated by Downs's An Economic Theory of Democracy , and falling under the general rubric “spatial models of party competition,” has been scrutinized, criticized, and reformulated. To clarify the accomplishments of this formulation we identify and discuss in section 2 the general democratic problem of ascertaining a social preference. We review critically in section 3 the definitions and assumptions of our model. We consider in sections 4 and 5 the logic of a competitive electoral equilibrium. We assume in section 4 that the electorate's preferences can be summarized and represented by a single function; the analysis in section 5 pertains to competition between two organizational structures or two opposed ideologies (i.e., when two functions are required to summarize and represent the electorate's preference). Finally, we suggest in section 6 a conceptualization of electoral processes which facilitates extending and empirically testing our model.
Article
In a Borda count, bc, M. de Borda suggested the last preference cast should receive 1 point, the voter’s penultimate ranking should get 2 points, and so on. Today, however, points are often awarded to (first, second,..., last) preferences cast as per (n, n−1, ..., 1) or more frequently, (n −1, n−2,..., 0). If partial voting is allowed, and if a first preference is to be given n or n − 1 points regardless of how many preferences the voter casts, he/she will be incentivised to rank only one option/candidate. If everyone acts in this way, the bc metamorphoses into a plurality vote... which de Borda criticized at length. If all the voters submit full ballots, the outcome—social choice or ranking—will be the same under any of the above three counting procedures. In the event of one or more persons submitting a partial vote, however, outcomes may vary considerably. This preliminary paper suggests research should consider partial voting. The author examines the consequences of the various rules so far advocated and then purports that M. de Borda, in using his formula, was perhaps more astute than the science has hitherto recognised.