Conference PaperPDF Available

Geopolitical interactions from reduced Google matrix analysis of Wikipedia

Authors:

Abstract and Figures

Interactions between countries originate from diverse aspects such as geographic proximity, trade, socio-cultural habits, language, religions, etc. Geopolitics studies the influence of a country's geographic space on its political power and its relationships with other countries. This work reveals the potential of Wikipedia mining for geopolitical study. Actually, Wikipedia offers solid knowledge and strong correlations among countries by linking web pages together for different types of information (e.g. economical, historical, political, and many others). The major finding of this paper is to show that meaningful results on the influence of country ties can be extracted from the hyperlinked structure of Wikipedia. We leverage a novel stochastic matrix representation of Markov chains of complex directed networks called the reduced Google matrix theory. For a selected small size set of nodes, the reduced Google matrix concentrates direct and indirect links of the million-node sized Wikipedia network into a small Perron-Frobenius matrix that preserves the PageRank probabilities of the global Wikipedia network. We perform a novel sensitivity analysis that leverages this reduced Google matrix to characterize the influence of relationships between countries from the global network. We apply this analysis to the set of 27 European Union countries. We show that with our sensitivity analysis we can exhibit easily very meaningful information on geopolitics from five different Wikipedia editions (English, Arabic, Russian, French and German).
Content may be subject to copyright.
Geopolitical interactions from reduced Google
matrix analysis of Wikipedia
Samer El Zant
Institut de Recherche
en Informatique de Toulouse
Université de Toulouse, INPT
Email: samer.elzant@enseeiht.fr
Katia Jaffrès-Runser
Institut de Recherche
en Informatique de Toulouse
Université de Toulouse, INPT
Email: kjr@enseeiht.fr
Dima L. Shepelyansky
Laboratoire de Physique Théorique
du CNRS, IRSAMC
Université de Toulouse, INPT
Email: dima@irsamc.ups-tlse.fr
Abstract—Interactions between countries originate from di-
verse aspects such as geographic proximity, trade, socio-cultural
habits, language, religions, etc. Geopolitics studies the influence
of a country’s geographic space on its political power and its
relationships with other countries. This work reveals the potential
of Wikipedia mining for geopolitical study. Actually, Wikipedia
offers solid knowledge and strong correlations among countries
by linking web pages together for different types of information
(e.g. economical, historical, political, and many others). The ma-
jor finding of this paper is to show that meaningful results on the
influence of country ties can be extracted from the hyperlinked
structure of Wikipedia. We leverage a novel stochastic matrix
representation of Markov chains of complex directed networks
called the reduced Google matrix theory. For a selected small size
set of nodes, the reduced Google matrix concentrates direct and
indirect links of the million-node sized Wikipedia network into
a small Perron-Frobenius matrix that preserves the PageRank
probabilities of the global Wikipedia network. We perform a
novel sensitivity analysis that leverages this reduced Google
matrix to characterize the influence of relationships between
countries from the global network. We apply this analysis to
the set of 27 European Union countries. We show that with
our sensitivity analysis we can exhibit easily very meaningful
information on geopolitics from five different Wikipedia editions
(English, Arabic, Russian, French and German).
I. INTRODUCTION
Relationships between countries have always been of utmost
interest to study for countries themselves as they have to be
accounted for into any country’s strategic and diplomatic plan.
Studies are driven by observing the influence of a relationship
between two countries on other countries from different per-
spectives listing economic exchanges, social changes, history,
politics, religious, martial, regional as seen in [1]. The major
finding of this paper is to show that meaningful results on
geopolitics interactions could be extracted from Wikipedia for
a given selection of countries. Therefore, it can be leveraged
to provide a picture of countries relationships offering a
new framework for long-term geopolitical studies. In [2], S.
Javanmardi et al. show that even though anyone can edit a
Wikipedia entry at any time, the average article quality in-
creases as it goes through various edits. Wikipedia’s accuracy
for its scientific entries has been proved by comparing it to
Encyclopedia Britannica and to PDQ - NCI’s Comprehensive
Database in [3], [4]. To sum up, Wikipedia has become the
largest accurate reliable free online open source of knowledge.
TABLE I
List of EU countries.
Wikipedia edition English French German
Countries CC Color K K K
France* FR BL 1 1 2
United Kingdom* GB GN 2 4 24
Germany DE BL 3 2 1
Italy IT BL 4 3 4
Spain* ES OR 5 5 5
Poland* PL RD 6 8 6
Netherlands NL BL 7 7 7
Sweden* SE PK 8 11 8
Romania RO RD 9 18 17
Belgium BE BL 10 6 9
Austria AT PK 11 9 3
Greece GR OR 12 13 14
Portugal PT OR 13 12 11
Ireland IE GN 14 19 16
Denmark DK GN 15 14 10
Finland FI PK 16 17 15
Hungary HU RD 17 10 13
Czech Republic CZ RD 18 15 12
Bulgaria BG RD 19 20 20
Estonia EE RD 20 24 22
Slovenia SI RD 21 23 23
Slovakia SK RD 22 16 18
Lithuania LT RD 23 22 21
Cyprus CY RD 24 27 27
Latvia LV RD 25 25 25
Luxembourg LU BL 26 21 19
Malta MT RD 27 26 26
PageRank Kfor EnWiki, FrWiki and DeWiki. Color code groups EU
countries into 5 subsets: Blue (BL) for Founders, Green (GN) for 1973 new
member states, Orange (OR) for 1981 to 1986 new member states, Pink
(PK) for 1995 new member states and Red (RD) for 2004 to 2007 new
member states. Standard country codes (CC) are given as well. Countries in
bold are the selected ones for each group.
Unique to Wikipedia is that articles make citations to each
other, providing a direct relationship between webpages. As
such, Wikipedia generates a larger directed network of article
titles with a rather clear meaning. For these reasons, it is
interesting to apply algorithms developed for search engines
of World Wide Web, those like the PageRank algorithm
[5], to analyze the ranking properties and relations between
Wikipedia articles. For various language editions of Wikipedia
it was shown that the PageRank vector produces a reliable
ranking of historical figures over 35 centuries of human history
[6]–[9] and a solid Wikipedia ranking of world universities
(WRWU) [6], [11]. It has been shown that the Wikipedia
ranking of historical figures is in a good agreement with
the well-known Hart ranking [12], while the WRWU is in a
good agreement with the Shanghai Academic ranking of world
universities [13].
This paper analyses the networks of articles extracted from
5 language editions of Wikipedia to study the influence of
countries on each other. Previous work [16] has identified
the strongest ties between countries, but this one focuses on
capturing the impact of a change in the strength of a relation-
ship between two countries on the overall network interactions
of selected countries via the global network. The impact on
the overall network structure is measured by calculating the
variation of importance of the nodes in the network. We
show that this sensitivity analysis renders a reasonable and
meaningful idea of the influence of a given bilateral tie on the
whole network.
We have conducted our geopolitics study for the target set of
27 European Union member states. As such, from the global
network of articles of Wikipedia we have derived the reduced
Google matrix GRfor these 27 EU states. Thus, GRreflects
in a 27-by-27 matrix the complete (direct and indirect) rela-
tionships between countries. To quantify the relative influence
of one relationship between two nations on all other nations,
we propose in this paper to compute a logarithmic derivative
of the PageRank probabilities calculated from GRand ˜
GR.
PageRank probabilities are derived from GRas explained later.
They represent the importance of a node in the global network
of articles. ˜
GRis almost equal to GR. It only differs by the
values of one column. If the relationship going from nation jto
nation i, only the values of column jare changed to relatively
inflate the probability ˜
GR(i, j)of nation jending in nation
icompared to the other ones. This is done in practice by
modifying ˜
GR(i, j)and then normalizing the column again to
unity to enforce the column normalization property of Google
matrices. Results are derived for 5 different Wikipedia editions
(Data collected February 2013) from the set of 24 analyzed in
[9]: EnWiki, ArWiki, RuWiki, DeWiki and FrWiki that contain
4.212, 0.203 , 0.966, 1.533 and 1.353 millions of articles each.
The selected countries are the 27 EU countries as of February
2013 (Croatia joined in July 2013) as mentioned in Table I.
The paper is organized as follows. At first we introduce
the reduced Google matrix theory, together with a primer
on Google Matrix and PageRank calculations. The reduced
Google matrix is illustrated for the 27 EU set of states.
Next, the methodology for our link sensitivity analysis is
presented. A detailed analysis of EU countries is given in
the Results section, with a special focus on the sensitivity
analysis of important relationships among member states.
Finally, conclusions are drawn in the last section.
II. RE DU CE D GOOGLE MATRI X TH EO RY
It is convenient to describe the network of NWikipedia ar-
ticles by the Google matrix Gconstructed from the adjacency
matrix Aij with elements 1if article (node) jpoints to article
(node) iand zero otherwise. Elements of the Google matrix
take the standard form Gij =αSij + (1 α)/N [5], [10],
where Sis the matrix of Markov transitions with elements
Sij =Aij /kout(j),kout (j) = PN
i=1 Aij 6= 0 being the node
jout-degree (number of outgoing links) and with Sij = 1/N
if jhas no outgoing links. The damping factor 0< α < 1is
which for a random surfer determines the probability (1α)to
jump to any node; below we use the standard value α= 0.85.
The right eigenvector of Gwith the unit eigenvalue gives the
PageRank probabilities P(j)to find a random surfer on a node
j. We order nodes by decreasing Pgetting them ordered by
the PageRank index K= 1,2, ...N with a maximal probability
at K= 1. From this global ranking we capture the top local
PageRank mentioned in Tab. I.
Reduced Google matrix is constructed for a selected subset
of nodes (articles) following the method described in [14]–[16]
and based on concepts of scattering theory used in different
fields of mesoscopic and nuclear physics or quantum chaos.
It captures in a Nr-by-NrPerron-Frobenius matrix the full
contribution of direct and indirect interactions happening in
the full Google matrix between the Nrnodes of interest.
In addition the PageRank probabilities of selected Nrnodes
are the same as for the global network with Nnodes, up
to a constant multiplicative factor taking into account that
the sum of PageRank probabilities over Nrnodes is unity.
Elements of reduced matrix GR(i, j)can be interpreted as the
probability for a random surfer starting at web-page jto arrive
in web-page iusing direct and indirect interactions. Indirect
interactions refer to paths composed in part of web-pages
different from the Nrones of interest. Even more interesting
and unique to reduced Google matrix theory, we show here that
intermediate computation steps of GRoffer a decomposition
of GRinto matrices that clearly distinguish direct from indirect
interactions: GR=Grr +Gpr +Gqr [15]. Here, Grr is the sub-
matrix of Grepresenting the original direct links between the
selected Nrnodes. Fig. 1 shows that Gpr is rather close to the
matrix in which each column is given by the PageRank vector
Pr, ensuring that PageRank probabilities of GRare the same
as for G(up to a constant multiplier). As such, Gpr doesn’t
provide relevant information to characterize the importance
of links between the selected nodes. The one playing an
interesting role is Gqr, which captures the effect of all indirect
paths connecting the selected Nrnodes in the full network of
Nnodes (see [14]–[16]). The matrix Gqr =Gqrd +Gqrnd
has diagonal (Gqrd) and non-diagonal (Gqrnd ) parts. Gqrnd is
leveraged for the studies of Section III-B. Results of sections
IV and V are based on GRonly. The complete theoretical
background is to be found in [14]–[16].
III. RES ULTS :GRPROPERTIES
A. Reduced Google matrix of 27 EU set.
As an example, we have picked the EnWiki edition to plot
the matrices GR,Gpr,Grr , and Gqrnd in Fig. 1. As GRis
per-column normalized and dominated by the projector Gpr
contribution, which is proportional to the global PageRank
Fig. 1. Density plots of GRand its decomposition for 27 EU extracted
from EnWiki. GR(top left), Gpr (top right), Grr (bottom left) and Gqrnd
(bottom right). Max values in red (0.14 in top panels; 0.003 in bottom left;
0.008 in bottom right), intermediate in green and min (0) in blue.
probabilities (more details in [14], [15]), this prevents a
meaningful per-line analysis. Grr provides information only
on direct links between countries as it lists the genuine Google
matrix probability for a random surfer to jump from node jto
i. On the contrary, Gqrnd offers a much more unified view of
countries interactions as it highlights more general indirect (or
hidden) interactions views via the rest of nodes. It captures the
contribution of all indirect paths connecting two nodes iand j
in the full network of Wikipedia articles. For the three selected
languages editions, we have identified very strong hidden links
connecting Finland to Sweden. Other interesting hidden links
are between Ireland and United Kingdom in DeWiki or in
EnWiki, the hidden links connecting Luxembourg to France.
B. Networks of friends
As proposed in [16], it is possible to extract from Gqrnd a
network of friendships to easily illustrate hidden links in the
network. For the sake of simplicity, we refer next to Gqrnd
using Gqr notation. To create these networks of friends, we
divide the set of Nrnodes into representative groups as shown
in Tab I. EU countries are grouped upon their accession date
to the union. One leading country per EU member state group
has been selected as well.
For each leading country j, we extract from Gqr the top 4
Friends given by the 4 best values of the elements of column j.
In other words, it corresponds to destinations of the 4 strongest
outgoing links of j. These networks of top 4 friends have been
calculated for the five editions of Wikipedia. Top 4 friends of
EU leading countries are plot on the graphs of Fig. 2. Results
for EnWiki, FrWiki and DeWiki are presented here. The black
thick arrows identify the top 4 friends interactions. Red arrows
represent the friends of friends interactions that are computed
recursively until no new edge is added to the graph. All graphs
are visualized with Yifan Hu layout [17] using [18].
It can be noticed in Fig. 2 that the order of arrival of member
states is meaningful. Indeed, nodes of the same color are
closely interconnected. It is worth noting as well that Germany,
Fig. 2. Relationship structure extracted from Gqrnd for the network of
EU countries. friends induced by the leading countries (FR, GB, ES, SE,
PL) of each group. Results are plotted for EnWiki (left), FrWiki (middle) and
DeWiki (right). Node colors represent geographic appartenance to a group of
countries (cf. Table I for details). Selected countries points with a bold black
arrow to its top 4 friends. Red arrows show friends of friends interactions
computed until no new edges are added to the graph.
TABLE II
Cross-edition friends extracted from Gqr of EU countries.
Top Gqr Wiki friends present in
country all 5 editions 4 out of 5 editions 3 out of 5 editions
FR BE -ES IT
GB IE DK - FR
ES IT - PT FR BE
SE DK - FI EE
PL CZ DE - HU - LT - SK
as one of the Founders, bridges the group of Founders to
Sweden (the leader of the countries that have joined EU in
1995) and Poland (the leader of the countries that have joined
EU between 2004 and 2007) in FrWiki and EnWiki. From
EnWiki and DeWiki, strong ties are seen between Italy and
France, while it is not the case for FrWiki authors. This is
an example of cultural bias. However, lots of links are to
seen in all three editions: GB-IE, SE-FI, ES-PT, PL-LT, IT-
GR and many others. In all editions, Benelux and Nordic
countries create a cluster densely interconnected. To underline
this constant presence of links, we give in Table II the list of
friends that are among the top 4 ones in all 5 editions, in 4
out of 5 and in 3 out of 5. For each leading country, around
2 to 3 top friends are present across all editions.
IV. LINK SENSITIVITY ANALYSIS
A. Influence analysis of geopolitical ties using GR
Previous developments in this article show that GRcaptures
essential interactions between countries. These interactions are
extracted from Wikipedia and thus stem from all links covering
this very rich network of webpages.
The point is now to see how some ties between countries
influence the whole network structure. More specifically, we
focus here on capturing the impact of a change in the strength
of a relationship between two countries on the importance
of the nodes in the network. Therefore we have designed a
sensitivity analysis that measures a logarithmic derivative of
the PageRank probability when the transition probability of
only one selected link is increased for a specific couple of
nodes in GR, relatively to the other nodes.
Our sensitivity analysis is performed for a directed link
where the relationship going from country ito jis increased.
We investigate in the last part of this Section the imbalance
between the influence of two opposite direction interactions.
In other words, we conduct the aforementioned sensitivity
analysis for the link going from country ito j, and for the
link going in the opposite direction from jto i. For each
pair of countries, we derive from this two-way sensitivity the
relationship imbalance to identify the most important player
in the relationship.
B. Sensitivity analysis
We define δas the relative fraction to be added to the
relationship from nation jto nation iin GR. Knowing δ, a new
modified matrix ˜
GRis calculated in two steps. First, element
˜
GR(i, j)is set to (1 + δ)·GR(i, j ). Second, all elements of
column jof ˜
GRare normalized to 1 (including element i)
to preserve the unity column-normalization property of the
Google matrix. Now ˜
GRreflects an increased probability for
going from nation jto nation i.
It is now possible to calculate the modified PageRank
eigenvector ˜
Pfrom ˜
GRusing the standard ˜
GR˜
P=˜
Prelation
and compare it to the original PageRank probabilities Pcal-
culated with GRusing GRP=P. Due to the relative change
of the transition probability between nodes iand j, steady
state PageRank and CheiRank probabilities are modified. This
reflects a structural modification of the network and entails a
change of importance of nodes in the network. These changes
are measured by a logarithmic derivative of the PageRank
probability of node agiven by:
D(ji)(a) = (dPa/dδij )/Pa= ( ˜
PaPa)/(δij Pa)(1)
Notation (ji)indicates that the link from node jto node i
has been modified. Element D(ji)(a)gives the logarithmic
variation of PageRank probability for country aif the link from
jto ihas been modified. We will refer to this variation as the
sensitivity of nation ato the relationship from nation ito nation
j. If this sensitivity is negative, country ihas lost importance in
the network. On the opposite, a positive sensitivity expresses
a gain in importance. The computation has been tested for
values of δ=±0.01,±0.03,±0.05. The result is not sensitive
to δand following results are given for δ= 0.03.
C. Relationship imbalance analysis
As introduced earlier, sensitivity D(ji)(a)of Eq (1) mea-
sures the change of importance of country aif the link from
nation jto ihas been changed. The sensitivity of node ato
a change in one direction is not necessarily the same as its
sensitivity to the change in the opposite direction. We define
as such the 2-way sensitivity of node awhich is simply the
sum of the sensitivities calculated for both directions:
D(ij)(a) = D(ij)(a) + D(ji)(a)(2)
The two-way sensitivity can be leveraged to find out, for a
pair of countries aand b, which one has the most influence
on the other one. Therefore, we define the following metric :
F(a, b) = D(ab)(a)D(ab)(b)(3)
Here, we measure the 2-way sensitivity for nodes aand b
when the link between them is modified both ways in GR.
If F(a, b)is positive, it means that the 2-way sensitivity of
ais larger than the 2-way sensitivity of b. In this case, ais
more influenced by bthan bby a. We can say that bis the
strongest country. If F(a, b)is negative, we can say that ais
the strongest country.
V. SENSITIVITY ANALYSIS RESULTS
The sensitivity analysis previously presented has been per-
formed for the 27 EU reduced network with 3 Wikipedia
editions: EnWiki, FrWiki and DeWiki. This analysis calculates
for each directed link jiof the reduced 27 EU network
the sensitivity Dji(a)of each country a. From this, the
relationship imbalance analysis has been calculated as well for
each pair of nations. Note that if the modified link is clearly
identified in the following, we will drop the index ijin
our sensitivity measure notation for clarity.
In order to better capture the countries sensitivities from
a multicultural perspective, all sensitivity results are averaged
over the three editions using ¯
D=1
3P3
i=1 Di, with ithe index
of a Wikipedia edition.
A. Sensitivity analysis
We start this analysis by introducing a first simple example
where Italy increases its relationship with France. Then, we
analyze the impact on the EU countries of Great Britain’s
exit (i.e. Brexit) from European Union. Next, we highlight
the sensitivity of Luxembourg to the increase of Germany
and France’s cooperation with other member states. Finally,
we present the results that underline the strong ties that exist
between groups of countries that function together in Europe.
For each sensitivity analysis, we show an axial represen-
tation of the sensitivity ¯
D(cf. Fig. 5, Fig. 6, Fig. 3, Fig. 4).
Each axis represents the sensitivity values obtained for a given
link variation.
1) Great Britain ties to France and Germany: The United
Kingdom has triggered article 50 on March 27, 2017 to leave
the European Union as a consequence of the referendum
of June 23rd, 2016 [23]. To understand its impact on EU
countries with our dataset, we have reduced (and not increased
as done in other studies) the GRtransition probability UK
towards France or Germany. We remind that our network
is dated by 2013 but it captures the strong UK influence.
Results are shown in Fig. 3 and indicate that Ireland and
Cyprus are by far the most negatively affected countries in
both cases. Moreover, the sensitivity of UK is negative as it
Fig. 3. Axial representation of ¯
Dfor link modifications from {GB} to
{FR or DE}. (A): GB to FR. (B): GB to DE.
Fig. 4. Axial representation of ¯
Dfor link modifications from {FR or
DE} to {GB or IT}. (A): FR to GB. (B): DE to GB. (C) FR to IT.
benefits less from France’s or Germany’s influence. These facts
have been recently backed up by specialists. In [24], a study
delivered by the London School of Economics discussing the
consequences of Brexit forecasts that UK will loose 2.8% of
its Gross domestic product (GDP)1. Similarly, [24] shows that
Ireland will loose as well 2.3% of its GDP, which is the largest
proportional loss caused by Brexit. Cyprus-UK Relations are
strong as claimed by the official website of the Ministry of
Foreign Affairs of Cyprus [26]. Referring to [20], UK is the
4th top export destination for Cyprus with $242M and the 2nd
import origin with $508M. As such, this clear bond of UK with
Cyprus explains that if GB suffers from Brexit, Cyprus will
do as well. Our data strikingly exhibits the same conclusion
as shown in Fig. 3.
2) Luxembourg’s sensitivity to Germany and France:
Luxembourg shares its borders with Belgium, Germany and
France with whom it has strong and diverse relationships.
Luxembourg has an open economy. Together with Belgium,
they position themselves as the 12th largest economy in the
world. Two of the top three export and import countries
of Belgium-Luxembourg are Germany ($44.6B, $50.4B) and
France ($43.8B, $36.8B) [20]. Official languages in Luxem-
bourg are Luxembourgish, French and German. Luxembourg
has robust relationships with France [27], [29] and Ger-
many [30] in various areas such as finance, culture, science,
security or nuclear power. It is clear that Luxembourg will
suffer if one of these EU countries reduces its exchanges
with it. In Fig. 4, we clearly show that Luxembourg is
strongly influenced by France and Germany. If France or
Germany increases its relationships with Italy or Great Britain,
Luxembourg is by far the most impacted country.
3) Clusters of countries: By analyzing the sensitivity of
countries to various 2-nation relationships, we have noticed
that several groups of nations function together. These groups
are strongly interconnected, and if anyone of these group
members increases its relationship strength with a country
outside of the group, all group members loose importance in
the network. We highlight two meaningful examples next: the
cluster of Nordic countries and the cluster Austro-Hungarian
cluster. Other clusters we have identified in our network are
for instance the cluster of Benelux countries (e.g. Belgium,
the Netherlands and Luxembourg) or the cluster of the Iberian
1GDP: monetary value of all the finished goods and services produced
within a country’s borders in a specific time period [25].
Fig. 5. Axial representation of ¯
Dfor link modifications from Nordic
countries to {FR or DE}. (A): DK to DE. (B): SE to DE. (C): FI to DE.
(D): DK to FR. (E): SE to FR.
peninsula (e.g. Portugal and Spain).
For both investigated groups, we test the influence of an
increase in collaboration from one member of the group to
France or to Germany. France and Germany have been chosen
as they are central members of European Union.
The Nordic countries Denmark, Finland, and Sweden have
much in common: their way of life, history, language and
social structure [19]. After World War II, the first concrete
step into unity was the introduction of a Nordic Passport Union
in 1952. Nordic countries co-operate in the Nordic Council, a
geopolitical forum. In the Nordic Statistical Yearbook [19],
Klaus Munch illustrates that “The Nordic economies are
among the countries in the Western World with the best
macroeconomic performance in the recent ten years”. Nordic
countries should keep cooperating to stay strong. Thus, if
any Nordic country attempts to abandon these relationships
in favor of other countries, it will negatively impact the
remaining Nordic countries. Our sensitivity analysis illustrates
this impact in Fig. 5. In these figs, we show how the relation-
ship increase between any Nordic country towards France or
Germany induces a drop in sensitivity for Nordic countries.
Referring to [21], relations between Slovenia, Hungary
and Austria are tight. Hungary has supported Slovenia for
its NATO membership applications and Austria has assisted
Slovenia in entering European Union. Relationships between
Austria and Hungary are important for both countries in
the economic, political and cultural fields [22]. Concerning
economy [20], Austria is one of the top import origins for
Hungary and Slovenia with $5.54B and $2.37B respectively.
Similarly to the Nordic group of countries, if Austria, Slovenia
or Hungary increases its relationships with another European
country, the other two will be affected. Sensitivity analysis
backs up this statement as seen in Fig. 6.
B. Relationship imbalance analysis
Relationship imbalance analysis has been derived for all
pairs of European countries following Eq (3). Fig. 7 shows a
density plot of F(a, b). We recall that if F(a, b)is negative,
nation ahas more influence on nation bthan bon a. If F(a, b)
Fig. 6. Axial representation of ¯
Dfor link modifications from {AT, HU
and SI} to {FR or DE}. (A): AT to FR. (B): HU to FR. (C): SI to FR. (D):
AT to DE. (E): HU to DE. (F): SI to DE.
is positive, nation bdominates nation a. According to The
Globe of Economic Complexity [28] and identical to our results
in Fig. 7, Germany and France are the two largest economies in
Europe. From GRwe can clearly see the dominance of France
and Germany on other EU countries. Another interesting result
of Fig. 7 is the equal influence between all pairs of countries
created by one member of {GR, PT, IE, DK, FI, HU} and
another of {BG, EE, SI, SK, LT, CY, LV, LU, MT}. These
pairs have F(a, b)close to zero and are plotted with orange
color in Fig. 7.
Fig. 7. Relationship imbalance analysis: F-representation for 27 EU
network. X-axis and Y-axis represent aand brespectively.
VI. CONCLUSION
This work offers a new perspective for future geopolitics
studies. It is possible to extract from multi-cultural Wikipedia
networks a global understanding of the interactions between
countries at a regional scale. Reduced Google matrix the-
ory has been shown to exhibit hidden interactions among
countries, resulting in new knowledge on geopolitics. Results
show that our sensitivity analysis captures the importance of
relationships on network structure. This analysis relies on
the reduced Google matrix and leverages its capability of
concentrating all Wikipedia knowledge in a small stochastic
matrix. We stress that the obtained sensitivity of geopolitical
relations between two countries and its influence on other
world countries is obtained on a pure mathematical statistical
analysis without any direct appeal to political, economical and
social sciences. REFERENCES
[1] E Jones. The European Miracle: Environments, Economies and Geopol-
itics in the History of Europe and Asia. Cambridge University, 2003.
[2] S Javanmardi, C Lopes. Statistical Measure of Quality in Wikipedia.
Proceedings of the First Workshop on Social Media Analytics, 2010
July.
[3] J Giles. Internet encyclopedias go head to head. Nature 438, 900 (2005)
[4] M S Rajagopalan, V Khanna, M. Stott, Y. Leiter, T Showalter, A
P Dicker et al. Accuracy of Cancer Information on the Internet:
A Comparison of a Wiki with a Professionally Maintained Database.
Journal of Clinical Oncology, 2010 May.
[5] S Brin, L Page. The anatomy of a large-scale hypertextual Web search
engine. Computer Networks and ISDN Systems, 1998, 30, 107.
[6] AO Zhirov, OV Zhirov, DL Shepelyansky. Two-dimensional ranking of
Wikipedia articles. Eur. Phys. J. B, 2010, 77, 523.
[7] Y-H Eom, KM Frahm, A Benczur, DL Shepelyansky. Time evolution
of Wikipedia network ranking. Eur. Phys. J. B, 2013, 86, 492.
[8] Y-H Eom, DL Shepelyansky. Highlighting entanglement of cultures
via ranking of multilingual Wikipedia articles. PLoS One 2013
;8(10):e74554.
[9] Y-H Eom, P Aragon, D Laniado. Interactions of cultures and top people
of Wikipedia from ranking of 24 language editions . PLoS One 2015
;10(3):e0114825.
[10] L Ermann, KM Frahm, DL Shepelyansky. Google matrix analysis of
directed networks. Rev. Mod. Phys. 2015, 87, 1261.
[11] J Lages, A Patt, DL Shepelyansky. Wikipedia ranking of world
universities. Eur. Phys. J. B, 2016, 89, 69.
[12] MH Hart. The 100: ranking of the most influential persons in history.
Citadel Press, 1992, N.Y.
[13] http://www.shanghairanking.com/ Accessed Aug. 2016.
[14] KM Frahm, DL Shepelyansky. Reduced Google matrix.
arXiv:1602.02394[physics.soc] (2016).
[15] KM Frahm, K Jaffrès-Runser, DL Shepelyansky. Wikipedia mining of
hidden links between political leaders. Eur. Phys. J. B 89, 269 (2016)
[16] KM Frahm, S El Zant, K Jaffrès-Runser, DL Shepelyansky. Multi-
cultural Wikipedia mining of geopolitics interactions leveraging reduced
Google matrix analysis. Phys. Lett. A 381, 2677 (2017)
[17] Yifan Hu. Efficient, High-Quality Force-Directed Graph Drawing. The
Mathematica Journal, 10:1, 2006 Wolfram Media, Inc.
[18] M Bastian, S Heymann, M Jacomy. Gephi: An Open Source Software
for Exploring and Manipulating Networks. Proc. of International AAAI
Conference on Weblogs and Social Media, 2009.
[19] K M Haagensen. Nordic Statistical Yearbook 2014. Nordic Council of
Ministers, Copenhagen 2014.
[20] The Observatory of Economic Complexity official Website.
http://atlas.media.mit.edu
[21] Slovenia Business Law Handbook: Strategic Information and Laws.
International Buisness Publications, 2013, US.
[22] I K˝
orösi. Austrian and Hungarian Relations since 1989, the Current
Situation and Future Perspectives. Centre for Economic and Regional
Studies HAS Institute of World Economics, August 01, 2013.
[23] A Hunt, B Wheeler. Brexit: All you need to know about the UK leaving
the EU. British Broadcasting Corporation (BBC), 27 March 2017.
[24] S Dhingra, G Ottaviano, T Sampson, J Van Reenen. The consequences
of Brexit for UK trade and living standards. Centre for Economic
Performance (CEP), London School of Economics and Political Science,
UK.
[25] Investopedia website. Gross Domestic Product - GDP.
[26] Cyprus Ministry of Foreign Affairs. Cyprus - UK Relations.
[27] French Ministry of Economy - Treasurer. Les échanges commerciaux
entre la France et le Luxembourg en 2014.
[28] The Globe of Economic Complexity. http://globe.cid.harvard.edu
[29] French Ministry of Foreign Affairs and International Development. La
France et le Luxembourg.
[30] Federal Foreign Office. Bilateral relations - Luxembourg.
http://www.auswaertiges-amt.de/
... This PageRank probability variation is defined as the sensitivity D(i) of a node i to a link change. We refer the reader to [29] for a precise definition of D (D essentially is given by a logarithmic derivative of PareRank probability in respect to a relative link weight variation). Figure 8 shows the sensitivity D of 40 world countries with respect to a link variation from Picasso to Spain (top panel) and from Picasso to France (bottom panel). ...
Article
Full-text available
This study concentrates on extracting painting art history knowledge from the network structure of Wikipedia. Therefore, we construct theoretical networks of webpages representing the hyper-linked structure of articles of 7 Wikipedia language editions. These 7 networks are analyzed to extract the most influential painters in each edition using Google matrix theory. Importance of webpages of over 3000 painters are measured using PageRank algorithm. The most influential painters are enlisted and their ties are studied with the reduced Google matrix analysis. Reduced Google Matrix is a powerful method that captures both direct and hidden interactions between a subset of selected nodes taking into account the indirect links between these nodes via the remaining part of large global network. This method originates from the scattering theory of nuclear and mesoscopic physics and field of quantum chaos. From this study, we show that it is possible to extract from the components of the reduced Google matrix meaningful information on the ties between these painters. For instance, our analysis groups together painters that belong to the same painting movement and shows meaningful ties between painters of different movements. We also determine the influence of painters on world countries using link sensitivity between Wikipedia articles of painters and countries. The reduced Google matrix approach allows to obtain a balanced view of various cultural opinions of Wikipedia language editions. The world countries with the largest number of top painters of selected 7 Wikipedia editions are found to be Italy, France, Russia. We argue that this approach gives meaningful information about art and that it could be a part of extensive network analysis on human knowledge and cultures.
Article
Full-text available
Geopolitics focuses on political power in relation to geographic space. Interactions among world countries have been widely studied at various scales, observing economic exchanges, world history or international politics among others. This work exhibits the potential of Wikipedia mining for such studies. Indeed, Wikipedia stores valuable fine-grained dependencies among countries by linking webpages together for diverse types of interactions (not only related to economical, political or historical facts). We mine herein the Wikipedia networks of several language editions using the recently proposed method of reduced Google matrix analysis. This approach allows to establish direct and hidden links between a subset of nodes that belong to a much larger directed network. Our study concentrates on 40 major countries chosen worldwide. Our aim is to offer a multicultural perspective on their interactions by comparing networks extracted from five different Wikipedia language editions, emphasizing English, Russian and Arabic ones. We demonstrate that this approach allows to recover meaningful direct and hidden links among the 40 countries of interest.
Article
Full-text available
We describe a new method of reduced Google matrix which allows to establish direct and hidden links between a subset of nodes of a large directed network. This approach uses parallels with quantum scattering theory, developed for processes in nuclear and mesoscopic physics and quantum chaos. The method is applied to the Wikipedia networks in different language editions analyzing several groups of political leaders of USA, UK, Germany, France, Russia and G20. We demonstrate that this approach allows to recover reliably direct and hidden links among political leaders. We argue that the reduced Google matrix method can form the mathematical basis for studies in social and political sciences analyzing Leader-Members eXchange (LMX).
Article
Full-text available
We use the directed networks between articles of 24 Wikipedia language editions for producing the Wikipedia Ranking of World Universities (WRWU) using PageRank, 2DRank and CheiRank algorithms. This approach allows to incorporate various cultural views on world universities using the mathematical statistical analysis independent of cultural preferences. The Wikipedia ranking of top 100 universities provides about 60 percent overlap with the Shanghai university ranking demonstrating the reliable features of this approach. At the same time WRWU incorporates all knowledge accumulated at 24 Wikipedia editions giving stronger highlights for historically important universities leading to a different estimation of efficiency of world countries in university education. The historical development of university ranking is analyzed during ten centuries of their history.
Article
Full-text available
In past ten years, modern societies developed enormous communication and social networks. Their classification and information retrieval processing become a formidable task for the society. Due to the rapid growth of World Wide Web, social and communication networks, new mathematical methods have been invented to characterize the properties of these networks on a more detailed and precise level. Various search engines are essentially using such methods. It is highly important to develop new tools to classify and rank enormous amount of network information in a way adapted to internal network structures and characteristics. This review describes the Google matrix analysis of directed complex networks demonstrating its efficiency on various examples including World Wide Web, Wikipedia, software architecture, world trade, social and citation networks, brain neural networks, DNA sequences and Ulam networks. The analytical and numerical matrix methods used in this analysis originate from the fields of Markov chains, quantum chaos and Random Matrix theory.
Article
Full-text available
Wikipedia is a huge global repository of human knowledge, that can be leveraged to investigate interwinements between cultures. With this aim we apply two methods, Markov chains and Google matrix, for the analysis of the hyperlink networks of 24 Wikipedia language editions, and rank all their articles by PageRank, 2DRank, and CheiRank algorithms. Using automatic extraction of people names we obtain the top 100 historical figures for each edition and for each algorithm. We investigate their spatial, temporal, and gender distributions in dependence of their cultural origins. Our study demonstrates not only the existence of skewness with local figures, mainly recognized only in their own culture, but also the existence of global historical figures appearing in a large number of editions. By determining the birth time and place of these persons, we perform an analysis of the evolution of such figures through 35 centuries of human history for each language, thus recovering interactions and entanglement of cultures over time. We also obtain the distributions of historical figures over world countries, highlighting geographical aspects of cross-cultural links. Considering historical figures who appear in multiple editions as interactions between cultures, we construct a network of cultures and identify the most influential cultures according to such network.
Conference Paper
Full-text available
Wikipedia is commonly viewed as the main online encyclopedia. Its content quality, however, has often been questioned due to the open nature of its editing model. A high--quality contribution by an expert may be followed by a low-quality contribution made by an amateur or a vandal; therefore the quality of each article may fluctuate over time as it goes through iterations of edits by different users. With the increasing use of Wikipedia, the need for a reliable assessment of the quality of the content is also rising. In this study, we model the evolution of content quality in Wikipedia articles in order to estimate the fraction of time during which articles retain high-quality status. To evaluate the model, we assess the quality of Wikipedia's featured and non-featured articles. We show how the model reproduces consistent results with what is expected. As a case study, we use the model in a CalSWIM mashup the content of which is taken from both highly reliable sources and Wikipedia, which may be less so. Integrating CalSWIM with a trust management system enables it to use not only recency but also quality as its criteria, and thus filter out vandalized or poor-quality content.
Article
Using parallels with the quantum scattering theory, developed for processes in nuclear and mesoscopic physics and quantum chaos, we construct a reduced Google matrix $G_R$ which describes the properties and interactions of a certain subset of selected nodes belonging to a much larger directed network. The matrix $G_R$ takes into account effective interactions between subset nodes by all their indirect links via the whole network. We argue that this approach gives new possibilities to analyze effective interactions in a group of nodes embedded in a large directed networks. Possible efficient numerical methods for the practical computation of $G_R$ are also described.