PreprintPDF Available

What's trending in the Chinese Google Books corpus? A Google Ngram analysis of the Chinese language area (1950-2008)

Authors:

Abstract and Figures

What’s trending? Google Trends tracks search trends every day, throughout the day. Social media, smartphone notifications, and unwanted pop-ups keep us abreast of current trends even when we have no interest in what’s trending on a particular day or in a particular hour. But we might not realize, or might forget, that those surface trends often have very deep, ancient roots. This chapter considers trends from 1950-2008 in China and compares them to trends found in other parts of the world. This research offers insight into of whether Chinese society between 1950 and 2008—not just the People’s Republic of China but Chinese as a language—experiences significant trends in discourse related to politics, law, economy, education, science, mass media, religion, education, art, or sport—the ten function systems covered in Roth and Schütz (2015).
Content may be subject to copyright.
1
WHAT’S TRENDING IN THE CHINESE GOOGLE BOOKS CORPUS? A GOOGLE
NGRAM ANALYSIS OF THE CHINESE LANGUAGE AREA (1950-2008)
Carlton Clark, University of Wisconsin, La Crosse
Lei Zhang, University of Wisconsin, La Crosse
Steffen Roth, Kazimieras Simonavičius University, Vilnius, Lithuania
2
Introduction
What’s trending? Google Trends tracks search trends every day, throughout the day.
Social media, smartphone notifications, and unwanted pop-ups keep us abreast of current trends
even when we have no interest in what’s trending on a particular day or in a particular hour. But
we might not realize, or might forget, that those surface trends often have very deep, ancient
roots. This chapter considers trends from 1950-2008 in China and compares them to trends found
in other parts of the world. We begin with remarks on the political theories and assumptions
traceable to Periclean Athens and republican Rome. We then compare this history to that of
China.
A fundamental tenet of constitutionalism, as shaped by the political writings of Aristotle,
Cicero, Spinoza, Locke, Montesquieu, Rousseau, the authors of The Federalist Papersand of
course many others, is that the best way to prevent tyranny is through a separation of powers.
Constitutionalism, which is evident in the politic systems of Periclean Athens and republican
Rome, does not depend on a written constitution and is not limited to constitutional democracy,
but it recognizes that “power can only be controlled by power” (Gordon, 1999, p. 15). As
Montesquieu (1748/1955) wrote, “There is as yet no liberty if the power of judging be not
separated from legislative power and the executrix.Similarly, in Federalist 47, James Madison
(1788) wrote, “The accumulation of all powers, legislative, executive, and judiciary, in the same
hands, whether of one, a few, or many, and whether hereditary, self-appointed, or elective, may
justly be pronounced the very definition of tyranny.Constitutionalism, in other words, calls for
the differentiation of political functionsin other words, functional differentiation, which is a
key concept in this essay.
3
Political separation of powers is just one of innumerable kinds of functional
differentiation. The human brain, for instance, has through the long course of evolution
differentiated itself according to various functions. Organizations of all types also internally
differentiate themselves according to various functions or responsibilities. Moving on to consider
society in general, Niklas Luhmann (2005, 2012, 2013) argues that modern globalized society
has differentiated itself into several autonomous function systems.
1
While there are debates
what “counts” as a function system, the existence of at least a core set of such systems is
regarded as a key concept of modernity (Leydesdorff, 2002; Berger, 2003; Vanderstraeten, 2005;
Brier, 2006; Ward, 2006; Kjaer, 2010; Bergthaller & Schinko, 2011; Jönhill, 2012; Schirmer &
Hamadek, 2007; Buzon & Albert, 2010). Roth and Schütz (2015) make the case for ten function
systems: economy
2
, politics, law, science, religion, education, health, art, mass media, and sport.
Contemporary globalized society protects the operational autonomy of the functions
systems. When we speak of separation of church and state, freedom of the press, or academic
freedom, or when we promote universal basic education or healthcare, we are speaking of
functional differentiation. Thus, we invoke a “semantics of corruption” (Newbury, 2016) when
one area of society unduly influences another area, as when money corrupts politics, when
religion infringes on science or education, or when the mass media and politics seem to
become one. We may judge the perceived importance of functional differentiation by surveying
the literature on functional de-differentiation. To cite a small sample of these “izations, scholars
1
The German Funktionssystem is translatable as function system or functional system, and both terms appear in
English translations of Luhmann’s work.
2
Arguably, in this context a better word than economy is commerce, a usage that avoids associations with thrift or
the wise management of limited resources. The word commercialization is also preferable to economization because
economization already has an established definition“the action or process of economizing (OED). In addition to
commerce, another option is market economy, a usage adopted by some researchers (e.g., Valentinov, 2015). But in
the publications cited in this section, economization signifies the economy’s take-over of traditionally non-
noncommercial areas of society.
4
have analyzed the politicization of art and the aestheticization of politics (Benjamin, 1936)
3
, the
politization of education (Roper, 2005; Wirth, Whiddon & Manson, 2008), the
economization/commercialization of education (Spring, 2015; Fazl-E-Haider, 2012), healthcare
(Ewert, 2009), and culture and the liberal arts (Fludernik, 2005; Eikhof & Haunschild, 2007; De
Valick, 2014), the mediatization of politics (Kepplinger, 2002; Esser & Strömbäck, 2014); the
politicization, economization, and mediatization of religion (Thompson, 2006; Robertson, 1992;
Thomas, 2016), the politicization of science (Bolsen & Druckman, 2015), the mediatization of
culture and society (McLuhan, 1964; Hjarvard, 2013, 2013; Mazzoleni, 2008), and, of course,
the economization of everything (Lamont & McGuirk, 2017).
In modern China functional differentiation has unfolded differently than it has in other
parts of the world. A thorough analysis of the factors that have led to functional differentiation
developing differently in China is well beyond the scope of this chapter. Briefly, however, we
can observe that the ancient Chinese dynasties established a pyramidal social structure—“a
massive base, with higher layers of strongly diminishing size, and an apex of one single person”
(Gassman, 2003, p.533)which is still recognizable in contemporary China. China also has a
philosophical/intellectual heritage based in the teachings of Confucius (or Kong Fuzi), Mozi,
Mencius, Zhuangzi, and Laozi (the reputed author of the Dao De Jing), along with the Legalist
thinkers. These “schools of thought,” which only became schools of thought retrospectively,
share a few core assumptions: “they regarded disunity and chaos as a problem to be solved, they
assumed that patriarchal hierarchy is natural to families and for states, and there is a natural
‘Way’ (dao) of Heaven, and that those who discover and follow the Heavenly Way will be
successful” (Tanner, 2009, p. 67). A key principle of Daoism is that change is natural and
3
Walter Benjamin (1936) declared, “All efforts to render politics aesthetic culminates in one thing: war.”
5
inevitable but, ideally, orderly. Thus, good leadership, whether of the family or the state, calls for
the orderly management of changean ideal embraced by the Chinese Communist Party.
In other words, while thinkers such as James Madison (Federalist 47) saw competing
interests and even a degree of disorder as the antidote to tyranny, Chinese thinkers and leaders
for millennia have promoted orderly, hierarchically-managed change.
4
Thus, while functional
differentiation, which generates autonomous, self-reproducing social systems with no central
steering mechanism, is happening in contemporary China, it goes against the grain of an ancient
cultural heritage. This bias for orderly change and top-down social control is observable in the
modern Chinese subcorpus of the Google Books Ngram Viewer, a free online graphing tool
which charts annual counts of words and word sequences as found in the largest available corpus
of digitized books.
Although this study uses the Google Books Ngram Viewer to analyze trends in ten
function systems, it is essentially a study of just one systemthe mass media systembecause
books (print and digitized) represent a large part of the mass media. Indeed, the mass media
system arguably began with the fifteenth-century invention of the printing press (McLuhan,
1962; Luhmann, 2007). According to Luhmann (2007), everything contemporary society knows
derives from the mass media.
The Google Books Ngram Viewer
4
This attitude toward change is captured in this quote from the Zhuangzi, the second most important text in Daoism.
In explaining his craft, Cook Ding says, “A good cook changes his knife once a year—because he cuts. A mediocre
cook changes his knife once a monthbecause he hacks. I've had this knife of mine for nineteen years and I've cut
up thousands of oxen with it, and yet the blade is as good as though it had just come from the grindstone (Watson,
1968).
6
One way to measure the evolution of functional differentiation, as well as movements
toward de-differentiation, is by analyzing the contents of a large corpus of digitized books. Roth,
Clark, Trofimov, et al. (2017) used the Google Books Ngram Viewer to engage in a form of
computational sociology known as culturomics, “a field of study that uses numerical analysis of
large volumes of data to investigate culture” (Matthews, 2014). Michel, Shen, Aiden, et al.
(2011) explain their creation of the Google Books Ngram Viewer as follows:
We constructed a corpus of digitized texts containing 4% of all books ever
printed. Analysis of this corpus enables us to investigate cultural trends
quantitatively. [. . .] [T]his approach can provide insights fields as diverse as
lexicography, the evolution of grammar, collective memory, the adoption of
technology, the pursuit of fame, censorship, and historical epidemiology.
Culturomics extends the boundaries of rigorous quantitative inquiry to a wide
array of new phenomena spanning the social sciences and the humanities.
This tool allows users to search a large corpus of books and obtain a visual representation of
cultural trends, the Ngram. While Google Books contains over 15 million digitized books, the
creators of the Ngram Viewer selected 5,195,769, or approximately 4% of all books ever
published. These books contain well over 500 billion words, including English (361 billion),
French (45 billion), Spanish (45 billion), German (37 billion), Chinese (13 billion), Russian (35
billion), and Hebrew (2 billion).
The Ngram Viewer produces time series graphs, also known as time plots, which is a data
visualization tool that illustrates data points at successive intervals of time. Each point on the
chart corresponds to both a time and a quantity that is being measured. These graphs are used to
identify, model, and forecast patterns and behaviors in data that is sampled over discrete time
intervals. The Ngram Viewer allows us to compare the levels of communication on economic
topics, political topics, legal topics, educational topics, scientific topics, artistic topics, religious
topics, and so forth. Using this tool, Roth, Clark, Trofimov, et al. (2017) analyzed key concepts
7
of functional differentiation in the English, Spanish, French, German, Italian, and Russian
subcorpora. Each of the ten function systems was represented by five keywords. According to
Roth, Clark, Trofimov, et al. (2017),
The selection of the five most important keywords per function system was a
multistep mix-methods process. First, we relied on a small collection of Python
scripts that generate word frequency lists based on the Google Ngram dataset. In
our case, we created lists of the 10,000 most frequent words per investigated
language area. We then manually scanned these lists for words that refer to one
and only one of the 10 function systems, whereby each list was screened by at
least two colleagues. The major challenge in this context was to identify n-grams
that unambiguously refer to not more than one function system. [. . .] We then
picked the five most frequent keywords per function system and combined them
to strings such as (business+economic+money+company+cost). If entered into the
Google Ngram Viewer, each such string creates one single graph that represents
the combined performance of all keywords, which in this case presents the
combined performance of the five strongest indicators for the economy (p. 7-8)
Roth, Clark, Trofimov, et al. (2017) limited the Google Books sample period to the years 1800
2000 because the data is most reliable after 1800 and because the research concerned
macrotrends in modern societies. This chapter covers the simplified Chinese subcorpus between
the years 1950-2008.
This research offers insight into of whether Chinese society between 1950 and 2008not
just the People’s Republic of China but Chinese as a language—experiences significant trends in
discourse related to politics, law, economy, education, science, mass media, religion, education,
art, or sportthe ten function systems covered in Roth and Schütz (2015). Our contention is that
the Google Books ngrams discussed below reveal traces of social change in modern China. We
will first discuss a few of the problems we encountered in charting the Chinese-language Google
Books subcorpus. Next, we will offer interpretations of the observed trends. Finally, we discuss a
number of questions that arise from the ngrams and that may be explored in further research.
8
Methods
The first methodological problem arose from the fact that following the founding of the
People’s Republic of China (PRC) in 1949, the Chinese government, building on earlier efforts,
revised the traditional Chinese script to create simplified Chinese. Google has predominately
scanned books written in simplified Chinese. Therefore, rather starting at the year 1800 as in
Roth, Clark, Trofimov, et al. (2017), this project considered only books published from 1950 to
2008. The second problem was that it is very difficult to create a reliable Chinese word-
frequency list. To help people learn entry-level Chinese, attempts have been made to create such
lists. For example, in 1988, the National Working Committee on Languages and Writing
Systems (国家言文字工作委) and the Chinese Ministry of Education created a List of
Frequently Used Characters in Modern Chinese (汉语 用字表), which includes 3,500
characters. These characters are divided into groups based on the number of strokes each
character comprises. Yet it is not known how these characters are ranked among themselves.
Another list, the Modern Chinese Character Frequency List (现代汉语单字频率列表), has been
compiled by Professor Jun Da. This list contains 9,933 unique characters. These two lists,
unfortunately, are not useful for our purposes because single characters can take on completely
different meanings when joined to one or more other characters to form words or phrases.
Compared to a character-frequency list, Chinese word-frequency information is even
more difficult to compile because, unlike Western languages/writing systems, Chinese words are
not segmented by white spaces. A single Chinese word or phrase may contain up to five
characters. Researchers have not yet developed software that can reliably segment Chinese
words and phrases. The software might misread a meaningless combination of characters as a
9
word: a problem we encountered when using the Google Books Ngram Viewer. Consequently,
we could not use an existing word-frequency list or character-frequency list.
Roth, Clark, Trofimov, et al. (2017) started with word-frequency lists and then used a multistep,
mixed-method process to find the top five keywords for each function system in each language
area. When it came to Chinese, without a reliable word-frequency list, one option could have
been to look for the best translations of the keyterms used in the previous study. But that method
was unworkable because each language has its own keywords for each function systemin other
words, the function sytems are the same but the keywords are different. To be more precise, we
should speak of lexemes rather than words. Although there is some debate among linguists
regarding the meaning or significance of this concept (Bonami, 2018), lexeme may be defined as
the unit a language “intermediate between morpheme and utterance” (OED). It is a single word,
a part of a word, or a chain of words (e.g., dog, stop sign, by the way, raining cats and dogs) that
forms a basic, meaningful element of the lexicon of a language. Thus, to solve our
methodological problem, one option was to settle on a set of Chinese lexemes that seem, in the
judgment of well-educated native speakers, to fit the function systems; but such an effort would
still be too arbitrary.
After casting about for solutions, we decided to start with the best translations for the
function systems themselves: economy, politics, law, education, science, mass media, art, etc.
But a single Chinese lexeme, even when it consists of two, three, or four characters, can have
many different meanings when combined with adjacent characters, and there is no white space
between these adjacent lexemes, or lexical units, of a sentence. A further difficulty is that to aid
in the processing of characters, the Ngram Viewer inserts white space where it does not actually
exist in the scanned texts. In fact, sometimes the user must insert white space between lexemes in
10
order to make the Ngram Viewer accept a legitimate lexemea time-consuming, frustrating
process.
Confirmation that single-character Chinese lexemes, or 1-grams, are not useful for
Google Ngram searches comes from Yang, et al. (2007), researchers at Google. They compared
English and Chinese Ngrams and found that while 1-grams in Chinese are very different from 1-
grams in English, multigrams (3-grams and up) share nearly identical frequency distributions in
English and Chinese. For example, Jun Da translates the character as learn/study/science/-
ology. If we try to compare the frequencies of this character and the English lexeme science, we
will not get a useful result. We get a more useful result if we compare a three- or four-character
Chinese lexeme with a three- or four-word English lexeme. This finding suggests that we should
look at longer strings of Chinese characters. As shown in Table 2, many of the Chinese lexemes
we used consisted of four characters. For example, one of the lexemes we used under Economy
was 市场经济, which translates as market economy.
Knowing, then, that translations of single words (1-grams) like politics, economy, and
law alone would be inadequate, we started experimenting with Google’s wildcard search
function. The wildcard search works as follows: When one puts an asterisk in place of a word,
the Google Ngram Viewer displays the top ten substitutions for the asterisk. To illustrate this
function, we performed the following wildcard search for the English word science from 1900 to
2008. When we place the asterisk in front of science, we get the following chart:
11
Figure 1. Wildcard term preceding science.
And when we put the asterisk after science, we get this chart:
Figure 2. Wildcard term following science.
We may discount the determiners, prepositions, conjunctions, and verbs because they do not
produce significant 2-grams, leaving the top four two-grams from Figures 1 and 2 in Table 1:
2-gram
Percentage of 2-grams in Google Books
English language corpus for the year 2008
social science
0.0003221600
science fiction
0.0002422460
political science
0.0002263571
12
modern science
0.0001327003
Table 1: Top four 2-grams produced from science wildcard searches.
In Roth, Clark, Trofimov, et al. (2017), we found (through the method described previously) that
in English, the top five keywords for the science system were system, method, theory, research,
and analysis. When the user inserts plus signs between lexemes, the Ngram Viewer combines all
the terms into a single time-series plotline. In Figures 3 and 4, we compare the method used in
Roth, Clark, Trofimov, et al. (2017) with the wildcard method.
Figure 3. Ngram for the combined keywords for the science system (in the English language
area) used in Roth, Clark, Trofimov, et al. (2017).
13
Figure 4. Science-associated lexemes found through wildcard search.
Because single-word lexemes (1-grams or unigrams) will occur more frequently than
two-word lexemes (2-grams or bigrams)e.g., theory (one of the science keywords used in the
Roth et. al, 2017) occurs over 45 times more frequently than social science (see Figure 5)the
frequency percentages are over 100 times greater in Figure 3 than in Figure 4. Consequently, we
could not fit both plotlines in the same chart. But if we ignore the difference in magnitude and
focus on the trends, which are more important, we see a striking similarity. Both lines rise
steadily from 1800-1900, and then more sharply (i.e., the rising trend accelerates) through most
of the 20th century. In Figure 4, the angle of the rising trend accelerates in 1950, while in Figure
3, the trend continues to rise during the same years. In Figure 3, the trend peaks in 1978, and in
Figure 4 it peaks in 1984. Both lines then decline (for unknown reasons) until the 2008 cutoff.
5
Based on the similar shape of the plotlines in Figures 3 and 4, we concluded that the wildcard
5
The reader may notice that in both charts we set the smoothing number at ten rather than the default of three.
Smoothing can be useful because if a lexeme occurs in a single book in one year but not in the preceding or
following years, that creates a taller spike than it would in later years. In other words, tall spikes would present a
distorted view of the actual trend. No smoothing, or a smoothing of zero, would reflect raw datai.e., no visual
patterns are all (Michel, Shen, Aiden, et al., 2011).
14
search method would be a valid substitute for the method used in Roth, Clark, Trofimov, et al.
(2017), which, as explained above, would not work at all for Chinese.
Figure 5. Ngram for theory and social science, 1950-2008: theory (blue), social science (red)
6
Now for the Chinese language area. As stated above, we found that there was no perfect
method, but we determined the most promising method was as follows: First, a native speaker
determined the best Chinese translations for the ten English words for the function system
names: economy, politics, law, science, education, health, sport, art, mass media, and religion.
Secondly, because, as discussed above, we knew these translations alone would be inadequate,
we used the wildcard search method. Additionally, given the processing limitations of the
Google Ngram Viewer, we decided that a string of four Chinese lexemes, rather than five as used
in Roth, Clark, Trofimov, et al. (2017), would be adequate.
Results:
6
The colors will only appear in an online version of this chapter.
15
Table 2 lists the ten function systems with their lexeme combinations in Chinese and
English ranked based on the percentage of occurrence in the Google Books simplified Chinese
subcorpus. Figures 5 and 6 are the Chinese ngrams for the years 1950-2008 for the top five and
bottom five function systems. We also compare the year 2000 rankings of English, Spanish, and
French languages from Roth, Clark, Trofimov, et al. (2017) with the Chinese rankings for the
same year. The reader may refer to that article for the rankings of the Italian, German, and
Russian language areas.
Function System
Lexeme Combination
Percentage in 2008 among all lexemes
in Google Ngram Viewer Simplified
Chinese Corpus (rounded to the
10,000th place)
Economy
经济
市场 经济 + 国民 经济 +
经济发展 + 经济增长 + 经济
增长
market economy, national economy,
economic development, economic
growth
0.07038
Science
科学
社会 科学 + 科学 + 科学
+ 科学 研究
social science + scientific
development + science & technology
+ scientific research
0.02663
Education
教育
教育 + 高等教育 + 教育部 +
义务教育
educational, higher education,
Ministry of Education +compulsory
education
0.01516
Law
法律
法律 + 法律 法规 + 法律 制度
+ 有关 法律
legal, legal regulations, legal system,
relevant law
0.01192
Politics
政治
中央 政治 + 政治 体系 + 民主 政治
+ 政治
Central politics, political system,
democratic politics, politician
0.00761
Health
医疗
医疗 机构 + 医疗卫+ 合作医
+ 基本医疗
0.00560
16
medical institutions, medical
hygiene, cooperative medical
treatment, basic medical care
Art
艺术
文化艺术 + 文学艺术 + 艺术作品
+ 艺术形式
Culture & art, literature & art, art
work, forms of art
0.00402
Religion
宗教
宗教 信仰 + 宗教 活动 + 民族宗教
+ 宗教事务
religion, religious activities, national
religion, religious affairs
0.00310
Mass Media
媒体
新闻媒体 + 新媒体 + 网络媒体 +
传统媒体
news media, new media, Internet
media, traditional media
0.00234
Sports
体育
体育活动 + 体育 运动 +体育事 +
国家体育
sports activities, sports, sports career,
national sports
0.00140
Table 2. Ten function systems in English and simplified Chinese, year 2008
Figure 6: Google Books Ngram 1958-2008 for economy (blue), science (red), education (green),
law (orange), politics (purple)
17
Figure 7: Google Books Ngram 1950-2008 for health (blue), art (red), religion (green), mass
media (orange), sports (purple)
English
Spanish
French
Chinese
Politics
Politics
Politics
Economy
Science
Law
Law
Science
Mass Media
Religion
Religion
Education
Religion
Science
Economy
Politics
Economy
Economy
Science
Law
Law
Mass Media
Art
Health
Education
Education
Mass Media
Art
Health
Art
Education
Religion
Art
Health
Health
Sports
Sports
Sports
Sports
Mass Media
Table 3. Comparative ranking of ten function systems for the year 2000 in English, Spanish,
French and Chinese. For Chinese, the only change between 2000 and 2008 is that mass media
overtakes sports.
Discussion:
The most noticeable line in Figure 6 is the Economy line,
7
which indicates a sustained
rise of economically (or commercially) oriented communication from 1970-2003. In 1950,
Economy is third place. In 1967 it enter second place, and in 1979 it overtakes Science to rise to
first place. In 2003 it drops in 2003 but remains well above the other four lines in 2008. .
8
In
7
In this section, we capitalized the function system names to distinguish them from other uses of the words
economy, politics, law, etc.
8
Some histories date the Chinese Cultural Revolution from 1966-76; however, it formally ended in 1969 when “the
Ninth Party Congress declared that the Cultural Revolution had come to a victorious conclusion” (Tanner, 2009, p.
18
trying to interpret this trend, we note that from 1978 until his retirement in 1992, Deng Xiaoping,
as Paramount Leader, oversaw major economic policy reforms, which remained in place under
the PRC presidency of Jiang Zemin (1993-2003) and Hu Jintao (2003-13). The decline from
2003-2008 may be related to the fact that Hu Jintao, after taking note of the increasing gap
between rich and poor, de-emphasized wealth creation and shifted to the Harmonious Society
theme. Another possible reason for the decline of the Economy line from 2003-2008 is that the
economic reforms were no longer news.
Moving on to Science, this line is steady from 1950-71; then it rises until the 1990s; it
then drops slightly but in 2008 remains well above its 1971 level. This trend is consistent with
the historical record. One of the four lexemes for science was science & technology. Clearly,
with the exception of most of the 1960s, the CCP saw science and technology as essential to
China’s emergence as a world power.
The Education and Law lines don’t give us much to go on. These two lines remain
parallel to each other from 1950 to 2008. Law drops slightly in 1966, the beginning of the
Cultural Revolution, which was a particularly lawless time.
9
Education has always been
important in China, and law has gained in importance as China’s economy has developed. But
the fact that Politics is in fifth position and never rise higher than fourth is worth considering.
The Education line rises modestly but steadily from 1983-98. Conventional academic
education has always, with the exception of the Cultural Revolution years, been a high priority
inthus accounting for its ranking of third behind Economy and Science in 2008.
530). The year 1976 is often cited because that was the year of Mao’s death and the end of ten years of factional
struggle within the CCP leadership.
9
Some histories date the Chinese Cultural Revolution from 1966-76; however, it formally ended in 1969 when “the
Ninth Party Congress declared that the Cultural Revolution had come to a victorious conclusion” (Tanner, 2009, p.
530). The year 1976 is often cited because that was the year of Mao’s death and the end of ten years of factional
struggle within the CCP leadership.
19
The Law line rises modestly but consistently from 1975-2008. Law is the only line that
never drops after 1975. This fact is likely associated with the movement to strengthen the
Chinese legal system following the chaos of the Cultural Revolution. Deng Xiaoping and other
reformists recognized that a stronger legal system was necessary for continued economic growth
and social stability. As Tanner (2009) writes,
One of their first priorities in 1979 was to have the National People’s Congress
approve China’s first post-1949 Criminal Law and Criminal Procedure Law. [. . .]
A number of other laws followed during the 1980s, many of them designed to
deal with the emerging market economy: the Inheritance Law (1985), Civil Law
(1986), and Bankruptcy Law (1986), laws concerning private enterprises, joint
enterprises, foreign investment, business taxes, and so on. (p. 551)
Politics sits in fifth position. This finding contrasts with the prominent position of Politics
in previous Google Ngram studies. For instance, in Roth, Clark, and Trofimov (2017) the
political function system has ranked first in the English, Spanish, French, German, Italian, and
Russian language areas since at least the early twentieth century. It has ranked first in the
Spanish language area since 1840 and first in the English area since 1880. But given that the
CCP, like Chinese rulers in the past, does not encourage the general Chinese public to take an
active interest in politics (Tanner, 2009), the relatively low position of the Politics line is
unsurprising. Nonetheless, the stark contrast between the positions of Politics in the modern
Chinese Google Books corpus and the English, Spanish, French, German, Italian, and Russian
language areas warrants further study.
In Figure 7, the most prominent line is for Art system, which resembles a mountain range
with peaks in 1962 and 1983, intervening valleys, and finally a plateau from 2000-2008. The
sharp decline from 1962-72 occurs following Great Leap Forward (1957-60) and the famine
years (1960-62) and during Cultural Revolution (1966-69) and its aftermath. Most artworks
20
produced during the Cultural Revolution were propaganda pieces created anonymously and
collectively, and most of it was destroyed or thrown away after and Mao’s death in September
1976 (Zheng, 2010). However, some of the artwork was preserved, and in 2002 an exhibition
titled Art of the Great Proletarian Cultural Revolution, 1966-1976, appeared in Vancouver,
Canada (King, Croizier, Watson & Zheng, 2010). Although this exhibition covered the years
1966-76, the fact that the Art line starts to rise 973 indicates that the art function system had
started to recover its autonomy at least three years before 1976, the year some histories mark as
the end of the Cultural Revolution. But, as mentioned previously, the Cultural Revolution
officially ended in 1969.
The second interesting line in Figure 6 is Health, which rises from 1963-74, then falls for
a decade before rising again in 1984. The rise from 1963-73 is likely related to the Barefoot
Doctors initiative, which, as part of the Cultural Revolution, brought affordable healthcare to the
countryside. Until the mid-1960s Chinese healthcare resources had been directed mostly to urban
areas. But, according to Zhang & Unschuld (2008),
Mao Zedong criticised the urban bias of medical services and pointed out the
stress placed on rural areas in 1965, [and] mobile teams of doctors from urban
hospitals were sent to deliver health care and train indigenous paramedics. [. . .]
[T]he barefoot doctor programme effectively reduced costs and provided timely
treatment to the rural people. [. . .] Reforms in the health-care system in the early
1980s [. . .] resulted in the collapse of the cooperative medical system to a
payment-based system of medical care in rural areas. The percentage of villages
with a cooperative medical system fell from 90% in the 1960s to 5% by 1985.
It is noteworthy that the third and fourth lexemes for the Health function system are medical
hygiene and cooperative medical care, terms clearly associated with the rural healthcare
initiative.
21
As for Religion, this trend line dropped overall between from 1950 and the early 1970s,
but it has risen consistently since 1975. This rise likely relates to general opening-up of Chinese
society in the post-Mao decades.
The Mass Media line is interesting because, although the system does register very
slightly in the early 1950s, the line only really appears in 1991 and then rises steadily up to 2008.
The rise of mass-media-oriented communication may be related to a loosening of the CCP
oversight and allowing a relatively independent print news industry, which has occurred along
with the economic reforms (Tanner, 2009; Clark & Zhang, 2017). But the rise of the Mass Media
line is also clearly related to emergence of the Internet, as indicated three of the lexemes under
Mass Media: new media, Internet media, and traditional media. With strong backing from the
government, on April 20, 1994, a 64K international dedicated line to the Internet with full
functional Internet accessibility was set up in China (Weishan, Hongjun & Zhangmin, (2018). In
July 2008, China surpassed the United States in total Internet users (Barboza, 2008).
In tenth position is Sport, which in comparison to Economy, Politics, and Law, is a minor
function system; it ranks last among the ten functions systems in each of the language areas
covered in Roth, Clark, Trofimov, et al. (2017).
We close this section with a few comments on Table 3. The comparison of the English,
Spanish, French, and Chinese language areas as represented by Google Books Ngram Viewer
shows that Politics and Economy basically exchanged positions. Economy and Politics rank first
and fourth, respectively, in the Chinese ngram. In contrast, in English, Spanish, and French,
Politics ranks first and Economy ranks no higher than fourth. Although to save space we did not
include German, Italian, and Russian, in Roth, Clark, Trofimov, et al. (2017), Politics ranks first
in each of those languages and Economy ranks no higher than third (Russian). These results
22
indicate that economically or commercially related topics are discussed far more frequently in
books published in modern Chinese in comparison to other major languages, whereas politically
related topics appear far less frequently. Another area of contrast is the educationally related
communication. In Table 3, Education ranks second on the Chinese list but no higher than
seventh for English, Spanish, and French. The rankings for Education are similar in German,
Italian, and Russian as shown in Roth, Clark, Trofimov, et al. (2017). We don’t know, however,
what how much education-related discourse in China is patriotic education and how much is
conventional academic education.
Conclusions:
This chapter has analyzed trends in communication, specifically in the medium of books,
related to ten function systems identified in social systems theorypolitics, law, economy,
education, science, mass media, religion, health, art, and sportin post-1949 China. By charting
these trends with the aid of the Google Books Ngram Viewer, we have been able to track social
change in modern China. This study builds on previous research on the English, Spanish, French,
German, Italian, and Russia subcorpora of Google Books, and it makes a makes a unique
contribution by using the wildcard search function to get around the difficulties presented by the
Chinese writing system.
The Google Books Ngram tool offers suggestive results that invite further, more fine-
grained research. That is to say, rather than being an end in itself, this kind of big data research
generates new research questions that may be explored through other methods. For example,
why does political communication (the Politics trendline) rank significantly lower on the list of
function systems for the modern Chinese language area in comparison with the English, Spanish,
23
French, German, Italian, and Russian as found in Roth, Clark, Trofimov, et al. (2017)? This
study offers evidence that books published in Chinese from 1950-2008 discuss political topics
less frequently than do books published in other major world languages. But books, of course,
are only one medium, and a very small percentage of people actually publish books. Are political
topics actually discussed less frequently outside of books among Chinese speakers in comparison
with speakers of these other languages? Researchers might also investigate the following
questions: Is commercially oriented communication (the economy function system) actually
more common or more valued among Chinese speakers in comparison to speakers of other major
world languages? Has modern China devoted a large part of its educational resources to patriotic
education in comparison to conventional education? What factors are behind the increasing
importance of a standardized, less politicized legal system in contemporary China? Did the
resources devoted to affordable healthcare actually rise sharply from 1963-71 and then decline
significantly from 1974-83? And if so, why? These are just a few of the questions provoked by
this kind big-data research.
24
References
Barboza, D. (July 26, 2008). China Surpasses U.S. in Number of Internet Users. The New York
Times. Retrieved from https://advance-lexis-
com.libweb.uwlax.edu/api/document?collection=news&id=urn:contentItem:4T2R-W960-
TW8F-G1FB-00000-00&context=1516831.
Benjamin, W. (1936/1968). Illuminations. H. Arendt (Ed.), H. Zohn, Trans. New York:
Harcourt, Brace & World.
Bergthaller, H., & Schinko, C. (2011). Introduction: From National Cultures to the Semantics of
Modern Society. In H. Bergthaller & C. Schinko (Eds.), Addressing Modernity. Social
Systems Theory and U.S. Cultures. Amsterdam: Edition Rodopi.
Bolsen, T. & Druckman, J. (2015). Counteracting the Politicization of Science, Journal of
Communication, Volume 65, Issue 5, 1, Pages 745769, https://doi-
org.libweb.uwlax.edu/10.1111/jcom.12171
Bonami, O. (Ed.). (2018). The Lexeme in Descriptive and Theoretical Morphology (Empirically
oriented theoretical morphology and syntax, 4). Berlin, Germany: Language Science
Press.
Brier, S. (2006). Construction of Knowledge in the Mass Media. Systemic Problems in the Post-
Modern Power-Struggle between the Symbolic Generalized Media in the Agora: The
Lomborg Case of Environmental Science and Politics. Systems Research and Behavioral
Science, 23(5), 667684. doi:10.1002/sres.793
Clark, C. & Zhang, L. (2017). “Grass Mud Horse: Luhmannian Systems Theory and Internet
Censorship in China.” Kybernetes: The international journal of cybernetics, systems and
management sciences. 46.5 (May): DOI: 10.1108/K-02-2017-0056
25
Da, Jun. Modern Chinese Character Frequency List. http://lingua.mtsu.edu/chinese-
computing/statistics/char/list.php?Which=MO
De Valick, M. (2014). Film Festivals, Bourdieu, and the Economization of Culture. Revue
Canadienne D’études Cinématographiques / Canadian Journal of Film Studies, 23(1),
7489.
Eikhof, D. R., & Haunschild, A. (2007). For Arts Sake! Artistic and Economic Logics in
Creative Production. Journal of Organizational Behavior, 28(5), 523538.
doi:10.1002/job.462
Esser, F., & Strömbäck, J. (Eds.). (2014). Mediatization of Politics: Understanding the
Transformation of Western Democracies. Palgrave Macmillan.
Ewert, B. (2009). Economization and Marketization in the German Healthcare System: How Do
Users Respond? German Policy Studies/Politikfeldanalyse, 5(1), 21-44.
Fazl-E-Haider, S. (2012). Commercialization of Education. Pakistan & Gulf Economist, 31(28),
1-2. Retrieved from https://libweb.uwlax.edu/
Fludernik, M. (2005). Threatening the UniversityThe Liberal Arts and the Economization of
Culture. New Literary History, 36(1), 5770. doi:10.1353/nlh.2005.0019
Google Books Ngram Viewer. https://books.google.com/ngrams/info
Gordon, S. (1999). Controlling the State: Constitutionalism from Ancient Athens to Today.
Cambridge, MA: Harvard University Press.
Hjarvard, S. (2013). The Mediatization of Culture and Society. New York: Routledge.
Jönhill, J. I. (2012). Inclusion and ExclusionA Guiding Distinction to the Understanding of
Issues of Cultural Background. Systems Research and Behavioral Science, 29(4), 387
401. doi:10.1002/sres.1140
26
Kepplinger, H. (2002). Mediatization of Politics: Theory and Data. Journal of
Communication, 52(4), 972-986.
Kjaer, P. F. (2010). The Metamorphosis of the Functional Synthesis: A Continental European
Perspective on Governance, Law, and the Political in the Transnational Space. Wisconsin
Law Review, 2, 4891555.
Lamont, V. & McGuirk, K. (2017). "Introduction: Culture and the Economization of
Everything." Canadian Review of American Studies, vol. 47 no. 2, 161-170. Project
MUSE, muse.jhu.edu/article/666650.
Leydesdorff, L. (2002). The Communication Turn in the Theory of Social Systems. Systems
Research and Behavioral Science, 19(2), 129-136. doi:10.1002/sres.453
Luhmann, N., & Bednarz, J. (2005). Social Systems (Reprinted ed., Writing science). Stanford,
Calif.: Stanford University Press.
Luhmann, N. (2007). The Reality of the Mass Media (Reprinted ed.). Cambridge: Polity Press.
Luhmann, N. (2012). Theory of Society, Vol 1. Rhodes Barrett, trans. Palo Alto, CA: Stanford
University Press.
Luhmann, N. (2013). Theory of Society, Vol 2. Rhodes Barrett, trans. Palo Alto, CA: Stanford
University Press.
King, R., Croizier, R., Watson, S., & Zheng, S. (2010). Art in Turmoil: The Chinese Cultural
Revolution, 1966-76 (Contemporary Chinese studies). Vancouver: UBC Press.
Madison, J. (1788) Federalist 47. The Particular Structure of the New Government and the
Distribution of Power Among Its Different Parts.
https://www.congress.gov/resources/display/content/The+Federalist+Papers#TheFederali
stPapers-47. Accessed July 21, 2019.
27
Matthews, J. (2014). Encyclopedia of Environmental Change. Thousand Oaks, California: SAGE
Publications. (2014).
Mazzoleni, G. (2008). Mediatization of Society. In W. Donsbach (Ed.), The International
Encyclopedia of Communication. Hoboken, NJ: Wiley. Modern Chinese Word List".
http://www.zdic.net/z/zb/cc1.htm.
McLuhan, M. (1962). The Gutenberg Galaxy: The Making of Typographic Man. Toronto:
University of Toronto Press.
Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Pickett, J. P., & Aiden, E. L. et
al. (2011). Quantitative Analysis of Culture Using Millions of Digitized Books. Science,
331(6014), 176182. doi:10.1126/science.1199644 PMID:21163965
Modern Chinese Word List. 2500 Words. http://www.zdic.net/z/zb/cc1.htm
Montesquieu, C., Baron de, & Rousseau, J. (1955). The Spirit of Laws (Great books of the
western world, v. 38) (T. Nugent & G. Cole, Trans.; J. Prichard, Ed.). Chicago:
Encyclopædia Britannica.
Newbury, C. (2016). The Semantics of Corruption: Political Science Perspectives on Imperial
and Post-Imperial Methods of State-Building. Journal of Imperial & Commonwealth
History, 44(1), 163194. https://doi-
org.libweb.uwlax.edu/10.1080/03086534.2015.1123544
Robertson, R. (1992). The Economization of Religion? Reflections on the Promise and
Limitations of the Economic Approach. Social Compass, 39(1), 147157. doi:
10.1177/003776892039001014
Roper, S. (2005). The Politicization of Education: Identity formation in Moldova and
Transnistria. Communist and Post-Communist Studies, 38(4), 501-514.
28
Roth, S. & Schütz, A. (2015). Ten systems. Toward a canon of function systems. Cybernetics &
Human Knowing, 22(4), 1131.
Roth, S., Clark, C., Trofimov, N., Mkrtichyan, A., Heidingsfelder, M., Appignanesi, L., Pérez-
Valls, M, Berkel, J., Kaivo-oja, J. (2017). Futures of a distributed memory: A global
brain wave measurement (18002000). Technological Forecasting and Social Change.
doi: 10.1016/j.techfore.2017.02.031
Schirmer, W., & Hadamek, C. (2007). Steering as Paradox: The Ambiguous Role of the Political
System in Modern Society. Cybernetics & Human Knowing, 14(2-3), 133-150.
Spring, J. (2015). The Economization of Education: Human Capital, Global Corporations, and
Skills-Based Schooling. New York: Routledge.
Tanner, H. (2009). China: A History. Indianapolis/Cambridge: Hackett Publishing Company.
Thomas, G. (2016). "The Mediatization of Religion as Temptation, Seduction, and
Illusion." Media, Culture & Society, 38(1): 37-47.
Valentinov, V. (2015). From equilibrium to autopoiesis: A Luhmannian reading of Veblenian
evolutionary economics. Economic Systems, 39(1), 143-155.
Vanderstraeten, R. (2005). System and Environment: Notes on the Autopoiesis of Modern
Society. Systems Research and Behavioral Science, 22(6), 471481. doi:10.1002/sres.662
Wang, Z. (2008). National Humiliation, History Education, and the Politics of Historical
Memory: Patriotic Education Campaign in China. International Studies Quarterly, 52(4),
783-806. Retrieved from http://www.jstor.org.nuls.idm.oclc.org/stable/29734264
Ward, S. (2006). Functional Differentiation and the Crisis in Early Modern upper-class
Conversation: The Second Madame, Interaction, and Isolation, Seventeenth-Century
French Studies, 28:1, 235-247, DOI: 10.1179/c17.2006.28.1.235
29
Watson, B., & Columbia College (Columbia University). (1968). The Complete Works of
Chuang Tzu (UNESCO Collection of Representative Works. Chinese Series). New York:
Columbia University Press.
Weishan, M., Hongjun Z., & Zhangmin C. (2018). Who’s in charge of regulating the Internet in
China: The history and evolution of China’s Internet regulatory agencies. China Media
Research, 14(3), 17. Retrieved from https://search-ebscohost-com.libweb.uwlax.edu
Wirth, R., Whiddon, T., & Manson, T. (2008). What is wrong with academia today? : Essays on
the politicization of American education. Lewiston, N.Y.: Edwin Mellen Press.
Yang, S., Zhu, H., Apostoli, A., & Cao, P. (2007). N-gram statistics in English and Chinese:
similarities and differences. Paper presented at the International Conference on Semantic
Computing, 2007. ICSC 2007.
Zhang, D., & Unschuld, P (2008). China’s barefoot doctor: Past, present, and future. The
Lancet, 372(9653), 1865-1867.
Zheng, S. (2010). “Brushes Are Weapons: An Art School and Its Artists,” in Art in Turmoil,
King, Croizier, Watson & Zheng,
30
Carlton Clark is a lecturer in the English Department at the University of Wisconsin, La
Crosse, where he teaches College Writing and American Literature. He earned his Master’s in
English and PhD in Rhetoric from Texas Woman’s University. With Lei Zhang, he co-edited
Affect, Emotion, and Rhetorical Persuasion in Mass Communication (Routledge, 2019). His
research has been published Technological Forecasting and Social Change, Kybernetes, the
Journal of Organizational Change Management, and other journals. His primary research
interest is social systems theory.
Lei Zhang is Associate Professor of English and Journalism at the University of Wisconsin, La
Crosse, where she teaches rhetoric, journalism, and new media studies. She received her
Master’s degree in Journalism from the University of North Texas and her PhD in Rhetoric from
Texas Woman’s University. Her research interests include intercultural rhetorics, new media
studies, and discourse analysis. She co-edited Affect, Emotion, and Rhetorical Persuasion in
Mass Communication (Routledge, 2019). Her research has been published in Rhetoric
Review and Kybernetes, among other venues.
Steffen Roth is Full Professor of Management at the La Rochelle Business School, France, and
Research Professor of Digital Sociology at the Kazimieras Simonavičius University in Vilnius,
Lithuania. He holds a Habilitation in Economic and Environmental Sociology awarded by the
Italian Ministry of Education, University, and Research; a PhD in Sociology from the University
of Geneva; and a PhD in Management from the Chemnitz University of Technology. He is an
associate editor of Kybernetes and the field editor for social systems theory of Systems Research
and Behavioral Science. The journals his research has been published in include Journal of
Business Ethics, Journal of Cleaner Production, Administration and Society, Technological
Forecasting and Social Change, Journal of Organizational Change Management, European
Management Journal, and Futures. His ORCID profile is available at orcid.org/0000-0002-8502-
601X.
Acknowledgement: One author gratefully acknowledges financial support from the Research
Council of Lithuania and the European Regional Development Fund-ERDF/FEDER (National
R&D Project 01.2.2-LMT- K-718-02- 0019 “Platforms of Big Data Foresight
(PLATBIDAFO)”).
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The Chinese government plays a crucial role in domestic Internet development. This research represents the first attempt to comprehensively investigate the history and evolution of government agencies that are important in regulating the Internet. The study finds that the China’s Internet governance can be classified into four models: (1) the Central Leading Group model; (2) the centrally led, governed by specialty model; (3) the multi-pronged management model; and (4) the recentralization of Internet governance. The historical development of these models represents an overall trend in China's cyberspace governance, which evolved from campaigns at the ministry/commission level into a top-down approach to governance, from administrative management into specialized management, and from decentralized management into centralized coordination. Moreover, changes in the regulatory responsibility and authority of various agencies indicate a shift of government priorities from technology development and infrastructure building to cyber security, informatization and ideological security.
Article
Full-text available
Purpose This paper aims to elucidate the systemic processes underlying the enhanced information-control measures taken by the Chinese Communist Party (CCP) under the leadership of President Xi Jinping. The tightening of state information control has stimulated increasingly sophisticated methods of disseminating information on the part of professional and citizen journalists. Drawing on social systems theory as articulated by Niklas Luhmann and others, the authors frame the CCP’s enhanced information-control efforts as a response to the increasing systemic complexity of Chinese journalism, which is part of a self-reproducing, self-regulating (autopoietic) global journalism system. The authors use both subtle and overt protests over Chinese censorship as evidence for the journalism system’s increasing complexity and autonomy. The authors observe that levels of complexity ratchet up as the CCP and Chinese journalism counter each other’s moves. Finally, the authors suggest that the increasing complexity of the CCP’s information-control apparatus may be unsustainable. Design/methodology/approach The authors ground their argument in Luhmannian social systems theory. Findings The CCP's effort to control journalism leads to increased internal complexity in the form of huge bureaucracies that themselves must be overseen in an almost endless proliferation of surveillance. Research limitations/implications This paper contributes to theoretical work in post-humanism. Originality/value To the authors’ knowledge, no studies have examined the tension between CCP censors and Chinese journalism from a Luhmannian systems theory perspective.
Article
Full-text available
If the global brain is a suitable model of the future information society, then one future of research in this global brain will be in its past, which is its distributed memory. In this paper, we draw on Francis Heylighen, Marta Lenartowicz, and Niklas Luhmann to show that future research in this global brain will have to reclaim classical theories of social differentiation in general and theories of functional differentiation in particular to develop higher resolution images of this brain’s function and sub-functions. This claim is corroborated by a brain wave measurement of a considerable section of the global brain. We used the Google Ngram Viewer, an online graphing tool which charts annual counts of words or sentences as found in the largest available corpus of digitalized books, to analyse word frequency time-series plots of key concepts of social differentiation in the English as well as in the Spanish, French, German, Russian, and Italian sub-corpora between 1800 and 2000. The results of this socioencephalography suggest that the global brain’s memory recalls distinct and not yet fully conscious biases to particular sub-functions, which are furthermore not in line with popular trend statement and self-descriptions of modern societies. We speculate that an increasingly intelligent global brain will start to critically reflect upon these biases and learn how to anticipate or even design its own desired futures.
Book
Forty years after China’s tumultuous Cultural Revolution, this book revisits the visual and performing arts of the period - the paintings, propaganda posters, political cartoons, sculpture, folk arts, private sketchbooks, opera, and ballet - and examines what these vibrant, militant, often gaudy images meant to artists, their patrons, and their audiences at the time, and what they mean now, both in their original forms and as revolutionary icons reworked for a new market-oriented age. Chapters by scholars of Chinese history and art and by artists whose careers were shaped by the Cultural Revolution offer new insights into works that have transcended their times.
Article
隨著出土古籍文獻的日益增加,使我們對傳統文化古籍的認知産生 了整體性的改觀。一些失傳古文獻的出土更新了我們對傳統文化認 識的史料,而更多的新出現的卻是史所未載的文獻資料。由它們展現 出的語言文字信息指示了傳統文化中的盲點。這些古籍的出現無疑提 供了研究中國傳統文化的新視角,也自然帶來了很多新問題。當今研究傳統古籍(包括出土文獻)的主要及通用的方法是在漢代 的權威著作尋求重要的參據(例如:《説文解字》、《禮記》等),并 以此作為校訂立論規範。然而,接踵而來的問題是:如何保證漢代文 獻及版本對于先秦文字信息判定的可靠性?對這個問題我想可分兩步走:首先從新文獻的文字本身分析研究,對 其重要的現象與概念尋求獨立的(排除漢代文獻的制約的)認知。由 此再對照漢代典籍,在得其異同的基礎上進一步分析研究,以求判 定及這種判定的可靠性。最適宜的研究對象應當是代表社會文化現 象旳概念。本文對“士”這一概念的分析.屬于這種嘗試。同時也 提供一種對於“孔子是否自以為‘士’ ” 一題的解答。