WHAT’S TRENDING IN THE CHINESE GOOGLE BOOKS CORPUS? A GOOGLE
NGRAM ANALYSIS OF THE CHINESE LANGUAGE AREA (1950-2008)
Carlton Clark, University of Wisconsin, La Crosse
Lei Zhang, University of Wisconsin, La Crosse
Steffen Roth, Kazimieras Simonavičius University, Vilnius, Lithuania
What’s trending? Google Trends tracks search trends every day, throughout the day.
Social media, smartphone notifications, and unwanted pop-ups keep us abreast of current trends
even when we have no interest in what’s trending on a particular day or in a particular hour. But
we might not realize, or might forget, that those surface trends often have very deep, ancient
roots. This chapter considers trends from 1950-2008 in China and compares them to trends found
in other parts of the world. We begin with remarks on the political theories and assumptions
traceable to Periclean Athens and republican Rome. We then compare this history to that of
A fundamental tenet of constitutionalism, as shaped by the political writings of Aristotle,
Cicero, Spinoza, Locke, Montesquieu, Rousseau, the authors of The Federalist Papers—and of
course many others, is that the best way to prevent tyranny is through a separation of powers.
Constitutionalism, which is evident in the politic systems of Periclean Athens and republican
Rome, does not depend on a written constitution and is not limited to constitutional democracy,
but it recognizes that “power can only be controlled by power” (Gordon, 1999, p. 15). As
Montesquieu (1748/1955) wrote, “There is as yet no liberty if the power of judging be not
separated from legislative power and the executrix.” Similarly, in Federalist 47, James Madison
(1788) wrote, “The accumulation of all powers, legislative, executive, and judiciary, in the same
hands, whether of one, a few, or many, and whether hereditary, self-appointed, or elective, may
justly be pronounced the very definition of tyranny.” Constitutionalism, in other words, calls for
the differentiation of political functions—in other words, functional differentiation, which is a
key concept in this essay.
Political separation of powers is just one of innumerable kinds of functional
differentiation. The human brain, for instance, has through the long course of evolution
differentiated itself according to various functions. Organizations of all types also internally
differentiate themselves according to various functions or responsibilities. Moving on to consider
society in general, Niklas Luhmann (2005, 2012, 2013) argues that modern globalized society
has differentiated itself into several autonomous function systems.
While there are debates
what “counts” as a function system, the existence of at least a core set of such systems is
regarded as a key concept of modernity (Leydesdorff, 2002; Berger, 2003; Vanderstraeten, 2005;
Brier, 2006; Ward, 2006; Kjaer, 2010; Bergthaller & Schinko, 2011; Jönhill, 2012; Schirmer &
Hamadek, 2007; Buzon & Albert, 2010). Roth and Schütz (2015) make the case for ten function
, politics, law, science, religion, education, health, art, mass media, and sport.
Contemporary globalized society protects the operational autonomy of the functions
systems. When we speak of separation of church and state, freedom of the press, or academic
freedom, or when we promote universal basic education or healthcare, we are speaking of
functional differentiation. Thus, we invoke a “semantics of corruption” (Newbury, 2016) when
one area of society unduly influences another area, as when money corrupts politics, when
religion “infringes” on science or education, or when the mass media and politics seem to
become one. We may judge the perceived importance of functional differentiation by surveying
the literature on functional de-differentiation. To cite a small sample of these “izations,” scholars
The German Funktionssystem is translatable as function system or functional system, and both terms appear in
English translations of Luhmann’s work.
Arguably, in this context a better word than economy is commerce, a usage that avoids associations with thrift or
the wise management of limited resources. The word commercialization is also preferable to economization because
economization already has an established definition—“the action or process of economizing” (OED). In addition to
commerce, another option is market economy, a usage adopted by some researchers (e.g., Valentinov, 2015). But in
the publications cited in this section, economization signifies the economy’s take-over of traditionally non-
noncommercial areas of society.
have analyzed the politicization of art and the aestheticization of politics (Benjamin, 1936)
politization of education (Roper, 2005; Wirth, Whiddon & Manson, 2008), the
economization/commercialization of education (Spring, 2015; Fazl-E-Haider, 2012), healthcare
(Ewert, 2009), and culture and the liberal arts (Fludernik, 2005; Eikhof & Haunschild, 2007; De
Valick, 2014), the mediatization of politics (Kepplinger, 2002; Esser & Strömbäck, 2014); the
politicization, economization, and mediatization of religion (Thompson, 2006; Robertson, 1992;
Thomas, 2016), the politicization of science (Bolsen & Druckman, 2015), the mediatization of
culture and society (McLuhan, 1964; Hjarvard, 2013, 2013; Mazzoleni, 2008), and, of course,
the economization of everything (Lamont & McGuirk, 2017).
In modern China functional differentiation has unfolded differently than it has in other
parts of the world. A thorough analysis of the factors that have led to functional differentiation
developing differently in China is well beyond the scope of this chapter. Briefly, however, we
can observe that the ancient Chinese dynasties established a pyramidal social structure—“a
massive base, with higher layers of strongly diminishing size, and an apex of one single person”
(Gassman, 2003, p.533)—which is still recognizable in contemporary China. China also has a
philosophical/intellectual heritage based in the teachings of Confucius (or Kong Fuzi), Mozi,
Mencius, Zhuangzi, and Laozi (the reputed author of the Dao De Jing), along with the Legalist
thinkers. These “schools of thought,” which only became schools of thought retrospectively,
share a few core assumptions: “they regarded disunity and chaos as a problem to be solved, they
assumed that patriarchal hierarchy is natural to families and for states, and there is a natural
‘Way’ (dao) of Heaven, and that those who discover and follow the Heavenly Way will be
successful” (Tanner, 2009, p. 67). A key principle of Daoism is that change is natural and
Walter Benjamin (1936) declared, “All efforts to render politics aesthetic culminates in one thing: war.”
inevitable but, ideally, orderly. Thus, good leadership, whether of the family or the state, calls for
the orderly management of change—an ideal embraced by the Chinese Communist Party.
In other words, while thinkers such as James Madison (Federalist 47) saw competing
interests and even a degree of disorder as the antidote to tyranny, Chinese thinkers and leaders
for millennia have promoted orderly, hierarchically-managed change.
Thus, while functional
differentiation, which generates autonomous, self-reproducing social systems with no central
steering mechanism, is happening in contemporary China, it goes against the grain of an ancient
cultural heritage. This bias for orderly change and top-down social control is observable in the
modern Chinese subcorpus of the Google Books Ngram Viewer, a free online graphing tool
which charts annual counts of words and word sequences as found in the largest available corpus
of digitized books.
Although this study uses the Google Books Ngram Viewer to analyze trends in ten
function systems, it is essentially a study of just one system—the mass media system—because
books (print and digitized) represent a large part of the mass media. Indeed, the mass media
system arguably began with the fifteenth-century invention of the printing press (McLuhan,
1962; Luhmann, 2007). According to Luhmann (2007), everything contemporary society knows
derives from the mass media.
The Google Books Ngram Viewer
This attitude toward change is captured in this quote from the Zhuangzi, the second most important text in Daoism.
In explaining his craft, Cook Ding says, “A good cook changes his knife once a year—because he cuts. A mediocre
cook changes his knife once a month—because he hacks. I've had this knife of mine for nineteen years and I've cut
up thousands of oxen with it, and yet the blade is as good as though it had just come from the grindstone ” (Watson,
One way to measure the evolution of functional differentiation, as well as movements
toward de-differentiation, is by analyzing the contents of a large corpus of digitized books. Roth,
Clark, Trofimov, et al. (2017) used the Google Books Ngram Viewer to engage in a form of
computational sociology known as culturomics, “a field of study that uses numerical analysis of
large volumes of data to investigate culture” (Matthews, 2014). Michel, Shen, Aiden, et al.
(2011) explain their creation of the Google Books Ngram Viewer as follows:
We constructed a corpus of digitized texts containing 4% of all books ever
printed. Analysis of this corpus enables us to investigate cultural trends
quantitatively. [. . .] [T]his approach can provide insights fields as diverse as
lexicography, the evolution of grammar, collective memory, the adoption of
technology, the pursuit of fame, censorship, and historical epidemiology.
Culturomics extends the boundaries of rigorous quantitative inquiry to a wide
array of new phenomena spanning the social sciences and the humanities.
This tool allows users to search a large corpus of books and obtain a visual representation of
cultural trends, the Ngram. While Google Books contains over 15 million digitized books, the
creators of the Ngram Viewer selected 5,195,769, or approximately 4% of all books ever
published. These books contain well over 500 billion words, including English (361 billion),
French (45 billion), Spanish (45 billion), German (37 billion), Chinese (13 billion), Russian (35
billion), and Hebrew (2 billion).
The Ngram Viewer produces time series graphs, also known as time plots, which is a data
visualization tool that illustrates data points at successive intervals of time. Each point on the
chart corresponds to both a time and a quantity that is being measured. These graphs are used to
identify, model, and forecast patterns and behaviors in data that is sampled over discrete time
intervals. The Ngram Viewer allows us to compare the levels of communication on economic
topics, political topics, legal topics, educational topics, scientific topics, artistic topics, religious
topics, and so forth. Using this tool, Roth, Clark, Trofimov, et al. (2017) analyzed key concepts
of functional differentiation in the English, Spanish, French, German, Italian, and Russian
subcorpora. Each of the ten function systems was represented by five keywords. According to
Roth, Clark, Trofimov, et al. (2017),
The selection of the five most important keywords per function system was a
multistep mix-methods process. First, we relied on a small collection of Python
scripts that generate word frequency lists based on the Google Ngram dataset. In
our case, we created lists of the 10,000 most frequent words per investigated
language area. We then manually scanned these lists for words that refer to one
and only one of the 10 function systems, whereby each list was screened by at
least two colleagues. The major challenge in this context was to identify n-grams
that unambiguously refer to not more than one function system. [. . .] We then
picked the five most frequent keywords per function system and combined them
to strings such as (business+economic+money+company+cost). If entered into the
Google Ngram Viewer, each such string creates one single graph that represents
the combined performance of all keywords, which in this case presents the
combined performance of the five strongest indicators for the economy (p. 7-8)
Roth, Clark, Trofimov, et al. (2017) limited the Google Books sample period to the years 1800–
2000 because the data is most reliable after 1800 and because the research concerned
macrotrends in modern societies. This chapter covers the simplified Chinese subcorpus between
the years 1950-2008.
This research offers insight into of whether Chinese society between 1950 and 2008—not
just the People’s Republic of China but Chinese as a language—experiences significant trends in
discourse related to politics, law, economy, education, science, mass media, religion, education,
art, or sport—the ten function systems covered in Roth and Schütz (2015). Our contention is that
the Google Books ngrams discussed below reveal traces of social change in modern China. We
will first discuss a few of the problems we encountered in charting the Chinese-language Google
Books subcorpus. Next, we will offer interpretations of the observed trends. Finally, we discuss a
number of questions that arise from the ngrams and that may be explored in further research.
The first methodological problem arose from the fact that following the founding of the
People’s Republic of China (PRC) in 1949, the Chinese government, building on earlier efforts,
revised the traditional Chinese script to create simplified Chinese. Google has predominately
scanned books written in simplified Chinese. Therefore, rather starting at the year 1800 as in
Roth, Clark, Trofimov, et al. (2017), this project considered only books published from 1950 to
2008. The second problem was that it is very difficult to create a reliable Chinese word-
frequency list. To help people learn entry-level Chinese, attempts have been made to create such
lists. For example, in 1988, the National Working Committee on Languages and Writing
Systems (国家语言文字工作委员会) and the Chinese Ministry of Education created a List of
Frequently Used Characters in Modern Chinese (现代汉语常 用字表), which includes 3,500
characters. These characters are divided into groups based on the number of strokes each
character comprises. Yet it is not known how these characters are ranked among themselves.
Another list, the Modern Chinese Character Frequency List (现代汉语单字频率列表), has been
compiled by Professor Jun Da. This list contains 9,933 unique characters. These two lists,
unfortunately, are not useful for our purposes because single characters can take on completely
different meanings when joined to one or more other characters to form words or phrases.
Compared to a character-frequency list, Chinese word-frequency information is even
more difficult to compile because, unlike Western languages/writing systems, Chinese words are
not segmented by white spaces. A single Chinese word or phrase may contain up to five
characters. Researchers have not yet developed software that can reliably segment Chinese
words and phrases. The software might misread a meaningless combination of characters as a
word: a problem we encountered when using the Google Books Ngram Viewer. Consequently,
we could not use an existing word-frequency list or character-frequency list.
Roth, Clark, Trofimov, et al. (2017) started with word-frequency lists and then used a multistep,
mixed-method process to find the top five keywords for each function system in each language
area. When it came to Chinese, without a reliable word-frequency list, one option could have
been to look for the best translations of the keyterms used in the previous study. But that method
was unworkable because each language has its own keywords for each function system—in other
words, the function sytems are the same but the keywords are different. To be more precise, we
should speak of lexemes rather than words. Although there is some debate among linguists
regarding the meaning or significance of this concept (Bonami, 2018), lexeme may be defined as
the unit a language “intermediate between morpheme and utterance” (OED). It is a single word,
a part of a word, or a chain of words (e.g., dog, stop sign, by the way, raining cats and dogs) that
forms a basic, meaningful element of the lexicon of a language. Thus, to solve our
methodological problem, one option was to settle on a set of Chinese lexemes that seem, in the
judgment of well-educated native speakers, to fit the function systems; but such an effort would
still be too arbitrary.
After casting about for solutions, we decided to start with the best translations for the
function systems themselves: economy, politics, law, education, science, mass media, art, etc.
But a single Chinese lexeme, even when it consists of two, three, or four characters, can have
many different meanings when combined with adjacent characters, and there is no white space
between these adjacent lexemes, or lexical units, of a sentence. A further difficulty is that to aid
in the processing of characters, the Ngram Viewer inserts white space where it does not actually
exist in the scanned texts. In fact, sometimes the user must insert white space between lexemes in
order to make the Ngram Viewer accept a legitimate lexeme—a time-consuming, frustrating
Confirmation that single-character Chinese lexemes, or 1-grams, are not useful for
Google Ngram searches comes from Yang, et al. (2007), researchers at Google. They compared
English and Chinese Ngrams and found that while 1-grams in Chinese are very different from 1-
grams in English, multigrams (3-grams and up) share nearly identical frequency distributions in
English and Chinese. For example, Jun Da translates the character 学as learn/study/science/-
ology. If we try to compare the frequencies of this character and the English lexeme science, we
will not get a useful result. We get a more useful result if we compare a three- or four-character
Chinese lexeme with a three- or four-word English lexeme. This finding suggests that we should
look at longer strings of Chinese characters. As shown in Table 2, many of the Chinese lexemes
we used consisted of four characters. For example, one of the lexemes we used under Economy
was 市场经济, which translates as market economy.
Knowing, then, that translations of single words (1-grams) like politics, economy, and
law alone would be inadequate, we started experimenting with Google’s wildcard search
function. The wildcard search works as follows: When one puts an asterisk in place of a word,
the Google Ngram Viewer displays the top ten substitutions for the asterisk. To illustrate this
function, we performed the following wildcard search for the English word science from 1900 to
2008. When we place the asterisk in front of science, we get the following chart:
Figure 1. Wildcard term preceding science.
And when we put the asterisk after science, we get this chart:
Figure 2. Wildcard term following science.
We may discount the determiners, prepositions, conjunctions, and verbs because they do not
produce significant 2-grams, leaving the top four two-grams from Figures 1 and 2 in Table 1:
Percentage of 2-grams in Google Books
English language corpus for the year 2008
Table 1: Top four 2-grams produced from science wildcard searches.
In Roth, Clark, Trofimov, et al. (2017), we found (through the method described previously) that
in English, the top five keywords for the science system were system, method, theory, research,
and analysis. When the user inserts plus signs between lexemes, the Ngram Viewer combines all
the terms into a single time-series plotline. In Figures 3 and 4, we compare the method used in
Roth, Clark, Trofimov, et al. (2017) with the wildcard method.
Figure 3. Ngram for the combined keywords for the science system (in the English language
area) used in Roth, Clark, Trofimov, et al. (2017).
Figure 4. Science-associated lexemes found through wildcard search.
Because single-word lexemes (1-grams or unigrams) will occur more frequently than
two-word lexemes (2-grams or bigrams)—e.g., theory (one of the science keywords used in the
Roth et. al, 2017) occurs over 45 times more frequently than social science (see Figure 5)—the
frequency percentages are over 100 times greater in Figure 3 than in Figure 4. Consequently, we
could not fit both plotlines in the same chart. But if we ignore the difference in magnitude and
focus on the trends, which are more important, we see a striking similarity. Both lines rise
steadily from 1800-1900, and then more sharply (i.e., the rising trend accelerates) through most
of the 20th century. In Figure 4, the angle of the rising trend accelerates in 1950, while in Figure
3, the trend continues to rise during the same years. In Figure 3, the trend peaks in 1978, and in
Figure 4 it peaks in 1984. Both lines then decline (for unknown reasons) until the 2008 cutoff.
Based on the similar shape of the plotlines in Figures 3 and 4, we concluded that the wildcard
The reader may notice that in both charts we set the smoothing number at ten rather than the default of three.
Smoothing can be useful because if a lexeme occurs in a single book in one year but not in the preceding or
following years, that creates a taller spike than it would in later years. In other words, tall spikes would present a
distorted view of the actual trend. No smoothing, or a smoothing of zero, would reflect raw data—i.e., no visual
patterns are all (Michel, Shen, Aiden, et al., 2011).
search method would be a valid substitute for the method used in Roth, Clark, Trofimov, et al.
(2017), which, as explained above, would not work at all for Chinese.
Figure 5. Ngram for theory and social science, 1950-2008: theory (blue), social science (red)
Now for the Chinese language area. As stated above, we found that there was no perfect
method, but we determined the most promising method was as follows: First, a native speaker
determined the best Chinese translations for the ten English words for the function system
names: economy, politics, law, science, education, health, sport, art, mass media, and religion.
Secondly, because, as discussed above, we knew these translations alone would be inadequate,
we used the wildcard search method. Additionally, given the processing limitations of the
Google Ngram Viewer, we decided that a string of four Chinese lexemes, rather than five as used
in Roth, Clark, Trofimov, et al. (2017), would be adequate.
The colors will only appear in an online version of this chapter.
Table 2 lists the ten function systems with their lexeme combinations in Chinese and
English ranked based on the percentage of occurrence in the Google Books simplified Chinese
subcorpus. Figures 5 and 6 are the Chinese ngrams for the years 1950-2008 for the top five and
bottom five function systems. We also compare the year 2000 rankings of English, Spanish, and
French languages from Roth, Clark, Trofimov, et al. (2017) with the Chinese rankings for the
same year. The reader may refer to that article for the rankings of the Italian, German, and
Russian language areas.
Percentage in 2008 among all lexemes
in Google Ngram Viewer Simplified
Chinese Corpus (rounded to the
市场 经济 + 国民 经济 +
经济发展 + 经济增长 + 经济
market economy, national economy,
economic development, economic
社会 科学 + 科学 发展 + 科学 技
术 + 科学 研究
social science + scientific
development + science & technology
+ scientific research
教育 的 + 高等教育 + 教育部 +
educational, higher education,
Ministry of Education +compulsory
法律 的 + 法律 法规 + 法律 制度
+ 有关 法律
legal, legal regulations, legal system,
中央 政治 + 政治 体系 + 民主 政治
+ 政治 家
Central politics, political system,
democratic politics, politician
医疗 机构 + 医疗卫生+ 合作医 疗
medical institutions, medical
hygiene, cooperative medical
treatment, basic medical care
文化艺术 + 文学艺术 + 艺术作品
Culture & art, literature & art, art
work, forms of art
宗教 信仰 + 宗教 活动 + 民族宗教
religion, religious activities, national
religion, religious affairs
新闻媒体 + 新媒体 + 网络媒体 +
news media, new media, Internet
media, traditional media
体育活动 + 体育 运动 +体育事业 +
sports activities, sports, sports career,
Table 2. Ten function systems in English and simplified Chinese, year 2008
Figure 6: Google Books Ngram 1958-2008 for economy (blue), science (red), education (green),
law (orange), politics (purple)
Figure 7: Google Books Ngram 1950-2008 for health (blue), art (red), religion (green), mass
media (orange), sports (purple)
Table 3. Comparative ranking of ten function systems for the year 2000 in English, Spanish,
French and Chinese. For Chinese, the only change between 2000 and 2008 is that mass media
The most noticeable line in Figure 6 is the Economy line,
which indicates a sustained
rise of economically (or commercially) oriented communication from 1970-2003. In 1950,
Economy is third place. In 1967 it enter second place, and in 1979 it overtakes Science to rise to
first place. In 2003 it drops in 2003 but remains well above the other four lines in 2008. .
In this section, we capitalized the function system names to distinguish them from other uses of the words
economy, politics, law, etc.
Some histories date the Chinese Cultural Revolution from 1966-76; however, it formally ended in 1969 when “the
Ninth Party Congress declared that the Cultural Revolution had come to a victorious conclusion” (Tanner, 2009, p.
trying to interpret this trend, we note that from 1978 until his retirement in 1992, Deng Xiaoping,
as Paramount Leader, oversaw major economic policy reforms, which remained in place under
the PRC presidency of Jiang Zemin (1993-2003) and Hu Jintao (2003-13). The decline from
2003-2008 may be related to the fact that Hu Jintao, after taking note of the increasing gap
between rich and poor, de-emphasized wealth creation and shifted to the Harmonious Society
theme. Another possible reason for the decline of the Economy line from 2003-2008 is that the
economic reforms were no longer news.
Moving on to Science, this line is steady from 1950-71; then it rises until the 1990s; it
then drops slightly but in 2008 remains well above its 1971 level. This trend is consistent with
the historical record. One of the four lexemes for science was science & technology. Clearly,
with the exception of most of the 1960s, the CCP saw science and technology as essential to
China’s emergence as a world power.
The Education and Law lines don’t give us much to go on. These two lines remain
parallel to each other from 1950 to 2008. Law drops slightly in 1966, the beginning of the
Cultural Revolution, which was a particularly lawless time.
Education has always been
important in China, and law has gained in importance as China’s economy has developed. But
the fact that Politics is in fifth position and never rise higher than fourth is worth considering.
The Education line rises modestly but steadily from 1983-98. Conventional academic
education has always, with the exception of the Cultural Revolution years, been a high priority
in—thus accounting for its ranking of third behind Economy and Science in 2008.
530). The year 1976 is often cited because that was the year of Mao’s death and the end of ten years of factional
struggle within the CCP leadership.
Some histories date the Chinese Cultural Revolution from 1966-76; however, it formally ended in 1969 when “the
Ninth Party Congress declared that the Cultural Revolution had come to a victorious conclusion” (Tanner, 2009, p.
530). The year 1976 is often cited because that was the year of Mao’s death and the end of ten years of factional
struggle within the CCP leadership.
The Law line rises modestly but consistently from 1975-2008. Law is the only line that
never drops after 1975. This fact is likely associated with the movement to strengthen the
Chinese legal system following the chaos of the Cultural Revolution. Deng Xiaoping and other
reformists recognized that a stronger legal system was necessary for continued economic growth
and social stability. As Tanner (2009) writes,
One of their first priorities in 1979 was to have the National People’s Congress
approve China’s first post-1949 Criminal Law and Criminal Procedure Law. [. . .]
A number of other laws followed during the 1980s, many of them designed to
deal with the emerging market economy: the Inheritance Law (1985), Civil Law
(1986), and Bankruptcy Law (1986), laws concerning private enterprises, joint
enterprises, foreign investment, business taxes, and so on. (p. 551)
Politics sits in fifth position. This finding contrasts with the prominent position of Politics
in previous Google Ngram studies. For instance, in Roth, Clark, and Trofimov (2017) the
political function system has ranked first in the English, Spanish, French, German, Italian, and
Russian language areas since at least the early twentieth century. It has ranked first in the
Spanish language area since 1840 and first in the English area since 1880. But given that the
CCP, like Chinese rulers in the past, does not encourage the general Chinese public to take an
active interest in politics (Tanner, 2009), the relatively low position of the Politics line is
unsurprising. Nonetheless, the stark contrast between the positions of Politics in the modern
Chinese Google Books corpus and the English, Spanish, French, German, Italian, and Russian
language areas warrants further study.
In Figure 7, the most prominent line is for Art system, which resembles a mountain range
with peaks in 1962 and 1983, intervening valleys, and finally a plateau from 2000-2008. The
sharp decline from 1962-72 occurs following Great Leap Forward (1957-60) and the famine
years (1960-62) and during Cultural Revolution (1966-69) and its aftermath. Most artworks
produced during the Cultural Revolution were propaganda pieces created anonymously and
collectively, and most of it was destroyed or thrown away after and Mao’s death in September
1976 (Zheng, 2010). However, some of the artwork was preserved, and in 2002 an exhibition
titled Art of the Great Proletarian Cultural Revolution, 1966-1976, appeared in Vancouver,
Canada (King, Croizier, Watson & Zheng, 2010). Although this exhibition covered the years
1966-76, the fact that the Art line starts to rise 973 indicates that the art function system had
started to recover its autonomy at least three years before 1976, the year some histories mark as
the end of the Cultural Revolution. But, as mentioned previously, the Cultural Revolution
officially ended in 1969.
The second interesting line in Figure 6 is Health, which rises from 1963-74, then falls for
a decade before rising again in 1984. The rise from 1963-73 is likely related to the Barefoot
Doctors initiative, which, as part of the Cultural Revolution, brought affordable healthcare to the
countryside. Until the mid-1960s Chinese healthcare resources had been directed mostly to urban
areas. But, according to Zhang & Unschuld (2008),
Mao Zedong criticised the urban bias of medical services and pointed out the
stress placed on rural areas in 1965, [and] mobile teams of doctors from urban
hospitals were sent to deliver health care and train indigenous paramedics. [. . .]
[T]he barefoot doctor programme effectively reduced costs and provided timely
treatment to the rural people. [. . .] Reforms in the health-care system in the early
1980s [. . .] resulted in the collapse of the cooperative medical system to a
payment-based system of medical care in rural areas. The percentage of villages
with a cooperative medical system fell from 90% in the 1960s to 5% by 1985.
It is noteworthy that the third and fourth lexemes for the Health function system are medical
hygiene and cooperative medical care, terms clearly associated with the rural healthcare
As for Religion, this trend line dropped overall between from 1950 and the early 1970s,
but it has risen consistently since 1975. This rise likely relates to general opening-up of Chinese
society in the post-Mao decades.
The Mass Media line is interesting because, although the system does register very
slightly in the early 1950s, the line only really appears in 1991 and then rises steadily up to 2008.
The rise of mass-media-oriented communication may be related to a loosening of the CCP
oversight and allowing a relatively independent print news industry, which has occurred along
with the economic reforms (Tanner, 2009; Clark & Zhang, 2017). But the rise of the Mass Media
line is also clearly related to emergence of the Internet, as indicated three of the lexemes under
Mass Media: new media, Internet media, and traditional media. With strong backing from the
government, on April 20, 1994, a 64K international dedicated line to the Internet with full
functional Internet accessibility was set up in China (Weishan, Hongjun & Zhangmin, (2018). In
July 2008, China surpassed the United States in total Internet users (Barboza, 2008).
In tenth position is Sport, which in comparison to Economy, Politics, and Law, is a minor
function system; it ranks last among the ten functions systems in each of the language areas
covered in Roth, Clark, Trofimov, et al. (2017).
We close this section with a few comments on Table 3. The comparison of the English,
Spanish, French, and Chinese language areas as represented by Google Books Ngram Viewer
shows that Politics and Economy basically exchanged positions. Economy and Politics rank first
and fourth, respectively, in the Chinese ngram. In contrast, in English, Spanish, and French,
Politics ranks first and Economy ranks no higher than fourth. Although to save space we did not
include German, Italian, and Russian, in Roth, Clark, Trofimov, et al. (2017), Politics ranks first
in each of those languages and Economy ranks no higher than third (Russian). These results
indicate that economically or commercially related topics are discussed far more frequently in
books published in modern Chinese in comparison to other major languages, whereas politically
related topics appear far less frequently. Another area of contrast is the educationally related
communication. In Table 3, Education ranks second on the Chinese list but no higher than
seventh for English, Spanish, and French. The rankings for Education are similar in German,
Italian, and Russian as shown in Roth, Clark, Trofimov, et al. (2017). We don’t know, however,
what how much education-related discourse in China is patriotic education and how much is
conventional academic education.
This chapter has analyzed trends in communication, specifically in the medium of books,
related to ten function systems identified in social systems theory—politics, law, economy,
education, science, mass media, religion, health, art, and sport—in post-1949 China. By charting
these trends with the aid of the Google Books Ngram Viewer, we have been able to track social
change in modern China. This study builds on previous research on the English, Spanish, French,
German, Italian, and Russia subcorpora of Google Books, and it makes a makes a unique
contribution by using the wildcard search function to get around the difficulties presented by the
Chinese writing system.
The Google Books Ngram tool offers suggestive results that invite further, more fine-
grained research. That is to say, rather than being an end in itself, this kind of big data research
generates new research questions that may be explored through other methods. For example,
why does political communication (the Politics trendline) rank significantly lower on the list of
function systems for the modern Chinese language area in comparison with the English, Spanish,
French, German, Italian, and Russian as found in Roth, Clark, Trofimov, et al. (2017)? This
study offers evidence that books published in Chinese from 1950-2008 discuss political topics
less frequently than do books published in other major world languages. But books, of course,
are only one medium, and a very small percentage of people actually publish books. Are political
topics actually discussed less frequently outside of books among Chinese speakers in comparison
with speakers of these other languages? Researchers might also investigate the following
questions: Is commercially oriented communication (the economy function system) actually
more common or more valued among Chinese speakers in comparison to speakers of other major
world languages? Has modern China devoted a large part of its educational resources to patriotic
education in comparison to conventional education? What factors are behind the increasing
importance of a standardized, less politicized legal system in contemporary China? Did the
resources devoted to affordable healthcare actually rise sharply from 1963-71 and then decline
significantly from 1974-83? And if so, why? These are just a few of the questions provoked by
this kind big-data research.
Barboza, D. (July 26, 2008). China Surpasses U.S. in Number of Internet Users. The New York
Times. Retrieved from https://advance-lexis-
Benjamin, W. (1936/1968). Illuminations. H. Arendt (Ed.), H. Zohn, Trans. New York:
Harcourt, Brace & World.
Bergthaller, H., & Schinko, C. (2011). Introduction: From National Cultures to the Semantics of
Modern Society. In H. Bergthaller & C. Schinko (Eds.), Addressing Modernity. Social
Systems Theory and U.S. Cultures. Amsterdam: Edition Rodopi.
Bolsen, T. & Druckman, J. (2015). Counteracting the Politicization of Science, Journal of
Communication, Volume 65, Issue 5, 1, Pages 745–769, https://doi-
Bonami, O. (Ed.). (2018). The Lexeme in Descriptive and Theoretical Morphology (Empirically
oriented theoretical morphology and syntax, 4). Berlin, Germany: Language Science
Brier, S. (2006). Construction of Knowledge in the Mass Media. Systemic Problems in the Post-
Modern Power-Struggle between the Symbolic Generalized Media in the Agora: The
Lomborg Case of Environmental Science and Politics. Systems Research and Behavioral
Science, 23(5), 667–684. doi:10.1002/sres.793
Clark, C. & Zhang, L. (2017). “Grass Mud Horse: Luhmannian Systems Theory and Internet
Censorship in China.” Kybernetes: The international journal of cybernetics, systems and
management sciences. 46.5 (May): DOI: 10.1108/K-02-2017-0056
Da, Jun. Modern Chinese Character Frequency List. http://lingua.mtsu.edu/chinese-
De Valick, M. (2014). Film Festivals, Bourdieu, and the Economization of Culture. Revue
Canadienne D’études Cinématographiques / Canadian Journal of Film Studies, 23(1),
Eikhof, D. R., & Haunschild, A. (2007). For Arts Sake! Artistic and Economic Logics in
Creative Production. Journal of Organizational Behavior, 28(5), 523–538.
Esser, F., & Strömbäck, J. (Eds.). (2014). Mediatization of Politics: Understanding the
Transformation of Western Democracies. Palgrave Macmillan.
Ewert, B. (2009). Economization and Marketization in the German Healthcare System: How Do
Users Respond? German Policy Studies/Politikfeldanalyse, 5(1), 21-44.
Fazl-E-Haider, S. (2012). Commercialization of Education. Pakistan & Gulf Economist, 31(28),
1-2. Retrieved from https://libweb.uwlax.edu/
Fludernik, M. (2005). Threatening the University—The Liberal Arts and the Economization of
Culture. New Literary History, 36(1), 57–70. doi:10.1353/nlh.2005.0019
Google Books Ngram Viewer. https://books.google.com/ngrams/info
Gordon, S. (1999). Controlling the State: Constitutionalism from Ancient Athens to Today.
Cambridge, MA: Harvard University Press.
Hjarvard, S. (2013). The Mediatization of Culture and Society. New York: Routledge.
Jönhill, J. I. (2012). Inclusion and Exclusion—A Guiding Distinction to the Understanding of
Issues of Cultural Background. Systems Research and Behavioral Science, 29(4), 387–
Kepplinger, H. (2002). Mediatization of Politics: Theory and Data. Journal of
Communication, 52(4), 972-986.
Kjaer, P. F. (2010). The Metamorphosis of the Functional Synthesis: A Continental European
Perspective on Governance, Law, and the Political in the Transnational Space. Wisconsin
Law Review, 2, 489–1555.
Lamont, V. & McGuirk, K. (2017). "Introduction: Culture and the Economization of
Everything." Canadian Review of American Studies, vol. 47 no. 2, 161-170. Project
Leydesdorff, L. (2002). The Communication Turn in the Theory of Social Systems. Systems
Research and Behavioral Science, 19(2), 129-136. doi:10.1002/sres.453
Luhmann, N., & Bednarz, J. (2005). Social Systems (Reprinted ed., Writing science). Stanford,
Calif.: Stanford University Press.
Luhmann, N. (2007). The Reality of the Mass Media (Reprinted ed.). Cambridge: Polity Press.
Luhmann, N. (2012). Theory of Society, Vol 1. Rhodes Barrett, trans. Palo Alto, CA: Stanford
Luhmann, N. (2013). Theory of Society, Vol 2. Rhodes Barrett, trans. Palo Alto, CA: Stanford
King, R., Croizier, R., Watson, S., & Zheng, S. (2010). Art in Turmoil: The Chinese Cultural
Revolution, 1966-76 (Contemporary Chinese studies). Vancouver: UBC Press.
Madison, J. (1788) Federalist 47. The Particular Structure of the New Government and the
Distribution of Power Among Its Different Parts.
stPapers-47. Accessed July 21, 2019.
Matthews, J. (2014). Encyclopedia of Environmental Change. Thousand Oaks, California: SAGE
Mazzoleni, G. (2008). Mediatization of Society. In W. Donsbach (Ed.), The International
Encyclopedia of Communication. Hoboken, NJ: Wiley. Modern Chinese Word List".
McLuhan, M. (1962). The Gutenberg Galaxy: The Making of Typographic Man. Toronto:
University of Toronto Press.
Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Pickett, J. P., & Aiden, E. L. et
al. (2011). Quantitative Analysis of Culture Using Millions of Digitized Books. Science,
331(6014), 176–182. doi:10.1126/science.1199644 PMID:21163965
Modern Chinese Word List. 2500 Words. http://www.zdic.net/z/zb/cc1.htm
Montesquieu, C., Baron de, & Rousseau, J. (1955). The Spirit of Laws (Great books of the
western world, v. 38) (T. Nugent & G. Cole, Trans.; J. Prichard, Ed.). Chicago:
Newbury, C. (2016). The Semantics of Corruption: Political Science Perspectives on Imperial
and Post-Imperial Methods of State-Building. Journal of Imperial & Commonwealth
History, 44(1), 163–194. https://doi-
Robertson, R. (1992). The Economization of Religion? Reflections on the Promise and
Limitations of the Economic Approach. Social Compass, 39(1), 147–157. doi:
Roper, S. (2005). The Politicization of Education: Identity formation in Moldova and
Transnistria. Communist and Post-Communist Studies, 38(4), 501-514.
Roth, S. & Schütz, A. (2015). Ten systems. Toward a canon of function systems. Cybernetics &
Human Knowing, 22(4), 11–31.
Roth, S., Clark, C., Trofimov, N., Mkrtichyan, A., Heidingsfelder, M., Appignanesi, L., Pérez-
Valls, M, Berkel, J., Kaivo-oja, J. (2017). Futures of a distributed memory: A global
brain wave measurement (1800–2000). Technological Forecasting and Social Change.
Schirmer, W., & Hadamek, C. (2007). Steering as Paradox: The Ambiguous Role of the Political
System in Modern Society. Cybernetics & Human Knowing, 14(2-3), 133-150.
Spring, J. (2015). The Economization of Education: Human Capital, Global Corporations, and
Skills-Based Schooling. New York: Routledge.
Tanner, H. (2009). China: A History. Indianapolis/Cambridge: Hackett Publishing Company.
Thomas, G. (2016). "The Mediatization of Religion – as Temptation, Seduction, and
Illusion." Media, Culture & Society, 38(1): 37-47.
Valentinov, V. (2015). From equilibrium to autopoiesis: A Luhmannian reading of Veblenian
evolutionary economics. Economic Systems, 39(1), 143-155.
Vanderstraeten, R. (2005). System and Environment: Notes on the Autopoiesis of Modern
Society. Systems Research and Behavioral Science, 22(6), 471–481. doi:10.1002/sres.662
Wang, Z. (2008). National Humiliation, History Education, and the Politics of Historical
Memory: Patriotic Education Campaign in China. International Studies Quarterly, 52(4),
783-806. Retrieved from http://www.jstor.org.nuls.idm.oclc.org/stable/29734264
Ward, S. (2006). Functional Differentiation and the Crisis in Early Modern upper-class
Conversation: The Second Madame, Interaction, and Isolation, Seventeenth-Century
French Studies, 28:1, 235-247, DOI: 10.1179/c17.2006.28.1.235
Watson, B., & Columbia College (Columbia University). (1968). The Complete Works of
Chuang Tzu (UNESCO Collection of Representative Works. Chinese Series). New York:
Columbia University Press.
Weishan, M., Hongjun Z., & Zhangmin C. (2018). Who’s in charge of regulating the Internet in
China: The history and evolution of China’s Internet regulatory agencies. China Media
Research, 14(3), 1–7. Retrieved from https://search-ebscohost-com.libweb.uwlax.edu
Wirth, R., Whiddon, T., & Manson, T. (2008). What is wrong with academia today? : Essays on
the politicization of American education. Lewiston, N.Y.: Edwin Mellen Press.
Yang, S., Zhu, H., Apostoli, A., & Cao, P. (2007). N-gram statistics in English and Chinese:
similarities and differences. Paper presented at the International Conference on Semantic
Computing, 2007. ICSC 2007.
Zhang, D., & Unschuld, P (2008). China’s barefoot doctor: Past, present, and future. The
Lancet, 372(9653), 1865-1867.
Zheng, S. (2010). “Brushes Are Weapons: An Art School and Its Artists,” in Art in Turmoil,
King, Croizier, Watson & Zheng,
Carlton Clark is a lecturer in the English Department at the University of Wisconsin, La
Crosse, where he teaches College Writing and American Literature. He earned his Master’s in
English and PhD in Rhetoric from Texas Woman’s University. With Lei Zhang, he co-edited
Affect, Emotion, and Rhetorical Persuasion in Mass Communication (Routledge, 2019). His
research has been published Technological Forecasting and Social Change, Kybernetes, the
Journal of Organizational Change Management, and other journals. His primary research
interest is social systems theory.
Lei Zhang is Associate Professor of English and Journalism at the University of Wisconsin, La
Crosse, where she teaches rhetoric, journalism, and new media studies. She received her
Master’s degree in Journalism from the University of North Texas and her PhD in Rhetoric from
Texas Woman’s University. Her research interests include intercultural rhetorics, new media
studies, and discourse analysis. She co-edited Affect, Emotion, and Rhetorical Persuasion in
Mass Communication (Routledge, 2019). Her research has been published in Rhetoric
Review and Kybernetes, among other venues.
Steffen Roth is Full Professor of Management at the La Rochelle Business School, France, and
Research Professor of Digital Sociology at the Kazimieras Simonavičius University in Vilnius,
Lithuania. He holds a Habilitation in Economic and Environmental Sociology awarded by the
Italian Ministry of Education, University, and Research; a PhD in Sociology from the University
of Geneva; and a PhD in Management from the Chemnitz University of Technology. He is an
associate editor of Kybernetes and the field editor for social systems theory of Systems Research
and Behavioral Science. The journals his research has been published in include Journal of
Business Ethics, Journal of Cleaner Production, Administration and Society, Technological
Forecasting and Social Change, Journal of Organizational Change Management, European
Management Journal, and Futures. His ORCID profile is available at orcid.org/0000-0002-8502-
Acknowledgement: One author gratefully acknowledges financial support from the Research
Council of Lithuania and the European Regional Development Fund-ERDF/FEDER (National
R&D Project 01.2.2-LMT- K-718-02- 0019 “Platforms of Big Data Foresight