Content uploaded by Jin Zhang
Author content
All content in this area was uploaded by Jin Zhang on Nov 18, 2022
Content may be subject to copyright.
Human Dynamics in Smart Cities
Series Editors: Shih‐Lung Shaw · Daniel Sui
XinyueYe
HuiLinEditors
Spatial
Synthesis
Computational Social Science and
Humanities
Human Dynamics in Smart Cities
Series Editors
Shih-Lung Shaw, Department of Geography, University of Tennessee, Knoxville,
TN, USA
Daniel Sui, Department of Geography, Ohio State University, Columbus, OH, USA
This series covers advances in information and communication technology (ICT),
mobile technology, and location-aware technology and ways in which they have
fundamentally changed how social, political, economic and transportation systems
work in today’s globally connected world. These changes have raised many
exciting research questions related to human dynamics at both disaggregate and
aggregate levels that have attracted attentions of researchers from a wide range of
disciplines. This book series aims to capture this emerging dynamic interdisci-
plinary field of research as a one-stop depository of our cumulative knowledge on
this topic that will have profound implications for future human life in general and
urban life in particular. Covering topics from theoretical perspectives, space-time
analytics, modeling human dynamics, urban analytics, social media and big data,
travel dynamics, to privacy issues, development of smart cities, and problems and
prospects of human dynamics research. This will include contributions from the
participants of the past and future Symposium on Human Dynamics Research held
at the American Association of Geographers annual meeting as well as other
researchers with research interests related to human dynamics via open submis-
sions. The series invites contributions of theoretical, technical, or application
aspects of human dynamics research from a global and interdisciplinary audience.
More information about this series at http://www.springer.com/series/15897
Xinyue Ye •Hui Lin
Editors
Spatial Synthesis
Computational Social Science and Humanities
123
Editors
Xinyue Ye
Department of Landscape
Architecture and Urban Planning
Texas A&M University
College Station, TX, USA
Hui Lin
School of Geography and Environment
Jiangxi Normal University
Nanchang, China
ISSN 2523-7780 ISSN 2523-7799 (electronic)
Human Dynamics in Smart Cities
ISBN 978-3-030-52733-4 ISBN 978-3-030-52734-1 (eBook)
https://doi.org/10.1007/978-3-030-52734-1
©The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2020
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Introduction: Spatial Synthesis in Computational
Social Science and Humanities
1. Towards Computational Spatial Social Science
and Humanities
As Goodchild et al. (2000) illustrate, “changes in the space and place of peoples and
nations have profoundly affected the spatial organization of the social, the eco-
nomic, the political, and the cultural—the key domains of focus of the social
sciences.”Space and place are central across social science and humanities disci-
plines to serve as both inputs and outputs of empirical and theoretical investigations
(Dezzani 2010). Goodchild (2020) also states “in essence the particular form of
integration that is so central to GIS practice is what we might term spatial inte-
gration.”Different from traditional social science and humanities, computational
social science and humanities adopt computation as the vital enabling method-
ological foundation and platform (Cioffi‐Revilla 2014). The Internet and cellular
data networks significantly change our mode of communication and reshape the
formation of networked groups which were previously strongly constrained by
distance and location, releasing the power of social interactions and group assembly
across a much larger territory. Furthermore, Lazer et al. (2009) announce the
coming age of computational social science, because we are entering the life in the
network digitally captured to form comprehensive pictures of both individuals and
communities.
The past decade has witnessed the dramatic growth of computational social
science and humanities research spurred by the increasingly available fine-scale and
human-centered spatial (spatiotemporal) data. The computing environment for
geo-visualization, geo-simulation, geo-collaboration, and human participation has
also been developed to assist various computer-aided research tasks (Lin et al.
2013). The following major trends have been identified:
1. Spatial social science and humanities research has shifted from a data-scarce to a
near real time data-rich environment. The availability of unprecedented data
sources over space, time, and social networking would facilitate the modeling of
v
individuals’behavior and the outcomes of such model across spatiotemporal
scales, deeply rooted in both geographic landscape and social network. For
instance, Ye et al. (2019) integrate the spatial method and social network ana-
lytics to model the scope and sources of online transactions and quantify the
driving forces, based on online transactions at the city level. A powerful ana-
lytical framework for identifying space-time research gaps and frontiers is
fundamental to the comparative study of spatiotemporal phenomena upon
configuration of various intertwined relationships. For example, novel research
questions can be generated when we can systematically query the dynamic
virtual and physical dimensions across multiple scales in socioeconomic mod-
eling, transportation analysis, and disaster response (Ye and Rey 2013; Li et al.
2017; Wang and Ye 2017).
2. As the space-time data accumulate, the rich details of spatiotemporal dynamics
in computational modeling remain largely unexplored because of many binding
constraints for scientific advancement such as the challenge of intensities of data
computing and very large geo-referenced dynamic databases (Shaw ad Ye
2019). In addition, such 24/7 unstructured social data needs special methods
(Batty 2020). The revolution of computing and information technology has
further blurred the boundary and definition of disciplines and applications. The
increasing affordability of computing cost and lowered learning curve have also
accelerated big spatiotemporal analytics studies at a growing rate.
3. The integration has been gradually realized among and across conceptualiza-
tions, analytical methods, and open-source software environments across dis-
ciplines of social science and humanities. Such an integration is needed to
respond to the new data and computing environment (Liu et al. 2019). Human
dynamics has been emphasized from the geospatial dimension within the con-
text of mobile and big data era (Shaw et al. 2016). A virtual geographic envi-
ronment has also been proposed as a computer-aided workspace for geographic
experiments and analyses involving both the physical and human dimensions
(Lin et al. 2013). By integrating environmental psychology theory and
geospatial artificial intelligence, a framework of virtual geographic cognition
experiment has been further developed to model and simulate human activity
and urban context data (Zhang et al. 2018).
2. Synthesis and Convergence
Themes of Social Science and Humanities are increasingly relevant to convergence
and synthesis across multiple disciplines as well as the data, computing, interactive,
and collaborative environments. Annual International Symposium of Spatially
Integrated Social Science and Humanities have been held ten times to promote such
practice. This book is born from the most exciting and dominant themes in Spatial
Synthesis: Computational Social Science and Humanities research in China. As a
first English book of such kind, it spans most social science and humanities
vi Introduction: Spatial Synthesis in Computational Social Science and Humanities
disciplines as well as computational science. This book is a comprehensive text on
spatial and computational social science and humanities research. The development
of more powerful computing technology, emerging big and open data sources, and
theoretical perspectives on Spatial Synthesis has revolutionized the way in which
we investigate social science and humanities. Given the pace of change and
prominence of human-centered computing and spatial social science/humanities
research, a summary of the principles and applications of such research is urgently
required and will be of great value. In the foreword of this book, Batty (2020) notes
that “the continued miniaturization of computers to the point where we are now
using them personally in real time to organize our lives has led to many new ways
of sensing and delivering data about our social behaviours”, while Goodchild
(2020) highlights core reasons supporting convergence and synthesis: “the pressing
challenges faced by the earth cannot be solved by one single discipline, and hence
need the collaborative work between computing experts and domains scientists for
broader perspectives”.
This book contains research and contributions from scholars across China and
the World. The main principles and applications of spatial social science and
humanities over the past decade in China are reviewed. The book provides fun-
damental information that will help to shape future research. This book will allow
researchers, students, and policy-makers worldwide to learn about the significant
achievements and applications of spatial social science and humanities research
within China.
3. Spatial Synthesis in Humanities, Regional Science,
and Urban Science
This volume is the Human Dynamics in Smart Cities book series and is composed
of 25 chapters. After the forewords by Academicians Michael Batty and Michael
Goodchild, the following chapters cover a variety of interesting and timely topics
on Spatial Synthesis for Computational Social Science and Humanities. The
chapters focus on three aspects: humanities, regional science, and urban science
according to their different roles, pertinent issues, and corresponding solutions, as
below:
Spatial Synthesis in Humanities: According to Hu et al. (2020), the official
history, chorography, and family trees form the memory of China as a nation. They
construct a multilevel architecture of Family Tree Geographical Information System
(FTGIS) by incorporating modern geospatial information technologies into the
research on family trees. Lu and Zhang (2020) build Historical Geographic
Information System to promote the research of Chinese history including literature,
maps, remote-sensing images, and archeological relics. Digital Historical Yellow
River system is also developed to contain: (1) high-precision three-dimensional
micro-geomorphology; (2) fusion scheme of historical hydraulic engineering and
Introduction: Spatial Synthesis in Computational Social Science and Humanities vii
terrain model; (3) restoration of the three-dimensional shape of river channel;
(4) simulation and demonstration of motion process in historical period of surface
water; (5) reconstruction of rainfall characteristics in historical periods; and (6)
river-water management methods in historical periods (Pan et al. 2020). Based on
CHGIS (China Historical Geographic Information System) and CBDB (China
Biographical Database) and mapping tools, Xu Y (2020) visualizes the trajectory,
activities, and social networks of Tang Xianzu, a Chinese playwright of the Ming
Dynasty. Measuring the cultural effects on demographic behaviors and outcomes is
difficult because such influences are challenging to quantify, Xu H (2020) inte-
grates biomarker data and small area estimation techniques to identify the spatial
variation of cultural tolerance. How to effectively and efficiently protect the tradi-
tional Cave-Dwelling village is crucial for cultural heritage conservation. Dang
et al. (2020) adopt Cultural Landscape Gene theory to analyze landscape features of
cave-dwelling village in Wudinghe River Basin and examine its cultural values.
Through the perspective of the cultural space, Shi (2020) explores the popularity
of the Taiwanese ballad music form characterized by the mixed-race influences
from Japan, by integrating the geographic information system and qualitative
interviews.
Spatial Synthesis in Regional Science: Gu et al. (2020) systematically review
the recent advance on Spatial Demography from the angles of differentiation and
isolation, birth and death, migration and urbanization, regional population forecast,
population and the environment, as well as analytical methods and application.
Yang and Li (2020) document the previous studies for the air and high-speed
railway networks at different spatial and temporal scales, based on the various
configuration of complex network in the weighted network. Quite a few USA-based
residential property owners have gotten financial support because of
energy-efficient products and services through the programs such as the Property
Assessed Clean Energy, Pan (2020) computes the economic influences of the
residential energy efficient programs based on a metropolitan input–output model.
Zhang et al. (2020) estimate the influences of changes in industrial structure, energy
total factor efficiency, and energy structure on changes in carbon emission (CO
2
)at
the provincial level in China, using exploratory spatial data analysis and spatial
panel econometric models. Gui et al. (2020) conduct a visual analysis of smart cities
and big data management in the Yangtze River Delta region, based on company
registration information for 30 years. Using both ordinary least square and geo-
graphically weighted regression, Gao and Chen (2020) analyze the driving forces of
land urbanization in China at the county level in 2000 and 2015, finding land
urbanization experienced an average increase by 2.77% annually during this period
with an obvious north–south disparity. Population growth, economic development,
industrial structure, city/county features, and geographical location are found to be
significant factors shaping the geographical disparities of land urbanization. Qin
et al. (2020) use the Chinese General Social Survey data to explore the geographical
patterns and driving forces of intergenerational education mobility via intergener-
ational mobility indices and geographically weighted regression model. Adopting a
representative sample of volunteered geographic information crawled from Sina
viii Introduction: Spatial Synthesis in Computational Social Science and Humanities
Weibo and Baby Back Home, Yang and Sui (2020) analyze the spatial distribution
of child beggars and missing children in China, respectively.
Spatial Synthesis in Urban Science: To illustrate how geospatial data might be
influenced by the rapid advance of artificial intelligence, Zhao et al. (2020) exam-
ine three geospatial spoofing cases: the game player trajectories generated by bot,
the tweeted fake locational information, and simulated image of place made by a
deep learning algorithm. Jiang (2020) promotes a complex network angle on the
wholeness to better understand the nature of order or beauty for sustainable design,
which helps to reduce the mystery of wholeness and enables us to appreciate
Alexander’s wholeness philosophy in fine and deep structure. Zhou and Peng
(2020) develop an analytical framework of behavior research in China, fundamental
to comparative study as well as dynamic and predictive research. Gao et al.
(2020) adopt travelers’perception towards city space through bloggers, tweets,
pictures, and videos, examining tourists’perceived images of city center, historical
community, and traditional water town of Shanghai. Shen (2020) investigates the
patterns of distance-based accessibilities for various housing types associated with
surrounding community facilities for four counties in North Carolina, U.S.A. In
addition, taking the transportation network companies vehicle GPS trajectories in
Shenzhen as a case, Tu et al. (2020) design a data-driven framework to uncover
on-demand shared mobility pattern. Zheng et al. (2020) review the applications of
eye movement experiments in humanities, social science, and geospatial cognition.
Furthermore, two experiments are conducted based on goal searching strategy and
indoor wayfinding.
4. Conclusion
Academia, decision-makers, and citizens have progressively realized the necessity
of modeling, simulating, and analyzing social phenomena based on large-scale
computing, in order to transform our understanding of our lives, organizations, and
societies at this point in the human history (Lazer et al. 2009). Noting that chapters
in this book do not cover the full scale of computational spatial social science and
humanities research, the following research avenues are also noteworthy. First,
there is a need to develop a systematic and theoretical framework to characterize
such methodological integration in reflecting the multifaceted nature of human
dynamics and social complexity. Second, data-challenged depressed communities
deserve the emerging strand of study because ignoring the coexistence of data-rich
and data-poor environments would lead to possibly biased model results that are
meaningless and harmful in policy implementation. To address fairness in big data
analytics of social science and humanities offers us new opportunities in under-
standing the world and harnessing data science for social good. While we may not
be able to mention all relevant studies in this short introductory piece, this edited
Introduction: Spatial Synthesis in Computational Social Science and Humanities ix
volume is among the efforts to promote spatial synthesis in human and social
dynamics studies, towards a new generation of research environments and tools to
contribute to a deeper understanding of the geographic world.
Xinyue Ye
Hui Lin
Acknowledgement The work has been supported by School of Geography and Environment,
Jiangxi Normal University.
References
Batty, M. (2020). Foreword I: Charting computational social science from a spatial perspective.
(This volume).
Cioffi‐Revilla, C. (2014). Introduction to computational social science: Principles and
Applications, Springer: New York.
Dang, A., Zhao, D., Chen, Y., & Wang, C. (2020). Conservation of cave-dwelling village using
cultural landscape gene theory. (This volume).
Dezzani, R. (2010). Spatially integrated social science. In B. Warf (Ed.), Encyclopedia of geog-
raphy. doi: 10.4135/9781412939591.n1070
Gao, J. & Chen, J. (2020). Demystifying the inequality in urbanization in China through the lens
of land use. (This volume).
Gao, J., Ma, J., Li, J., & Wang, L. (2020). Studies on tourists’city space images. (This volume).
Goodchild, M. F. (2020). Foreword II: Convergence and synthesis. (This volume).
Goodchild, M. F., Anselin, L., Appelbaum, R. P., & Harthorn, B. H. (2000). Toward spatially
integrated social science. International Regional Science Review, 23(2), 139–159.
Gu, H., Lao, X., & Shen, T. (2020). Research progress on Spatial Demography. (This volume).
Gui, R., Chen, T., & Wu, Z. (2020). Spatial visualization and analysis of the development of
high-paid enterprises in the yangtze river delta. (This volume).
Gui, Z., Wang, Y., Li, F., Tian, S., Peng, D., & Cui, Z. (2020). High performance spatiotemporal
visual analytics technologies and its applications in big socioeconomic data analysis. (This
volume).
Hu, D., Cheng, X., Lu, G., Wen, Y. & Chen, M. (2020). The China family tree geographic
information system. (This volume).
Jiang, B. (2020). A complex-network perspective on alexander’s wholeness. (This volume).
Lazer, D., Pentland, A. S., Adamic, L., Aral, S., Barabasi, A. L., Brewer, D., & Jebara, T. (2009).
Life in the network: the coming age of computational social science. Science: New York, NY,
323(5915), 721.
Li, M., Ye, X., Zhang, S., Tang, X., & Shen, Z. (2017). A framework of comparative Urban
trajectory analysis. Environment and planning B. doi: 10.1177/2399808317710023
Lin, H., Chen, M., & Lu, G. (2013). Virtual geographic environment: a workspace for
computer-aided geographic experiments. Annals of the Association of American Geographers,
103(3), 465–482.
Liu, X., Xu, Y., & Ye, X. (2019). Outlook and next steps: Integrating social network and spatial
analyses for urban research in the new data environment. In Cities as spatial and social
networks (pp. 227–238). Springer, Cham.
x Introduction: Spatial Synthesis in Computational Social Science and Humanities
Lu, Y & Zhang, P. (2020). GIS for Chinese history research. (This volume).
Pan, Q. (2020). Economic impact analysis for an energy efficient home improvement program.
(This volume).
Pan, W., Su, R., Man, Z., Zhang, L., He, M., & Han, L. (2020). Digital historical yellow river.
(This volume).
Qin, K., Luo, P., Lu, B., & Lin, Z. (2020). Analysing spatial patterns of intergenerational edu-
cation mobility in China. (This volume).
Shaw, S., Tsou, M., & Ye, X. (2016). Human dynamics in the mobile and big data era.
International Journal of Geographical Information Science,30(9): 1687–1693.
Shen, G. (2020). Accessibility of residential houses to commnuity facilities. (This volume).
Shi, C. (2020). Digitalized enka-style Taipei. (This volume).
Tu, W., Wei, C., Zhao, T., Li, Q., Zhong, C., & Li, Q. (2020). Uncovering online sharing vehicle
mobility patterns from massive GPS trajectories. (This volume).
Wang, Z. & Ye, X. (2017). Social media analytics for natural disaster management. International
Journal of Geographical Information Science. doi: 10.1080/13658816.2017.1367003
Xu, H. (2020). Quantifying spatial variation in aggregate cultural tolerance. (This volume).
Xu, Y. (2020). Visualizing classic Chinese literature. (This volume).
Yang, H. & Li, Y. (2020). Complex network theory on high-speed transportation systems. (This
volume).
Yang, X. & Sui, D. (2020). Can social media rescue child beggars? (This volume).
Ye, X. & Rey, S. J. (2013). A framework for exploratory space-time analysis of economic data.
Annals of Regional Science,50(1): 315–339.
Ye, X., Lian, Z., She, B., & Kudva, S. (2019). Spatial and big data analytics of E-market
transaction in China. GeoJournal,1–13.
Zhang, F., Hu, M., & Lin, H. (2018). Virtual geographic cognition experiment in big data era. Acta
Geodaetica et Cartographica Sinica,47(8), 1043.
Zhang, J., Li, J., & Wang, X. (2020). Exploring the dynamics of carbon emissions in China via
spatial-temporal analysis. (This volume).
Zhao, B., Zhang, S., Xu, C., & Liu, X. (2020). Spoofing in geography: Can we trust artificial
intelligence to manage geospatial data? (This volume).
Zheng, S., Chen, Y. & Wang, C. (2020). Application of eye-tracking technology in humanities,
social sciences and geospatial cognition. (This volume).
Zhou, S. & Peng, Y. (2020). Spatial-temporal behavior analysis in Urban China. (This volume).
Introduction: Spatial Synthesis in Computational Social Science and Humanities xi
Contents
Part I Foreword
1 Foreword I: Charting Computational Social Science
from a Spatial Perspective ............................... 3
Michael Batty
2 Foreword II: Convergence and Synthesis .................... 7
Michael F. Goodchild
Part II Spatial Synthesis in Humanities
3 The China Family Tree Geographic Information System ........ 13
Di Hu, Xinghua Cheng, Guonian Lü, Yongning Wen, and Min Chen
4 GIS for Chinese History Research ......................... 39
Yifan Lu and Ping Zhang
5 Digital Historical Yellow River ............................ 53
Wei Pan, Rao-rao Su, Zhi-min Man, Li-jie Zhang, Mi-mi He,
and Li-kun Han
6 Visualizing Classic Chinese Literature ...................... 65
Yongming Xu
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance ... 77
Hongwei Xu
8 Conservation of Cave-dwelling Village using Cultural Landscape
Gene Theory .......................................... 97
Anrong Dang, Dongmei Zhao, Yang Chen, and Congwei Wang
9 Digitalized Enka-Style Taipei ............................. 107
C. S. Stone Shih
xiii
Part III Spatial Synthesis in Regional Science
10 Research Progress on Spatial Demography ................... 125
Hengyu Gu, Xin Lao, and Tiyan Shen
11 Complex Network Theory on High-Speed Transportation
Systems .............................................. 147
Haoran Yang and Yongling Li
12 Economic Impact Analysis for an Energy Efficient Home
Improvement Program .................................. 163
Qisheng Pan
13 Exploring the Dynamics of Carbon Emission in China via
Spatial-Temporal Analysis ............................... 181
Jin Zhang, Jinkai Li, and Xiaotian Wang
14 Spatial Visualization and Analysis of the Development
of High-Paid Enterprises in the Yangtze River Delta ........... 199
RenZhou Gui, Tongjie Chen, and Zhiqiang Wu
15 High Performance Spatiotemporal Visual Analytics Technologies
and Its Applications in Big Socioeconomic Data Analysis ........ 221
Zhipeng Gui, Yuan Wang, Fa Li, Siyu Tian, Dehua Peng,
and Zousen Cui
16 Demystifying the Inequality in Urbanization in China Through
the Lens of Land Use ................................... 257
Jinlong Gao and Jianglong Chen
17 Analyzing Spatial Patterns of Intergenerational Education
Mobility in China ...................................... 285
Kun Qin, Ping Luo, Binbin Lu, and Zeng Lin
18 Can Social Media Rescue Child Beggars? .................... 303
Xining Yang and Daniel Z. Sui
Part IV Spatial Synthesis in Urban Science
19 Spoofing in Geography: Can We Trust Artificial Intelligence
to Manage Geospatial Data? .............................. 325
Bo Zhao, Shaozeng Zhang, Chunxu Xu, and Xiaobai Liu
20 A Complex-Network Perspective on Alexander’s Wholeness ...... 339
Bin Jiang
21 Spatial-Temporal Behavior Analysis in Urban China ........... 355
Suhong Zhou and Yinong Peng
22 Studies on Tourists’City Space Images ..................... 377
Jun Gao, Jianyu Ma, Jie Li, and Liangxu Wang
xiv Contents
23 Accessibility of Residential Houses to Community Facilities ...... 399
Guoqiang Shen
24 Uncovering Online Sharing Vehicle Mobility Patterns
from Massive GPS Trajectories ........................... 413
Wei Tu, Cui Wei, Tianhong Zhao, Qiuping Li, Chen Zhong,
and Qingquan Li
25 Application of Eye-Tracking Technology in Humanities,
Social Sciences and Geospatial Cognition .................... 431
Shulei Zheng, Yufen Chen, and Chengshun Wang
Part V Afterword
26 Prospects of Spatial Synthesis in Computational Social Science
and Humanities: Towards a Spatial Synthetics and Synthetic
Geography ........................................... 451
Daniel Z. Sui
Contents xv
Part I
Foreword
Chapter 1
Foreword I: Charting Computational
Social Science from a Spatial Perspective
Michael Batty
Although digital computers emerged in the first half of the 20th century, the idea of
computation had been deeply embedded in science and philosophy from the Enlight-
enment, certainly from the Renaissance on, and indeed as far back as the Greeks.
When computers were invented however, it took on a new meaning in that every-
thing which computers were able to do first depended upon reducing a problem to
its digital fundamentals and then combining and recombining its elements using
logics, arithmetic, and algebras. This is the contemporary notion of ‘computation’
in contrast to the term ‘computer’ which is reserved for the hardware on which such
computation takes place. In fact computation has come to dominate the myriad of
applications that that define the scope that computers can address, and slowly but
surely over the last 80 years, the term ‘computational’ has been appended to many
areas as computers increasingly penetrated social and economic life, well beyond
their original applications in science. In the late 20th century, the term began to be
applied to various of the social sciences. For example, 25 years ago, it was used by
Hummon and Farajo (1995) in their paper on computational sociology. In the late
1980s, David Mark and his colleagues at the National Center for Geographic Infor-
mation and Analysis used the term in many conversations and in 1994, The Centre
for Computational Geography was set up by Stan Openshaw at the University of
Leeds (http://www.ccg.leeds.ac.uk/). This led directly to the term Geocomputation
which is still widely used to this day and whose history I recalled in an editorial to
mark the 21st anniversary of the first conference (Batty, 2017).
But back to social science. The publication of the path-breaking book by Epstein
and Axtell (1996 Growing Artificial Societies: Social Science from the Bottom Up)
was a wonderful demonstration of how computation could be employed to simu-
late many features of contemporary communities and their histories, showing all
M. Batty (B
)
CASA, University College London, London, UK
e-mail: m.batty@ucl.ac.uk
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_1
3
4M. Batty
the features of complexity science—segregation, agglomeration, income distribu-
tion, spatial clustering, reflecting new methods and ideas ranging from emergence,
fractals, positive feedback, power laws, historical accident, path dependence and
so on. It introduced agent-based modelling in contrast to much of the aggregative
modelling in the social sciences that had preceded it. Thus computational social
science came to define the use of computers to enable simulations to be extended
to quite large systems but more specifically to methods that enabled many different
kinds of logics other than classical algebras to be applied to diverse problems in
social and economic domains. A decade after the millennium, the field had matured
to the point where representation as well as simulation set the boundaries on its scope,
reflected in Claudio Cioffi-Revilla’s (2010) review of the field where he defines “the
main computational social science (CSS) areas are automated information extrac-
tion systems, social network analysis, social geographic information systems (GIS),
complexity modelling, and social simulation models.” What marks this definition is
that CSS is methodologically self-conscious in that although it deals with the social
and economic domains extending as far as the behavioural sciences of individual
economic and social decision-making, it does not presume to extend our knowledge
of these substantive systems. CSS is not focussed on developing new social science
theory per se although it may be based on demonstrating how we can develop new
methods for validating theory, and in this process, there is often some focus on artic-
ulating ways of measurement and simulation which may eventually lead to new and
better theory.
For the last 10 years, there has however been a sea change with respect to CSS. The
continued miniaturisation of computers to the point where we are now using them
personally in real time to organise our lives has led to many new ways of sensing and
delivering data about our social behaviours. This is data that is captured and often
delivered, often analysed and acted upon in near real time. It is data that is ‘big’ in
the sense that individual behaviours are being captured 24/7 and it is voluminous in
size. It often requires special and very different multivariate techniques and methods
to even represent, store and access it as this is data that is largely unstructured.
Unlike official Census data, it is not made to measure and often requires the powerful
tools coming from what is now called data science for exploring whether significant
patterns exist within it. In this sense, the focus changes within computational social
science to developing much more inductive methods, methods that seek to extract
patterns which ultimately build up to new hypotheses, rather than develop simulations
which seek to test these hypotheses. Of course, the scientific methods in CSS are no
different from any other positive philosophies which always depend on a fusion of
inductive and deductive perspectives.
In this book, Hui Lin and Xinyue Ye have put together an interesting collection
of papers that deal with a very wide range of computational approaches to not only
the social sciences but also the humanities. What distinguishes this set of papers
other than its extent is the fact that the expertise of the editors in geospatial analysis
is brought to bear on the various papers. Computational geography as we alluded
to above is an additional theme that runs throughout the collection and this serves
to ground the various chapters in quite well-developed GIS technologies. In fact as
1 Foreword I: Charting Computational Social Science … 5
Cioffi-Revilla (2014) notes, social GIS (geographic information systems/science) is
key to his more catholic definition of CSS and this is certainly the stance taken by
the editors.
The book is divided into three parts, all dealing with communicating a synthetic
knowledge of computation in the humanities, regional science, and urban science in
that order. The first part deals with the humanities covering the structure of geographic
information using ideas about hierarchy, the use of GIS in Chinese historical research,
the visualisation of Chinese literature, cultural landscapes, conservation, and archae-
ological perspectives. The second and third parts deal with changes in scale, to some
extent from national concerns to the regional and then the urban. Spatial demography,
network theory as in transportation, economic impact analyses, carbon emissions, the
locations of firms, analysis of social and economic structure through visualisation,
inequalities, educational mobility and poverty are all key dimensions in the papers
developed in this section. There is a stronger quantitative dimension to the papers
here where spatiotemporal modelling and visualisation are widely developed.
The book then changes tack to deal with cities at the urban scale. Illusions
and twists in geographic analysis introduce this focus and then the tenor changes
to complex networks, spatiotemporal behaviour, imageability in cities, community
facilities, mobility, and tracking. All of these papers are written using spatial tools
which emphasise visualisation of complex data sets. In fact most of the data intro-
duced in what are a set of strongly empirical papers do not really fall into the class
of big data per se. But the tools of simulation and visualisation in computational
social science are well developed here and potential readers will be able to gain a
real sense of how geospatial analysis can be used in CSS to great advantage. Many of
the examples relate to different spatial scales in Chinese cities and regions and this
provides a fascinating explanation of how wide such science is and how it is being
developed for important advances in our understanding of explanation and prediction
in social systems from a spatial perspective.
References
Batty, M. (2017). Geocomputation.Environment and Planning B: Urban Analytics and City Science,
44, 595–597.
Cioffi-Revilla, C. (2010). Computational social science. WILEY Interdisciplinary Reviews: Compu-
tational Statistics, 2(3), 259–271.
Cioffi-Revilla, C. (2014). Introduction to computational social science: Principles and applications.
New York: Springer.
Epstein, J. M., & Axtell, R. L. (1996). Growing artificial societies: Social science from the bottom
up. Cambridge, MA: The MIT Press.
Hummon, N. P., & Fararo, T. J. (1995). The emergence of computational sociology. Journal of
Mathematical Sociology, 20(2–3), 79–87.
Chapter 2
Foreword II: Convergence and Synthesis
Michael F. Goodchild
In their book Convergence of Knowledge, Technology, and Society Roco et al. (2013;
seealsoNRC2014) argued that the history of science has been one of swings between
divergence and convergence. In the divergence phase specialization flourishes, with
limited interaction between specialties, while in the convergence phase the barriers
between the specialties begin to weaken, and science advances through the sharing of
expertise and interest between specialties. The US National Science Foundation has
recognized the importance of convergence in today’s scientific enterprise, defining
it as “integrating knowledge, methods, and expertise from different disciplines and
forming novel frameworks to catalyze scientific discovery and innovation” (https://
www.nsf.gov/od/oia/convergence/index.jsp).
There are several reasons for believing in the importance of convergence and
synthesis at this point in the history of science. First, the problems faced by humanity
are arguably more challenging than they have ever been, as the planet becomes more
crowded and its ability to sustain life is under increasing threat. Second, science today
is by nature collaborative, requiring specialists in statistics, computing, and other
cross-cutting disciplines in addition to the expertise in particular domain sciences
that is required by the problem at hand; the days when a lone investigator working in a
single discipline could isolate and study a problem and derive significant knowledge
from it are probably gone. Third, many of the practices of academia are centripetal,
drawing a scientist into a real or imagined core of his or her discipline; it follows
that positive effort is required to encourage and reward broader perspectives.
Yet anyone who has followed the history of geographic information systems (GIS)
since their inception in the mid 1960s will be familiar with an earlier version of the
convergence argument. In building his school of landscape architecture at the Univer-
sity of Pennsylvania in the 1950s and 1960s, Ian McHarg argued that expertise in a
M. F. Goodchild (B
)
University of California, Santa Barbara, CA, USA
e-mail: good@geog.ucsb.edu
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_2
7
8M. F. Goodchild
number of disciplines—in ecology, hydrology, climatology, geology, soil science—
was essential to good landscape architecture. Thus in staffing the school he made sure
to hire experts in each of these areas, and to insist that each of them be dedicated to
synthesis, to making the school more than the sum of its disciplinary parts (McHarg
1969,1996). In developing a plan, each discipline’s contribution could be visualized
as a layer of knowledge, one of the stack of layers that now graces the front cover
of many GIS textbooks and the left margin of many GIS software products. In short,
GIS has always claimed a role in integration and in supporting interactions across
the boundaries of disciplines.
In essence the particular form of integration that is so central to GIS practice is
what we might term spatial integration. Driving a spike through all of the layers
will intersect the same location on each layer, and have the effect of integrating
each discipline’s data for the point in space that is represented by the spike. The
ecologist’s information about that point can now be coupled with information from
the geologist, the hydrologist, the climatologist, and the soil scientist. This argu-
ment raises space to a central role in integration or convergence. In the late 1990s
I proposed that the National Science Foundation fund a Center for Spatially Inte-
grated Social Science, an investment in the infrastructure of the social sciences that
would explore and demonstrate the value of location in enabling conversations and
synthesis between the social sciences. The center was established at UCSB in 1999,
and for five years it organized a series of programs: workshops, software develop-
ment, learning resources, examples of best practice, improving access to tools, and
search for data based on geographic location (csiss.org; Goodchild and Janelle 2004;
Goodchild et al. 2000). The work of the center continues today in UCSB’s Center
for Spatial Studies (spatial.ucsb.edu).
It is easy to see how this argument for the role of geographic space in integration
can be extended to time, and to any discipline that deals with phenomena distributed
in space and time. A compelling argument can even be made that space and time are
unique in this respect; that the processes studied largely independently in the domain
sciences need be integrated only when it is necessary to study their joint impacts on
a location at a specific time.
Early progress on the development of GIS was slow, due at least in part to the heavy
demands that it placed on very limited computing resources. Combining vector data
required that intersections be computed between the lines and areas depicted on each
layer, and it was not until the late 1970s that algorithms, methods of indexing, and
computing resources had advanced to the point where this was feasible and reliable
(Esri’s PIOS and Harvard’s ODYSSEY both emerged at about this time). Instead,
a common work-around was to represent each layer in raster, using a common, co-
registered raster for each layer, despite the old adage that “raster is faster but vector
is correcter”. Several systems emerged in the early 1970s to implement what was in
practice a very simple raster overlay operation. Today vector overlay algorithms are
fast and reliable, and many of the early raster-overlay systems disappeared or were
absorbed by the industry leaders.
But another twist to this argument has emerged in recent years. Raster systems
were always seen as single-scale, working at a fixed spatial resolution, yet today we
2 Foreword II: Convergence and Synthesis 9
have access to a great variety of raster data at a wide range of resolutions, and inter-
esting advances have been made recently in multi-scale analysis, combining layers
at different resolutions. The technology of discrete global grid systems (DGGS; Sahr
et al. 2003) allows the planet’s surface to be divided into tiles that are approximately
equal in size and shape, at a hierarchy of levels of resolution, with each level nesting
within the level above. DGGS are superbly elegant ways of integrating multi-scale
data.
This new book on spatial synthesis is one more proof of the value of this approach.
Each chapter takes one area of the social sciences and humanities and shows how a
spatial approach can result in significant advances in knowledge. It should be of great
value to anyone interested in pursuing this approach, or in developing new tools to
support it, or in developing courses that can empower students. Congratulations to
the organizers and editors; I look forward very much to seeing it in print.
References
Goodchild, M. F., & Janelle, D. G. (2004). Spatially integrated social science. Oxford, New York.
Goodchild, M. F., Anselin, L., Appelbaum, R. P., & Harthorn, B. H. (2000). Toward spatially
integrated social science. International Regional Science Review, 23(2), 139–159.
McHarg, I. (1969). Design with nature. Garden City, NY: Natural History Press.
McHarg, I. (1006). A quest for life.Wiley,NewYork.
NRC (National Research Council). (2014). Convergence:Facilitating Transdisciplinary Integration
of Life Sciences, Physical Sciences, Engineering, and Beyond. Washington, DC: The National
Academies Press.
Roco, M. C., Bainbridge, W. S., Tonn, B., & Whitesides, G. (Eds.). (2013). Convergence of knowl-
edge, technology and society: beyond convergence of nano-bio-info-cognitive technologies.New
York: Springer.
Sahr, K., White, D., & Kimerling, A. J. (2003). Geodesic discrete global grid systems. Cartography
and Geographic Information Science, 30(2), 121–134.
Part II
Spatial Synthesis in Humanities
Chapter 3
The China Family Tree Geographic
Information System
Di Hu, Xinghua Cheng, Guonian Lü, Yongning Wen, and Min Chen
3.1 Family Tree and GIS
3.1.1 Family Tree
A family tree (also called genealogy) is important historical material. The official
history, chorography and family trees constitute China’s national history. A family
tree systematically documents a clan with the same ancestor. A large amount of
historical information about individuals, families, clans, society, ethnology, customs,
economy, peoples, geography, population and culture is contained in a family tree
(Ge 1996; Wang 2006).
Family trees have great value that is mainly reflected in four aspects: cultural
relics, literature, education and rooting (Ge 1996; Wang 2006). First, a family tree
is a cultural relic; some family trees have existed for more than 1,000 years. Some
family trees have been edited or commented on by celebrities.
Second, the family tree is an important form of literature that can provide ample
and important data for many research fields, including studies of the family, history,
humankind and surnames. The family tree is a kind of useful material for researchers
D. Hu ·X. Cheng ·G. Lü (B
)·Y. W e n ·M. Chen
School of Geography Science, Nanjing Normal University, Nanjing 210023, China
e-mail: gnlu@njnu.edu.cn
D. Hu ·G. Lü ·Y. W e n ·M. Chen
Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal
University, Nanjing 210023, China
Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development
and Application, Nanjing 210023, China
X. Cheng
Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University,
Kowloon, Hong Kong, China
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_3
13
14 D. Hu et al.
who wish to investigate feudal thought and the family system of China. Furthermore,
national historical events, the lives of celebrities, and histories of minorities and
families are recorded in family trees with different levels of detail. These records are
important reference materials for historical research.
Third, family trees have educational value. Generally, family tree records include
parental instructions, the regulations of a clan and the laws of a family. These records
reflect traditional Chinese virtues. Investigating a family tree enables a researcher to
trace a family’s heritage and development and reproduce a glorious history that can
greatly inspire the descendants of the family.
Fourth, family trees have rooting value. An increasing number of overseas Chinese
people are interested in identifying their ancestors. In this sense, family trees can be
regarded as an important reference that provides important evidence of family history.
The origin and lineage of a family are important content that makes a family tree
essential material for such genealogical research efforts.
A great advantage can be found in Chinese family trees. First, Chinese family
trees have a long history, originating from the pre-Qin period and continuing to the
present. Family trees have existed over several centuries, and some have existed for
nearly 1,000 years. Second, numerous family tree records exist in China, providing
researchers with important source material. Some family trees have been well
preserved, which enables researchers to extract useful information from them. Third,
with increasing awareness of tracing roots and ancestors, family trees are continu-
ally consulted by both the public and research institutes. Therefore, it is important
to investigate and mine the information contained in Chinese family trees.
3.1.2 GIS and Family Tree Research
Geographical information system (GIS) is a computer system that stores, manages,
analyzes, expresses and displays geographic information about geolocation-related
phenomena (Goodchild 2009). In the several decades since its initial development,
GIS has been applied in all walks of life, including environmental protection (Good-
child 1993), hydrologic modeling (Devantier and Feldman 1993), land use analysis,
agriculture, public health (Nykiforuk and Flaman 2011; Higgs 2004), transporta-
tion and urban planning (Harris and Elmes 1993). Current hot topics in GIS research
include three-dimensional GIS, service-oriented GIS, digital globe (Goodchild 2018)
and smart cities (Roche 2014; Degbelo et al. 2016) and so on.
GIS has enabled a focus on the organization, management and spatial analysis of
geographic information about natural phenomena over recent decades. However, few
studies have put effort into studying geographic information about the humanities
and social sciences. In recent years, applying GIS to solve problems related to the
humanities and social sciences has become increasingly popular, and GIS has been
widely used in fields such as history, linguistics, criminology and economics. The
concept of spatially integrated humanities and social sciences has been proposed
(Harris 2009;Rumsey2009; Goodchild and Janelle 2010). Many research institutions
3 The China Family Tree Geographic Information System 15
related to GIS and the humanities and social sciences have been established, and
corresponding conferences have been held successfully. Moreover, a lot of databases
and information systems have been established, such as Chinese Civilization in Time
and Space (CCTS) (Liao and Fan 2012; Academia Sinica 2002), the China Historical
Geographic Information System (CHGIS) (Harvard CGA 2001), the Historical GIS
Database of Cotton Textile Industry on the Songjiang Region in Late Ming China
(Billy 2011) and the Spatial History Project at Stanford University (White 2010).
The family tree can be considered a new source of geographic information because
of its typical spatial-temporal characteristics. Family trees contain extensive infor-
mation about individual, family and clan activities that occur in the context of specific
spatial-temporal scenarios. The spatial information includes the birthplace and death
place of an individual, the location of a grave and the site of an event. The temporal
information includes the times of births, deaths, migrations and other events.
GIS has a variety of functions, including spatial-temporal information modeling,
analysis, expression and display. These functions are highly useful for family
tree research and use. Family tree information can be stored, managed, analyzed,
expressed and displayed from spatial and temporal perspectives using GIS.
3.1.3 Concept and Objectives of Family Tree GIS
The concept of the Family Tree Geographical Information System (FTGIS) was
proposed by Lü et al. (2009). The FTGIS emphasizes the importance of obtaining and
mining family tree information about the spatial-temporal distribution and migration
of clans and then expressing the spatial-temporal clan pedigree visually. The FTGIS
is dedicated to digitally storing, analyzing and expressing the spatial-temporal poral
information in a family tree. Furthermore, FTGIS aims to construct a visible spatial-
temporal network of family tree and to express spatial and genealogical relationships
clearly and understandably. In dealing with Chinese family tree sources, mining the
driving mechanisms of family inheritance and development to reproduce the history
of Chinese civilization is the ultimate goal of the FTGIS. Specifically, the main
objectives of the FTGIS are as follows:
(1) To digitize the full texts of family trees. Most family trees are stored in libraries
and private homes in the form of printed text, which makes it difficult to analyze
and share family tree information. Existing family tree information systems
mainly support bibliographic search rather than full-text search. The primary
objective of the FTGIS is to digitize the full text of family trees and then build
family tree databases and establish a foundation for the construction of the
FTGIS platform.
(2) To make the temporal information contained in family trees comparable. The
expression of the temporal information is mainly based on the Chinese tradi-
tional calendar, supplemented by the Christian era. These two ways of indicating
time are different in terms of benchmarks. Hence, the FTGIS is dedicated to
16 D. Hu et al.
ensuring that the temporal information can be located by using a time conversion
engine.
(3) To make the spatial information contained in family trees locatable. Place names
are the main spatial information in family trees. These place names are not
associated with longitude and latitude coordinates, which makes them difficult
to locate. In addition, as time passes, place names, locations, and regions often
change. The FTGIS can map ancient place names to specific spatial locations
or regions using an encoding technology based on ancient and modern place
names.
(4) To build a platform for family tree information collection and sharing. In China,
many family trees have not been publicly published, making information collec-
tion and sharing an issue. The FTGIS platform aims to provide an efficient and
convenient way to collect and share family tree information.
(5) To express the family tree information in a dynamic and visual way. Family tree
information is mostly recorded in the form of dry words, which is a disadvantage
for intuitively expressing family tree information, especially spatial-temporal
information. Expressing family tree information in a dynamic and visual way
not only helps the public gain an understanding of family trees but also helps
researchers to analyze family tree information. The FTGIS aims to provide
various ways of displaying family tree information directly and vividly.
(6) To promote family tree information analysis with the aid of GIS. As a tool, GIS
has a variety of functions, including spatial analysis, spatial positioning, and
multidimensional visualization. Of these, spatial analysis is the core function.
Therefore, expressing and analyzing family tree information by leveraging GIS
is both possible and convenient. Through the FTGIS, information on individuals,
families and clans can be mined in addition to the path of the development and
heritage of any clan.
3.2 A Unified Spatial-Temporal Framework for Family
Trees
3.2.1 Why Is a Unified Spatial-Temporal Framework
Needed?
A unified spatial-temporal framework is essential for constructing the FTGIS. A large
amount of spatial-temporal information is implicated in the preface, personal biogra-
phies and other content of a family tree, which is an important part of studying clan
lineages, migrations and the spatial distribution of families and building the spatial-
temporal pedigrees of families and clans. However, such information is recorded
with different spatial-temporal benchmarks. Hence, constructing a unified spatial-
temporal framework is extremely important for mapping the spatial-temporal infor-
mation from different family trees for further analysis. GIS technologies can be used
3 The China Family Tree Geographic Information System 17
to conduct a spatial analysis of family trees in various time periods and geographic
regions.
Time is one basic type of information contained in family trees. Generally, time
is expressed in two forms, namely, the Chinese traditional calendar and the Grego-
rian calendar. These two forms are different in terms of datum. Specifically, the
Chinese traditional calendar can be divided into two types in which time is recorded
using the annual number of dynasties and the annual branches of dynasties. Ancient
Chinese family trees mainly employ the Chinese traditional calendar, while modern
family trees mostly use the Gregorian calendar. However, the Chinese traditional
calendar usually omits the name and annual number of the dynasty; therefore, that
time must be estimated by interpreting the context. It is difficult for a computer to
directly compare the different types of temporal information, and issues such as clan
lineages, population ages and life regulations cannot be definitively resolved. Thus,
constructing a unified temporal datum and unifying the time expression methods are
essential.
Spatial information in family trees is mainly expressed by place names and simple
maps that lack accurate descriptions of specific locations and regions. Ancient place
names are mostly expressed as lower-level place names and tend to omit higher-level
administrative divisions. This makes it difficult for modern people to locate ancient
place names. In addition, place names change frequently over time. One place may
have different names in different time periods, and different places may have the
same name. Therefore, locating positions of place names correctly is important for
further study of family trees.
3.2.2 How Can a Unified Spatial-Temporal Framework Be
Constructed?
Positioning time and place names correctly is the core of building a unified spatial-
temporal framework. Therefore, models for time and place names should be built
and then combined into a unified spatial-temporal framework. First, to address the
time expression issues mentioned above, time must be located based on a unified
temporal datum. More importantly, the time model should be able to cover the entire
process of the spatial-temporal evolution of Chinese civilization. Second, types of
place names should be selected feasibly. Ancient and modern place names are taken
into consideration because these two types of place names exist at specific time
points or periods. Positioning place names requires building relationships between
ancient place names and modern place names and spatially orienting them based on
administrative divisions. Then, the current locations of ancient place names can be
identified based on this relationship. Moreover, place names can be abstracted as
geographic regions or entities with specific spatial locations, shapes, and ranges. In
addition, many place names in family trees lack longitude and latitude coordinates.
Therefore, a specific spatial-temporal symbol for expressing place name is needed.
18 D. Hu et al.
A unified spatial-temporal framework for family trees is proposed in this study.
This framework uses the Christian era year, Julian date and time as temporal datum
and Chinese historical administrative divisions and ancient and modern place names
as spatial datum. Through spatial-temporal database technology, the framework
converts Chinese traditional time to the Christian era and Julian date and time using
a time conversion engine. Thus, time information can be positioned. The framework
maps ancient place names to a specific location or extent using an ancient and modern
place name encoding engine. Thus, spatial information can also be positioned. Then,
the family tree information can be expressed in a unified spatial-temporal frame-
work, and researchers can use spatial-temporal data analysis and mining methods to
investigate family tree information.
3.3 FTGIS Data Model
3.3.1 Content and Information of Family Trees
Family trees are rich in content. Generally, a family tree contains a cover, preface,
commentary, legend, catalog, compiler, details of origin, lineage chart, Zibei, honor
record, biographies, clan rule and domestic discipline, and information about ances-
tral temples, tombs, clan property, contracts, writings and serial numbers (Ge 1996;
Wang 2006). Zibei is a word used in a name to indicate the rank of a clan. Modern
family trees usually contain attached demographic charts, compared tables of time,
and ancient and modern place name references. Some modern family trees even
contain audio and video materials. There is no standard for the content of family
trees. Some contain more information, and some contain less; some are brief, and
some are detailed.
Family tree information can be divided into three parts: basic information, core
information and other information. Time and place are the basic information of a
family tree. The birth and death information of family members constitutes the main
content of a family tree together with their activities at specific time periods and
places. Time and place frequently appear in family trees. Based on a family tree,
we can know when and where the ancestor of a branch clan migrated; networks
of blood relationships, which imply the order of birth; when and where individ-
uals were born and died; where their graves are located; and when and where they
lived, studied, worked and had experiences. Extensive time and place information
express the inheritance relationship of a family in every generation from the temporal
perspective and convey the distribution and migration information of a family from
the spatial perspective.
The core information of the family tree is clan information and individual infor-
mation and relationships. Clan information includes compiling information, the clan
branch and migration. This information shows detailed migration information for
a clan and its main migrators, such as the time when a migration occurred and the
3 The China Family Tree Geographic Information System 19
Fig. 3.1 Relationships in a family tree
origin and destination of the clan or migrators. Compiling information includes the
preface, commentary, autobiographies, honor records, biographies, stylistic rules,
clan rules and collection records. Individual information includes name, Zi, Hao,
nation, generation, rank, birth time and place, experiences, death time and place
and grave location. Zi and Hao are respectful title for a person used in ancient
Chinese society. Experiences include when and where the individual studied, lived
and worked. Relationships include family and clan relationships. As illustrated in
Fig. 3.1, these relationships include father, mother, spouse, stepfather, stepmother,
adopted father and mother, and main member or clan member.
Other family tree information, also called bibliographic information, includes
genealogy place, genealogy name, compiler, compilation mode, version, carrier form,
binding form, annotation, abstract and collection unit.
3.3.2 Overview of the Models
Based on the above analysis, this study proposes the FTGIS data model, which
is composed of five data models. Figure 3.2 shows the components of the FTGIS
data model and their relationships. These five data models can be divided into two
categories: the spatial-temporal framework data model and the family tree spatial-
temporal data model. The time data model and place name data model constitute
the data model for the unified spatial-temporal framework. The family tree spatial-
temporal data model includes the family tree bibliographic model, family tree item
content model and family tree lineage record model.
Figure 3.3 shows the time data model, which is divided into two parts. One part
contains the entities HistoricalStage, Dynasty, DynastyStage, Emperor and Emper-
orReignTitle. The other part is the time reference, including YearRef and DateRef
entities. The HistoricalStage entity contains the id, name, start and end date attributes.
The Dynasty entity contains the id, id of HistoricalStage entity, name, start and end
20 D. Hu et al.
Fig. 3.2 Components of the FTGIS data model
Fig. 3.3 Time data model
date attributes. The DynastyStage entity contains the id, the title of the emperor’s
reign, name, creator, start and end date attributes. The Emperor entity contains the
id, id of DynastyStage entity, name, historical name, temple name, posthumous title,
start and end date attributes. The EmperorReignTitle entity contains the id, id of the
Emperor entity, name, start and end date attributes.
3 The China Family Tree Geographic Information System 21
The YearRef entity contains id, gregorian_calendar_year, tradi-
tional_calendar_year, lunar_year, gregorian_calendar_start_date, grego-
rian_calendar_end_date and remarks attributes. The DateRef entity contains
id, lunar_year, lunar_month, month_type, gregorian_calendar_start_date,
gregorian_calendar_end_date and remarks attributes, as shown in Fig. 3.4.
Figure 3.5 displays the place name data model. The AncientModernPlaceName
entity is composed of five entities: PlaceName, Type, SubordinateRelationship and
SpacePos. The SpacePosRef entity is related to the SpacePos entity, and the Time
entity is related to Type and SubordinateRelationship entities. Furthermore, the Time
entity includes the start and end time and comprises the time description and standard
time attributes.
Fig. 3.4 Year and date references in the time data model
Fig. 3.5 Place name data model
22 D. Hu et al.
Figure 3.6 shows the family tree bibliographic model describing bibliographic
information in sets of family trees and relations among a set of family trees and
a volume of family trees as well as the relation between bibliographic informa-
tion and clan. The FamilyTree entity contains the head_info and content attributes.
The Keywords entity contains the keyword, version, introduction, create_info,
modify_info, publicate_info and data_store_access attributes. The Store entity
and Access entity comprise the StoreAccess entity. The Store entity contains the
file_name, file_type, store_location and process_environment attributes. The Modi-
fication entity contains the modifier, modify_time, and modify_place attributes. The
Publication entity contains publicator, publicate_time, and publicate_place attributes.
The Creation entity includes the creator, create_time and create_place attributes. The
Keyword entity contains the content and type attributes. The KeywordType entity
contains the family_name, celebrity, generation_extent and living_place attributes.
Figure 3.7 shows the family tree item content model, which expresses the item
content, excluding the lineage record and relations between item content and clan.
Fig. 3.6 Family tree bibliographic model
3 The China Family Tree Geographic Information System 23
Fig. 3.7 Family tree content data model
The Clan entity contains the id, ft_name, family_name, ft_item, ft_edit, clan_event,
origin_text, other_text, and ft_multimedia attributes. The FTItem entity contains
the type, author, time, item_text, and item_multimedia attributes. The FTItemType
entity contains the type attributes of the FTItem entity, including preface, genealog-
ical_comment, genealogical_style, honor_record, biography, and clan_rules. The
FamilytreeEditInfo entity contains types of ft_edit, editor and edit_time attributes.
The MultiMedia entity contains id, title, type, format, description and url attributes.
The MultiMediaType entity is related to the Table of MultiMediaType entity and
contains the picture_type, audio_type and video_type attributes. The MultiMedia
24 D. Hu et al.
entity description attribute is related to the SubmitInfo entity and the TextInfo entity.
The SubmitInfo entity is related to the TextInfo entity and includes the submittor and
submit_time attributes. The TextInfo entity contains the title and content attributes.
As shown in Fig. 3.8, the family tree lineage record model is the core of the
family tree spatial-temporal data model. This data model expresses information about
individuals, families and clans, events related to them, and relations among them.
There are two types of families: the main family and affiliated families. The Family
entity is composed of husband-and-wife attributes, and the affiliated family is related
to the MainFamily entity and records the second husband and wife. The Clan entity
includes the clan name and totem attributes. The ClanObjRelation entity indicates
relations among individuals, clans and families. The RelationType entity indicates
individual-individual, individual-family, individual-clan, family-family, family-clan,
clan-clan and other relations. The Event entity contains the name, subject, time,
place, type, and description attributes. The subject attribute is associated with the
individual, family and clan and type attributes, including individual births, deaths,
family construction, migrations and clan sacrifices.
Fig. 3.8 Family tree lineage record model
3 The China Family Tree Geographic Information System 25
3.4 Family Tree Information Specification and Sharing
3.4.1 Existing Specifications Associated with Family Trees
Sharing family tree information is difficult. Family trees arise from different time
periods, nations and families and involve multiple levels of information. Furthermore,
family trees in different periods have different compilation features with rich content
and various forms of expression and preservation. This complexity causes difficulty
in collecting and sharing family tree information. Although massive family tree
catalog databases have been established, these systems have distinct data collection,
processing, and querying procedures, which makes it difficult to share family tree
information.
Information specification is an essential and effective way to digitize and share
family tree information. A perfect specification should be able to perfectly describe
the appearance, structure and content of a family tree; then, information can be
exchanged and shared adequately and easily by leveraging it. Such a specification is
dedicated to providing a method for implementing the standardized expression and
sharing of family tree information. The design of a specification not only takes all
content into consideration but also considers different attributes.
As an effective way to implement sharing information, family tree information
specification has attracted considerable attention. Several description specifications
for family tree information have been developed. The Family History Department of
the Church of Jesus Christ of Latter-day Saints proposed GEDCOM (GEnealogical
Data COMmunication), which is dedicated to providing a flexible, unified family tree
data interchange and presentation format that can be processed directly by computers.
Currently, the widely used version is GEDCOM 5.5 (GEDCOM Team 1996), and
GEDCOM 6.0, which stores data in XML format, has been released (GEDCOM
Team 2001). Based on the lineage-linked data model, GEDCOM records informa-
tion on nuclear families and individuals. In general, a GEDCOM file is plain text
using either ANSEL (American National Standard for Extended Latin Alphabet
Coded Character Set for Bibliographic Use) or ASCII (American Standard Code for
Information Interchange) and is composed of three sections: the header, records and
trailer. The header section defines the metadata, such as the genealogy name, founder,
source and collector. A series of modified specifications based on GEDCOM have
emerged, such as GeniML (Genealogical Information Markup Language), GedML
(Genealogical Data in XML), and GenXML (Genealogy XML) (GEDCOM Team
2001). GEDCOM has facilitated the sharing and expression of Euro-American family
tree information. However, problems still exist in describing the relationships among
individuals, families and clans, events, spatial-temporal information and individual
information. The Library of Shanghai, China, has established the Genealogy Descrip-
tion Metadata Specification (Shanghai Library 2005), which normatively defines and
describes bibliographic resources. This specification is designed for sharing and inter-
operating family tree resources among digital libraries. The Genealogy Description
Metadata Specification consists of core elements of ancient literature and individual
26 D. Hu et al.
elements. This specification is designed mainly to standardize information, such
as the publisher, date, source, creator, and description. Moreover, it extracts only
information about ancestors, first migrated ancestors and notable ancestors from the
family tree as a content abstract. Thus, it is merely a metadata specification of bibli-
ographic information, not a comprehensive description of the family tree content.
For Chinese family trees, none of the specifications mentioned above are able to
express their unique content and elements and to perfectly implement the sharing
and exchange of information. This is because Chinese family trees are different from
those of other countries and have many features (i.e., a long history, clear lineage,
well-developed family system and complicated family structure). Therefore, there is
an urgent need to develop a new specification or standard for Chinese family trees.
3.4.2 Family Tree Information Specification
This study proposes a specification of Chinese family tree information based on the
FTGIS data model. The specification includes two parts, namely, metadata speci-
fication and content specification. The implementation of the proposed specifica-
tion is based on XML. In this sense, two types of elements, simple elements and
composite elements for family tree information, are defined by XML Schema. The
simple elements are used to describe atomic information items of the family tree,
family members, and supplementary materials. These atomic information items do
not include sub-information items and are expressed by the XML element type,
which corresponds to the leaf nodes of the XML document. A composite element
is composed of two or more simple elements. Compared with simple elements,
composite elements are expressed by XML complexType element and correspond
to non-leaf nodes.
Fig. 3.9 Entity group of family tree metadata elements
3 The China Family Tree Geographic Information System 27
As shown in Fig. 3.9, the designed elements for family tree metadata are divided
into four core entity sets and three accessorial entity sets. The core entity sets include
the header information set, bibliographic management information set, stylistic rules
information set and clan information set. The accessorial entity sets include the time
information set, spatial information set, and individual information set.
The header information set defines the description of the family tree metadata file.
The bibliographic management information set is based on the Dublin Core Meta-
data and family tree description metadata specifications. Structural and nonstructural
records are applied to record time information in family trees. Structural records
mainly record the precise types and timing methods. More specifically, time infor-
mation is organized at four levels: year, month, day and hour. The spatial information
structure contains existence time, administrative division, geolocation, and present
contrast. In this specification, the new concepts of the double time tag and period
place name are used to completely and explicitly express the spatial-temporal infor-
mation (Hu et al. 2010,2011). The stylistic rules information set is important for
directory navigation and providing instructions for locating family tree content. In
addition, it provides useful instructions for modifying family tree content. The clan
information set is the core of family tree metadata. This is designed to describe and
extract core and specified information of family used to distinguish different fami-
lies and clans. The core element of the metadata specification, the ClanInfo element
defined by XML Schema, is shown in Fig. 3.10.
The designed elements for family tree content are divided into the basic informa-
tion set, member information set, time information set, site information set and other
information set. Specifically, as the core element, FTBasicInfo includes the id, name,
surname, entry, modification, event, other text, multi-media and original text elements
or attributes. Among these elements, the entry element, modification element, event
element, and other text elements are composite elements. XML Schema definitions
of the FTBasicInfo element, FTItem element and Event element are illustrated from
Figs. 3.11,3.12 and 3.13.
The FTMember element is composed of the id, name, sex, nation, generation,
seniority, clan branch, current state, event, other text, member multi-media, original
text elements or attributes. Four elements, including name, event, other text, and
member multi-media, are composed of simple elements. The XML schema definition
of the FTMember element is shown in Fig. 3.14. More details of the XML elements
for family tree information specification can be found in (Feng 2011).
3.5 Mass Family Tree Information Collection
The volunteered geographic information (VGI) approach advocates collecting and
organizing geographic information through cooperation between users and infor-
mation collectors (Goodchild 2008). After geographic information is organized and
arranged, it becomes the basic data for the public to share and apply. As historical
28 D. Hu et al.
Fig. 3.10 XML Schema definition of the ClanInfo element
Fig. 3.11 XML Schema definition of the FTBasicInfo element
3 The China Family Tree Geographic Information System 29
Fig. 3.12 XML Schema definition of the FTItem element
Fig. 3.13 XML Schema definition of the Event element
Fig. 3.14 XML Schema definition of the FTMember element
material for civilians, family trees have a broad public base. Many family tree soft-
ware systems have a large number of users, and many family trees have been created
or integrated through these systems by these users. Thus, to take advantage of the
public’s enthusiasm for genealogical information sharing and tracing ancestors, this
study employs VGI and explores a new mode of mass family tree information collec-
tion. Many factors, such as the diverse age groups, education levels, and computer
skills of users, are considered.
30 D. Hu et al.
There are three ways to collect family tree information, which are introduced
below.
(1) Collecting information manually in a variety of ways.
First, users collect family tree information from the FTGIS website, surname
websites, family websites and personal blogs. Second, information collection from
the stand-alone version based on Microsoft Excel is available. Users can input family
tree information into an Excel spreadsheet. Third, users can also input data through
Microsoft Word to create a paper version, which is designed for people who are not
able to access the web and are not familiar with computers. This project established
a Word document form to help users who need to input family tree information
manually.
(2) Generating family trees semi-automatically and quickly.
This study proposes a family tree collection system that provides an interface for
generating family trees semi-automatically. The print version of the family tree is
scanned as images. Then, users edit the scanned images and collect information by
manually processing them.
(3) Converting the data formats of family trees.
To support the popular family tree data format GEDCOM, the FTGIS platform
enables users to organize family tree data in the GEDCOM format and convert it
into other formats, such as XML.
3.6 FTGIS Platform
3.6.1 Architecture of the FTGIS Platform
This study proposes the architecture for the FTGIS platform shown in Fig. 3.15,
which provides an overview of the key components. The platform adopts a three-
tiered architecture: data layer, service layer and application layer. The data layer
contains the family tree index database, family tree metadata database, family tree
webpage database, family tree image database, family tree full-text database, service
metadata database, time database, ancient and modern place name database and
Chinese historical administrative division database. These databases are the main
data carriers for the FTGIS platform. Based on the data layer, the service layer
provides users with two types of web services: family tree data services and family
tree function services. The application layer is the access interface provided by the
general platform. Users connected to the web can access family tree web services
provided by the platform.
3 The China Family Tree Geographic Information System 31
Fig. 3.15 Architecture of the FTGIS platform
Fig. 3.16 Architecture of the FTGIS website group
32 D. Hu et al.
To collaboratively construct the FTGIS and share family tree information, the
FTGIS website group is proposed. As shown in Fig. 3.16, the basic three-tier archi-
tecture integrates surname websites, clan websites and individual homepages. Users
can create family trees and edit and share information through a consistent interface.
Any individual or group can construct a surname website or individual homepage
using the web services provided by the platform. In addition, the platform provides
news services using web crawler technology to retrieve and parse family tree news
from the internet. By combining the in-depth information collected from the internet,
this platform significantly promotes the social sharing of family tree information.
3.6.2 Functions of the FTGIS Platform
The FTGIS web services can be classified into two categories: data services and func-
tion services, which are shown in Table 3.1 in detail. The FTGIS data services include
a perpetual calendar data service, place name data service and genealogical data
service. As shown in Figs. 3.17 and 3.18, the FTGIS data services support creating
and editing individual information online and editing time and place information.
The FTGIS function services support a full-text genealogy retrieval service, place
name encoding service, spatial-temporal analysis service, spatial-temporal spectrum
visual express service, and statistical analysis service. Figure 3.19 shows the query
of present and past place names. As illustrated in Fig. 3.20, the GIS function service
supports genealogical lineage information using the tree structure. Moreover, users
can view information for individuals, families, and surnames with the help of the
web map. Based on data provided by the FTGIS data services, statistical charts and
graphs are available online, as shown in Fig. 3.21.
3.7 Conclusions and Future Research
To make full use of family trees and to help resolve the related problems in the
humanities and social sciences, this study proposes a strategy to construct the FTGIS
by incorporating modern information technologies, such as database, GIS and web
technologies, into research on family trees. In this way, family tree information can be
systematically collected, arranged, analyzed, and integrated. The key FTGIS issues
were discussed in detail: (1) a unified spatial-temporal framework; (2) the FTGIS data
model; (3) family tree information specification and sharing; and (4) mass family tree
information collection. Finally, a multilevel architecture of the FTGIS platform was
proposed, and a prototype of the FTGIS platform was developed. The proposal and
implementation of the FTGIS dramatically promote family tree information analysis
and sharing. Furthermore, the FTGIS provides an accurate tool for integrating and
visually expressing potential information in family trees, building spatial-temporal
3 The China Family Tree Geographic Information System 33
Table 3.1 Family Tree Platform Service Description
Service Type Service Title Description
FTGIS Data Services Perpetual calendar data
service
Supports expressing temporal
information in different ways
Place name data service Supports querying past and
present place names
Family tree full-text data
service
Supports creating and editing
family tree items and individual
information online. Multiple ways
of querying and browsing family
tree information
Historical geography
fundamental data service
Provides web maps digitized from
Chinese historical atlases, past and
present place names extracted
from past and present Chinese
place names dictionary
Maps and images service Provides web maps and images
Family tree news data service Provides family tree news from
internet using web crawler
Family tree metadata service Provides family tree metadata
FTGIS Function Services Full-text family tree retrieval
service
Retrieves family tree information
including individuals, families,
clans, and surnames
Family tree data
transformation service
Converts time data to different
record formats automatically
Place name encoding service Supports encoding of 6,000,000
modern place names and 100,000
ancient place names as well as
fuzzy encoding
Time encoding service Supports converting time
expressed in different ways with
precision of Chinese traditional
time as day
Spatial-temporal analysis
service
Supports spatial-temporal analysis
of family tree information
Spatial-temporal spectrum
visualization service
Tree structure visualization. Web
map visualization
Statistical analysis service Provides online statistical charts
and graphs
spectra and revealing the evolutionary process of families and clans from diverse
perspectives.
Although this study proposes the FTGIS and achieves many of its goals, some
limitations and issues remain to be addressed. In the future, some improvements will
be made on the FTGIS platform, case studies will be strengthened, and services for
34 D. Hu et al.
Fig. 3.17 Creating and editing individual information online
Fig. 3.18 Editing time and place information
historical scholars who use family trees as data sources will be provided. From the
perspective of historical GIS, some ideas for further research are suggested below.
(1) Building a novel and unified spatial-temporal framework and then applying it
to a data platform designed for spatial-temporal analysis and visual expression
as part of a historical humanities knowledge system.
(2) Designing a GIS data model and fundamental historical GIS software based on
time, sites, individuals, events, and scenes.
(3) Exploring the thematic mapping method and spatial-temporal analysis methods
as well as mining potential information and patterns.
3 The China Family Tree Geographic Information System 35
Fig. 3.19 Querying past and present place names
Fig. 3.20 Visualizing tree structure
36 D. Hu et al.
Fig. 3.21 Online statistical charts and graphs
References
Academia Sinica. (2002). Chinese Civilization in Time and Space. Retrieved May 5, 2019, from
http://ccts.ascc.net.
Billy, K. L. (2011). GIS Database of Cotton Textile Inducstry of the Greater Songjiang Region form
the Late Ming to the mid-Qing. Retrieved May 5, 2019, from http://www.iseis.cuhk.edu.hk/son
gjiang/.
Devantier, B. A., & Feldman, A. D. (1993). Review of GIS applications in hydrologic modeling.
Journal of Water Resources Planning and Management, 119(2), 246–261.
Degbelo, A., Granell, C., Trilles, S., et al. (2016). Opening up smart cities: Citizen-centric challenges
and opportunities from GIScience. ISPRS International Journal of Geo-Information, 5(2), 16.
Feng, Y. R. (2011). The designation of Family Tree Metadata Specification and its implementation
by XML. Diss: Nanjing Normal University.
GEDCOM Team. (1996). The GEDCOM standard release 5.5. Family and Church History
Department, The Church of Jesus Christ of Latter-day Saints.
GEDCOM Team. (2001). The GEDCOM standard release 6.0. Family and Church History
Department, The Church of Jesus Christ of Latter-day Saints.
Ge, J. X. (1996). The value and limitation of genealogy as historical article. History Teaching and
Research, 6, 3–6.
Goodchild, M. F. (1993). The state of GIS for environmental problem-solving. Environmental
modeling with GIS, 8–15.
Goodchild, M. F. (2008). Virtual geographic environments as collective constructions. Acta
Geodaetica Et Cartographic Sinica, 31(1), 1–6.
Goodchild, M. F. (2009). Geographic information system. Encyclopedia of Database Systems
(pp. 1231–1236). Boston, MA: Springer.
Goodchild, M. F., & Janelle, D. G. (2010). Toward critical spatial thinking in the social sciences
and humanities. GeoJournal, 75(1), 3–13.
Goodchild, M. F. (2018). Reimagining the history of GIS. Annals of GIS, 1–8.
Harris, T. M., & Elmes, G. A. (1993). The application of GIS in urban and regional planning: a
review of the North American experience. Applied Geography, 13(1), 9–27.
Harris, T. (2009). Conceptualizing the spatial humanities and humanities GIS. In Keynote
presentation at the GIS in the humanities and social sciences international conference.
3 The China Family Tree Geographic Information System 37
Harvard CGA. (2001). China Historical GIS. Retrieved May 5, 2019, from http://sites.fas.harvard.
edu/~chgis/.
Higgs, G. (2004). A literature review of the use of GIS-based measures of access to health care
services. Health Services and Outcomes Research Methodology, 5(2), 119–139.
Hu, D., Lü, G. N., Wen, Y. N., et al. (2010). GIS-based family tree information sharing and
service. International Conference on Geoinformatics. IEEE.
Hu, D., Lü G. N., & Wen, Y. N., et al. (2011). GIS-based family tree system integration. In
International Conference on Spatial Data Mining and Geographical Knowledge Services. IEEE.
Lü, G. N., Chen, M., & Wen, Y. N., et al. (2009). Research on constructing the Family Tree GIS.
In: Proceedings of the 1st Spatially Integrated Humanities and Social Science Forum, Hongkong,
China, 114–127.
Liao, H. M., & Fan, I. C. (2012). Chinese civilization in time and space: The design and application
of China historical geographic information system. E-science Technology & Application, 3(4),
17–27.
Nykiforuk, C. I., & Flaman, L. M. (2011). Geographic information systems (GIS) for health
promotion and public health: a review. Health Promotion Practice, 12(1), 63–73.
Rumsey, A. S. (2009). Scholarly communication institute 7: spatial technologies and the human-
ities, a conference hosted by the Scholarly Communication Institute, University of Virginia,
Charlottesville, VA: June 28–30.
Roche, S. (2014). Geographic Information Science I: Why does a smart city need to be spatially
enabled? Progress in Human Geography, 38(5), 703–711.
Shanghai Library.(2005). Genealogy Description Metadata Specification. 2003 DEA4T035: CDLS-
S05-015.Shanghai: Shanghai Library.
Wang, H. M. (2006). The value and abuse of genealogy. Shanghai Education Research, 6, 63.
White, R. (2010). What is spatial history. Spatial History Lab: Working paper [online]. Retrieved
May 5, 2019, from http://www.stanford.edu/group/spatialhistory/cgi-bin/site/pub.php.
Chapter 4
GIS for Chinese History Research
Yifan Lu and Ping Zhang
Geographic Information System (GIS) has commenced to provide support for histor-
ical research worldwide since 1980s, and it is widely applied in the field of Chinese
history research from the early 21st century. Recently, with the rapid development
of the information technology, the attempt to use the GIS technology leads to many
breakthroughs in history studies, especially promoting and enhancing the research of
issues in regard to environmental change, rivers and geomorphology, climate change,
water conservancy projects, rural settlements, urban growth, diseases spread and
old maps researching. Besides, some unsolved problems have been tackled through
GIS technology, and thus, historians would like to pay more attention to develop
the Geographic Information System and Science as a new approach for significant
progress in history. On the basis of that, this paper reviews the process that GIS
has been drawn into the research of China Study and summarizes the enormous
changes and promotions that this technology brings into the traditional Chinese
history research by these following six parts.
Y. L u
College of Foreign Languages, Capital Normal University, Beijing 100089, China
e-mail: yifan.lu0607@outlook.com
P. Zh a n g ( B
)
School of History, Capital Normal University, Beijing 100089, China
e-mail: zhangping029@126.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_4
39
40 Y. Lu and P. Zhang
4.1 The Construction of Typical Geographic Information
Systems for China Study
These years, several representative Geographic Information Systems have been estab-
lished, such as the Chinese Historical Geographic Information System (CHGIS)
which is developed cooperatively by Harvard University and Fudan University,1
the Chinese Civilization in Time and Space (CCTS) which is built by “Academia
Sinica”,2A Historical GIS Dataset of Urban Cultures in Republican Beijing which is
constructed by the Institute of Space and Earth Information Science of The Chinese
University of Hong Kong,3the Silk Road Historical Geography Information Plat-
form which is developed collaboratively by the Center for Historical Geography of
Capital Normal University and the General Publishing House of Shaanxi Normal
University.
These databases represent the major progress that the GIS technology achieved
these years in Chinese history studies. In these datasets, changing boundaries are
linked to various statistical information and historical administrative maps have been
enriched with a vast number of datum containing historical events or geograph-
ical elements. Moreover, they offer a dynamic expression method for historical
maps and textual descriptions so that spatial analysis and map expression have been
deeply improved in social science. In addition, these datasets are the platforms for
the querying, integration and exchange of information. That promotes historians’
cooperation.
In 2001, Chinese Historical GIS (CHGIS) has been developed for purpose of estab-
lishing a comprehensive geographic dataset for exploration of the spatial pattern of
the past and for further study of history. On the basis of the historical mapping and
statistics showing the administrative divisions in each period of history, this dataset
goes beyond simple mapping to more complex electronic visualizations. It reflects
the continuous changes of historical divisions and place names in history. Besides,
large amounts of relevant information have been also displayed such as the changing
boundaries of districts and administrative divisions. Therefore, CHGIS provides
researchers with much more functions than electronic maps. Combining mapping
and the time dimensions, users gain easily access to data querying and acquisition,
timeline data and statistics, information retrieval tools as well as spatial analysis.
Wider time ranges being accurate to annual changes of administrative regions, from
the year 221BC in which the Qin Dynasty established to 1911 when the Qing Dynasty
fell, will be completely presented on the platform.4And now most parts of digital
maps have been published and allow users to download from the website.
1http://yugong.fudan.edu.cn/views/chgis_index.php?list=Y&tpid=700.
2http://ccts.ascc.net/intro.php?lang=zh-tw.
3http://www.iseis.cuhk.edu.hk/history/beijing/index.htm.
4https://sites.fas.harvard.edu/~chgis/.
4 GIS for Chinese History Research 41
Another national historical GIS system, the Chinese Civilization in Time and
Space has built a platform with precise spatial positioning, data querying and acqui-
sition, integrated time and space attributed to experts for further spatial analysis.
Developed in 2002 as its first edition, CCTS provides rich basic historical geographic
data and thematic data by various old map resources. Based on the Historical Atlas
of China edited by Tan Qixiang, which provides maps of each dynasty, and the 1930
Shenbao Map of China which was edited by Ding Wenjiang, the integrated historical
maps in each period of history over 2000 years are collected in the CCTS and are
organized in dynasties. Furthermore, the “Academia Sinica” has linked large amounts
of thematic data resources expressed on the historical maps, such as the digital litera-
ture system of Chinese works, the Database of grain price in Qing Dynasty developed
by the Institute of Modern History, the Chinese Ming and Qing Dynasties’ chorogra-
phies union catalog database. By way of using these data resources and a base map of
ArcChina drawn in 1990s, the scale of which is 1:1000000, the system could transfer
the traditional paper map into new visualization method so that the spatial relations
of each historical elements and contents are shown in the electronic maps. Users get
the searching information as well as the spatial relations of searching results at the
same time. Thus unsolved problems in history studies will be discussed further with
new perspective and new questions are to be put forward.
A Historical GIS Dataset of Urban Cultures in Republican Beijing developed by
The Chinese University of Hong Kong, aims at examining the spatial patterns of
modern urban cultural changes in China through historical geographic information
system and science. The object of observation is set in this project at Beijing city from
the advent of the Republican era in 1912 to the year 1937. And in order to present the
urban cultural changes, the data across six sets of cultural spheres, including urban
morphology, market culture, education culture, public health and medical culture,
legal culture and religious culture, are added to the GIS program for spatial analysis
and data comparison so as to explore any implications therein. Besides, it is available
for users to browse the digital maps online.
The Silk Road Historical Geography Information Platform developed coopera-
tively by the Historical Geography Center of Capital Normal University and the
General Publishing House of Shaanxi Normal University has been launched in June,
2017.5This system aims at exploring the huge changes of natural environment and
cultural landscapes along the Silk Road. It has chosen several elements lasting for
over 2000 years as representative objects in order to observe their changes, such as
eco-environment, heritage sites, ethnics and religions, traffic and trade, cultural trans-
mission. At present, rich thematic information is involved in the platform, including
300,000 pieces of datum in regard to the place names along the Silk Road. Via
extracting these datum, it allows experts exploring the spatial patterns of the past,
investigating geographic entities and phenomena both in the spatial and temporal
dimensions and then analyzing the spatiotemporal relations of geographic entities
on the platform. The system covers the shortage of the historical maps that are lack
5http://www.srhgis.com/homePage.
42 Y. Lu and P. Zhang
in the expression of the social, economic and cultural factors as well as spatiotem-
poral phenomena and is focus on a dynamic expression of the evolution process of
thematic history with respect to certain fields, such as traffic, ethnics, regimes and
cities from 2nd century BC to the early 20th century as well as the hydrology and
rivers’ changes over the past 300 years. Due to a time-place-event three-dimensional
visualizing expression of those thematic information, users are provided with new
ways and perspectives by the subject-based modeling method for promoting a better
understanding of the previous phenomena under their personalized studies. Besides,
some other data analyzing programs are available to use online including 3D analyst,
Kernel analyst, buffer analyst, tracking analyst, etc.
4.2 The Research Regarding Climate, Rivers, Hydrology
and Geomorphology Through the Application
of the GIS Technology
The research on climate, rivers, hydrology and landform in history belongs to histor-
ical geography studies. Due to lack of accurate and suitable geographical data,
quantitative analysis was rare to use previously. However, owing to bringing in the
Geographic Information System and Science, various analyzing methods and rich
integrated information promote further historical geographic research referring to
numerous issues.
4.2.1 The Historical Climate Research with the GIS
Some historians has tried to use the GIS technology for discussing the historical
climate issue. Man (2000) has combined the textual historical documents with the
GIS technology for discussing the severe drought of 1877. The historical records
that the expert could easily have access to are the official documents submitted by
the governors of Shanxi and Hebei Province about the situations of drought in 1877.
GIS technology provides a new way that enables researchers to extract these
datum in the written texts and to reconfigure them spatially in order to analyze the
spatial relations of historical geographic factors that could not be found directly
in the historical documents. Via organizing the existing data related to disaster-
affected villages according to the historical documents, an uniform drought index
have been obtained according to the number of villages attacked by drought of
different degrees in different divisions recorded in the written documents. Then, the
Kriging interpolation method was used on account of the drought index in order
to optimally estimate the information of villages which are not mentioned in the
historical records but still bit by the disaster, so as to make up for the lack of data of
certain villages in the disaster areas.
4 GIS for Chinese History Research 43
On the basis of that method, Zhimin Man drew a specialized distribution map of
the drought in 1877, presenting directly and continuously the spatial differences in
the drought severity between Shanxi and Hebei province. In this map, with showing
the drought grades of difference places, author has observed the location of three
drought centers as well as their respective durations. Some other thematic maps such
as the boundary map of different divisions have been achieved, which also help the
expert locating directly and accurately the drought centers.
And furthermore, in terms of the disaster intensity index and its movement both
in space and time, he has inferred the movement process of the rainfall zone and
the summer monsoon in North China of that year. In addition, it could be confirmed
that this severe drought in 1877 of northern China was affected by the strong ENSO
event in the worldwide. When the monsoon rain weakened in Asia, consequently the
course and feature of the rainfall altered.
Recently, some experts, such as Wei Pan and Zhimin Man, continue to use this
researching method. Through making increased use of the GIS technology and estab-
lishing datasets, they have explored some other similar problems, such as issues
relating to the frequency of floods and droughts, and the changes of rivers’ volume
of runoff along the Yellow River (Pan 2011,2013) and Loess Plateau area, and
then have illustrated the relationship among the factors including these disasters,
the changes of landscapes and the movement of the summer monsoon (Liu and Pan
2014), and thus have obtained abundant achievements (Pan 2014).
4.2.2 The Research of Rivers and Hydrology in History
Through the GIS
Introducing the Historical GIS as a discipline to history studies is of great help to
create new insight into the geographies of the past and then to rebuild the unknown
departed landscapes or rivers. Via using the Highest-resolution Topographic Database
of Earth generated from NASA’s Shuttle Radar Topography Mission (SRTM data)
which aims at obtaining the digital elevation model on a near-global scale, Zhimin
(2006) has identified several watercourses of Yellow River in different periods
through the remote-sensing images. Referring to the records in historical texts and
comparing the ancient and modern place names, the author has confirmed one of the
ancient watercourses of Yellow River from the period of Eastern Han Dynasty to
1034 A.D. It flowed through Henlong and then into the sea in the present Shandong
territory. This water course of Yellow River was Jingdonggudao, the name of which
was recorded in historical documents, nevertheless its flowing route and direction
had never been drawn before. Thus, drawing precisely the ancient water courses of
Yellow River on the geomorphologic maps has become a breakthrough in the study
of historical fluvial geomorphology.
44 Y. Lu and P. Zhang
4.2.3 The Geomorphology and the Research
of Environmental Changes Through the GIS
Historical GIS expands the expression of geographic patterns in the form of their
changing process and aids the historians by means of analyzing the interactions
between historical phenomena and geographical elements.
Some experts like Deng Hui, used the digital spatial simulation, which is the
development of the GIS technology, for the study of the process of desertification in
the Mu Us Desert (Deng 2007), so as to show on the maps the quantity of reclamation
and the land-use pattern in the Mu Us area in the Ming and Qing Dynasty, especially
the quantity of garrison reclamation at Yulin area in the Ming Dynasty. And thus
through digital maps, it is possible to visualize the changes of the south side of
the desert (Wu 2014). As a result of that, with the historical documents supplied,
it could be concluded that the garrison reclamation did not lead to the enlargement
southward of the desert, since the south side of the Mu Us desert in the Ming Dynasty
was nearly the same with that of today. In addition, in the Qing Dynasty, as history
has proven, the land-use pattern in the northern Shaanxi, had really positive effect
on the ecological system, that was graining in the south and grassing in the north.
This pattern still takes effect up to now (Shu 2016).
The environmental changes in the region of Yangtze River Delta is an emerging
researching field. A vast number of historical documents show that the main reason
for the environmental changes in the Jiangnan region is the changes of the river
networks caused by the extension of polders and towns. Experts like Zhimin Man
and Wei Pan, using several large scale maps and charts, described the erosion and the
deposition intensity of Yangtze Estuary south branch from 1861 to 1953 (Pan 2009)
and the channel density with its changes in the area of Qingpu, Shanghai from 1918
to 1978 (Pan 2010).
In these two issues, the grid systems were constructed based on the GIS technology
for reorganizing the datum about the density of channels and the length of rivers
recorded in the 1918 and 1978 military maps. By mean of this method, the datum
of maps are able to be extracted and applied in the grid systems for discussing
the density contrasts in the equal-area deltas or channel networks. As a result of
these comparisons, it is moreover possible to analyze the position of deltas, the
length of rivers, the area of river networks and their changes in different periods, and
furthermore to estimate the impact of the river networks on the environment of the
whole Yangtze River Delta.
4 GIS for Chinese History Research 45
4.3 The Research of Towns and Villages in History
via the Application of the GIS
4.3.1 The Urban History and the Research of Urban
Historical Geography Under the GIS Platforms
In 1990s, researchers, represented by Li Xiaocong and Wu Honglin, initiated reading
remote sensing aerophotographic films as a supplementary way for the urban history
study. In these films and historical records, they found out the process of the relocation
of three cities along the Yangtze River since the Ming Dynasty, that is Jiujiang,
Anqing and Wuhu. Furthermore, on the basis of showing the geomorphic conditions
and the urban forms as well as their changes in these cities, experts clarified the
relations between the spatial expansion of the city areas from the Ming Dynasty and
the water course changes of Yangtze River (Li 1992).
Via checking out and comparing the large scale maps of Shanghai which were
drawn from 1855 to 1990, Xiaohong Zhang (2013) presented the spatial patterns as
well as their changes so as to observe the relocation process of different cultural
spheres in Shanghai. The process of taking shape of modern spatial pattern in
Shanghai, the development of the management in the concessions as well as in
the city have been shown to us by GIS analysis. And then, the author’s standpoints
have been confirmed with the help of historical documents.
Historical GIS gives us an expression method offering the spatiotemporal
changing process of the historical geographic information, so that large number
of intuitive and visual images which express the changing process are widely
obtained for pursuing a dynamic demonstration and exploration. That makes up
the disadvantage of using only the textual descriptions.
Wu (2008) has focused on the issue of river reclamation. The evolution process of
Shanghai is able to be shown through the GIS technology. Using historical records
as supplementary, it is possible for us to see various views and the process that large
areas of farmlands transformed to urban road system in Shanghai.
According to the information about roads, Chen (2010a) has compared the changes
of city landscapes before and after the advent of the British concessions in Shanghai
by means of analyzing the documents relating to the process of the transformation
of the land usufruct. Besides, using the GIS technology, Mou (2012) has presented
the process of changes of urban landscapes from polders to downtown in the French
concession of Shanghai.
And recently, through using the city maps in the period of the late Qing Dynasty
and the Republic of China era, experts have tried to analyze the issues relating to
the social geography in cities (Wang and Zhu 1999), the class divisions, and the
spatiotemporal features of urban crimes, and thus obtained various achievements
and deep conclusions about the adjustments of the internal forms and spaces of cities
(Zhang and Sun 2011).
46 Y. Lu and P. Zhang
4.3.2 The Research of Town Economy of Jiangnan Region
in Ming and Qing Dynasty by the GIS
Specialists represented by I-chun Fan have numerous works regarding to the
economic development in the area of Yangtze River Delta in the Ming and Qing
Dynasty. I-chun Fan has marked all towns in the Taihu area in different periods
during the Ming and Qing dynasty on the digital map, the scale of which is 1:50000,
and has studied the relations between the rise and decay of towns of different levels
and the exploitation of the whole Delta region with the help of GIS statistical analysis.
According to these maps, it could be summarized that, except for few large towns
which maintained a sustaining growth, most of towns in this area developed and
declined unsteadily in the Ming and Qing dynasty. In author’s opinion, the reasons
for the increase in the number of towns over the past 600 years are various, and
the most important one is not the development of the capitalist economy, or the
urbanization, but is the regional development (Fan 2002,2004).
4.3.3 The Research on the Rural Settlements in History
Through the GIS
Historical GIS promotes in historiography by providing revised studies to challenge
the existing opinions. I-chun Fan (2008) has digitized two valuable village maps
in the Hebei area compiled in the late Qing Dynasty, the map of Qingxian and
Shenzhou. And then he has drawn the villages on large scale GIS maps. Using the
spatial layer analysis method, several mentioned factors in the historical records
have been classified and shown overlapped for data integration and then a new layer
has been formed containing certain factors. Through this layer mode, the author
has acquired some features and characteristics of these northern Chinese villages,
including the patterns of land allocation, settlements, population, education, elites,
religions, markets, etc. Based on the comprehensive analysis of various factors in
this area through the GIS technology, the author raised doubt about the traditional
view that the villages usually distributed in virtue of the location of the periodic
markets in north of China during the late Qing dynasty. By means of comparing the
maps of town markets with that of the villages, in combination with the charts of the
density of the population extracted from the layer analysis, it is available to present
that the relations between the villages and the population are linked by various types
of markets. As a result of that, it is possible to illustrate more about the internal logic
among the markets, villages and the population size in the area of the northern China.
4 GIS for Chinese History Research 47
4.4 The Research of the Economy and Society in History
via the GIS
Irrigation problem is a hot debated issue in the field of social history recent years.
Li (2012) has discussed the history of conflicts for the water resource from 1763 to
1945 in the region of an irrigated area, named Houcunzhen, that is from the Danshui
River which is located in the west of Taipei Basin, to the west bank of the Dahanxi
River. The positions of the irrigation area and the irrigated canals in the farmland in
different periods have been marked on the digital map by the author, and it is possible
to find out the connection between the distribution of the origin of the irrigated canal
and the stream segments in each period. On the basis of that, the locations of certain
factors such as the place of residence, the ancestral home, and the intake and the pump
station of the water conservancy project along the Daxi River were also marked on
the maps, and then were added on the geomorphologic maps of Google Earth.
By this way, author tried to analyze the characteristics and the interrelation of each
location of these factors, and the relations between the places of these factors and the
area in which the conflict of water resource took place. And furthermore he attempted
to discuss the differences between the traditional method for water development and
the modern water conservancy project. According to author’s opinion, the traditional
water resource facilities were so rough. As an irrigated area, Houcunzhen was located
at the end of the waterway, it took the least water withdrawal in this area, so that in
the dry season, the disputes for water resource among different villages was the most
drastic. Moreover, the water-supply method and the water flow direction were also
the reasons that have impact on the water conflicts. However, in modern times, large
water conservancy facilities started to come into use. The government, the provider
of these water resource projects and the protection to the irrigation areas, became the
focal point of the water conflicts, and as a result of that, the water resource dispute
has been a problem between the authority and the residents.
4.5 The Research of Ancient Maps and Their Digitization
by the GIS Technology
In 2004, experts from Taiwan, like Jinn-Guey Lay, have tried to investigate the Taiwan
Qian-Hou-Shan Map for spatial analysis (Lay 2004). The Taiwan Qian-Hou-Shan
Map, released in 1878, is one of the most important old maps for that era. Drawn
based on the scientific survey, it consists of latitude and longitude coordinates which
were initialized in the later years of the Qing Dynasty.
Experts scanned this old map, located and digitized it, then conducted the coor-
dinate transformation for the spatial data, and overlapped it with the modern map
of Taiwan province so as to reconstruct a unified spatiotemporal framework with
historical elements. The analysis of old maps was seldom conducted by researchers
because of lack of analyzing tools for transforming the old maps to digital maps
48 Y. Lu and P. Zhang
which are conforming to the modern mapping criteria. Recently with the aid of GIS
technology, different types of maps are available to be transformed to the GIS map
under the same standard for spatial quantitative analysis, thus it is possible to know
the old people’s spatial cognition level, to investigate the quality of maps, to extract
useful information and to provide new kind of data and model for supporting history
research. Thus, experts, such as Lay (2004) developed a new approach of quanti-
tative geometry analysis based on a geographic information system and considered
that historical spatial cognition can be effectively interpreted, and the interpretation
can enhance research in historical geography.
These days, specialists have further researches on the ancient unified maps, town
maps and cadastral maps. Moreover, large number of maps, such as the measured
maps of Qing Dynasty, the scale of which is 1:50000 or 1:100000, have been digitized
and applied for much deeper researches.
4.6 The Exploratory Research for the Methodology
for Digitizing the History Geographic Information
The digitized information in historical geography is related to the researching
aspects including the administrative divisions, populations, economy, land utiliza-
tion, ethnics, religions, and culture. The information was always extracted from
traditional Chinese records and documents. However, most of these records and
documents were qualitative descriptions and were far less than systematic. Thus how
to transform the scattered information to useable geographic information became an
important methodological problem for historians. Therefore, a series of new attempts
have been promoted and some methods have been demonstrated as useful (Jiang
2015).
GIS technology makes it possible for historians to make full and simultaneous
use of digital maps containing three components: space, attributes and time and then
to analyze in combination with other historical elements.
The Grid system put forward by Zhimin Man is a method which deserves partic-
ularly recommendation. Grid System is one of the principal methods to standardize
the spatial data in Geography. Via this geographic technology, it is possible to divide
the geographical interfaces into several grids, the size of which could be selected as
needed, so that researchers get the degree and the density of certain factors in equal
area and thus they could compare the different densities of each factor distributed in
this equal area and then evaluate the land utility degree and the land use efficiency
in different areas.
Zhimin Man has shown an example. Experts who need to analyze the land use
status and their changes in Shanghai in a period of time, should compare the changes
of three sets of datum, that are the data reflecting the features of hydrographic nets,
the spatial changes of settlements, and the city changes. In general, these datum
are classified respectively by the data of river nets, settlements and city’s districts
4 GIS for Chinese History Research 49
and blocks. However, these three sets of datum could be expressed separately on
the GIS spatial data system as points, lines and surfaces in terms of three types of
presentations, which had difficulties to be shown on the same charts before. So it is
a key problem to solve by experts about how to compare these datum on one chart
via using the same standard. This is also regarded as a problem of standardizing the
data.
In the opinion of Man (2008), the grid system is an instrument that has advantage
to contain and standardize different types of datum. So it is possible to present these
different types of datum on the same plane in the Grids by means of transforming in
space these various types of historical datum and records. Therefore, it is convenient
to present the spatial patterns of land covers, and also the man-earth relation in a small
area. In brief, the grid system has solved the problem of the accurate calculation in the
regional study and thus experts are able to develop the spatial quantitative analysis
regarding to the way and the process that humans use resources. And the accuracy
of the research on the man-earth relationship has also been largely improved.
The Chinese history aspects that GIS technology could apply to also include
many other issues, such as the positioning of archaeological sites developed by
Nishimura (2016), the manuscripts from Dunhuang Grottoes and its geographic infor-
mation researched by Rong (2016), the historical population studies researched by Lu
(2012,2014,2015), the disease studies in history presented by Gong (1993,2014),
the disaster research developed by Kong (2017), the religions research conducted
by Chen (2010b), and the research of the land use and the cadastral management
presented by Zhao (2005).
Recently, Geographic Information System and Science has become an important
analysis instrument for Chinese history research which is widely used in the field
of archeology, historical geography and regional social history. The experts in these
fields have persistently promoted the use of GIS technology for their studies. And
until now, the GIS technology has been applied to the research of Chinese history
for over 20 years. As a tool of mapping, querying, analyzing and researching, it has
already given initial successes. And we have considered that the more complicated
the historical issue is, the more the GIS technology could contribute to solve the
problem. Owing to the continuous development of historical spatial database, the
GIS technology will play more and more important role for history research in the
future.
References
Chen, L. (2010a). The changes of urban and rural landscapes in modern time of Shanghai (1843–
1863)-according to the analysis of the data of the road system in Shanghai. The Doctoral Thesis
of Fudan University.
Chen, Q. (2010b). Landlord, religious organization and dispersion of lower Danshui tribe in Wandan
Region of Pingtung Plains (1720–1900). Research of Taiwan History,3, 1–37.
Deng, H. et al. (2007). The changes of the south side of the mu us desert since ming dynasty. Chinese
Science Bulletin,52(21), 2556–2563.
50 Y. Lu and P. Zhang
Fan, I.-C. (2002). The nature of the expansion of the Jiangnan market towns in the ming-qing
dynasty. Institute of History and Philology Bulletin of Academia Sinica,73(3), 443.
Fan, I.-C. (2004). Market towns and regional development to the east of lake tai during the mid-ming
dynasty. Institute of History and Philology Bulletin of Academia Sinica,75(1), 149–221.
Fan, I.-C. (2008). Local society in late Qing Hebei as seen in village maps from two counties.
Journal of New History,19(1), 51–104.
Gong, S. (1993). A preliminary study on variations of the distribution of zhang-disease for the past
2000 Years in China. Acta Geographica Sinica, 48(4), 1993.
Gong, S. et al. (2014). A geographic study of epidemic disasters of Jiangnan area in ming dynasty
(1638–1644). Geographical Research,33(8), 1569–1578.
Jiang, W. (2015). Research on urban Population of Jiangnan in the Republic of China: Base on GIS
and the data of topographic maps. Researches in Chinese Economic History, 4, 39–56.
Kong, D. et al. (2017). Spatial-temporal characteristics and environmental background of locust
plague in Beijing-Tianjin-Hebei region during ming and qing dynasties. Journal of Palaeogeog-
raphy,19(2), 383–392.
Lay, J.-G. et al. (2004). Quantitative Analysis of 1878 Taiwan Qian-Hou-Shan Map, Symposium of
the 1st Seminar of the Toponymy in Taiwan, 253–271.
Li, C.-Y. (2012). Inside-out: The historical changes of the dispute of water conservancy in the
irrigated area of Houcunzhen. Journal of Baisha Historical Geography,2, 65–169.
Li, X. (1992). Using the remote sensing images for urban historical geography research-taking
example of the relationship between the changes of cultural landscapes and that of river courses
in three cities, Jiujiang, Wuhu, Anqing. Journal of Beijing University, 37–41.
Liu, H., & Pan, W. (2014). The Initial Time of the Rainy Season on the Loess Plateau during
1766–1950 and its Response to the Summer Monsoon. Journal of Earth Environment,5(6),
378–382.
Lu, W. (2012). The distribution of Hui people’s settlements in Shaanxi-Gansu Areas and the Related
Database Construction before the Tongzhi Reign of the Qing Dynasty. N.W. Journal of Ethnology,
4, Total No. 75, 37–45.
Lu, W. (2014). A research on the small probability events with tiny population base under gis case
study on scale and spatial distribution of Chin-shihs from Hui ethnic group in qing dynasty. The
Journal of Hui,2, 54–61.
Lu, W. (2015). Spatial Distribution of Hui Muslin Chin-shihs and Population during Ming and Qing
Dynasty. Journal of Beifang University of Nationalities, 2, Total No. 122, 99–105.
Man, Z. (2000). Climatic background of the severe drought in 1877. Fudan Journal (Social Science),
6, 28–35.
Man, Z. (2008). Spacial and temporal data structure for local study. Journal of Chinese Historical
Geography,23(2), 5–11.
Mou, Z. (2012). From the ancient water town to the eastern Paris-the research on the urban space
changing process in the French concession of modern Shanghai, Shanghai Bookstore Publishing
House, Aug. 2012.
Nishimura, Y., & Kitamoto A. (2016). Re-identify ancient ruins on the silk road and establish ruins
database. Journal of Shaanxi Normal University (Philosophy and Social Sciences Edition),2,
75–85.
Pan, W. (2009). Reconstruction of erosion-deposition in yangtze river estuary south branch and
related problem study, 1861–1953. Journal of Chinese Historical Geography,24(1), 22–29.
Pan, W. et al. (2010). The grid methods of drainage density data reconstruction in big river
delta—based on the case of Qingpu, Shanghai, 1918–1978 A.D. Journal of Chinese Historical
Geography,25(2), 5–14.
Pan, W. et al. (2011). Reconstruction of the precipitation(May-Oct) in the upper and middle reaches
of the Yellow River (1766–1911). Journal of Earth Environment,6(1), 285–290.
Pan, W. et al. (2013). The study for relationship between PDO and the Streamflow of Yongdinghe
River (Lugouqiao) since 1766AD. Journal of Chinese Historical Geography,28(1), 127–133.
4 GIS for Chinese History Research 51
Pan, W. et al. (2014). The changing of chinese coastal typhoon frequency based on historical
documents, 1644–1911AD. Geographical Research,33(11), 2196–2204.
Rong, X. (2016). A view of the historical-geographic information of the ancient Gaochang in terms
of the unearthed documents in Turpan. Jounal of Shaanxi Normal University (Philosophy and
Social Sciences Edition), (1), 12–24.
Shu, S. et al. (2016). The Temporal and spatial distribution of settlements in the area along the great
wall in Yansui town during the late ming dynasty. Geographical research,35(4), 790–802.
Wang, J., & Zhu, G. (1999). A preliminary study of the social geography of beijing during the late
Qing and the early republican period. ACTA Geographica Sinica,54(1), 69–76.
Wu, J. (2008). From ancient water town to Metropolis: The changes of the urban road system in
modern Shanghai (1843–1863). The Doctoral Thesis of Fudan University.
Wu, C. et al. (2014). The study on the land developmentprocess in the Border Area between Shaanxi
and Inner Mongolia. Geographical Research,33(8), 1579–1592.
Xiaohong, Z. (2013). The maps of modern cities and the research of the urban spatial pattern in
the british concession area of shanghai since it opened to foreign traders. Journal of Historical
Geography, 28(2), 2013.
Zhang, X., & Sun, T. (2011). Urban space production: Urbanization of Wujiaochang Area in
Jiangwan Town of Shanghai in 1900–1949. Acta Geographica Sinica,31(10), 1181–1188.
Zhao, Y. (2005). The land utility in the area of Sunwan and its motivations. The Doctoral Thesis of
Fudan University.
Zhimin, M. (2006). The research on the flowing Route of Jingdonggudao of yellow river in the
northern song dynasty. Journal of Historical Geography, 21(2006), 1–9.
Chapter 5
Digital Historical Yellow River
Wei Pan, Rao-rao Su, Zhi-min Man, Li-jie Zhang, Mi-mi He,
and Li-kun Han
5.1 Digital Historical Yellow River
The development of historical geographic information needs to try new ideas and
means. “Digital Historical The Yellow River”(DHY) is a project started in 2017
by Yunnan University and Shaanxi University. It is a sample of “Digital Historical
River”, which consists of six aspects: (1) high-precision three-dimensional micro-
geomorphology; (2) fusion scheme of historical hydraulic engineering and terrain
model; (3) restoration of three-dimensional shape of river channel; (4) simulation
and demonstration of motion process in historical period of surface water (5) recon-
struction of rainfall characteristics in historical periods; (6) river-water management
methods in historical periods. The practice of “Digital History Yellow River” as the
concept of “Digital History River” is not only a visualization result showing the
temporal and spatial changes of the Yellow River channel in the historical period,
but a professional historical data management platform +a special data set +aseries
of historical information analysis and display features (Pan et al. 2012).
The basic components of DHY include materials database, 3-D terrain module,
the Yellow River water environment event information management module, data
management platform and analysis-simulation function module. DHY currently
focuses on the Yellow River-related information during the Qing Dynasty-Republic.
Among them, we have completed the design and construction of the database, the
W. Pa n ( B
)·L. Zhang
Institute of Historical Geography, Yunnan University, Kunming 650091, Yunnan, China
e-mail: panwei@ynu.edu.cn
R. Su ·M. He ·L. Han
Northwest Institute of Historical Environment & Socio-Economic Development,
Shaanxi Normal University, Xian 710119, Shaanxi, China
Z. Man
Center for Historical Geography, Fudan University, Shanghai 200433, China
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_5
53
54 W. Pa n e t a l .
Yellow River basic hydrological information database and the financial management
information database.
The materials database is a historical information platform for the Yellow River
with query, download, online browsing, annotation and data association functions.
The data is divided into the rivers, the Qing Dynasty river archives, the Republican
archives, the Republic archives and the folk literature. The database is currently able
to hold files in formats such as DOC\PDF\PPT\EXCEL\JPG, and we plan to expand
the data types to allow the library to manage video and audio files.
3-D terrain module can simulate the terrain with multi-scale, to show the water
move direction, speed and scale in virtual environment. The Yellow River water
environment event information management module was used to managed the data
about historical hydrological changing, flood disasters, etc. Data management plat-
form and analysis-simulation function module will assistance user to observe and
analysis the data in the system or personal‘s. These 2 module will create kinds chart,
provide “deep mapping” to users (Schreibman et al. 2004).
Through the work of DHYR, we initially tried three-dimensional, dynamic, histor-
ical hydrological simulation and historical water-scenario simulation in historical
geographic information 2.0. We hope that the practice of this work can enhance
the level of historical geographic information (Chen 2014;Tu2014), combine the
information operation method with the actual research, and cultivate the deep soil
of historical geographic information development, so that this direction can have
long-term vitality.
5.2 The Relationship Between Qing Government Finance
and the Yellow River Management
Yellow River management is one of main political affairs in Qing Dynasty. The
Empire and the emperors needs the Yellow River keep peaceful, so the greatness of
emperors could be felt by peoples over the river. However, the political affairs need
finance power to afford the huge engineer (Pan et al. 2020).
Based on the DHY, huge amount of files, archives and old maps can be used as
research materials. The software “Voyant” can be used in analyzing the structure
characteristics of historical documents. The analyzing result showed that the silver
supplies became the most important theme since 1740AD. The Yellow River manage-
ment become a financial problem in about 1740s and continued until the collapse of
Qing Dynasty. The government officers must provide mounts of silver for projects
or engineering taken place along the river. The Fig. 5.1 describe the scale of silver
spend in different counties over the lower reaches of the Yellow River in 1750s. It is
just an example, in fact, the whole period of Qing Dynasty, the distribution of silver
spend on the Yellow River changed every year.
In order to make sure the management of Yellow River affairs within the control
of central government, the general structure of Yellow River had been determined,
5 Digital Historical Yellow River 55
Fig. 5.1 The silver distribution over the Yellow River in 1750s
especially after the maintenance by Jin Fu in Kangxi Period. The stable financial
system was increasingly important for government to operate the maintenance of
Yellow River (Pan 2019). The quota river maintenance fund system developed by
Qing Dynasty should not be interpreted as only fixating the amount of river mainte-
nance fund, but also its source, management departments, expenditure item and soon.
However, this quota assumption of central government was too difficult to realize, it
was never actually implemented in practice for river maintenance.
Qing government attempted to control the cost of river maintenance with quota
management system, however, the actual spending was still hard to be restrained with
the system. It could be revealed that the purpose of quota management system was
hard to realize through the repeating discussion of supplementary fund from Qian-
long to Jiaqing dynasty. In practice, the quota management system was continually
adjusted, even the source of the fund was not stable. The expected quota management
system of Qing government was built on the financial system based on the agricul-
tural economy. It was nearly impossible to be stable. While the supplementary fund
in Qianlong and Jiaqing dynasty was built on inchoate financial system, such as salt
administration in Henan province and civil exchangers in Shandong province.
Many inherent conflicts could be found out through the study of quota manage-
ment system. To show this spatial feature better, thematic cartography technology
was applied in the article to reveal the specific condition of quota river maintenance
fund collection in Henan and Shandong with visualization method. The county level
administrative region data in 1820 from CHGIS is quoted in the article. With the
geographic information systematic software ArcGIS, the record in this report of
quota river maintenance fund record will be interpreted into geographic data which
is easier to understand and analyze (Man 2002). With the DHY, the record in this
report of quota river maintenance fund record will be interpreted into geographic
data which is easier to understand and analyze.
56 W. Pa n e t a l .
The counties fulfilled the fund collection tasks in Henan were less than 50%,
and this index in Shandong was only about 65% as well (Pan 2014). It proved that
completion status of quota river maintenance fund collection was not satisfied enough
and the tax of the land tax with apportioned poll tax could not annually well support
the quota cost of east river maintenance. Quota river maintenance fund system was
unstable since fund collection, which not only related to the unbalanced quota fund
collection undertaken by minority counties but also the spatial condition of this
system. For example, the states and counties which undertaken more quota tasks in
Shandong concentrated mainly in the angle zone formed by Yellow River, Henan and
Hebei. This area located among the Grand Canal, Nansi Lakes and Yellow River.
Flood disasters happened frequently in this area because of its low and plain terrain.
Subsequent problems were the difficulty of fund collection and the arrears of during
the collection. The quota collection of river maintenance fund became more and
more difficult (Zhang 2018).
The statistics of the extreme disasters in Anyang, Luoyang, Zhengzhou, Nanyang
and Xinyang in Henan province in Qianlong dynasty based on DHY, which listed the
frequency of 1 (the most water logging) and 5 (the driest) in the drought-flood grade,
as below. Figure 5.2 reveals clearly that the regions with heavier disasters under-
took more responsibility of river maintenance fund collection. In order to reveal the
contradiction degree between quota river maintenance fund collection and disas-
ters, the concept of contradiction index is introduced in the article. The most ideally
reasonable condition is considered as the least fund collection responsibility for the
region suffering the most severe consequence of disaster and the same with that least
responsibility for the region suffering the least severe consequence of disaster.
Fig. 5.2 The relationship between disasters and hydrological quota funding during 1736–1795AD
(Qianlong Emperor period)
5 Digital Historical Yellow River 57
In the distribution diagram of flood and drought of atlas, every site stands for the
scope of one or two regions in administrative division (or one or two mansions in
the historical period), which means every dot in the atlas stands for the spot and its
perimeter zone. The data of the dot could be considered as the condition of the spot
and its perimeter zone.
There was a great conflict between the increase of Yellow River maintenance
projected expenditure and the payment ability of various counties. Some counties
along the river, which suffer the disasters more easily, undertook higher quota of fund
collection, especially the areas in dyke building on the south bank from Yingze to
Yucheng and dyke building on the north bank from Wuling to Kaocheng in Henan.
However, these areas were just the ones which were frequently impacted by the
Yellow River flood in this 350 km river reach from Wuling to Xuzhou city, and more
than 140 levee failures occurred in this reach from early in Qing dynasty to the levee
failure of Tongwaxiang (Pan 2014).
The counties with heavier disasters undertook heavier financial burden. There
were serious unreasonable circumstances in the distribution pattern of quota river
maintenance fund collection and the distribution area of disasters. This framework
to some extent threaten the stability and effective operation of the quota river main-
tenance fund system of Qing government. Usually counties along the river were
under a greater threat when experiencing the drought and flood of Yellow River,
especially flood. Along with the increasing maintenance projects of Yellow River
and the growing predicted expenditure of river works, river affairs management was
also become more and more difficult (Piao et al. 2010).
The river maintenance expenditure of Yellow River increased in late Qian-
long dynasty, but the existing quota river maintenance fund system could not be
implemented ideally and meet the requirements. The quota river maintenance fund
system in Henan experienced the reconstruction from middle to late of Qianlong
dynasty to Jiaqing and Daoguang dynasty, which was that the supplementary fund
appeared in the mid-term of Qianlong dynasty and was abolished in the late term and
proposed once again in early Jiaqing dynasty, and fundraising and interest-bearing
was proposed in mid-to-late term of Jiaqing dynasty. The supplementary fund was
collected by counties at first and donated from the compensation salary of nourishing
honesty by officials later and finally stabilized by the measure of supplementary fund
saved from fundraising and interest-bearing.
Based on DHY and GIS, the conclusion is that the collection of quota river mainte-
nance fund relied on minority counties in a great extent and other counties undertook
little proportion of collection. More importantly, the counties with heavier disasters
undertook heavier financial burden. These spatial features directly influenced the
punctual and full payment of quota river maintenance fund. Because of these insti-
tutional problems, the quota river maintenance fund system lacked the sustainability
and the river maintenance project faced the increasingly severe challenges in the
context of increasing material price and hired labor instead of dispatched.
58 W. Pa n e t a l .
5.3 The Collapse of the Yellow River Finance in 1820–1840
5.3.1 The Changing of the Hydrological Environment Over
the Yellow River in 1820–1840
In the early 18th century, the Qing government began to set up water level observation
stations on rivers in China. The water level observation station at Wanjintan is on
the Yellow river, north of Laoxiancheng, Shanxian county, Sanmenxia city, Henan
province. It is an important data source for monitoring the water conditions of the
middle reaches of the Yellow River. Similarly, there are observation stations on
the Qinhe River and Yongding River, located at Muluandian in Wuzhi County and
Shijingshan-Lugouqiao, respectively. According to the rules of Qing Dynasty, when
the water level rises 2 Chi (Chi is a Chinese length unit; 1 Chi ≈0.32 m) or more, the
date and height must be reported to the imperial government (Zhuang and Pan 2016).
At present, these reports are scattered through the following sources: ‘Extracts of
the water condition of historical floods in Qing Dynasty at Wanjintan and Xiakou
on the Yellow River; Muluandian on the Qinhe River; and Gongxian on the Yiluo
River’, edited by the Yellow River Conservancy Commission in the 1980s as internal
documents as Fig. 5.3 (Liu et al. 2012; Wei et al. 2013).
According to the average situation of the reconstruction, the beginning times of
flood seasons of both the Yellow River and Qinhe River range from July 6 to July
10. Meanwhile, the flooding season of the Yongding River begins a little later, and
Fig. 5.3 The Water Level spots in the basin of the Yellow River during 1766–1911AD
5 Digital Historical Yellow River 59
ranges from July 16 to July 20. Here, we reconstruct the chronology of the beginning
of the flood season of the three rivers on a pentad scale (Mantua et al. 1997).
The flood season and the fluctuation at Sanmenxia on the Yellow River during
1766–1911AD, the Qinhe River during 1761–1911AD, and Yongding River during
1736–1911 were reconstructed based on the water level observation reports of the
Qing Dynasty. 5-pointsmoothed chronologies show that flood season was advanced
and delayed during 1820–1860s and 1870–1880s, which correlates negatively with
the temperature change of the loess plateau. This phenomenon is especially apparent
in the 1880s (Zheng et al. 2005).
Through establishing the regression model, inverts the annual runoff of 1766–
1911AD, builds up and improves the annual runoff series of 1766–2000AD in flood
season in Lanzhou, Qingtongxia and Sanmenxia by using the records of water level
stake of Sanmenxia stations in the Upper- Middle Yellow River (UMYR) in the Qing
Dynasty. Combining the annual runoff of 1766–1911AD at Tangnaihai Station in
riverhead reach, the study builds the runoff series of four stations at the riverhead
and UMYR, which is presently the clearest runoff curve of the Yellow River by
historical records. According to the research, the heavy “river disaster” that appeared
in the lower Yellow River in the mid-19th century was caused by sudden changes of
the runoff at the Qingtongxia-Sanmenxia section. Drought period of the river in the
1920s existed from the riverhead to the middle reach, but it was not caused by sudden
changes. Meanwhile, the study also reveals that PDO and the runoff of the UMYR
had a periodic inverse phase relationship on the inter-decadal scale. In the early and
mid-20th century, the runoff of the four stations had an inverse phase relationship on
the scale of 8–16 years. In the 1830–1850s, the inverse phase relationship between
PDO and flow on the scale of 4–6 years was more obvious at Lanzhou-Sanmenxia
section. According to the interactive wavelet analysis, there is a significant inverse
correlation between PDO and the amount of water in the UMYR on a scale of 8–
16 years, but only at the Sanmenxia-Lanzhou section, suggesting that the relationship
between summer rainfall in the UMYR and PDO had obvious temporal and spatial
differences. (1) During Qing Dynasty, the change of runoff flow in the UMYR had
obvious differences; On the natural state, there was no obvious consistency in the
flow change of the UMYR. The occurrence of sudden change time point was not
synchronousin history. In the long term, the runoff change of the UMYR had a unique
phenomenon. The simultaneous reduction of flows of each reach since the 1970s is
a special phenomenon, at least it is the only phenomenon discussed in this study
within this time range. (2) It is concluded that the correlation between the PDO and
runoff in the UMYR is periodic and there is no special obvious linear relationship,
but regional differences are more obvious. The inverse correlation between PDO and
runoff in the study reaches is mainly on a decadal scale. The Lanzhou-Sanmenxia
section is relatively sensitive in the face of the change of the PDO on the decadal
scale. When formulating the water resources strategy of the Yellow River, we should
notice the differences in the response of different sections to the same environmental
background. (3) In the mid- 19th century, many large- scale floods in the lower
reach resulted from the sudden increase of runoff in the middle reach. In the reign
of Emperor Daoguang of the mid-19th century, the Qing Dynasty declined rapidly.
60 W. Pa n e t a l .
During this period, large- scale flood disasters occurred in many parts of eastern
China, especially in the populous North China Plain and Taihu Basin. The flood
brought huge financial and social losses. Among them, eastern Henan of North China
Plain suffered from the flood disaster in successive years by burst of the Yellow River
in the 1840s, and the central government spent a huge amount of money to solve
the problem of the river, which greatly aggravated the financial difficulties in that
period. The large scale flood in the lower Yellow River corresponds to the period of
sudden change of runoff low in Sanmenxia section revealed by this research, which
indicates the sudden increase of rainfall in the Loess Plateau. Climate change was
deeply involved in China’s decline and depression during the reign of Daoguang
Emperor. (4) Although some progress has been made in reconstruction of multi-site
and long-time runoff series of the Yellow River based on different materials, further
work is needed in data analysis so as to make clear the sequences of uncertainty (Shi
et al. 1990), thus enabling the integration of data in the future to provide basic data
for further research on long-time spatial and temporal change of runoff of the Yellow
River.
The colder Northern Hemisphere period is not consistent with the warm summer
temperature on the Loess Plateau. It is also inconsistent with the advance of the flood
season revealed in this research. The results show that the beginning time of flood
season of rivers in this area is related to the temperature fluctuation of the Northern
Hemisphere, whereas it should be even more closely correlated with the summer
temperature change on the loess plateau. This phenomenon could be a multiyear
response to the change of the intensity of the monsoon of East Asia.
The maxim value of the Yellow River runoffs in past 300 years appeared in 1820–
1840s, and the flood season was earliest during the runoffs peak. The sudden climatic
changing occurred in 1820–1840s lead to the hydrological variation over the Yellow
River.
5.3.2 The Hydrological Challenge of Daoguang Period
The Yellow River finance system had already been malfunction before Daoguang
Period (1820–1840s). The huge amount of silver needed every year made Qing
government feel big pressure of finance. In Daoguang period, the silver less and
less in Emperor‘s economic and financial section. As the structural faultiness of
the Yellow River management finance has been introduced in this paper, the silver
collection became more and more difficult as the flood disasters more serious since
1810s. The silver cannot be got from the towns, villages and cities over the Yellow
River.
First of all, the Hedaoku (河道库) originally has a certain amount of deposits,
in case of emergency. However, during the Daoguang period, there was a shortage
of deposits in Hedaoku (河道库). On June 18th on the seventh year of Daoguang,
Yan Liang, the Governor of Hedong River Road, mentioned in a memorial that a
certain amount of deposits of Kailuan and Hebei Hedaoku (河道库) can be used for
5 Digital Historical Yellow River 61
emergency purposes. However, the storage of these two Hedaoku (河道库) has been
significantly reduced in recent years. This shows that the storage of the quota river
funds is not ideal in the Daoguang period.
During the Daoguang period, although the emperor repeatedly warned the minister
of river affairs that the state funds had its management system, it still could not
guarantee the stable supply of the river funds, and the quota system was difficult to
sustain at this time. In the eleventh year of Daoguang, the Bangjiayin (帮价银)was
reduced from 300,000 Liang to 250,000 Liang. However, this amount has not been
seriously implemented, and river officials often avoid this when they are budgeting.
The amount of the river funds used for purchasing the materials in Henan province is
far greater than 300,000 Liang. This is an important manifestation of the non-binding
nature of the quota system.
Most importantly, some changes can be seen from the way the river officials
apply for silver. As mentioned above, Yan Liang asked for extra two thousand duo
of straw to meet the needs for material during 1820–1836AD. The emperor’s reply
is in accordance with the application. The situation that quoting the former way of
asking for more river funds is along with the Daoguang period. It is important to
note that the governor of Henan Province and the emperor had discussed the issue
of “increasing the river funds” for several times at the turn of Qianlong and Jiaqing.
And the new quota was finally set. Although the scale of the river funds has been
expanded, the quota system itself has been retained and it is still possible to restrict
the increase of the river related expenditure.
However, during the Daoguang period, discussions between the emperors and
ministers about the increase in expenditures of the Yellow River were no longer
carried out under the premise of a quota system. This has a great relationship with
the attitude of the Emperor Daoguang towards the quota system. In the early days
of Emperor Daoguang’s administration, he resolutely kept the quota unchanged, and
did not allow the behavior of increasing the amount of the river funds. On September
11thon the seventh year of Daoguang (1827AD), there is a memorial from Emperor
Daoguang to Cheng Zuluo, the governor of Henan province,its general meaning is
as follows: I have reviewed the documents related to the increasing of the river funds
between the governor of Henan province and the central government in the 57th year
of Qianlong (1792AD) (Ren et al. 2008). The Emperor Qianlong clearly opposed the
increase of the river funds. This behavior can only be used as a temporary measure
under special circumstances and cannot be mistakenly believe that the behavior of
increasing the material and the quota river funds is a new quota standard set by the
central government. Yan Liang’s request to ask for extra two thousand duo of straw is
an act of arbitrarily increasing materials, which affects the people’s daily life (Verdon
and Franks 2006).
In short, the Quota river funds system, which is closely related to Diding tax, has
been in poor operation since the late Qianlong period. During the Jiaqing period, a
new quota had been set, and the use of interest was also used to make up for the
arrears of the river funds. However, the expenditure of the river affairs are extremely
huge, and it needs millions of silver each year in common situation. The financial
system that is still in its infancy cannot afford such a huge amount (Lingling et al.
62 W. Pa n e t a l .
2007; Xiaohua et al. 2010). In the Daoguang period, the quota system has become
ineffective. The main reason is that the supply of the fixed amount of river funds has
been increasingly dependent on items outside the country’s normal fiscal system, such
as donations (捐纳). In the 1840s, the hydrological environment of the Yellow River
was abrupt, causing big disasters in the south of Henan Province. The temporary
large-scale engineering to settle these disasters was not subject to quota control,
which further led to unrestricted expenses. However, from the perspective of the
operation of the quota system, as early as the 1840s, the quota system of river funds
was already rampant and could not play the role of “limitation”.
5.4 Conclusion
The Runoff of the upper and middle reaches of the Yellow River in flood season
from AD1766 to AD2000 clearly shows the changes of the water environment of the
Yellow River in the Qing Dynasty. According to this understanding, it can be found
that the direct cause of the year-after-year burst flood in eastern Henan in 1840s is the
sudden change of runoff in the Qingtongxia-Sanmenxia reach, and the lower reaches
of the Yellow River is at the highest stage of water level since Qianlong 30 years
ago. In this context, a detailed study of the Qing Dynasty’s management system of
river industry and banking shows that the expenditure of river industry in Daoguang
period has further increased compared with the rapid expansion since Qianlong and
Jiaqing dynasties, but the utilization efficiency has not increased significantly, and
the flooding of the lower reaches of the Yellow River is more than that of the previous
generations. During the Daoguang period, the river administration problems were
only triggered by the sudden change of water environment. The essential reason is the
change of management mode of river works funds. The management mode of river
works bank, whose main symbol is quota, has been shaken since the late period of
Jiaqing. During the Daoguang period, River affairs had become the most important
financial burden of the Qing Dynasty. In the face of the high river work expenditure,
Daoguang Emperor himself repeatedly stressed that river work expenditure needed
to be controlled, but he never raised the quota standard of river affairs. River officials
continuously increased the actual expenditure of river affairs by selectively quoting
Qianjia cases. In the Daoguang period, the quota system of river management and
bank has lost its ability to restrict the rapid increase of expenditure.
References
Danielle C. V., & Franks, S. W. (2006). Long-term behavior of ENSO: Interaction with the PDO
over the past 400 years inferred from paleo-climate records. Geophysical Research Letters,33,
L06712, 5PP.
Gang, C. (2014). The study for DH and HGIS. Social Sciences in Nanjing, 3, 136–142.
5 Digital Historical Yellow River 63
Hong-zhong, Z., & Wei,P. (2016). The research of the flood-height recording by Qing government—
based on Wanjintan Henan. The Qing History Journal, 2, 87–99.
Jingyun, Z., Zhixin, H., & Quansheng, G. (2005). The changes of precipitation over the middle and
lower reaches of the Yellow River during the past 300 years. Science in China: Series D, 35(8),
765–774.
Lingling, K., Yuexian, N., & Jinhua, W. (2007). Rebuilding the natural runoff series in the nearly
500 years at the Lanzhou Station in up stream of Yellow River. Journal of Water Resources and
Water Engineering, 18(4), 5–8.
Liu, F., Shengliang, C., & Ping, D. (2012). Spatial and temporal variability of water discharge in
the Yellow River Basin over the past 60 years. Journal Geography Science, 22(6), 1013–1033.
Man Zhi-min. (2002). Entered into digital era: methods and conceptions of GIS. Historical
Geography,18, 12–22.
Mantua, N. J., Hare, S. R., & Zhang, Y. (1997). A Pacific inter-decadal climate oscillation with
impacts on salmon production. Bull. Amer. Meteor. Soc., 78, 753–1069.
Pan, W. (2014). A preliminary study for the yellow river finance during late Qianlong reign in
Shandong. Journal of Chinese Historical Geography,29(4), 5–12.
Pan, W. (2019). The creation of annual renovation of flood-prevention work of the yellow river in
Shunzhi Reign. Shanghai Academy of Social Science, 180, 77–87.
Piao, S., Ciais, P., & Huang, Y. (2010). The impacts of climate change on water resources and
agriculture in China. Nature, 467, 43–51.
Ren, G.-y., Jiang, T., Li, W.-j. (2008). An Integrated assessment of climate change impacts on
China’s water resources. Advances in Water Science,19(6), 772–779.
Shi Fu-cheng, M., & Ping, G Z.-d. (1990). The reconstruction of run-offs in Qingtongxia George
during 1736–1912AD. Yellow River, 4, 27–29.
Susan Schreibman,Ray Siemens,John Unsworth. (2004). A Companion to Digital Humanities//A
companion to digital humanities /.Blackwell Pub.
Tu Zi-pei.(2014). The Peak of Data Science. Science Press.
Wei, P., Tao, S., & Zhi-min, M. (2012). The review of GIS entered into Chinese historical geography
since 2000 and outlook. Journal of Chinese Historical Geography, 27 (1), 11–18.
Wei, P., Jing-yun, Z., & Ling-bo, X. (2013). The relationship of nature run-off changes in flood-
season of middle Yellow River & Yongding River, 1766–2004. Acta Geographica Sinica, 68(7),
975–982.
Wei, P., Zhe, W., & Zhi-min, M. (2020). The achievement of historical GIS since 1990. Journal of
Chinese Historical Geography, 35(1), 25–35.
Xiaohua, G., Yang, D., & Fahu, C. (2010). The reconstruction based on tree-rings and analysis of
runoffs in the upper reaches of the Yellow River during the past 1234 years. Chinese Science
Bulletin, 55, 3236–3243.
Zhang, P. (2018). The application of the geographic information system in the study of chinese
history. Historiography Quarterly, 2, 35–47.
Chapter 6
Visualizing Classic Chinese Literature
Yongming Xu
In the eyes of data scientists, classical Chinese literature, including original texts
and research outcomes, have great potential for analytics. Big data of this sort can
be molded into databases of various types, and some data can be visually repre-
sented. The author is not a database or computing expert; yet during his scholarly
visits to Western universities, the author has witnessed the visualization of classic
Chinese literature through relevant software and databases by scholars and graduate
students alike, which is explicit, refreshing, and new. The author finds it possible
to employ these databases and visualization methods for the study and teaching of
classical Chinese literature, further facilitating its development in the big data era.
Therefore, the author presents here some relevant databases and software as well as
their operational procedures, based on the case study of Tang Xianzu, a famous play-
wright of the Ming Dynasty, in the hope that it may help illustrate such geospatial
humanities procedures to many non-technical readers.
6.1 Visualization of Writers’ Trajectory and Activity
Distribution
ArcGIS is the product of Esri. It is a powerful analytic software that can be widely used
to create maps in relation to anything geographical and spatial. Harvard University
has acquired the right to use ArcGIS products so that its faculty and students can
install and use the software on campus. In China, however, very few universities or
research institutes have the license. Consequently, the use of ArcGIS is largely limited
in China.
Y. X u ( B
)
School of Humanities, Zhejiang University, Hangzhou 310028, China
e-mail: Yongmingxu1967@zju.edu.cn
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_6
65
66 Y. X u
QGIS is short for “Quantum GIS”. It is an open source geographical information
system software developed by the QGIS Development Team. Users can gain free
access to the website (http://www.qgis.org) to download the latest version of QGIS
software. It is also an analytic map-creating software concerning geography and
space, similar as ArcGIS.
The China Historical Geographic Information System, CHGIS, is a project led by
Prof. Peter K. Bol from the Department of East Asian Languages and Civilizations
at Harvard University, with Lex Berman as project manager. It is an open source
Chinese geographical information system website,1The CHGIS project cooper-
ates with Fudan University’s Center for Historical Geography; it vectorizes Chinese
historical place-names and maps, and records the hierarchy and the evolutionary
information of place-names in the form of relational databases. So if there is any
Chinese historical place-name, the digital CHGIS can render it into a visual represen-
tation. The website provides the geographic coordinate system for Chinese historical
place-names. However, only the Qing dynasty vector historical map can be down-
loaded. For the Ming dynasty and the earlier ones, some place-names’ coordinate
system can only be looked up, without the vectorized map of the administrative units.
CaroDB is a geographic space database on the Cloud. Users can upload acquired
longitude and latitude data in batches onto the CartoBD website, quickly creating a
visual effect based on maps. The maps created in this way can be saved online or
published for public access. It is also an open source website.
Worldmap is a platform for publishing and sharing the results of a global
geographical information study; it was developed by Harvard University’s Center for
Geographic Analysis. For the China component, it contains geographical information
and maps of numerous areas, such as demographics, religion, traffic, urban study,
ethnic minorities and languages, energy, environment, education, climate, public
health, economy, and history. For example, in the literature-related area, there are
the Imperial Examination distribution maps of the Song/Yuan/Ming/Qing dynasties,
as well as the courier station roadmaps of the Ming and Qing dynasties.
After a brief introduction to geographical information systems and spatial map
generating software, the author will take Tang Xianzu as an example, presenting his
trajectory and activities distribution on a map using QGIS. But first let’s take a look
at the rendering after production (Fig. 6.1).
The place-names in red indicate the distribution of Tang’s trajectory and activities.
So then how does the map come into being? The steps and methods are as follows
(Sheet 6.1):
(1) Install QGIS software.
(2) Look up Tang’s trajectory and activities distribution (based on A Chronological
Biography of Tang Xianzu, written by Mr. Xu Shuofang).
(3) Look up the longitude and latitude of Tang’s trajectory and activities distribution.
This step involves the use of CGIS; namely, China Historical Geographic Infor-
mation System website (http://www.fas.harvard.edu/~chgis/). In addition, users
1Its address is: http://www.fas.harvard.edu/~chgis/.
6 Visualizing Classic Chinese Literature 67
Fig. 6.1 Tang Xianzu’s Trajectory and activity map1
can resort to the search interface developed by Lex Berman, project manager
of CHGIS. The website address is http://maps.cga.harvard.edu/tgaz/, and users
can copy the searched coordinate information into an Excel worksheet, with
the field name as follows: name X Y. It should be noted that, as Tang Xianzu
is from the Ming dynasty, so the place-name to look up should be under the
administrative units of the Ming dynasty, since the corresponding location of
68 Y. X u
Sheet 1 Tang Xianzus’s trajectory and activities (parts)
ID Place Place2 X Y
1臨川Lin Chuan 116.3513 27.98478
21 南昌Nan Chang 115.8977 28.6749
22 新建Xin Jian 115.8977 28.6749
23 臨川Lin Chuan 116.3513 27.98478
24 滕州Teng Zhou 117.0657 35.06738
25 宣城Xuan Cheng 118.7425 30.94694
26 順天府Shuntian Fu 116.368 39.93143
some place-names changes over the course of history. The coordinate informa-
tion of Tang’s trajectory and activities distribution that I found is presented as
follows:
(4) Save the Excel worksheet as a CSV file and upload it onto the QGIS. Note that
to upload, one has to locate the huge comma symbol to the left of the opened
QGIS. Click Okay, enter ‘Xian 1980’ in the filter bar, and double click the ‘Xian
1980’ below.
(5) GotoCHGISwebsitehttp://www.fas.harvard.edu/~chgis/ and download
‘v4_citas90_cnty_pgn_utf_stats’. The path is: DATA—China Historical GIS—
Version 4 Datasets (with descriptions)—CITAS-1990-Counties (polygons)—
Data Archive—1990 CITAS Counties (With Stats, UTF-8)—Dataset.
(6) Decompress the downloaded ‘v4_citas90_cnty_pgn_utf_stats’, go back to the
GGIS interface, click the icon on the left, upload the files with ‘.shp’ suffix
from the decompressed v4_citas90_cnty_pgn_utf_stats, and drag CSVs onto
the top of the ‘.SHP’ files.
(7) Click the property of CSV file, under the ‘labels’ condition, tick ‘label this layer
with’, then choose ‘name’ from the pull-down, and set the colors and font size
below.
(8) Import Google map or Bing map into the map link above the QGIS menu.
The path is: plugins—manage and install plugins–open layers—Web-openlays
plugin—googlemap—googlephysics.
If the base map is a satellite map, then the visual effect will be different (Fig. 6.2).
Apart from the QGIS, mappers can use the CartoDB website to create the map of
a writer’s trajectory and activities distribution for free. The steps and methods are:
(1) Register on https://cartodb.com/.
(2) Click the red light after log-in, choose ‘your dashbord’, and then choose ‘new
map’.
(3) Click ‘connect dataset’, and upload the excel worksheets with the field ‘name
XY’.
(4) Click ‘the geom GEO’ in the dataview, and choose the Coordinate X Y bar.
Thus, Mapview is ready (for preview). Parameters can be set up in the option
box on the right.
6 Visualizing Classic Chinese Literature 69
Fig. 6.2 Tang Xianzu’s trajectory and activity map2
(5) The map one has created can be saved online, published or saved locally to one’s
computer.
Below is a section of the effect map made with CartoDB (Fig. 6.3).
6.2 Visualization of the Geographical Distribution
of Writers’ Social Relations with CBDB
and the Aforementioned GIS Software
CBDB is short for China Biographical Database project, with the website: http://
isites.harvard.edu/icb/icb.do?keyword=k16229. This project is led by Prof. Peter
K. Bol from the Department of East Asian Languages and Civilizations at Harvard
University, and with the collaboration of the Center for Research on Ancient Chinese
70 Y. X u
Fig. 6.3 Tang Xianzu’s Trajectory and activity map3
History at Peking University and the Institute of History and Philology of “Academia
Sinica”. CBDB is by far the most comprehensive database for China biographical
materials and analyses, with as many as 360,000 individuals recorded throughout
the dynasties. Almost 500,000 or so individuals from Chinese local gazetteers are
also covered. With this database, one can look up an individual’s basic biographical
information, such as birth year, nickname and alternate name, affiliation, and his/her
result in the imperial examination, as well as his/her kinship and social relations.
Coordinate system data for historical place-names, such as affiliation, are also avail-
able. Contents of the database are accessible free of charge. Users can search online
or download the database to a local computer. For instance, if we are to ascertain
Tang Xianzu’s kinship and social relations, we could acquire relevant data by
6 Visualizing Classic Chinese Literature 71
Fig. 6.4 The access version of CBDB1
searching for kinship and social relations on CBDB. The figure below is the offline
search interface of CBDB (Fig. 6.4).
For example, looking up one’s social relations network would denote various types
of social categories. For ‘academic’ relations, teacher/student relationships, academic
exchanges, subject appropriation, academic committees, academic patronage, liter-
ature and art exchanges, and academic attacks are covered. For ‘political’ relations,
officialdom equality, officialdom subordinate/superior relations, officialdom support,
recommendations, and political confrontation are also included. These relations are
the data captured by the computer from massive text data on the basis of predeter-
mined relation keywords; therefore, some data may be invaluable beyond the grasp
of human vision. However, the data captured by the computer may sometimes fail
to present an individual’s practical social relations network. Say A’s anthology gets
circulated into place B, person C from place B comes across A’s anthology, and
then C may comment on the reading of A’s anthology in his writings. The computer
would naturally capture the A/C relationship. While the A/C relationship exists to
some degree in real life, however, there may not be any interaction between the two.
Hence, not all social relations found in the search are real-life social relations, which
requires users to distinguish from the search results. The best solution is to combine
the results of the search with an author’s chronological biography to screen out the
more intimate and significant social relations with practical interactions. Below is
the social relation search interface on CBDB (Fig. 6.5).
72 Y. X u
Fig. 6.5 The access version of CBDB2
The table below shows Tang Xianzu’s social associations. It is the result from
combining CBDB search results with A Chronological Biography of Tang Xianzu,
written by Mr. Xu Shuofang (徐朔方). Some of the data from the longitude X and
latitude Y have been auto-generated by the CBDB; some are my additions based on
CHGIS search results (Sheet 6.2).
With the coordinate registered data, the social relations’ geographical distribution
maps can be readily made with the the help of software and websites such as ArcGIS,
QGIS, and CartoDB. The creation method is similar to that of the trajectory and
activities distribution map, so the procedure would not be listed here.
6.3 The Point-Line Visualization of Social Relations
with Databases and Software Such as CBDB
and GEPHI
After some editing, the social relations data acquired from CBDB can be visualized
using GEPHI. GEPHI is another free open source network analysis software. But
this software needs a JAVA 1.7 language working environment, which requires the
pre-installation of JAVA Control on the computer.
Two tables are needed in order for GEPHI to demonstrate an individual’s social
relations: one is ‘Nodes’, and the other ‘Edges’. ‘Nodes’ contains two fields—ID
and Label, while ‘Edges’ contains Source and Target, which mainly present the
correlation among individuals, indicating the one-to-many relationship. In the tables,
ID mainly signifies the correlation. Take Tang Xianzu as an example, ‘Nodes’ and
‘Edges’ are like this (Fig. 6.6):
6 Visualizing Classic Chinese Literature 73
Sheet 2 Tang XianZu’s social relations (parts)
Id Label
1 Tang Xianzu
2 Chen Yubi
3 Dai Xun
4 Feng Mengzhen
5 Gu Xiancheng
6 Gu Yuncheng
7 Hu Guifang
8 Hu Yingl in
9 Jiang Shichang
10 Li Weizhen
11 Li Zhi
Nodes
Source Target
68 6
68 9
68 12
68 17
68 18
68 22
68 23
68 26
68 31
68 33
68 38
edges
Importing the two tables into GEPHI, Tang’s social relations map connected with
point-line will be generated. The rendering appears as follows (Fig. 6.7).
GEPHI can not only generate the point-line social relations map of an individual,
but also the map of two to many individuals and groups. Below is a point-line
74 Y. X u
Fig. 6.6 Tang XianZu’s social relations map
Fig. 6.7 Tang XianZu’s social relations
6 Visualizing Classic Chinese Literature 75
Fig. 6.8 Tang XianZu & Tu Long’s social relations
relation map of Tang Xianzu and Tu Long, another playwright from the Ming Dynasty
(Fig. 6.8).
The following is the social relations network construed among Tang Xianzu, Tu
Long and Wang Daokun (Fig. 6.9).
The point-line representation of the writers’ social relations network is a
very straightforward way to unveil each writer’s social relations and the shared
acquaintances among these writers.
Software such as UCINET, Nodexl, and Pajek can all represent the correlation
among data in a point-line manner. Due to the length of this paper, they will not be
illustrated here.
6.4 Conclusion
An introduction to the aforementioned databases and software concerning visualiza-
tion could reach the conclusion that the visualization of literary study requires on the
one hand, the support of a database, and on the other hand, high-quality software.
The construction of a database regarding literature and history is perspective driven,
and calls for computer specialists as well as long-term funding input. ‘CHGIS’ and
‘CBDB’ projects, funded by Prof. Peter K. Bol from the Department of East Asian
Languages and Civilizations at Harvard University, are even more significant after
over a decade of development. They have a grander prospect and, as open source
databases, we are confident that they will improve. For one thing, we are looking
forward to the vectorized historical maps of China for the periods before the Ming
76 Y. X u
Fig. 6.9 Tang Xianzu,Tu Long and Wang Daokun’s social relations
dynasty. Thus, when it comes to the making of the map of the writers of a certain
dynasty, the geographical maps of this dynasty, as base maps, would make it more
reliable. Moreover, we hope that domestic academia would make an effort towards the
construction of databases regarding literature and history, and appeal to concerned
administrative departments to increase funding investments in the construction of
databases, instead of waiting to exploit the inherited “Big Data” someday, only to
find that all of the valuable databases have been developed by foreigners. Classical
Chinese literary works include much that can be visualized, such as personal names,
place-names, goods, utensils, clothing, flora and fauna, and so on. How to visualize
such objects encountered during the reading of texts deserves close attention and
serious study. In terms of software, the above-mentioned software are all developed
by Westerners and we might experience some restriction during usage. For instance,
there are very limited fonts available to choose from; only Bing maps and Google
maps are available as contemporary maps for QGIS’ map link, and the Baidu map
is absent. Our hope is that Chinese software developers will eventually produce
visualization software optimized for Chinese users.
Chapter 7
Quantifying Spatial Variation
in Aggregate Cultural Tolerance
Hongwei Xu
7.1 Introduction
Social scientists frequently acknowledge the significant role of cultural force in
shaping human behaviors and performance with respect to cognition (DiMaggio
1997), academic achievement (Hsin and Xie 2014; Yamamoto and Sonnenschein
2016), labor force participation (Antecol 2000) and entrepreneurship (Guiso et al.
2006), marriage and family (Thornton 2005), emotional self-regulation (Varnum
and Hampton 2016), and mental health (Chen et al. 2003), to name just a few
areas. However, research efforts to quantify exogenous cultural influence remain
very limited because components of culture such as social norms, values, and beliefs
are difficult to measure and difficult to isolate from other institutional, social, and
economic confounders (Bachrach 2014). Current quantitative measures of social
norms rely heavily on survey questions about respondents’ individual attitudes and
beliefs (Thornton and Achen 2010; Thornton and Binstock 2012). This approach
restricts researchers’ capacity to correct for measurement error and reporting bias
in respondents’ self-reports and to infer causal influence of culture when individ-
uals’ ideations and behaviors are contemporaneously measured in a cross-sectional
survey. Even with longitudinal data, the individual-level approach can still be prob-
lematic because a person’s cultural values and beliefs affect how he/she interacts
with the world, from which the acquired new life experiences may further modify
his/her prior cultural schemas. In addition, developing effective survey measures
of culture generally requires extensive qualitative investigation, including the use of
ethnographic observation, semi-structured interview, and focus group discussion with
H. Xu (B
)
Department of Sociology, Queens College - CUNY, 65-30 Kissena Blvd., Powdermaker Hall 252,
Queens, NY 11367, USA
e-mail: hongwei.xu@qc.cuny.edu
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_7
77
78 H. Xu
socio-demographically diverse people, as well as pilot surveys to validate the instru-
ment (Thornton et al. 2010,2012). Developing valid instruments that are suitable for
international comparative studies can be even more challenging and costly.
This study seeks to develop new behavioral measures of the cultural tenets that
value individual autonomy and freedom of choice over conformity and deference
to authority. For simplicity, these tenets are referred to as cultural tolerance in
this study. Cultural tolerance has been theorized as the ideational origin of the
second demographic transition in the Western societies (Lesthaeghe 2010) and of
similar societal shifts in East Asia (Raymo et al. 2015). It also correlates closely
with the contrast between the Eastern collectivistic cultures and Western individual-
istic cultures (Triandis 2001). Psychological research has documented that people’s
collectivistic versus individualistic cultural orientation affects their social cognition,
that is the skills that relate to interactions with other people and the social environ-
ment, and further, social cognition deficits can contribute to various mental disorders
(Koelkebeck and Uwatoko 2016). Developing new measures of cultural tolerance
can help enrich the research into cultural influence on social cognition and mental
disorders beyond the collectivistic-individualistic dichotomy.
This study constructs new measures of cultural tolerance by studying the distribu-
tion of human handedness in general populations. The basis of my approach hinges on
the observation that, aside from potential genetic and pathological factors, the popu-
lation distribution of left versus right human handedness is affected by cultural and
environmental pressures against left-handedness (Porac and Coren 1981). Sociolo-
gists, economists, and demographers are interested in examining cultural influences
on demographic behaviors and socioeconomic outcomes, but they rely heavily on
attitudinal surveys to measure cultural traits (Guiso et al. 2006; Fernández and Fogli
2009; Thornton et al. 2012; Polavieja 2015). Social psychologists are interested in
cultural difference in cognition and personality, but they tend to use a person’s country
of origin as a crude proxy of his/her cultural background (Markus and Kitayama 1991;
Koelkebeck and Uwatoko 2016), or design experimental tasks to tap into specific
cultural traits in a laboratory setting (Masuda and Nisbett 2001; Masuda et al. 2008).
On the other hand, both psychologists and epidemiologists have frequently treated
handedness as a personal trait and examined its implications for an individual’s cogni-
tion, health, and socioeconomic achievement (Annett and Kilshaw 1982; Coren and
Halpern 1991; Halpern and Coren 1991; Annett 1993; Bryden et al. 2005; Johnston
et al. 2013). Some behavioral psychologists and neuroscientists have examined cross-
country differences in the prevalence of left-handedness, as well as within-country
temporal trends (Brackenridge 1981; Raymond and Pontier 2004; McManus 2009).
However, the generalizability of their findings is questionable given inconsisten-
cies in measurement of handedness across studies and the use of non-representative
samples, including clinical patients, college students, magazine subscribers (Gilbert
and Wysocki 1992), and even people depicted in artworks (Porac and Coren 1981).
Meanwhile, epidemiologists and gerontologists have embarked on collecting nation-
ally representative data on grip strength as a biomarker to study population aging and
health around the world. My analysis bridges these research strands to address the
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 79
challenge of quantifying cultural context in the study of exogenous cultural effects
on social and behavioral outcomes.
My research approach is two-fold: (1) estimate geographic variation in left-hander
prevalence as a proxy for between-area variation in cultural tolerance, and (2) assess
the construct validity of the handedness-based measure of cultural tolerance. Drawing
on data from the China Health and Retirement Longitudinal Study (CHARLS), this
study applies small area estimation (SAE) methods to estimating left-hander preva-
lence at the provincial level (equivalent to the state-level in the U.S.). This study then
tests whether or not the small area estimates of left-hander prevalence at the provincial
level (obtained from the CHARLS sample) predict individual-level cultural attitudes
in an independent sample—the World Values Survey (WVS).
China is selected as the research setting for both substantive and analytical reasons.
First, China is a country known to be conservative, traditional, collectivistic, and
culturally exclusive (Nisbett et al. 2001). Developing and demonstrating subnational
geographic variation in the new cultural measure in China will help establish its broad
utility for other countries. Second, China is also known for its vast geography and
large population size, which imply substantial within-country cultural variation for
us to exploit. Third, although comparable survey data on individuals’ handedness are
available for many other countries, CHARLS is one of the few sources that permit
access to participants’ geographic information at a fine subnational level, without
any restriction.
7.2 Conceptual Background
Various theories have been proposed to explain the existence of a left-handed minority
across space and time. These theories can be broadly grouped into three categories:
genetic, pathological, and cultural factors. Many twin studies have been conducted
to estimate the relative importance of genetic and environmental influences on hand-
edness. For example, Medland and colleagues (2006,2009), who analyzed large
twin samples (N > 20,000), found that about 25% of the variance in handedness
was explained by an additive genetic effect and about 75% by non-shared envi-
ronmental effects._ENREF_45 However, recent genome-wide association studies
have been unsuccessful in detecting any genetic variant associated with handedness
(Brandler et al. 2013; Armour et al. 2014). As for pathological factors, a variety
of birth traumas and brain injuries have been theorized to affect handedness with
the presumption that left-handedness results from certain physiological or neurolog-
ical insults that disrupt the normal developmental processes. However, the empirical
evidence remains inconclusive (Bakan 1971; Satz 1972; Hicks et al. 1978a,b; Chay-
atte et al. 1979). Even if some pathological factors are at work, they only pertain to
certain special populations, exerting little influence on the general population.
On the other hand, social pressure, stigma, and discrimination have been associ-
ated with left-handedness in many cultures. Bias against left-handedness is evident in
negative connotations associated with the word “left” in many languages (Beidelamn
80 H. Xu
1973). For example, the Latin word “sinister” means “evil” as well as “left”. The
English word “left” comes from the Celtic “lyft” which means “weak” or “broken”;
whereas the word “right” means “correct.” The French word “gauche” means both
“left” and “awkward” or “impolite.” The German word “links” or “linkisch” means
both “left” and “clumsy” or “inapt.” The so-called Right-Sided World Hypothesis
argues that hand preference results from a learning process influenced by social pres-
sure and cultural bias built into the environment (Porac and Coren 1981; Porac et al.
1986). Hand preference has some plasticity whereby an individual’s initial biological
inclination can be modified through a learning process. In a right-sided world, regard-
less of its underlying mechanisms, the population composition of handedness varies
as a function of both the amount of cultural and environmental pressure applied to
left- or mixed-handed persons to conform to the right-handed norm and the resistance
of those persons to that pressure (Porac and Coren 1981). Where pressure is substan-
tial, people born left-handed may effectively be forced to switch to a dominant right
hand orientation, which would presumably reduce the prevalence of left-handers in
the population. And where pressure is low and cultural tolerance of left-handedness
is high, we should expect an increased prevalence of left-handers in the population,
although the upper bound of the increase may be constrained by biological factors
(Porac and Coren 1986; McManus 2009).
Historically, in both Western and non-Western countries, using the left hand was
forbidden or strongly discouraged for certain socially relevant activities such as
writing and eating. Throughout the literate world, children were trained by parents
and school teachers to write with their right hands, with natural left-handers coerced,
sometimes punitively, to switch hands (Harris 1990). In the West, school teachers
were still allowed to apply physical punishment to enforce right-handed writing at
the turn of the 20th century (Harris 1983). It was not until the early to mid- 1900s
that several U.S. and U.K. psychologists began to question the practice of forcing
right-handed writing, linking it to interfering with speech development, and through
to the 1950s, U.S. psychologists, pediatricians, and educators continued to debate its
use (Harris 1990).
Previous cohort studies have documented growing tolerance for left-handedness in
“liberal” (non-traditional) Western countries. Smart and colleagues (1980) reported
an increase in the percentage of left-handedness from 6.2% among grandparents to
10% among parents and 17.5% among the children in Britain. Tambs and colleagues
(1987) found an increase in the percentage of left-handedness from 1.2% in the 1895–
1905 cohort to 8.7% in the 1975–1985 cohort in Norway. The tolerance towards left-
handed writing began to increase in the U.S. from 1930 onwards, but it took some
40 years to complete the liberalization (Levy 1974), as opposed to only 20 years in the
Netherlands, where the shift started after 1945 (Beukelaar and Kroonenberg 1986).
In contrast, the practice of forced hand switching is sustained in many “conservative”
non-Western countries today, even though the rationale for forbidding left hand use
may no longer apply. Iwasaki and colleagues (1995) found no declining cultural
censorship of left-handedness in Japan, with 15.5% of adult respondents reporting
correction of hand use for writing and eating at an early age. In China the left hand
is restricted for writing and eating and childhood intervention has been so pervasive
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 81
and effective that prevalence of left-handedness is extremely low—at just 0.23% for
children and adults in the mainland (Li 1983), 0.7% for grade-school and college
students in Taiwan (Teng et al. 1976), and 1.6% for college students in Hong Kong
(Hoosain 1990). By comparison, the prevalence of left-handed writing is 6.5% among
Asian American school children in California (Hardyck et al. 1975).
Even in the absence of overt cultural pressure, covert environmental pressure
against left-handedness may persist. Since the Industrial Revolution, or even earlier,
many factory machines (e.g., lathes and presses), everyday tools (e.g., scissors and
can openers), musical instruments (e.g., violins and guitars), sporting gear (e.g.,
fishing reels and bowling balls), and other equipment such as cameras and computer
mice have been designed and produced for right-handed usage (Porac and Coren
1981; McManus 2009). In many cases, left-handed people have adapted by learning
to operate equipment with their right hand (Coren 1989).
Although biological factors may play a role in the distribution of handedness
across populations, this does not preclude a significant role for cultural pressure.
Porac and colleagues (1990) conducted a meta-analysis of 55 studies across 20
countries and found that environmental pressure accounts for about 8% of the within-
cultural variation in adult handedness score and 23.5% of the cross-cultural variations
in prevalence of left-handedness, whereas biological factors (using race as a proxy)
only explains 1.9% of the variability.
In conclusion, assuming that the genetic and pathological effects have remained
relatively stable, we expect the prevalence of left-handedness and/or mixed-
handedness to increase in a population as the overall cultural tolerance grows
(assuming resultant decreases in both overt anti-left-hander social pressure and indi-
rect environmental pressure). This expectation allows us to infer the degree of cultural
tolerance in a society from estimating its rates of left-handedness.
7.3 Data and Measures
There are two sources of data in this study. Data from the 2011 national baseline
of the China Health and Retirement Longitudinal Study (CHARLS) were used for
SAE of left-hander prevalence. Modeled after the Health and Retirement Study in
the U.S., CHARLS is a biennial survey of a nationally representative sample of
Chinese residents ages 45 and older, and their spouses if available. The 2011 national
baseline of CHARLS surveyed 10,287 households and 17,708 individuals living in
150 counties across 28 out of 31 provinces in mainland China, with a response rate
of 80.5% (Zhao et al. 2014).
Individual-level handedness was measured in two ways. First, a subjective
measure of individual-level handedness is based on survey participants’ responses
to the global question, “Which is your dominant hand?” (choices: right hand, left
hand, or both hands equally dominant). A dichotomous variable was coded 1 for
left-handed, and 0 for right-handed or ambidextrous. Self-reported handedness is
subject to reporting error. If left-handed people are more likely to falsely report their
82 H. Xu
true handedness in areas where cultural and environmental pressures are higher, the
sample estimates of left-handedness rates will be systematically biased downward.
To address this problem, the second measure of individual-level handedness incor-
porated objective information on hand grip strength as measured with a hand-held
dynamometer in CHARLS (Zhao et al. 2013). Two measurements were taken for
each hand, with the final grip strength of each hand determined by the average value
of the two measurements. Each CHARLS respondent was classified as left-handed if
he/she self-reported as left-handed, or his/her grip strength was greater in the left than
the right hand. As a sensitivity check, a one-kilogram (or two-kilogram) threshold
in determining handedness such that respondents were categorized as left-handed if
the grip strength is at least one (or two) kilogram greater in the left hand than the
right hand (Siengthai et al. 2008).
To apply poststratification weights from China’s 2010 Population Census Data to
model-based SAE (see the Methods section below), a set of individual-level covari-
ates in CHARLS were coded into the same categories as the cross-tabulated census
statistics at the provincial level. These individual covariates include age in 2010
(45–49, 50–54, 55–59, 60–64, 65–69, 70–74, 75–79, 80–84, or over 85), sex (men or
women), educational attainment (none, primary school, middle school, high school,
college or above), and residence (rural or urban).
Data from wave 6 of World Values Survey (WVS)-China were used for the anal-
ysis of the association between left-hander prevalence and cultural values. WVS
is a repeated cross-sectional survey of people’s values around the world. Wave 6,
conducted in 2012, surveyed 2,300 adult respondents recruited in 24 provinces.
The measures of individuals’ cultural values and attitudes are: (1) self-perception
of individual autonomy; and (2) the overall emancipative values index and its four
sub-indices—autonomy, equality, choice, and voice.
Self-perceived autonomy is based on survey participants’ ratings of the statement,
“I see myself as an autonomous individual” on a four-point Likert scale. The emanci-
pative values index and its sub-indices were originally constructed by Welzel (2013)
and publicly released as part of the WVS data product. All of them are normal-
ized continuous scores, ranging from 0 to 1, based on survey participants’ responses
to multiple questions about their attitudes and beliefs in each area. The autonomy
sub-index measures views on individual autonomy versus obedience to authority.
The equality sub-index measures views on gender equality with respect to educa-
tion, employment, and political leadership. The choice sub-index measures views on
personal freedom in reproductive choices. The voice sub-index measures views on
the role of the voice of the people as a societal influence. The overall emancipative
values index is a summary score of the four sub-indices. The control variables include
age, sex, and provincial fixed effects.
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 83
7.4 Methods
CHARLS samples respondents within a province to create an overall sample that is
nationally representative, not representative of the province in which the sampled
respondents are located. A model-based SAE strategy, known as the multilevel regres-
sion with poststratification weighting (MRP) (Gelman and Little 1997;Parketal.
2004; Zhang et al. 2014), was employed to obtain reliable estimates of left-hander
prevalence at the provincial level. The MRP method consists of three steps. The first
step is to fit a multilevel logistic model, in which individual-level left-handedness
(1 =yes; 0 =no) is regressed on individual demographic and socioeconomic char-
acteristics and provincial random intercepts. The second step is to apply coefficient
estimates from the multilevel logistic model to calculate the probability of being
left-handed for each of the 5,040 age×sex×education×residence ×province cross-
tabulated categories (i.e., 9 age categories, 2 sex categories, 5 education categories,
2 residence categories, and 28 provincial intercepts). The third step is to calculate
provincial left-hander prevalence by summing the predicted individual probabili-
ties of being left-handed over all the cross-tabulated categories in a given province
weighted by the categories’ corresponding population size in that province (also
known as posstratification weighting).
To evaluate the construct validity of the handedness-based cultural measures, the
small area estimates of provincial left-hander prevalence derived from the CHARLS
data are used to predict individual-level cultural values from the WVS-China data.
The WVS-China respondents (level 1) who lived in the same provinces (level
2) were first merged with the same rates of left-handers at the provincial level
estimated from the CHARLS data. Then the WVS-China participants’ scores on
self-perceived autonomy, the emancipative values index, and its four sub-indices
(autonomy, equality, choice, and voice) were regressed on provincial left-hander
prevalence while controlling for age, sex, and provincial fixed effects. Given
assumptions regarding the role of cultural tolerance in moderating overt social
pressure and indirect environmental pressure exerted on handedness, a positive
correlation between provincial left-hander prevalence and residents’ scores on these
indices is expected (Thornton 2001; Lesthaeghe 2010;Kavas2015).
7.5 Results
7.5.1 Descriptive Statistics
Table 7.1 presents the frequency distributions of individual-level left-handedness
and the independent variables used in SAE in the CHARLS sample. The unweighted
sample size is 13,022 and it represents 506,547,019 Chinese middle age and older
adults after weighting. In unweighted and weighted samples, the prevalence rate of
left-handers is much lower on the basis of self-reported dominant hand (Measure
84 H. Xu
Table 7.1 Summary
statistics for the variables
used in the small area
estimation of left-hander
prevalence in Chinese
middle-aged and older adults
(>=45 years)
Unweighted % Weighted %
Left-handedness
Measure 1 7.4 7.9
Measure 2 30.5 30.3
Measure 3 22.4 22.3
Measure 4 16.8 17.1
Covariates
Age (years) in 2010
45–49 19.5 21.5
50–54 16.5 15.7
55–59 21.5 21.6
60–64 16.5 14.5
65–69 11.2 10.4
70–74 7.4 7.6
75–79 4.7 5.1
80–84 2.0 2.5
>=85 0.7 1.1
Sex
Women 52.1 51.7
Men 47.9 48.3
Educational attainment
None 29.1 26.0
Primary school 40.3 38.2
Middle school 19.9 22.0
High school 9.3 11.4
College or above 1.5 2.4
Residence
Urban 36.9 49.4
Rural 63.1 50.6
N of observations 13,022 506,547,019
Note Individual’s left-handedness is determined by self-reported
dominant hand only in Measure 1; by both self-report and grip
strength in Measure 2; by both self-report and grip strength with
1 kg margin of error in Measure 3; and by both self-report and grip
strength with 2 kg margin of error in Measure 4
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 85
1; about 7.4–7.9%) than for self-report combined with grip strength (Measure 2;
about 30%), suggesting that estimation of left-hander prevalence is sensitive to the
measurement of individual handedness. As the margin of error in comparing grip
strength between left and right hands is allowed to increase to 1 and 2 kg, the preva-
lence rate of left-handers drops to about 22% (Measure 3) and 17% (Measure 4),
respectively.
In terms of covariates, overall the CHARLS sample is sex-balanced (approxi-
mately 52% women and 48% men) and consists of a large proportion of middle-
aged (roughly 73–74% of ages 45–64), poorly educated (about two thirds completed
primary school or less) respondents. The unweighted statistics are similar to the
weighted statistics with one exception. The unweighted sample is dominated by
rural respondents (63.1%), whereas the weighted sample is evenly split between
rural (50.6%) and urban (49.4%) respondents.
7.5.2 Small Area Estimation Results
Table 7.2 reports regression coefficients estimated from the multilevel logistic models
of being left-handed in the CHARLS sample. Regardless of how individual handed-
ness is determined, men were more likely than women to be left-handed. There was a
negative educational gradient in left-handedness—better educated respondents were
less likely to be left-handed. This pattern also holds irrespective of the measurement
choice of individual handedness. Rural-urban residence was not predictive of indi-
vidual left-handedness, whereas age difference varied by the measurement of indi-
vidual handedness. When using self-reported dominant hand to determine individual
handedness (Measure 1), respondents in certain oldest age groups (70–74 and over
85 years) were less likely than the youngest group (45–49 years) to be left-handed; but
those of 75–79 years old were more likely to be left-handed (marginally significant).
After incorporating grip strength to determine individual handedness (Measures
2–4), the negative associations between the oldest age groups and left-handedness
lost statistical significance, while there was some evidence that respondents of
55–59 years old were more likely to be left-handed than the youngest reference
group. In short, gender and educational attainment were consistently predictive of
individual left-handedness, whereas age and rural-urban residence were not.
Model-based small area estimates of left-hander prevalence in the 28 provinces
surveyed in CHARLS were obtained after calculating predicted probabilities of being
left-handed for all the cross-tabulated demographic categories and applying post-
stratification weights. Figure 7.1 compares the kernel density distributions across
four sets of estimates using different measures of individual handedness. Recall that
Measure 1 refers to individual left-handedness purely determined by self-reported
dominant hand; Measure 2 extends Measure 1 by categorizing a respondent whose
grip strength on left hand is stronger than that on right hand as left-handed even if he
or she self-reports to be right-handed or ambidextrous; Measures 3 and 4 are similar
86 H. Xu
Table 7.2 Regression coefficients from the multilevel logistic models of being left-handed in
Chinese middle-aged and older adults (>=45 years)
Measure 1 Measure 2 Measure 3 Measure 4
Men (ref: women) 0.346 *** 0.147 *0.299 *** 0.369 ***
Age group (ref: 45–49)
50–54 −0.108 0.040 0.014 0.052
55–59 −0.057 0.113 †0.132 *0.102
60–64 0.003 0.028 −0.010 −0.005
65–69 −0.087 0.055 −0.005 0.026
70–74 −0.268 *0.102 0.033 −0.075
75–79 0.326 †0.183 0.193 †0.199
80–84 −0.217 −0.028 −0.304 −0.319
>=85 −1.855 ** 0.164 0.035 −0.312
Education (ref: no schooling)
Primary school −0.334 ** −0.188 ** −0.216 ** −0.263 **
Middle school −0.437 ** −0.217 ** −0.244 *−0.313 **
High school −0.484 *−0.331 *** −0.387 *** −0.452 ***
>=College −1.204 ** −0.395 †−0.276 −0.571 *
Rural area (ref: urban) −0.105 0.076 0.091 0.044
Constant −2.247 *** −0.867 *** −1.330 *** −1.637 ***
Variance component
Province level 0.075 *** 0.064 *** 0.053 *** 0.057 ***
Note ref =reference category. Individual’s left-handedness is determined by self-reported dominant
hand only in Measure 1; by both self-report and grip strength in Measure 2; by both self-report and
grip strength with 1 kg margin of error in Measure 3; and by both self-report and grip strength with
2kgmarginoferrorinMeasure4
†p<0.1;*p< 0.05; ** p< 0.01; *** p< 0.001
to Measure 2 except that they allow a margin of error (1 and 2 kg, respectively) when
comparing grip strength between two hands.
Two findings stand out in Fig. 7.1. First, consistent with the descriptive statis-
tics mentioned the previous section, combing self-reported dominant hand and grip
strength to classify individual handedness leads to much higher estimates of left-
hander prevalence than using self-report alone. The entire distribution of small area
estimates is concentrated below 7% when using self-reported dominant hand alone
(Measure 1), whereas the corresponding distributions combining self-report and grip
strength are centered above 10%, suggesting that subjective measure of individual
handedness may lead to underestimates of left-hander prevalence. Among the three
measures that combine self-reported dominant hand and grip strength, the average
estimate of provincial left-handed prevalence rate is highest when no margin of error
is allowed.
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 87
Fig. 7.1 Kernel densities of the small area estimates of left-hander prevalence at the provincial
level in Chinese middle-aged and older adults (>=45 years) Note Individual’s left-handedness is
determined by self-reported dominant hand only in Measure 1; by both self-report and grip strength
in Measure 2; by both self-report and grip strength with 1 kg margin of error in Measure 3; and by
both self-report and grip strength with 2 kg margin of error in Measure 4
Second, the small area estimates are more smoothed when grip strength is incor-
porated to determine individual handedness than relying on self-reported dominant
hand alone. As shown in Table 7.3, when using self-reported dominant hand alone
(Measure 1), the standard deviation of the small area estimates is 0.9%, which is
only about a fifth of that for the estimates using Measure 2 (4.4%). Similarly, the
interquartile range of the small area estimates is 1.2 when using Measure 1, which is
about one fourth of that for the estimates using Measure 2. After taking into account
Table 7.3 Summary statistics for different types of small area estimates of left-hander prevalence
in Chines middle-aged and older adults (>=45 years) at the provincial level
Mean SD Min Median Max IR N
Model-based estimates of left-hander prevalence at the provincial level (%)
Measure 1 5.3 0.9 3.1 5.3 7.3 1.2 28
Measure 2 26.3 4.4 12.7 27.6 35.6 5.1 28
Measure 3 17.5 2.8 9.4 18.0 23.9 3.6 28
Measure 4 11.7 2.2 6.1 12.2 16.4 2.3 28
Note SD =standard deviation; Min =minimum; Max =maximum; IR =interquartile range
88 H. Xu
possible margin of error in comparing grip strength between two hands, the small area
estimates are shrunk towards the center of the distribution. Specifically, the standard
deviation of the small area estimates drops from 4.4% for Measure 1 to 2.8% and
2.2% for Measures 3 and 4, respectively, and the corresponding interquartile ranges
decreases from 5.1% to 3.6% and 2.3%.
Figure 7.2 depicts the spatial distributions of small area estimates of left-
handedness prevalence at the provincial level. Darker colors represent higher preva-
lence rates of left-handers. Although there are some variations across when using
different measures of individual handedness, two clusters of high left-hander preva-
lence rates can be observed. The first cluster is located in Southwest China,
consisting of Sichuan, Yunan, and Guizhou Provinces. The second cluster is located
in Northwest China, consisting of Gansu, Qinghai, and Xinjiang Provinces.
Fig. 7.2 Spatial distributions of small area estimates of left-handedness prevalence at the provincial
level
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 89
Table 7.4 Summary statistics for the variables in the 2012 World Values Survey-China
Definition Mean SD Min Max N
Outcome variables
Self-perceived as an
autonomous individual
4-point Likert scale on “I see
myself as an autonomous
individual”
3.16 0.58 1 4 1,970
Emancipative values
index
Multi-point index from 0 to 1
basedon12items
0.39 0.12 00.86 2,159
Autonomy sub-index 4-point index from 0 to 1 0.60 0.23 0 1 2,300
Equality sub-index 12-point index from 0 to 1 0.53 0.23 0 1 2,134
Choice sub-index 30-point index from 0 to 1 0.21 0.22 0 1 1,885
Voice sub-index 6-point index from 0 to 1 0.18 0.24 0 1 2,098
Control variables
Age Years 43.92 14.95 18 75 2,300
Sex 1 if male; 0 if female 0.49 0.50 0 1 2,300
Note SD =standard deviation; Min =minimum; Max =maximum
7.5.3 Predicting Individual Cultural Values
Table 7.4 reports the descriptive statistics for the variables in WVS-China. No missing
data occur in the control variables. The full sample is evenly split between men and
women, with an average age at about 44 years. The average score of self-perceived
autonomy is 3.16 on a 4-point Likert scale, which suggests that on average the respon-
dents ‘agree’ with the statement that, “I see myself as an autonomous individual.”
All the other attitudinal variables are normalized on a continuous scale ranging from
0 to 1, with a higher score representing a stronger emphasis on freedom of choice
and equality of opportunities (Welzel 2013), although the maximum value of the
emancipative values index in my analytical sample reaches only 0.86. The amount
of missing data varies across the attitudinal measures, but even in the worst case of
the choice sub-index, more than 80% of the sample have valid responses. To maxi-
mize statistical power, the analytical sample size is allowed to vary depending on the
number of valid responses for each attitudinal measure.
Table 7.5 presents the coefficients from regressing attitudinal measures on
different estimates of left-hander prevalence, adjusting for age, sex, and provin-
cial fixed effects. A significantly positive coefficient lends support to the theoret-
ical expectation that left-hander prevalence is a valid measure for cultural tolerance.
Across different small area estimations, respondents living in provinces with a higher
prevalence rate of left-handers consistently considered themselves with a higher level
of autonomy. For the emancipative values index and its sub-indices, the association
between left-hander prevalence at the provincial level and cultural attitude at the indi-
vidual level differs by which measure of individual handedness is used in small area
estimation. When using self-reported dominant hand alone to determine individual
90 H. Xu
Table 7.5 Regression estimates of the associations between left-hander prevalence in the 2011 China Health and Retirement Longitudinal Study and Chinese
adults’ attitudes in the 2012 World Values Survey
Self-perceived
autonomous individual
Emancipative values
Overall index Autonomy sub-index Equality sub-index Choice sub-index Voice sub-index
% Left-handers at the provincial level
Measure 1 (0.0002)
0.0707
*** (0.0000)
−0.0199
*** −0.0205 *(0.0002)
−0.0038
*** (0.0001)
−0.0214
*** (0.0001)
−0.0107
***
(0.0000)
Measure 2 (0.0003)
0.0210
*** 0.0015 *** (0.0001)
0.0015
*** (0.0001)
0.0046
*** (0.0001)
−0.0017
*** (0.0001)
0.0046
***
(0.0001)
Measure 3 (0.0006)
0.0478
*** 0.0033 *** (0.0001)
0.0034
*** (0.0002)
0.0106
*** (0.0002)
−0.0039
*** (0.0003)
0.0106
***
(0.0001)
Measure 4 (0.0005)
0.0424
*** (0.0001)
0.0030
*** (0.0001)
0.0030
*** (0.0002)
0.0094
*** (0.0002)
−0.0035
*** (0.0002)
0.0094
***
N of observations
Note Individual’s left-handedness is determined by self-reported dominant hand only in Measure 1; by both self-report and grip strength in Measure 2; by
both self-report and grip strength with 1 kg margin of error in Measure 3; and by both self-report and grip strength with 2 kg margin of error in Measure 4.
All the models control for age, gender, and provincial fixed-effects. Robust standard errors that adjust for individuals clustered within provinces are shown in
parentheses
*p< 0.05; ** p< 0.01; *** p< 0.001
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 91
handedness (Measure 1), a higher level of left-hander prevalence at the provincial
level were negatively associated with the individual-level attitudinal score on the
overall emancipative values index, as well as all the four sub-indices. In contrast,
when using both self-reported dominant hand and grip strength to classify individual
handedness (Measures 2–4), a higher level of left-hander prevalence at the provin-
cial level were positively associated with the individual-level attitudinal score on
the overall emancipative values index, as well as the autonomy, equality, and voice
sub-indices, but negatively associated with score on the choice sub-index. In addi-
tion, the strength of these associations appeared to be stronger when margin of error
was allowed in comparing grip strength between two hands (Measures 2 and 3) as
evident by the larger coefficient sizes compared with using Measure 1 which did
not incorporate margin of error. In short, using self-reported dominant hand to clas-
sify individual handedness leads to results that run largely counter to the theoretical
expectation, whereas combining self-report and grip strength to measure handedness
produces results that are largely consistent with the theoretical expectation.
7.6 Discussion
Focusing on one aspect of culture—cultural tolerance—this study conceptualizes
aggregate-level handedness as a contextual indicator and infer the amount of social
pressure against left-handedness in an area from the left-hander prevalence in that
area. Drawing on nationally representative data in China, this study applied a model-
based SAE method to subjective (self-reported dominant hand) and objective (hand
grip strength) measures of individuals’ handedness, yielding four sets of estimates of
left-hander prevalence at the provincial level for 28 out of 31 provinces. This study
found that sets of estimates generally agree with one another not in absolute values
but in relative terms.
This study tested the construct validity of this new measure of cultural tolerance
(population prevalence of left handedness) by using the SAEs of left-hander preva-
lence obtained from the 2011 CHARLS sample to predict individual-level cultural
attitudes gleaned from the 2012 WVS-China sample. The two data sources are
independent from each other. The estimates of left-hander prevalence presumably
reflected the cultural environment during the early childhood of the CHARLS respon-
dents, who were 45 years and older in 2011, while the attitudinal responses in the
WVS-China sample reflected the respondents’ cultural values at the time of the inter-
view. These analytic features help alleviate the concern about potential endogeneity
in the regression model.
This study found strong evidence for construct validity when grip strength was
combined with self-reported dominant hand to classify individual handedness. After
controlling for age, sex, and provincial fixed effects, a higher prevalence rate of left-
handers at the provincial level predicted significantly higher scores at the individual
level of self-perceived autonomy, the emancipative values index, and its three sub-
indices—autonomy, equality, and voice. These indices capture favorable attitudes
92 H. Xu
toward freedom of choice and equality of opportunities. This finding is robust against
different margins of error used to classify individual handedness. The only exception
is that a higher prevalence rate of grip-strength left-handers at the provincial level
predicted a significantly lower score on the choice sub-index. One possible expla-
nation is that the choice sub-index consists of three items that summarize respon-
dents’ attitudes toward homosexuality, abortion, and divorce, and these domains do
not overlap well, at least in the Chinese context, with the cultural value placed on
tolerance of left-handedness.
On the other hand, using self-reported dominant hand to classify individual hand-
edness led to generally poor performance of SAEs of left-hander prevalence in
predicting individuals’ attitudes toward freedom of choice and equality of oppor-
tunities. The difference in predicting individuals’ attitudes suggests that the self-
reported dominant hand measure may be subject to reporting bias and less reliable
than objectively measured grip strength. Future research should avoid sole reliance
on self-reported handedness, especially in contexts where cultural pressure against
left-handedness is strong.
Despite these limitations, my proposed cultural measure has several potential
methodological merits. First, they are behavioral in nature and hence are less subject
to measurement error and reporting bias, which are common in survey measures of
attitudes, beliefs, and values. Even if some natural left-handed respondents underre-
port their dominant handedness because of cultural stigmatism or because they were
pressured to switch hands during childhood, we can alleviate the problem of misclas-
sifying handedness by using objectively measured grip strength data (Siengthai et al.
2008). Second, despite being collected at middle age or older, the handedness data
actually reflect the cultural pressure experienced by the adult survey respondents
during their early childhood, when handedness is typically established (Raymond
and Pontier 2004). The new handedness-based cultural measures are retrospective
in nature and yet robust against respondents’ recall bias because they are objective,
behavioral measures rather than subjective, attitudinal measures. The retrospective
or time-lagged feature of the new measure also implies that they are exogenous to
the sociocultural context at the time of survey, which provides an opportunity for
future researchers to investigate the long-term causal impact of early-life cultural
environment on later-life outcomes with cross-sectional data.
Third, developing attitudinal questions in surveys, especially comparable instru-
ments used in different countries, can be costly. Similarly, experimental tasks devel-
oped by social psychologists to assess cultural attitudes and values in a laboratory
environment are hardly applicable to large-scale data collection in a representative
sample of the general population. In contrast, thanks to the growing efforts of health
and aging studies, data on self-reported dominant hand and grip strength have already
been collected in many countries across the globe with consistent instruments and
procedures, as shown in Table 7.6. Together, these surveys cover 30-plus countries
in North America, Europe, Asia, and Africa. The new measure can be applied to
quantify cultural tolerance in many countries around the world where comparable
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 93
Table 7.6 Selected longitudinal surveys and the corresponding waves in which both self-reported
dominant hand and grip strength have been measured using the same instruments
HRS CHARLS SHARE MHAS KLoSA JSTAR WHO-SAGE
U.S. China 20+ European
countries
Mexico Korea Japan China, Ghana,
India, Mexico,
Russia, South
Africa
2004–05 W7 W1
2006–07 W8 W2 W1 W1
2008–09 W9 W2 W2
2010–11 W10 W1 W4 W3 W3 W1
2012–13 W11 W2 W5 W3 W4 W4
2014–15 W12
Note HRS =Health and Retirement Study; CHARLS =China Health and Retirement Longitudinal
Study; SHARE =Survey of Health, Ageing and Retirement in Europe; MHAS =Mexican Health
and Aging Study; KLoSA =Korean Longitudinal Study of Aging; JSTAR =Japanese Study on
Aging and Retirement; WHO-SAGE =World Health Organization Study on global AGEing and
adult health
handedness data are available. As additional waves of these longitudinal surveys
are conducted and new birth cohorts are recruited, the new cultural measure can be
updated periodically at relatively low cost.
In a broader sense, this study highlights the progress in spatial data analysis
in social science research. In the past, such research topics as cultural values and
social norms were predominantly examined by a qualitative approach due to lack of
population-based data, let alone research into spatial dynamics of cultural values and
social norms. In addition to increased data availability, advancements in statistical
methods and computing power have considerably reduced the computational costs
of managing large data and estimating complex models. For example, hierarchical
generalized linear models, as the one used in this study, were developed in the 1990s,
but their applications in social science research remained rare in the early 2000s. Now,
procedures of estimating hierarchical generalized linear models are a routine part of
most statistical software packages.
This study demonstrates how to draw on biomarker data for cultural research, but
the source of data is still a traditional survey. Future research on SAE in general
will benefit from integrating data from different sources (e.g., survey, administrative
records, Internet searches, and social media) or in different forms (e.g., spreadsheets,
text, and images). Such data fusion remains a challenging task but new techniques
are emerging, especially in the field of spatial-temporal analysis.
Acknowledgements An earlier version of this article was presented at the 2017 annual meeting
of the Population Association of America in Chicago, IL. The author thanks session participants
for useful comments. The author also thanks Arland Thornton for his helpful feedback and N.
94 H. Xu
E. Barr for her assistance in copy-editing. This study was supported by the National Institutes of
Health under a center grant to the Population Studies Center at the University of Michigan (R24
HD041028) and a Mellon Diversity Fellowship awarded to the author at Queens College.
References
Annett, M. (1993). The disadvantages of dextrality for intelligence—corrected findings. British
Journal of Psychology, 84(4), 511–516.
Annett, M., & Kilshaw, D. (1982). Mathematical ability and lateral asymmetry. Cortex, 18(4),
547–568.
Antecol, H. (2000). An examination of cross-country differences in the gender gap in labor force
participation rates. Labour Economics, 7(4), 409–426.
Armour, J. A., Davison, A., et al. (2014). Genome-wide association study of handedness excludes
simple genetic models. Heredity, 112(3), 221–225.
Bachrach, C. A. (2014). Culture and demography: From reluctant bedfellows to committed partners.
Demography, 51(1), 3–25.
Bakan, P. (1971). Handedness and birth order. Nature, 229(5281), 195.
Beidelamn, T. O. (1973). Kaguru symbolic classification. In R. Needham (Ed.), Right and left:
Essays on dual symbolic classification (pp. 128–166). Press: Chicago, University of Chicago.
Beukelaar, L. J., & Kroonenberg, P. M. (1986). changes over time in the relationship between hand
preference and writing hand among left-handers. Neuropsychologia, 24(2), 301–303.
Brackenridge, C. J. (1981). Secular variation in handedness over ninety years. Neuropsychologia,
19(3), 459–462.
Brandler, W. M., Morris, A. P., et al. (2013). Common variants in left/right asymmetry genes and
pathways are associated with relative hand skill. PLoS Genetics, 9(9), e1003751.
Bryden, P. J., & Bruyn, J. et al. (2005). Handedness and health: An examination of the association
between different handedness classifications and health disorders. Laterality: Asymmetries of
Body, Brain and Cognition,10(5), 429–440.
Chayatte, C., Abern, S. B., et al. (1979). Left handed people. Irish Medical Journal, 72, 511.
Chen, H., Guarnaccia, P. J., et al. (2003). Self-attention as a mediator of cultural influences on
depression. International Journal of Social Psychiatry, 49(3), 192–203.
Coren, S. (1989). Left-handedness and accident-related injury risk. American Journal of Public
Health, 79(8), 1040–1041.
Coren, S., & Halpern, D. F. (1991). Left-handedness: A marker for decreased survival fitness.
Psychological Bulletin, 109(1), 90–106.
DiMaggio, P. (1997). Culture and cognition. Annual Review of Sociology, 23(1), 263–287.
Fernández, R., & Fogli, A. (2009). Culture: An empirical investigation of beliefs, work, and fertility.
American Economic Journal: Macroeconomics, 1(1), 146–177.
Gelman, A., & Little, T. C. (1997). poststratification into many categories using hierarchical logistic
regression. Survey Methdology, 23, 127–135.
Gilbert, A. N., & Wysocki, C. J. (1992). Hand preference and age in the united states.
Neuropsychologia, 30(7), 601–608.
Guiso, L., Sapienza, P., et al. (2006). Does culture affect economic outcomes? Journal of Economic
Perspectives, 20(2), 23–48.
Halpern, D. F., & Coren, S. (1991). Handedness and life span. New England Journal of Medicine,
324(14), 998.
Hardyck, C., Goldman, R., et al. (1975). Handedness and sex, race, and age. Human Biology, 47(3),
369–375.
7 Quantifying Spatial Variation in Aggregate Cultural Tolerance 95
Harris, L. J. (1983). Laterality of function in the infant: historical and contemporary trends in
theory and research. In G. Young, S. J. Segalowitz, C. M. Corter, & S. E. Trehub (Eds.), Manual
specialization and the developing brain (pp. 177–247). New York, Academic: Press.
Harris, L. J. (1990). Cultural influences on handedness: historical and contemporary theory
and evidence. Left-handedness: Behavioral implications and anomalies. S. Coren. Amsterdam,
Elsevier Science, 195–258.
Hicks, R. A., Evans, E. A., et al. (1978a). Correlation between handedness and birth order:
Compilation of five studies. Perceptual and Motor Skills, 46 (1), 53–54.
Hicks, R. A., Pellegrini, R. J., et al. (1978b). Handedness and birth risk. Neuropsychologia, 16(2),
243–245.
Hoosain, R. (1990). Left handedness and handedness switch amongst the chinese. Cortex, 26(3),
451–454.
Hsin, A., & Xie, Y. (2014). Explaining Asian americans’ academic advantage over whites.
Proceedings of the National Academy of Sciences, 111(23), 8416–8421.
Iwasaki, S., Kaiho, T., et al. (1995). Handedness trends across age groups in a japanese sample of
2316. Perceptual and Motor Skills, 80(3), 979–994.
Johnston, D. W., Nicholls, M. E. R., et al. (2013). Handedness, health and cognitive develop-
ment: Evidence from children in the national longitudinal survey of youth. Journal of the Royal
Statistical Society: Series A (Statistics in Society), 176(4), 841–860.
Kavas, S. (2015). ‘Wardrobe modernity’: Western attire as a tool of modernization in turkey. Middle
Eastern Studies, 51(4), 515–539.
Koelkebeck, K., & Uwatoko, T. et al. (2016). How culture shapes social cognition deficits in mental
disorders—a review. Social Neuroscience: null-null.
Lesthaeghe, R. J. (2010). The unfolding story of the second demographic transition. Population and
Development Review, 36 (2), 211–251.
Levy, J. (1974). Psychobiological implications of bilateral asymmetry. In S. J. Dimond & J. G.
Beaumont (Eds.), Hemisphere function in the human brain (pp. 121–183). Oxford, England,
John: Wiley.
Li, X.-T. (1983). The distribution of left and right handedness in chinese people. Acta Psychologica
Sinica, 3, 268–276.
Markus, H. R., & Kitayama, S. (1991). Culture and the self: Implications for cognition, emotion,
and motivation. Psychological Review, 98(2), 224–253.
Masuda, T., Ellsworth, P. C., et al. (2008). Placing the face in context: Cultural differences in the
perception of facial emotion. Journal of Personality and Social Psychology, 94(3), 365–381.
Masuda, T., & Nisbett, R. E. (2001). Attending holistically versus analytically: Comparing the
context sensitivity of japanese and americans. Journal of Personality and Social Psychology,
81(5), 922–934.
McManus, I. C. (2009). The history and geography of human handedness. In I. E. C. Sommer & R.
S. Kahn (Eds.), Language lateralization and psychosis (pp. 37–58). Press: New York, Cambridge
University.
Medland, S. E., Duffy, D. L., et al. (2009). Genetic influences on handedness: Data from 25,732
Australian and dutch twin families. Neuropsychologia, 47(2), 330–337.
Medland, S. E., Duffy, D. L., et al. (2006). Handedness in twins: Joint analysis of data from 35
samples. Twin Research and Human Genetics, 9(01), 46–53.
Nisbett, R. E., Peng, K., et al. (2001). Culture and systems of thought: Holistic versus analytic
cognition. Psychological Review, 108(2), 291–310.
Park, D. K., Gelman, A., et al. (2004). Bayesian multilevel estimation with poststratification: State-
level estimates from national polls. Political Analysis, 12(4), 375–385.
Polavieja, J. G. (2015). Capturing culture: A new method to estimate exogenous cultural effects
using migrant populations. American Sociological Review, 80(1), 166–191.
Porac, C., & Coren, S. (1981). Lateral preferences and human behavior. New York: Springer.
Porac, C., Coren, S., et al. (1986). Environmental factors in hand preference formation: Evidence
from attempts to switch the preferred hand. Behavior Genetics, 16(2), 251–261.
96 H. Xu
Porac, C., & Rees, L. et al. (1990). Switching hands: A place for left hand use in a right hand
world. Left-Handedness: Behavioral Implications and Anomalies. S. Coren. Amsterdam, Elsevier
Science, 259–290.
Raymo, J. M., Park, H., et al. (2015). Marriage and family in east Asia: Continuity and change.
Annual Review of Sociology, 41(1), 471–492.
Raymond, M., & Pontier, D. (2004). “Is there geographical variation in human handedness?”
laterality: Asymmetries of body. Brain and Cognition, 9(1), 35–51.
Satz, P. (1972). Pathological left-handedness: An explanaory model. Cortex, 8(2), 121–135.
Siengthai, B., Kritz-silverstein, D., et al. (2008). Handedness and cognitive function in older men
and women: A comparison of methods. The Journal of Nutrition, Health & Aging, 12(9), 641–647.
Smart, J. L., Jeffery, C., et al. (1980). A retrospective study of the relationship between birth history
and handedness at six years. Early Human Development, 4(1), 79–88.
Tambs, K., Magnus, P., et al. (1987). Left-handedness in twin families: Support of an environmental
hypothesis. Perceptual and Motor Skills, 64(1), 155–170.
Teng, E. L., Lee, P.-H., et al. (1976). Handedness in a chinese population: Biological, social, and
pathological factors. Science, 193(4258), 1148–1150.
Thornton, A. (2001). The developmental paradigm, reading history sideways, and family change.
Demography, 38(4), 449–465.
Thornton, A. (2005). Reading history sideways: The fallacy and enduring impact of the develop-
mental paradigm on family life. Chicago: University of Chicago Press.
Thornton, A., & Achen, A. et al. (2010). Creating questions and protocols for an international study
of ideas about development and family life. Survey Methods in Multinational, Multiregional, and
Multicultural Contexts, Wiley, 59–74.
Thornton, A., Binstock, G., et al. (2012a). International fertility change: New data and insights from
the developmental idealism framework. Demography, 49(2), 677–698.
Thornton, A., Ghimire, D. J., et al. (2012b). The measurement and prevalence of an ideational model
of family and economic development in nepal. Population Studies, 66 (3), 329–345.
Triandis, H. C. (2001). Individualism-collectivism and personality. Journal of Personality, 69(6),
907–924.
Varnum, M. E. W., & Hampton, R. S. (2016). Cultures differ in the ability to enhance affective
neural responses. Social Neuroscience, 1–10.
Welzel, C. (2013). Freedom rising: Human empowerment and the quest for emancipation.New
York: Cambridge University Press.
Yamamoto, Y., & Sonnenschein, S. (2016). Family contexts of academic socialization: The role of
culture, ethnicity, and socioeconomic status. Research in Human Development, 13(3), 183–190.
Zhang, X., Holt, J. B., et al. (2014). Multilevel regression and poststratification for small-area estima-
tion of population health outcomes: A case study of chronic obstructive pulmonary disease preva-
lence using the behavioral risk factor surveillance system. American Journal of Epidemiology,
179(8), 1025–1033.
Zhao, Y., Hu, Y., et al. (2014). Cohort profile: The china health and retirement longitudinal study
(CHARLS). International Journal of Epidemiology, 43(1), 61–68.
Zhao, Y., Strauss, J., et al. (2013). The China health and retirement longitudinal study (CHARLS)—
Users’ Guide for the 2011–2012 national baseline survey. Beijing: China, National School of
Development, Peking University.
Chapter 8
Conservation of Cave-dwelling Village
using Cultural Landscape Gene Theory
Anrong Dang, Dongmei Zhao, Yang Chen, and Congwei Wang
8.1 Introduction
As a result of the interaction between the natural environment and the folk culture for a
long history, village cultural landscape, with strong regional and national characters,
not only is the crystallization and witness of agricultural civilization, but also the
carrier of regional culture (Dong et al. 2019; Chen et al. 2014; Yang et al. 2013;
Yang and Dang 2012;Lietal.2010). Under influences of an environmental concept
known as “oneness of nature and man”, the cave dwelling which is seen as a building
taking root in the earth becomes a principal architectural form in Wudinghe river
basin (Dang et al. 2013,2012a,b;Ma2012). Located in the northern Loess Plateau
of China, Wudinghe river is one of the main tributaries in the middle reaches of the
Yellow River. Not only countless neolithic Longshan culture sites distribute along
both sides of Wudinghe river, but also various types of cave village cultural landscape
which constitute an integrated system, and all these have great academic value and
significance of cultural inheritance (Zhao 2010; Dang et al. 2009;Li2007; Qin et al.
2008).
According to the cultural landscape gene theory, it is deemed that there exists a
cultural factor which is not only different from other cultural landscapes but also can
be inherited from generation to generation (Huo and Liu 2005;Liu2004). Effective
A. Dang (B
)·D. Zhao ·Y. C h e n ·C. Wang
School of Architecture, Tsinghua University, Beijing 100084, People’s Republic of China
e-mail: danganrong@126.com
D. Zhao
e-mail: 81864349@qq.com
Y. C h e n
e-mail: c_yang2012@126.com
C. Wang
e-mail: 124252113@qq.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_8
97
98 A. Dang et al.
identification of the landscape can only be achieved by grasping the gene funda-
mentally. In addition, Cultural Landscape gene theory also puts forwards that the
settlement landscape gene identification can be performed from five aspects, such as
environmental factors, layout forms, totem signs, subjectivity public buildings, and
dwelling characteristics (Liu 2004). In this chapter, the author treats the concepts
of cultural landscape gene theory as breakthrough points to analyze the five genetic
characteristics of cave-dwelling village cultural landscape (CDVCL) in Wudinghe
river basin, such as natural gene, cultural gene, spatial gene, material gene, and
intangible gene, and explore their cultural values in order to conserve CDVCL.
8.2 The First Genetic Characteristic of CDVCL: Natural
Gene
8.2.1 The Loess Landform
The landform of the Wudinghe river basin can be roughly divided into three parts,
such as the hinterland of the Maowusu desert in the northwest where lower population
density within larger area, the river source area in the southwest, and the loess hilly-
gully region in the southeast where both residents mainly settled. Covered with thick
loess and divided into numerous broken parts by long-term erosion of the Wudinghe
river tributaries, the region is characterized with ravines interlaced morphologies
such as tableland, loess hill, hillock and groove. With a depth of 100–200 m and
difficult seepage and extremely orthostatic nature, the loess provides a very good
precondition for the development of cave dwelling.
8.2.2 Semi-arid Climate
With an annual average temperature of 7.9–11.2 °C and an annual rainfall between
300–550 mm, the Wuding river basin is in the temperate continental monsoon climate.
The areal distribution of precipitation is unbalanced, which is mainly concentrated in
summer (from June to September) with frequent rainstorm and torrential downpour.
To sum up, the environmental factors of dry climate with little rain, cold winter, hot
summer, and lack in forstory in the basin, form the foundations of formation and
development of cave dwelling, which is warm in winter and cool in summer, green
and economical, and wood is not required.
8 Conservation of Cave-dwelling Village … 99
8.2.3 Dry Farming
With generally poor ecological environment and complex landscape, most area of
the Wudinghe river basin has to conduct dry farming that mostly relies on natural
precipitation. On the other hand, large-scale irrigation and water conservancy facility
are hard to construct because of the deficiency of flat land. As a result, the efficiency
of agricultural production there is fairly low, therefor, the residents always settled in
the area close to the arable land. It is the dry farming mode in the basin that not only
determines the layout structure of village, but also the choice of the most suitable
building there—cave dwelling. Adapted to the slope topography, obtaining interior
space horizontally, and making the most use of undisturbed land as its wall and roof,
cave dwelling is conducive to protect the limited arable land. Especially, the original
earth was dugged out during the construction of the caves on cliff, can also be used
to fill slope to be stretched out, so as to increase the farmland area.
8.3 The Second Genetic Characteristic of CDVCL:
Cultural Gene
Rooted in deep loess, the cave dwellings are mostly built along the river, cliffs or
slopes, forming the structure of one layer or layers of stereo cave zones along the
contour lines, or sunken caves which can be described by “in the village without
seeing it, only top of the tree crown is visible”.
8.3.1 Village Pattern of Along the River
Influenced by the Chinese traditional “geomancy” culture, the site selection of cave
dwelling generally maintained a routine, that is “face water and back to mountain,
carry Yin and embrace Yang”. In order to avoid natural disasters like flood, debris
flow, landslide, oblique slip, etc., villages always distribute beside the river and
extend according to the concave and convex folds of hills and valleys, achieving
the state of the balance between nature and humanity while making full use of
the nature. Therefore, the spatial layout of the village is mostly determined by the
river morphology, more specifically, there are three layout types: dendritic structure
following the directions of the valley, parallel structure perpendicular to the valley
and scatter structure spreading along the branches of the valley.
100 A. Dang et al.
8.3.2 Village Pattern of Along the Cliff
Cave dwelling that built along the cliff on the loess slope and utilize the open space in
front of it is so called the “backer cave”, which normally shows a curve or broken liner
distribution along the contour line. This type of cave dwelling has good daylight,
but there is a certain slope distance from the river and the road, so it is not quite
convenient for the residents to access, transport materials and get water.
8.3.3 Village Pattern of Along the Slope
This type of cave dwelling is normally built on the sunny side of the loess slope beside
the river or on the upper part of the rock wall, both of which are always too steep to
cultivate on. The main body across the courtyard is constituted of 2–5 backer caves,
together with pigsty or sheepfold, toilet, walls and gates to form a basic building unit.
The caves extend along the slope in compliant with topography, and always show
a linear distribution. Viewing from a further perspective, layers of circular arches
strewn at random altitude outline the whole village. The advantage of this type is the
convenience in transportation and water consumption, as well as shelter from sand,
while the weakness is poorer vision than the backer cave.
8.4 The Third Genetic Characteristic of CDVCL: Spatial
Gene
8.4.1 Production Space
Most land in the Loess Plateau are extremely barren, so limited population are fed
with low crop yield. Therefore, in addition to the restriction of landform, the size
of the village there is determined by the quantity and quality of arable land around
the village to a great extent. Due to the gully topography, arable land in the Wuding
River Basin are mostly small and scattered instead of large and flat, and the distance
between the farming area and the residence space is quite short, as a result, the scale
of most cave villages are small.
8.4.2 Living Space
Generally, a courtyard lives one family. Compounded by main caves, outdoor hearth
or lean-to, livestock shed or corn storehouse, millstone, pigsty or henhouse, seepage
8 Conservation of Cave-dwelling Village … 101
pit or water cellar, toilet, fences and gates, the courtyard possess both living and
production functions.
8.4.3 Mental Space
Because of the belief in the Mountain-god, villagers consider the local temple as a
place for spiritual sustenance, so the temple is the main place for public activities
typically. In the day of the first, the ceremony of “invitation of the Mountain-god”
usually held, and then, in the name of god, public activities such as Yangko, serpentine
maze, drama show, etc. are organized. Therefore, in the view of space form, the
temple, outdoor stage and square are adjacent to each other to keep all the activities
smoothly.
8.5 The Fourth Genetic Characteristic of CDVCL:
Material Gene
8.5.1 Construction Materials
According to the structure and material, cave dwellings can be divided into the
following types, such as loess cave, interface cave, brick cave, stone cave, as well
as adobe cave, thin shell cave, brick-stone cave and other derived ones. To be more
specific, the adobe cave and the thin shell cave are derivative from the loess cave
and the brick cave accordingly, and the brick-stone cave uses two kinds of building
materials.
(1) Loess cave. It is the most primitive form of cave, derived from the ancient
habitude of cavemen. The loess cave is generally 3 m wide, 3 m high and 7 to 8
m deep, while the deepest could be 20 m. The advantages of the caves, which
are being worm in winter and cool in summer, cost and material saving and easy
to build, are fully embodied by the loess caves. However, at the same time, poor
daylight and air circulation, windows and walls are hard to paint, front walls
are easily weathered and rain corroded, landslide leads to collapse are the main
weaknesses. With the gradual improvement of the residents’ living standard
after the founding of new China, the loess caves have been largely abandoned.
(2) Interface cave. On the basis of the former loess cave, several progresses are
made including widening the original outlet, expanding the depth of the cave
by 1 or 2 m, using bricks or stones to build the front wall (mostly between
1.5–m), and new made round window and wooden door. The connecting area
between the loess and stone (brick) parts is hidden with screening, so as to
integrate them as one. Bigger doors and windows mean larger lighting area and
102 A. Dang et al.
bring more sunlight, and the other improvements bring better heat preservation,
firmness and outlooking.
(3) Brick cave. It is a kind of arch cave that built by clay brick and mortar, with the
advantages of beautiful and neat, but the weaknesses of poor heat preservation
and easily aging.
(4) Stone cave. It is a kind of arch cave that built with rocks and dust. Specifically,
the front wall is assembled with square or arc rocks in accurate size, the internal
wall is painted with white stucco, the stove is made by polished stone slabs and
the floor and the Kang are pasted with tiles, all there treatments indicate that the
stone cave is a kind of new generation of caves.
8.5.2 Facade Forms
There are different forms of cave dwellings including backer cave, sunken cave,
detached cave, etc. Firstly, the backer cave is most wildly applied, with terraced
distribution along the slope or the edge of tableland. Secondly, the sunken cave is
dug in the inwalls of a square pit, forming an underground courtyard. Thirdly, the
detached cave is independent form the surroundings, also known as “head cave”.
The most important facade of cave dwellings is the front one, facing the courtyard
and often referred to as “the cave face”, which means it is as important as a man’s face,
and always be decorated delicately by the local residents. The structures of the facade
from top to bottom include the parapet, eaves, arch head line, doors, windows, etc.,
in which doors and windows take most part of the “face” and located at the center,
so become the most decent decoration parts.
The “face” of the cave dwellings in the Wuding river basin mostly in the form
of “open and full arch window”—the area from the arch line to the middle part of
the face all taken up by windows, while on the other side are doors, and brick, rock
or adobe are only used below the windowsill. What’s more, there are some kind of
number routines, for example, the layers of windowsill must be singular number,
and 17 layer if it is a brick or adobe one. If several caves distributes linearly, the
patterns of the window lattices should not be consistent. Moreover, the shapes of the
arches are normally double, triple and concentric, so the overall modeling creates a
atmosphere of clear, symmetry and grand.
8.5.3 Flat Pattern
The flat layout form and structure pattern of cave dwellings mostly follow the Chinese
traditional residential courtyard—enclosed courtyard, with the main forms of three-
section, quadrangle courtyard, and courtyard with the combination of two former
ones. In addition, some poor families only build one regular room, while enclose the
courtyard by walls. Because the cave dwellings are located in the Loess Plateau with
8 Conservation of Cave-dwelling Village … 103
ravines crossbar and complicated topography, the terrain do always claim different
requirements on the layout and structure of the courtyard.
The layout of the cave dwellings has to abide by the etiquette rules usually, such
as facing south and setting the living space in the southeast. Within a complete cave
courtyard, only three cave rooms are dug on the frontage. Due to the concept of
“virtues between father and son”, the middle cave with the biggest size is the elders’
house, the east and west rooms nearby are assigned to the eldest and the second son
respectively, and the descendants and servants live in the caves (like wing-room)
on both sides. Furthermore, the middle caves are often used as a central hall where
decorate the spirit tablet of ancestors for worship.
8.5.4 Partial Adornment
The priority colors of cave dwellings are yellow and green gray. The main external
decorations of caves include the material, style, craft and color of the windows,
door curtains, top bars, etc., showing whether the master is hardworking as well as
the family’s wealth state. Above all, the arch curves of the doors, windows and the
entrance are the most critical decoration parts, and the Chinese traditional culture
and regional folk ideas have been embodied in the decoration patterns.
The internal decoration of cave dwellings include the inner shape, wicket (that
is, a small door between two caves) and the curtain cover up it, as well as paint-
ings, coverings of furnitures and household appliances, etc., which are mostly the
handiworks by the local residents especially the housewives. The tank and Kang
surrounding paintings are mostly representative, and the latter has been included in
“the list of second batch of national intangible cultural heritage” in 2008.
According to the material, cave dwelling decoration can be divided into rock-
made, brick-made, wood-made and paper-made. Rocks and bricks are usually used
in caved lions, drums, foundations, screen walls, and arch headlines on the facade,
overhangs, parapets and so on with auspicious patterns of “happiness”, “affluence”
and “longevity”. Wood is mainly used in the carvings of gate raising and window
lattice, etc. Paper refers to window and roof paper-cuts, Kang surrounding painting,
hanging curtain, goalkeeper, etc. which can be temporarily replaced.
8.6 The Fifth Genetic Characteristic of CDVCL: Intangible
Gene
8.6.1 Religion
In the Wuding river basin, a variety of religious beliefs coexist, including not only
the Buddhism, Taoism and Confucianism, which are widespread in China, but also
104 A. Dang et al.
the Catholicism, Christianity, and Islam. The characteristics of religious beliefs in
Wuding River Basin are primitiveness, practicability and diversity. The system of
Three Wise Kings and Five August Emperors, being closely connected with the
agricultural civilization, was the worship of religious belief at the early stage. As
the rise of the combinational culture of the Confucianism, Buddhism and Taoism,
temples were gradually constructed. In Ming and Qing dynasties, the temples and
folk meetings were connected, forming a strong grassroots social organization, in
which the entertainment activities promoted the public conservation efficiently, and
the ingenious union of folk art and sacrificial activities brought more vigor to the
temple fairs, strongly attracting the masses to participate in the activities of religious
culture. In the area, numerous temples of different religious beliefs existed in each
village, and almost every family has their own God, which mainly due to the special
location of the Wuding river basin—a crucial battlefield of the Hans, Huns and
other minorities in the transitional zone between the cropping and nomadic area.
In ancient times, frequent wars and natural disasters made people struggle on the
edge of starvation and in pain bitterness. So people desired for peaceful life and then
turned to the gods blessing, at the same time, because of the promotion of cultural
communication and integration by war, the religious believes became diverse.
8.6.2 Traditional Customs
As most residents in the area have the pantheism faith and believe that “the gods
are everywhere”, the traditional folk custom there usually shows the fear of god.
Taking the cave building as an example, from the very beginning of the location
decision, the geomancer plays an important part in selecting the terrain, orientation
and propitious timing. First, the geomancer helps the master to decide the orientation
of the new building by means of the “compass” and tell him which day is good for the
beginning of construction. On the break ground day, the master would held a serious
of worship activities,including offering food and drink, lighting incense and kowtow
to the local soil god and tell that the construction is going to start, praying god bless
the whole family in the coming days, and then, the construction could be started.
On the finishing day, another ceremony, named ‘closing dragon’s mouth’, should be
held. What’s more, before the possession date, there are even other activities like
window setting, god placing, cave worming, etc.
8.7 Conclusions
Embodying the environmental concept of “the unit of nature and man”, the cave
dwelling village in the Wuding river basin, is a typical representative of human
settlement environment and an important part of human cultural heritage, which
shows the distinctive local characteristics. By analyzing the form and other aspects
8 Conservation of Cave-dwelling Village … 105
of the settlement, this paper comes to the conclusion that, during the process from the
location and formation to the development of the settlement, the form is indeed the
result of the joint interactions among the history, natural environment and human-
ities environment. To conserve, develop, and inherit the traditional settlement with
historical, cultural and artistic value, it is not only beneficial to maintain its regional
characteristics and cultural continuity, but also make sense to the modern architec-
tural designing for reference and practical value by means of conserve the Cultural
Landscape Genes.
References
Chen, Y., Dang, A., et al. (2014). Building a cultural heritage corridor based on geodesign theory
and methodology. Journal of Urban Management, 2014(1–2), 121–141.
Dang, A., Zhao, D., & Cheng, Y. (2012a). Characteristics of traditional cave dwelling village cultural
landscape at Yulin Prefecture. Traditional Village Conservation, 10, 128–133.
Dang, A., Ma, Q., & Lv, J. (2009). Conservation of Chinese traditional culture based on information
technology. The 14th Inter-University Seminar on Asian Mega-Cities (IUSAM), Taipei City,
Taiwan, China. March 12–15, 2009.
Dang, A., Ma, Q., & Zhao, J. (2012b). Conservation study on village traditional culture based on
geo-information technology. Urban Flux, 1, 26–29.
Dang, A., Zhang, Y., & Chen, Y.(2013). sustainable-oriented study on conservation planning of cave-
dwelling village culture landscape. In: Spatial Planning and Sustainable Development, edited by
M. Kawakami et al. published by Springer.
Dong, Y., Fei, Y., & Dong, Y. (2019). analysis of the cultural landscape characteristics of Hezhe
Traditional settlements based on the genetic method of cultural landscape. Development of Small
Cities and Towns, (03), 98–105.
Huo, Y., & Liu, P. (2005). The town form and the landscape of the Loess Plateau. Journal of
Architecture, 12, 42–44.
Li, Y., & Dang, A. (2010). application of spatial statistical for village system planning. In: Spatial
Integrated Humanities and Social Science. China Science Press, (2): 257–269.
Li, J. (2007). Research on the ancient settlement of Mizhi cave in northern Shaanxi, Xi’an, Xi ‘an
academy of fine arts.
Liu, P. (2004). The gene expression and recognition of ancient village cultural landscape. Journal
of Hengyang Normal University, 24(4), 1–8.
Ma, W. (2012). The planning and design of traditional cave village landscape in northern Shaanxi—
based on the principle of cultural ecology. Beijing: Tsinghua University.
Qin, Y., & Dong, J. (2008). The villages and natural environment of the Loess Plateau in northern
Shaanxi in the late Qing and early republic. Gansu social Science, 2008, 210–213.
Yang, Y., & Dang, A. (2012). The cross-disciplinary research methods on village cultural landscape,
taking Nuodeng village as an example. Urban Flux, 2012(1), 18–22.
Yang, Y., Zhang, D., & Dang, A. (2013). 2013, Discuss on the spatial and timing characteristics of
the forming mechanism of village cultural landscape—taking Nuodeng village as an example.
Chinese Garden, 3, 60–65.
Zhao, J. (2010). 2010, The conservation planning and design of cave village landscape in northern
Shaanxi—taking Dangjiashan village as an example. Beijing: Tsinghua University.
Chapter 9
Digitalized Enka-Style Taipei
C. S. Stone Shih
9.1 Introduction
This study explores the popularity of Taiwanese ballad that reflect influences from
Japan. These ballads have evolved from Taipei’s old-town regions throughout history.
As the geographic information system (GIS) precursor, Goodchild (2004) proposed
a spatially integrated approach to humanism and social science. The integrity of
this approach relies largely on the technological progress of the GIS, such as virtual
reality, the Internet, and wireless terminals. Progress in the information processing
technology in terms of the GIS has led to the regular use of time–space-oriented infor-
mation in humanities because a cultural–spatial–analytical perspective is employed
in humanities (Bodenhamer et al. 2010). Since 2001, sociology digital mapping
has been developed as a separate discipline in Taiwan. By rethinking the social
space and the integrity problem, Shih proposed the “GIS bridging method” by using
digital drafting for multidimensional adjustment, interpretation, and reconstruction.
The bridging method is a human-centered study that includes multiple methods and
data for conducting comprehensive quantification and GIS integration (Shih et al.
2010). A researcher maps the cultural space with the social space and physical space
(Liechty 2003). The differentiation within a class and among different classes causes
the appropriation and redefinition of any space by the ascendant group according to
cultural logic (Bridge 2002; Podmore 1998). The social space may be defined as a
cultural site not only selected as the geosocial locale of ethnographic gaze but also
as a centralized location within a cultural community that serves as the confluence of
banal ritualized activity and exchanges of cultural currency (Alexander 2003). The
main factors of social space discussed in this study are as follows: class and ethnic
C. S. Stone Shih (B
)
Department of Sociology, Soochow University, Taipei 111, Taiwan
e-mail: post.cstone@gmail.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_9
107
108 C. S. Stone Shih
groups and dynamic change in Taipei’s historical landscape via musical genres, espe-
cially Taiwanese ballad. In this study, we explored the cultural space of Taiwanese
ballad related to Enka sources by bridging qualitative interviewing methods with the
GIS. All these cultural imagery and historical discourses are intertwined with cultural
products and imbued with social meaning, thus providing a critical understanding of
the relevance of Enka in Taipei.
9.2 Taiwanese Enka-Style Ballad Performers in the 1960s:
Two Cases1
Enka has considerably influenced many post-war popular Taiwanese songs, such as
songs of the singing diva Lu-Shyia Chi (紀露霞) and primo singer Yi-Feng Hung
(洪一峰) in the 1960s.
Enka originated in the 1880s during the Meiji period (1868–1912), and jiyu minken
undo (freedom and civil rights movement) occurred during 1874–1890. The freedom
of public expression was restricted by the Meiji government. Thus, to avoid govern-
ment restrictions and police interference, Japanese intellectuals used speech–song
form to express their ideas at public gatherings. The most primitive form of Enka—
half sung and half spoken—was performed by Enkashi (Enka singer). The musical
style of Enka changed again in the early post-war period (i.e., the 1950s). Naniwa
bushi, a genre of traditional Japanese narrative singing, did not have the largest influ-
ence on Enka. Rather, American jazz bands, which thrived in major cities and military
bases, were more influential on Enka than Naniwa bushi. However, Japanese burusu
became popular again in the 1960s. The songs composed in Japanese burusu were
known as mudo-kayo (mood songs). A slow-to-medium tempo, the topic of lost love,
and the use of Western instruments such as the saxophone and guitar are the signature
musical traits of Japanese burusu.
For distinguishing between the Japanese and Western popular music styles, the
music industry in the 1970s used the label of Enka to represent and emphasize
the popular Japanese musical styles. Enka was derived from popular songs. The
independence of Enka can be traced back to the 1960s. Since the establishment of the
special department of the Columbia Record Company (コロムビア販賣株式會社)
in the 1963, Enka has been considered as a unique genre of popular Japanese music
(Kishi 2013). The typical accompanying instruments for Japanese-style Enka are
the alto and tenor saxophones, trumpet, electric bass, piano, and string instruments.
These differences became more obvious between the 1970s and 1980s because of
the emergence of nyu myujikku (new music), which included brighter and lighter
lyrics and modern music styles. Compared with the older audience who preferred
the sad melodies of Enka, the audience of nyu myujikku is younger. Currently, Enka
has become a distinct genre, and its subgenres have been developed on the basis
of the musical style and lyrical subjects. According to Yano, Enka is currently less
1All the interviewees were interviewed by the author between 2009 and 2011.
9 Digitalized Enka-Style Taipei 109
popular than before and has been replaced by other forms of popular music (Yano
et al. 2003). Because popular Japanese music frequently incorporates elements of
Western music for producing new genres and subgenres, more varieties of popular
music exist than ever before. Thus, the history, musical style, and audience of Enka
are not fixed under different cultural contexts. People still sing and listen to Enka on
different occasions in their daily life.
The two singers, Yi-Feng Hung (the primo singer) and Lu-Shyia Chi (the singing
diva), are representative performers of Enka-style Taiwanese ballad in the 1960s.
Hung Yi Feng covered most of the musical repertoire of the Enka bass singers
of the time. For example, he covered 15 musical pieces of Frank Nagai (フラン
ク永井), such as “Let’s Meet at Yurakucho” (有樂町で逢いましょう). He also
performed the nine musical pieces of Ishihara Yuujirou (石原裕次郎), such as “Rusty
Knife”(錆びたナイフ); two musical pieces of Ichiro Kanbe (神戶一郎); and several
pieces of Mifune Hiroshi (三船浩), such as “Man’s Blues”(男のブルース). These
musical pieces were performed by well-known bass singers. Moreover, the selected
songs were related to Japanese culture. Consequently, Hung traveled to Japan and
performed in the Nihon Gekijou theater (a theater in Japan) in Tokyo in 1962 due to
his popularity. Due to his inclusion in the Nihon Gekijou program and because he was
considered a bass singer as good as Frank Nagai, Hung was the male representative
of Taiwanese ballad. He recorded many albums and more than 100 individual pieces
of music. He also played the leading role in Taiwanese movies of the 1960s. His
vocal performance exhibits the same bass and aura as that of Frank Nagai, which
helped popularize his songs. After Hung performed at the Nihon Gekijou theater, he
returned to Japan eight times continually and taught Chinese folk songs in Shinjuku.
He divided his time among the major cities in Japan, including Tokyo, Nagoya,
Sendai, Yokohama, Osaka, and Kyoto. Lu-Shyia Chi adapted songs from music-
school-trained Japanese divas. For example, Misora Hibari’s (美空ひばり) musical
repertoire was the primary source of Chi’s adaptations, such as “Hill at the Dusk” (夕
やけ峠) and “Hibari the Flower Girl” (ひばりの花売娘). In contrast to the musical
sources of Hung, who adapted songs from a limited selection of singers, Chi’s sources
were more varied. For example, her other sources were the musical pieces of Awaya
Noriko (淡谷のり子), Hirano Aiko (平野愛子), Takamine Mieko (高峰三枝子),
and tenor singers such as Haida Katsuhiko (灰田勝彦). Despite adapting songs from
various Japanese singers, Chi displayed her unique singing skill by effectively using
the resonance of her head and thoracic cavity for creating a soft and mellow tone,
which she developed from her bel canto learning experience. Chi’s own personal
singing style earned her the title of “Taiwanese Misora Hibari.” In contrast to Chi,
who composed Taiwanese ballad with Mandarin songs, Hung only sang Taiwanese
ballad. However, similarities exist in the content of their work. The songs selected
by them were generally well-known classics in their original countries, and their
songs differed from the majority of Enka songs adapted to Taiwanese because the
adaptations became classics in Taiwanese rather than in Japanese.
Although Lu-Shyia Chi was honored with the title of “Taiwanese Misora Hibari”
due to the popularity of her hit song “Hill at Dusk,” her song differed from that
of Misora Hibari’s original version and could not compare with Hibari’s classic.
110 C. S. Stone Shih
The original Japanese version of “Hill at Dusk” was composed in 1957 during the
second rural-to-urban population migration. Due to the rise of the cultural industry
driven by broadcasting and television, Japan entered the “era of singer adoration
and popular music fervor ” (愛唱歌時代) after the 1950s. Simultaneously, Taiwan
experienced an ethnic conflict in 1947. The massacre of the 228 incident led the
Taiwanese to long for a way out of trauma. Taiwan also entered into the era of
singer adoration and popular music fervor in the 1960s (Shih 2016). Compared with
the social background in Japan, the post-war rural-to-urban migration in Taiwan
occurred from the 1950s to the 1960s. Chi’s version of the song was a response to
the urbanized social background. For example, consider the country girl described
in the lyrics of “Hill at Dusk,” who moved to a factory in the metropolitan area
of Taipei for finding a job. While working day and night, she missed her mother
in the village of southern Taiwan and imagined that she stood on the hills in the
evening gazing afar for her hometown. This scenario illustrates a social–political
process of how Japanese urban songs were transformed into Taiwanese ballad. The
Taiwanese people localized the lyrics that contained Enka. This often occurred in the
early Showa period in the historical stage and echoed the social situation of Taiwan
in the 1960s. As an urban popular song genre, Enka emerged through the stories
of misfortune that reflected the backwardness of Taiwan’s industrialization and the
distress in the political environment.
9.3 Digitalized Enka Pertaining to Taipei that Reflected
a Mixed-Race Cultural Space
Lu-Shyia Chi’s Japanese songs that were covered into Taiwanese could be defined
as “mixed-race songs.” After the 228 incident, in the 1960s, Taiwan’s economy
did not reach mature global circulation, political control of popular music was not
strict, and relevant legal norms were not yet institutionalized. In this situation, local
lyricists added lyrics to the existing songs, reinterpreted foreign songs, and localized
foreign songs in a multidisciplinary manner to transform them to the Taiwanese style.
I consider this process as “quasi-globalization” (Shih 2014), which was a cultural
rather than economic globalization. Quasi-globalization illustrated the success of
Taiwanese ballad in the 1960s. The incorporation of Enka into Taiwanese songs
by singers such as Lu-Shyia Chi and Yi Fong Hong’s was only one of the forms of
mixed-race songs. The singers not only incorporated pieces from Japan but also from
China, especially Shanghai, Hong Kong, Vietnam, South Korea, the United States,
and Italy.
The mixed-race songs, especially including Enka, studied in this research are
analyzed in terms of the historical time from the governance under the Japanese colo-
nial era (1895–1945) to the Kuomintang governance in the early post-war period as
well as in terms of the distribution of musical media venues in the cultural space. The
venues for popular music performance were mainly in the old district of Taipei, the
9 Digitalized Enka-Style Taipei 111
western district near the Tamsui River. Before Japanese colonization in Taiwan, the
Chinese Qing government had already established Taipei City. The main administra-
tive center was later renamed as “inner city” (城內) by the Japanese people. Tradition-
ally, inner city, Dadaocheng (大稻埕), and Monga (艋舺) in Taipei’s western district
were together known as the “three prosperous streets” (三市街). After occupying
Taiwan, the Japanese people established Seimonch ¯o(西門町) in a place close to inner
city. The region resembled Asakusa near Tokyo, the Japanese metropolis that was a
center of recreation and business. As displayed in Fig. 9.1, four districts of Taipei
constitute the main cultural space in this study: Inner city, Dadaocheng, Monga,
and Seimonch¯o. Regarding the space of popular songs, Jones (2004) proposed the
concept of “media loop” to explain pop music circulation in Shanghai in the 1930s.
The work of Jing-hui Li (黎錦暉), who is the founding father of Chinese yellow
music (modern song), was taken as an example. In a sense, the film, Peach Blossom
Dream (1935), represented the creation of a new media loop at that time. The sort
of urban milieu in which Li’s yellow music had first gained popularity became the
object of filmic representation in movies pertaining to the lives of sing-song girls who
performed the music. The screen songs from the movie were published in the collec-
tions of sheet music and film magazines in turns, made into gramophone records,
broadcasted, and ultimately emulated by sing-song girls in the dance halls.
Figure 9.1 displays the cultural space of Taipei’s musical genre and
media (1930–1970). The venues presented through digital mapping, such as
cinemas, dance halls, radio stations, and recording shops, were used for
playing music and conducting dramas in Taipei. These venues could be clas-
sified into six categories according to the musical style from the construc-
tion of the city to the initial post-war period: (1) Beijing opera/nanguan
/stage play, (2) Taiwanese puppet show/Taiwanese opera, (3) popular music from
Shanghai and Japanese movies, (4) popular music from Taiwanese movies and
ballads, (5) occidental movies, and (6) outdoor cabaret. Some of the venues were
assigned to several categories. This indicates that a venue could be used for diverse
genre performance functions and could be regarded as a mixed hall. Consider the
following examples: Eirakuza (永樂座#13; Here, Beijing operas, stage plays, and
Japanese and Shanghai movies were conducted before the war but Taiwanese movies
were displayed after the war), The First Theater (第一劇場#12; Here, film screenings
of occidental movies, Beijing opera performances, Japanese movies, and Taiwanese
movies occurred after the war), Yoshino Kan in Seimonch¯o(芳乃館#17; Here,
Beijing opera performances occurred before the war and screenings of Japanese, occi-
dental, and Chinese movies occurred after war). Varied performances were conducted
in the mixed halls, which indicated that the musical space in Taipei from the Japanese
colonial period to the initial post-war period was hybrid and complex.
Several venues managed by the Japanese people, such as Taipeiza (台北座#29),
Niitaka Kan (新高館#30), and Yoshiaki Kan (芳明館#28), hosted traditional
Japanese songs and dance forms, such as noh (能劇). The Taiwanese people were the
majority spectators for the Taiwanese puppet show (budaisi布袋戲) and Taiwanese
opera.
112 C. S. Stone Shih
Fig. 9.1 Cultural Space of Taipei’s Musical Genres and Media Venues (1930–1960)
9 Digitalized Enka-Style Taipei 113
Fig. 9.1 (continued)
114 C. S. Stone Shih
Fig. 9.1 (continued)
9 Digitalized Enka-Style Taipei 115
Fig. 9.1 (continued)
During the Qin Dynasty, Changchou (漳州) and Quanzhou (泉州), who were
immigrants from Fukien in China, sailed across the sea to Taiwan and initially resided
in Monga. They then moved to Dadaocheng because of mob violence and cultivated
the Tamsui River bank. Because of the dense population in this area, the area was
named as the “Taiwanese street” (臺灣人市街). Before 1920, Dadaocheng, Monga,
and Dalongdong (大龍峒) belonged to the Taipei Prefecture (台北州). These regions
were known as “Taipei” according to the local government system after the admin-
istrative area was restructured by the Japanese government. Monga was occupied
by both the Taiwanese and Japanese people and contained the “Wan Hua Hooker
Street” (萬華遊廓), which offered Japanese prostitution and entertainment services
during the Japanese colonial period. Seimonch¯o was designated the downtown area
by Japanese administrators. The Japanese government decided to follow the model
of Asakusa district, Tokyo, and filled soil into the Monga depression in October
1914. After the construction was completed, the Japanese people were the principal
residents in Seimonch¯o and formed the so-called “Inlander’s street” (Japanese street,
內地人市街), which became a recreation and business area that flourished until the
1960s under the Kuomintang government (Gao 2004). Because of class and ethnic
group divisions, Hsin Minpao (臺灣新民報) presented a report titled “Three Main
Problems of Taipei” on August 2, 1930. The report stated that “In the 5th year of the
Showa period (1925), the total population of Taipei City was 233,340, the number
of Japanese was 64,800, and the number of Taiwanese was 164,400. The Japanese
and the Taiwanese inhabited areas were clearly divided. The Japanese mostly lived
in the Inner city, and some lived in Monga. The Taiwanese lived in Dadaocheng and
Monga. Therefore, the Japanese and Taiwanese people were divided in space.”
The scenario continued from the 1930s to the late 1960s. There existed differences
between the language and reading habits of different ethnic groups. Figure 1 illus-
trates the aggregation of music genre venues. The ethnic groups who lived in Monga
were in a mixed state. Thus, the songs heard and movies watched by the residents were
mixed as well. As displayed in Fig. 9.1, the audience who were attracted to the Monga
116 C. S. Stone Shih
theater (#25) included Taiwanese and Japanese people. The content of the perfor-
mances included Taiwanese puppet shows (budaisi); Taiwanese operas (歌仔戲);
stage plays (新劇); and Japanese, Taiwanese, and Chinese movies. In Dadaocheng,
music and drama, such as Taiwanese ballad, Taiwanese movies, Taiwanese operas,
Taiwanese puppet shows, stage plays, Shanghai movies, Beijing operas, nanguan (
南管), beiguan (北管), and Xiao-Qu (小曲), were mainly conducted in the theaters.
For conducting stage plays, the Kosei Theater Society (厚生演劇研究會) was estab-
lished in 1943 by intellectuals who were highly interested in drama, such as Chuan-
Sheng Lu, San-Lang Yang, and Tuan-Chiu Lin. They held the first performance at
Eirakuza (#13) in Taipei on September 3rd, and their program included “A Capon,”
“Takasag Kan,” “Terrestrial Heat,” and “The City Lights We Look Down Above from
the Mountain” in Japanese, which were written and conducted by Lin Po Chiu. These
plays resulted in Eirakuza being unprecedentedly packed (Yeh 1990; Shih 2011).
After the war, the new venues in Dadaocheng included Greater China Theater
(#14), Da Qiao Theater (#11), Guosheng Cinema (#2), Golden Dragon Hall (#5),
and Xiaokilin (#4). Before the war, the general public watched Taiwanese puppet
shows and listened to Taiwanese ballad, whereas the intellectuals favored nanguan,
beiguan, and Beijing operas. Because Dadaocheng was a “Taiwanese street,” in addi-
tion to speaking Taiwanese, the residents yearned for Chinese culture and favored
Chinese opera and music. Therefore, during the Japanese colonial rule, Dadaocheng
was not close to Japan but was rather close to China. However, the post-war 228
incident occurred exactly in Dadaocheng. This event completely changed the story.
The Kuomintang government deliberately suppressed Taiwanese ballad. Eventually,
the Chinese musical genre was rejected by the Taiwanese people and was replaced by
Japanese songs and movies. After the war, Kuomintang controlled the inner city. As
a Japanese entertainment center, Seimonch¯o was transformed for serving mainlan-
ders. Japanese pop culture was directly expelled to Dadaocheng. In the early 1950s,
the Mandarin-oriented genres of music and movies favored by Taiwanese audiences
indicated that they culturally identified themselves as Chinese in the Japanese colo-
nial rule. The 228 incident completely disintegrated the identity to cultural China of
the Taiwanese in Dadaocheng after the war. Dadaocheng continued to be a center for
the performance of Taiwanese ballad and Taiwanese films transformed from Xia Qu
and the Taiwanese opera.
Before the war, Seimonch¯o was the entertainment center of the Japanese ruling
class. Many types of movies were exhibited in Seimonch¯o from countries such as
Japan, China, the United States, and Europe. All the exhibited movies were first-
round movies. After the war, the Kuomintang government directly took over the
entertainment space of the Japanese people. People in Seimonch¯o continued to watch
first-round movies and listen to pop songs; however, the language was changed from
Japanese to Mandarin. After being defeated by the communist party in China, the
Kuomintang government exiled two million soldiers and people from Taiwan. These
people were historically known as “mainlanders” (外省人). An owner who originally
operated a dance hall and was engaged in the entertainment industry in Shanghai
reopened a similar hall in Seimonch¯o and banned Japanese songs there. Thus, the
cultural landscape of the district changed completely.
9 Digitalized Enka-Style Taipei 117
The locations of the performances were Sekai Kam (#18), Sinsekai Kam (#19),
and Daisekai Kam (#20). After the war, Japanese movies, noh dramas, and stage
plays were mainly screened, and Chinese operas were sometimes shown. The Inner
city area adjacent to Seimonch¯o was one of the places where senior Japanese officials
congregated. The main performance venues in Inner city were Taihoku City Public
Auditorium (#21) and Kikumoto Department Store (#15). The films shown at these
locations included movies from Japan as well as first-round movies in Europe and
America. After the war, the entire situation changed. The Chinese people replaced
the Japanese people as the masters of the presidential palace at Inner city, secured
political power, and became the main consumers of entertainment at Seimonch¯o and
Inner city.
9.4 Interviews and Digital Mapping Interpretation
During the Japanese colonial era, the Shanghai Beijing opera troupe conducted a
program at the Eirakuza (永樂座) in Dadaocheng in 1923. Before 1937, Shanghai,
Japanese, and occidental movies were often screened, whereas Taiwanese operas
and new dramas were occasionally performed at Eirakuza. The films presented in
Mandarin were mostly produced by the Lian Hua (聯華) and Mingxing film compa-
nies (明星影業) in Shanghai. The movies had plots revised from romances that had
occurred in both new and old Chinese societies. Popular movie examples during this
trend include “The Broken Zither Loft” and “Peach Blossom Village” by Hu Die (胡
蝶) and “Love and Duty” by Ling-Yu Ruan (阮玲玉). Moreover, occidental movies
were screened at The First Theater (第一劇場) in Dadaocheng, which was built for
“The Taiwan Exhibition of the Fortieth Anniversary of Japanese Colonial Gover-
nance” in 1935 and patronized by the renowned tea merchant Tian-Lai Chen (陳天
來). Dong-Cheng Wong (翁東成), a 90-year-old elder who once lived in Dadaocheng,
noted
My wife was recruited to be an accountant at The First Theater by Lin-Qiu Li.
That’s why I often went to the movies…(What was The First Theater’s program?)
Movies, it was entirely movies. (Where were those movies from?) Most of them were
from America. (Were there any Japanese movie?) Yes, but there were less of them than
American movies. (Were there any Taiwanese movies?) No! At that time, I went to
the theater but I never watched Taiwanese movies. Nobody wanted to see Taiwanese
movies. We all loved watching American and Japanese movies.
Gradually, the center of Beijing’s opera performances switched to Eirakuza in
Dadaocheng, which opened in February 1924. Although initially prosperous, the
popularity of Beijing opera quickly declined. Except for the performance of Feng Yi’s
Beijing opera troupe (鳳儀京班) at The First Theater in 1935 and the performance
of the Shanghai Tian Chan Big Beijing opera troupe (天蟾大京班) at Eirakuza, the
performance of Beijing opera ceased entirely until the conclusion of World War II.
When describing the program of the new stage and The First Theater, Yun-Diao
118 C. S. Stone Shih
Wang (王雲雕), a resident of Dadaocheng and a well-educated 80-year-old elder,
recalled that
At that time, Taiwanese opera was usually performed on the New Stage (#26).
I didn’t want to see Taiwanese operas at all, but my oldest sister often saw the
plays…The First Theater was opened on October 10. (What was its first program?)
The Beijing opera from Shanghai!
During the post-war period, The First Theater became the most important space
for Enka as well as the screening of Japanese movies. Famous Taiwanese ballad
singers, such as Yi-Fong Hung and Lu-Shyia Chi, performed at this theater. Jing-
Shang Li (李錦祥), the manager of the First Record Store (第一唱片行), argued
that
In the late 1960s, it was good business to sell records. Because my company
was located in front of The First Theater, audiences constantly came to my place to
purchase records, including Enka movie theme songs such as “Love in May Flower
”(
愛染かつらを
), “My Darling on the Bridge”(
あの橋の畔で
), and “Where My
Darling Is”(
あの波の果てまで
).
“The Third Sekai Kan” theater, which was renamed the “Da Guang Ming Theater”
(#9) after the war, was also located in Dadaocheng. The main program at this theater
was Taiwanese movies. Notably, as a “Taiwanese street,” Dadaocheng embodied not
only Taiwanese culture but also Chinese opera and music. In summary, the culture
in Dadaocheng was overall much more similar to Chinese culture than to Japanese
culture. For instance, intellectuals such as Wei-Shuei Jiang (蔣渭水), who established
the Taiwan People’s Party in 1923, frequently attended Jiang Shan Dinners (#3),
Peng Lai Pavilion (#27), and Dong Hui Fang (#8). These theaters retained Beijing
opera programs, including the nanguan and beiguan styles. The programs at Chun
Feng De Yi Dinners (#10), which was opened by Jiang, were similar to those of the
aforementioned three theaters. Chun Feng De Yi Dinners showcased operas such as
the “Baffling Case in Fuzhou,” “Nine Interlocking Rings,” and the “Crab Song.” This
prosperous performance of Beijing opera indicates the strong relationship between
Dadaocheng and Chinese culture in the Japanese colonial period.
The Taiwanese Xiao-Qu style included contributions from artists such as Jun-Yu
Chen (陳君玉) and offered a motive for pursuing a traditional Taiwanese identity.
This musical style became increasingly important when the relationship between
Dadaocheng and China collapsed following the decline of Beijing opera and the
Beiguan style, particularly after the 228 incident. Taiwanese identity persisted
throughout the creation of Taiwanese ballad and movies, which were heavily derived
from Xiao-Qu pop and Taiwanese opera. Moreover, the direct ignorance of the
Kuomintang government toward Taiwanese ballad triggered a marked change among
popular music audiences. Taiwanese movies and songs began circulating in movie
theaters, cabarets, and dance halls, including the Da Qiao Theater (#11) and the
Mayflower land (originally, the Chun Feng De Yi Dinners). By contrast, Chinese
dramas and music faded from Dadaocheng and were only circulated in venues patron-
ized by the ruling class in the region of Seimonch¯o. The example of Dadaocheng
indicated that Japanese and Taiwanese music audiences were separated spatially
9 Digitalized Enka-Style Taipei 119
before the war, which was reflected in the cultural separation of highly and less
educated Taiwanese people.
Notably, the establishment and group distribution in Seimonch ¯o had its own
complicated separation. Seimonch¯o was the main Japanese community and recre-
ational area in Taipei during the Japanese colonial period. The programs offered
in this area were mainly Japanese film screenings, noh performances, and stage
play performances. However, “sometimes, they also rented out to Chinese troupes
performing Beijing operas” (Ye 1997). Furthermore, Seimonch ¯o was connected to
the performance venues located in Inner city, such as The Taihoku City Public Audito-
rium (#21) and Kikumoto Department Store (#15). A variety of Japanese and popular
international films were screened at Seimonch¯o’s theaters, including the Daisekai
Kan (#20), Sekai Kan (#18), Shinsekai Kan (#19), and Yoshino Kan (#28). Although
the audiences were primarily Japanese people, Taiwanese people also watched these
movies. Rong-Liu Yan (顏榮柳), a 90-year-old Monga gentry, described
When I was at the Kai-Nan High School of Commerce and Industry, I always
skipped class to go to the Daisekai Kan and watch movies. They were all in Japanese,
and the audiences were mainly Japanese people. The tickets were 3 or 5 dollars. I
bought the half -priced student ticket, and went to see the occidental movies. I can
remember watching “Giant,” “High Noon,” “The Bark of the Gun,” and “Bump,
Bump!” It was so exciting, and I still remember it.
The interaction between Shanghai and Taiwan resumed when the provincials
migrated to Taipei in the post-war period during the rule of the Kuomintang govern-
ment. The government inherited the tradition initiated by the Japanese people and
designated “Seimonch¯o” as the major entertainment space after the population had
migrated from China to Taiwan. The popular entertainment programs then comprised
Mandarin movies from Shanghai; new “national language” popular music, which was
sung by singers such as Xuan Zhou (周璇) and Guang Bai (白光); and Beijing opera
and music, which originated from intellectuals in Dadaocheng. This modification of
entertainment habits reflected the transformation of political power in Taiwan and
led to different audiences and various performance locations in the Taipei music
scene. Jing-Mei Li (李靜美), a Taiwanese ballad singer, spoke vividly of her experi-
ence meeting the diverse ethnic groups that appeared in the musical space. However,
she also revealed that Mandarin songs became mainstream during this period, as
evidenced from her own experiences when circuiting cabarets in the late 1960s. She
states
I gained my fame in Guo Sheng Cinema (#2), Zhenshanmei Hall (#1), and
Xiaokirin (#4). I always wanted to sing in Seimonch¯o because you can record if
you sing in Seimonch¯o. At that time, singing Mandarin songs was the mainstream
performance as a result of the decline in Taiwanese songs. Other singers, such as Ni
Zhen (
甄妮
) and Ya You (
尤雅
),… we were all of the same generation.
According to Li’s memory, the programs sung in the cabarets in Dadaocheng
during the 1970s primarily comprised Taiwanese and Japanese songs. She wanted
to sing in Seimonch¯o for two reasons: (1) due to to the higher audience standards,
she would have been able to achieve fame more easily and (2) singing in a cabaret
presented opportunities for recording and television appearances. Several theaters
120 C. S. Stone Shih
and cabarets changed their name after the war. For example, Sakaeza (#16, built in
1900) was named as Wan Guo theater and then Cinema. Yoshino Kan (#28, built in
1908) was renamed first as the Mei Du Li theater and later as the Ambassador theater.
Daisekai Kan (#20, built in 1935) was named as the Da Shi Jie theater. Kokusai Kan
(#22, built in 1935) was renamed as the Wan Nian International Commercial building.
The Taiwan Theater (#23) was named as The China Theater, and Kikumoto Depart-
ment Store One was later renamed as Seventh Heaven. Notably, Seventh Heaven
was the main location for popular Mandarin music performances. Lu-Shyia Chi, a
singer whose musical pieces are analyzed in this study, recalled that she made a stage
appearance at this venue. Hua-Shi Guan and Zhi Shen (who once was the producer
and host of the first Mandarin singing television program, Taiwan Television, in
1962), her good friends, also made appearances at Seventh Heaven.
In addition to the location of the performance, record stores were essential for the
circulation of Taiwanese ballad. The record stores that existed in the four districts
of Taipei gradually joined the “media loop” circulation, which originated from the
regions that were under Japanese colonial power during the post-war period. As a
base for selling records, these stores were the source through which Taiwanese ballad
entered the media loop. Specifically, these stores even functioned as a major hub for
records. The production of records using Taiwanese materials and recording tech-
niques was not revived until 1952 when Shi Xu (許石), the pioneer of Taiwanese
ballad, established the China Record Company. The factory of this company was situ-
ated in Sanchong district (三重縣) adjacent to Taipei City across the Tamsui River,
which was the base for prosperous record companies in the 1960s and then became
an extension of Taipei’s media loop (Shih 2009). Sanchong was thus the headquar-
ters of record companies. Among the 72 record companies in Taiwan in 1967, more
than 40 were located in Sanchong district. The records manufactured from factories
in this area were distributed throughout Taiwan. Notably, the Zhong Hua merchan-
dize market (中華商場) in Seimonch¯o contained several recording warehouses, such
as Universal, Metro–Goldwyn–Mayer, Columbia, Xin Xing, Nana, Jinmen, and Ge
Wang (Fig. 9.1). These seven recording warehouses were crucial for sellers from
the middle and southern regions of Taiwan because they obtained their money from
wholesale and then retail sales. Jing-Shang Li, the manager of the First Record
Company, stated
In the late 1960s, if you wanted to listen to Enka records, the best place to buy
them was at my store. You could also find records at the the Zhong Hua merchandize
market in Seimonch¯o. Enka movies, like Love in May Flower (
愛染かつらを
), were
also very popular at that time. The movies were all screened first in Seimonch ¯o and
then in Dadaocheng. Taiwanese people, especially the working class, liked to watch
movies at The First Theater and then come to my store to buy the records because it
was cheaper. You could also find the Enka-style records of Lu-Shyia Chi, Shia Wun,
and Yi-Fong Hung at that time, which were very, very popular…
In conclusion, we can reasonably estimate that half of all the record stores in
Taiwan during the post-war period were located in Taipei. In the 1960s, Enka-style
mixed-race pop was still one of the principal memories in the cultural imagination of
Taiwanese people due to the popularity of Japanese movies and Enka-style records
9 Digitalized Enka-Style Taipei 121
and songs. Lu-Shyia Chi and Yi-Feng Hung were the two main singers who combined
Enka and Taiwanese music, which made them very popular at that time. The locations
of Enka-style performance halls and theaters shifted from Inner city and Seimonch¯o
to Dadaocheng after the war.
9.5 Concluding Remarks
In the Japanese colonial era, the center of Beijing’s opera performance gradually
switched to Eirakuza, Dadaocheng. Although initially prosperous, the popularity
of Beijing opera quickly declined. Japanese people were the primary residents in
Seimonch¯o and formed the so-called “Inlander’s street,” which became a recre-
ation and business area that flourished until the 1960s under the rule of the Kuom-
intang government. Beijing opera and music faded from Dadaocheng and were only
circulated in venues patronized by the ruling class in Seimonch¯o. The example of
Dadaocheng indicated the formation of an ethnic–class cultural space. Japanese
and Taiwanese song audiences were separated spatially before the war, which was
reflected in the cultural separation of highly and less educated Taiwanese people.
During the post-war period, Dadaocheng’s First Theater became the most impor-
tant space for Enka as well as the screening of Japanese movies. The Kuomintang
government inherited the tradition initiated by the Japanese people and designated
Seimonch¯o as the major entertainment space after the Chinese people migrated to
Taiwan. Beijing opera and music, which originated favored by Dadaocheng’s intel-
lectuals. This modification of entertainment habits reflected the transformation of
political power in Taiwan, led to different audiences and various performance loca-
tions in the Taipei music scene, and gave rise to diverse ethnic groups that appeared
in different musical spaces. In the 1960s, Enka-style music was still in the forefront
of the cultural imagination of Taiwanese people due to the popularity of Japanese
movies and Enka-style records. Lu-Shyia Chi and Yi-Feng Hung were the two main
singers who combined Enka and Taiwanese music to form mixed-race songs, which
made them very popular in the 1960s.
By focusing on the cultural space, this study describes the popularity of the
Taiwanese ballad music form. The compositions of Lu-Shyia Chi and Yi-Fong Hung
were taken as examples. Their compositions feature mixed-race influences from
Japan. The music evolved from Taipei’s four districts, namely, inner city, Monga,
Dadaocheng, and Seimonch¯o, from 1930 to 1960. All the cultural imagery and histor-
ical discourses entangled with cultural products and imbued with social meaning
provided a critical understanding of the implications of Enka in the lives of its
listeners. As indicated by the bridging interviewing method and GIS digital mapping,
the music-language differences throughout history eventually caused racial geospa-
tial division in the examined districts. This historical study is in agreement with
the suggestion of Brown and Knopp (2008), who state that an epistemologically
plural approach is possible from a perspective that embraces tensions and conflicts
as opportunities to advance knowledge rather than viewing them as obstacles.
122 C. S. Stone Shih
References
Alexander, B.K. (2003). Fading, twisting, and weaving: an interpretive ethnography of the black
barbershop as cultural space. Qualitative Inquiry, 9, 105–125.
Bodenhamer, D. J., Corrigan, J. Harris, T. M. (2010). The spatial humanities: GIS and the future of
humanities scholarship, Indian University Press.
Bridge, G. (2002). Bourdieu, Rational action and time-space strategy of gentrification. Transactions
of Institute of British Geographers, New Series, 205–216.
Brown, M., & Knopp, L. (2008). Queering the map: The productive tensions of colliding
epistemologies. Annals of the Association of American Geographers, 98(1), 40–58.
Gao, T. C. (2004). Watching Taipei through time and space: The 120th anniversary of city
founding: ancient maps and old image literature and cultural relics exhibition. Taipei: Taipei
City Government Press.
Goodchild, M. F. (2004). GIScience, geography, form, and process. Annals of the Association of
American Geographers, 94(4), 709–714.
Jones, A. F. (2004). Yellow music: media culture and colonial modernity in the chinese jazz age.
Durham and London: Duke University.
Kishi, Toshihiko 貴志俊彥(2013). 東アジア流行歌アワー越境する音交錯する音樂人,
Tokyo: 岩波書店.
Liechty, M. (2003). Suitably modern: Making middle-class culture in a new consumer society.
Princeton: Princeton University Press.
Podmore, J. (1998). (Re)reading the ‘loft living’ habitus in Montreal’s inner city. International
Journal of Urban and Regional Research. 283–301.
Shih, C. S. Stone, Chi, C. L., & Huang, Y. L. (2009). A spatial excavation on the medial loop of
Taiwan’s folk song in the greater Taipei area, 1960–80. In: Jinn- Guey Lay et al. (eds.), Digital
Archives GIScience. pp. 1–22, Taipei, Department of Geography, National Taiwan University.
Shih, C. S. Stone, Chi, C. L., & Huang, Y. L. (2010). Representation, bridging and interpretation:
on the realization of social geographic information systems. In: L. Hue et al. (eds.), Spatially
Integrated Humanities and Social Sciences. pp. 17–32, Beijing, Science Publication.
Shih, C. S. Stone (2011). Taiwan’s Ballad as a mainstream song of the period: the shanghai and
other mixed-blood influence of music Taipei, 1930–1960. Taiwanese Journal of Sociology,47,
91–141.
Shih, C. S. Stone (2014). Modern Song: Lu-Shyia Chi and Taiwanese Ballad’s Era. Taipei: Tonshan
Publication Inc.
Shih, C. S. Stone (2016). 「歌謡、歌謡曲集と雑誌の流通:中野忠晴、「日本歌謡学院」の
戦後初期台日に対する文化を越えた影響」,p. 101–132, 『台湾のなかの日本記憶』,三
元社,日本。.
Yano, K., Nakaya, T., & Isoda, Y. (2007). Virtual Kyoto: Exploring the past. Nakanishiya: Present
and Future of Kyoto.
Yano, K., Nakaya, T., Kawasumi, T., & Tanaka, S. (2011). Historical GIS of Kyoto. Nakanishiya.
Ye, L. Y. (1997). Taiwan’s earliest theaters and movies. Taipei Literature, Straight (122).
Yeh, C. F. (1990). Taiwan’s early post-war drama. Taipei: Taiyuan.
Part III
Spatial Synthesis in Regional Science
Chapter 10
Research Progress on Spatial
Demography
Hengyu Gu, Xin Lao, and Tiyan Shen
10.1 Introduction
The healthy development of all human beings has always been an important standard
to measure the sustainable development level of a region or a country. Human Devel-
opment Report 1990 released by the United Nations put forward the HDI (Human
Development Index), seeking to assess the sustainable development level of a region
from three aspects: life expectancy, knowledge, and living standard. Therefore, popu-
lation change is closely related to the sustainable development of a country or a region.
To look into the depth, there are three components of the population change: birth,
death, or migration. With the improvement in healthcare and the collapse in birth
rate, population migration has directly led to the changes in the total population and
the structure of population. According to The International Migration Report 2017
issued by the UN Population Division, the world international migrant stock reached
258 million, taking up 3.4% of the world population. Among all these international
migrants, 57% migrated to developed countries while the rest 43% moved to devel-
oping countries. In developed countries, 61% of immigrants come from developing
countries, while for developing countries, immigrants from developed countries take
up only 13% of the immigrants. It can be concluded that the international population
migration exacerbates the imbalance of the international population distribution and
further influences culture exchange, trade, and resources allocation between various
H. Gu ·T. Sh e n (B
)
School of Government, Peking University, Beijing 100871, China
e-mail: tyshen@pku.edu.cn
H. Gu
e-mail: henry.gu@pku.edu.cn
X. Lao
School of Economics and Management, China University of Geosciences, Beijing 100083, China
e-mail: laoxin2017@cugb.edu.cn
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_10
125
126 H. Gu et al.
regions. In spite of its contribution to the flow of factors, population migration has
widened the regional imbalance, which poses a threat to the order and development
of the regions. In other words, an uneven population pattern that partly resulted from
migrations exacerbates the imbalanced regional development, which will manifest
as spatial heterogeneity of social characteristics, such as urbanization and public
health.
In the community of demography, space is an inherent dimension of demographic
research and a core concept in demographic application analysis. In demography,
population refers to a group with certain characteristics in a geographic region and
at a particular time (Zeng et al. 2011). Therefore, population data has almost the
same characteristics as spatial data. Spatial synthesis and computational methods are
crucial to advance social science and humanities, especially for demography. Spatial
methods can help to visualize better, analyze, and predict the spatial distribution of
demographic characteristics at each geographic unit. Had the concept of space been
ignored, many demographic researches would have been hardly carried out.
In the second half of the twentieth century, scholars of social science began to
pay attention to the spatial problems of social science. Giddens (1984) pointed out
that spatial factors that are instrumental in building a reasonable social theory were
ignored in the traditional theories. Additionally, with the growing maturity of GIS
technology, the technology, models, and theories of spatial analysis have gained
booming development, which speeds up the process of “Space Transfer” in social
science. Against the background of spatial science, Spatial Demography has emerged
as an interdisciplinary academic field and received more and more attention from
scholars in demography, geography, and regional science.
Compared to other existing methods of demographic study, spatial demography
can better deal with demographic issues related to space to some degree, such as the
distribution of urbanization rates and the characteristics of interprovincial migration.
Meanwhile, supported by spatial methods of spatial demography, some traditional
theories of demography can be better validated. Furthermore, spatial demographic
analysis methods can display some population data (e.g., distribution, migration)
more intuitively, thus providing scientific evidence for urban population management
and governance.
In recent years, research on Spatial Demography has been thriving with the appli-
cation of many emerging techniques and methods of spatial analysis. However, few
papers have reviewed the research themes and methodology of previous studies
systematically. Besides, Spatial Demography, as an emerging subject, still faces
many problems waiting to be further discussed. For example, how to get a system-
atical understanding of “Space” in Spatial Demography? Is Spatial Demography
equivalent to demographic, spatial analysis? This paper clarifies the core concept of
Spatial Demography and sorts out its development clue, introduces recent advances
in Spatial Demography research, hoping to promote discipline construction of it and
of some related fields.
10 Research Progress on Spatial Demography 127
10.2 Core Concept of Spatial Demography
10.2.1 The Definition of Spatial Demography
The concept of Spatial Demography first emerged with more studies on popula-
tion migration (Clarke 1984). According to the book Spatial Demography written
by Suzuki Keisuke (1980), a Japanese demographer, the issue of population can be
explained from the aspects of population size and population distribution, which
depends not only on the rates of birth and death, but also on interregional migration.
American demographer Voss (2007), the founder of Spatial Demography, published
a paper “Demography as a Spatial Social Science”, which has had a big repercussion
in this field. He regarded Spatial Demography as a new discipline which offers a
regional perspective to the study of traditional demography. While other demogra-
phers like Matthews and Parker (2013) define Spatial Demography as “the spatial
analysis on the issues and process of population”, emphasizing on the significance of
spatial analysis to the development of Spatial Demography. In conclusion, existing
definitions of Spatial Demography can be summarized as follows: From the perspec-
tive of scale, Spatial Demography focuses on the overall presentation of population
phenomena at the regional level instead of individual behaviors; From the perspective
of a method, Spatial Demography is inseparable with the spatial analysis techniques.
Some scholars even contend that Spatial Demography is employing spatial analysis
techniques to solve traditional demographic issues (Matthews and Parker 2013).
In this paper, Spatial Demography is defined as a discipline that studies population
(birth, death, migration) at the regional level by using spatial data and techniques
(cartography, visualization, pattern recognition, and mechanism analysis). In general,
Spatial Demography has the following four characteristics:
(1) Collection of the population spatial data. Access to the population spatial data is
the prerequisite to analyze problems in Spatial Demography. The data applied
should contain not only demographic characteristics but also spatial information
such as the latitude and longitude.
(2) The application of the spatial analysis method. In addition to the traditional
spatial analysis based on the geographical cognition such as buffer analysis and
overlay analysis, spatial relationship modeling and spatial statistical analysis
methods are emphasized.
(3) Model spatialization. The concept of space should be integrated into the analyt-
ical model of Spatial Demography. Space should be embedded in the model
through transport cost (New Economical Geography) or distance parameter
(gravity model), or the model should reflect the process of spatial interaction
(multiregional population projection).
(4) Regionalization of the analytical perspective. Spatial Demography focuses on
demographic phenomena in a particular region, and macroscopically explores
the demographic issues, which belongs to the category of Macro-demography
to some extent (Voss 2007).
128 H. Gu et al.
10.2.2 Space and Spatial Analysis in Spatial Demography
Space is an essential element in demographic research. When space is studied as the
background and object behind certain demographic phenomena, we should cast our
attention towards the spatial distribution pattern of such demographic phenomena
and its influence mechanism, for example, the spatial characteristics of a country’s
labor market and their formation reasons. When space is regarded as the subject
that affects the demographic phenomena, or space serves as the underpining driver
of such demographic phenomena, we should pay attention to the effect of space on
the demographic phenomena, for instance, the effect of the urban built environment
on the resident trips shapes the overall pattern of population distribution within the
city. Understanding the relationship between space and demographic phenomena
from the perspectives of object and subject is the logical starting point to analyze
demographic issues in Spatial Demography.
The thought of spatial analysis found its basis in the first law of geography,
which signifies the interrelation of the geographic objects, i.e., the closer they are
to each other, the more connected they are (Tobler 2004). The advancement of the
spatial analysis technique is an important driving force for the development of Spatial
Demography. The more advanced the spatial analysis technique is, the more detailed
geo-spatial data and question-oriented GIS analysis method can be obtained, which
have considerably raised attention on space in recent years. Techniques like Spatial
Econometrics, Geographically Weighted Regression (GWR), Multilevel Modeling
and Spatial Pattern Analysis are all influential for the development of Spatial Demog-
raphy in the future (Matthews and Parker 2013). However, as Goodchild and Janelle
(2004) states, there is a shortage of theoretical interpretation of space in related
models. For example, in the spatial interaction model, space can be interpreted
as either the transport cost or the correlation between group communication and
distance. Thus, Spatial Demography cannot be simply equated with the spatial anal-
ysis of demographic issues. As for Spatial Demography, the most urgent need is
to promote the combination of the advanced spatial analysis techniques with the
space-based demographic theories.
10.2.3 The Relations and Differences Between Spatial
Demography and Related Disciplines
In order to distinguish certain concepts and meanings of some related theories
and lay a qualitative foundation for technological analysis, this paper compares
the focuses laid in demographic research (Shown in Table 10.1) in the fields of
Spatial Demography, population geography and regional science: ➀Compared with
population geography, Spatial Demography concerns more about those demographic
issues from the perspective of geographic space. ➁Compared with regional science,
Spatial Demography focuses on demographic and economic issues in a specific
10 Research Progress on Spatial Demography 129
Table 10.1 The comparison of population study between Spatial Demography and related
disciplines
Disciplines Population studies in related
disciplines
Population studies in spatial
demography
population geography Study the geographic distribution
of population as well as its
relationship with the environment,
belonging in the category of
geography
Study the law of population
development combining
geographical theories with
techniques such as spatial
statistics, belonging in the category
of demography
Regional science Study the economic population
structure in abstract geographical
units, and emphasize establishing
an explanatory economic model,
with the population as one of the
important elements in a region
Study the economic population
structure in specific geographical
units, focusing on the description
and the mechanism of spatial
economic population phenomena
with population as the main subject
Regional demography Study the spatial change of
regional(multi-regional)
demographic phenomena, focus on
the comparative study from the
regional(multi-regional)
perspective and the synthesis of
those analytical perspectives
Study the demographic phenomena
at a technical level from the aspects
of visual representation, pattern
recognition, analysis and modeling
of driving forces, focusing on the
application of quantitative analysis
spatial unit, in which descriptive analysis is applied instead of explanatory modeling.
➂Compared with regional demography, Spatial Demography lays more emphasis
on the application of spatial analysis techniques. However, the difference between
Spatial Demography and regional demography is so tiny that Spatial Demography
can be regarded as the dominant theory and method of regional demography to some
degree (Wang 2017).
10.3 The Course of Development in Spatial Demography
Some signs of Spatial Demography are shown in some early studies in demography. It
could be dated back to 1855 when Snow (1855) began to analyze the causes of cholera
deaths in London, England, with the cartography method. Spatial Demography is
featured by its spatial character of the population data and the regional perspective
of research. According to this statement, most demographic researches are classified
into Spatial Demography, because census data used in these researches are added
onto a certain geographical level or unit for demographic change analysis (Voss
2007). Although Voss (2007) admits that this classification may be groundless, it
at least shows an early understanding of Spatial Demography in the demographic
circle. That is, in traditional demographic studies, all these demographic analyses
with the spatial character can be classified into Spatial Demography. Actually, before
130 H. Gu et al.
the middle of the last century, this space-based analytical model is applied in many
demographic researches. Meanwhile, in those disciplines related to demography,
such data collected from geospatial integration are also widely used (Theodorson
1961).
Since the middle of the last century, with the emergence of the “Ecological
Fallacy” and mass popularization of the statistic survey mode in micro-demography,
the western world casts their attention towards micro-demographic researches, in
which social demography focus on families and individuals is emphasized. At the
same time, there are still some demographers who paid consistent attention to
spatial demographic issues with the subjects concerning urban demography, popu-
lation migration and population prediction. In urban demography, some scholars
focus on urban function, urban hierarchy, urban structure and spatial distribution of
ethnic groups within a city and extend the measurement methods of early studies
on residential differentiation. The primary concerns in population migration are the
measurements of interregional migration and its influencing mechanism (Shryock
and Eldridge 1947). In terms of population prediction, the main research task is to
estimate and to predict the population of a particular region, which is an important
component of Applied Demography.
At the beginning of the 1980s, with the rapid advancement of GIS spatial anal-
ysis technology, Spatial Demography has been paid great attention by the academic
community. Those population geographers such as Rees, Congdon and Batey are
the first group who applied the spatial analysis technology into the demographic
researches. Rees and Wilson (1977) emphasize that the main direction of popula-
tion geography is to analyze the population issues with methods of spatial analysis
and demographic statistics. As a result, the number of literature on Spatial Demog-
raphy has been on the rise. Congdon and Batey (1989) have issued a memoir on
Spatial Demography and have generalized the demographic papers from these four
aspects, such as spatial planning based on population information, residence and
re-distribution, population migration and population forecast. During this period,
the great achievements from some related disciplines such as geography, regional
science and spatial econometrics are valued and absorbed by social science fields
represented by demography. Compared with the traditional spatial demographic
researches, Spatial Demography of this period presents some spatial characteris-
tics from the perspectives of models, data and analysis. There also appears a gradual
diversification of research topics, and the interdisciplinary comprehensive researches
in demography begin to emerge. Such tendency continues untill today (Table 10.2).
10.4 The Trans-Century Research Focuses in Spatial
Demography
Spatial Demography didn’t step into a period of great development until the turn of
the century. Many papers published in top international demographic journals have
10 Research Progress on Spatial Demography 131
Table 10.2 The development phases, characteristics and causes of Spatial Demography
Development phase Time Characteristics of the
researches
Causes
Phase 1: Origin
stage
Before the 1950s A general and macro
demographic
interpretation based on
demographic data at the
geographic level
Space is an important
element of demographic
research; Demographics
is integrated at a certain
geographical level
Phase 2: Slow
development stage
The 1950s-1990s Despite the shift to
micro-demography,
some scholars still paid
consistent attention to
spatial demographic
issues concerning urban
demography, population
migration and
population prediction
The popularization of
the micro-demographic
data; Ecological fallacy
Phase 3: Leaping
development stage
The 1990s till today Applying the spatial
analytical techniques to
demographic researches,
stressing the spatial
interpretation of
demographic issues
The rapid development
of GIS spatial
techniques
pointed out positive directions towards which Spatial Demography can proceed either
in theory or in applications (Voss 2007; Matthews and Parker 2013). Spatial Demog-
raphy was first published in 2013, marking the maturity and the systematization of
Spatial Demography in this field. Paying special attention to advanced achievements
in Spatial Demography, this part reviews the related literature since 2000, based
on the classification system consisting of “differentiation and isolation”, “birth and
death”, “migration and urbanization”, “population and the environment”, “regional
population forecasting” and “methodology research”. This kind of classification tries
to integrate traditional demographic topics (birth, death and migration) with new
topics (differentiation and isolation, urbanization, population and environment, and
population forecasting) emerging due to the development of cross-disciplines and
technology.
10.4.1 Differentiation and Isolation
Spatial differentiation and isolation of those demographic characteristics such as
ethnics, stratum, income and educational level have always been important research
topics in demography. Such differentiation often leads to the imbalanced develop-
ment of regional demographic and economic factors. Demographic and economic
variables are always featured by spatial heterogeneity and spatial dependence (spatial
132 H. Gu et al.
autocorrelation). For instance, the poverty rate of a region is influenced not only by
some economic and environmental variables in that region but also by some relevant
variables in the adjacent regions. However, the spatial spillover effects of population
variables are difficult to be estimated by traditional regression models, which calls
for advanced tools and techniques. With the development of spatial econometrics,
spatial regression models, with the spatial weight matrix introduced, is useful in
dealing with spatial heterogeneity and spatial dependence. When it comes to the
mechanism of population variables, spatial economic models with a spatial weight
matrix can quantify the spatial interaction effects, which provide a more accurate
evaluation.
Previous demographic studies on spatial differentiation and isolation mainly focus
on the following three aspects. ➀Residential differentiation: that is to lay stress on
the spatial differentiation of residents in a region from the following aspects of
ethnics, educational level and age structure, and on the relationship between the
spatial differentiation of residential environment and demographic characteristics in
the residential zones. Based on the spatial demographic perspective, Harris et al.
(2007) employed a multilevel modeling technique in studying whether students in
Birmingham city of UK prefer to choose a state-funded secondary school closer
to them, and finally found out that the ethnic composition in the residential region
has an obvious influence on students’ selection of schools. From the perspective
of age differentiation in communities, using census data of American counties in
1990, 2000 and 2010, Winkler (2013) discussed the residential segregation of aged
people (60 and above) from young people (20–34), and discovered that the age differ-
entiation among residential communities is widespread in America, especially for
Hispanics and non-Hispanic whites. While Duncan et al. (2012), from the perspective
of the spatial differentiation of community walkability, measured the spatial vari-
ance of walkability in communities in Boston and its relation to spatial demographic
variables (such as the proportion of ethnic minority population and the proportion
of poor households), using the methods of Moran’s I, OLS regression and spatial
auto-regression. It turned out that although there exists residential segregation in
Boston, spatial demographic variables have no significant effects on the residential
walkability. ➁Regional income disparity: that is to emphasize the regional spatial
differences both in economic development and poverty level, and the causes of such
differences, with child poverty rate as the main focus. Since 2006, Voss et al. began to
compute the spatial auto-correlation and the spatial spillover effects of child poverty
rate in American counties with explanatory spatial data analysis and spatial regression
analysis (Voss et al. 2006). Recently, this research team applies spatial economet-
rics to the researches of the child poverty rate and takes into account the effects of
more independent variables such as ethnics and the regional economic composition,
revealing that the regional ethnic agglomeration and industry restructuring both have
a marked impact on the regional poverty level (Curtis et al. 2012). Laurini (2016)
employed DMSP-OLS nighttime light data to estimate spatial and temporal disparity
of resident income level in Brazil, proving the feasibility of such a method in the
absence of the census information. ➂Spatial variation of other social problems: that
is to focus on the spatial variance of some social problems such as the crime rate,
10 Research Progress on Spatial Demography 133
the unemployment rate and the causes of both of them. Arnio and Baumer (2012)
studied the spatial variance of crime rates in Chicago based on OLS and GWR
and found the significant influence on spatial patterns of crime rates from variables
like the proportion of the black in the community, the concentration of immigrants
and the foreclosure. Based on the Theil index, Thiede and Monnat (2016) evalu-
ated the spatial difference of unemployment rates at the county level and state level
during the American financial crisis (2007–2009). After identifying the cluster areas
with similar change regularities based on spatial statistics method, they finally used
a spatial regression model to assess the influential factors of the unemployment
rate. The results show that the imbalances in labor markets at the county level have
been exacerbated, and some counties with a surging unemployment rate during the
financial crisis are affected by similar factors such as lower educational investment.
10.4.2 Birth and Death
Birth and death is a traditional topic in demographic studies and a hot issue in
Spatial Demography. The discussion has been made in depth around this traditional
topic based on spatial analytical techniques. Most researches on fertility rate are
conducted under the background of the fertility decline across the world. Based on
the economic and demographic data at the county level in the USA, Porter (2017)
studied the relationship between regional economic development and fertility rate.
Meanwhile, some hot topics like childbearing within cohabitation have aroused the
awareness of spatial demographers. Vitali et al. (2015) studied the spatial distribution
of childbearing within cohabitation in Norway from 1988 to 2011. The research
findings display that there exist spatial heterogeneity and autocorrelation, and that
the unemployment rate and the female educational level are the main causes for
childbearing within cohabitation based on the results of the spatial panel model.
In addition, antenatal care becomes a hot issue in recent years. Gayawan (2014)
investigated the explanatory factors and spatial effects of antenatal care services in
Nigeria with the Poisson regression model, discovering that the antenatal care service
presents spatial heterogeneity, and that some factors like the childbearing age, spouse
age, length of marriage have great influences on it.
Researches on mortality mainly focus on the cases of developing countries, of
which the prediction and estimation of child mortality is an important topic. Balk et al.
(2004) paid early attention to child mortality in developing countries, studied some
determinants on child mortality in 10 West African countries and found the child
mortality has more correlation with the geographic environment. Later, Storeyard
et al. (2008) explored the spatial distribution of global child mortality and conducted
the spatialization of child mortality data using grid statistics. Yang et al. (2015)
employed Spatial Durbin Model to reveal the spatial distribution of mortality at
county level in the USA with noticeable spatial spillover effects. Jankowska et al.
(2013) constructed an estimation model based on child (<5 years old) mortality data
from 1987 to 2006 in Accra of Ghana and revealed the spatial heterogeneity of child
134 H. Gu et al.
mortality in that region, which is also affected significantly by the environment. Roy
et al. (2017) predicted the infant mortalities in 64 districts in Bangladesh, and found
that the modified estimation result based on the spatial regression model is more
accurate compared with traditional estimation models. Besides, the aging population
has also attracted great attention among the scholars. Against the background of the
aging of American population, some scholars estimated the number of beneficiaries
of medical insurance at different spatial scales, which could serve as the reference
for the government to formulate health policies (Siordia 2013).
10.4.3 Migration and Urbanization
With the further progress of globalization and the constant improvement of commu-
nications, the number and frequency of migration begin to exceed that of the past
development phase, displaying a new migration pattern and structure. On the premise
of low natural growth rates in many countries and regions, the mechanical growth rate
generated by inter-regional migration becomes a principal element influencing the
change of population size and structure in that region. The literature on population
migration in Spatial Demography mainly revolves around the following three aspects:
➀the spatial characteristics of population distribution within the city and across the
regions due to population migration. Kalogirous (2005) mapped the spatial trend
of population migration in England and Wales (including immigration rate, emigra-
tion rate, net migration rate and migration flow). Mazza and Punzo (2016)used
Ripely’s K function to measure the spatial distribution of the immigration popu-
lation from abroad in Catania of Italy. It came out that the spatial centralization
is evident in immigrants from Sri Lanka, Mauritius, Senegal and China, while the
spatial distribution is rather random among immigrants from Tunisian and Moroccan.
In addition, they found that the attractiveness of urban areas to immigrants varies from
one region to another. ➁The determinants of population migration. Fernandez et al.
(2007) investigated the effects of educational level in El Paso of the USA on popula-
tion emigration to examine whether the low proportion of college students in this city
has something to do with the emigration of the highly educated population. The result
demonstrates that Mexicans and Mexican Americans are more willing to emigrate
from this area, compared with highly educated population. Withers and Clark (2006)
assessed the cost and rewards of American family migration from economic geog-
raphy and claimed the geography of family migration is interrelated with moves in
and out of the labor market on the part of wives. Shen (2015,2017) conducted a
set of studies on the pattern of inter-provincial migration in China and its causes,
such as improving the traditional gravity model using the Poisson migration model.
➂Distance features of population migration. Stillwell and Thomas (2016) calcu-
lated the population migration distance via an improved spatial interaction model
and performed the tests on the population survey data collected from 2001 to 2010 in
England, which turned out that this model has higher explanatory capacity compared
with traditional models.
10 Research Progress on Spatial Demography 135
Urbanization is also a front-burner issue in Spatial Demography. The net migra-
tion population is an important composition of the urban population change. The
spatial features of population migration will always lead to different levels of regional
urbanization and different city sizes. Ortega (2014) analyzed the expansion and the
changing trend of urban agglomerations from the perspective of Spatial Demog-
raphy by studying the population spatial pattern and its change at the county level in
Manila of the Philippines. He finally found the spatial morphological change rule in
that region as growth point expansion. Chandrasekhar et al. probed into the spatial
pattern of inter-city population migration in India and population migration on Indian
urbanization. Strozza et al. (2016) studied the changing trend of internal migrants and
international migrants in 8 urban agglomerations of Italy from 2001 to 2010 and its
impact on the change of regional population. It turned out that the population growth
of central and northern urban agglomerations in Italy highly relies on international
immigration.
10.4.4 Regional Population Forecast
Regional population forecast is of great importance to demography. Accurate predic-
tion of regional population size and structure in different periods can provide impor-
tant references for regional development policies. Most researches have conducted
a single-regional population forecast using a mathematical prediction model (such
as linear models and quadratic models), a queue prediction model based on demo-
graphic theory and socioeconomic model. Via the methods of single-regional popu-
lation forecast, the estimated number of men and women, population at different ages
and the gross population prove to be inconsistent with the relevant parameters of the
upper-level region (or the overall region). In addition, when it comes to migration,
single-regional population models are unable to estimate the scale, flow direction and
distribution of the future migration among regions (Wang 2000). The multiregional
population forecast differs fundamentally from traditional single-regional population
forecast in that it introduces spatial effect, which also serves as a shift of the main
focus from traditional demographic research to Spatial Demography. Markov chain
model is a basic model of multi-regional population forecast, which can calculate
the population size in each region at the time of t +1 based on t with probability
theory and matrix method. This method can be applied in the population forecast of
specific regions, combining with the prediction techniques like Grey Model (Jiang
et al. 2014). Rogers (2015) made some improvements on the Markov chain model
and proposed a multi-regional population prediction method that could make much
more sense in demography. Taking account of not only population migration but also
other elements like birth and death, this model can even be used to predict the gender
and age structure of the population in each region. In 2015, Rogers published a book
named Applied Multiregional Demography: Migration and Population Redistribu-
tion, which gives a full account of the new progress in multi-regional population
forecast (Rogers 2015).
136 H. Gu et al.
With the increasing channels to gain access to population spatial data, small-
area population forecast begins to draw attention to spatial demographers. Scholars
rely on various population forecast methods to predict population characteristics at
the county level or finer scales. Based on the spatial regression model, Chi and Voss
(2011) put forward a small-area population prediction method, and conducted an
empirical study using the relevant data of Wisconsin and Milwaukee in the USA.
By integrating the historical growth characteristics of the region and its effects on
the neighboring regions, this model can produce more accurate predictions than the
traditional methods.
10.4.5 Population and the Environment
The study on the relationship between population and environment is of great signif-
icance to sustainable development for many countries and regions. The research in
population and environment, and the advances in spatial data collection and anal-
ysis, have deepened scholars’ understanding of the society and ecological process
and attracted the attention of spatial demographers towards the issues of population
and environment. On the one hand, the production and life of people have made
certain impacts on the geographic environment, and how to evaluate such impacts
from the spatial perspective is a long-term concern for them. For example, Linderman
and Lepczyk (2013) pointed out a close relationship between the dynamic change of
American vegetation and human habitats using MODIS satellite data from 2001 to
2011. Duncan et al. (2014) discussed the relationship between the vegetation density
and the socio-demographic variables (the ethnic composition and the income level)
in communities of Boston with the method of Moran’s I and OLS and found no
significant link between these two elements, thus suggesting adopting spatial regres-
sion model in the future exploration of this issue. On the other hand, the impact
of environmental disasters on demographic behaviors like birth, death and migra-
tion is also one of the research focuses. Taking the coastal zone of Georgia state
of America as an example, Hauer et al. (2015) discussed the influence of the rising
shoreline on the human habitats along the coastline in the context of global warming,
conducted an overlay analysis on the predicted population size of this region and the
prediction result of the future shoreline, and drew a conclusion that a population of
60,000–160,000 would be threatened by the rising shoreline in 2100.
10.5 Research Technique in Spatial Demography
Up to now, a great amount of spatial analytical methods has been used in many cases
of Spatial Demography. This part generalizes the techniques and methods applied in
Spatial Demography. As shown in Table 10.3, the research methods used in Spatial
10 Research Progress on Spatial Demography 137
Table 10.3 Conceptual meaning of spatial analysis and its application in Spatial Demography
Concept Meaning Specific method Cases of spatial
demography
GIS atlas analysis Spatial proximity
measure based on
the distance and
interaction effects
between
geographic data
Buffer analysis,
overlay analysis
Spatial
visualization of
demographic
variables,
thematic
cartography
Travel path based
on topological
network and
analysis of
distance
characteristics
Network analysis Path analysis of
population
migration and
diseases
transmission
Geostatistical analysis Spatial
interpolation
analysis, the
unbiased optimal
estimation of the
unknown regional
data on the basis
of geographic
sample points
using the variation
function
Kriging
interpolation
Spatial
heterogeneity
detection of
demographic
variables,
prediction of the
data in the
unknown regions,
discretization of
the aggregate data
Spatial statistics
analysis
Pattern analysis The spatial pattern
analysis of the
case data on
account of the
position
characteristics of
the point data like
distance and
density, and the
spatial pattern
analysis of
discrete raster or
grid data
Standard
deviation ellipse,
average nearest
neighbor method,
Ripley’s K
function method
Distribution law
of individual case
(such as a disease
sample point in a
city) or
aggregation(birth
rate and death
rate) of
demographic
variables
Model analysis Analysis and
identification of
the spatial data
combining the
principle of
statistics with the
technology of
chart data
visualization
Explanatory
spatial data
analysis (ESDA)
Spatial
autocorrelation,
anisotropy,
hotspot analysis
and local spatial
clustering of
population
variables
(continued)
138 H. Gu et al.
Table 10.3 (continued)
Concept Meaning Specific method Cases of spatial
demography
Estimation of the
factors affecting
geographic data
with spatial
regression model
of explicit format
Spatial
autoregression
model
The factors
affecting
demographic
variables (the
migration flow
between two
regions) and their
spatial spillover
effects
Estimation of the
factors affecting
geographic data
with the spatial
filter technique of
implicit format
Spatial filtering
model
The factors
affecting
demographic
variables (the
migration flow
between two
regions) after
filtering the
spatial
autocorrelation
effects
Analysis of the
spatial difference
of influencing
factors regarding
each geographic
sample point
Geographically
weighted
regression model
The spatial
segregation of the
factors
influencing
regional
demographic
variables (such as
the birth rate)
Spatial complex
model
Random model Simulation of the
behaviors and
characteristics of
study subjects
through
conditional
constraint
mechanism
Cellular
automata,
multi-agent model
The temporal and
spatial simulation
analysis of
demographic
variables (such as
the birth rate)
Dynamic model Simulation
analysis
considering
spatial nonlinear,
multilevel and
complex network
with the system
feedback
mechanism
Complex network
model, system
dynamics model
Evolution
analysis of the
complex network
features of the
inter-regional
migration
network
10 Research Progress on Spatial Demography 139
Demography are divided into four categories: GIS atlas analysis, geostatistical anal-
ysis, spatial statistics analysis (including pattern analysis and model analysis) and
spatial complex model. GIS atlas analysis mainly revolves around traditional GIS
analysis methods based on the spatial position, while geostatistical analysis revolves
around the semivariation function and the Kriging interpolation. The spatial statis-
tical analysis mainly uses statistical analysis methods of spatial pattern such as
Standard Deviation Ellipse, and spatial regressive models, spatial filtering models,
GWR model to explain demographic phenomena. The spatial complex model is
used to simulate the spatial-temporal patterns and provide decision support based
on the geographical simulation technologies like cellular automata and agent-based
models. Spatial analysis method serves as the foundation of the development of
Spatial Demography. Besides using existing models and methods, some scholars are
committed to studying theoretical models and technical methods of Spatial Demog-
raphy. The researches on methodology of Spatial Demography will be reviewed in
this chapter.
Methodological research can be classified into two categories: theoretical
construction of Spatial Demography and application of demographic spatial analysis
technology. In terms of theoretical construction, this paper clarifies Spatial Demog-
raphy, analyzes the relation and distinction between Spatial Demography and the
relevant disciplines, teases out the origin and evolution of Spatial Demography, and
suggests the directions for future research. Spatial Demography has been neglected
for a long time, while it has regained lots of attention since the beginning of this
century. Some scholars conduct a systematic analysis of its reasons and claim that
some factors, such as the rapid progress of computer processing ability, GIS tech-
nology and visualization technology in the second half of the last century, have greatly
promoted the development of Spatial Demography (Howell and Porter 2013). Balk
and Montgomery (2015) discussed the prospect of the application of Spatial Demog-
raphy in urban research, stating that against the background of the increase of urban-
ization rates and population migration, it is necessary to analyze urban issues from
the spatial perspective due to the limitations of traditional demographic methods.
In 2016, Recapturing Space: New Middle-Range Theory in Spatial Demography
was published, which lays great emphasis on the theoretical construction of Spatial
Demography (Howell et al. 2016).
In terms of technology application, the extant literature mainly emphasizes the
application of the spatialization of population data, the spatial analysis technology in
population research and the management of population data in Spatial Demography.
First of all, the spatial visualization of population data is the basis of Spatial Demog-
raphy. How to spatialize the population data collected from social surveys is the focus
of scholars. Reibel and Agrawal (2007) discussed the errors of sociodemographic
data when aggregating data to incompatible spatial units, believing that spatial inter-
polation is a better approach. Walker (2016) and Sun et al. (2013) realized the visual
mapping of spatial population data and the improvement of precision in demographic
data mapping. Based on historical spatio-temporal population data of the 20 counties
of Atlanta from 1940 to 2009, Hauer (2013) elaborated on the technical details of
spatial and 3D temporal visualization display of housing density. Furthermore, how
140 H. Gu et al.
to apply the frontier spatial analysis method to the study of Spatial Demography
is another hot topic. The development of Spatial Demography is closely related to
spatial analysis technology. Rerbel (2007) systematically reviewed the literature on
the application of GIS and spatial data in demography and pointed out that the various
channels to get access to data (such as remote sensing data and cellular signaling
data) provided support for population data acquisition and analysis. Patel et al. (2015)
illustrated the application of Landsat satellite data in the extraction of urban popu-
lation data. The spatial econometric model, such as the GWR model is also put into
use in the field of Spatial Demography. For example, through the analysis of spatial
population data using GWR, the regression coefficient and the t-test value of each
sample point can be recorded on the same map, which strengthens the data presen-
tation capacity (Matthews and Yang 2012). Golgher and Voss (2015) discussed the
application of spatial regression statistical models in Spatial Demography. Finally,
the spatial population data management is of great significance to both the researchers
and the government. For instance, Kugler et al. (2017) put forward a regional data
management strategy, which can not only meet the needs of researchers, but also
satisfy the requirements of the government department in data management.
10.6 Prospect of Future Development of Spatial
Demography
Spatial Demography has entered into the phase of the rapid development by the
end of the 20th century. Many researches have been accumulated on Spatial Demog-
raphy, which left great impacts on demography, geography and other social sciences,
and could serve as a reference for the future development of demography as well
as related fields. Based on the analysis of core concepts, development vein and
important research aspects in Spatial Demography, this paper attempts to provide an
outlook for further development of Spatial Demography below from three perspec-
tives: data acquisition, theoretical analysis and practical application. The future of
Spatial Demography will be combined with more theories of related disciplines,
more advanced spatial analysis technology, and more problem-oriented to provide
scientific evidence for the policy formulating for population sustainable development
across the world.
(1) Data acquisition: ➀Increased access to population data. Due to the abundance
of geographic or location information, the combination of spatial population
analysis technology with the traditional analysis technology is inevitable. The
increasing channels to spatial population data will extend the data sources in
demographic researches. For example, with the application of remote sensing
technology, although it is impossible to directly extract population information,
it can still predict the population by the products of human activity on the space,
which not only saves the cost of data acquisition but also contributes to the
researches on Spatial Demography in those countries with lower availability of
10 Research Progress on Spatial Demography 141
data. At present, the wide application of big data makes it possible to apply
the data (such as the mobile phone signaling data) that could directly reflect
population activities into some related researches, thus widening the channels
to access population data. ➁Spatialization of population data: Population data
spatialization refers to the process of discretization of population statistic data
through the adoption of a certain computer algorithm and mining spatial infor-
mation. The advantages of the spatial population data include better depiction
of population characteristics, the multi-source and multi-scale data fusion, and
access to population data at a lower cost in a shorter time. The spatialization of
population data will be one of the trends in the future development of Spatial
Demography, and the discrete population model will provide more support for
future researches of Spatial Demography.
(2) Theoretical analysis: ➀The theoretical construction based on Spatial Demog-
raphy. Since the turn of the century, the renaissance of Spatial Demography has
largely depended on the development of spatial analysis technology. In recent
years, more and more scholars begin to pay attention to the discussion of demo-
graphic theories. For example, to deduct and calculate multi-regional popu-
lation forecast model, the prediction model based on mathematical statistics is
combined with demographic theory, not only to forecast the regional population
but also to calculate other variables more related to demographic theory such as
gender or population structure. This reflects the future trend of a combination
of spatial demographic model with demographic theory. ➁Theintegrationof
macro and micro perspectives. There always exists a divergence between macro
and micro perspectives in demography. Macro-demography relies more on the
analysis with geographically integrated macro population statistic data while
micro-demography relies more on the exploration of individuals. In the future,
the development of Spatial Demography will be featured by the combination
of a macro perspective with micro perspective, and the application of multi-
level models reflects this combination. By means of stratified statistical anal-
ysis, the model can be used in studying demographic phenomena from different
spatial scales, which is highly praised by many scholars. ➂The spatial turn
of demographic model. Traditional Spatial Demography usually focuses on the
cartographic visualization of population data and the basic map analysis (layer
overlay, spatial proximity, buffer analysis, etc.). At the end of the last century,
with the rapid development of the spatial analysis method, some methods have
all been applied in the empirical studies of Spatial Demography like spatial
pattern recognition technology based on spatial relationship and spatial autocor-
relation, exploratory spatial data analysis (ESDA), spatial regression analysis,
spatial filtering model, and GWR. In recent years, spatial complex models such
as Cellular Automata, multi-agent model, and Neural Network have begun to
appear in Spatial Demography, which enhances its ability of realistic simulation
of demographic phenomena (e.g., population spatial distribution). Therefore,
Spatial Demography pays more and more attention to the direct integration of
space into the analysis model.
142 H. Gu et al.
(3) Practical application: ➀Migration and urbanization. It is believed that internal
and international migration will exert a non-negligible and far-reaching influ-
ence on urbanization process in the future world. In this context, migration and
the urbanization studies will become an important field of the practical appli-
cation of Spatial Demography research, in which the availability and validity
of existing demographic theories (e.g. the push and pull migration theory) and
models (e.g. Possion gravity models) will be further tested, while advanced
spatial analysis methods and technologies will be employed. The concrete
research contents mainly include spatial patterns of migration or urbanization
in different periods and their evolution characteristics, mechanism and determi-
nants of migration or urbanization, interaction between migration and urban-
ization, spatial and social effects of migration or urbanization. ➁Multi-regional
population forecast. We consider multi-regional population forecast as one of
the most potential development branches of Spatial Demography, which will
provide reference and inspiration not only for the demographic and population
research, but also for regional planning and urban governance practice. Based
on gross flow data instead of net rates, multi-regional population forecast can
improve the accuracy of population prediction (Rogers 2015), and its prediction
results aggregated to geographical regions are probably more compatible with
GIS software. Recent years have witnessed some improvements in the field of
multi-regional population forecast, such as a new type of multi-regional forecast
model using place of birth data (Abel 2013) and the visualization of population
prediction results by the method of chord diagram plot (Abel and Sander 2014;
Qi et al. 2017). ➂Population and environment. The relationship between human
activities and environment is another application field of future Spatial Demog-
raphy research, which is closely combined with the sustainable development of
human beings. In this field, the empirical studies will focus on the following
aspects: (1) Analyzing the spatial distribution of human activities or natural
resources at different stages, in order to explore the co-evolution characteris-
tics between population and environment; (2) Evaluating the impacts of human
settlement on natural resources, and optimizing regional population structure
and distribution. For instance, calculating and mapping different regions’ envi-
ronmental population carrying capacities and reasonable capacities of popula-
tion. These researches can provide more comprehensive policy references for
the government.
Acknowledgements This research was supported by the National Social Science Foundation of
China (grant number 17ZDA055) and the National Academy of Innovation Strategy (grant number
CXY-ZKQN-2019-041).
10 Research Progress on Spatial Demography 143
References
Abel, G. J., & Sander, N. (2014). Quantifying global international migration flows. Science, 343,
1520–1522.
Abel, G. J. (2013). Estimating global migration flow tables using place of birth data. Demographic
Research, 28, 505–546.
Arnio, A. N., & Baumer, E. P. (2012). Demography, foreclosure, and crime: Assessing spatial
heterogeneity in contemporary models of neighborhood crime rates. Demographic Research, 26,
449–488.
Balk, D. L., Pullum, T., Storeygard, A., Greenwell, F., & Neuman, M. (2004). A spatial analysis of
childhood mortality in West Africa. Population, Space and Place, 10, 175–216.
Balk, D. L., & Montgomery, M. R. (2015). Guest Editorial: “Spatializing Demography for the Urban
Future”. Spatial Demography, 3, 59–62.
Chi, G., & Voss, P. R. (2011). Small-area population forecasting: Borrowing strength across space
and time. Population, Space and Place, 17(5), 505–520.
Clarke, J. I. (1984). Geography and population: Approaches and applications. Oxford: Pergamon
Press.
Congdon, P., & Batey, P. (1989). Advances in regional demography: Information, forecasts. Pinter
Pub Ltd, New York: Models.
Curtis, K. J., Voss, P. R., Long, D. D., Spatial variation in poverty-generating processes: Child
poverty in the United States. Social Science Research, 41, 146–59.
Duncan, D. T., Aldstadt, J., Whalen, J., White, K., Castro, M. C., & Williams, D. R. (2012).
Space, race, and poverty: Spatial inequalities in walkable neighborhood amenities? Demographic
Research, 26, 409–448.
Duncan, D. T., Kawachi, I., Kum, S., Aldstadt, J., Piras, G., Matthews, S. A., et al. (2014). A spatially
explicit approach to the study of socio- demographic inequality in the spatial distribution of trees
across boston neighborhoods. Spatial Demography, 2, 1–29.
Fernandez, L., Howard, C., & Amastae, J. (2007). Education, race/ethnicity and out-migration from
a border city. Population Research and Policy Review, 26, 103–124.
Gayawan, E. (2014). A poisson regression model to examine spatial patterns in antenatal care
utilisation in Nigeria. Population, Space and Place, 20, 485–497.
Giddens, A. (1984). The constitution of society. Cambridge: Polity Press.
Golgher, A. B., & Voss, P. R. (2015). How to interpret the coefficients of spatial models: spillovers,
direct and indirect effects. Spatial Demography, 4, 175–205.
Goodchild, M. F., & Janelle, D. G. (2004). Spatially integrated social science. New York: Oxford
University Press.
Harris, R., Johnston, R., & Burgess, S. (2007). Neighborhoods, ethnicity and school choice: Devel-
oping a statistical framework for Geodemographic analysis. Population Research and Policy
Review, 26, 553–579.
Hauer, M. E. (2013). A 3D Spatio-Temporal Geovisualization of Subcounty estimates of historic
housing density in Metro Atlanta, 1940–2009. Spatial Demography, 1, 146–161.
Hauer, M. E., Evans, J. M., & Alexander, C. R. (2015). Sea-level rise and sub-county population
projections in coastal Georgia. Population and Environment, 37, 44–62.
Howell, F. M., & Porter, J. R. (2013). Editorial welcome: Why spatial demography? Spatial
Demography, 1, 1–2.
Howell, F. M., Porter, J. R., & Matthews, S. A. (2016). Recapturing space: New middle-range
theory in spatial demography. Berlin: Springer.
Jankowska, M. M., Benza, M., & Weeks, J. R. (2013). Estimating spatial inequalities of urban child
mortality. Demographic Research, 28, 33–62.
Jiang, C., Wang, J., & Shi, Y. (2014). Forecasting spatial migration tendency with FGM (1, 1) and
hidden markov model. Indonesian Journal of Electrical Engineering and Computer Science, 12,
3348–3356.
144 H. Gu et al.
Kalogirou, S. (2005). Examining and presenting trends of internal migration flows within England
and Wales. Population, Space and Place, 11, 283–297.
Keisuke, S. (1980). Spatial demography. Tokyo: Daimyo Publishing House.
Kugler, T. A., Manson, S. M., & Donato, J. R. (2017). Spatiotemporal aggregation for temporally
extensive international microdata. Computers, Environment and Urban Systems, 63, 26–37.
Laurini, M. P. (2016). Income estimation using night luminosity: A continuous spatial model. Spatial
Demography, 4, 83–115.
Linderman, M. A., & Lepczyk, C. A. (2013). Vegetation dynamics and human settlement across
the conterminous United States. Journal of Maps, 9, 198–202.
Matthews, S. A., & Parker, D. M. (2013). Progress in spatial demography. Demographic Research,
28, 271–312.
Matthews, S. A., & Yang, T. C. (2012). Mapping the results of local statistics: Using geographically
weighted regression. Demographic Research, 26, 151–166.
Mazza, A., & Punzo, A. (2016). Spatial attraction in migrants’ settlement patterns in the city of
Catania. Demographic Research, 35, 117–138.
Ortega, A. A. C. (2014). Mapping Manila’s Mega-Urban region. Asian Population Studies, 10,
208–235.
Patel, N. N., Angiuli, E., Gamba, P., Gaughan, A., Lisini, G., Stevens, F. R., et al. (2015). Multitem-
poral settlement and population mapping from Landsat using Google Earth Engine. International
Journal of Applied Earth Observation and Geoinformation, 35, 199–208.
Porter, J. R. (2017). Human development and the fertility reversal: A spatially centered sub-national
examination in the US. Spatial Demography, 5, 43–72.
Qi, W., Abel, G. J., Muttarak, R., & Liu, S. (2017). Circular visualization of China’s internal
migration flows 2010–2015. Environment & Planning A, 49, 2432–2436.
Rees, P. H., & Wilson, A. G. (1977). Spatial population analysis. London: E. Arnold.
Reibel, M., & Agrawal, A. (2007). Areal interpolation of population counts using pre-classified
land cover data. Population Research and Policy Review, 26, 619–633.
Rerbel, M. (2007). Geographic information systems and spatial data processing in demography: A
review. Population Research and Policy Review, 26, 601–618.
Rogers, A. (2015). Applied multiregional demography: Migration and population redistribution.
Berlin: Springer.
Roy, P. K., Gulshan, J., & Hossain, S. S. (2017). Spatially revised estimation of infant mortality in
Bangladesh. Spatial Demography, 5, 25–72.
Shen, J. (2015). Explaining interregional migration changes in China, 1985–2000, using a
decomposition approach. Regional Studies, 49, 1176–1192.
Shen, J. (2017). Modelling interregional migration in China in 2005–2010: The roles of regional
attributes and spatial interaction effects in modelling error. Population, Space and Place, 23,
1–14.
Shryock, H. S., & Eldridge, H. T.(1947). Internal migration in peace and war. American Sociological
Review, 12, 27–39.
Siordia, C. (2013). Benefits of small area measurements: A spatial clustering analysis on medicare
beneficiaries in the USA. Human Geographies, 7, 53–59.
Snow, J. (1855). On the mode of communication of cholera. London: John Churchill.
Stillwell, L. J., & Thomas, M. (2016). How far do internal migrants really move? Demonstrating
a new method for the estimation of intra-zonal distance. Regional Studies Regional Science, 3,
28–47.
Storeyard, A., Balk, D. L., Levy, M., & Deane, G. (2008). The global distribution of infant mortality:
A subnational spatial view. Population, Space and Place, 14, 209–229.
Strozza, S., Benassi, F., Ferrara, R., & Gallo, G. (2016). Recent demographic trends in the major
italian urban agglomerations: The role of foreigners. Spatial Demography, 4, 39–70.
Sun, M., Kronenfeld, B. J., & Wong, D. W. (2013). Cartographic techniques for communicating
class separability: Enhanced choropleth maps of median household income, Iowa. Journal of
Maps, 9, 43–49.
10 Research Progress on Spatial Demography 145
Theodorson, G. A. (1961). Studies in human ecology. Evanston, IL: Row Peterson and Company.
Thiede, B. C., & Monnat, S. M. (2016). The Great Recession and America’s geography of
unemployment. Demographic Research, 35, 891–928.
Tobler, W. (2004). On the first law of geography: A reply. Annals of the Association of American
Geographers, 94, 304–310.
Vitali, A., Aassve, A., & Lappegard, T. (2015). Diffusion of childbearing within cohabitation.
Demography, 52, 355–377.
Voss, P. R. (2007). Demography as a spatial social science. Population Research and Policy Review,
26, 457–476.
Voss, P. R., Long, D. D., Hammer, R. B., & Friedman, S. (2006). County child poverty rates in the
US: A spatial regression approach. Population Research and Policy Review, 25, 369–391.
Walker, K. E. (2016). Tools for interactive visualization of global demographic concepts in R.
Spatial Demography, 4, 207–220.
Wang, G. (2000). Regional population forecast methods and applications. Shanghai: East China
Normal University Press.
Wang, X. (2017). On regional demography. Population Research, 41, 59–69.
Winkler, R. (2013). Research note: Segregated by age: Are we becoming more divided? Population
Research and Policy Review, 32, 717–727.
Withers, S. D., & Clark, W. A. V. (2006). Housing costs and the geography of family migration
outcomes. Population, Space and Place, 12, 273–289.
Yang, T. C., Noah, A., & Shoff, C. (2015). Exploring geographic variation in US mortality rates
using a spatial Durbin approach. Population, Space and Place, 21, 18–37.
Zeng, Y., Zhang, Z., & Gu, D. (2011). Demographic analysis: methods and applications. Beijing:
Peking University Press.
Chapter 11
Complex Network Theory on High-Speed
Transportation Systems
Haoran Yang and Yongling Li
11.1 Introduction
Physical transport systems, along with other virtual infrastructures like ICT, are
fundamental elements of our societies and economies. Historically, technology
reform in transportation has always led to societal change and brought socio-
economic growth and job creation by reducing the spatial-temporal relationships of
people in the transportation networks. Even though ICT overwhelmingly facilitates
instant communication, face-to-face interactions are still important in the contempo-
rary world (Bertolini and Dijst 2003). High-speed physical transportations such as air
and high-speed railway (HSR) can dramatically decrease geographic and temporal
constraints of moving people for business transactions, tourism, post-migratory
travels for saving social links with friends and relatives, academic collaborations and
political activities in the contemporary world (Hall and Pain 2006). Regarding the
important role of high-speed physical transportation networks on linking urban areas,
the development of airlines and HSR has been supported with substantial capital and
infrastructure investment in the world. Since the DELAG, Deutsche Luftschiffahrts-
Aktiengesellschaft the world’s first airline was founded in 1909, up to now, airline
networks have covered almost every country in the world. In 2015, airline industry’s
total global economic impact is 2.7 trillion including direct, indirect, induced and
H. Yang
The Center for Modern Chinese City Studies, East China Normal University, Shanghai 200062,
China
e-mail: hryang@re.ecnu.edu.cn
School of Urban and Regional Science, East China Normal University, Shanghai 200241, China
Y. L i ( B
)
Department of Human Geography and Planning, Utrecht University, 3584 CB Utrecht, The
Netherlands
e-mail: y.li2@uu.nl
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_11
147
148 H. Yang and Y. Li
the catalytic effects of tourism and transported approximately 3.6 billion passengers
(IATA 2017). Meanwhile, HSR networks have been constructed largely in Europe
and China. The first HSR corridor, Tokyo-Osaka, was inaugurated in Japan in 1964.
After that, other typical HSR networks, such as the TGV in France and the ICE
in Germany (Givoni 2006;Hall2013), were developed in Europe. It is estimated
that the EU HSR networks will contribute to a 0.23 percent of GDP growth for
Europe (Banister and Givoni 2013). HSR has also experienced exponential growth
in China—from 477 km in 2008 to 19,000 km at the end of 2016, a distance that
accounts for over 60% of the global figure (NDRC 2016).
Remarkably, the complex networks approach has explained the appearance of
emergent phenomena in many systems composed by a large set of interacting
elements (Zanin and Lillo 2013). Many methodologies for analyzing different topolo-
gies have favored a better understanding of the structure and dynamics of many real-
world systems. While complex network theory could be dated back to the physics in
the nature sciences, the theory and application of complex networks have been largely
applied in the social sciences in the past decade for understanding the structure of
urban systems (Derudder et al. 2013; Derudder and Taylor 2005), the internet (Choi
et al. 2006; Devriendt et al. 2010), social media (Wang et al. 2018) etc. Doubtlessly,
the components of transportation networks are strongly interacting with each other
in the daily life of human beings and can be studied also from the perspectives of
complex network. Although complex network methodology also used for the early
transportation network research, including urban streets (Jiang 2007; Jiang et al.
2014), conventional railways (Guo and Cai 2008; Li and Cai 2007) and urban public
transportation networks (von Ferber et al. 2007,2009), the absence of a systematic
theoretical scheme of complex network theory on high-speed transportation network
is still a prominent issue to be addressed. Although compared to conventional trans-
portation systems, both the spatial scale of airline and HSR systems are larger and
more complex, it is still possible to get a reasonably comprehensive overview of
them, because they operate through a finite set of major locations, cities and hubs.
At this moment, most studies of complex network research focuse on the applica-
tion of the model rather than the differences in spatial characteristics between the
two transportation systems. Since both air and HSR networks are at least used for
transportation at least for the regional urban system, how the previous studies were
conducted on different spatial scales needs to be explored. Moreover, the temporal
evolution of both networks differs with each, which needs to be further reviewed
from a perspective of longitudinal research. In this paper we present a review of the
literature related to the application of complex network theory to the air and HSR
transportation system at different spatial and temporal scales. Moreover, both high-
speed transportation networks present different configurations regarding the weight
for the edges in a weighted network. Therefore, the differences between weighted
networks also need to be clarified.
This review is organized as follows. Section 11.2 describes the main compo-
nents involved in the complex network theory on the air and HSR transportation
systems, which are the basis for the construction of different network representa-
tions. Section 11.3 reports the previous research for the air and HSR networks at
11 Complex Network Theory on High-Speed Transportation Systems 149
different spatial and temporal scales. Section 11.4 reviews the different configura-
tion of complex network in the weighted network. Finally, Sect. 11.5 draws some
final conclusions, and presents some open lines of research.
11.2 Methodologies Used in Complex Network Theory
on Quantifying the Air and HSR Systems
11.2.1 Centrality Measures
Centrality measures the relative importance of a node within a transportation network.
Degree centrality, closeness centrality, and betweenness centrality—are used to
capture a node’s importance as being directly connected to others, being accessible
to others, and being the intermediary between others in the transportation networks.
11.2.1.1 Degree Centrality
Degree centrality is the number of edges connected to a specific node and indicates
the individual importance of the node in the network work. It can be defined as:
CD(i)=
i= j∈N
aij (11.1)
aij is the entry value in the matrix where element aij =1 when a direct link exists
between nodes i and j and otherwise, aij =0. If a directed network is considered, then
the degree can be extended to in and out degree centrality, which reflects the number
of edges ending in or starting from node i, respectively. If the in-degree centrality
is to large extent correlated with out-degree centrality in the transportation network,
the network is symmetric and regarded as an undirected network.
11.2.1.2 Betweenness Centrality
As described by (Freeman 1978,1977), betweenness centrality quantifies the level
of intermediate importance of a node in the interaction between other nodes. The
node betweenness can be defined as:
CB(i)=
k=i= j∈N
σkj(i)/σkj (11.2)
150 H. Yang and Y. Li
where σkj(i)is the number of the shortest paths between node kand j, which are
passing through node i, while σkj is the sum of all shortest paths between them.
Therefore, the betweenness of a node is can define the ration of all shortest paths
passing thought it and reflects its transitivity. It is worth noting that in real transporta-
tion systems, these nodes with a high betweenness would impose critical impacts
on network security (Barthélemy 2011,2004) because they may be in a position to
broke or mediate connections between these pairs.
11.2.1.3 Closeness Centrality
Closeness centrality indicates that how a given node is close to other nodes along the
shortest path and reflects node’s accessibility in a transportation network. The node
closeness can be defined as:
CC(i)=1
j=idij
(11.3)
Here dij is the shortest distance from node ito all other nodes in a given network.
The larger the closeness value, the more convenient. For a non-weighted network, it
only becomes the geodesic distance.
11.2.1.4 Weighted Network Analysis
The indices above are all based on unweighted projections of the transportation
system.
For the real transportation networks, the weight can be the number of flights, the
number of offered seat, or the actual number of passengers. Therefore, when a link
between two nodes i and j has a weight wij, the weighted degree of a node called
strength is defined as:
si=
j∈V
wij (11.4)
where, V is the neighbor set of node i and for a directed network, in-strength and
out-strength of a node estimate the weighted of links that go in or depart from it.
11.2.2 General Characteristics About the Complex Networks
A number of very interesting concepts have emerged to describe complex networks:
clustering coefficient, degree distribution, characteristics path length, degree-degree
11 Complex Network Theory on High-Speed Transportation Systems 151
correlation and ideas such as preferential attachment (Lin and Ban 2013; O’Kelly
2016; Wang et al. 2011)
11.2.2.1 Clustering Coefficient
The clustering coefficient is normally used to quantify the degree of clustering of
a network. It indicates the probability that two neighbors of a node are likely to be
connected themselves, which can be further defined as:
Ci=Ei
ki(ki−1)/2(11.5)
The clustering coefficient Cihere is the portion of the actual number of links Ei
between the nodes kiwithin its neighborhood divided by the maximal number of
possible links ki(ki−1)/2 between them (Watts and Strogatz 1998). Normally the
neighborhood of node i includes all the nodes directly connected to it except itself.
In a fully connected transportation systems, Ciof all nodes equals 1. If ki=1, Ci
equals 0.
11.2.2.2 Degree Distribution
The degree distribution p(k)is defined as the portion of nknodes with the degree k
in a transportation network with nnodes, i.e. nk/n. Therefore, the cumulative degree
distribution P(k)is defined as:
P(k)=
∞
k=k
P(k)(11.6)
In general, the cumulative degree distribution P(k)reflects the fraction of nodes
with degrees greater or equals k. Moreover, the average degree of a network, denoted
as ˜
kwhich reflects the average number of directly connected nodes a node has in a
transportation network.
11.2.2.3 Characteristics Path Length (Average Path Length)
Average path length is an important property for the communication between nodes
in networks. It reflects the average number of links along the shortest paths for all
possible node-pairs in the network and written as:
L=1
N(N−1)
i= j
dij (11.7)
152 H. Yang and Y. Li
where N is the number of nodes in the networks, and dij is the number of links for the
shortest path from i to j. Moreover, the inverse of path distance between nodes can be
used as the calculated the efficiency inside a transportation network. The efficiency
index can reflect the performance of the information exchange of the networks and
can be defined as:
E=1
N(N−1)
i= j
1
dij
(11.8)
11.2.2.4 Degree-Degree Correlation
Degree-degree correlation K(i)is used to probe the extent of a node’s degree related
to the average degree of nearest neighbors in a transportation network,
K(i)=1
ki
vj∈Ni
kj(11.9)
where kiis the degree of node i considering the node vjwith degree k and its ki
neighbors (each vj∈Ni). The degree correlation reflects a node’s connection pref-
erence. If high degree nodes tend to link with each other, this tendency is considered
as assortativity. Otherwise, if high-degree and low-degree nodes tend to link with
each other, this tendency is referred to as disassortativity (Newman 2003). Moreover,
the average of all k-degree node Ni(neighbors of all odes with k-degree) is defined
as:
K(k)=1
N(k)
vj∈Ni,ki=k
ki(11.10)
where N(k)equals the number of k-degree nodes. Degree correlation refers to the
relationship between k and K(k).
11.2.2.5 The Categories of Complex Networks
According to the three aforementioned indices, the transportation in general can be
categorized into 5 types:
(1) Regular network
A network with a long average path length and a high clustering coefficient can be
defined as a regular network in which each vertex is connected by the same way as
its neighboring vertices. That is the node degree distribution of a network follows a
point to point distribution.
11 Complex Network Theory on High-Speed Transportation Systems 153
(2) Random network
A network with a short average path length and a low clustering coefficient can be
defined as a regular network with a set of n vertices and adding edges between them at
random. In this situation the node degree distribution of the network follows a bino-
mial distribution p(k)=n
kpk(1−p)n−kor approximately a Poisson distribution
p(k)≈kk
k!e−k.
(3) Small world network
A network with a small average path length and a high clustering coefficient can be
defined as a small world network. In this situation the node degree distribution of
the network follows an exponential or power-law distribution, that is p(k)∝1/kμ.
In the small world, most nodes are not neighbors of one another and many linkages
for the transportation networks tend to be no more than six hops (O’Kelly 2016).
(4) Scale free network
If the node degree distribution of a network with a short average path length and a
high clustering coefficient follows a power-law distribution at least asymptotically,
then this network can be generalized as scale-free. It indicates that there is a large
difference in the network; there exist a few nodes highly connected by other nodes
and most the rest nodes very poorly connected.
11.3 Empirical Analysis for HSR and Airline Networks
11.3.1 HSR Networks
11.3.1.1 The Spatial Scale of HSR Networks
In general, research has already confirmed that HSR networks can also share some
characteristics of a small-world network, which shares a similar trend with airline
networks (Jiao et al. 2017). It has been found that most airline network research
have been conducted at the national or international scale. Compared to the airline
networks, the focus on HSR network research at the national scale or international
scale is quite limited due to the reasonable travel distance of HSR networks for
the regional scale which covers a huge ground area and involves a large number of
stations and routes; moreover, the construction of HSR system is much later, and it
has only recently been built in Europe and China.. Therefore, it is very difficult to
collect high-quality research data on such a large scale and it is for this reason that
there are not many studies on real HSR networks (Lin and Ban 2013).
154 H. Yang and Y. Li
Studies on HSR infrastructure began with Hall and Pain (2006) which aims
to identify polycentric urban regions connected by HSR networks in Europe and
found that of multiple hub cities have been formed in the European HSR networks.
Compared to other HSR countries, HSR network studies in China were prompted by
the country’s fast development of HSR networks. Based on the time schedule data of
2015, Zhang et al. (2016) found that national HSR networks still have the character-
stics of tree network and have the network propertiy of scale-free and small-world
networks. Yang et al. (2018b) analyse the spatial configurations of 99 HSR cities at
the national scale in China and further compare the spatial configurations of three
regional HSR networks in the Pearl River Delta, the Yangzi River Delta and the Bohai
Rim, indicating that the comprehensive Chinese HSR networks are largely polycen-
tric, especially in the central and eastern regions. Zhang et al. (2016) evaluate the
spatial structures of HSR networks in both Japan and China and found that Japanese
HSR networks had the best national connectivity, but Chinese HSR networks had
the best local connectivity and the greatest transport capacity. At the regional scale
most of these studies use time schedule data and follow the complex network anal-
ysis on functional polycentricity developed by Green (2007), who uncovered the
polycentricity of functional urban regions connected by the HSR at the regional
scale. At the regional scale, Luo (2010,2011) used the high-speed train flow data
to measure the polycentricity of HSR and urban networks in the Yangzi River Delta
(YRD) region, discovering that the growing integration of HSR cities within the
YRD region; Nanjing and Shanghai in particular had the strongest HSR connection
and become the hub HSR cities. Feng et al. (2014) used the same approach as Luo
(2010) to measure the polycentricity of HSR networks in the Pearl River Delta (PRD)
region and discovered that the HSR networks in the PRD is more polycentric than
the YRD region. Zhang et al. (2016) used HSR time schedule data to approximate
actual passenger flows to uncover the HSR network and city structure in the Yangzi
River Delta (YRD) region in China.
11.3.1.2 The Evolution of HSR Networks
Compared to the cross-sectional research on the HSR networks, the longitudinal
research on the relevant spatial scale is rather limited, largely due to the lack of
publicly accessible database of when and how long a country’s HSR services (stations
and links) have been inaugurated and in operation (Ato et al. 2018). Recently, with
the help of data mining technology, some researchers used the time schedule of
HSR networks to try to overcome this deficiency. Based on the construction scale
of China’s HSR networks between 2007 and 2017, Ato et al. (2018) found that an
increasing degree and eccentricity, but the a decreasing PageRank of the first tier city
from 2007–2017 to 2018–2030, whereas from 2018–2030 the clustering coefficients
of some of second tier and third tier cities would be greater than those of first-tier
cities.
Based on time schedule data of HSR from 2003 to 2014, Jiao et al. (2017) found
that the HSR network to a large extent increased the overall connectivity, the average
11 Complex Network Theory on High-Speed Transportation Systems 155
centrality of the city network, and the inequality of weighted closeness centrality,
whereas HSR networks decreased the inequality of weighted degree centrality and
between centrality. Meanwhile, the growing HSR network led to the centrality tended
to intensify in large cities in terms of the weighted closeness centrality, but intensify
in small cities according to the weighted degree centrality and weighted betweenness
centrality.
11.3.2 Airline Networks
11.3.2.1 The Spatial Scale of Airline Networks
In general, regarding the reasonable travel distance of airlines for the middle-and-
long haul travel, most airline network studies are conducted at least for a national
scale.
At the national scale, there are a great deal of local researchers have analyzed
the structure and network of domestic airline and the relevant urban networks by
using complex network theory. For instance, Chinese scholars such as (Wang et al.
2011) and (Lin 2012) have identified the importance of air cities and the structure of
airlines. Bagler (2008) found that India airline networks have a strong hierarchical
structure. Hossain and Alam (2017) applied the complex network approach in the
case of Australia to uncover the spatial structure of the airport networks. Guida and
Maria (2007) analyzed the structure of the Italian airport network from the period
2005–2006, of which data was derived from the OAG database. The authors found
that the network is a scale-free, small-world network with a fractal structure.
At the global scale, Guimerà et al. (2004,2005) conducted a structure analysis
of the air transportation network at the global scale, uncovering that the number
of non-stop connections from a given city and the number of shortest paths going
through a given city for the worldwide air transportation networks. Lordan et al.
(2014) have identified the topology of and robustness of complex airline networks in
Europe. O’Kelly (2016) also identified the global pattern of airline networks by using
the between centrality and found significant differences between different airline
networks.Daietal.(2018) also used the complex network research to identify the
Southeast Asian air transportation networks.
11.3.2.2 The Evolution of Airline Networks
In addition of the cross-sectional research, scientists also tried to use the longitudinal
data sets to understand the evolution of air transportation networks regarding the rele-
vant spatial scale of air transportation networks. Wandelt and Sun (2015) conducted
a temporal analysis of the evolution of airline networks at the international scale,
indicating that international airline networks with stable characteristics of scale-free
and small world increasingly develops into symmetric, transitive closure because of
156 H. Yang and Y. Li
increased airline connections between neighboring countries. Moreover, in Europe,
Burghouwt and Hakfoort (2001) identified the formation of hub-and-spoke structures
regarding the evolution of aviation network between 1990 and 1998. Dai et al. (2018)
recently investigated the evolving structure of the Southeast Asian air transporta-
tion networks and found that the network combines a relatively stable topological
structure with a changing multilayer geographical structure.
Compared to the international scale, there are a lot of temporal complex network
research on the national airline networks. Regarding the national scale, most research
are conducted in the context of China and the US with a large spatial scale. Zhang
et al. (2010) first investigated the evolution of the Chinese airport network from 1950
to 2008, indicating that there exist dynamic networks and an exponential growth of
air traffic grows with seasonal fluctuations, although there is no difference for the
topology of the Chinese airport network. Wang et al. (2014) further examined the
evolution of the Chinese air transportation network from 1930 to 2012, indicating
that airline networks experienced a significant improvement of connectivity and a
gradual expansion of core cities.
In the US, Jia et al. (2014) examined the evolution of the US airline network from
1990 to 2010 with, founding that the evolution of stable cities and new air cities
have the characteristics of the scale-free, small-world, and disassortative mixing
properties. Based on the same data set, Lin and Ban (2014) studied the evolution of
topological and spatial structure of the US airline network. They found that the US
airline network is stable although it became more inefficient with the growth of the
network and distance, and it worsens as time passes.
Some researchers also identified the national airline networks with a relative
spatial scale compared to the US and China. Jimenez et al. (2012) identified a less
concentrated network for the evolution of the Portuguese airport network between
2001 and 2010 and shown that low-cost carriers exerted a significant influence on
the evolution of the national airline network. Papatheodorou and Arvanitis (2009)
explored the evolution of the airport network in Greece from 1978 to 2006, indi-
cating that Greece airline network have a high spatial concentration and asymmetry
because of a short of traffic supply by the low-cost carries. Rocha (2009) analyzed
the evolution of the Brazilian airport network between 1995 and 2006 and found that
Brazilian airline network experienced a decreasing number of airports, a slightly
decreasing average shortest path length, and a shrinking number of airline routes in
spite of more than doubled number of passengers over the time.
11.4 The Differences Between Weighted and Non-weighted
High-Speed Transportation Networks
It is common to consider supply-related data (typically the number of seats offered
between two cities, or sometimes train frequencies or seat-km’s). The rationale for
supply-side data relies on the fact that it illustrates carriers’ strategies that are expected
11 Complex Network Theory on High-Speed Transportation Systems 157
to map networks according to existing and potential interactions between places
served. However, the supply is by definition larger or equal to the demand, so at best
it can be considered as a proxy for actual flows of people (Neal 2014). Second, supply
or demand data are usually given at the individual legs of trips rather than the trip as
a whole. For instance, if air or rail passengers travel from A to B where they connect
to C, usual figures would count the number of (airline or HSR) seats or passengers
between A and B and between B and C but not between A and C via B. As a result,
transfers distort the picture of actual intercity relationships (Derudder et al. 2010;
Derudder and Witlox 2008,2005). As far as air travel is concerned, some researchers
have addressed this issue by using the so-called MIDT dataset, which is based on the
actual origins/destinations that the air travellers flew from/to (Derudder et al. 2007).
However, this information is based on bookings made through the global distribution
systems (GDS). It means that those travellers who directly book on airlines’ websites
are not included. This could arguably lead to biases, for instance, an underestimation
of people flying by low-cost airlines.1
Finally, HSR timetables are difficult to convert into the number of seats for two
reasons. First, many HSR routes are served by heterogeneous rolling stock (e.g.,
shorter vs. longer trains or single- vs. double-deck trains). This means that if a train
operator would pursue a high-frequency strategy (that is, the operation of frequent
services but with likely less capacity per train), the alleged interactions between cities
derived from HSR frequency would be biased. Second, provided the number of HSR
seats is nevertheless available (e.g., from the train operator or thanks to homogeneous
rolling stock), one still needs to consider that most high-speed trains call at several
intermediate stations. This involves uncertainty in the allocation of seats between
different pairs of cities. For instance, if a Beijing (A) to Shanghai (D) HSR service
calls at Jinan (B) and Nanjing (C), then seats are potentially sold for A-B, A-C, A-D,
B-C, B-D and C-D city-pairs. Either the train operator pre-allocates seats to all pairs
or the actual bookings make the split change in real time. But in both cases, this
information is usually not available to researchers. It is thus not surprising that Yang
et al. (2018a) found that the scheduled train flows actually can underestimate the
positions of major cities in the transportation networks to a large extent, especially
in China with a larger-than-average capacity in the trains running from and to these
major tier cities to satisfy the demands of passenger travel.
As a result of all these limitations, there is a strong rationale for investigating
both HSR and airline networks (1) through demand-related data, which (2) are based
on true origins and destinations (Neal 2014). Of course, such data are usually not
fully available (or even not available at all) for scholars. Commercial privacy and
confidentiality dominate academic purposes, even in the strictly controlled railway
sector in China (Liu et al. 2015). However, spatial social science and humanities
research are shifting from a data-scarce to a data-rich environment. Reinforced with
the emergence of big data (especially big spatiotemporal data) in recent years, many
changes would take places in the future. For instance, mobile devices has created a
platform on which every aspect of life can be digitally imprinted (Arribas-Bel 2014).
1In Europe for instance, European low-cost airlines have long kept out of GDS to avoid extra costs.
158 H. Yang and Y. Li
Global positioning system (GPS) and their inclusion in mobile devices enables these
digital traces to incorporate the geographical coordinates of the location where the
event occurs, producing massive data which is highly detailed in space-time. As the
space-time data accumulate, the rich details of spatiotemporal dynamics in computa-
tional modelling remain largely unexplored, especially among researchers in urban
and regional field, because of many binding constraints for scientific advancement
such as the challenge of intensities of data computing and very large georeferenced
dynamic databases. It is noteworthy that the first explorations into potentials of big
data come from computer science researchers, in particular “urban computing” (Ratti
et al. 2010; Zheng 2014). To identify true origins and destinations both for HSR and
airline networks, computational methods are crucial to handle high volumes of data.
On the one hand, it is crucial to identify target groups from massive irrelevant data.
On the other hand, calculating the transfer point also requires a lot of computation
work. Thus, the combination of expertise from computer science and transporta-
tion network analysis will enable researchers to conduct demand-related analysis by
tracking the mobility of passengers.
11.5 Conclusion
The development of HSR and airline networks faced an unprecedented speed in order
to satisfy the increasing travel demand of people for business transactions, tourism,
post-migratory travels for saving social links with friends and relatives, academic
collaborations and political activities in the contemporary world. Meanwhile, the
complex network theory has been widely used in the past decades in the natural and
social sciences and airline and HSR networks are at the center of network analysis
in complex network analysis.
This paper reviewed the methodologies used in complex network theory on quan-
tifying the air and HSR systems regarding centrality degree, general characteris-
tics about the complex networks. The spatial and temporal configuration of HSR
and airline networks could be deeply investigated from the perspective of network
research. Here, relaying on a large number of empirical studies on HSR and airline
networks, the results show that the complex network theory reveals many more
similarities than expected in the high-speed rail and airline networks, all with scale-
free and small world characteristics. Moreover, in a weighted network, the usage
of weight for the transportation networks would present a different configuration
of transportation networks. Researchers need to be born in mind that the differ-
ences for the supply and demand side data sets for the complex network analysis
for HSR and airline networks. Besides, from a larger scale, the regional clustering
and global measures are further discussed. The last part in this section is allocated to
cellular structure. Moreover, the complex network theory just gave the configuration
of the networks instead of the casual relationship for the evolution of the networks.
Therefore, it is necessary for the researchers to explore the underlying topological
11 Complex Network Theory on High-Speed Transportation Systems 159
structure of transportation systems and the determinants of its formation is much
more challenging.
Moreover, with the availability of GPS and various smart phone devices, big data
is emerging in the contemporary world, which could relate the complex network
theory more actively to the approach of tracking passenger mobility for transportation
network analysis. Emerging big spatiotemporal socioeconomic data would facilitate
the modelling of individuals’ economic behaviour in space and time and the outcomes
of such models can reveal information about economic trends across spatial scales.
Dynamic and predictive research on transportation network analysis driven by these
big spatiotemporal socioeconomic data can better reflect future urban development
trends, avoid excessive construction in some low-demand areas, and achieve rational
allocation of resources, especially when cities are increasingly dependent on each
other. However, at present, the preliminary stage of urban planning mainly studies the
economic and traffic relationship between the target city and surrounding cities rather
than relationship between the target city and all cities reflecting the actual interaction.
Complex network analysis for HSR and airline networks helps to improve planning
strategies in real time and has a practical significance for the determination of urban
function and population size. Undoubtedly, a powerful analytical framework for
identifying space-time research gaps and frontiers is fundamental to comparative
study of spatiotemporal phenomena and more efforts should be made to take into
account the practical application of HSR and air networks at different spatial scale
and temporal scales.
Acknowledgements We would like to thank the editors and reviewers for their valuable comments
and suggestions. This research is sponsored by Shanghai Pujiang Program(2019PJC034). We also
would like to thank the financial support from Shanghai Philosophy and Social Science Project
(2019ECK011).
References
Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the
understanding of cities. Applied Geography, 49, 45–53. https://doi.org/10.1016/j.apgeog.2013.
09.012.
Ato, W., Zhou, J., Yang, L., & Li, L. (2018). The implications of high-speed rail for Chinese cities:
Connectivity and accessibility. Transportation Research Part A, 116, 308–326. https://doi.org/
10.1016/j.tra.2018.06.023.
Bagler, G. (2008). Analysis of the airport network of India as a complex weighted network. Physica
A: Statistical Mechanics and its Applications, 387, 2972–2980. https://doi.org/10.1016/J.PHYSA.
2008.01.077.
Banister, D., & Givoni, M. (2013). High-Speed Rail in the EU27: Trends, Time, Accessibility and
Principles. Built Environment, 39, 324–338.
Barthélemy, M. (2011). Spatial networks. Physics Report, 499,1–101. https://doi.org/10.1016/j.
physrep.2010.11.002.
Barthélemy, M. (2004). Betweenness centrality in large complex networks. European Physical
Journal B, 38, 163–168. https://doi.org/10.1140/epjb/e2004-00111-4.
160 H. Yang and Y. Li
Bertolini, L., & Dijst, M. (2003). Mobility environments and network cities. Journal of Urban
Design, 8, 27–43. https://doi.org/10.1080/1357480032000064755.
Burghouwt, G., & Hakfoort, J. (2001). The evolution of the European aviation network, 1990–1998.
Journal of Air Transport Management, 7, 311–318. https://doi.org/10.1016/S0969-6997(01)000
24-2.
Choi, J. H., Barnett, G. A., & Chon, B.-S. (2006). Comparing world city networks: a network
analysis of Internet backbone and air transport intercity linkages. Global Networks, 6, 81–99.
https://doi.org/10.1111/j.1471-0374.2006.00134.x.
Dai, L., Derudder, B., & Liu, X. (2018). The evolving structure of the Southeast Asian air transport
network through the lens of complex networks, 1979–2012. Journal of Transport Geography, 68,
67–77. https://doi.org/10.1016/j.jtrangeo.2018.02.010.
Derudder, B., Devriendt, L., & Witlox, F. (2007). Flying where you don’t want to go: An empirical
analysis of hubs in the global airline network. Tijdschrift voor economische en sociale geografie,
98, 307–324. https://doi.org/10.1111/j.1467-9663.2007.00399.x.
Derudder, B., & Taylor, P. (2005). The cliquishness of world cities. Global Networks, 5, 71–91.
https://doi.org/10.1111/j.1471-0374.2005.00108.x.
Derudder, B., Timberlake,M., & Witlox, F.(2010). Introduction: Mapping changes in urban systems.
Urban Studies, 47, 1835–1841. https://doi.org/10.1177/0042098010373504.
Derudder, B., & Witlox, F. (2008). Mapping world city networks through airline flows: Context,
relevance, and problems. Journal of Transport Geography, 16, 305–312. https://doi.org/10.1016/
j.jtrangeo.2007.12.005.
Derudder, B., & Witlox, F. (2005). An appraisal of the use of airline data in assessing the world
city network: A research note on data. Urban Studies, 42, 2371–2388. https://doi.org/10.1080/
00420980500379503.
Derudder, B., Witlox, F., & Taylor, P. J. (2013). U.S. Cities in the World City Network: Comparing
their positions using global origins and destinations of airline passengers. Urban Geography, 28,
74–91. https://doi.org/10.2747/0272-3638.28.1.74.
Devriendt, L., Derudder, B., & Witlox, F. (2010). Conceptualizing digital and physical connectivity:
The position of European cities in Internet backbone and air traffic flows. Telecommunication
Policy, 34, 417–429. https://doi.org/10.1016/j.telpol.2010.05.009.
Feng, C., Xie, D., Ma, X., & Cai, L. (2014). Functional Polycentricity of the urban region in the
Zhujiang river delta based on intercity rail traffic flow. Scientia Geographica Sinica, 34, 648–655.
Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social Networks.
https://doi.org/10.1016/0378-8733(78)90021-7.
Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry.https://
doi.org/10.2307/3033543.
Givoni, M. (2006). Development and impact of the modern high-speed train: A review. Transport
Reviews, 26, 593–611. https://doi.org/10.1080/01441640600589319.
Green, N. (2007). Functional Polycentricity: A formal definition in terms of social network analysis.
Urban Studies, 44, 2077–2103. https://doi.org/10.1080/00420980701518941.
Guida, M., & Maria, F. (2007). Topology of the Italian airport network: A scale-free small-world
network with a fractal structure? Chaos, Solitons & Fractals, 31, 527–536. https://doi.org/10.
1016/J.CHAOS.2006.02.007.
Guimerà, R., & Amaral, L. A. N. (2004). Modeling the world-wide airport network. European
Physical Journal B: Condensed Matter and Complex Systems, 38, 381–385. https://doi.org/10.
1140/epjb/e2004-00131-0.
Guimerà, R., Mossa, S., Turtschi, A., & Amaral, L. A. N. (2005). The worldwide air transportation
network: Anomalous centrality, community structure, and cities’ global roles. Proceedings of
National Academy of Sciences, 102, 7794–7799.
Guo, L., & Cai, X. (2008). Degree and weighted properties of the directed China Railway Network.
International Journal of Modern Physics C, 19, 1909–1918. https://doi.org/10.1142/S01291831
0801331X.
Hall, P. (2013). High Speed Two: The Great Divide. Built Environment, 39, 339–354.
11 Complex Network Theory on High-Speed Transportation Systems 161
Hall, P., & Pain, K. (2006). The Polycentric Metropolis: Learning from Mega-City Regions in
Europe. Routledge.
Hossain, M. M., & Alam, S. (2017). A complex network approach towards modeling and analysis
of the Australian Airport Network. Journal of Air Transport Management, 60, 1–9. https://doi.
org/10.1016/J.JAIRTRAMAN.2016.12.008.
IATA. (2017). Aviation Benefits Beyond Borders, International Air Transport Association.
Jia, T., Qin, K., & Shan, J. (2014). An exploratory analysis on the evolution of the US airport
network. Physica A: Statistical Mechanics and its Applications, 413, 266–279. https://doi.org/
10.1016/J.PHYSA.2014.06.067.
Jiang, B. (2007). A topological pattern of urban street networks: Universality and peculiarity.
Physica A: Statistical Mechanics and its Applications, 384,https://doi.org/10.1016/j.physa.2007.
05.064.
Jiang, B., Duan, Y., Lu, F., Yang, T., & Zhao, J. (2014). Topological structure of urban street
networks from the perspective of degree correlations. Environment and Planning B: Planning
and Design, 41, 813–828. https://doi.org/10.1068/b39110.
Jiao, J., Wang, J., & Jin, F. (2017). Impacts of high-speed rail lines on the city network in China.
Journal of Transport Geography, 60, 1–17. https://doi.org/10.1016/j.jtrangeo.2017.03.010.
Jimenez, E., Claro, J., & Pinho de Sousa, J. (2012). Spatial and commercial evolution of aviation
networks: a case study in mainland Portugal. Journal of Transport Geography, 24, 383–395.
https://doi.org/10.1016/J.JTRANGEO.2012.04.011.
Li, W., & Cai, X. (2007). Empirical analysis of a scale-free railway network in China. Physica A:
Statistical Mechanics and its Applications,382, 693–703. https://doi.org/10.1016/j.physa.2007.
04.031.
Lin, J. (2012). Network analysis of China’s aviation system, statistical and spatial structure. Journal
of Transport Geography, 22, 109–117. https://doi.org/10.1016/j.jtrangeo.2011.12.002.
Lin, J., & Ban, Y. (2014). The evolving network structure of US airline system during 1990–2010.
Physica A: Statistical Mechanics and its Applications, 410, 302–312. https://doi.org/10.1016/J.
PHYSA.2014.05.040.
Lin, J., & Ban, Y. (2013). Complex Network Topology of Transportation Systems. Transport
Reviews, 33, 658–685. https://doi.org/10.1080/01441647.2013.848955.
Liu, X., Song, Y., Wu, K., Wang, J., Li, D., & Long, Y. (2015). Understanding urban China with
open data. Cities, 47, 53–61. https://doi.org/10.1016/j.cities.2015.03.006.
Lordan, O., Sallan, J. M., & Simo, P. (2014). Study of the topology and robustness of airline
route networks from the complex network approach: A survey and research agenda. Journal of
Transport Geography, 37, 112–120. https://doi.org/10.1016/j.jtrangeo.2014.04.015.
Luo, Z. (2010). Study on the Functional Polycentricity of Yangtze River Delta. Urban Planning
International, 25, 60–65.
Luo, Z., He, H., & Geng, L. (2011). Analysis of the polycentric structure of Yangtze River Delta
based on passenger traffic flow. Urban Planning Forum, 2, 16–23 (in Chinese).
Neal, Z. (2014). The devil is in the details: Differences in air traffic networks by scale, species, and
season. Social Networks, 38, 63–73. https://doi.org/10.1016/j.socnet.2014.03.003.
Newman, M. E. J. (2003). The Structure and Function of Complex Networks. SIAM Review,45,
167–256. https://doi.org/10.1137/s003614450342480.
O’Kelly, M. E. (2016). Global airline networks: Comparative nodal access measures. Spatial
Economic Analysis, 11, 253–275. https://doi.org/10.1080/17421772.2016.1177262.
Papatheodorou, A., & Arvanitis, P. (2009). Spatial evolution of airport traffic and air transport
liberalisation: The case of Greece. Journal of Transport Geography, 17, 402–412. https://doi.org/
10.1016/J.JTRANGEO.2008.08.004.
Ratti, C., Sobolevsky, S., Calabrese, F., Andris, C., Reades, J., Martino, M., Claxton, R., & Strogatz,
S. H. (2010). Redrawing the map of Great Britain from a network of human interactions. PLoS
One,5.https://doi.org/10.1371/journal.pone.0014248.
162 H. Yang and Y. Li
Rocha, L. E. C. (2009). Structural evolution of the Brazilian airport network. Journal of Statistical
Mechanics: Theory and Experiment, 2009, P04020. https://doi.org/10.1088/1742-5468/2009/04/
P04020.
von Ferber, C., Holovatch, T., Holovatch, Y., & Palchykov, V. (2009). Public transport networks:
Empirical analysis and modeling. The European Physical Journal B, 68, 261–275. https://doi.
org/10.1140/epjb/e2009-00090-x.
von Ferber, C., Holovatch, T., Holovatch, Y., & Palchykov, V. (2007). Network harness: Metropolis
public transport. Physica A: Statistical Mechanics and its Applications, 380, 585–591. https://
doi.org/10.1016/j.physa.2007.02.101.
Wandelt, S., & Sun, X. (2015). Evolution of the international air transportation country network
from 2002 to 2013. Transportation Research Part E: Logistics and Transportation Review, 82,
55–78. https://doi.org/10.1016/j.tre.2015.08.002.
Wang, J., Mo, H., & Wang, F. (2014). Evolution of air transport network of China 1930–2012.
Journal of Transport Geography, 40, 145–158. https://doi.org/10.1016/j.jtrangeo.2014.02.002.
Wang, J., Mo, H., Wang, F., & Jin, F. (2011). Exploring the network structure and nodal centrality
of China’s air transport network: A complex network approach. Journal of Transport Geography,
19, 712–721. https://doi.org/10.1016/j.jtrangeo.2010.08.012.
Wang, Z., Ye, X., Lee, J., Chang, X., Liu, H., & Li, Q. (2018). A spatial econometric modeling
of online social interactions using microblogs. Computers, Environment and Urban Systems, 70,
53–58. https://doi.org/10.1016/j.compenvurbsys.2018.02.001.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of “small-world” networks. Nature, 393,
440–442. https://doi.org/10.1038/30918.
Yang, H., Dijst, M., Witte, P., Ginkel, H. Van, van Ginkel, H., & Wang, J. (2018a). Comparing
passenger flow and time schedule data to analyse High-Speed Railways and urban networks in
China. Urban Studied, 1–21. https://doi.org/10.1177/0042098018761498.
Yang, H., Dijst, M., Witte, P., Van Ginkel, H., & Yang, W. (2018b). The Spatial Structure of High
Speed Railways and Urban Networks in China: A Flow Approach. Tijdschrift voor economische
en sociale geografie, 108, 109–128. https://doi.org/10.1111/tesg.12269.
Zanin, M., & Lillo, F. (2013). Modelling the air transport with complex networks: A short review.
European Physical Journal-Special Topics, 215, 5–21. https://doi.org/10.1140/epjst/e2013-017
11-9.
Zhang, J., Cao, X.-B., Du, W.-B., & Cai, K.-Q. (2010). Evolution of Chinese airport network.
Physica A: Statistical Mechanics and its Applications, 389, 3922–3931. https://doi.org/10.1016/
J.PHYSA.2010.05.042.
Zhang, J., Hu, F., Wang, S., Dai, Y., & Wang, Y. (2016a). Structural vulnerability and intervention
of high speed railway networks. Physica A: Statistical Mechanics and its Applications, 462,
743–751. https://doi.org/10.1016/j.physa.2016.06.132.
Zhang, L., Qin, Y., & Wang, L. (2016b). Statistical analysis of weighted complex network in Chinese
high-speed railway. Journal Railway Engineering Science, 13, 201–209. (in Chinese)
Zheng, Y. (2014). Urban Computing: Concepts, Methodologies, and Applications. ACM Transac-
tions on Intelligent System Technology Arctic, 5, 1–55. https://doi.org/10.1145/2629592.
Chapter 12
Economic Impact Analysis for an Energy
Efficient Home Improvement Program
Qisheng Pan
12.1 Introduction
Cost-effective energy efficiency technologies have become available for households
in both developed and developing countries. In the U.S., improvement in energy effi-
ciency has made per-capita residential energy use remain relatively constant since the
1970s though demand for energy has increased significantly (Crandall-Hollick and
Sherlock 2018). Many federal, state, and local agencies offer financial incentives and
financing programs to home owners for energy efficient purchases and improvements
in the form of tax credits or rebates, energy savings, and energy-efficient financing,
which significantly offset the costs of implementing an energy efficient improvement
plan or installing a renewable energy system. The owners of new and existing prop-
erties are able to apply for energy-efficient mortgage through government-insured
programs or conventional loan programs.
A well-known nation-wide energy efficiency financing program in the US is called
Property Assessed Clean Energy (PACE), which provides financial assistance to
residential, commercial and industrial property owners for energy-efficient, water-
efficient, and renewable energy products and installation services. The purpose of the
PACE programs was to reduce greenhouse gas (GHG) emissions, increase energy
independence, and stimulate local economy. The concept of PACE was initially
proposed by the Association of Monterey Bay Area Governments (AMBAG) in
2006 as a region-wide Special Energy Financing District (EFD) that provides on-tax
bill financing of energy efficiency and renewable energy projects (Kammerer 2006).
It was also proposed by the City of Berkeley, California in 2007 as an EFD and
implemented in the city with grants from the U.S. Environmental Protection Agency
Q. Pan (B
)
Department of Public Affairs and Planning, University of Texas at Arlington, Arlington, TX
76019, USA
e-mail: qisheng.pan@uta.edu
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_12
163
164 Q. Pan
(EPA) and the Bay Area Air Quality Management District (BAAQMD) (Kammen
et al. 2009). Since then, it has received growing attention as a feasible mechanism for
financing clean and energy efficiency projects of residential or commercial properties,
such as the installation of air sealing and ventilation, insulation, space heating and
cooling, water heating, solar photovoltaic, and solar thermal systems, etc.
As a national initiative, PACE programs are formed locally and adapted to meet
community needs. They are usually established and implemented by local munic-
ipalities authorized by State legislation. The key features of the PACE programs
include voluntary participation of all parties involved, coverage of all the costs, the
long-term financial assistance up to 20 years, possible combination with other federal
and local energy efficiency incentives, permanent affixation to the property, and the
assessment as a lien on the property. Though Fannie Mae, Freddie Mac, the Federal
Housing Finance Agency (FHFA), and other financial regulators suggested that the
PACE program violated standard mortgage provisions because PACE assessments
were actually loans not assessments, the growth of PACE programs has not been
halted. Today there are many PACE programs launched and operating in 20 states
plus Washington D.C. Most of them are commercial programs while there are a
number of residential programs available in three states, i.e. California, Florida, and
Missouri.1
The most widely adopted residential PACE Program in the U.S. is the Renovate
America’s Home Energy Retrofit Opportunity (HERO) Program. Similar to the other
PACE programs, the HERO program provides home owners financing support that is
repaid through annual property tax payments. The program was initially developed in
2010 for the communities in the Western Riverside County and now become available
for over 85% of the residents living in the state of California. It has been used by
more than 110,000 households in over 500 communities. It was estimated to create
2,400 new jobs in the construction sector and has $250 million in funded projects
on 12,500 homes in California by July 2014.2However, it is unclear how these jobs
and outputs are estimated and where actually they are located in the region.
This research aims to fill the gap in the literature by investigating the economic
impacts of an energy efficient home improvement program on regional economy.
It selects Renovate America’s HERO program as an empirical case to explore its
economic impacts in Southern California. It intends to address the research questions,
i.e. what are the total impacts of the energy efficient home improvement program
measured as jobs or dollar values? and how these impacts are distributed in the
region?
The rest of the paper is organized as follows. Part 2 reviews the relevant literature;
Part 3 describes the methodology; Part 4 explains the empirical study; and Part 5
discusses findings and draws conclusions.
1See http://pacenation.us/pace-programs/.
2See https://www.marketwatch.com/press-release/24-additional-california-jurisdictions-launch-
hero-program-to-help-homeowners-conserve-water-and-energy-2014-07-09.
12 Economic Impact Analysis for an Energy Efficient … 165
12.2 Literature Review
Stakeholders and decision makers are interested in the effect of energy efficient home
improvement programs on their local economy. However, there are few academic
studies addressing this issue.
As the senior economists in the ECONorthwest, the largest economic consulting
firm in the Pacific Northwest, Pozdena and Josephson (2011) conducted a research to
assess the economic impacts of PACE programs for PACENow, which is a national
non-profit foundation advocating for PACE financing. They examined the local and
national impacts of a hypothetical purchase activity with the same economic compo-
sition as PACE projects in four cities, including Santa Barbara in California, San
Antonio in Texas, Columbus in Ohio, and Long Island in New York. Each city had
$1 million-dollar spending in PACE projects. The IMPLAN Input-Output model and
data were employed in their study. The impacts of the hypothetical purchases were
measured, including direct, indirect, and induced impacts of the spending related
to the energy efficient improvement of the PACE program, the possible increase
of household spending on other goods and services except for energy due to the
reduction of utility costs, and the economic effects of the tax-revenue changes due
to the spending on energy efficiency improvement and household expenditures. The
impacts in each city were listed by type at federal, state, and local level. However,
the spatial distribution of impacts in small areas of the city was not discussed or
presented.
Goldberg et al. (2011) examined the employment and other economic impacts
of the ClimateSmart Loan Program (CSLP), which is considered as the first test of
the effects of PACE financing program on a multi-jurisdictional level involving both
the municipal cities and the county government in Boulder County, Colorado. They
employed an input-output model to explore the effects of homeowner’s spending on
energy efficiency or renewable energy products like attic insulation or solar panels
on stimulating local job creation of vendors, contractors, supplier, and manufac-
turers, etc. They found that the newly created jobs, earnings, and outputs of the
CSLP program are more than enough to justify the county’s investment on the
program. They also conducted a qualitative assessment by interviewing contrac-
tors and program participants to learn the influences and outcomes that are usually
missing in the quantitative analysis.
Kieper and Bicknell (2018) worked for the Cadmus company to estimate the net
economic development impacts of Focus on Energy program, which is the statewide
energy efficiency and renewable energy programs in Wisconsin. They employed
the Policy Insight +(PI+) model from Regional Economic Models, Inc. (REMI),
which is a dynamic economic forecasting model consisting of input-output matrix,
general equilibrium, econometrics, and economic geography. They measured both
the employment and economic benefit impacts in 2015-2016. They found that Focus
on Energy program has positive net impacts on the state economy and it increases
the in-state spending.
166 Q. Pan
In addition to the energy efficiency programs in the US, there are also a few
studies examining similar programs in other countries. In Ireland, Scheer and Moth-
erway (2011) employed cost-benefit analysis (CBA) mechanism to evaluate the home
energy saving (HES) scheme from the residential energy efficiency upgrades and
explore the effects of the Small and Medium Enterprise (SME) program, which
aim to achieve energy efficiency improvements and emissions reductions through
Sustainable Energy Authority of Ireland (SEAI) programs. They found that both the
HES scheme and the SME program have achieved a positive net present value (NPV)
by saving more than the costs over the program duration.
Washan et al. (2014) utilized an analytical model called Cambridge econometrics
model to examine the UK economy and energy system. They constructed a baseline
scenario using the data from the Office of National Statistics (ONS) in 2012 and
compared it against the alternative investment policy scenario. They found that every
pound invested by the government on energy efficiency improvements will generate
a three-fold return in GDP, the government investment in funding and incentives
on energy efficiency measures can achieve self-financing, and the energy efficiency
investment will generate a significant net increase in employment, mostly in the
services and the construction sectors.
As a leading clean energy advisory in North America, Dunsky Energy Consulting
(2018) modeled the combined net macroeconomic effects of the improvements to
energy efficiency through the Canadian government’s Pan-Canadian Framework on
Clean Growth and Climate Change (“PCF”). Their models assessed both positive
and negative effects of the PCF by examining the increased demands for efficiency-
related goods and services, the redistribution of savings, and the reduced energy
sales. They found that improvements to energy efficiency can lead to significant cost
savings and the investing in PCF programs can bring significant net benefits to the
Canadian economy in terms of increased jobs and GDPs. The net change in GDP
and employment from energy efficiency investments were reported by province and
by industry sector.
Most relevant studies in the literature focused on the economic impacts of energy
efficiency programs at city, state or province, and federal levels. Few of them exam-
ined the impacts in small areas within a city or county. For instance, the impacts of
the PACE programs reported by Pozdena and Josephson (2011) were listed by city
and by type at federal, state, and local level. However, it is crucial for planners to
learn the spatial distribution of those impacts in small areas. Pan et al. (2009) esti-
mated economic impacts of special events in the small geographic zones of Southern
California. But the empirical cases of their studies were terrorist attacks or natural
disasters rather than energy efficiency.
This research focuses the impacts of energy efficient improvement programs,
which can be examined by economic input-output model, such as the IMPLAN input-
output model employed by Pozdena and Josephson (2011) for the PACE projects in
four US cities or REMI’s Policy Insight +model for Wisconsin’s Focus on Energy
program. However, the traditional input-output models ignore the spatial distribution
12 Economic Impact Analysis for an Energy Efficient … 167
of impacts in the small areas. To fill the gaps, this study intends to extend the model
developed by Pan et al. (2009) to examine the economic impacts of energy efficient
improvement programs in the small geographic areas of a large metropolitan region.
12.3 Methodology
Most economic impact studies employed input-output model to estimate economic
effects of a plan, project or policy. The original input-output modeling of inter-
industry analysis was established on the work of Leontief (1928). Leontief’s input-
out model classified an economic system into a number of industries or so-called
industrial sectors. Each sector takes inputs from itself and other industrial sectors to
produce a product that also serves as an input to itself and to the others.
This research focuses the impacts of energy efficient improvement programs,
which can be examined by a model like the Southern California Planning Model
(SCPM). SCPM is a metropolitan input-output model that interacts with a spatial
allocation model in the Garin-Lowry tradition, to estimate economic impacts,
including direct, indirect, and induced effects, and assign them to small-area zones.
Similar to the inter-industrial input-out models based on the transactions flows
between intermediate suppliers and end producers, SCPM was demand driven to
account for effects primarily via backward and forward linkages between economic
sectors. Different from many other inter-industrial models, however, SCPM allocated
regional economic impacts to geographic zones such as political boundaries.
SCPM has two components. The first model component of SCPM was built upon
input-out model. The estimate of total indirect and induced effects was described in
Richardson (2015) as follows,
V(c)=V(d)+V(i)+V(u)(12.1)
V(i)=V(o)−V(d)(13.2)
V(u)=V(c)−V(o)(13.3)
where, V(c) represents the vector of total output from the closed model for all but
the household sector
V(d), V(i), and V(u) are the vectors of direct, indirect, and induced effects,
respectively.
V(o) is the vector of total outputs.
The second basic model component involved the adaptation of a Garin-Lowry-type
model (Garin 1966) for spatially allocating the economic impacts generated by the
input-output model. It consists of a spatial counterpart to Eq. (12.1) is shown as
follows,
168 Q. Pan
Z(c)=Z(d)+Z(i)+Z(u)(12.4)
where, Z(c) is a matrix of impacts both by spatial unit (zone) and by sector.
Z(d), Z(i), and Z(u) are matrices of direct, indirect, and induced impacts by spatial
unit (zone) and by sector, all derived or specified in different ways.
The spatial allocation of the direct outputs Z(d) to the impacted areas is
accomplished based on practical experience. It is defined exogenously in SCPM.
The indirect outputs are allocated in SCPM according to the proportion of
employees in each sector by zone, described as follows,
Z(i)=P·diag[F(i)](12.5)
where Pis a matrix indicating the proportion of employees in each zone.
the operator ‘diag’ diagonalizes the final demand vector F(i).
The induced impacts, i.e., the effects resulting from household expenditure changes,
are allocated spatially throughout the entire region by tracing the induced outputs
via household expenditure patterns using two trip origin-destination (O-D) matrices.
A journey home-to-work O-D matrix JHW traces activities from workplace back to
home and a journey home-to-services matrix JHS traces household expenditures from
the home to the retail stores or service establishments. It can be described as follows,
Z(i)=JSH JHW P·diag[F(i)](12.6)
where JSH is journey from services to home matrix,
JHW is journey from home to work matrix,
Pand F(i) are the same as those in Eq. (12.5).
The results generated by SCPM were detailed economic impacts in terms of jobs
or dollar values of output by sector and by sub-regional zone. The latter are typically
local cities and communities. It intends to achieve operationality and levels of spatial
disaggregation never otherwise achieved in empirical research.
SCPM has been steadily updated over the years as new and revised data sources
became available. The building blocks of the simple versions of SCPM were the
metropolitan input-output model, a journey from home to work matrix, and a journey
from services to home matrix matrix. Incorporating the Garin-Lowry approach into
spatial allocation makes the transportation flows in SCPM exogenous. This study
employed the advanced version of SCPM that has explicit representation of the trans-
portation network and has modules to account for the economic impact of changes
in transportation supply.
12 Economic Impact Analysis for an Energy Efficient … 169
12.4 Empirical Study
This study applies a metropolitan input-output model to examine the economic
impact of an energy efficient home improvement program. After the HERO Program
was developed by Renovate America, Inc. via a partnership with the Western River-
side Council of Governments in 2010, it has become quickly accessible by the resi-
dents in the other places of California and has also been available to home owners in
the states of Missouri and Florida. It has become one of the most adopted residential
PACE Program for energy efficient home financing in the United States.
12.4.1 Data
Renovate America, Inc. provided us data about residents and contractors who had
projects funded by the HERO Program between December 2011 and April 1, 2013.
The data set records the information of all the participants during the period, including
property address, exact location with longitude and latitude, name, address, and
exact location (longitude and latitude) of its contractor company, the amount of loan
requested and approved, energy saving product installed, NAICS codes for installer,
wholesaler, and manufacturer, etc.
The data sets provided by Renovate America were preprocessed step by step.
The longitude and latitude data were used to pinpoint the 1,936 properties and 200
contractors in the region (see Fig. 12.1). All the properties are located in Riverside
County while most contractors spread out in the four Counties of Southern California,
including Riverside, San Bernardino, Orange, and Los Angeles.
According to Renovate America, Inc., a total of 3,091 separate product installation
requests were submitted by the 1,936 properties in 2011–2013. A total of 2,895 or
93.66% of these requests were 100% funded by the HERO program. The minimum
percentage of funding received is 43.76% while the maximum is 200%. The total
amount of initial loan requests is $30,743,229.9 while the total amount of approved
loan requests is $30,501,750.99, indicating 99.21% of total amount of initial requests
was approved in the program.
The total amount of $30,501,750.99 was funded for the 3,091 product installations
submitted by the 1,936 properties in the projects funded by the HERO program. They
were aggregated by NAICS (North American Industry Classification System) Code
for Manufactures, which was converted to IMPLAN code to facilitate economic
impact analysis in the following steps. The conversion between NAICS code and
IMPLAN code was based on a bridge table provided by IMPLAN. The sector of
semiconductor and related device manufacturing (NAICS 334413 or IMPLAN 243)
has the highest dollar amount, which is $12,045,273.58.
The average funded financing amount per product installation in the HERO
program is $9,867.92, i.e. $30,501,750.99 divided by 3,091, while the average
financing amount funded for each property is $15,755.04, $30,501,750.99 divided by
170 Q. Pan
Fig. 12.1 Location of properties and contractors participating in the HERO Program. Source Author
Preparation
1,936. The highest average cost per installation is $21,623.69 for Solar Photovoltaic
(PV), which accounts for 39.49% of total dollar amount and 18.02% of total product
installations. Windows and Doors have the largest installation, which has 1,235 total
number of installations and accounts for 39.95% of total installation. Its total values
and the average costs per installation are far less than those of Solar PV. Heating,
Ventilation, and Air Conditioning (HVAC) also has a large number of installations
with high values. Its total number of installations is 906, which is higher than that of
Solar PV. The total values of HVAC products are about $8.5 million, which is higher
than the values of Windows and Doors. More details about the costs by product
category are shown in Table 12.1.
According to the report from the GreenHomes America, a nationwide leading
residential energy services company, average materials account for 45% and labor
account for 55% of the total costs in energy efficient measures, which includes wood
windows and doors; insulation; sealing; indoor and outdoor water efficient devices;
and energy efficient HVAC systems (Pozdena and Josephson 2011). According to a
report by the Lawrence Berkeley National Lab in 2010, materials account for 52%
and labor costs represent 48% of the total costs of renewal energy projects, which
includes Solar PV. These ratios are used to split materials and labor costs in energy
efficient and renewable energy projects, shown in Table 12.2.
Table 12.2 lists the installed products by IMPLAN code and description, corre-
sponding NAICS code and description, count or number of installations, initial
financing request, approved fund, ratio of material costs, and the amount of material
12 Economic Impact Analysis for an Energy Efficient … 171
Table 12.1 Summary of the costs by product category
Product
category
# Product
category
installations
Persentage of
total product
installations
(%)
Total category
dollars
Persentage
of total
dollars (%)
Average costs
per installation
Air sealing and
weatherization
16 0.52 $18,422.34 0.06 $1,151.40
Cool wall and
roof
installations
111 3.59 $1,520,292.73 4.98 $13,696.33
Wat e r
efficiency
60.19 $8,778.70 0.03 $1,463.12
Insulation 160 5.18 $554,633.20 1.82 $3,466.46
Lighting 7 0.23 $8,259.00 0.03 $1,179.86
Pool equipment 50 1.62 $120,929.19 0.40 $2,418.58
Solar PV 557 18.02 $12,044,393.58 39.49 $21,623.69
Solar thermal 13 0.42 $93,338.00 0.31 $7,179.85
HVAC 906 29.31 $8,497,722.93 27.86 $9,379.39
Water heating 27 0.87 $165,411.98 0.54 $6,126.37
Windows and
doors
1,235 39.95 $7,459,159.34 24.45 $6,039.81
Custom
products
30.10 $10,410.00 0.03 $3,470.00
SUM 3,091 100.00 $30,501,750.99 100.00 $9,867.92
and labor costs. The materials accounts for 52% of the total costs for Solar PV while
they account for 45% of the total costs for all the other products. The total materials
costs are $14,568,957.10 and the labor costs are $15,932,793.89.
Semiconductor and related device manufacturing has about $6.3 million material
costs and $5.8 million labor costs, which are the highest among all the products.
Air-conditioning and warm air heating equipment, and commercial and industrial
refrigeration equipment manufacturing has about $3.5 million material costs and
$4.3 million labor costs, which are the second highest among all the products. Wood
window and door manufacturing has the third highest material costs and labor costs,
which are $3.4 million and $4.1 million, respectively. The three products categorized
by NAICS or IMPLAN code with the highest labor costs and material costs are
consistent to the three products categorized in HERO programs with the largest
number of installations and the highest installation costs, i.e. Solar PV, HVAC, and
Windows and Doors.
172 Q. Pan
Table 12.2 Summary of the labor costs and material costs for product installation
IMPLAN IMPLAN Des NAISC for
manuufact
NAICS Des Count INITREQ Approv Material
ratio (%)
Material costs Labor costs
99 Wood windows
and doors and
millwork
manufacturing
321911 Wood Window
and Door
Manufacturing
1235 $7,471,939.24 $7,459,159.34 45 $3,356,621.70 $4,102,537.64
117 Asphalt shingle
and coating
materials
manufacturing
324122 Asphalt Shingle
and Coating
Materials
Manufacturing
111 $1,526,179.73 $1,520,292.73 45 $684,131.73 $836,161.00
128 Synthetic rubber
manufacturing
325220 Artificial and
Synthetic Fibers
and Filaments
Manufacturing
89 $372,198.64 $371,598.64 45 $167,219.39 $204,379.25
137 Adhesive
manufacturing
325520 Adhesive
manufacturing
16 $18,422.34 $18,422.34 45 $8,290.05 $10,132.29
147 Urethane and
other foam
product (except
polystyrene)
manufacturing
326150 Urethane and
Other Foam
Product (except
Polystyrene)
Manufacturing
4$10,340.00 $10,340.00 45 $4,653.00 $5,687.00
149 Other plastics
product
manufacturing
326191 Plastics Plumbing
Fixture
Manufacturing
4$4,053.70 $4,053.70 45 $1,824.17 $2,229.54
168 Mineral wool
manufacturing
327993 Mineral Wool
Manufacturing
34 $79,717.56 $79,962.56 45 $35,983.15 $43,979.41
(continued)
12 Economic Impact Analysis for an Energy Efficient … 173
Table 12.2 (continued)
IMPLAN IMPLAN Des NAISC for
manuufact
NAICS Des Count INITREQ Approv Material
ratio (%)
Material costs Labor costs
198 Valve and fittings
other than
plumbing
manufacturing
332919 Other Metal
Valve and Pipe
Fitting
Manufacturing
3$12,655.00 $12,655.00 45 $5,694.75 $6,960.25
202 Other fabricated
metal
manufacturing
332999 All Other
Miscellaneous
Fabricated Metal
Product
Manufacturing
34 $94,332.00 $94,332.00 45 $42,449.40 $51,882.60
214 Air purification
and ventilation
equipment
manufacturing
333413 Industrial and
Commercial Fan
and Blower and
Air Purification
Equipment
Manufacturing
92 $565,414.66 $565,414.66 45 $254,436.60 $310,978.06
215 Heating
equipment
(except warm air
furnaces)
manufacturing
333414 Heating
Equipment
(except Warm Air
Furnaces)
Manufacturing
63 $214,268.11 $214,267.19 45 $96,420.24 $117,846.95
(continued)
174 Q. Pan
Table 12.2 (continued)
IMPLAN IMPLAN Des NAISC for
manuufact
NAICS Des Count INITREQ Approv Material
ratio (%)
Material costs Labor costs
216 Air conditioning,
refrigeration, and
warm air heating
equipment
manufacturing
333415 Air-Conditioning
and Warm Air
Heating
Equipment and
Commercial and
Industrial
Refrigeration
Equipment
Manufacturing
738 $7,884,600.25 $7,867,754.25 45 $3,540,489.41 $4,327,264.84
243 Semiconductor
and related device
manufacturing
334413 Semiconductor
and Related
Device
Manufacturing
558 $12,249,923.30 $12,045,273.58 52 $6,263,542.26 $5,781,731.32
260 Lighting fixture
manufacturing
335121 Residential
Electric Lighting
Fixture
Manufacturing
7$8,259.00 $8,259.00 45 $3,716.55 $4,542.45
261 Small electrical
appliance
manufacturing
335210 Small Electrical
Appliance
Manufacturing
103 $230,926.37 $229,966.00 45 $103,484.70 $126,481.30
Sum 3091 $30,743,229.90 $30,501,750.99 $14,568,957.10 $15,932,793.89
12 Economic Impact Analysis for an Energy Efficient … 175
12.4.2 Economic Impact Analysis
The Southern California Planning Model (SCPM) was utilized to trace all the regional
economic impacts at a high level of sectoral and spatial disaggregation. The model
was initially developed for the five-county Los Angeles metropolitan region, and has
the unique capability to allocate all impacts, in terms of jobs or the dollar value of
output, to sub-regional zones, mainly individual municipalities, which is the result
of an integrated modeling approach that incorporates two fundamental components:
input-output and spatial allocation. The approach allows the representation of esti-
mated spatial and sectoral impacts corresponding to the vector of changes in final
demands that come from the installations of the energy efficient products. The exoge-
nous shocks treated as changes in final demand are fed through an input-output model
to generate sectoral impacts that are then introduced into the spatial allocation model.
The first model component of SCPM is built upon the well known IMPLAN input-
output model3which has a high degree of sectoral disaggregation, i.e. 440 sectors
in IMPLAN 3.0. The second basic model component is used for allocating sectoral
impacts across the 308 municipalities or 3191 traffic analysis zones in Southern
California (Pan et al. 2009).
In this study, IMPLAN 3.0 was utilized for economic impact analysis on the
HERO program. Different from the previous version, the new IMPLAN 3.0 has
few numbers of industrial sectors, 440 versus 509 in version 2.0. The input-output
analysis was conducted step by step. The first step is to construct a new model.
The new model is constructed for the five county Southern California Association of
Government (SCAG) region using the IMPLAN 2011 data that fit to the data obtained
from Renovate America, Inc. for the participants of residents and contractors in the
HERO program between 2011 and 2013. The second step in IMPLAN model is to
set up activities for the economic impact analysis (EIA) of the HERO program and
to create events for the EIA activities using the labor costs and material costs for the
installation of energy efficiency products summarized in Table 12.2. The third step
is to create a scenario with the EIA activities and analyze the scenario. The results
of impact analysis are reported in IMPLAN as direct, indirect, and induced effects.
The IMPLAN impact analysis results are comprehensive, including output and
employment by IMPLAN sector. Two sets of the results are yields with the two
assumptions. One assumes that 100% final demands are met by local supplies while
the other more realistic one assumes that not all local demands are purchased from
local producers. The portion of local demands imported from the producers outside
of region is called “leakage”. The difference between the two sets of results from
IMPLAN are used to estimate regional leakages.
The second basic model component is used for allocating the sectoral impacts
across the geographic zones in Southern California. The key is to adapt a Garin-
Lowry style model for spatially allocating the indirect and induced impacts generated
by the input-output model.
3Made available by http://www.implan.com/.
176 Q. Pan
Table 12.3 Economic impact of HERO program in the five County SCAG region
Output ($1,000 s) Jobs
Direct Indirect Induced Tot a l * Direct Indirect Induced Tota l *
City of
Riverside
4,362 102 155 4,618 22 1 1 24
County of Los
Angeles
1,673 3,388 4,399 9,460 719 30 55
County of
Orange
5,288 1,201 1,589 8,079 24 711 41
County of
Ventura
0242 357 599 0 1 2 4
County of
Riverside
11,007 434 626 12,067 52 3 4 59
County of San
Bernardino
1,383 508 648 2,539 6 3 5 14
Sum of Five
Counties
19,351 5,774 7,619 32,744 89 33 52 174
Regional
Leakages
11,151 5,196 3,736 20,082 34 27 25 86
Tot a l 30,502 10,970 11,354 52,826 123 60 77 259
The modeling results are summarized by County in Table 12.3. The total output
impacts are $52.8 Million, which creates 259 jobs. But there are $20.1 Million output
and 86 jobs located out of the SCAG region. In the five county SCAG region, the
total output impacts are $32.7 Million, which creates 174 jobs. The location of the
impacts in dollar value and job is shown in Figs. 12.2 and 12.3.
The results of economic impact analysis show that most of the impacts of Renovate
America Inc.’s HERO program have concentrated in Riverside County, especially
Riverside City. Riverside County has received over $12.1 million outputs and 59
jobs, which account for about 22% of total outputs and jobs generated by the HERO
program. It has also got $11.0 million output and 52 jobs as direct impacts, which are
more than one third of the direct outputs and over 42% of the total directly impacted
jobs. However, the contractors and manufactures are not limited to the Riverside
County. They spread out in the five county Los Angeles area and even go beyond the
region. The other four counties in the Los Angeles region have received about $20.7
million outputs and 115 jobs while the impacts beyond the region or the so-called
regional leakages are estimated as $20.1 million output and 86 jobs.
12 Economic Impact Analysis for an Energy Efficient … 177
Fig. 12.2 Dollar value of total impact by Traffic Analysis Zone (TAZ). Source Author Preparation
Fig. 12.3 Job impact by TAZ. Source Author Preparation
178 Q. Pan
12.5 Conclusions and Discussions
This study selected the Home Energy Retrofit Opportunity (HERO) Program of
Renovate America, Inc. as an empirical case to demonstrate the economic impact
analysis for an energy efficient home improvement program in a large metropolitan
area. It adopted the Southern California Planning Model (SCPM) to trace all the
regional economic impacts of the HERO program in the five county Los Angeles
region at a high level of sectoral and spatial disaggregation. As an integrated modeling
approach, the SCPM incorporates two components, i.e. an input-output model and a
spatial allocation model.
We obtained data about residents and contractors who have participated in the
HERO program between 2011 and 2013. IMPLAN input-output model was employed
as the first step to estimate the indirect and induced effects of the HERO program
and a spatial interaction model was utilized to allocate the impacts by sector to small
geographic areas, i.e. municipalities or traffic analysis zones. The results show that the
impacts are not limited to the directly impacted areas, in this case, Riverside County
or Riverside City. They spread out to the other parts of the Los Angeles region and
even go beyond the region. The spatial distribution of the impacts, including direct,
indirect, and induced impacts, provide the stakeholders and policy makers more
comprehensive understandings of energy efficient programs.
References
Crandall-Hollick, L. M., & Sherlock, M. F. (2018). Residential Energy Tax Credits: Overview and
Analysis, Congressional Research Service (CRS) Report R42089, Washington D.C.
Dunsky Energy Consulting. (2018). The Economic Impact of Improved Energy Efficiency in
Canada—Employment and Other Economic Outcomes from the Pan-Canadian Framework’s
Energy Efficiency Measures, Report Submitted to Clean Energy Canada, Vancouver, Canada.
Garin, R. A. (1966). A matrix formulation for the Lowry model for intrametropolitan activity
location. Journal of the American Institute of planners, 32, 361–364.
Goldberg, M., Cliburn, J. K., & Coughlin, J. (2011) Economic Impacts from the Boulder
County, Colorado, ClimateSmart Loan Program: Using Property-Assessed Clean Energy (PACE)
Financing, Technical Report NREL/TP-7A20-52231, National Renewable Energy Laboratory,
Office of Energy Efficiency & Renewable Energy, U.S. Department of Energy, Golden, Colorado.
Pozdena R., & Josephson, A. (2011). Economic Impact Analysis of Property Assessed Clean Energy
Programs (PACE), Research performed by ECONorthwest for PACENow.
Kammen, D. M., Kunkel, C., & Fuller, M. (2009) Guide to energy efficiency and renewable energy
financing districts for local governments, Report prepared by Renewable and Appropriate Energy
Laboratory (RAEL) to the University of California, Berkeley.
Kammerer, K. J. (2006). Monterey bay regional energy plan, report prepared to the association of
monterey bay area governments (AMBAG). California: Marina.
Kieper, T., & Bicknell, C. (2018). Focus on energy economic impacts 2015–2016. Madison,
Wisconsin: Report Prepared to the Public Service Commission of Wisconsin.
Pan, Q., Richardson, H. W., Gordon, P., & Moore, J. (2009). The economic impacts of a terrorist
attack on the downtown los Angeles financial district. Spatial Economic Analysis, 4(2), 213–239.
12 Economic Impact Analysis for an Energy Efficient … 179
Richardson, H. W., Pan, Q., Park, J., & Moore, J. E. (2015). Regional economic impacts of terrorist
attacks, natural disasters and metropolitan policies, advances in spatial science series, Springer.
Scheer, J., & Motherway, B. (2011). Economic analysis of residential and small-business energy
effi-ciency improvements. Dublin, Ireland: Sustainable Energy Authority of Ireland.
Washan, P., Stenning, J., & Goodman, M. (2014). Building the future: Economic and fiscal impacts
of making homes energy efficient. London, UK: Report to Energy Bill Revolution.
Chapter 13
Exploring the Dynamics of Carbon
Emission in China via Spatial-Temporal
Analysis
Jin Zhang, Jinkai Li, and Xiaotian Wang
13.1 Introduction
The relationship between energy, environment and economic development has
become a major global issue. It is projected that energy use will increase and decouple
around 56% in 2040 compared to 2010 (Azad et al. 2015; Çetin and Ecevit 2015).
The rise in energy consumption was responsible for the increase in the intensity of
carbon dioxide (CO2) emissions, which has caused severe environmental and ecolog-
ical challenges. CO2emissions are a major part of overall greenhouse gas (GHG)
emissions and constitute almost two-thirds of overall GHGs emissions. Global CO2
emissions increased by 144% between 1970 to 2014 (World Bank 2018).
Since the 21st century, China’s economy has developed rapidly. China’s acces-
sion to the WTO in 2001 has led to a sharp increase in domestic resources and
energy consumption. This is because industrial production, the service industry, trade,
construction, transportation, urbanization and many other sectors profoundly depend
on energy consumption. During the 10th, 11th and 12th Five-Year Plan (2000–2015),
China attained the position of world’s second largest economy. However, environ-
mental issues, especially carbon dioxide emissions, have also drawn global atten-
tion. As reported in IEA 2017, China accounted for 28% of the world emissions
J. Zhang
School of Public Policy and Management, Tsinghua University, Beijing 100084, China
e-mail: zhangj17@mails.tsinghua.edu.cn
J. Li (B
)
Center for Energy Environment and Economy Research, Zhengzhou University,
Henan 450001, China
e-mail: lijinkai@sina.com
X. Wang
Business School, Henan University of Engineering, Henan 451191, China
e-mail: wxtcome@163.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_13
181
182 J. Zhang et al.
in 2015 and is the world’s biggest CO2emitter. Few studies on the calculation of
carbon emissions concludes that China’s CO2emissions has increased nearly seven-
fold, from 1538.313 Mton CO2eq/yr in 1970 to 10,433 Mton CO2eq/yr in 2016
(Janssens-Maenhout et al. 2017).
The Chinese government has given much attention to environmental issues, as
energy conservation and emission reduction have always been prioritised in policy
documents. In August 2002, China participated in the World Conference on Environ-
ment and Development of the United Nations and its determination to unswervingly
follow the path of sustainable development. Industrial restructuring, energy conser-
vation and structure adjustment, and energy efficiency improvements have all been
subsequently included in national policy guidance documents during the 10th, 11th
and 12th Five-Year-Plan periods.
However, given the size of the country, related policies proposed by the central
authorities may not be always effective at the local level in the thirty-one (31)
provinces of China. For carbon abatement, the regions performed differently. This
may suggest the presence of spatial or regional disparity in China’s carbon emis-
sions. With the development of spatial analysis theories, there are several analytical
techniques such as exploratory spatial data analysis (ESDA), GIS and so on.
Spatial analysis is important both in theory and in statistical analysis. It can
improve the accuracy of analysis, and also reduce the estimation bias by consid-
ering spatial proximity and dependence (Heraux 2007; Ye and Wu 2011). This paper
conducts a spatial-temporal analysis of China’s carbon emissions. Before now, there
are very few papers that clearly illustrate the spatial effects, and this is the gap this
study fills.
The remainder of the paper is arranged as follow. The next section presents the
review of the related literatures. Then the ESDA is used to conduct an exploratory
explanation of the data sample. This helps to detect the problem of carbon abatement
in China, simultaneously considering the individual, spatial and time effects. Lastly,
spatial econometric analysis is used to analyze the influence of different variables
on carbon emissions.
13.2 Literature Review
The relationship between energy, economy and environment is momentously
dispensed by intellectuals across the world (Islam et al. 2012; Ozturk and Acar-
avci 2010; Uddin et al. 2016). In the literature on the determinants of pollution (CO2
emission), several factors like growth, financial development, industrial production,
energy consumption, trade openness, urbanization, services sector development and
agricultural production have been examined (Li et al. 2016). For example, Rahman
et al. (2017) studied the dynamic relationship between carbon emissions, energy
consumption and structural change or industry growth in China using the ARDL
bound test and the Granger causality test. The results showed that both industrial
production and energy consumption contributed significantly to carbon emissions
13 Exploring the Dynamics of Carbon Emission in China … 183
in the short and long term. Abdul et al. (2018) follows this study to test the rela-
tion between structural change and carbon emissions during 1978–2016. The results
also show that industry growth significantly contributes to carbon emission. These
macro indicators are always included in the model as factors affecting carbon emis-
sions. The main implicit assumption is that these factors affect the energy consump-
tion of a country. However, when focusing on a country, such as China, whose
provinces are used as observation units, the interpretation of macro indicators is
not as strong as the interpretation of mesoscopic indicators. When scholars study
such problems, the logic behind the quantitative analysis is mainly summarized in
Fig. 13.1. Since China’s carbon emissions mainly come from energy consumption,
studies have examined China’s carbon emissions from the perspective of industrial
production and energy. (Zhang and Zhang 2018). This means that all factors affecting
energy consumption will affect carbon emissions. Given the industrial adjustment
and energy transition strategy launched in recent years, most of the scholars that
studies China’s carbon abatement issue would consider industry adjustment, energy
efficiency, energy transition, institution, market environment, etc. as key factors (Li
and Cheng 2009; Neng 2010;Lietal.2012,2013).
However, there are still some disputed problems. Some studies showed that high
secondary industry value added is associated with high carbon emission. They believe
a high proportion of industry in GDP means a more developed economy, and more
developed economy will have higher production efficiency which will help to reduce
Fig. 13.1 2000–2015 CO2in China
184 J. Zhang et al.
carbon emissions by using energy effectively. Yet some studies find opposite results
and believe that more developed economy produces more carbon dioxide because
of higher energy consumption. Some researchers believe that energy efficiency
improvements can help to reduce carbon emission through energy savings. But some
results are adverse. They argue that the more the energy efficiency the more energy
consumption and this trend can be observed clearly in most of the countries in the
world.
Such a paradox is called Jevon’s paradox. Some studies showed there is a positive
influence of state-owned company on carbon emission. They argue that state-owned
companies have abundant resources to allocate in more effective way. But some
argue that lack of innovation incentive makes state-owned companies to have lower
production efficiency, which can distort resource allocation.
The dynamics of carbon emissions in China is still worth investigating as there are
still many paradoxes. China is a big country with a very intriguing market and society
environment. Since the 10th to 15th Five-year-plan period, the Chinese government
has taken various measures to promote industrial and energy transformation and
upgrading. Different provinces have different regional performances in response
to central policies. So, what is the main factor affecting carbon emissions? Which
hypothesis is valid to explain the evolution of carbon emissions in China? Does
the spillover effect of energy policies exist among provinces? What is the temporal
and spatial evolution of China’s carbon emissions in the sample interval? These
issues are very important, but they cannot be solved by the traditional analytical
methods mentioned in the previous literatures. Spatial analysis is necessary to identify
and examine them. Therefore, this paper applies the spatial-temporal analysis to
energy- environment- economy (3E) study to explore the dynamics of China’s carbon
emissions.
13.3 Data
13.3.1 Unit of Analysis
This paper takes China’s 31 provinces as observations which means each variable is
measured at the provincial level. It uses a time period from 2000 to 2015 because
it transverse the three important ‘National Five-year Plans’ in industry upgrade and
carbon emission reduction of the Chinese government. All observation units are
regionally related though they are administrative divided by the state. The importance
of the regional perspective is gradually demonstrated below.
13 Exploring the Dynamics of Carbon Emission in China … 185
13.3.2 Dependent Variable
Carbon dioxide is largely recognized as an important proxy variable for environ-
mental pollution. The amount of carbon dioxide in environmental economics is
mainly converted by some measurement method. The most important is the carbon
dioxide conversion coefficient of different energy types (e.g. coal, oil, gas etc.)
published by the IPCC. Many methods have been used in the literature to calcu-
late carbon dioxide emissions. The matrix of the conversion coefficient mainly used
in this paper is adopted from Li et al. (2016). The consumption of different types of
energy comes from China Energy Statistics Yearbook from 2001 to 2016.
13.3.3 Independ Variables
As mentioned in the literature review, the factors affecting carbon emission reduction
mainly include industrial structure, energy consumption structure, and energy effi-
ciency. Since the “10th Five-Year” development plan, the Chinese government has
always regarded industrial restructuring, energy consumption structure adjustment
and energy efficiency development as the focus of energy conservation and emission
reduction.
In general, the industrial structure is expressed by the proportion of the added
value of the secondary industry to the total GDP. The impact of industrial struc-
ture on carbon emissions is mixed, as shown in the literature. On the one hand,
the optimization of the industrial structure, reflected in the expansion of the output
value scale, will lead to more energy consumption. With coal being the main energy
consumption source in China, its expansion will inevitably lead to more carbon emis-
sions. That is, the greater the proportion of the secondary industry in the industrial
structure, the higher the carbon emissions will be. This is called the scale hypoth-
esis. On the other hand, the adjustment of the industrial structure, reflected in the
improvement of technology, will enable less energy to produce more output, which
will reduce carbon emissions. That is, the greater the proportion of the secondary
industry in the industrial structure, the lower the carbon emissions. This is called the
efficiency hypothesis.
The energy consumption structure is mainly expressed by the proportion of coal
consumption in total energy consumption. Coal consumption is the main source of
carbon dioxide emissions. The higher the proportion of coal in the energy consump-
tion structure, the more the carbon dioxide will be. Therefore, the indicator of energy
consumption structure will be negatively correlated with carbon emission indicators.
Energy efficiency is the reciprocal expression of the indicator of energy intensity
used in the literature. The greater the energy intensity, the more energy is needed
for unit GDP output, and the lower the energy efficiency. It seems that energy effi-
ciency development can reduce carbon emission by using less energy. But this is
not that true. There is a famous paradox named Jevon’s Paradox which argues that
186 J. Zhang et al.
improvement in energy efficiency will lead to more energy consumption. So, there
are still debates on the effects of energy efficiency, depending on the measurement
method. In recent years, with the intersection of the idea of input-output optimiza-
tion in system engineering and energy economics, the method of calculating the total
factor efficiency of energy based on DEA method has become very popular (Mardani
et al. 2017). Data Envelopment Analysis (DEA) approach is a well-known technique
utilized to evaluate the efficiency for peer units compared with the best practice
frontier, and has been widely used by researchers to analyze energy efficiency (Feng
and Wang 2017; Rebolledo-Leiva et al. 2017). The total factor energy efficiency is
a value around 1 and mainly represents the distance from the optimal frontier of the
energy input and output efficiency of the observed unit. The specific methods have
been introduced in our previous papers (Li et al. 2015a,b,2016). In summary, the
measurement of energy efficiency in this paper is mainly based on the total factor
value of energy efficiency of super-efficiency DEA, which is more able to reflect the
essential meaning of energy efficiency. The GDP output, capital investment, human
input and energy input indicators required for the measurement are obtained from
the China Statistical Yearbook.
13.3.4 Other Control Variables
In order to avoid the endogeneity of the measurement model, it is necessary to add
other factors that affect carbon emissions. After reviewing the literature, it can be seen
that the factors affecting China’s provincial carbon emissions include institutional
factors (i.e., the ratio of state-owned enterprises’ output to local GDP), government
intervention factors (i.e., the ratio of government fiscal expenditure to local GDP),
and energy prices factor. According to some previous studies, the production of
enterprises under the conditions of public ownership will distort the market and
make unfair competition which will have negative influence to improve efficiency.
Therefore, the greater the proportion of state-owned enterprises in the economy, the
more the waste of resources, and more carbon emissions. Government fiscal expen-
ditures are mainly used to provide public services. According to Western economic
theory, as with the institutional mechanism, it is generally believed that governments
with excessive fiscal expenditure will affect market efficiency, resulting in waste of
resources and more carbon emissions. Energy prices reflect energy as a commodity,
and its essential attributes also follow the laws of the market economy. That is, the
higher the price, the less energy the company will use, and the higher the energy
efficiency, and hence reduction of carbon emissions.
The data are obtained from the China Statistical Yearbook. Here we do not include
Hong Kong, Macao and Taiwan data. All variables are summarized in Table 13.1.
Figures 13.1 show the spatial structure of China’s carbon dioxide emissions from
2000 to 2015. It can be seen that the overall increase in carbon dioxide emissions
and the spatial distribution have always spread outwards in the North China region.
Hebei, Shanxi, Shandong, and Henan have far higher carbon emissions than the
13 Exploring the Dynamics of Carbon Emission in China … 187
Table 13.1 Description of variables
Var i a b l e s Description Abbreviation
CO2Measured by conversion coefficients (tons) CO2
Energy efficiency Energy total factor efficiency, Measured by DEA
method (around 1)
En_TFP
Energy structure Coal consumption/Energy consumption (%) En_str
Industry structure Second industry added value/GDP (%) Ind_str
Energy price Industrial producer purchase price index deprives
production price index (%)
En_price
Institution State-owned company added value/GDP (%) Institution
Government interference Fiscal expenditure/GDP (%) Gov_int
national average during the 10th Five-Year Plan period. During the 12th Five-Year
Plan period, emissions in Hebei, Shandong, Shanxi, Inner Mongolia, Henan, and
Guangdong were far above the national average. As a whole, the total amount of
carbon dioxide emissions increased over time.
According to Fig. 13.2, it can be clearly seen that the industrial structure has under-
gone significant changes between 2000 and 2015, showing a distinct feature of indus-
trial transfer from east to west. The high-value areas occupying the second industry
are gradually transferred from the eastern coastal cities to the central and western
Fig. 13.2 2000–2015 Industry Structure in China
188 J. Zhang et al.
regions. During the 12th Five-Year Plan period, the proportion of the secondary
industry in the central region increased significantly.
On the whole, provinces with a relatively high coal share are concentrated in
China’s northern coal-producing areas and provinces with more coal consumption
on the economically developed eastern coast. The proportion of coal in the energy
consumption structure of the eastern coastal areas have a relatively decreasing trend
over time, and coal emerging as a major energy source in the central region. This
trend seems to be consistent with changes in industrial structure.
From the numerical point of view, the overall energy efficiency shows an initial
decreasing trend, followed by an increasing trend. According to the spatial distri-
bution structure of China’s energy efficiency shown in Fig. 13.3, it can be seen that
the energy efficiency of the East, Central and West regions in 2000 was obviously
decreasing. In 2015, the gap between the eastern region and the central region was
significantly reduced.
In general, carbon dioxide, industrial structure, energy structure and energy effi-
ciency have obvious spatial agglomeration structure, and there are obvious spatial
distribution structure changes in industrial structure, energy structure and energy
efficiency. It can be initially deduced that there exists spatial spillover effect. A more
accurate analysis requires a more rigorous approach, which is analyzed as follows.
Fig. 13.3 2000–2015 Energy Efficiency in China
13 Exploring the Dynamics of Carbon Emission in China … 189
13.4 Exploratory Spatial Data Analysis (ESDA)
According to the first law of geography (Tobler’s theorem), “everything is spatially
related; the closer the distance, the stronger the degree of association, and the further
the distance, the weaker the degree of association”. Anselin (1995) pointed out that
spatial association mainly refer to spatial autocorrelation. As long as the data is
organized in units of space, there is the possibility of spatial autocorrelation.
Exploratory spatial data analysis (ESDA) is also known as spatial exploration data
analysis. It is to reveal spatial data characteristics through visualization technology,
identify abnormal points or regions, explore spatial connection patterns, agglom-
eration or hotspots, and explain spatial dependence or spatial interaction (spillover
effect).
Before using the factors to explain the spatial patterns in the study and evaluating
and testing more complex regression models, Anselin (2005) suggested that ESDA
should be considered a descriptive step as a formal diagnostic step before the regres-
sion. Because ESDA can reveal other unrecognized complex spatial phenomena, it
forms the basis for the development of new research problems (Ye and Wu 2011). The
development of new ESDA methods has stimulated many research efforts (Goodchild
2006; Ye and Rey 2013; Ye and Wu 2011).
ESDA method is applied to the study of common economic problems, mainly
for spatial autocorrelation analysis, including the global autocorrelation test and the
local spatial autocorrelation test.
The global autocorrelation test mainly explores the overall spatial distribution
characteristics of spatial data in the whole sample system, and is used to study
whether there is agglomeration and overall agglomeration morphology. It uses the
global Moran index I (formula 13.1) to test whether the adjacent regions in the
entire study area are spatially positively correlated, spatially negatively correlated,
or independent of each other.
I=
n
i=1
n
j=i
wij(xi−¯x)(xj−¯x)
S2
n
i=1
n
j=1
wij
(13.1)
Moran index I is between [−1, 1]. If I>0, it shows the adjacent region is
positively correlated, which means the high value is adjacent to high value and the
low value is adjacent to low value. If I<0, it shows the adjacent region is negatively
correlated, which means the high value is adjacent to low value adjacent or low value
is adjacent to high value. If I→0, it shows the sample is randomly distributed or
there is no spatial autocorrelation.
Local autocorrelation mainly explores the distribution characteristics of local
subsystems of spatial data, and is used to study spatial local agglomeration patterns
and spillover effects. It uses the local Moran index I (formula 13.2), also known
as LISA (local indicator of spatial association), to test whether similar or different
observations are concentrated in local regions.
190 J. Zhang et al.
Table 13.2 Global Moran’s I of dependent and independent variables
Var i a b l e s 2000 2005 2010 2015
CO20.236
(0.02)**
0.295
(0.005)**
0.272
(0.009)**
0.246
(0.015)**
Ind_str −0.016
(0.365)
−0.0093
(0.335)
−0.016
(0.365)
−0.0317
(0.418)
En_str 0.182
(0.03)**
0.254
(0.014)**
0.164
(0.047)**
0.187
(0.036)**
En_TFP 0.115
(0.09)*
0.175
(0.052)*
0.216
(0.033)**
0.172
(0.061)*
Notes Results by 999 permutations randomization; Data in () are p-value; * p < 0.1 ** p < 0.05 ***
p < 0.01
Ii=(xi−¯x)
S2
i= j
wij(xj−¯x)(13.2)
If LISA value is greater than 0, it indicates that the high value region is surrounded
by a high value region (H-H), or the low value region is surrounded by a low value
region (L-L). If LISA value is less than 0, it indicates that the high (low) value region
is surrounded by low (high) regions (H-L or L-H).
Where, S2=1
n
n
i=1
(xi−¯x)2represents the variance of the sample data; ¯x=
1
n
n
i=1
xirepresents the sample mean; i, j represents different regions, nrepresents the
total number of regions in the study, and Wij represents the elements in the spatial
weight matrix.
This paper builds a spatial weight matrix based on the rules of Rook adjacency.1
Table 13.2 shows the results of the global autocorrelation test for the dependent
and independent variables. The global autocorrelation test of carbon dioxide is signif-
icantly positive throughout the sample, indicating that there is a significant spatial
autocorrelation relationship in China’s carbon emissions. Overall, carbon emission
hotspots tend to cluster with carbon emission hotspots, which verifies the analyt-
ical assumptions of Fig. 13.1. Energy efficiency and energy structure evolution also
exhibit similar spatial agglomeration characteristics.
However, the results of the global spatial autocorrelation test of industrial structure
are not significant. This indicates that the spatial agglomeration form of industrial
structure has significant dynamic characteristics and does not form stable and signif-
icant agglomeration characteristics in a certain time section. This not only verifies
the evolution of the industrial structure in Fig. 13.2, but also reveals the complexity
of the industrial structure variable.
1Although there are different ways of constructing a weight matrix, such as economic distance or
political connection, here this paper adopts the classical Rook adjacency in order to make comparison
with the previous studies well.
13 Exploring the Dynamics of Carbon Emission in China … 191
The global Moran index can only test whether there is an agglomeration in the
space but does not determine where to agglomerate. In other words, the global
Moran’I only answers Yes or N O, and the local Moran’I answers Where. If the global
autocorrelation is significant, further local autocorrelation testing is required.
In general, LISA cluster maps2are used to visually reflect the imbalance of spatial
distribution and the pattern of local agglomeration. According to the LISA analysis,
the local spatial agglomeration characteristics of energy efficiency and energy struc-
ture have not changed much. Basically, in 2000 and 2015, both exhibits a pattern
of high value agglomeration in the eastern region, and the western region shows a
pattern of low value clustering.
The local autocorrelation characteristics of carbon emissions are also relatively
stable. From 2000 to 2015, they are characterized by high-value accumulation in
Northern China, high value and low-value accumulation in Sichuan. The local
autocorrelation feature of the industrial structure still shows complexity.
These show that the spatial difference is very significant in this analysis and
cannot be ignored. In particular, if a traditional regression analysis is performed, the
current sample does not conform to the classical Gaussian assumption and cannot
be effectively estimated.
13.5 Spatial Econometrics Analysis
The panel data regression model combines the information of time scale and section
unit, which contains more variability, and it usually has less collinearity between
variables. It can reflect the relationship between variables more scientifically and
objectively. The traditional panel data model assumes that the observed samples
are obtained by random sampling. However, as indicated in the spatial exploration
analysis section, the unit of analysis have significant spatial dependence and spatial
heterogeneity. The traditional panel regression model cannot be used, and the spatial
panel regression model is needed. In recent years, the spatial measurement model
has become increasingly mature for the processing, setting and estimation of panel
data, which gives solid foundation to the analysis of this paper.
The spatial econometric model considers three kinds of interaction effects: the
endogenous interaction between the explanatory variables, the exogenous interaction
between the explanatory variables, and the interaction between the error terms. The
commonly used spatial panel models are Spatial Lag Model (also known as spatial
autocorrelation model, SAR), Spatial Error Model (SEM), and Spatial Durbin Model
(SDM).
The spatial lag model (SAR) contains endogenous interaction effects. Endogenous
interaction is the interpretative variable of a particular unit that depends on other units.
If the space unit can form a spatial lag model due to the spillover effect caused by
technology diffusion, resource flow, etc., the model is set as follows:
2Due to space constraints, please contact to us for LISA map analysis results if necessary.
192 J. Zhang et al.
Yit =α+ρ
30
j−1
WijYit +Xitβit +γi+δt+μit
where ρis the spatial lag regression coefficient; Yit is the explanatory variable;
Xit is the 1 ×mdimensional explanatory variable; βit is the corresponding m×1
dimensional coefficient vector; Wij is the element normalized in the weight matrix;
μit represents the error term obeying the independent and identical distribution; γi
and δtrespectively indicate regional and temporal effects. If the impact of these two
aspects is ignored, the calculation of the estimated amount will be inaccurate.
The spatial error model (SEM) contains the interaction effects of the error terms.
The interaction effect of the error term means that the missing variables in the model
are spatially related, or there are unobservable impacts that follows the spatial inter-
action form. The spatial error model (SEM) assumes that the effects between regions
are generated by unknown variables, and the disturbances in this region affect the
disturbances in another region. The model is as follows:
Yit =α+Xitβit +γi+δt+εit
εit =ρ
30
j−1
Wijεit +φit
where ρis the spatial autoregressive coefficient; γiand δtrepresent the regional and
temporal effects respectively; εit are spatial error terms, φit are random terms of the
error term, φit ∼N(0,σ2Iit).
The spatial Durbin model includes both endogenous and exogenous interaction
effects models. Exogenous interactions refer to the interpretative variables of a partic-
ular unit that depend on independent explanatory variables of other units. The model
is as follows:
Yit =α+η
30
j−1
WijYit +Xitβit +
30
j−1
Wij Xitθ+γi+δt+μit
ηand θare parameter vector.
The selection of models has been discussed in the work of Anselin (2005) and
LeSage and Pace (2010). It is mainly judged based on the Lagrange estimators.
The dependent variable, independent variables and control variables are shown
earlier. Table 13.3 shows the results of spatial econometric analysis.
The significance and the positive effect of the independent variables showed that
the regression results are consistent. This indicates that there is a significant positive
impact of industrial structure, energy efficiency and energy structure on carbon emis-
sions during the sample period. Energy prices have a slight significant positive corre-
lation with carbon emission, and institutional factor and government intervention
factor have significant negative relations with carbon emission.
13 Exploring the Dynamics of Carbon Emission in China … 193
Table 13.3 Results of spatial
panel models Var i abl e s SDM SAR SEM
En_TFP 0.694***
(7.04)
0.689***
(7.58)
0.666***
(7.50)
En_str 0.833***
(11.92)
0.855***
(12.14)
0.827***
(11.73)
En_price 0.457*
(2.42)
0.577**
(3.00)
0.582**
(3.04)
Ind_str 0.642***
(8.58)
0.671***
(9.47)
0.642***
(8.74)
Institution −0.203***
(−5.23)
−0.110**
(−3.13)
−0.126***
(−3.32)
Gov_int −0.057
(−0.85)
−0.057
(−0.86)
−0.044
(−0.66)
Spatial
rho 0.058
(0.89)
0.147**
(2.84)
lambda 0.120
(1.73)
sigma2_e 0.011***
(15.48)
0.011***
(15.46)
0.012***
(15.46)
N480 480 480
LM 412.17 391.15 388.7
Notes t statistics in parentheses*p<0.05, ** p< 0.01, *** p< 0.001
In addition to the energy efficiency(En_TFP) indicator, other
indicators are linearized. Because energy efficiency is an
evaluation index and there is no economic meaning after
linearization, the raw measured data is used in the regression model
Specifically, the impact of industrial structure on carbon emissions is in line with
the scale effect hypothesis. That is, during the three five-year-plan periods from
2000 to 2015, China’s total production scale is expanded through industrial transfer.
This made the secondary industry consume more energy which eventually lead to
an average increasing impact on carbon emissions. The impact of energy total factor
efficiency on carbon emissions is in line with Jevon’s paradox hypothesis. That is the
higher the energy efficiency, the more the energy consumption of the economy (Coal
is domain in China), ultimately implying that energy efficiency would have a posi-
tive impact on carbon emissions on average. The coefficient of energy consumption
structure is positive, indicating that the higher the proportion of coal in the energy
consumption structure, the higher the carbon emissions will be. During the sample
period of this paper, the marginal impact of changes in energy consumption structure
on carbon emissions is about 0.83–0.86.
The coefficient of energy prices is positive, which is a bit counterintuitive.
However, the coefficient is less significant. This reflects the fact that China’s energy
market is not a completely liberalized, and its price is not truly determined by the
194 J. Zhang et al.
forces of energy supply and demand. Therefore, the logic that energy price affects
carbon emissions by affecting energy supply and demand levels is not validated in our
study. Both institutional factor and government intervention factor have shown signif-
icant negative impacts on carbon emissions. This shows that the higher the propor-
tion of state-owned enterprises and government expenditures, the more significant
reduction in carbon emissions.
From China’s social practice, this conclusion is also very realistic. China is a
socialist country, and the influence of government on market and society is not
same as mentioned in western economics. China’s practice proves that the view of
focusing on doing big things is very reliable. From 2000 to 2015, China’s state-owned
enterprises made great contributions to energy conservation and emission reduction.
Local governments have invested heavily in public services and energy conservation,
all of which have had a significant impact on the reduction of carbon emissions.
As showed in Table 13.3, the variance estimators of the three models are all signif-
icant at the 99% significance level, which means there must exist both endogenous
interaction and interaction effect of error terms. Otherwise, the selection of models
should be between SAR and SDM. Following Anselin’s suggestion (Anselin 2005)
on choosing based on LM estimator, we seemingly should choose the result of SDM
model as its LM is much higher (412.17, in Table 13.3). But seeing from the spatial
parameters, SAR model is more significant than the SDM. Thus, the results of the
SAR and SDM models should be given much attention.
There are many discussions about direct and indirect effects, and we follow the
definition of LeSage and Pace (2010) in this respect. Changes in the explanatory
variables of an observed Xiaffect its own explained variables Yi, which is the direct
effect; and changes of Xican also potentially affect other individuals Yj, which
is the indirect effect. Indirect effect is also known as spillover effect. Table 13.4
lists the direct, indirect and total effects based on the SDM and SAR. The total
effect equals direct effect plus indirect effect, which means spatial marginal effect
of independent variables on dependent variable. Such a total effect is not the same
with coefficients listed in Table 13.3. Estimated coefficients in Table 13.3 means the
marginal influence on dependent variable when all other independent variables are
fixed. They are approximately equal to the direct effect. Here the total effect is without
other fixed variables, which includes the interaction influence of each variable.
Statistically, spatial spillover effects of most of the variables are slightly signifi-
cant, except for industry structure. As showed in Table 13.4, industrial structure has
an obvious indirect effect on the changes in CO2which is between 0.115 and 0.613
according to SDM and SAR models respectively. This means the change in industry
structure in region iwill also influence the change in CO2in region j. The spillover
effect of industry structure is significant. The change in energy efficiency and energy
structure in region ican also influence the change in CO2in region jaccording to
SAR model. It can be concluded that energy efficiency and energy structure have
spillover effects. Energy price, institution factor and government intervention have
no obvious spillover effects. This can be attributed to the authoritarian nature of the
Chinese state with less flexibility in market and institution among different provinces.
13 Exploring the Dynamics of Carbon Emission in China … 195
Table 13.4 Direct, Indirect and Total effects
Var i a b l e s SDM SAR SDM SAR SDM SAR
LR_Direct LR_Indirect LR_Total
En_TFP 0.698***
(6.76)
0.697***
(7.37)
0.034
(0.13)
0.120*
(2.11)
0.731*
(2.22)
0.817***
(6.18)
En_str 0.832***
(12.35)
0.858***
(12.46)
0.133
(0.96)
0.147*
(2.26)
0.965***
(6.04)
1.004***
(9.03)
En_price 0.477**
(2.61)
0.602**
(3.25)
−0.072
(−0.17)
0.103
(1.85)
0.405
(0.82)
0.705**
(3.15)
Ind_str 0.649***
(8.77)
0.675***
(9.72)
0.613***
(3.71)
0.115*
(2.31)
1.262***
(6.29)
0.789***
(8.55)
Institution – 0.199***
(−5.28)
−0.110**
(−3.23)
0.264***
(4.22)
−0.019
(−1.81)
0.065
(1.02)
−0.129**
(−3.12)
Gov_int −0.058
(−0.90)
−0.054
(−0.82)
−0.416**
(−3.23)
−0.009
(−0.72)
−0.475**
(−2.96)
−0.063
(−0.82)
N480 480 480 480 480 480
Notes t statistics in parentheses * p < 0.05, ** p < 0.01, *** p < 0.001
13.6 Conclusion and Discussion
This paper applies spatial analysis methods to energy, environmental, and economic
research and yields several interesting findings. In the past, most of the previous
researches on the same topic focused on traditional econometric methods, with
emphasis only on time and individual effects and ignored spatial effects. According
to the analysis in this paper, it is teseted that spatial effects are necessary for related
research.
The article firstly conducts spatial analysis of the main variables of carbon emis-
sions (CO2), industrial structure, energy structure and energy efficiency based on
ESDA theory. The global spatial autocorrelation test results show that there is signif-
icant positive global spatial autocorrelation in carbon emissions, energy structure and
energy efficiency, and the spatial global agglomeration characteristics show a rela-
tively stable evolution. The results of local spatial autocorrelation (LISA) test show
that there are significant differences in carbon dioxide, energy structure and local
agglomeration characteristics of energy efficiency, which further reveals regional
differences and spatial autocorrelation. There are also significant spatial spillover
effects on the relevant variables. In summary, exploratory spatial data analysis identi-
fies significant spatial dependence and heterogeneity in the research object, providing
a clear basis for further use of spatial econometric analysis.
In the spatial econometric analysis section, the paper estimates the effects of
various influential factors on carbon emissions. It also focuses on the scale effect
hypothesis of industrial structure and the Jevon’s Paradox hypothesis of energy effi-
ciency. The results show that in the sample of this paper, the impact of industrial
structure on carbon emissions is in line with the scale effect hypothesis, and the
196 J. Zhang et al.
impact of energy efficiency on carbon emissions is in line with Jevon’s Paradox
hypothesis. The regression results also show that energy structure has a significant
positive impact on carbon emissions. Institutional factor and government intervention
have significantly negative impacts on carbon emissions. Energy prices have a slight
positive impact on carbon emissions. These results can explain China’s economic
and social practices. At the same time, the article focuses on detecting the source
of spatial interaction effect of factors affecting carbon emission during the sample
period.
Based on the results and diagnosis of SDM, SAR and SEM regressions, the results
show that spatial interaction effects mainly manifest as error term interaction effects
and endogenous interaction effects. Based on the significance of the spatial regression
coefficient, the paper further analyzes the direct and indirect effects (spillover effect)
of SDM and SAR model estimation. The results show that the spatial spillover effect
of the industrial structure is obvious. The adjustment of the industrial structure of one
region not only affects the carbon emissions of such region, but also affects the carbon
emissions in the adjacent regions. Energy structure and energy efficiency have a slight
statistically significant spatial spillover effect, and energy prices, institutional factors,
and government intervention do not have statistically significant spatial spillover
effects.
Since the purpose of this paper is only to apply the ideas and methods of spatial
analysis to classical energy, environment, and economic (3E) research, there are not
too much exploiting innovation here. A basic knowledge of the application in social
science research of spatial econometrics is that the more understanding of spatial
weight matrix the people get, the more innovative application can be achieved.
the diffrent considered comprehensibly in the spatial regression part. This will be
looked into in future research.
Acknowledgements We sincerely thank Professor Xinyue Ye and other anonymous reviewers.
We also thank Yamin Tang, a master student in Henan University of Finance and Economics, who
helped with the data. This manuscript is also a product of National Natural Science Foundation of
China (Grant No. 71173006 and 71473070), Humanities and Social Sciences Youth Project of the
Ministry of Education of China (Grant No. 18YJC790216), and the New-Century Training Program
Foundation for Talents from the Ministry of Education of China (Grant No. NCET-2012–0691). We
especially appreciate the support of 3E Group of IRTSTHN, and Technology project of Headquarter
of China’s State Grid Co., Ltd (Contract No. SGFJJY00GHWT800059).
References
Abdul, R., Jin, Z., Jinkai, L., & Waqas, A. (2018). Structural changes, energy consumption and
carbon emissions in China: Empirical evidence from ARDL bound testing model. Structural
Change and Economic Dynamics,47, 194–206. https://doi.org/10.1016/j.strueco.2018.08.010.
Anselin, L. (1995). Local indicators of spatial associatione LISA. Geographical Analysis, 27, 93–
115.
Anselin, L. (2005). Exploring spatial data with GeoDa: A workbook. Center for Spatially Integrated
Social Science.
13 Exploring the Dynamics of Carbon Emission in China … 197
Azad, A. K., Rasul, M. G., Khan, M. M. K., Sharma, S. C., & Bhuiya, M. M. K. (2015). Study on
Australian energy policy, socio-economic, and environment issues. Journal of Renewable and
Sustainable Energy.https://doi.org/10.1063/1.4938227.
Çetin, A. P. M., & Ecevit, A. P. E. (2015). Urbanization, energy consumption and CO2emissions
in sub-saharan countries: a panel cointegration and causality analysis. Journal of Economics and
Development Studies,3, 66–76. https://doi.org/10.15640/jeds.v3n2a7.
Feng, C., & Wang, M. (2017). Analysis of energy efficiency and energy savings potential in China’s
provincial industrial sectors. Journal of Cleaner Production, 164, 1531–1541.
Goodchild, M. (2006). Geographical information science: Fifteen years later. In P. Fisher (Ed.),
Classics from IJGIS: Twenty years of the International Journal of Geographical Information
Science and Systems, Vol. 2 (pp. 107–133). Boca Raton:CRC Press.
Heraux. (2007). Spatial data analysis of crime. Social Science Computer Review,25(2), 259–264.
Janssens-Maenhout, G., Crippa, M., Guizzardi, D., Muntean, M., Schaaf, E., Olivier, J. G. J., Peters,
J. A. H. W., & Schure, K. M. (2017). Fossil CO2& GHG emissions of all world countries.
Islam, F., Shahbaz, M., & Butt, M. S. (2013). Is There an Environmental Kuznets Curve for
Bangladesh? Evidence from ARDL Bounds Testing Approach. Bangladesh Development Studies
XXXVI, 1–23.
LeSage, J. P., & Pace, R. K. (2010). Spatial Econometric Models. In: Fischer M., & Getis A. (eds)
Handbook of Applied Spatial Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-
3-642-03647-7_18.
Li, S., & Cheng, J. (2009). The characteristics of energy efficiency in china’s industrial sector and its
influencing factors—an empirical analysis based on nonparametric frontiers. Journal of Finance
and Economics, 2009(07), 134–143.
Li, J., Shen, B., Han, Y., & Zhang, M. (2012). Comparison of regional energy efficiency in China—
based on DEA-Malmquist and cluster analysis. Journal of Beijing Institute of Technology(Social
Sciences Edition),6, 1–6.
Li, J., Zhang, J., Gong, L., & Miao, P. (2015a). Research on the total factor productivity and
decomposition of chinese coastal marine resource—based on DEA-Malmquist Index[J]. Journal
of Coastal Research, 2015(73), 283–289.
Li, J., Zhang, J. Gong, L., & Miao, P. (2015). The spatial and temporal distribution of coal resource
and its utilization in China—based on space exploration analysis technique ESDA[J]. Energy &
Environment,26(6+7).
Li, J., Gong, L., Chen, Z., Zeng, L., & Zhang, J. (2016). The hierarchy and transition of China’s
urban energy efficiency. Energy Procedia, 2016, 110–117.
Li, J., Shen, B., Miao, P., Han, Y., & Zhang, J. (2013). China’s industrial energy intensity: Regional
differences and influencing factors. Journal of Applied Sciences, 13, 3604–3607. https://doi.org/
10.3923/jas.2013.3604.3607.
Mardani, A., Zavadskas, E., Streimikiene, D., Jusoh, A., & Khoshnoudi, M. (2017). A comprehen-
sive review of data envelopment analysis (DEA) approach in energy efficiency. Renewable and
Sustainable Energy Reviews,70, 1298–1322.
Ozturk, I., & Acaravci, A. (2010). The causal relationship between energy consumption and GDP
in Albania, Bulgaria, Hungary and Romania: Evidence from ARDL bound testing approach.
Applied Energy, 87, 1938–1943. https://doi.org/10.1016/j.apenergy.2009.10.010.
Rahman, M. M., Kashem, M. A., Bank, B., et al. (2017). Carbon emissions, energy consumption
and industrial growth in Bangladesh: empirical evidence from ARDL cointegration and granger
causality analysis. Energy Policy, 110, 600–608.
Rebolledo-Leiva, R., Angulo-Meza, L., Iriarte, A., & Gonzalez-Araya, M. C. (2017). Joint carbon
footprint assessment and data envelopment analysis for the reduction of greenhouse gas emissions
in agriculture production. Science of the Total Environment, 593, 36–46.
Shen, N. (2010). Regional investment in energy input, pollution emission and China’s energy
economic efficiency. Journal of Finance and Economics, 2010(01), 107–113.
198 J. Zhang et al.
Uddin, M. G. S., Bidisha, S. H., & Ozturk, I. (2016). Carbon emissions, energy consumption, and
economic growth relationship in Sri Lanka. Energy Sources, Part B: Economics, Planning and
Policy, 11, 282–287. https://doi.org/10.1080/15567249.2012.694577.
World Bank. 201(8). Doing Business 2018. Reforming to create jobs Economy profile, Zmabia, A
World Bank Group Flagship Report. https://doi.org/10.1596/978-1-4648-1146-3.
Ye, X., & Wu, L. (2011). Analyzing the dynamics of homicide patterns in Chicago: ESDA and
spatial panel approaches. Applied Geography, 31, 800–807. https://doi.org/10.1016/j.apgeog.
2010.08.006
Ye, X., & Rey, S. (2013). A framework for exploratory space-time analysis of economic data. The
Annals of Regional Science, 2013(50), 315–339. https://doi.org/10.1007/s00168-011-0470-4.
Zhang, Y., & Zhang, S. (2018). The impacts of GDP, trade structure, exchange rate and FDI inflows
on T China’s carbon emissions. Energy Policy, 120(2018), 347–353.
Chapter 14
Spatial Visualization and Analysis
of the Development of High-Paid
Enterprises in the Yangtze River Delta
RenZhou Gui, Tongjie Chen, and Zhiqiang Wu
14.1 Introduction
Nowadays, big data is booming and the value it brings is huge. A large amount of
data can better analyze human behavior, urban form and lifestyle, and provide better
servers for smart city applications (Deren et al. 2015; Hashem et al. 2016; Pan et al.
2016). Data spatial visualization has played an important role in the development of
smart city. It can break through the restriction of data in time, space and coverage,
achieve an overall analysis and evaluation of the urban layout, and find regional,
regional and overall relevance. By mining the current economic development of the
city, it can also help us understand the operation of urban space and the layout of
urban space, provide more accurate and more intuitive data statistics and analysis
results, and predict the direction of urban development and future models (Wu et al.
2016).
The effective utilization of big data can optimize the regional industrial layout,
functional layout and the benign distribution of employment position of the city. The
early work of urban functional area identification can only distinguish the urban area
roughly, and divide the area into residential area, industrial and commercial area,
public institution area and so on. These simple urban functional regions can achieve
good treatment effect in some cities with single structure, short existence time and
R. Gui (B
)·T. Ch e n
The Department of Information and Communication Engineering, Tongji University, Shanghai
201804, China
e-mail: rzgui@tongji.edu.cn
T. Ch e n
e-mail: chentongjie@tongji.edu.cn
Z. Wu
The Department of Urban Planning, Tongji University, Shanghai, China
e-mail: prof.wus@gmail.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_14
199
200 R. Gui et al.
Table 14.1 Differences between traditional and smart city planning
Traditional urban planning Smart city planning based on big data
The data is not continuous, limited by
geographic location, and the coverage is low
It breaks the restriction of data in time, space
and coverage and solves the problem of
efficient storage and query of large amounts of
data
It is impossible to judge the overall and local
relationship, and it is even impossible to grasp
the rule of city life and see clearly the future
development of the city
Based on big data, the overall analysis and
evaluation of urban layout are conducted to find
the correlation between regions, regions and
the whole
The division between urban functional zones
and functional zoning is too simple, which
requires professional planners’ conception and
experts’ views
It can predict the future city’s model, display
the data results more quickly, accurately and
intuitively, and rationally and scientifically
determine the development direction and
measures of the intelligent city in all directions
simple design. However, it is difficult to meet the actual demand through rough
regional division for the problems commonly existing in Chinese cities, such as the
complex function of urban areas, high degree of mixing, and even after construction
and planning. Particularly, in the field of new urban planning based on flow, there
is a wide range of data required, which requires rapid integration and analysis from
a large number of dispersed data sources, and thus has higher requirements for big
data platforms (Table 14.1).
At the same time, the smart cities research under the big data is an important
mean of what is needed now. Big spatiotemporal socioeconomic data would facil-
itate the modelling of individuals’ economic behaviour in space and time and the
outcomes of such models can reveal information about economic trends across spatial
scales. Especially in Yangtze river delta in China, enterprise’s time and space anal-
ysis and preliminary exploration is necessary. When this kind of field matures, we
will have a clearer understanding of the future development and trend of companies
in Yangtze river delta. And this will help the city have better development. However,
with the accumulation of spatio-temporal data, the rich details of spatiotemporal
dynamics in computational modelling remain largely unexplored because of many
binding constraints for scientific advancement such as the challenge of intensities
of data computing and very large georeferenced dynamic databases. Therefore, it
is necessary to develop a powerful and high-performance platform to meet these
challenges.
Geographic Information System (GIS) is a very important spatial information
system, and can be a very effective and helpful way to realize visualization. Focusing
on urban heat island caused by the obstruction of sea breeze by high-rise buildings
in coastal areas, (Yamamoto 2015) introduced an example of the visualization of
GIS analytic for open big data in environmental science. Lv et al. (2016) proposed a
visualization application in smart city by using Web virtual reality (WebVR), Internet
14 Spatial Visualization and Analysis of the Development … 201
of Things (IoT), and three-dimensional (3-D) geographical information system (3-
DGIS) with peer-to-peer (P2P) network. ArcGIS Engine is embedded map compo-
nent libraries introduced by ESRI company for custom development of GIS appli-
cation. It can be used in c++, COM, .net and Java environment as GIS functions, to
construct special GIS application solutions. A novel quantitative risk analysis method
(Ma et al. 2013) was proposed to deal with urban natural gas pipeline networks based
on ArcEngine and C# programming techniques. Guo and Liu (2011) developed a
forest multiple resources management system design and implementation based on
C# developing platform, ArcGIS Engine developing component, and forest resources
survey data of Maoershan Experimental Forestry Center and GIS data, the database
uses SQL.
Faced with the characteristics of massive, diversified and volatile data, traditional
storage methods are greatly limited. The requirement of real-time, intellectualiza-
tion, security, reliability, integrity, low power consumption, high concurrency and
high efficiency for data storage under the characteristics of large data is playing an
important role of distributed file system in large data (Zhang et al. 2013) NoSQL
(Not Only SQL) (Han et al. 2011; Kang et al. 2015) is non-relational database, which
can break the relational model of traditional databases. Data is stored in a freer way,
without relying on a fixed table structure, and is also no longer dependent on the
relationship between data, so data documents can be read and written efficiently and
quickly. Yunus et al. (2017) proposed a web-based restful architecture in an applica-
tion of smart city by using NoSQL json data, and Map-Reduce process is performed
using MongoDB Aggregation framework.
In this chapter, we present a complete solution for the acquisition, organization,
storage, visualization and analysis of Internet big data. We first realize the data acqui-
sition by web crawler and the data storage by a NoSQL database named MongoDB,
and then develop a spatial visualization platform of smart city by using ArcGIS
Engine and C#, which can show the company distribution characteristics in time and
space in Yangtze river delta in China. And the analysis of visualization is shown at
last, which can provide some references in urban structure development.
14.2 Methord
There are mainly two different parts of this solution, which are data acquisition and
data visualization. We first use web crawler to obtain company information on the
network with python language. And then, the data are stored in MongoDB. Finally,
we achieve the data spatial visualization by using ArcGIS Engine and C#. And the
whole design is shown in Fig. 14.1.
202 R. Gui et al.
Fig. 14.1 The whole frame of platform
14.2.1 Data Acquisition
Web data acquisition is the process of using web information location and web crawler
to collect the information and content of on its web page, as shown in Fig. 14.2. Before
collecting, we actually need to analyze the location of the information in the web
code. For data preprocessing, it reprocesses and supplements the content and format
of the data from the website and makes them more responsive to actual needs.
It should be noted that the data acquisition part is still a part of the entire solution.
Through this process, we can have a clearer understanding of the characteristics
and structure of big data on the Internet. However, as the location types of different
internet data are not fixed, this paper does not make a unified interface or visual
operation interface. Instead, through the learning and use of advanced tools, we can
obtain the Internet data we want more freely, quickly and efficiently.
In this section, we mainly use Scrapy (Myers et al. 2015) framework to achieve
data acquisition in python language, and also take advantage of XPath technology.
Fig. 14.2 Web data acquisition and processing
14 Spatial Visualization and Analysis of the Development … 203
14.2.1.1 Scrapy
Scrapy is a fast, high-level web data scraping framework developed by Python that
can extract structured data from web pages. It is very simple and handy to use, and
also provides a variety of crawler types, such as BaseSpider, sitemap crawler, etc. It is
widely used in data acquisition, data mining, monitoring and automatic testing, and
has very powerful functions. In addition, Scrapy uses a non-clogged asynchronous
network library called Twisted to handle network traffic, and can use a pool of threads
brought in by Twisted (by default, 10 threads are used).
It has a clear architecture and contains a variety of middleware interfaces, which
can be flexibly implemented according to actual requirements. Using the “scrapy
startproject” command, you can generate a crawler’s Project framework. Finally, the
data is saved in json, a lightweight data-interchange format that can be easily parsed
and generated.
14.2.1.2 Xpath
XPath is the XML path language used to determine the location of parts of an XML
document (a subset of the standard general-purpose markup language). It is an xml-
based tree structure that provides the ability to find nodes in the data structure tree.
Path expressions in XPath are the order of nodes from one XML node to another,
or a set of nodes. This technology not only has low threshold, high expansibility
and maintainability, but also uses multi-thread technology to improve the speed of
crawler. However, his weakness is that he needs to analyze the web page first and
sort out the labels. For simple web pages, he is inefficient.
In practice, Xpath’s path can be found by constantly looking at the relationships
between page tags, as in Fig. 14.3, is a very simple web site source code, if you want
to get the title of the string, use the “/HTML/head/title/text ()”. Then you will find
“Welcome to Tongji” string, the string is to start from the root node “HTML” step
by step, through to the string in the “title”, then “text ()” to pick up the target string.
This method is very effective in complex web pages.
Fig. 14.3 Asimpleweb
source code <html>
<head>
<title>Welcome to Tongji</title>
</head>
<body>
<div id=”Tongji”>
<h1>This is my data1</h1>
<h1>This is my data2</h1>
</div>
<body>
</html>
204 R. Gui et al.
Fig. 14.4 Crawler process of company data
Another way of writing Xpath is to make a quick query based on the “features” of
each tag. For example, still in Fig. 14.4, the string “This is my data1” can be found,
which can be written as “/HTML/body/div/h1[1]/text()” or “//div[@id =” Tongji
“]/h1[10]/text()”.
This method is very effective in finding data far from the lie root node in complex
web pages, and can get the same type of data in batches. “text()” is only one method
to extract text. If you want to extract other attributes, you need to use @ +attribute
name, such as the link in < a href =”www.Tongji.edu.cn“>, you can use “ a/@href
“.
14.2.1.3 MongoDB
With the rapid development of data computing technology, the use of high-
performance equipment and software has achieved high efficiency in processing
large amounts of data. But at the same time it has brought opportunities for low cost,
high performance and fast query storage technology. Traditional relational databases
14 Spatial Visualization and Analysis of the Development … 205
are difficult to generate responses quickly when dealing with large, highly concurrent
dynamic websites.
Aimed at shortcomings of low speed, low efficiency and high requirements of
traditional relational database, a big data processing method based on NoSQL
database is proposed, and has broken the traditional relationship database model.
Data is stored in a freer way, and not rely on a fixed table structure, no longer depend
on the relationship between the data, can be quickly and efficiently complete the data
document to read, write and query.
NoSQL database is a non-relational database, which is also divided into key
database, column database, document database and graph database. In this platform,
we adopt the document-oriented NoSQL database, MongoDB, as the data storage of
our platform, which is not only used to store the collected data, but also to support the
data visualization afterwards. MongoDB is based on distributed file storage, its query
language is a very powerful, and can store more complex data types by using bjson
(Binary JSON) in the form of a document storage. MongoDB has the characteristics
of light weight, high efficiency and easy to transport, high performance, and very
convenient to be used.
14.2.1.4 The Method of Crawling Data in This Chapter
As show in Fig. 14.4, we have collected registration information for all companies
in the Yangtze River Delta of China. Some of the types of companies we classify
include electronic computers, real estate, high and new technologies, internetwork,
finance, tourism, software development and communication. The command of the
crawler to collect data is shown in Fig. 14.5.
The configuration code of settings.py is as follows:
BOT_NAME = 'myspider' # The name of the crawler project
SPIDER_MODULES = ['myspider.spiders']
NEWSPIDER_MODULE = 'myspider.spiders'
RANDOMIZE_DOWNLOAD_DELAY = True
# Delay of 0.5 s
DOWNLOAD_DELAY = 0.5
# The crawler mode is width first
SCHEDULER_ORDER = 'BFO'
# Configure browser information in header information
USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36'
Next, the configuration code of items.py for the data class is as follows:
206 R. Gui et al.
Fig. 14.5 Crawler command and the data display during acquisition. aStart crawler command.
bReal-time display of data acquisition
class MyspiderItem(scrapy.Item):
Name = scrapy.Field() # company name
Type = scrapy.Field() # company type
SearchType = scrapy.Field() # Product type
EcoScale = scrapy.Field() # Registered capital scale
Produce = scrapy.Field() # product
Address = scrapy.Field() # company address
lat = scrapy.Field() # Company longitude and latitude location
lnt = scrapy.Field()
Province = scrapy.Field() # province
FoundYear = scrapy.Field() # registration time
14 Spatial Visualization and Analysis of the Development … 207
The Url part of the website to obtain company information is:
hxs = HtmlXPathSelector(response)
# Extract the Url's Xpath data
findedUrlfield = hxs.select('//div[@class="itemblocks"]/h3/a/@href').extract()
for item_url in findedUrlfield:
# Request services for each page, passing the web page processing to the new
callback function
yield Request(url=item_url,callback = self.parse_item)
Part code of information data extraction and processing in the company’s website:
hxs2 = HtmlXPathSelector(response)
name = hxs2.select('//div[@class="dn_more"]/table/tbody/tr[1]/td/text()')[0].extract()
type = hxs2.select('//div[@class="dn_more"]/table/tbody/tr[2]/td/text()')[0].extract()
ecoscale = hxs2.select('//div[@id="busDetail"]/table/tbody/tr[5]/td/text()')[0].extract()
produce = hxs2.select('//div[@class="dn_more"]/table/tbody/tr[4]/td/text()')[0].extract()
address = hxs2.select('//div[@class="dn_more"]/table/tbody/tr[5]/td/text()')[0].extract()
year = hxs2.select('//div[@id="busDetail"]/table/tbody/tr[8]/td/text()')[0].extract()
# The code is set to GBK
address2 = address.encode('gbk')
# Call the longitude and latitude query function and set the approximate location as the
target province
latlnt = map.getLocation(address2,'XXProvince')
Then we integrate the data and convert them into precise coordinates through the
Baidu Map API, save them as json format. Json is a lightweight data interchange
format and is easy to parse and generate. After web crawling, we store all data from
json files into MongoDB.
14.2.2 Data Visualization
Visual analysis is an important method of big data analysis, and data visualization
is technically a more advanced technology and method. These techniques allow the
use of graphics, image processing, computer vision and user interfaces to build the
visualized data.
Generally, data visualization can be further processed by data reporting, mapping,
actual display, or statistical processing of data. About 80% of the information
obtained by human from the outside world comes from the visual system. Through
data visualization, human-computer interaction can be better realized.
In this section, data visualization is the embedded development of ArcGIS Engine
using c#, and MongoDB is connected.
208 R. Gui et al.
14.2.2.1 ArcGis
Geographic Information System (GIS) is a special and important space information
system that can collect, store, manage, compute, analyse, display and describe the
geographic distribution data in the whole or part of the earth’s surface (including
the atmosphere) space, as shown in Fig. 14.6, supported by computer hardware
and software systems. It is characterized by a common geographical location
and can be displayed by coordinate transformation. It also can collect, manage,
analyze and output various geospatial information. The system has strong ability of
spatial comprehensive analysis and dynamic prediction, and can generate high-level
geographic information. For the purpose of geographic research and geographic deci-
sion making, it is a good human-computer interactive spatial decision support system.
Therefore, it can be fully applied to the development of smart cities to observe the
changes of data through time-space observation.
ArcGIS Engine is an embedded development component for custom development
of GIS applications, capable of building specialized GIS application solutions by
using simple interfaces to obtain combinations of arbitrary GIS functions in C++,
COM, .net and Java environments.
Fig. 14.6 Yangtze river delta region are displayed using ArcGIS
14 Spatial Visualization and Analysis of the Development … 209
14.2.2.2 The Method of Data Visualization
We first use the axMapControl of the engine to load the map file into the system,
and then query the required data preliminarily from MongoDB. The main code of
loading the map file is as follows:
// Open the file dialog box
System.Windows.Forms.OpenFileDialog openFileDialog;
openFileDialog = new OpenFileDialog();
openFileDialog.Title = "Open the map document";
// The condition is filtered as .mxd map file
openFileDialog.Filter = "map documents(*.mxd)|*.mxd";
openFileDialog.ShowDialog();
string filePath = openFileDialog.FileName;//Get map path
if (axMapControl.CheckMxFile(filePath)){
axMapControl.MousePointer = esriControlsMousePointer.esriPointerHourglass;
// Import the map
axMapControl.LoadMxFile(filePath, 0, Type.Missing);
axMapControl.MousePointer = esriControlsMousePointer.esriPointerDefault;
}
else{ MessageBox.Show(filePath + "Not a valid map document");}
Once the data is successfully imported into the visualization platform, it can be
displayed on the map control. However, due to the format of the data in the database,
we need to further extract and process the data from the database to compare them
to more filter criteria such as time, registered capital size, and company type. At the
same time, we do coordinate transformation for the filtered data, the coordinate of
longitude and latitude of data position is transformed into geodetic coordinate, as
shown in Fig. 14.7. Finally, the data are displayed on the corresponding position of
map at different times.
The code to convert latitude and longitude to geodetic coordinates is as follows:
Fig. 14.7 Data processing before Visualization
210 R. Gui et al.
IMap pMap = pActiveView.FocusMap;
IPoint pt = new PointClass();
ISpatialReferenceFactory pfactory = new SpatialReferenceEnvironmentClass();
ISpatialReference flatref = pMap.SpatialReference;
//Adopt Beijing 1954 geodetic coordinate system
ISpatialReference earthref = pfactory.CreateGeographicCoordinateSystem( (int)
esriSRGeoCSType.esriSRGeoCS_Beijing1954);
pt.PutCoords(x, y);
IGeometry geo = (IGeometry)pt;
geo.SpatialReference = earthref;
geo.Project(flatref);
return pt; //Returns the converted coordinate point
After loading the map, the coordinate points need to be displayed on the map.
But the displayed points still need to be configured. In this paper, the red point with
a black edge is selected for clear display, and the painted point is marked on the
map as an element. What still needs to be noted is that, after all the elements of the
points are added, the map needs to be refreshed to show all the points. If the map
page is refreshed with every addition, it will cause low speed and affect efficiency
due to continuous refresh when there are too many data points. Part of the code is as
follows:
IPoint point = new ESRI.ArcGIS.Geometry.Point();
IMap map = axMapControl.Map;
IMarkerElement markElement = null;
activeView = map as IActiveView;
ISimpleMarkerSymbol simpleMark = new SimpleMarkerSymbol();
// Load the coordinates
point.PutCoords(pointX, pointY);
// Set the shape and color style of the coordinate point
ISimpleMarkerSymbol simpleMark = new SimpleMarkerSymbol();
simpleMark.Size = 3;
simpleMark.Color = getRGB(255, 0, 0);
simpleMark.Color.Transparency = 150;
simpleMark.Style = esriSimpleMarkerStyle.esriSMSCircle;
14 Spatial Visualization and Analysis of the Development … 211
simpleMark.Outline = true;
simpleMark.OutlineColor = getRGB(0, 0, 0);
simpleMark.OutlineColor.Transparency = 80;
simpleMark.OutlineSize = 1;
graphicContainer = map as IGraphicsContainer;
IElementCollection elementsCollection = new ElementCollectionClass();
IMarkerElement markElement = new ESRI.ArcGIS.Carto.MarkerElement() as
IMarkerElement;
markElement.Symbol = simpleMark;
IElement element = markElement as IElement;
element.Geometry = point;
// Add the point to the map container
graphicContainer.AddElement(element, 0);
// Refresh the map
activeView.PartialRefresh(esriViewDrawPhase.esriViewGraphics, null, null);
In addition, we also develop the function of regional statistics and quantitative
ranking. When the rectangular area is selected by the mouse, the right Statistics area
of the software shows the number of the company, the company density, the total
registered economic scale and ranking the number of the various types of Companies.
The data is displayed in the form of points to clearly show the spatial distribution
of various companies, but another problem is that it does not display the density of
the area well. Especially when the area points are too concentrated and strong, the
points will overlap. Then we will not be able to get information about the density of
the area (Fig. 14.8).
Density map, also known as heat map, can show the density of statistics in each
region, and be easy to help people understand the distribution characteristics. What’
more, it can well solve the problem of overlap. In this chapter, map areas are divided
into plenty of small areas. In each small area, we compute the number of the company
or sum up the registered economy, then divide by the area of the region, and then
we get the corresponding density. After We calculate all the density of each region
and the corresponding range color, finally show them on the map. The code for
calculating density is as follows:
Fig. 14.8 Spatial visualization system
212 R. Gui et al.
//Query all the results
foreach (Record record in result)
{
// Check whether the longitude and latitude of each result are within the current
display box area, and meet the time, captial scale and company type requirements on
this side. TempPoint1 and tempPoint2 are the points in the lower left and upper right
corner of the current area respectively
if (!(record.lnt < tempPoint1.X || record.lat < tempPoint1.Y || record.lnt >
tempPoint2.X || record.lat > tempPoint2.Y) &&
ifMeetDrawMaprequirement(record.FoundYear, record.EcoScale,
record.SearchType, cYear, cMonth))
{
// Longitude and latitude conversion
tempPoint3 = GetProject(axMapControl.ActiveView, record.lnt, record.lat);
// Get the position of the matrix
MatrixX = (int)((tempPoint3.X - XMin) / DensityXUnit);
MatrixY = (int)((tempPoint3.Y - YMin) / DensityYUnit);
// Extract the number of captial scale
int tempindex = record.EcoScale.IndexOf(" ");
int tempscale = Convert.ToInt32(record.EcoScale.Substring(0, tempindex));
DensityMatrix[MatrixX, MatrixY] += 1;
DensityScaleMatrix[MatrixX, MatrixY] += tempscale;
}
}
After calculating the density of each small area of the matrix, a “color template”
can be set from the lowest density of dark green to light yellow, and finally the
highest density to dark red. Each density range is configured with a color, and the
color squares are displayed on the screen in turn to a density map.
14.3 Realization
Due to the platform built by ArcGIS Engine and C#, and the data we get from
internet, we can see the results shown as Fig. 14.9 In UI design, a large area of the
middle is used to display map controls and data visualization. The left and bottom
are the screening sections for the display conditions, including the company type,
the company’s registered size, and the time axis. The right side is the map control
area and the visual functional area, which can operate on the middle display control,
display density and data statistics processing. In addition to the time axis, the earth
coordinates and the coordinates of the longitude and latitude of the mouse position
are also at the bottom. The dynamic display is to dynamically display the distribution
of data points on the visual control according to the time axis on the left.
We chose electronics companies and all companies in the Yangtze River Delta
as examples to show visualization. Electronic companies include: computers, high
14 Spatial Visualization and Analysis of the Development … 213
Fig. 14.9 UI interface of the platform
technology, internet, software and communications. Registered capital includes all
ranges.
We can control the display area of the map and use the distribution of points
to observe the distribution of the data from a spatial perspective. As can be seen
from Fig. 14.10a, Shanghai’s electronics company is mainly located in the center of
Shanghai, becoming a sloping rectangular distribution, just like a sun shining around.
In this area, the development of companies is very prosperous. From Fig. 14.10b, the
most companies in Zhejiang province are distributed in coastal areas, presenting as a
semicircular distribution. For example, companies in Hangzhou, Ningbo, Shaoxing,
Jiaxing, Wenzhou, Jinhua and other cities are relatively dense, especially around
Hangzhou. And they spread outward based on this, while other areas are relatively
sparse compared to inland areas. From Fig. 14.10c, most of the current distribution
in Jiangsu province is close to the Yangtze river, and all the way to Shanghai, such
as Nanjing, Yangzhou, Changzhou, Wuxi and Suzhou. From Fig. 14.10d that the
overall distribution of Anhui province is relatively sparse at present, showing like
several star distributions. The denser places are mostly in Hefei and several regions
along the Yangtze river, which do not spread well.
The Fig. 14.11 shows that the whole Yangtze river delta area (electronics compa-
nies and all types of companies), companies mostly distributed in the coastal areas
and along the Yangtze river, especially in Shanghai, and can be divided into two small
triangle and big triangle structure. Small triangle in the Shanghai area is driving the
development of the whole triangle. On the other hand, the whole Yangtze river delta
and continuously to the small triangle gathered themselves together, and through the
214 R. Gui et al.
Fig. 14.10 Characteristics of electronic company distribution of each province in 2015 aShanghai
bZhejiang cJiangsu dAnhui
companies gathered to form company communities to better promote the develop-
ment of the Yangtze river delta regional interaction. In a further proliferation, there
is also a trend of development. It can be said that Shanghai is the development center
of the Yangtze River Delta (Cheng and LeGates 2018; Lin et al. 2015).
By selecting the column function on the right side of the UI, the number distri-
bution of different rectangular areas can be depicted, as shown in Fig. 14.12.The
number marked on the blue bar is the number of companies in the selected rectangular
area.
The role of spatial distribution of companies can not only show the characteris-
tics of regional economic development, but also be combined with more fields. For
example, we can observe how companies interact with transportation development.
As shown in Fig. 14.13, the distribution of the all companies in Shanghai changed
in different time periods before and after the completion of Shanghai metro lines 1,
14 Spatial Visualization and Analysis of the Development … 215
Fig. 14.11 Company distribution in the Yangtze river delta region in 2015. aElectronic company
bAll types of companies
Fig. 14.12 The number distribution of electronic companies in some cities in the Yangtze River
Delta in 2015
4 and 11. It can be clearly seen that before the completion of corresponding metro
lines, there were already corresponding dense company distributions near the lines,
which indicates that it is the development of regional economy that influences the
completion of subway lines. After the completion of the subway line, the number
216 R. Gui et al.
Fig. 14.13 The interplay between company distribution and metro distribution in Shanghai (The
company data are all companies in Shanghai): take metro line 1 (built in 1993), metro 4 (built in
2005) and metro line 11(built in 2009) as examples. adistribution map of metro line 1 in Shanghai.
bcompany registration distribution map of Shanghai in 1990–1993. ccompany registration distri-
bution map of Shanghai in 1994–1996. dcompany registration distribution map of Shanghai in
1994–1996. edistribution map of metro line 4 in Shanghai. fcompany registration distribution
map of Shanghai during 2000–2005. gcompany registration distribution map of Shanghai during
2006–2010. hcompany registration distribution map of Shanghai during 2011–2015. idistribution
map of metro line 11 in Shanghai. jcompany registration distribution map of Shanghai during
2004–2009. kcompany registration distribution map of Shanghai during 2010–2014. lcompany
registration distribution map of Shanghai during 2015–2018
of companies near the line continued to develop rapidly and the pattern remained
roughly unchanged, which indicated that the subway construction still had a positive
impact on the distribution of companies. Transportation and economic distribution
complement and reinforce each other.
As the leading city in the Yangtze river delta, what other economic characteristics
of Shanghai itself can be observed? From Fig. 14.14, we can observe from the angle
of time. Through the comparison of the distribution of different times, the distribution
of the electronics companies is still mainly concentrated in the center, and becomes
more and more intensive from 2000 to 2015.
And from Fig. 14.15 we can see, electronic companies in the center of Shanghai
are very densely distributed, the number of which accounts for 49.91% of the total
of Shanghai in the statistics shown in the Figures. And in all types of companies,
communications are the most numerous, with computers and software development
second.
14 Spatial Visualization and Analysis of the Development … 217
Fig. 14.14 Data space point display in different years (The data is electronics company in
Shanghai): a2000. b2005. c2010. d2015
Fig. 14.15 Quantitative statistics: a4624 in the center. b9265 in the whole Shanghai
The density maps are shown in Fig. 14.16. Figure 14.16a, c show the density map
for the quantity. Within a unit area, the larger the number of companies, the more red it
will become, the less the number, the closer the color will be to green. Figure 14.16b,
d are the registered capital density. From these four figures, the results of the density
map show that the closer to Shanghai, the denser the number of companies. Especially
in the center of Shanghai, it is the most economically developed area. The research
shows that the distribution center of companies in the Yangtze river delta is still
Shanghai, which can lead the development of the whole Yangtze river delta and
the coordinated development of industry. The color distribution range of quantity is
shown in Table 14.2.
14.4 Conclusion
With the development of the era of big data, the requirements for data collection,
storage and visual analysis are getting higher and higher. Now, the rise of smart city
cannot be separated from the development of big data. This chapter aims to provide
time and space visualization and analysis of enterprises in the Yangtze River Delta
218 R. Gui et al.
Fig. 14.16 Data density map display in 2015: aQuantitative density in the Yangtze River delta
region. bRegistered capital density in the Yangtze River delta region. cQuantitative density in
Shanghai. dRegistered capital density in Shanghai
region, and provide a certain data and image base for the development of the Yangtze
River Delta region. Through data and visual analysis, some practical conclusions can
be drawn.
Through the development of big data acquisition and visualization platforms, this
chapter realized effective data acquisition, storage and visual analysis. The main
works are summarized as follows:
(1) Use web crawler to obtain information on the network with python language.
In the XPath method, using algorithm based on the tree node text extraction and
parallel information collection, effective means against anti reptiles, and parallel
acquisition with multi threads. This method has high efficiency, flexibility, and
access to comprehensive data.
(2) In the management of the data, use MongoDB database based on the NoSQL
to solve the traditional storage bottleneck problem. The collected information
is sorted, sorted and stored in the database, and efficient data query is achieved.
14 Spatial Visualization and Analysis of the Development … 219
Table 14.2 Density map color range of quantity
RGB color Quantity density range (unit:
a/km2)
Scope of captial economic
density (unit: RMB
thousand/km2)
Remarks
(0, 150,20) 0–0.02 0–2 Aqua
(0, 180, 10) 0.02–0.1 2–300
(0, 210, 0) 0.1–0.2 300–600
(0, 240, 0) 0.2–0.3 600–900
(30, 250, 0) 0.3–0.4 900–1200
(60, 250, 0) 0.4–0.5 1200–1500
(90, 255, 0) 0.5–0.6 1500–1800 Yellow green
(120, 255, 0) 0.6–0.7 1800–2100
(150, 250, 0) 0.7–0.8 2100–2400
(200, 250, 0) 0.8–0.9 2400–2700
(250, 250, 0) 0.9–1 2700–3000 Yellow
(250, 200, 0) 1–1.1 3000–3300
(250, 150, 0) 1.1–1.2 3300–3600
(250, 100, 0) 1.2–1.3 3600–3900
(250, 50, 0) 1.3–1.4 3900–4200
(255, 0, 0) More than 1.4 More than 4200 Red
(3) Develop ArcGIS by using C# to implement a visual interface. Further analyze
the mining data from the perspective of time and space, obtain the details of
the statistics, distribution and changes of the data, and explore and analyze the
development changes.
The combination of big data and smart city development will become hotter,
but the development of big data is still in its infancy, and people have more and
more demands for the visualization and analysis of big data. Especially in urban
development, by analyzing and mining the spatio-temporal view of data between
regions and regions, we can find more useful values to get rid of the drawbacks of
backward traditional urban planning. This method can not only help the development
of the Yangtze river delta region, but also extend to the pearl river delta and cities,
and provide better tools and means for national development and construction. It also
lays the foundation for future functions and development.
References
Cheng, Y., & LeGates, R. (2018). China’s hybrid global city region pathway: evidence from the
Yangtze River Delta. Cities, 77, 81–91.
220 R. Gui et al.
Deren, L., Yuan, Y., & Zhenfeng, S. (2015). Big data in smart city. Geomatics And Information
Science of Wuhan Univers, 58(10), 1–12.
Guo, X., & Liu, Z. G. (2011). Researching and implementing of multiple resources management
information system of Maoershan experimental forestry based on ArcGIS engine. Advanced
Materials Research, 268–270, 1360–1366.
Han J., E, H., Le, G., & Du, J. (2011). Survey on NoSQL database. 2011 6th International Conference
on Pervasive Computing and Applications, Port Elizabeth, 363–366.
Hashem, I. A. T., Chang, V., Anuar, N. B., Adewole, K., Yaqoob, I., Gani, A., et al. (2016). The role
of big data in smart city. International Journal of Information Management, 36 (5), 748–758.
Kang Y. S., Park, I. H., Rhee, J., & Lee, Y. H. (2015). MongoDB-based Repository Design for
IoT-generated RFID/Sensor Big Data. IEEE Sensors Journal,1.
Lin, X., Quan, H., Zhang, H., & Huang, Y. (2015). The 5I model of smart city: A case of Shanghai,
china. IEEE First International Conference on Big Data Computing Service and Applications,
2015, 329–332.
Lv, Z., Yin, T., Zhang, X., Song, H., & Chen, G. (2016). Virtual reality smart city based on
Web V R GIS . IEEE Internet of Things Journal, 3, 1015–1024.
Ma, L., Cheng, L., & Li, M. (2013). Quantitative risk analysis of urban natural gas pipeline networks
using geographical information systems. Journal of Loss Prevention in the Process Industries,
26(6), 1183–1192.
Myers, D., & McGuffee, J. W. (2015). Choosing Scrapy. Journal of Computing Sciences in Colleges,
31(1), 83–89.
Pan Y., Tian, Y., Liu, X., Gu, D., & Hua, G. (2016). Urban big data and the development of city
intelligence. Engineering,2(2), 171–178,185–192.
Wu Z., & Ye, Z. (2016). Research on urban spatial structure based on baidu map thermal map—a
case study of central urban area of Shanghai. Urban Planning,40, 33–40.
Yunus, S., Sinem, G.S., & Mehmet, K. (2017). Big Data and Restful Based Web Api for Smart Health
Application in Smart Cities. International Conference on Advanced Technology & Sciences.
Yamamoto, K. (2015). Visualization of GIS analytic for open big data in environmental science.
International Conference on Cloud Computing and Big Data (CCBD), 2015, 201–208.
Zhang, X., & Xu, F. (2013). Survey of research on big data storage. 2013 12th International
Symposium on Distributed Computing and Applications to Business, Engineering & Science,
Kingston upon Thames, Surrey, UK, 76–80.
Chapter 15
High Performance Spatiotemporal Visual
Analytics Technologies and Its
Applications in Big Socioeconomic Data
Analysis
Zhipeng Gui, Yuan Wang, Fa Li, Siyu Tian, Dehua Peng, and Zousen Cui
15.1 Introduction
Spatial computing methods are crucial to advance Social Science and Humanity
research. Nowadays, with the development of Volunteered Geographic Information
(VGI) and Internet of Things (IoT), spatial social science and humanities research
has shifted from a data-scarce to a data-rich environment. The booming of big
spatiotemporal socioeconomic data would facilitate the modelling of macro-level
as well as individuals’ socioeconomic behavior in space and time. The outcomes of
such models can quantitatively analyze socioeconomic activities using full samples,
and also creates new opportunities for revealing socioeconomic trends across spatial
scales. To comparative study such spatiotemporal phenomena, a powerful visual
analytical framework for effectively identifying interesting events and discovering
Z. Gui (B
)·S. Tian ·D. Peng
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079,
China
e-mail: zhipeng.gui@whu.edu.cn
S. Tian
e-mail: wilhelm_tian@whu.edu.cn
D. Peng
e-mail: pengdh@whu.edu.cn
Y. W a n g ·F. L i ·Z. Cui
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing
(LIESMARS), Wuhan University, Wuhan 430079, China
e-mail: yuan.wang@whu.edu.cn
F. Li
e-mail: lifa@whu.edu.cn
Z. Cui
e-mail: zousen_cui@whu.edu.cn
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_15
221
222 Z. Gui et al.
hidden patterns, anomalies, and relations from datasets is fundamental. However,
building such a visual analytics framework faces tough technical challenges in ear
of big data. As the space-time data accumulate, the rich details of spatiotemporal
dynamics remain largely unexplored because of binding constraints on data manage-
ment, computing and visualization. The heterogeneity and streaming feature of multi-
source data, extremely large data volume and intensive computation impede the effi-
ciency and computability of visual analytics. Therefore, a novel analytical framework
with a flexible system architecture for integrating latest computing technologies is
fundamental and highly-desired.
In this chapter, we introduce a multi-tier computing framework for supporting
web-based visual analytics of big socioeconomic data. In this framework, the
latest enabling technologies that cover the major steps of data analysis work-
flow, are considered throughout as a full-stack solution, including storage, pre-
processing, computing, transmitting and visualization. The architecture of the
proposed framework is illustrated in Fig. 15.1, which compose of storage layer,
computing layer and web visualization layer for big heterogeneous data management,
high-performance-supported data analysis and interactive visualization respectively.
Storage Layer
Web visualization
Layer
SQL
Databases
NoSQL
Databases
Data Cubes
Distributed File System
Web Framework
Computing
Layer
Communication Components
&Web API
Echart D3.js Kepler.gl
...
Vis ualization Libraries
React Django Vue.js
Angular
Spatial I nd ex es
CUDA
Hadoop /
Spark /
FLink
MPI /
OpenMP
HPC Framework & Technologies
Fig. 15.1 Generic computing framework for web-based high-performance big data visual analytics
15 High Performance Spatiotemporal Visual Analytics Technologies … 223
Data storage is critical for data analysis and visualization in subsequent operations.
Cutting-edge data storage technologies, such as NoSQL databases, spatial or full-text
index and data cubes, provide approaches to handle data heterogeneity, and improve
I/O performance of data query and access. High performance computing (HPC)
technologies, such as Apache Spark and CUDA in the computing layer, enables
data transformation, processing and analysis, in a streaming and real-time fashion,
making interactive analysis possible. HPC accelerates computing by decomposing
data and scheduling computing tasks in parallel or in distributed computing environ-
ments. Utilizing state-of-the-art HPC framework, advanced analysis and computing
functions can be easily implemented or integrated, such as spatial autocorrelation
and machine learning algorithms. The communication components and Web APIs
between the client-side (visualization layer) and server-side (computing and storage
layers) are for data transmission and communication optimization. Web framework
and visualization libraries are essential to build web client, which can provide capa-
bilities and flexibilities to support web Graphical User Interfaces (GUI) design and
provide rich visualization effects.
The rest of the chapter is organized as follows: Sect. 15.2 describes spatial index
and storage mechanisms for efficient spatial data access. Section 15.3 presents HPC
methods and frameworks for accelerating spatial processing. Section 15.4 introduces
web-based visualization technologies that provide interactive visual spatial analytics
functions. Section 15.5 demonstrates a HPC-supported visual analytics application
by using big enterprise registration data as an example. Section 15.6 concludes the
chapter with discussions.
15.2 Spatial Index and Storage Mechanisms
The amount of socioeconomic and social media data increases explosively on a daily
basis, with the advances in the Internet of Things. In contrast to traditional spatial
data, these data have larger data volume, higher complexity, and heterogeneity in data
modes and relations. This innate characteristic of Big Data challenges traditional data
storage methods, thus limiting the capacity for efficient analysis and visualization.
We introduce the fundamental technologies and state-of-the-art databases for spatial
big data storage that tackle such problems.
15.2.1 Spatial Indexing
Due to the complexity of spatial operations like spatial queries, spatial data needs
sophisticated index mechanism for accurate retrieval and efficient processing. A
spatial index is a data structure arranged in a certain order according to spatial distri-
bution of data. Based on the construction principle, spatial indexing can be catego-
rized into space-driven and data-driven structures. Space-driven data structures are
224 Z. Gui et al.
based on partitioning of space into rectangular cells, independently of the distribution
of the spatial objects. While data-driven structures are organized by partitioning the
spatial objects, which adapts to the objects’ distribution (Rigaux et al. 2002).
15.2.1.1 Space-Driven Structures
Space-driven structures partition the embedding space into cells and map Minimum
Bounding Boxes (MBRs) to the cells according to spatial relationships, e.g., overlap
or intersect. Based on the mechanism used for division of space, space-driven
structures include grid indexing, quad-tree and geohash techniques, as shown in
Fig. 15.2.
Grid indexes divide space into array of cells; intersecting or overlapping spatial
objects are associated with each cell. A geohash is a geocoding system based on space
as divided into longitude-latitude rectangles encoded as binary strings. A quadtree
is a tree-like structure where each node in a quad-tree represents a bounding box
covering certain part of the space. The concepts, partitioning and encoding methods
of space-driven structures are straight-forward, which have become the building-
block of GIS data structures. Databases like IBM DB2, Microsoft SQL Server and
ESRI geodatabases adopt this indexing method.
A1 A2
C1 C2 C3 C4 C5
D1 D2 D3 D4
B3 B4 B5
E1 E2 E3 E4
a
b
c
00 0100 0101
0110 0111
10 1100 1101
1110 1111
Root
00 01 10 11
00 01 10 11 00 01 10 11
[a]
[a] ]b[]a[[c][c]
0
10
111
1100
11010 11011
(a) Grid Index (b) Geohash Index
(c) Quad-Tree Index
Fig. 15.2 Examples of Grid, Geohash and Quad-Tree Index
15 High Performance Spatiotemporal Visual Analytics Technologies … 225
m1
m4
R6
m5
m2
m7
m6
R7
m8
m9
m11
m3
m10
m12
R1
R3
R2
R4
R5
R6 R7
R4 R5R1 R2 R3
m4 m5 m6 m7m1 m2 m3 m8 m9 m10 m11 m12
Fig. 15.3 An example of R-tree Index
15.2.1.2 Data-Driven Structures
Data-driven structures use the spatial containment relationship instead of the order
of the index. These structures, such as R-tree, adapt themselves to the MBRs of the
spatial objects (Zhang et al. 2017). An R-tree consists of a hierarchical index on the
MBRs of the geometries as shown in Fig. 15.3. This hierarchical structure is based on
the heuristic optimization of the area of MBRs in each node in order to improve the
access efficiency (Theodoridis et al. 2000). Data-driven advanced indexing structures
are developed and widely-adopted in GIS tools and spatial databases for efficient data
query as discussed in the following section.
15.2.1.3 Advanced Spatial Indexes
In recent decades, advanced spatial indexes have been developed, including R-tree
variants (Balasubramanian and Sugumaran 2013) and dynamic indexes (Kamel et al.
2017). R-tree variants as shown in Fig. 15.4, improve the original R-tree by changing
indexing process, and can be combined with other methods or enhanced with exten-
sions. The process changes in the construction of R-Tree, usually aim at minimizing
overlap of tree nodes. Hybrid indexes combine R-tree with other spatial indexes
such as Hillbert curve, K-D Tree, and Hash, for supporting advanced capabili-
ties. Extension corresponds to R-Tree extended to store additional information to
more effectively process unique types of queries so that extended application can be
supported.
226 Z. Gui et al.
R-Tree
Process change Hybrid Extension
R+ Tree
R* Tree
Packed R-Tree
Buffer R-Tree
Priority R-Tree
X Tree
Multi Small Index
RT Tree
DR TreeHilbert R-Tree
R k-d Tree
HR Tree
R*Q Tree
Q+R Tree
Vo R-Tree
3D R-Tree
Historical R-Tree
Partially Persistent
R-Tree
Parametric R-Tree
FNR R-Tree
Fig. 15.4 The family of R-Tree variants
15.2.2 Spatial Databases
A spatial database provides an “all-in-one” solution for supporting spatial data store
and access. In addition to the well-known relational databases, NoSQL databases
have become increasing popular.
15.2.2.1 Relational Databases
Relational databases, or SQL databases use relation model to storage spatial data.
In SQL spatial databases, geometric features are represented in records by multiple
key values including coordinates and associated attributes. The mainstream SQL
spatial databases contain traditional SQL and SQL with stock support. Traditional
SQL with spatial feature extensions, includes databases such as IBM DB2 with
Spatial Extender, Oracle Database with Oracle Spatial and Graph, PostgreSQL with
PostGIS Extension, and SQLite with SpatiaLite. SQL database with stock support
for spatial data types, include Microsoft SQL Server, MySQL, TerraData GeoSpatial,
and Boeing’s Spatial Query Server. In addition, open-source or free license databases
such as PostGIS or MySQL are widely applied.
15 High Performance Spatiotemporal Visual Analytics Technologies … 227
15.2.2.2 NoSQL Databases
NoSQL databases stem from the unsatisfactory performance of relational databases
in scenarios where instant queries and fast updates rather than strict validity are
required to query huge volumes of data. Originally, NoSQL means “non-SQL” or
“non-relational”. Nevertheless, the term has been extended to “Not Only SQL”, which
means a new generation of database designs, which highlight scalability and avail-
ability for emerging applications. For improving performance, NoSQL databases
compromise consistency with the concept of eventual consistency, which means
queries might not return updated data immediately or might result in reading data
inconsistent with their real status. To deal with different application scenarios and
address heterogeneity of data, NoSQL databases with different data modes have
been developed, including key-value databases like Aerospike and MemcaheDB,
column databases like HBase, Dynamo, and Cassandra which is used by Facebook
to store social media data, graph databases like Neo4j, and document databases like
CouchDB. Another important issue needed to be addressed is how to store huge
dataset with increasing data volume using commodity device and guarantee high
availability. The next section will introduce distributed databases as a solution to
such an issue.
15.2.2.3 Distributed Databases
Distributed database technologies were developed to handle the growing data volume
by making database systems distributed across physically dispersed hardware.
Distributed databases can be regarded as a collection of separated database systems
that communicate with each other (Fig. 15.5).
The major advantages of a distributed database system are flexibility and scala-
bility, while the trade-off is the extra communication and computation cost on data
synchronization and validation, that all the storage nodes need to keep local data most
updated. Currently both mainstream SQL and NoSQL databases highlight their abili-
ties in supporting distributed storage, such as Oracle, DB2, MongoDB and Cassandra.
Research on distributed spatial index is also emerging, e.g., using distributed index
to manage geo-spatial data in IoTs (Fox et al. 2013; Fathy et al. 2017; Zhang et al.
2016). While distributed database seems to be a good solution for big data storage,
there are some certain scenarios, such as lots of operations against the raw data need
to be performed. Comparing with databases, file system is a more simple way to
store dataset using original raw data structure directly.
228 Z. Gui et al.
Communication
Channel
Memory
Location 1
Database
Memory
Location 3
Database
Memory
Location 2
Database
Fig. 15.5 An exemplary physical architecture of Distributed Database System
15.2.3 Distributed File System
Distributed file systems1(DFSs) are also widely used for big data management, espe-
cially in high I/O and high-performance computing applications. Like a distributed
database, a DFS does not share block-level access details and thus is “transparent”
and “invisible” to end users. The major difference between a DFS and a distributed
database is that the latter uses different APIs to extract datasets or views with different
semantics. DFS, in contrast, allows files to be accessed using the same interfaces and
semantics.
The Google File System (GFS) and Parallel Virtual File System (PVFS) are
two representatives of distributed file systems. The Hadoop Distributed File System
(HDFS) maintained by Apache is the most popular open-source implementation of
GFS. HDFS is widely used in both academia and industry to store many types of data,
such as imagery, user profiles, web pages and web logs. There are many other kinds
of DFS such as Lustre, Andrew File System (OpenAFS) as well as Microsoft DFS.
Using DFSs to manage big and streaming geospatial data has become more and more
popular, some relevant studies are worth attention (Zhang et al 2016; Fahmy et al.
2017;Huetal.2018). DFSs such as HDFS are well-supported by high-performance
computing frameworks as the basic data storage mode for conducting distributed
computing. In next section, we will introduce the latest high-performance computing
technologies and frameworks.
1Distributed file systems. In Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.
php?title=Distributed_file_systems&oldid=869574529. Accessed 21 Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 229
15.3 High Performance Computing Technologies
For the last few decades, massive volumes of disparate, dynamic and distributed
spatiotemporal data, generated by ubiquitous earth observation systems and IoTs,
have become significant research material in the spatial humanities and social
sciences. However, the limitations of traditional computing methods impede the
exploration of spatiotemporal patterns and dynamics hidden behind these massive
datasets. In this context, high performance computing technologies has made great
progress to tackle the challenges. This section introduces basic computing paradigms
in high-performance technologies, the mainstream frameworks, and their application
in the spatial humanities and social sciences.
15.3.1 Computing Paradigms
Big data can be both computing-intensive and data-intensive. Usually, big data
computation requires massive computing resources such as CPU and memory, with
long processing times due the complexity of the analytical algorithms. A computa-
tion process may also involve intensive I/O tasks and large network communication
overheads because of massive data volume. To address these demands, computing
task decomposition and scheduling mechanisms divide the processing operations into
pieces and allocate them to different processing units. In turn, large volume datasets
are subdivided into many segments, then cached and processed on nearby processing
units to avoid frequent data transmission and reduce I/O workloads. According to
the differences in computing environments and modes, high-performance computing
technologies can be categorized into parallel computing and distributed computing
in general.
15.3.1.1 Parallel Computing
Parallel computing handles intensive computing tasks by dividing them into a set
of sub-tasks that can be solved concurrently. Apparently, the procedures in parallel
computing vary with the features of hardware, which are divided into CPU-based
parallel computing and General-Purpose computing on GPU (GPGPU).
CPU-based parallel computing is a type of classical parallel computing method,
including multi-core CPU and multi-CPU. The major goal is to maximize the speedup
ratio with high parallel efficiency, which is influenced by the relative proportions of
parallelizable serial code and the communication overhead between cores or CPUs.
Data and control flow synchronization on distributed memory must be properly
handled; otherwise, the modifications for parallel processing may introduce incon-
sistences and fatal errors. In general, CPU-based parallel computing is used to meet
230 Z. Gui et al.
Fig. 15.6 The GPGPU Pipeline in four steps
low hardware requirements, but the acceleration effect is significantly affected by
the computing capacity of a single machine.
GPGPU refers to the use of GPU for general-purpose processing rather than
graphic processing, which results from advances in GPU hardware technologies.
A GPU usually contains hundreds or even thousands of tightly coupled cores and
achieves significant gains in performance. In general, a GPGPU pipeline is conducted
between CPUs and GPUs. CPUs move the data to GPUs, and the GPUs analyze the
data simultaneously, on a large number of cores. As is shown in Fig. 15.6, the pipeline
can be generally finished in four steps.
Firstly, CPU initiates the environment and allocates memory for input datasets.
Secondly, the input datasets are transferred to a GPU. Thirdly, the cores of a GPU
execute the iterations simultaneously. Lastly, the results are retrieved from the GPU.
For simple, but strongly repetitive operations such as matrix computations, GPGPU
offers advantages over CPU-based computing. However, computing capacity and
applications of parallel computing technologies is limited, because of the constricted
computing resources of a single machine.
15.3.1.2 Distributed Computing
As the data volume increase, a single machine will not be able to efficiently store and
process them; hence, the distributed computing using computer clusters develops.
Usually in distributed computing, instructions are sent to the computing nodes where
the data located, rather than transferring the datasets to distrinct nodes. There-
fore, every node focuses on processing its local datasets. As is shown in Fig. 15.7,
batch processing and stream processing are two major types according to the data
processing time pace.
15 High Performance Spatiotemporal Visual Analytics Technologies … 231
(a) Batch processing
(b) Stream processing
Fig. 15.7 Distributed computing paradigm
In batch processing, datasets are collected over time, and fed to the processing
engine as batch (Fig. 15.7a). It can be triggered in diverse ways such as fixed
time interval or data size. Since batch processing can be carried out on distributed
computers, fault tolerance mechanisms must be established to avoid single points of
failure apart from data partition strategies, task schedulers, and communications. In
scenarios like offline geospatial data analysis, batch processing methods are the right
choice.
In stream processing, datasets are fed directly to the engine piece by piece when
they arrive (Fig. 15.7b). Ideally, the time when datasets are produced and the time
when datasets are processed should be equal. However, there is highly mutable skew
between them in reality caused by input sources, stream processing engines, and
hardware. The skew may generate time latency in processing pipeline and affects the
correctness and completeness of the computation in turn. To deal with the problem,
native stream processing systems offers diverse mechanisms, including windowing,
watermarks, triggers and accumulation. For applications like real-time data analysis,
warning systems, and financial transactions, stream processing outperforms other
methods.
232 Z. Gui et al.
15.3.2 Mainstream Frameworks
Due to the vast demands for parallel/distributed computing technologies, many
commercial or open source HPC frameworks are developed. Using these frame-
works, developers can implement parallel programing much easier as compared to
hard coding the program from scratch. In the section, we will introduce some widely-
used frameworks corresponding to the computing paradigms that we have explained
above.
15.3.2.1 OpenMP
Open Multi-Processing (OpenMP2) is a library for CPU-based parallel computing
on standalone computers. It is implemented based on the fork/join programming
model. The programs start on a single master thread and fork additional threads
where operations must be executed in parallel. When the parallel operations are
finished and synchronized, they are joined back together.
The advantages of OpenMP includes: (1) easy to implement parallel program with
a few code modifications; (2) OpenMP program can be run as serial codes; (3) codes
are easier to understand and maintained. However, there are also disadvantages:
(1) the program can only run on shared memory computers; (2) requires compiler
support; (3) application scenario is limited to a few program structures, e.g., loop.
15.3.2.2 MPI
Message Passing Interface (MPI) is one of the major programming models and
specifications for CPU-based parallel computing on computer clusters. The most
commonly used implementation of MPI is Open MPI,3which was derived from
many early projects, such as FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI. MPI
hides differences in hardware architecture, performs necessary data conversion and
switches communications protocol automatically. Therefore, MPI makes it possible
to run programs on heterogeneous systems and groups of processors with distinct
architectures.
MPI can be run on both shared and distributed memory architectures and has
wider applications than OpenMP. However, MPI programs exact a relatively higher
cost on programming and debugging, and the performance will be limited by the
communication network between nodes.
2The OpenMP API specification for parallel programming. https://www.openmp.org. Accessed 21
Feb 2019.
3Open MPI: Open Source High Performance Computing. https://www.open-mpi.org. Accessed 21
Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 233
Fig. 15.8 Programming model of CUDA
15.3.2.3 CUDA
Compute Unified Device Architecture4(CUDA) is a GPGPU API based on C
programming language exclusively for NVIDIA GPUs. Figure 15.8 demonstrates
the programming model of CUDA. In this model, the host refers to the CPU that
controls the computing procedure, while device refers to the GPU that executes
the computing tasks. Different tasks are expressed as different kernel functions and
assigned to different grid. For each grid, the tasks are divided to several thread blocks,
which are handled by a relevant streaming multiprocessor (SM). Then, threads in
the blocks physically execute the tasks on the processing cores. Synchronization
and communications between the threads in the same block are supported by shared
memory. Therefore, the CUDA model is in sync with the hardware design of NVIDIA
GPUs.
CUDA can dramatically speed up massive parallel jobs, especially for applica-
tions like image processing, model simulations, and machine learning. Nevertheless,
CUDA demands high hardware requirements for intensive computation. Moreover,
4NVIDIA Corporation. (2019). Develop, Optimize and Deploy GPU-accelerated Apps. https://dev
eloper.nvidia.com/cuda-toolkit. Accessed 21 Feb 2019.
234 Z. Gui et al.
Dear Bear River
Car Car River
Dear Car Bear
Dear Bear River
Car Car River
Dear Car Bear
Dear,1
Bear,1
River,1
Car,1
Car,1
River,1
Dear,1
Car,1
Bear,1
Bear,(1,1)
Car,(1,1,1)
River,(1,1)
Dear,(1,1 )
Bear,2
Car,3
Dear,2
River,2
Bear,2
Car,3
River,2
Dear,2
K1,V1 List(K2,V2) K2,List(V2)
List(K3,V3)
Input Splitting Mapping Shuffling Reducing Final Result
Fig. 15.9 MapReduce process taking Word Count as an example
GPU fails on branch prediction, if a program contains very chaotic instruction flows,
then the GPU will become slower than the CPU.
15.3.2.4 Hadoop
Apache Hadoop5is an open-source framework for scalable, reliable and distributed
computing. The core computing mechanism of Hadoop is MapReduce, which
consists of two functions: a map function that applies a specific operation to input
datasets and produces a set of key/value pairs on distributed nodes, and a reduce
function merges all intermediate values associated with the same key on the nodes.
An example is word count, which involves map and reduce as illustrated in Fig. 15.9.
Hadoop takes care of data partitioning, scheduling, load balancing, fault tolerance,
and network communications, so programmers can focus on the design for distributed
applications.
There have been some spatial extension libraries for Hadoop. For example,
SpatialHadoop (Eldawy 2014; Eldawy and Mokbel 2015) adapted Grid File, R-tree
and R+-tree indexes to partition data across nodes and organize local records inside
each node. These enhancements accelerate typical spatial operations and geometric
operations, including range query, KNN and spatial join. SpatialHadoop has been
applied in traffic data processing, as well as in map and satellite data analysis.
15.3.2.5 Spark
Apache Spark6is a unified computing engine for distributed clusters, which offers
libraries for SQL, stream computation, graph computation and machine learning. Its
5The Apache Software Foundation. (2018). Apache Hadoop. https://hadoop.apache.org. Accessed
21 Feb 2019.
6Apache Spark. https://spark.apache.org. Accessed 21 Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 235
core programming model, Resilient Distributed Dataset (RDD), provides transfor-
mations (e.g., distinct,filter,map and sort) and actions (e.g., reduce,count,first and
take) to break through the limitations of Hadoop. In Spark, a job consists of multiple
transformations and an action, the transformations build up a Direct Acyclic Graph
(DAG) of instructions, and the action begins the execution of the graph. The result of
a Spark job is stored in memory by default, so the I/O time cost for writing and reading
data is much less than Hadoop jobs. Furthermore, high-level APIs like DataFrames
and Datasets have also been developed, making it more convenient to process and
manipulate big data.
The high speed, ease of use, generality, and compatibility of Spark has attracted
the interest of large numbers of developers to solve geospatial problems. Frameworks
and libraries like GeoSpark (Yu et al. 2018) and GeoTrellis (Kini et al. 2014)are
applied to process big geospatial datasets for humanities and social sciences research,
including geospatial datasets IO, geospatial indexes, and geospatial operations.
15.3.2.6 Flink
Apache Flink7is also an open source unified computing framework based on
Google’s DataFlow model, processing datasets in native stream rather than by micro-
batches in Spark Streaming. Flink uses custom memory management and serializa-
tion methods to avoid the costs of garbage collection in JVM, which makes it easier to
achieve lower latency and higher throughout than micro-batch processing (Karimov
et al. 2018). The core component of Flink is a distributed system that accepts stream
programs and executes them in cluster with fault tolerance, i.e., “Flink runtime” in
Fig. 15.10. Flink also provides a wide range of high-level and user-friendly APIs
to develop programs with flexibility, including DataStream API, DataSet API and
Table API. These factors have attracted a broader community for Flink than other
stream processing frameworks.
Flink is still an emerging framework, and geospatial related extensions are not
mature. Nevertheless, it has been applied to achieve real-time transport analytics and
spatial semantics processing (Hennig et al. 2016). As demand for low latency grows,
for the foreseeable future Flink will play a key role in real time data processing.
15.3.3 Applications in Spatial Humanities and Social
Sciences
High performance computing technologies have attracted many researchers seeking
to handle massive geospatial datasets. Here we will introduce several applications in
the spatial humanities and social sciences.
7Apache Flink- Stateful Computations over Data Streams. https://flink.apache.org. Accessed 21
Feb 2019.
236 Z. Gui et al.
Fig. 15.10 Key components of Flink Stack (Friedman and Tzoumas, 2016)
15.3.3.1 Social Phenomenon Analytics
The investigation on spatiotemporal distribution and spatial interaction between
social events can benefit governmental policy making and individual-level activity
planning. However, computing multi-level space-time interaction is time-consuming
and hinders the research progress. For instance, to compute a series of space-time
interactions with 32,505 crime event records, it would take around 48 min to complete
calculation for 1,000 runs on a desktop GPU. In contrast, Keeneland powered by MPI
and GPGPU technologies will spend only 264 s to finish the same task (Ye et al. 2017).
15.3.3.2 Urban Mobility Simulation
Cities are complex systems, and the agent-based model (ABM) for simulation of
urban mobility is an effective approach to discover patterns of cities. Neverthe-
less, intensive computing and extremely long urban simulation running times create
challenges for researchers. HPC technologies (like MPI) have been used to imple-
ment ABM and simulation frameworks like Repast HPC speed up agent-based geo-
simulation. However, experiments show that approximation of Point of Attraction
(PoA) information effectively boosts efficiency but with less simulation accuracy
(Zia et al. 2013). So the tradeoff between the efficiency and accuracy should be
considered carefully.
15.3.3.3 Social Sensing
Many spatiotemporal data sources currently capture daily human behaviors intro-
ducing the new field of social sensing. However, the big volume and dynamic
15 High Performance Spatiotemporal Visual Analytics Technologies … 237
attributes of spatiotemporal data brings challenges to efficient and scalable spatial
operations. MPI and Spark have been adopted to accelerate spatial join processing
over large-trajectory dataset and road network data for map matching (Stojanovic and
Stojanovic 2013). However, different data splitting strategies result in different I/O
and computing workload, as the result produce different performance results. The
efficiency of uniform splitting by fixed grid size decreased rapidly when number
of processors increased (only 18% when 16 processors). When spatial splitting
considers spatial distribution of data, the efficiency remains high (about 95%) even
when the processors grow to 16. Therefore, skewness of spatial data must be properly
handled when using HPC technologies.
15.4 Web-Based Visualization
Visualization helps people make sense of information much more rapidly (Wang et al.
2020). For socioeconomic data, effective visual analytics can better assist analyst to
explore spatiotemporal relationships, mining influencing factors and potential rules
of human social phenomena or the processes hidden behind data.
Today, internet is the most important channel to acquire information. Web
visualization has become more popular as compared to traditional stand-alone or
client/server-based visualization (Bender et al. 2000). The wide adoption of cloud
computing has triggered a trend toward data processing and management on the cloud
or server-side. Cloud computing resources accomplish analytics tasks for massive
data, thus avoiding risks associated with transferring sensitive or large volume raw
data to the client side. In addition, with the pervasive use of mobile devices and hetero-
geneous terminals, loosely coupled software architectures that have good portability
and scalability are highly desired. To create sophisticated visual analytics applica-
tions with such web-based architectures, developers utilize open-sourced visualiza-
tion tools and web application frameworks to make programming more succinct and
efficient. Optimized data model and communication technologies could improve the
user experience associated with the volume of data transmission problems.
15.4.1 JavaScript-Based Visualization Libraries
In recent years, various dynamic web page and visualization technologies have
emerged. These include JavaScript, Java Applet, Java Server Page (JSP), Flash, Flex,
Silverlight and Web Graphics Library (WebGL). With the trend of standardization,
major manufacturers of Internet have gradually reached a consensus on web stan-
dards, and HTML 5 has become the de facto standard. In this context, JavaScript-
based visualization has become a trend and compatible with various browsers without
installation of any plugins.
238 Z. Gui et al.
JavaScript-based visualization technologies and libraries have become pervasive.
WebGL is a 3D drawing protocol that combines JavaScript and OpenGL ES 2.0.
Developers can utilize WebGL to command graphic processing units (GPU) to
streamline 3D scenes in the browser, and achieve complex scene navigation and inter-
action for data visualization. Advanced graphics libraries, such as D3.js, Deck.gl,
Kepler.gl, Plotly.js, Three.js, and Leaflet provide plenty of visualization forms, not
only making programming much easier but also make the results more professional
and eye-catching.
15.4.1.1 D3.Js
D3.js8is a data-driven JavaScript library that binds arbitrary data to the Document
Object Model (DOM) to archive data-driven transformations. Different from most
open source chart libraries, it allows users to customize the style of charts. D3
provides a convenient method to set attributes or styles of nodes called selections,
as defined by W3C Selectors API rather than the traditional W3C DOM API. D3
with a small workload for DOM manipulation has been widely used to visualize the
distribution pattern of geographical phenomena.
D3 provides a variety of visualization methods to present the patterns of geograph-
ical phenomena. Figure 15.11a is a choropleth map which shows unemployment
rates data of US from Bureau of Labor Statistics, Census Bureau, in August, 2016.
Figure 15.11b uses histograms to present the medical cost of hip replacement by
state. Figure 15.11c. presents a hex-agonal heatmap and Fig. 15.11d illustrates the
topography of Maungawhau with D3-contour and D3-hsv. D3.js provides plenty
type of charts for two-dimensional visualization, while some other libraries e.g.,
Deck.gl are dedicates to map vis-ualization and provide professional scenario-related
visualization effects.
15.4.1.2 Deck.Gl
Deck.gl9is a WebGL-powered dataset visualization framework. The predecessor of
Deck.gl was an Uber project to better understand the human travel behavior, which
use maps to present big data of passenger and driver movements from where to get
on and where to drop off to further optimize service quality. Deck.gl is not only good
at static presentation of different types of maps such as 3D histograms and migra-
tion maps, but also supports state-of-the-art animations to reveal the spatiotemporal
dynamics behind dataset, such as huge amount of taxi tracks.
Using Deck.gl, users can easily design various cool 3D histograms, scatter
plots and dynamic trajectories. Figure 15.12a reveals the road safety in UK by
counting personal injury road accidents in Great Britain from 1979 to 2017.
8D3: Data Driven Document. https://d3js.org. Accessed 21 Feb 2019.
9Deck.gl. http://deck.gl. Accessed 21 Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 239
(a) (b)
(c) (d)
Fig. 15.11 Four examples of D3.js, aunemployment rates of the US (Mike Bostock. (2017). D3
Choropleth Unemployment rate by county. https://beta.observablehq.com/@mbostock/d3-chorop
leth. Accessed 21 Feb 2019), bmedical cost of hip replacement in US (Phuoc Do. (2015). Medical
Cost of Hip Replacement by State. https://vida.io/documents/s5qo5Gwrct5HNxAD2. Accessed 21
Feb 2019), chexagonal heatmap (D3: Data Driven Document. https://www.visualcinnamon.com/
2013/07/self-organizing-maps-creating-hexagonal.html), dthe topography of Maungawhau (D3:
Data Driven Document. https://observablehq.com/@d3/volcano-contours)
Figure 15.12b shows the flight paths of London Heathrow Airport in a 6-hours
window. Figure 15.12c shows the yellow cab and green cab trips in Manhattan,
New York city, using a dynamic track display. Figure 15.12d demonstrates the
highway safety in the US. Due to the successful application of Deck.gl, there
are many libraries developed based on it like Kepler.gl, which provides more user
customization functions.
15.4.1.3 Kepler.gl
Kepler.gl10 is a powerful open source graphics library based on Deck.gl. Different
from Deck.gl, it provides various built-in visualization GUIs and data analysis func-
tion, and supports users to upload their own data for visualization, which dramat-
ically reduces the programming workload. Meanwhile, Kepler.gl can be easily
embedded into users’ web applications by using web frameworks, such as React
and Redux. Furthermore, Kepler can display complex 3D scenes smoothly since it
10Kepler.gl. https://kepler.gl. Accessed 21 Feb 2019.
240 Z. Gui et al.
(a) road safety in UK (b) flight paths of Heathrow Airport
(c) cab trips in Manhattan (d) highway safety in the US
Fig. 15.12 Four examples of Deck.gl (Deck.gl examples overview. (2018). http://deck.gl/#/exa
mples/overview. Accessed 21 Feb 2019)
is WebGL-based, which provides hardware-accelerated 3D rendering for a HTML5
Canvas.
Figure 15.13a shows a small sample of taxi trip records in New York City. Despite
the fact that 100,000 rows of data are contained in this sample, it can still render the
layer quickly. Figure 15.13b shows the elevation contours of San Francisco main-
land and Treasure Island/Yerba Island. Figure 15.13c shows the congestion of every
single street in San Francisco by using a 3D density map. Figure 15.13d is an origin-
destination map, which shows commuting patterns of England and Wales residential
areas using 3D arcs. Libraries such as, Kepler.gl and Deck.gl provide powerful func-
tionalities to support map-based big data visualization, while statistical charts is also
indispensable. In that, Plotly.js is very powerful.
15.4.1.4 Plotly.js
Plotly.js11 is based on D3.js and Stack.gl. It uses JavaScript to implement graphical
presentations like MATLAB and Python matplotlib on the web, which supports
more than 20 graphic styles including 2D and 3D visualizations. It is dedicated to
11Plot.ly. https://plot.ly/javascript. Accessed 21 Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 241
(a) taxi trip in NYC (b) elevation contours of San Francisco
(c) street congestion in San Francisco (d) commute patterns of England and
Wales residence
Fig. 15.13 Four examples of Kepler.gl (Kepler.gl demo. https://kepler.gl/demo. Accessed 21 Feb
2019)
the visualization of statistical charts. The interactive effects are abundant enough to
meet the needs of statistical analysis of many data types. Plotly.js is not only used
in web development projects, but also supports other languages such as R, Python
and MATLAB, making code integration of different programming languages more
convenient.
Plotly.js not only supports map-based visualization, but also supports 3D statis-
tical charts. Figure 15.14a is a bubble map and the size of bubble reveals the city
populations in US. Figure 15.14b–c belongs to choropleth maps and the depth of
color shows the size of two socioeconomic indicators agriculture exports and North
America precipitation respectively. Figure 15.14d is a 3D scatter plot for visual-
izing high dimensional data. Plotly.js provides statistical charts to meet the visual-
ization requirements of users, while Three.js is more effective when creating 3D
visualizations.
242 Z. Gui et al.
(a) (b)
(c) (d)
Fig. 15.14 Four examples of Plotly.js, a2014 US city populations (Bubble Maps. https://plot.
ly/javascript/bubble-maps. Accessed 21 Feb 2019), b2011 US agriculture exports by state (USA
Choropleth Map. https://plot.ly/javascript/choropleth-maps/#usa-choropleth-map. Accessed 21 Feb
2019), cNorth America precipitation (North America Precipitation. https://plot.ly/javascript/sca
tter-plots-on-maps/), da 3D scatter plot (3d Scatter Plots. https://plot.ly/javascript/3d-scatter-plots.
Accessed 21 Feb 2019)
15.4.1.5 Three.Js
Three.js12 is a JavaScript 3D library based on WebGL. Utilizing Three.js, devel-
opers can implement 3D visualization that run smoothly on the browser, without
writing C++ programs. Three.js provides strong capability to create a variety of 3D
scenes, including cameras, lights and materials. However, the engine is still under
development, and lacks an API and documentation creating a steep learning curve for
beginners. Even so, there are still many excellent applications emerging. For instance,
a web globe has been built by Owen Cornec aiming to show the scope, variety, and
inequality of world economies. It is featured in the Best American Infographics 2016
edition, and won the IEEE Vis and IIB awards.13
As we can see from the Fig. 15.15, it supports several forms of visualization
including globe view, map view, country stacks and 3D product space. The dynamic
and interactive effect is intuitive and helpful for detecting economic growth and
12Three.js. https://threejs.org. Accessed 21 Feb 2019.
13Mariner Books, Houghton Mifflin Harcourt. (2016). The Best American Infographics
2016. https://sfpl.bibliocommons.com/item/show/3273821093_the_best_american_infographics_
2016. Accessed 21 Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 243
(a) 3D version product stacks (b) all products by category
(c) 3D version of product space (d) stacks products by category
Fig. 15.15 The globe of economic complexity (Center for International Development, (2016). The
globe of economic complexity. http://globe.cid.harvard.edu. Accessed 21 Feb 2019)
exploring the economic differences between countries. By using Three.js, it will be
much easier for users to build 3D visualization scenes. In addition to the visualization
libraries on the front-end, the communication technologies between front-end and
back-end also play an important role.
15.4.2 Web Framework and Communication Technologies
A systematic web visualization solution should not only consider graphic rendering
with visualization libraries on the front-end, but also front-end and back-end commu-
nication issues, such as data optimizing transferring, network session and transaction
management, security, and authentication. Therefore, sophisticated web application
frameworks with flexible embedding mechanisms to integrate function modules and
technologies for handling such issues are widely adopted.
A web application framework is essential for web application development. It can
provide self-contained templates and methods for developers to create web pages
and deploy web servers. Especially in web applications with complex functions
and abundant interaction, web application frameworks can package the complicated
operations in encapsulated functions thus alleviating the workload of development
244 Z. Gui et al.
and deployment. With web application frameworks, developers can easily implement
functional requirement like database access, data validation and dynamic interaction
by using library functions (Sun et al. 2005). Web application frameworks can also
optimize system architecture and achieve loose coupling between the front-end and
back-end, in turn improving development efficiency and increasing portability (Deeb
et al. 2015). There are many JavaScript-based web application frameworks such as
Node.js, Vue.js, React.js, Jquery.js and Express.js.
Network transport protocols and data format standards are critical for data trans-
mission across different platforms and programming languages. Web service tech-
nology is a standardized, platform and language independent method way of inte-
grating web applications (Mockford et al. 2004). Conventional, Simple Object Access
Protocol (SOAP) or HTTP REpresentational State Transfer (REST) are used for data
transmission by using both human and machine readable formats like XML, JSON,
GeoJSON, WKT, GML and other formats that comply with web service standards.
Optimized data transmission mechanisms is for reducing transmission latency
and promoting user experiences when fetching a large amount of data. A widely
used method is progressive transmission by scope of window or levels of detail
(LOD) (Levenberg et al. 2002). To implement progressive transmission, specific data
structure models are needed to organize data by blocks or levels. The asynchronous
data prefetching and caching method in the front-end is also important to reduce
the data latency and network load. For example, Google earth uses the quad-tree
encoding method to organize map data, prefetch and cache the necessary data to the
client to make interaction smoothly. Moreover, data compression can also reduce the
data volume in network transmission. The encoding and decoding algorithm must
be efficient enough to avoid overlong computing time. For example, Deck.gl loads
trip data by compressing the series consisting of longitude, latitude and timestamps
into an encoded polyline format that includes arrays of turn points.14
15.5 Enterprise Registration Data Visual Analytics as a Use
Case
To demonstrate the power of aforementioned technologies in big data visual analytics,
we take big enterprise registration data as an example in this section. Enterprise
registration data provides fine-grained registration information about each individual
enterprise, offering a promising solution for economic geography and regional studies
(Duranton and Overman 2005; Marcon and Puech 2010). By analyzing the spatiotem-
poral distribution of industries at multiple scales, studies of urban spatial structure,
urban agglomerations, industrial aggregations, and socioeconomic activities can be
furthered (Li et al. 2018). As shown in Fig. 15.16, an HPC-supported web visual
analytics framework was developed to store, preprocess, and visually analyze big
14ibgreen. (2018). building-apps.md. https://github.com/uber/deck.gl/blob/master/docs/developer-
guide/building-apps.md. Accessed 21 Feb 2019.
15 High Performance Spatiotemporal Visual Analytics Technologies … 245
Location Time Other attribut es
Big Enterp rise Regi stration Data Manageme nt
Data Pr eprocess ing
Support
Spatial statistics and analysis
Big Dat a Computin g
Stora ge Layer
Web visual ization
Layer
MySQL Neo4J Nanocubes
HDFS
Web Framework
Comp uting
Layer
Echart
D3.js
Kepler.gl ...
Visualization Libraries
React Django Vue.js
Angular
Spatial &Full-text Indexes
Apache Spark
HPC Fram ework
Industri al Types
Data Query
Services
Computation
Servic es
Spatial Query &
Ana lysis Co mpone nts
Web-based Visual Analy tics
Spatiotemporal Distribution Trend
Network Analysis
Spatial Clustering Patterns
Correlation Analysi s
Fig. 15.16 High-performance visual analytics of big enterprise registration data for economic
geographic studies
enterprise registration data by use the proposed technologies and tools (Li et al.
2018; Gui et al. 2020b; Wang et al. 2020).
15.5.1 HPC-Accelerated Data Preprocessing
Big, fine-grained enterprise registration data that includes time and location infor-
mation enables us to quantitatively analyze, visualize, and understand the patterns of
industries at multiple scales across time and space. However, data quality issues like
nonstandarlization, duplication, incompleteness, and ambiguity, hinder such anal-
ysis. Data preprocessing become challenging when the volume of data is immense
and constantly growing, and may result in out of memory and intolerable calcula-
tion time problems. HPC technologies can be used to tackle big data computational
issues.
We use HPC technologies to fill the industry category and location attribute
missing problems for enterprise registration data. Industry categorization is impera-
tive for analyzing the development of different industrial categories, while location
accuracy is critical for spatial analysis. For the dataset we collected with about 17
millions of enterprise registration records of mainland China, there is 43.64% of the
records has no industrial category values. Approximately 30% of the records only
have a street-level address but do not include the province or city to which it belongs.
The address ambiguity problem seriously impedes effective geocoding. A big data
imputation workflow based on cluster computing technologies is utilized to impute
enterprise registration data (Li et al. 2018).
The proposed imputation workflow is illustrated in Fig. 15.17. Industrial cate-
gory imputation is treated as a short text classification problem consisted of input
246 Z. Gui et al.
Incomplete enterprise
registration data
Filtering
Data used for industria l
category imputation
Data used for
location imputation
Industrial category
imputation
Input ve ctor
construction
Classification
methods filling
Postco de
imputation
AD imputation
Geoco din g
Apach e
Spark
cluste r
Bare-metal
cluster
Location
imputation
Support Support
Fig. 15.17 Workflow and computational framework for imputation of incomplete enterprise
registration data
vector construction and classification, and solved in Apache Spark. Location impu-
tation uses a bare-metal computing cluster and contains three steps, i.e., postcode
imputation, Administrative Division (AD) imputation, and geocoding. Experiments
demonstrate the feasibility and efficiency of the proposed imputation framework for
big geotagged text data. It cost about 1,600 s for industrial category imputation and
achieved about 77.4% accuracy in average using Logistic Regression; 97% of all
records were geocoded using the proposed location imputation method while the
geocoding rate was only 53% without location imputation. The HPC-based frame-
work efficiently handles the data incompleteness and location ambiguity problems,
and make further spatiotemporal visual analysis of industries possible.
15.5.2 HPC-Enabled Dynamic Visual Analytics
Based on the imputed data, we developed a web-based visual analytics system by
utilizing aforementioned data storage, indexing, computing and web-based visual-
ization technologies in an integrated manner under the proposed computing frame-
work. With the supports of these enabling technologies, this system is capable to
provide four types analyses on-the-fly, including spatiotemporal distribution trends,
clustering patterns, spatial correlations, and network relations.
15 High Performance Spatiotemporal Visual Analytics Technologies … 247
Fig. 15.18 The spatiotemporal distribution of industries in selected cities of China
15.5.2.1 Spatiotemporal Distribution Trend
Analysis of industrial spatiotemporal distribution has been highlighted in economic
geography, studies of urban spatial structure, and regional policy studies (Li et al.
2015;Parr2014). To analyze the overall spatiotemporal density distribution trend
of different kinds of industries visually, Nanocubes are applied to store and index
big enterprise registration data. Through data-cube mechanisms, Nanocubes slice and
dice data with respect to space, time, or other attributes, supporting real-time viewing
of tens of billions points on a web browser over heatmaps of leaflets. In Fig. 15.18,
the spatial distribution of all industries in China in 2015 is visualized along with
the spatiotemporal distribution of industries in different cities. The yellow line is
Hu’s (Heihe-Tengchong) line (Hu 1935) often used in population geography. About
94.4% of population in China was distributed on the southeast of this line at 2010
(Chen et al. 2016). About 92% of enterprises were distributed on the southeast of
this line, revealing a spatial correlation between population and economic activity,
as well as the sharp gap on east-west industrial development in China.
The moving economic barycenter and changes of industrial distributions reflect
the rising and extending of industries in space (Wang et al. 2006). To explore the
spatiotemporal moving of the economic barycenter, standard deviation ellipses and
centroid trajectories are visualized (Fig. 15.19). Intensive computing is involved to
compute centroids and parameters of standard deviation ellipses within arbitrary
regions containing millions of points. Apache Spark is used to accomplish such
computation tasks on-the-fly for supporting real-time visualization, and a RESTful
API is applied to query the computed results (Song et al. 2017). Trajectories and
ellipses are visualized using Leaflet and Baidu ECharts.15
15Echarts. https://echarts.baidu.com. Accessed 21 Feb 2019.
248 Z. Gui et al.
Fig. 15.19 Standard deviation ellipses and centroid trajectory of industries in Chongqing, China
15.5.2.2 Spatial Clustering and Aggregation Pattern
Spatial concentration of industries may affect the competitive advantage of regional
economies in studies of agglomeration economies (Porter 2014; Tian et al. 2017).
To explore the clustering patterns of millions of enterprises, we used HPC-supported
spatial clustering technologies. Apache Spark graph computing and KD-tree has
been used to accelerate the computing process of DBSCAN (Gao et al. 2017), never-
theless the progress is needed to support real-time clustering. Therefore, we devel-
oped grid-based multiscale clustering method (Gui et al. 2020a) because of its lower
computation complexity and advantages in analyzing clustered patterns across multi-
scale(Danetal.2006). The grid-based multiscale clustering can be accomplished
in a near real-time fashion. Rendering millions of clusters on the client side is also
a nonnegligible challenge. As shown in Fig. 15.20, we used the Kepler library to
visualize the clustered results of enterprise registration data. Figure 15.20ashows
hundreds of thousands of clustering results across China, while Fig. 15.20b illustrates
the coexistence of clusters in Yangtze River delta area.
We also used Nanocubes and Apache Spark to analyze the industrial spatial
agglomeration, qualitatively and quantitatively (Wang et al. 2020). As shown in
Fig. 15.21, Nanocubes depict the spatial distribution of different industries in 2013.
We found that different industries have different spatial cluster patterns. The indus-
tries shown in Fig. 15.21b are geographically concentrated in the capital or large
cities, while the industries in Fig. 15.21a are more evenly dispersed, revealing the
patterns of industrial aggregations visually. To reveal the geographic concentration
of economic activities quantitatively, we compared the aggregations of these indus-
tries in Guangdong, China. We developed distributed Ripley’s K functions based on
Apache Spark to accelerate spatial point pattern analysis (Gui et al. 2020b). As shown
in Fig. 15.21c, d, industries on the left figure are geographically concentrated within
15 High Performance Spatiotemporal Visual Analytics Technologies … 249
(a) Country-level clustering in
mainland China
(b) Regional-level clustering in Yangtze
River delta area
Fig. 15.20 Visualization of spatial clustering of enterprises in mainland China using Kepler
(a) Visualization of AFAHF,
Guangdong, China
(b) Visualization of Social service,
Guangdong, China
(c) Ripley’s K result of AFAHF,
Guangdong, China
(d) Ripley’s K result of Social service,
Guangdong, China
Fig. 15.21 Visualization and quantitative analysis of enterprise spatial cluster patterns using
Ripley’s K (AFAHF denote Industries of Agriculture, Forest, Animal Husbandry, and Fishery)
250 Z. Gui et al.
a larger space (within a distance of 520 km) than industries on the right figure in
Guangdong (170 km). This result verifies that the visualization result in Fig. 15.21a, b
that social service industries are more geographically concentrated than agriculture,
forest, animal husbandry and fishery industries.
Enterprise clustering pattern visualization analysis helps us explore the aggrega-
tion phenomenon of industries at multiple scale. In the following sub-section, we
demonstrate how to use visualization to facilitate the spatial correlation study of
different industries across space and time.
15.5.2.3 Spatial Autocorrelation and Correlation
Spatial autocorrelation of industries indicates how industries of the studied area are
related to industries of neighbored regions (Cui et al. 2017). To reveal the spatial
autocorrelation of different categorical enterprises, we calculated the z-values for
the Getis-Ord General G (Dubin 1998) of different industries, based on space grid-
ding using Apache Spark. Hot-spot distributions of different kinds of industries were
visualized using Leaflet and ECharts. As shown in Fig. 15.22a, b, different kinds of
industries display different spatial patterns. Primary industries are more dispersed in
space while the second industries are geographically concentrated in the main urban
areas. The z-values on the right part of each figure illustrate the spatial autocorrela-
tion variations across different spatial scales. Figure 15.22c, d show that the spatial
distributions of industries and their spatial autocorrelation change over years, and the
(a) primary industry during 2011-2015 (b) second industry during 2011-2015
(c) primary industry during 2001-2005 (d) primary industry during 2006-2010
Fig. 15.22 Spatial autocorrelation and hot-spots visualization of enterprises in Chongqing, China
15 High Performance Spatiotemporal Visual Analytics Technologies … 251
Fig. 15.23 Province level correlation exploration between number of industries, population, GDP,
and GDP per capita by associating the map with scatter diagrams
z-scores in figures indicate the degree of aggregation for different industry categories
in different periods.
In addition to spatial autocorrelation analysis, we also explored correlation
between industries and other statistical indexes using visual analytics. As shown
in Fig. 15.23, by using the choropleth map and scatter diagrams from D3.js, the
correlations among number of industries, population, GDP, and GDP per capita
were visualized. By linking the map and the scatter diagrams together, correlations
between different indexes and regions can compared and analyzed in an interactive
manner. The aforementioned cases of visual analysis are conducted in physical or
geographical space, we furtherly investigate the enterprise relations in network space
in the following sub-section.
15.5.2.4 Enterprise Network Relation
By using network visual analysis, relations, including supply chains between
upstream and downstream industries, cooperation and competition between enter-
prises can be explored. As shown in Fig. 15.24, we visualized enterprises as nodes
of networks, and constructed an edge between two enterprises if they cooperate. We
applied community detection algorithm on the constructed networks, and uncovered
the most influential enterprise communities for different industrial categories. In this
exploration, Neo4J is used to store, and manage the enterprise relation data as a
big graph. Graph algorithms included in Neo4J are used to help detect hard-to-find
patterns and structures in the connected data. A force-directed graph from D3.js was
used to show the discovered patterns.
252 Z. Gui et al.
Fig. 15.24 Exploring network relation of enterprises by using force-directed graph and Neo4J
graph database
15.6 Conclusions
In this chapter, we introduce high-performance visual analytics technologies for
enabling big socioeconomic data analysis from perspective of system architecture.
The depicted technologies and application demonstration might benefit researchers
and developers who is suffering from the big data computing and analysis issues
in policy making and inter-disciplinary research. It might give insight on how to
utilize the latest technologies, software packages and frameworks for data storage,
computing, and web visualization to build such a visual analytics system. We take
enterprise registration data as an example to demonstrate the capabilities and potential
applications of the proposed high-performance visual analytics framework. HPC-
based data imputation methods show it power in enabling and accelerating data
preprocessing for large data volume. The feasibility of web-based dynamic visual
analytics is verified by four exemplary case studies, including (1) spatiotemporal
distribution trend analysis using heatmaps and histograms provided by Nanocubes,
as well as economic barycenter trajectories and standard deviation ellipses; (2) spatial
clustering pattern using grid-based multi-scale clustering and aggregation pattern
using Ripley’s K function; (3) spatial autocorrelation using Getis-Ord General G
function and correlation exploration among different statistical indexes by associ-
ating maps with statistical plots; (4) network relation analysis using community
detection algorithm with the support of graph database and force-directed graph.
The introduced technologies and proposed framework are not limited to the illus-
trated case study, it can be adopted to visual analytics scenarios of other big socioeco-
nomic data. To fully utilize the power of the introduced technologies, the entire tech-
nology stack needs to be designed thoroughly. The data indexing and storage method
need to be carefully investigated in term of the concrete data models and application
15 High Performance Spatiotemporal Visual Analytics Technologies … 253
scenarios. The HPC-supported analysis algorithm is capable to leverage the cutting-
edge computing technologies for computing intensive applications. The application
framework, visualization libraries, data transmission and rendering strategies are
indispensable to develop intuitive visualization effects and user-friendly interaction
functions to better support exploratory visual analytics. With the wide adoption of
cloud computing technologies and increasing demands on big data online fusion
and analysis, web-based visual analytics functions might evolve into cloud service
in near future, i.e., Visual Analytics as a Service (VAaaS). To achieve this goal,
generic visual analytics capacity must be divided into fine-grain, application and
domain-specific cloud service components or functions elaborately. Meanwhile, the
issues relevant to service delivery, computing resource provisioning, payment, and
the protection of data privacy are also need to be carefully designed.
References
Balasubramanian, L., & Sugumaran, M. (2013). A state-of-art in r-tree variants for spatial indexing.
International Journal of Computer Applications, 42(20), 35–41.
Bender, M., Klein, R., Disch, A., & Ebert, A. (2000). A functional frame-work for web-based
information visualization systems. IEEE Transac-tions on Visuali-zation & Computer Graphics,
6(1), 8–23.
Chen, M., Gong, Y., Li, Y., Lu, D., & Zhang, H. (2016). Population distribution and urbanization on
both sides of the Hu Huanyong Line: Answering the Premier’s question. Journal of Geographical
Sciences, 26(11), 1593–1610.
Cui, Z., Xie, G., Gui, Z., & Wu, H. (2017). Analyzing the spatiotemporal distribution of different
industries in wuhan city using enterprise registration data. Int. Arch. Photogramm. Remote Sens.
Spatial Inf. Sci., XLII-2/W7, 5–10.
Dan, K., Galun, M., & Brandt, A. (2006). Fast multiscale clustering and manifold identification.
Pattern Recognition, 39(10), 1876–1891.
Deeb, R., Ooms, K., Brychtová, A., Van Eetvelde, V., & De Maeyer, P. (2015). Background and
foreground interaction: influence of comple-mentary colors on the search task. Color Research
& Application, 40(5), 437–445.
Dubin, R. A. (1998). Spatial autocorrelation: A primer. Journal of Housing Economics, 7(4), 304–
327.
Duranton, G., & Overman, H. G. (2005). Testing for localization using micro-geographic data.
Review of Economic Studies, 72(4), 1077–1106.
Eldawy, A. (2014, June). SpatialHadoop: towards flexible and scalable spatial processing using
mapreduce. In Proceedings of the 2014 SIGMOD PhD symposium (pp. 46–50). ACM.
Eldawy, A., & Mokbel, M. F. (2015, April). Spatialhadoop: A mapreduce framework for spatial
data. In Data Engineering (ICDE), 2015 IEEE 31st International Conference on (pp. 1352–1363).
IEEE.
Fahmy, M. M., Elghandour, I., & Nagi, M. (2017). CoS-HDFS: co-locating geo-distributed spatial
data in hadoop distributed file system. Ieee/acm, International Conference on Big Data Computing
Applications and Technologies (pp. 123–132). IEEE.
Fathy, Y., Barnaghi, P., & Tafazolli, R. (2017). Distributed spatial indexing for the Internet of Things
data management. IEEE: Integrated Network and Service Management.
Fox, A., Eichelberger, C., Hughes, J., & Lyon, S. (2013, October). Spatio-temporal indexing in non-
relational distributed databases. In Big Data, 2013 IEEE International Conference on (pp. 291–
299). IEEE.
254 Z. Gui et al.
Friedman, E., & Tzoumas, K. (2016). Introduction to Apache Flink: Stream Processing for Real
Time and Beyond. “ O’Reilly Media, Inc.”.
Gao, X., Gui, Z., Long, X., Li, F., Wu, H., & Qin, K. (2017). KDSG-DBSCAN: A High Performance
DBSCAN Algorithm Based on KD-Tree and Spark GraphX. Geography and Geo-Information
Science, 33(6), 1–7.
Gui, Z., Peng, D., Wu, H., Long, X. (2020a). MSGC: Multi-Scale Grid Clustering via Analytical
Granularity and Visual Cognition for Detecting Hierarchical Spatial Patterns. Future Generation
Computer Systems, 112, 1038–1056.
Gui, Z., Wang, Y., Cui, Z., Peng, D., Wu, J., Ma, Z., Luo, S., Wu, H. (2020b). Developing Apache
Spark based Ripley’s K Functions for Accelerating Spatiotemporal Point Pattern Analysis. Int.
Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B4-2020, 545–552.
Hennig, L., Thomas, P., Ai, R., Kirschnick, J., Wang, H., Pannier, J.,… & Uszkoreit,H. (2016). Real-
Time Discovery and Geospatial Visualization of Mobility and Industry Events from Large-Scale,
Heterogeneous Data Streams. Proceedings of ACL-2016 System Demonstrations, 37–42.
Hu, H. Y. (1935). The distribution of population in china, with statistics and maps. Acta Geographica
Sinica, 15(2), 1–24.
Hu, F., Yang, C., Jiang, Y., Song, W., Duffy, D., Schnase, & J., Lee, T. (2018). A hierarchical
indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial
raster data. International Journal of Digital Earth.
Kamel, I., Talha, A. M., & Aghbari, Z. A. (2017). Dynamic spatial index for efficient query
processing on the cloud. Journal of Cloud Computing, 6(1), 5.
Karimov, J., Rabl, T., Katsifodimos, A., Samarev, R., Heiskanen, H., & Markl, V. (2018).
Benchmarking Distributed Stream Processing Engines. arXiv preprint arXiv:1802.08496.
Kini, A., & Emanuele, R. (2014). Geotrellis: Adding geospatial capabilities to spark. Spark Summit.
Levenberg, J. (2002). Fast view-dependent level-of-detail rendering using cached geometry.
Visualization, 2002. Vis (pp. 259–266). IEEE.
Li, F., Gui, Z., Wu, H., Gong, J., Wang, Y., Tian, S., et al. (2018). Big enterprise registration data
imputation: supporting spatiotemporal analysis of industries in china. Computers, Environment
and Urban Systems, 70, 9–23.
Li, J., Zhang, W., Chen, H., & Yu, J. (2015). The spatial distribution of industries in transitional
China: A study of Beijing. Habitat International, 49, 33–44.
Marcon, E., & Puech, F. (2010). Measures of the geographic concentration of industries: improving
distance-based methods. Journal of Economic Geography, 10(5), 745–762.
Mockford, K. (2004). Web services architecture. BT Technology Journal, 22(1), 19–26.
Parr, J. B. (2014). The regional economy, spatial structure and regional urban systems. Regional
Studies, 48(12), 1926–1938.
Porter, M. E. (2014). Competitive advantage, agglomeration economies, and regional policy.
International Regional Science Review, 19(1), 85–90.
Rigaux, P., Scholl, M., & Voisard, A. (2002). Spatial Databases: with application to GIS (p. 410).
San Francisco: Morgan Kaufmann.
Song, Y., Gui, Z., Wu, H., & Wei, Y. (2017). A web-based framework for visualizing indus-
trial spatiotemporal distribution using standard deviational ellipse and shifting routes of gravity
centers. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W7, 129–135.
Stojanovic, N., & Stojanovic, D. (2013). High–performance computing in GIS: techniques and
applications. International Journal of Reasoning-based Intelligent Systems, 5(1), 42–49.
Sun, L., Lu, B., & Sun, J. (2005). Design and study of web application framework based on struts.
Computer Engineering, 31(8), 57–60.
Theodoridis, Y., Stefanakis, E., & Sellis, T. (2000). Efficient cost models for spatial queries using
r-trees. Knowledge & Data Engineering IEEE Transactions on, 12(1), 19–32.
Tian, S.,Wang, J., Gui, Z.,Wu, H.,&Wang, Y. (2017). A case study: exploring industrial agglomer-
ation of manufacturing industries in shanghai using duranton and overman’s k-density function.
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W7, 149–154.
15 High Performance Spatiotemporal Visual Analytics Technologies … 255
Wang, X., Dian-Ting, W. U., & Xiao, M. (2006). Industrial development and moving of Chinese
economic barycenter. Economic Geography.
Wang, Y., Gui, Z., Wu, H., Peng, D., Wu, J., Cui, Z. (2020). Optimizing and Accelerating Space-
Time Ripley’s K Function based on Apache Spark for Distributed Spatiotemporal Point Pattern
Analysis. Future Generation Computer Systems, 105, 96-118.
Ye, X., Shi, X., & Chen, Z. (2017). Scalable near-repeat and event chain calculations over
heterogeneous computer architecture and systems. Big Earth Data, 1(1–2), 191–203.
Yu, J., Zhang, Z., & Sarwat, M. (2018). Spatial data management in apache spark: the GeoSpark
perspective and beyond. GeoInformatica, 1–42.
Zhang, F., Zheng, Y., Xu, D., Du, Z., Wang, Y., Liu, R., et al. (2016). Real-time spatial queries for
moving objects using storm topology. ISPRS International Journal of Geo-Information, 5(10),
178.
Zhang, X and Du, Z. (2017). Spatial Indexing. The Geographic Information Science & Technology
Body of Knowledge (4th Quarter 2017 Edition), John P. Wilson (ed). https://doi.org/10.22224/
gistbok/2017.4.12.
Zia, K., Farrahi, K., Riener, A., & Ferscha, A. (2013). An agent-based parallel geo-simulation of
urban mobility during city-scale evacuation. Simulation, 89(10), 1184–1214.
Chapter 16
Demystifying the Inequality
in Urbanization in China Through
theLensofLandUse
Jinlong Gao and Jianglong Chen
16.1 Introduction
Regional inequality is an important aspect of academic inquiry and is one of the major
concerns facing governments as it may threaten national unity and social stability
(Ravallion 2014;Wei2015; Iammarino et al. 2018). The trends and driving forces
underlying regional inequality have been the subject of heated debates, especially
after the late 1980s (e.g., Liu 2006; Florida and Mellander 2016; Paredes et al. 2016;
Lee et al. 2018). As the neoclassical growth model predicts, poor nations/regions
tend to catch up with the rich ones in terms of the level of per capita product or
income, because of the relative homogeneity in technology, preferences, and institu-
tions (Martin and Sunley 1998; Scott 2000;Wei2000; Rey and Janikas 2005). While
some support the theory of neoclassical convergence, others find a lack of conver-
gence and that regional inequality even increased in those developing economies
such as China and India (Liao and Wei 2012; Ravallion 2014; Xie and Zhou 2014;
Wei 2017; Yenneti et al. 2017).
From a methodological perspective, the commonly used Gini coefficient, Theil
index, and, coefficient of variation, which can well examine the temporal variation
of regional inequality based on social-economic data, have been challenged for their
ignorance of geographical space. Specifically, one can hardly figure out exactly
where the gap is, but merely know there is a gap (Li et al. 2015; Gao et al. 2019a).
As Li and Gibson (2013) argued much of the apparent increase in inter-provincial
inequality was a statistical artifact caused by the distortion of non-hukou migrants
J. Gao ·J. Chen (B
)
Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing 210008,
China
e-mail: jlchen@niglas.ac.cn
J. Gao
e-mail: jlgao@nilgals.ac.cn
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
X. Ye and H. Lin (eds.), Spatial Synthesis, Human Dynamics in Smart Cities,
https://doi.org/10.1007/978-3- 030-52734- 1_16
257
258 J. Gao and J. Chen
in China. Others have also argued that gross domestic product (GDP) statistics data
is distorting under the pressure of political achievement competition among local
government officials in China (Liu et al. 2013), and even suggested to abandon GDP
as a measure of national success (Costanza et al. 2014). Fortunately, we can acquire
a relatively stable urban land use statistic from remote sensing images. And Sawyer
(1975) has pointed four decades ago that urban form flowed out of and must remain
consistent with the basic economic structure of the society of which it was a part.
And urban land can thus be employed as an important indicator of regional/urban
development, particularly under the new urbanization background (Lin 2014;Gao
et al. 2015; Lee et al. 2016;Lietal.2018). Specifically, the accretion and replacement
of urban land can well characterize the flow/transformation of population and energy,
and can provide clues for further magnitude the evolution of inequality (Bai et al.
2014; Ding and Zhao 2014; Gao et al. 2017,2020a).
Though scholars have argued that a polarized pattern of demographic urbanization
in China has been forming surrounding those mega-city regions (Fang et al. 2015),
and claimed that there should be a generally universal framework of urbanization
across the country (Wang and Liu 2015; Chen et al. 2018). These are far from the
truth from the perspective of land urbanization (Li et al. 2017; Wei et al. 2017;Lietal.
2018). On the contrary, relatively little has been known about the disparity among
regions in China in terms of the land use, more efforts are still needed to examine
urban inequality and differentiation (Lang et al. 2018; Zeng et al. 2018; Gao et al.
2019a). Coincidently, unfolding complex land urbanization makes it clear that the
quantitative understanding, optimization, and adjustment of land use pattern of cities
is a major issue for sustainable land use (Verburg et al. 2004). Nevertheless, the lack
of solid understanding of patterns makes it difficult to address the ongoing challenges
of the volatility and complexity of land use policy in China (Long 2014). The situation
consequently poses a number of challenging questions to the country: (1) How to
correctly depict the general situation of land urbanization and consequential spatial
phenomenon in China? (2) How to adapt a quantitative approach to address the
distinctive patterns of land urbanization? and (3) What are the underlying drivers of
land urbanization patterns?
With the introductory remarks in mind, we analyze the patterns of inequality in
urbanization for the period 2000–2015, from a land use perspective, and proceed
with the following agenda. The next section presents a brief discussion of data and
methodology. Then, we start with examining the pattern and the evolution process
of land urbanization at county level. Thereafter we model the determinants of land
urbanization under the given analytic framework. Finally we conclude with major
findings and policy implications.
16 Demystifying the Inequality in Urbanization … 259
16.2 Data and Methodology
16.2.1 Data Sources
Employing the data of urban and rural construction land (including urban, industrial
and mining, rural residential, and transportation lands) acquired from remote sensing
images, this study mainly analyze and discuss spatial differential characteristics
of land urbanization in China at the county level since 2000. Socioeconomic data
required for influencing factors are extracted from the Statistical Yearbook of Social
Economy of Counties (cities) in China in 2001 and Statistical Yearbook of Counties
in China (for counties and cities) in 2016. In the space expression section, data
on traffic networks, terrain conditions, precipitation, and the administrative division
vector boundary are provided by the Resource and Environment Data Cloud Platform
of Chinese Academy of Sciences (http://www.resdc.cn/).
Combining the particularity of land urbanization in developing counties and data
availability, land urbanization rate (LUR) was employed as the dependent vari-
able. And independent variables were selected from the terms of population size,
economic level, industrial structure, urban characteristics, and geographical location
(see Table 16.1). In this chapter, specific variable selection was based on the following
assumptions: (1) Urban population growth is the main demand for urban land (Wu
et al. 2015; Chen et al. 2016). And the larger the population size, the higher the
corresponding level of land urbanization is (Deng et al. 2008; Wu and Zhang 2012;
Gao et al. 2015). (2) Economic development can effectively increase the income
of urban residents, improve living conditions in cities and towns, and simulate the
transfer of agricultural populations to cities and towns, thus increasing the demand
for land for housing, industry, and transportation (Deng et al. 2010; Chen et al.
2016;Lietal.2018). (3) Land urbanization inevitably promote the transformation
of industrial structures by reducing the share of agriculture sectors (Liu et al. 2014;
Chen et al. 2016). The development of the service industry and the improvement of
intensive level of manufacturing sector may have a negative impact on regional land
urbanization. (4) Urban characteristics including the administrative level and popula-
tion density have also been recognized to have significant impact on the expansion of
urban land (Gao et al. 2014;Lietal.2015). And the higher the administrative level or
density of a city, the stronger ability of agglomerating resources and the higher level
of development it has. This will unsurprisingly accelerate the rate of land urbaniza-
tion to a certain extent. 5) Favorable physical conditions (i.e., geographical location
and terrain) can better meet the requirement of urban land expansion (Liao and Wei
2014; Chen et al. 2016), which is conducive to land urbanization.
260 J. Gao and J. Chen
Table 16.1 Influencing factors of land urbanization at county level
Categories Vari a b l e s Definitions
Population growth Ration of demographic
urbanization (DUrban)
Urban population/permanent
population
Economic development Per capital GDP (PGDP)Gross domestic production
(GDP)/permanent population
Fixed investment (FInvest )Total amount of fixed investment/GDP
Fiscal revenue (Finance)Budget revenue/GDP
Industrial structure Industrialization (Indust)Non-agricultural value added/GDP
Intensification
(ADSIndust)
Gross industrial output value above
designated size/GDP
Development of service
(Service)
Added value of tertiary
industry/non-farming gross product
Urban characteristics Administrative hierarchy
(Admin)
Districts in province-level
municipality, sub-provincial city,
provincial capital city, general city
and the county (county-level city)
have a value of 5–1
Population density (PDen) Permanent population/area of district
or counties
Geographical features Topographic relief
(Terrain)
Stemming from Feng et al. (2007)
Annual precipitation
(Precipit)
County average annual precipitation
Density of roads (Roads) Total road mileage/area of district or
counties
Central region(Central) Dummy variable, counties in central
are 1, others are 0
Western region(We s t )Dummy variable, counties in west are
1, others are 0
Northeastern
region(NEast)
Dummy variable, counties in
northeast are 1, others are 0
16.2.2 Methodology
Land conversion index (Lin et al. 2018) and land urbanization quality (Zhang and
Wang 2018) were used for reference to calculate the index for LUR in counties, that
is, the proportion of urban, industrial and mining, and transportation land used in
cities and towns relative to the total urban and rural construction land (Yang et al.
2018). This index not only describes the level of land urbanization, but also reflects
the changes in land use in the urbanization process. The formula LUR is as follows:
LUR =ul +il +tl
ul +il +tl +rl
16 Demystifying the Inequality in Urbanization … 261
where ul denotes the scale of urban land, il denotes the scale of industrial and
mining land, tl denotes the scale of transportation land, and rl denotes the scale of
land for rural residents. Applying spatial analyst in software ArcGIS 10.2, the land
urbanization pattern in China at the county level during 2000–2015 was presented.
In this chapter, the county was taken as the basic research unit. Considering the
restrictions of linear regression model (LRM) for estimating spatial characteristics
of independent variables and “global” estimation, this chapter combined the ordi-
nary least square (OLS) and geographically weighted regression (GWR) models to
measure the influence of the above-mentioned factors on land urbanization. Given
there are series of observed values for explanatory variables xij and explained vari-
ables yij with i=1, 2…, mand j=1, 2…, n, the classical global regression model
is shown as follows:
yj=β0+
n
j=1
βjxij +εi,(i=1,2,...,m;j=1,2,...n)
where εdenotes the error of the whole regression model, and regression coefficient β
is assumed to be a constant. OLS is generally used to estimate model parameters and
GWR expands the OLS model. The regression coefficient is no longer the assumed
constant β0obtained from global information, but is β0obtained from conducting
local regression estimation on a sub-set of data approximate to observed values. β0
varies with geographic locations. The specific GWR can be described as follows:
yi=β0(mi,ni)+
n
j=1
βj(mi,ni)xij +εi
where (mi,ni)denotes the central geographic coordinates of the ith county unit, and
γj(m,n)denotes the value for continuous function γj(m,n)of variables xij at the
ith county unit.
16.3 Spatial Inequality in Land Urbanization
16.3.1 Land Urbanization Patterns by County in 2000
According to the remote sensing data, LUR in China was 26.33% in 2000. And they
are the eastern coastal regions and old industrial bases in northeast China those have
the highest land urbanization levels of 31.26% and 26.03%, respectively. While the
central and western regions have relatively low land urbanization levels of 20.74
and 23.64%. This is basically consistent with the regional pattern of population
urbanization (Fang et al. 2015). Specifically, land urbanization levels in counties are
generally lower, and the land urbanization levels in over 75% of the 4342 counties are
262 J. Gao and J. Chen
lower than 50% (Fig. 16.1). With reference to the stage of total urbanization in China,
land urbanization levels were classified into 5 grades, namely low (≤10%), medium-
low (10%–30%), medium (30%–50%), medium-high (50%–70%), and high (>70%).
As Fig. 16.1 indicates, the number of districts/counties with low land urbanization is
roughly equivalent to those with medium low land urbanization, with the proportion
of the both being about 30%. Conversely, the number of districts/counties above
medium land urbanization is relatively small, with proportions of 17.87, 10.69, and
14.03%.
Geographically, the north-south differentiation pattern of the land urbanization
level in Chinese counties is more apparent than the east-west or coast-inland one.
Levels of land urbanization in southern counties with Qinling Mountains-Huaihe
River as the boundary are obviously higher than those in northern counties. Urban
agglomeration areas such as the Yangtze River Delta, Pearl River Delta, West Coast
of Taiwan Straits, Chengdu-Chongqing, and Middle Reaches of Yangtze River are
manifested as areas with high land urbanization levels followed by East Liaoning
Peninsula and Shandong Peninsula. In contrast, traditional agricultural areas such as
Huang-Huai-Hai, Northeast China, and Shaanxi-Gansu-Ningxia have low levels of
land urbanization (Fig. 16.2).
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
sei
tnuocfonoitalumuccA
Ratio of land urbanization (LUR)
Fig. 16.1 Lorenz curve of land urbanization at the county level in 2000
16 Demystifying the Inequality in Urbanization … 263
Fig. 16.2 Spatial patterns of land urbanization at the county level in 2000
16.3.2 Land Urbanization Patterns by County in 2015
With further acceleration of population urbanization, the level of land urbanization
increases correspondingly. East China has the highest level, 43.29%, followed by its
west and central counterparts with 42.22% and 34.57%, respectively. However, due to
continuous population shrinkage and economic decline, land urbanization in north-
east China is relatively slow, with LUR increasing by only 4.56% in 15 years. During
this period, the nation proposed the new-type urbanization strategy of “coordinating
urban and rural development, and promoting urbanization actively and steadily,” to
slow rapid urbanization and narrow the differences in land urbanization level among
regions. The variable coefficient of LUR in counties decreased from 0.775 in 2000
to 0.584 in 2015, and regional differences of land urbanization tended to converge
(Fig. 16.3).
As Fig. 16.4 maps, the number of districts/counties in which land urbanization
is at the low or medium low levels decreased by nearly 20%, from 2493 in 2000
to 1632 in 2015. The number of districts/counties in which land urbanization is
above medium high levels accounted for over 40% of all study units. The number
of districts/counties in which LUR was over 70% increased by 11.32% compared
with that in 2000, and overall LUR in counties improved significantly. Regions with
264 J. Gao and J. Chen
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Accumulation of counties
Ratio of land urbanization (LUR)
Fig. 16.3 Lorenz curve of land urbanization at the county level in 2015
high land urbanization levels include the region south of Qinling Mountains-Huaihe
River, which expanded, and the southeastern coastal areas, such as areas of Pearl
River Delta and west coast of the Taiwan Straits. The main urban agglomerations
along the Yangtze River Economic Belt, became polar land urbanization nuclei.
Moreover, in northwestern regions and the Inner Mongolia-Shanxi area, there were
large areas with high land urbanization. The probable reason might be that a large
amount of petroleum or coal resource-based cities are concentrated in these areas,
and large-scale exploitation of resources contributed to the increase of industrial and
mining land use.
16 Demystifying the Inequality in Urbanization … 265
Fig. 16.4 Spatial patterns of land urbanization at the county level in 2015
16.3.3 Evolution of Land Urbanization Patterns in Chinese
Counties
From 2000–2015, LUR in China increased from 26.33 to 39.63%, with an average
annual growth of 2.77%. Due to their lower base, counties in central and western
China have witnessed the rapidest growth. Whilst, the growth rate in the northeast
is the lowest, which might be a result of economic recession and population decline
(Table 16.2). Based on the growth rate of demographic urbanization and taking
conclusions remarked by existing studies on the coupling relationship between popu-
lation and land urbanization into consideration, we herein divide the growth rate of
Table 16.2 Regional difference of land urbanization, 2000–2015
Region 2000 (%) 2015 (%) 2000–2015 (%) Average growth rate (%)
Eastern 31.26 43.25 11.99 2.19
Central 20.74 34.57 13.83 3.47
Western 23.64 42.22 18.58 3.94
Northeastern 25.03 30.59 5.56 1.35
266 J. Gao and J. Chen
land urbanization into 5 groups, namely decrease (≤0), slowly increase (≤0–1%),
increase (≤1–3%), rapid increase (≤3–5%), and fantastically increase (>5%). Results
of spatial statistics and linear interpolation show that the annual average growth rate
of land urbanization in over 20% of counties is over 5%, followed by those with
an annual average growth rate of 3–5%, accounting for 18.42%. While the propor-
tion of counties with annual average rate of land urbanization under 1% is 14.03%,
representing a relatively high level of land urbanization in the whole.
According to division of “Hu Line”, LUR in the northwestern part increases from
26.23% in 2000 to 38.39% in 2015, with an annual growth rate of 2.57%; while LUR
in the southeast increases from 21.33% to 40.77%, with an annual growth rate of
4.41%. As Fig. 16.5 maps, counties with higher LUR are primarily concentrated in
the middle reaches of Yangtze River, Wanjiang regions along Yangtze River in Anhui
province, Nanchang-Jiujiang of Jiangxi province, central regions of Yunnan province,
Gansu-Ningxia region, central region of Inner Mongolia, and the central region of
Xinjiang. And regions surrounding those provincial capitals such as Nanjing, Ji’nan,
Hefei, Nanchang, Taiyuan, Hohhot, and Guiyang are demonstrated as hotspots of
land urbanization in the past one and a half decades.
Fig. 16.5 Changing patterns of land urbanization at the county level, 2000–2015
16 Demystifying the Inequality in Urbanization … 267
16.3.4 Land Urbanization Types in Chinese Counties
By overlaying the maps of land urbanization in 2000 (including 3 basic levels of low
and medium-low, medium, medium-high and high) and the map of land urbanization
change from 2000–2015 (including 5 growth rates), we set 15 types of land urban-
ization. As Fig. 16.6 implies, counties with both medium basic level and high growth
rates (i.e., increase, rapid increase, and fantastically increase) account for as much
as 44.29%, and mainly distribute on the peripheries of Yangtze River Delta, Pearl
River Delta and other urban agglomerations mentioned in the National New-type
Urbanization Plan (Fang et al. 2015). And 715 counties (about 16.45% of the whole)
with low LUR in 2000 have witnessed the development of land urbanization with
the rate over 1% till 2015. In particular, counties in the Huang-Huai-Hai plain and
central region of Inner Mongolia shows an obvious trend of catching up in terms of
the rate of land urbanization. In addition, about 16% of all the counties witnessed a
gradually decrease of LUR during 2000–2015, the majority of which were counties
Fig. 16.6 Development types of land urbanization at the county level. Note L, M, and H denote
the low, median, and high levels of land urbanization in 2000; D, S, I, R, and F denote decrease,
slowly increase, increase, rapid increase, and fantastically increase during 2000–2015
268 J. Gao and J. Chen
with high basic level and concentrated in northeastern China. On the whole, land
urbanization in Chinese counties shows the trend of catching up in the convergent
manner of “the lower, the faster and the higher, the slower”.
16.4 Determinants of Spatial Inequality in Land
Urbanization
16.4.1 Comprehensive Analysis of Elements Based
on the OLS Model
Employing the Z-score method, we standardized 15 index variables listed in
Table 16.1 with the help of SPSS software. Thereafter, variance inflation factor
(VIF) was applied to conduct multicollinearity tests. VIFs of all the variables in
2000 and 2015 are smaller than 3, with no multicollinearity existing among vari-
ables (Table 16.3). According to fitting results of the OLS model, in both the years,
the above-mentioned variables can better explain the inequality patterns of land
urbanization in counties, and overall, they both reach an extremely significant level
(p<0.01), with the model’s goodness of fit being 40.5% and 53.6%, respectively, indi-
cating that the variables’ ability to explain land urbanization in counties of China is
enhanced.
In 2000, fixed asset investment (Flnvest), industrial intensification (ADSIndust),
and dummy variables for central regions fail to pass the significance test of 95% confi-
dence. While, the other 13 indices all have a significant impact on land urbanization,
and the effects of these variables is consistent with theoretical expectations. Specif-
ically, LUR of counties with higher levels of population urbanization is commonly
higher. It has been verified that economic development factors such as per capita
GDP and local fiscal revenue have a positive effect on land urbanization as well.
In terms of local industrial structure, higher industrialization level corresponds with
higher demand for land for construction of cities and towns as well as higher levels of
land urbanization; however, service industry development can promote the intensive
use of urban land, thus restraining the unreasonable development of land urbaniza-
tion. This is similar to the conclusions drawn by Chen et al. (2016b). During the
acceleration of urbanization, cities with higher urbanization levels have a greater
intensity of land expansion (Li et al. 2015b), and the negative feedback of high-
density population agglomeration to land urbanization passes significance test as
well. The assumption that counties with smaller terrain relief, higher average annual
precipitation, and denser road networks have higher LUR passes significance test.
When regional economic and social conditions are the same, the level of land urban-
ization in northeastern China is 10.3% lower than that in eastern China, while that
in the central and western regions is higher than that in eastern China, reflecting the
complex influence of regional differences in development path and regional policy
on land urbanization.
16 Demystifying the Inequality in Urbanization … 269
Table 16.3 Result of OLS model for land urbanization
Var i a b l e s 2000 2015
Coeff. Std coeff. Sig. VIF Coeff. Std coeff. Sig. VIF
C−0.397 0.069 −0.356 0.047
DUrban 0.052 0.062 0.000 1.376 0.035 0.036 0.117 1.008
PGDP 0.069 0.078 0.004 1.990 0.108 0.121 0.000 1.446
FInvest 0.057 0.062 0.266 1.204 0.098 0.099 0.047 1.219
Finance 0.098 0.099 0.001 1.200 0.063 0.067 0.000 1.112
Indust 0.263 0.147 0.001 1.793 0.144 0.154 0.000 1.859
ADSIndust −0.011 −0.014 0.668 1.306 −0.098 −0.104 0.000 1.720
Service −0.060 −0.045 0.029 1.240 −0.021 −0.021 0.401 1.508
Admin 0.104 0.010 0.000 1.065 0.206 0.062 0.030 1.131
PDen −0.098 −0.122 0.000 1.417 −0.110 −0.120 0.000 1.700
Terrain −0.124 −0.168 0.000 1.488 −0.129 −0.153 0.000 1.410
Precipit 0.190 0.241 0.000 1.116 0.240 0.278 0.000 1.086
Roads 0.002 0.007 0.036 1.005 0.047 0.047 0.052 1.335
Central 0.063 0.028 0.358 1.798 0.038 0.019 0.040 1.926
West 0.044 0.021 0.044 2.590 0.021 0.014 0.002 2.527
NEast −0.369 −0.103 0.000 1.572 −0.339 −0.103 0.000 1.498
R20.412 0.542
Adjusted R20.405 0.536
F-statistics 14.357 21.951
Sig. 0.000 0.000
The model results for 2015 show that except for population urbanization and
development of the service industry, other variation coefficients all pass the signifi-
cance test of 90% confidence. With the evolution of industrialization processes, the
larger the enterprise, the higher its level of land intensification, which can restrain the
demand for land use to a certain extent, thus causing negative incentive to increase the
level of land urbanization. The effect of economic development, especially fixed asset
investments, on land urbanization is further highlighted in 2015. The coefficient of
administrative level is still positive but has decreasing significance, indicating weak-
ening intervention of local government on land urbanization. Given that disparity in
development levels among counties and political differences in land urbanization are
long-standing, the difference levels converge to some extent.
270 J. Gao and J. Chen
16.4.2 Spatial Heterogeneity Analysis of Elements Based
on GWR Model
This section analyzes how different dynamics of land urbanization are sensitive to
the geographical location. Applying GWR model in 2000 and 2015, we further
produce, in a more rigorous multi-variant environment, a series of coefficients for
counties located in different regions. The spatial heterogeneity for industrialization
level (Indust) and urban administration (Admin), are not significant and are excluded
in the model. The model goodness of fit in 2000 and 2015 is 67.4% and 76.5%,
respectively, which is significantly superior to the OLS model (Table 16.4). In order
to compare the influencing characteristics of each variable on land urbanization,
statistical analysis was conducted on the fitting coefficient values of sample points
from these two years.
As Table 16.5 repots, all dependent variables reach a significance level of 99.9%
in 2000. From the perspective of influencing model, the positive ratio of coefficients
of elements such as population urbanization, per capita GDP, and fixed asset invest-
ment is over 70%; while the negative ratios of industrial intensification, service
industry, and population density are over 70%. Comparatively, fixed asset invest-
ment in 2015 fails to pass the significance test of 95% (Table 16.6). Furthermore,
Table 16.4 Test result of
GWR model for land
urbanization
Parameter 2000 2015
Bandwidth 409698.3931 632626.0808
AICc 330.593928 10.434024
R20.713215 0.783086
Adjusted R20.674371 0.764637
Table 16.5 Statistical results of GWR model coefficient values in 2000
Var i a b l e s Min L-quantile Median U-quantile Max Mean Positive
(%)
Negative
(%)
DUrban*** −0.398 0.029 0.137 0.287 1.079 0.190 79.71 20.30
PGDP*** −0.594 −0.018 0.107 0.155 0.635 0.074 72.06 27.94
FInvest*** −0.563 −0.025 0.117 0.230 0.761 0.107 70.94 29.06
Finance*** −4.794 −1.031 −0.047 0.482 3.870 −0.234 48.39 51.61
ADSIndust*** −0.864 −0.283 −0.165 −0.066 0.457 −0.173 10.21 89.79
Service*** −0.918 −0.299 −0.190 −0.068 0.267 −0.187 11.90 88.10
PDen*** −0.895 −0.271 −0.119 0.021 1.039 −0.116 29.96 70.04
Terrain*** −0.240 −0.051 0.020 0.203 0.393 0.071 57.46 42.54
Roads*** −0.129 −0.003 0.001 0.006 0.041 0.002 59.69 40.31
Precipit*** −0.240 −0.051 0.020 0.203 0.393 0.071 56.82 43.18
Note *, **, and *** denote significance under 90%, 95%, and 99%
16 Demystifying the Inequality in Urbanization … 271
Table 16.6 Statistical results of GWR model coefficient values in 2015
Var i a b l e s Min L-quantile Median U-quantile Max Mean Positive
(%)
Negative
(%)
DUrban*** −0.264 −0.091 0.010 0.020 0.244 0.012 58.31 41.69
PGDP*** −0.242 0.088 0.155 0.199 0.506 0.138 90.70 9.30
FInvest −0.083 −0.030 −0.001 0.031 0.137 0.001 48.61 51.39
Finance*** −0.056 0.021 0.064 0.114 0.310 0.073 87.00 13.00
ADSIndust*** −0.750 −0.320 −0.247 −0.133 0.474 −0.226 5.67 94.33
Service*** −0.878 −0.234 −0.171 −0.105 0.348 −0.179 6.42 93.58
PDen*** −0.526 −0.290 −0.130 −0.018 0.422 −0.147 21.32 78.68
Terrain*** −0.131 −0.064 −0.020 0.125 0.419 0.037 46.86 53.14
Roads*** −0.370 −0.022 0.019 0.061 1.003 0.026 62.38 37.62
Precipit*** −0.483 −0.008 0.018 0.063 0.187 0.024 65.51 34.49
Note *, **, and *** denote significance under 90%, 95%, and 99%
through comparing the influencing models of elements, the positive effect of popu-
lation urbanization decreases during the evolution of the urbanization process, but
the heterogeneity of effect among different regions is enhanced. Economic develop-
ment and fiscal revenue play a positive promotion role in a wider range. The other
elements’ effect on land urbanization is slightly strengthened.
In order to show the spatial heterogeneity of each factor’s effect on land urban-
ization in combination with the significance characteristics and action intensity of
all variables in these two years, five indices including population urbanization, per
capita GDP, service industry, density of road networks, and terrain were selected to
describe the spatial patterns for their effect on land urbanization (Fig. 16.7).
(1) Population aggregation. Overall, the influence of population on land urban-
ization shows an increasing trend from the coast to inland areas, indicating
the closer it is to inland areas, the stronger the effect population aggrega-
tion has on land urbanization. As the process of urbanization in the eastern
regions accelerates, the range of influence of population aggregation constantly
shrinks to inland areas. Especially, different from central and western coun-
ties, levels of population urbanization in urban agglomeration areas such as
the Yangtze River Delta area, Pearl River Delta area, and Beijing-Tianjin-Hebei
regions are relatively higher. Urban population aggregation does not necessarily
cause synchronous land use growth.
(2) Economic development. There are significant differences between the influ-
ence of PGDP on land urbanization in eastern and western regions. Economic
development in eastern counties accelerates the process of land urbanization,
while that in western regions is relatively weak. In 2000, regression coefficients
gradually decrease outwards from regions of Bohai-Rim, northern Jiangsu, and
areas along the Yangtze River in Anhui, indicating that economic development
272 J. Gao and J. Chen
Fig. 16.7 Spatial distribution of the coeeficients of GWR models in 2000 and 2015
16 Demystifying the Inequality in Urbanization … 273
Fig. 16.7 (continued)
274 J. Gao and J. Chen
Fig. 16.7 (continued)
16 Demystifying the Inequality in Urbanization … 275
Fig. 16.7 (continued)
276 J. Gao and J. Chen
Fig. 16.7 (continued)
16 Demystifying the Inequality in Urbanization … 277
in these areas is mainly driven by land. By 2015, regions where economic devel-
opment is sensitive to land urbanization shrinks to inland areas, indicating that
economic development in coastal areas is less dependent on land resources to
some extent. Whilst, the pattern of south-north inequality does not change, and
the positive effect of economic development on land urbanization is still stronger
in the northern regions than the south.
(3) Industrial structure. The effect of service industry on land urbanization is basi-
cally negative. In 2000, the areas significantly influenced by service industry
are mainly concentrated in the northwest and northeast, followed by the lower
reaches of Pearl River Basin and Huang-Huai-Hai plain. In 2015, the further
development of service industry changes the traditional urbanization mode of
extensive land consumption. Apart from middle reaches of Yangtze River, the
effect shows a trend of increasing from the coast to inland areas, indicating that
the adjustment of industrial structure in inland areas can reduce the demand for
urban land.
(4) Traffic and location. In 2000, the areas influenced by traffic location at high
values were concentrated in the coastal and northwestern regions. And spatial
heterogeneity characteristics of overall intensity and direction were not signif-
icant. In 2015, the influence of traffic location conditions expands rapidly.
Especially, the improvement of traffic conditions in the central and western
counties significantly promotes land urbanization, demonstrating a “V-shaped”
pattern spreading from eastern coastal areas to central and western areas. Traffic
conditions in the northeast and southwestern regions do not correspond with
synchronous land urbanization, possibly due to the surprise attack and grab by
large cities to siphon urban agglomeration and improve external traffic condi-
tions and provide convenience for resource outflow of labor, thus having a
negative effect on land urbanization in these areas.
(5) Terrain condition. The spatial heterogeneity in terrain relief has a regular impact
on land urbanization, showing a trend of decreasing from the coastal areas to
northwest hinterland. Comparing these two years, areas influenced by terrain
condition at high values are mainly concentrated in the coastal areas, and the
scope of areas with high values shows an obvious decreasing trend. The reasons
might be that within the rigid constraints of cultivated land and ecological protec-
tion in coastal counties, the greater terrain relief would result to the smaller
amount of available construction land and thus led to a larger proportion of
urban land correspondingly on the one hand. On the other, the level of economic
and social development is relatively high in coastal China, which would make
the expansion of urban land more easily, regardless of terrain fluctuations.
16.5 Discussion
Since the reform and opening up, China has witnessed unprecedented urbanization.
The number and scale of cities have increased rapidly (Bai et al. 2014; Ding and
Zhao 2014). As one of the core supporting elements for urbanization, the large
278 J. Gao and J. Chen
scale supply of land has promoted rapid accumulation of material capital, which
is an indispensable analytic factor for urban economic growth (Wu et al. 2015;
Chen et al. 2018). Since the early 1990s, “land-centered” urbanization in China has
accelerated with the model of urbanization transforming from “rural industrialization
driven” to “giant cities led” (Lin 2007; Gao et al. 2014;Lin2014). While, different
from the constant polarization of population urbanization, land urbanization tends
to spread spatially, and differences among regions in land urbanization level tend
to converge. Overall the pattern of land urbanization in Chinese counties gradually
transforms from “core-periphery” structure to a contiguous “group-type” one, and
shows a catching up trend surrounding main urban agglomeration in the manner of
“the lower, the faster and the higher, the slower”.
In general, land urbanization shows a convergence trend with the continuous
increase of urban dwellings, indicating that population agglomeration is not (at least
not the most) important driving factor of land urbanization at the present stage.
Land urbanization is generally more rapid than population urbanization, particularly
under a rigorous control of the total quantity of construction land. And intensity of
land development in urban agglomerations has been close to the upper limit (Yue
et al. 2016). Further population aggregation cannot easily bring about a similar scale
of synchronous land use growth. However, there is a sign that land urbanization
accelerates with continuous population decline in central and western counties, which
consequently deepens the contradictions of urban-rural division and human-land
separation (Liu et al. 2014; Gao et al. 2015). Moreover, the results of OLS model have
well implied that population increase, economic growth, industrial structure, urban
characteristics, and geographical locations all have significant influences on land
urbanization. However, results from GWR model show that influences of variables
have significant spatial heterogeneity, indicating that there are significant differences
in the effect intensity of specific factors on different types of regions. For example,
population aggregation has the strongest positive effect on land urbanization in central
and western regions, while economic development and industrial structure might be
effective in the northeastern regions. This not only verifies the