Content uploaded by Eniko Kelly-Voicu
Author content
All content in this area was uploaded by Eniko Kelly-Voicu on Dec 07, 2019
Content may be subject to copyright.
Page1
Exploratory Data Analysis of the California Wildfires
Space-Time Pattern
Eniko Kelly-Voicu
Funding Information: Early version of this paper has been submitted as graduation project to Hunter
College at the City University of New York.
* Correspondence: ekelly@ciesin.columbia.edu
Abstract: Spatio-temporal visualization can be useful in finding new geographical patterns, compared
to analysis relying solely on physical locations. But while researchers have long recognized the need to
integrate time into the spatial representation of maps, a common platform for space-time analysis is still
missing. This paper uses a new method of spatio-temporal visualization that explores California wildfire
data to show the evolution of wildfire incidents over space and time. The method demonstrates how the
most basic representation of a geographic entity, the point, can reveal otherwise hidden space-time
patterns using a visualization method suitable to the purpose. A graph of space-time dimensions was
plotted using polar coordinates that integrate time, in conjunction with a map representing the locations
of wildfires as points. The visualization was enabled by a Python script that can be customized to user
needs. An ArcGIS tool for general users was also developed. My analysis searched for a relationship
between space-time patterns of forest fires and demographic, land cover, and hydrographic data, which
are generally acknowledged to be associated with the ignition and spread of wildfires. Two main spatio-
temporal patterns were observed. Local patterns may be of interest for fire- and forest management,
whereas regional patterns of 40 to 50 years may benefit from further research on the underlying
processes. These calculations and tool for spatio-temporal representation can lead to regional- and local-
scale space and time analyses that may provide new insights into wildfire management, as well as support
further understanding of how to more effectively incorporate time and space into maps.
Key terms: Concepts and reasoning about time in geography. Exploratory analysis of the space-time
pattern in GIS. California’s wildfire history. Time series.
1. Introduction: Maps help us to understand the nature and causes of phenomena and features
observed on the Earth’s surface, regardless of whether they are natural or human-caused. These
processes, which define geographic features — their location, attributes, all their characteristics
and patterns — are dynamic, their course depending on one main factor: time.
The search for solutions to the challenge of how to include time in maps, which are static by
nature, has often resulted in spatio-temporal models that represent dynamics. Visualization of the
evolution of geographic features is typically implemented in the form of time slides or models of
space-time paths. Representations of time in cartographic visualization require specific solutions,
or “transformations,” suitable to the object of study and applicable to the available data. Maps that
allow us a glimpse into the realm of the processes that shape geographical entities have their
limitations. Researchers have long recognized the need to integrate time as a new dimension into
map representations. Solutions for specific cases have been elaborated by Batty and Jiang (1999);
Batty (2005) and Loidl et al. (2016). To represent evolution in time and space, agent-based
Page2
modeling and cellular automaton has been applied by O’Sullivan and Perry (2013). The existing
representations are mostly unique, and were developed for specific cases. But a common platform
of the spatio-temporal relations is still missing. By integrating the individual solutions into a
common platform, tools and methods for further research and future applications could be shared.
The collection of time representations could then become a library of specific visualizations, or
“cartographic transformations”.
A cartographic transformation defines how geographic features are placed and represented on the
map. It expresses the accuracy and detail of the map. This step is inseparable from the process of
map making. The long tradition of cartographic transformation is based on formulas developed by
mathematicians and geographers. The great geographic discoveries and navigation overseas would
not have been possible without those calculations. And these computations and formulas could
never have successfully defined longitudes correctly without the element of exact time, a historical
struggle which was solved in the first half of the eighteenth century, after the first accurate
chronometer was developed. To define the correct position of an object on the Earth’s surface, the
exact time must be known.
Figure 1. Tobler’s transformational matrix.
Cartographic transformations, reprised by Waldo Tobler (1979) in a 4 x 4 table (Figure 1) have the
point, line, and area as basic units, and transitions between them build the column and rows of the
table. Tobler sees the dimension of the cartographic object as characteristic of its states over time.
He distinguishes between transformations applied to the locative geographic data; transformations
between points, lines and areas; and substantive transformations, which occur during interpolation,
filtering, and map reading. Transitions between the geometries of point, line and area allow
complex relationships, such as seen on a street where crime locations are represented by a point,
the street by a line and police precinct by an area. Tobler (1979) describes the entire process of
making and using a map as a sequence of transformations. Forms of cartographic transformations
refer to the geometry, attribute, symbolization, scale, data structure and data model, and map type.
Isopleth maps are one such form of transformation, as are isochrone maps (Galton, 1881), which
Page3
express time as distance traveled. Time was added as a new dimension or new parameter to the
cartographic transformations on the map.
This paper will present a transformation of wildfire coordinates by integrating the time into one of
the entity’s locational parameters. The space-time representation will be attached to the map as an
illustration of the time dimension. The positions of the points representing wildfire occurrences
will be calculated in polar coordinates: angle and radius. The radius value comprises the radius
length and the date (year) of the occurrence, thus creating a space-time representation of the forest
fire’s position.
2. Background and objectives in spatio-temporal process integration and visualization.
General concepts, models and analytical methods in the GIS literature.
General concepts of space and time
Nunes (1995) approaches the topic of spatio-temporal relationships by describing it from a
science-historical point of view, considering the theme as a central subject of study from the
beginning of human cognition and scientific research. The way we understand and represent this
relationship reflects our knowledge about the facts and processes of the world. Facts and objects
(ontologies), events and processes are real, spatially and temporally determined. In GIS, the results
are visualized in an accurate representation of the real world if the system correctly defines the
parameters of the ontologies and processes. The operations applied to the data and data
manipulations have a meaningful outcome only if the data and operations reflect the nature of the
real situation they are trying to represent. Nunes concludes that each theory has its own
representation; the researcher must define his/her own position and the influence of other theories
on the subject matter.
A conceptual model for the representation of spatio-temporal dynamics
Peuquet (1994) presents a model that incorporates time, location, and object-based views. Those
different views are combined in a “Triad framework.” The model allows queries about object
location, spatial distribution, and temporal relationship between multiple spatial and temporal
phenomena. Once integrated into GIS, this model can identify and examine relationships,
causalities, and effects, and it allows inquiries related to exploration, explanation, prediction and
planning. The Triad framework creates a platform which aims to approach the spatio-temporal
dimension the same way as humans perceive their environment and changes in this environment.
In the Triad model the information is stored as location-based view (where), as the object-based
view (what), and as time-based view (when). Peuquet (1994) underlines that knowledge of spatio-
temporal relationships and the patterns must be based on a set of elemental rules which take into
consideration the intrinsic behavior of spatio-temporal distributions. Peuquet describes four main
temporal patterns: steady state, oscillating (cycles and rhythms), chaotic, and random. In her model
the temporal distributions are described by the following characteristics:
- Cohesiveness, when the cohesiveness of individual objects (e.g., a town or a population)
over time results in smooth transitions from one type of feature to another.
- Temporal similarity, which causes objects and locations to reveal similar rates of change.
- Temporal continuity, which affects events through time that are influenced by a single
process and tend to exhibit an organized temporal pattern.
Page4
- Hierarchical organization of events through time generated by different processes which
operate at different temporal scale and can be related in a hierarchical structure.
- Incompleteness due to our partial knowledge.
Spatio-temporal relationships and representations
Adrienko et al. (2010) approach the topic of spatio-temporal representation through the method of
visual analytics. The main idea of visual analytics is to develop knowledge, methods, technologies
and practice that help to exploit and combine the strengths of human vision and electronic data
processing. The main components of visual analytics research are data analysis, problem solving,
and decision-making using automated data processing techniques such as knowledge discovery
algorithms, interactive visual interfaces, and new geovisualization methods for explanation and
communication of the analytical results. Adrienko presents the inherent structure of time, time
units and natural cycles. Some of these cycles, such as seasons, are regular and predictable; others
are less regular, for example social, economic and natural cycles. The social cycles are defined by
the presence or absence of social problems such as unemployment, crime, drug consumption,
racism. The economic cycle is the fluctuation of the economy between periods of expansion and
recession. The social, economic and natural cycles such as earthquake, volcano activity, and
tsunami are not regular and not fully predictable despite the efforts invested in analysis and
forecast of such events. Again, holidays, school breaks or weekends are a recurring part of the time
line. The temporal dimension can be viewed as composed of time points or time intervals. The
structural organization of the temporal dimension mentioned by Peuquet (1994) is defined by
Adrienko as three different types of temporal structures: ordered time, branching time, and
multiple perspectives. Ordered time can be subdivided into two further subcategories: linear and
cyclic time. Linear time corresponds to our natural perception of time as a continuous sequence of
temporal units, that is time, which proceeds from the past to the future. A cyclic time axis is
composed of a set of recurring temporal primitives (e.g. time of day, seasons of the year, high
unemployment in construction during the cold season). Branching time is a representation that
simplifies the description and comparison of alternative scenarios, which is widely used and is of
great importance in planning or prediction. Time with multiple perspectives represents more than
one point of view related to the observed phenomena or entity. This type of time-related data with
multiple perspectives may be generated when people describe their observations about hazard
events via blogs and other online forums: each reporting person may have a distinct perspective on
the events. The processing, integration, and analysis of spatial data are both constrained by the
fundamental concept of spatial dependence, which is often referred to as “the first law of
geography” or “Tobler’s first law”: “Everything is related to everything else, but near things are
more related than distant things” (Tobler 1970). According to this law, characteristics at proximal
locations and neighborhoods tend to be correlated, either positively or negatively. In statistical
terms, this interdependence is called spatial autocorrelation. Similar concepts of temporal
dependence and temporal autocorrelation can exist for relationships in time, if no disturbance
event occurs (Peuquet, 1994). Spatial and temporal dependencies exclude the use of standard
techniques of statistical analysis, which assume independence among observations, and require
specific techniques, such as spatial regression models, that take dependencies into account.
Page5
3. Spatio-temporal process integration of the California wildfires.
Because multipurpose spatio-temporal models are not available, each space-time representation in
GIS will need a specific representation, a “cartographic transformation” that allows combining the
GIS map view, which is static, precise, and two-dimensional (Couclelis, 1999), with the dynamism
of the phenomena that shapes the entities. The evolution of a point or of multiple points in time
and space constitutes the simplest element of a space-time study. Wildfires are physical processes
with well-defined spatio-temporal attributes. Areas affected by wild fires can be depicted by area
centroids, which are points. The point is the simplest geometry, and in our case, it represents a
short-duration event defined by the locational data (the coordinates) and the wildfire’s date of
occurrence. This representation allows the query described by Peuquet (1994) in reference to what
(wildfires), where (coordinates), and when (date and time when the incident was recorded).
In this paper the spatio-temporal representation shows the position of the wildfire area centroids
over the state of California, with the year as a time unit. This representation allows a further
breakdown to other space-time scales, to finer-grained spatial extents and shorter time units such
as months or days, allowing the analysis of spatio-temporal details. The method presented in
Section 6 of this paper allows for new insights into space-time characteristics of the wildfire
incidents. It enhances the discovery of spatio-temporal patterns, which can be a basis for future
model building; and it allows to identify possible cycles of events and spatial dependencies. The
scope is to integrate all available data in an objective way that is not influenced by existing
knowledge about the processes (Nunes, 1995), spatial interdependencies, or structure of the time
dimension (Adrienko, 2010), in order to reflect the real situation.
4. Recent studies concerning historical wild fires, their causes and methods of prevention.
Demographic, hydrologic and ecologic data.
A short overview of historical wildfires and their causes helps to understand the data used and the
results of this analysis. Fire regimes characterize the spatial and temporal pattern of fires, and the
impact they have on the landscape (Keeley et al. 2009). The weather and climate pattern, and the
type of vegetation, which defines the ecosystem are the main factors, which generate differences in
the fire regimes. Additional factors are the fuel structure of the vegetation, the type of past fire
management, succession after past fires or disturbance in vegetation cover, and forest
management. The main characteristics of the vegetation are its height, age, stand density,
vegetation patches, moisture content, and the health of vegetation. Topography, seasons,
temperature and major winds, precipitation, consecutive periods of unfavorable weather conditions
such as drought, fire weather regime, ownership and human activity are all factors, which
demonstrably affect the ignition and spreading of the wildfire (Keeley et al. 2009). Fires are
classified as crown fire, ground fire, and surface fire (Agee, 2005; Cleland and Dickmann, 2002),
depending on which part of the vegetation has been affected. Forest fires are also classified by
severity (impact on vegetation) and intensity (energy released). The fire return intervals define not
only the time between fires in an area, but also refer to the fire severity of an area which is
periodically exposed to wildfires. The wildfire severity can be of low or mixed severity; and the
fire is called replacement fire if it is of maximal severity (Keeley et al. 2009). Replacement fires
kill all or most of the living overstory trees in a forest and initiate forest succession or regrowth.
Page6
The wildfire severity influences how easily the fire can be contained. Young chaparral in Southern
California is easily controlled as it has a higher fuel moisture, but conifer fuel leads to severe fires
due to low fuel moisture content (Keeley et al., 2009). However, under severe fire weather the
young, humid vegetation will not stop massive fires (Keeley and Fotheringham, 2001). High stand
density generates dangerous forest fuel accumulation. Steep slope favors fire spread especially if
the vegetation is old, dry, or infested, having lower moisture content. Fuel break, such as
understory vegetation separated from the tree canopy, or trimmed tree canopy, controls the fire’s
behavior. Aged trees are higher and resilient to understory fires, but at the same time, they have a
lower fuel moisture content (Keeley et al. 2009). Large catastrophic fires are often associated with
effectiveness of the fire containment and forest management in the US. However, it has been
observed that in regions where systematic fire suppression is not practiced, such as Baja
California, large fires were absent as far back as 1920. This suggests that under the natural fire
system, small fires generated a fragmented landscape and created vegetation patches of different
ages, preventing large fires. Lightening is most common at higher elevations in the interior of
California. The frequent lightning-ignited fires in the reserves at high altitude burned small tracks,
supporting the theory that fine-grained vegetation patches having different ages can prevent
catastrophic fires (Keeley et al. 2001).
Human activities are the main ignition cause of forest fires. In addition, human activities are those,
which mostly shape the environment, influencing the state of vegetation and of the terrain
sustaining the forest. Deforestation usually increases overland runoff, resulting in more erosion of
the land surface and concurrently reducing the amount of water that soaks into the ground to
recharge underlying aquifers. Degradation of the water table affects the moisture content of
vegetation and makes it more vulnerable to ignition. Wildfires increase the severity of soil water
repellency (Versini et al. 2012), and combustion of the leaf layer on the ground reduces infiltration
by sealing the surface and reducing its permeability, increasing runoff and flood frequency as well.
However, an increased ground water recharge was observed in watersheds where up to 1/3 of the
watershed may be affected by a wildfire. This is a long-term effect and is the result of reduced
evapotranspiration as large areas lose their vegetation cover. As permeability recovers and dead
leaf-litter which burned stops retaining moisture, then precipitation can directly infiltrate the water
table and increase the water supply, if the burned areas are large enough compared with the
watershed extension and if high rainfall amounts are measured in the region (Wine and Cadol,
2016).
The construction of private properties deep in forest area has a negative effect. The houses built of
flammable materials intensify the spread of wildfires in the surrounding forest. Increasing urban-
wildland interface creates mosaics of urban-wildland areas, making them highly vulnerable
(Bradshaw, 1987). Studies show a significant relationship between wildfire rate and human
density, which is spatially variable and most prominent in wildland areas adjacent to coastal
population centers (Keeley and Fotheringham, 2001). The correlation between wildfire
occurrences and human presence in California is decreasing in intensity in the interior and at
higher elevations (Keeley, 1982).
Most wildfires are seasonal, and usually occur during fall, which is the driest season. Large fires
are usually associated with the Santa Ana hot winds (Southern California) and with foehn winds
(Keeley et al., 2001). The National Weather Service (NWS) Red Flag Warning is based on the
correlation observed between fire occurrences and high temperature periods, relative humidity
(RH) below the climatological average, strong winds, and adiabatic compression associated with
Page7
inhibited radiative cooling and nocturnal recovery, the latter being strongly influenced by urban
heat islands. Also, a weak positive correlation was observed between the early winter
precipitations, which favor vegetation growth and burned areas during the subsequent fire season
(Nauslar et al. 2018).
Burned areas of historically large wildfires were not always recorded in the past. Existing records
regarding the extension of the areas affected by forest fires are not reliable. However, the
approximate position, date of occurrence, and documented burned areas are available in wildfire
databases. Comparing the historical data and new records, an intensification of the fires was
observed. The main causes of intensified fire activity and increase in areas affected by wildfire are
considered to be global warming and increase in population. A study of large wildfires in the
Western US (Dennison et al. 2014) uses satellite remotely-sensed data to map burned areas. The
measurements are based on pre- and post-fire values of the normalized burn ratio, data being
available since 1984. Total fire areas are increasing in all ecoregions of the Western United States,
despite differences between fire regimes defined by fuel type, fire season, fire frequency, and fire
intensity. Increasing fire activity was observed during all seasons: winter (8%), spring (17%), and
summer (18%), especially during periods with very low precipitation and high temperatures.
Under normal climate conditions, EVI (Enhanced Vegetation Index) recovers six years after major
fires, and more rapidly under very favorable humid conditions, generating plenty of fuel for the
next major wildfire as temperatures are rising again.
Water and moisture define the vulnerability of flora to the wildfires. The water content of plants is
explained by temperature, humidity, and rainfall, and by the state of ground water in the area
(Evaristo and McDonnell, 2017). California aquifers are often subject to overdraft, aquifers being
used more rapidly than they are naturally replenished, especially in the San Joaquim and
Sacramento Valleys where agriculture consumes 80% of the groundwater. Geodetic measurements
in San Joaquim valley revealed a subsidence of 9m between 1920 and 1970. Due to excessive
water extraction, the porous sand and gravel deposits in the aquifer are under increased lithostatic
stress. The response to this mass deficit is a regional crustal uplift observed by GPS measurements.
After recharge the deformation can be recovered, but there can be inelastic, irreversible
deformations due to compaction of fine-grain sediments in the aquitard layer. Those deformations
can be observed in the regional structural change of the aquifer and groundwater flows, and cause
continuous redistribution of the water inside the aquifer, subsidence, and faults; and decreased
aquifer storage capacity and degradation of wetlands. Overall reduced water thickness was
measured in the Central Valley; however, the water thickness is partially maintained during all
seasons in the Sierra Nevada (Argus, 2014).
5. Data sources and description.
CAL FIRE (California Department of Forestry and Fire Protection) manages the records of all
historical wild fires going back to the 1900s, and the database is continually updated. Historical
records have been digitized and the areas affected by fires are stored as shapefiles. The database
contains records of locations, areas burned, ignition cause, start and containment date of the fires,
containment methods, and all available additional information. This database has been used in this
paper to represent the spatial and temporal positions of the fires. In order to observe possible
correlation between the areas most exposed to wildfires and the factors which enhance ignition and
spread of wildfires, additional data described in the previous section has been analyzed, such as the
Page8
demographic history of the counties (US Census), land cover data (National Land Cover Database)
and ecological regions (Calveg Zones & Ecoregions). Ecoregions were used to capture areas of
similar climate conditions and vegetation types, variability within an ecoregion being correlated to
the hydrological regime (CalWater watershed layer). Region 5 Calveg Zones-Ecoregions consists
of ecological tile unit boundaries and Calveg (Classification and Assessment with Landsat of
Visible Ecological Groupings) zone units used to tile the EVEG (existing vegetation) dataset.
Selected lines were added from the CalWater watershed layer. Additionally, attributes from
Ecological Units of California (Ecological Domain, Division, Province, Section and Subsection)
have been incorporated into this layer.
Studies described in the previous section show that the vegetation lifecycle generates cyclical
spatio – temporal patterns, and that wildfire occurrences are related to these patterns. Visualization
of observed wildfires should reveal details related to the factors that influenced them and expose
some of the associated processes.
The wildfire data analysis consists of several steps, starting with exploratory visual analysis of
wildfire data at different scales, after the transformation presented in the following section has
been applied to the data. The exploratory visual analysis at local and regional scale is associated
with long-term and short-term analysis, followed by search for geographical zone units, which are
correlated to the wildfire occurrences and pattern. The geographical units are demographic,
ecologic, hydrologic, topographic, or combined, such as the Calveg zones.
This analytical method observes changes in the time-space pattern in the polar plot generated with
the Python tool, and identifies the associated time periods and affected areas. The areas of interest
are selected and translated into geographical data. Data analysis is enhanced by the zonal statistics
with representations of wildfire occurrence dates in the context of selected geographical units. The
data analysis is performed at a regional (state-wide) scale, and for selected locations at a higher
resolution.
6. Method.
The custom tool using Python programming language visualizes the wildfire area centroids,
transforming the locational and time parameters and creating a new image of the space-time
dimensions. The locational x, y coordinates are transformed into polar coordinates of radius R
and angle, with a central reference point of x, y coordinates. After the radius and angle values
have been calculated for the entire dataset, the longest radius value is retained as Rmax. The first
year’s dataset is drawn as a usual polar scatterplot, measuring the angle and distance of each point
(wildfire location) from the central reference point. After all points of the first year have been
plotted, a circle with radius Rmax is drawn. The second years’ points will be plotted in the following
step, but now the radii are measured from the previously drawn Rmax radius circle. This is the
second time sequence. Each year is represented by a ring, successive years by additional external
rings, and inner radius of the ring increasing from Rmax for the second year to 2 * Rmax in the third
year, 3 * Rmax for the fourth year and so on. Rmax is multiplicated with a year index, “year_ix,”
generating concentric rings and assuring that the relative position of the points inside the rings is
the same for similar x, y coordinates. If wildfires were recorded at the same x, y coordinate in two
consecutive years, the point’s angle and the distance from the inner circle of the ring will be the
same in the two successive rings. Rmax is the largest R value calculated, and it represents the ring’s
Page9
thickness, the distance between the inner and external circles of the ring. Based on the value of
Rmax a new radius R1 of the points is calculated, which measures the distance of the points from the
plot’s center, and it is defined as R1 = (Rmax * year_ix) + R. The new R1 coordinate combines the
geographical polar coordinate R and the date of the incident as the year index “year_ix,” thereby
becoming a spatio-temporal coordinate. It is measured in distance units, the “year_ix” being a
whole number that always starts with zero. If records are not available for the entire period, and the
researcher wishes to visualize the years having no incidents, blank rings will be drawn after adding
new rows to the file. The year value will be entered by the user, and a set of x, y coordinates
equaling zero.
Transformation of the spatio-temporal data generates a new image, a set of concentrically ordered
rings. Only wildfire occurrences with unknown cause of ignition are selected and analyzed, as the
method aims to reveal relationships and pattern which are not related to well-known human causes
such as ignition by campfire or by faulty transmission lines. Human causes are strongly connected
to geographical locations of settlements, roads, or electric networks, and induce a spatial bias in
the data distribution. However, the human factor and the random process of lightening cannot be
completely eliminated, as the causes of ignition are not always precisely determined.
The Python tool can be easily used for any MS Excel dataset by just updating the file name and
path in the Python script, and by changing the column titles of the date and coordinate fields in the
MS Excel table to be analyzed. In addition, an ArcGIS tool has been built, which can be used in
parallel. The scripts and instructions for the ArcGIS toolbox developed for this purpose can be
found in the Appendix.
The file name, file directory, and the column names used in script are the user-specific entries. The
script column names are “centroid_x” and “centroid_y” for the x, y coordinates of the points and
“YEAR_” for the years. The MS Excel file used as input data in the Python script is running with
these column names. The user can modify the script parameters and adapt the script to the user-
specific needs. The script can be used after the “YEAR_” data has been sorted in ascending order.
The Python script will automatically sort the dates, but if the user is working with the ArcPy tool,
the ArcGIS sort tool must be run first. In the first step, the plot’s centroid will be calculated based
on the selected area. This centroid will be the central point of the study area as well. This central
point and the boundaries of the study area build the spatial reference frame and allow comparison
of the data subsets (Figure 3-8).
The data to be plotted will be specified in the second step. As already mentioned, first the whole
study area will be selected, and the central point will be calculated based on its extent. This is the
absolute central point. In the second step, the same data will be selected again if the plot will be
drawn for the whole area and the entire time period, or alternatively a temporal subset of the data
can be entered. The time period can be selected based on the first and last year of the period of
interest. Also, a spatial subset can be selected in the second step instead of the entire study area,
and this data subset will use the central point of the main study area. In this case the data will be
mostly plotted as a sector of the circle representing the primary study area. A relative central point
will be generated if the tool will be run for the spatial subset separately, creating a new central
point for the spatial subset (Fig. 17). Using the absolute center, the geographic location of the
subset within the main study area can be easily identified. Using the relative center, the points are
balanced over the entire plot and the exploration of the details is enhanced, but the position of the
points cannot be compared with the original plot drawn for the entire area, which has a different
(the absolute) central point. Plots of the same region drawn for different time periods should use
Page10
the same central point to visually compare all space-time patterns. Figure 17 represents the
scatterplot for Los Angeles County calculated with absolute and relative central points.
The script adds new columns to the data frame, and these columns will be filled out with the
results of the calculations. The year index will be calculated with values between 0 and 117 for the
time period 1900–2017. All rows dated 1900 will get the year index zero, all 1901 entries will get
the index 1, and so on through 117 for the year 2017. In the next step the script calculates the “R”
distance from the centroid and the angle value for each point of the map. Based on the year index
and the R coordinate, a new R1 coordinate is calculated and added as a new column in the data
frame.
Below is the Python script used for the calculations. Data path and parameters are user-specific
entries and are colored green.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import xlsxWriter
# this script works with the following column titles: "centroid_x" for x coordinates of
the point, "centroid_y" for the y coordinates.
# “YEAR_” is the title of the column of the time periods used in the calculation.
# FOR TIME SUBSET THE FIRST AND LAST YEAR OF THE PERIOD is defined. Output PATH and
Title of the results (resulting image and excel file) are at the end of the script.
# Central point (origin of the coordinate system) is calculated based on the first file
selected. The data subset to be plotted will be entered after the central point has been
calculated.
# If the dataset does not have records for some of the years, the user can add rows for
the missing years and blank rings will be drawn. In the new rows the year_ has to be
filled out and x, y coordinates equaling zero.
mydf = pd.read_excel('C:/MY_FOLDER/MyData.xls', sheet_name='Sheet1') # THE CENTRAL POINT
OF THE PLOT WILL BE CALCULATED BASED ON THIS FILE.
#assign a new polar coordinate system based on the existing x,y cartesian coordinate
values.
Xmax = mydf['centroid_x'].max()
Xmin = mydf['centroid_x'].min()
Ymax = mydf['centroid_y'].max()
Ymin = mydf['centroid_y'].min()
# calculate the central point of the plot
if Xmax<0:
Xcentr =Xmax-((Xmax-Xmin)/2)
elif Xmin>0:
Xcentr = Xmin+((Xmax-Xmin)/2)
else:
Xcentr = 0
if Ymax<0:
Ycentr = Ymax - ((Ymax-Ymin)/2)
elif Ymin>0:
Ycentr = Ymin + ((Ymax-Ymin)/2)
else:
Ycentr = 0
print "Study Area Central Point Coordinates X, Y:"
print Xcentr
print Ycentr
Page11
#choose data subset to be plotted.
mydf = pd.read_excel('C:/MY_FOLDER/MyData.xls', sheet_name='Sheet1')
mydf.sort_values("YEAR_", inplace=True)
rows_no=mydf.shape[0] # number of rows only - mydf:
cols_no=mydf.shape[1] # number of columns only - mydf:
#inserts new columns for the new coordinates which will be calculated and added to the
dataframe.
mydf.insert(int(cols_no), "x", value = int)
mydf.insert(int(cols_no+1), "y", value = int)
mydf.insert(int(cols_no+2), "R", value = int)
mydf.insert(int(cols_no+3), "Angle_rad", value = float)
mydf.insert(int(cols_no+4), "R1", value = int)
mydf['x'] = mydf['centroid_x']-Xcentr
mydf['y'] = mydf['centroid_y']-Ycentr
# THE FIRST AND LAST YEAR OF THE PERIOD CAN BE SELECTED:
mydf = mydf[mydf['YEAR_'].between(1900, 2017, inclusive=True)] #Select the time period
to be plotted.
YEAR_list = mydf['YEAR_'].tolist()
#creates list of unique year numbers. Assigns the first value which is -1. The index
#value will change only if the calendar year changes.
#Each new (different) calendar year will get a unique index, starting with zero:
#0,1,2,..etc.
year_ix = [-1]
for i in range(len(YEAR_list)-1):
if YEAR_list[i] == YEAR_list[i-1]:
year_ix[i] = year_ix[i-1]
else:
year_ix[i]= year_ix[i-1]+1
year_ix.append(year_ix[i])
mydf['year_ix'] = year_ix
#calculates the new polar coordinates and adds them to the dataframe.
mydf["R"] = np.sqrt(np.power(mydf["x"],2 ) + np.power(mydf["y"],2 )).astype(np.float32)
#calculates distance from the central point with coordinates (0,0).
Rmax = mydf["R"].max()
print Rmax
mydf["Angle_rad"] = np.arctan2(mydf["y"], mydf["x"]).astype(np.float32) #calculates
angle in radian.
mydf["R1"] = mydf['R'] + (Rmax * mydf['year_ix'])
theta = mydf['Angle_rad'].tolist() #angle in radian
r = mydf['R1'].tolist() #this is the transformed radius containing the time info and
radius as polar coordinate.
R1max = mydf["R1"].max()
#draws polar scatterplot
year_set = set(year_ix)
my_index = list(year_set)
area = 2 #this value defines the area of the points in the scatterplot.
fig = plt.figure()
ax = fig.add_axes([0.1,0.1,0.8, 0.8],polar=True)
ax.set_ylim(0, R1max) #maximum value of R1
ax.set_yticks(np.arange(Rmax, R1max, Rmax)) # if the rings representing the years are
#drawn every 10th year: ax.set_yticks(np.arange(0, R1max,10*Rmax). Otherwise
Page12
#(np.arange(0, R1max, Rmax))
ax.set_yticklabels(my_index, fontsize=4) #fontsize can be modified.
ax.scatter(theta,r,c = 'b', s=area)
#ax.set_rmin(-R1max/3) # set offset. The value can be changed by the user as needed.
plt.show()
fig.savefig('C:/MY_FOLDER/MyImage.jpg') # save figure to your work folder.
#The results of the above calculations can be written to an excel file.
# Creates a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('C:/MY_FOLDER/MyData2.xlsx', engine='xlsxwriter') # save to your
work folder.
# Converts the dataframe to an XlsxWriter Excel object.
mydf.to_excel(writer, sheet_name='Sheet1')
# Closes the Pandas Excel writer.
writer.save()
7. Results
The following parameters describe the position of the incidents in space and time:
- The ring’s position relative to the center: each subsequent ring represents the same area but
at a different time, in this case in a different year. The first (oldest) events are drawn in the
interior ring; additional external rings will be added for successive years, and the recent
events are drawn in the external ring. The time successions (the plot’s rings) are labeled
with the year index “year_ix.”
- The point’s position within the ring: As each ring represents the same area, the ring’s inner
circle embodies the expanded central point, and the area’s boundary is depicted by the
ring’s external circle if the area is circular. The area’s boundary is close to the ring’s
external circle, but will not touch it, if the area has a different shape (Figures 3–8 and 9–
12). If the point is close to the ring’s inner circle, it is close to the area’s centroid on the
map. A point close to the external circle of the ring depicts an event that occurred close to
the study area’s boundary. The radial coordinate (radius) of the forest fire is measured
between the central point and the point representing the forest fire on the map, and it will
be measured as the perpendicular distance between the point and the inner circle of the ring
on the plot. Unique colors can be assigned to different radius lengths in the plot for
additional color-coded visualization (Figure 21 in Appendix).
- The angle value is the second parameter which defines the point’s geographical position. It
represents the angular coordinate of the points (the polar angle), and it is measured as the
angle between the horizontal line representing 0o on the circular scatterplot, and the point’s
radius. It is measured in the same way on the map (Figures 3–8). The radial sectors on the
map are the same subdivisions as on the scatterplot. A point at 45o on the map will be at 45o
on the scatterplot too. The angle value is very sensitive to the smallest changes of the x, y
coordinates of the point, and changes will be immediately visible on the plot.
- The point density: increasing point density in the external rings reveals an intensification of
the phenomenon over time. As the small radial sectors on the map close to the map’s
centroid cannot confine high number of events, these events must have been occurred at a
distance from the centroid.
Page13
The changes in the time-space pattern are well-represented in the plot, and all above parameters
are of importance for the interpretation of the resulting image. A quick visual exploration is
enhanced by two parameters, the angle values, which are very sensitive to the smallest positional
change of the point, and the point density, which is of interest as it reveals an intensification of the
phenomenon, and is also an indicator of the distance to the central point of the study area, in case
of very high point density.
An example is presented below, in Figures 3–8. The polar coordinates of the point define its
position relative to the map’s center. The same parameters define the position of the points within
the concentric rings of the plot, but the map’s center is represented by the inner circle of the ring.
The plot’s resulting image reveals the hotspots of the time-space series, patterns of the space-time
distribution, major changes in the geographic distribution, and the related time period. The
smallest change in the x, y coordinates of the incidents (points) will result in change in the angle
and a change in the radius. If the incident occurred in the same location every year, the position of
the point remains unchanged within consecutive rings. As the radius values are small in a plot
which encompasses numerous rings and represents a long time period, modification of the angle
values will be more visible than changes in the radius.
An application of the polar space-time scatterplot on a circular study area in Central California
(Figure 2) with wildfires of unknown ignition occurring between 1900 and 2017 is presented in the
following section. The incidents recorded between 1930 and 1931 are plotted in Figure 3. The
position of points in the plot and on the map are compared in Figures 3–8.
Wildfires in the study area (Figure 2) occurred around the area
centroid and along the NNW-SSE direction (135o–315o axis)
through year 1930 (Plot A in Figure 3). Plot B in Figure 3 displays
the period between 1930 and 1931. The points in the exterior ring’s
45o–180o sector are very close to the interior ring, consequently
they are grouped around the area’s centroid. The pattern did not
change during 1931; the wildfires are still close to the area’s
centroid, and they spread SSE. The plots in Figure 4–7 show
wildfires recorded during a longer time period, 1900–1940, 1940–
1960, 1960–2000 and 2000–2017. The four time periods have been
selected based on the changes observed in the pattern on the plot
drawn for the period 1900–2017 (Figure 8). The points of the
scatterplot are aligned NNW-SSE between 1900 and 1940; the
pattern changes during the 50’s. The 1940–1960 period is the
transition period.
Figure 2. Circular study
area in Central California. Between 1960 and 2000, the points cover the SW semicircle. The
pattern changes after 2000, and the points cover almost all radial
sectors of the plot now.
The map and the plot in Figure 4 displays NNW-SSE spread of the wildfire occurrences between
1900 and 1940. A different pattern is observed in Figure 6 for the period 1960 and 2000. The
wildfires are considerably closer to the San Joaquin River and incidents were registered both east
and west of the river. The points on the plot are drawn mostly in the SW semicircle. The period
between 1940 and 1960 (Figure 5) can be considered a transition period between the two main fire
Page14
regimes 1900–1940 and 1960–2000. The pattern changes after 2000, when incidents are recorded
within the 0o–45o sector, where no fires were observed before.
Figure 3. Wildfires within the study area of Central California, 1930–1931.
Figure 4. Wildfires within the study area of Central California, 1900–1940.
Page15
Figure 5. Wildfires within the study area of Central California, 1940–1960.
Figure 6. Wildfires within the study area of Central California, 1960–2000.
Page16
Figure 7. Wildfires within the study area of Central California, 2000–2017.
Figure 8. Wildfires within the study area of Central California, 1900–2017.
Page17
The study area was extended by selecting a region east from Central Valley (Figure 9–Figure 12).
Now the area has an irregular shape, which makes the interpretation of the relationship between
the points on the map and the plot more demanding. However, the main purpose of this
transformation is to identify a pattern in the spatio-temporal events and observe changes of this
pattern, thus the results are not influenced by the irregular shape of the area. The selected study
region’s form is close to an ellipsoid oriented NNW-SSE. The lateral boundaries of the selection
will not touch the plot’s rings but will be close to them. The distortion of the non-circular study
area will be preserved within all rings. The region east of Central Valley is a wildfire “hotspot.” It
was selected by drawing a polygon around the area of interest, and the scatterplot was calculated
for the Calveg ecoregions circumscribed by the polygon. The first plot was drawn for the entire
period, 1900–2017. The changes of the angles are obvious. The position of the points within the
ring is determined by their geographical locations only. The point positions can be distinguished if
the rings are larger, therefore a breakdown to shorter time periods is helpful. For the Central
Valley East region three distinct periods could be observed in the plot in Figure 12: 1900–1950,
1950–2000 and 2000–2017.
The geographic map suggests that the wildfires “moved” toward the valley. The fires recorded
during the early 1900’s were observed at high altitudes, whereas those after 1950 occurred at lower
altitudes. Between 2000 and 2017 the spatio-temporal pattern changed again, and wildfires spread
over the entire region, to low, high, and even very high altitudes.
Figure 9. Wildfires within the study area of the Central Valley East region, 1900–1950.
Page18
Figure 10. Wildfires within the study area of Central Valley East region, 1950–2000.
Figure 11. Wildfires within the study area of Central Valley East region, 2000–2017.
Page19
Figure 12. Wildfires within the study area of Central Valley East region, 1900–2017.
Figure 13. Altitude (meter) of wildfire occurrences and year of incident in the region between
Central Valley and the eastern border of California, 1900–2017. The LOESS regression line was
calculated in SPSS. The main time periods are marked with different colors.
Page20
Loess regression uses local weighted regression to fit a smooth curve through points in a scatter
plot. It is a nonparametric technique that can reveal trends and cycles in data that might be difficult
to model with a parametric curve. In Figure 13 an abrupt change is noticeable during the period
1940–1950 and 2015–2017. Before the forties the areas below 500m were not affected by fires; in
the forties just a few occurrences were recorded at this altitude. After 1950 most of the incidents
were observed at low altitudes. The trend changes after 2000, but the LOESS regression line did
not change. The time period is too short; more data is needed to perform a statistically significant
analysis for the period after 2015.
Figure 14 a)
Figure 14 b)
Page21
Figure 14 c)
Figure 14. Treemaps a), b), c) showing the distribution of wildfires across Calveg ecoregions in
Central Valley East during different time periods. The Calveg areas in the treemap are proportional
to the number of the wildfires within the area.
The data analysis distinguished a regional pattern of 40-50 years. Local patterns exist within the
main regional pattern and these local patterns are of interest for fire- and forest management. The
regional pattern needs further studies. The changes of the regional spatio-temporal pattern are
strong related to the Calveg ecoregions. This correlation is depicted on the maps and in the
treemaps of Figure 14. Figure 14 represents the Calveg ecoregions affected by forest fires over
different time periods. The Calveg areas in the treemap are proportional to the number of the
wildfires recorded during the selected time period within the area. The treemaps illustrate that the
forest fire hotspots moved to new ecoregions between the main time periods. The correlation
between the Calveg ecoregions and the space-time patterns of the wildfires can be followed on the
maps. The changes occurred suddenly, within a few years between 1940–1950 according to Figure
13. The Calveg ecoregions are drawn based on the ecological and hydrological characteristics,
which are strongly related to each-other, thus ecological and hydrological causes could explain the
spatio-temporal patterns observed in the region. The changes are regional, and they affected the
entire eastern part of California. The process which caused these changes happened suddenly, and
since it affected large areas it can be assumed that it was intensive. The response could be found in
the existing ecological, hydrological and hydrogeological database. The changes are not related to
the vegetation cycles, as the vegetation recovery time is approximately seven years and the pattern
shows two approximatively 50-year periods with short 10-years transition between them. The
forest fire pattern is not overall related to the population growth (Figure 15). Correlation between
population and fire records is observed in areas with very high population such as Los Angeles and
San Diego, but the correlation is just spatial, not spatio-temporal.
Page22
Figure 15 a)
Figure 15 b)
Figure 15. a) County population over time, 1940–2016. b) Yearly total number of wildfires per
county during the same period.
Page23
a) b)
c) d)
e)
Figure 16. Scatterplots, State of California. a) Points
represent centroids of the areas affected by wildfires
occurring 1900–2017; b) Scatterplot of the period 1900–
1940; c) 1940–1980; d) 1980–2017; e) Position of the
central point calculated for California. The scatterplots
were drawn based on the position of this central point.
Page24
The spatial data subsets for Central Valley East and Central California, discussed previously, were
selected based on the plot in Figure 16 a) drawn for the entire state of California. The irregular
form of California makes the interpretation of observed patterns more challenging. The position of
the central point within the state explains the low point density observed on the right side of the
plot in Figure 16.
The regional space-time pattern distinguished in the central and eastern region of California can be
discerned across the entire state (Figure 16). Between 1900 and 1940 points are mostly spread
NNW-SSE. Then the pattern suddenly changes, and points fill the 135o–225o sector where no
wildfires were previously observed. An overall intensification of the wildfires is suggested by the
plot drawn for the period 1980–2017.
The ecological, hydrological, topographic, and population diversity across California results in a
huge variety of all parameters. The possible causes and correlations are multiple, and each region
should be studied in part. The pattern shows sudden changes between main periods with relative
stability, the same as we have seen in the central and eastern regions of California.
a) b)
Figure 17. Scatterplots, Los Angeles County, 1900–2017, with absolute (a) and relative (b)
central points. Plot (a) depicts the position of the county’s scatterplot within the State; plot (b)
draws a balanced image which is more suitable for analysis of the local space-time pattern. This
plot was mentioned in a previous section (see “Methods,” page 8, in connection with the absolute
and relative central points of the polar scatterplots.)
7. Conclusions:
The plots resulting from the calculations presented in this paper reveal spatio-temporal patterns
which are otherwise difficult to observe. Two main spatio-temporal patterns can be observed in
California. The data analysis differentiates a regional pattern of 40–50 years, and local patterns
within the main regional pattern. The local patterns are of interest for fire- and forest management.
In the local patterns, incidents occurred within regions with relatively stabile boundaries, namely
the Calveg ecoregions. Dispersion and locations of the incidents within the Calveg ecoregions can
be associated with activities of fire and forest management and the cycles of resident vegetation.
Page25
Most of the factors that influence wildfire ignition and spread, presented in section 4 on page 5, are
related to local patterns. The regional pattern needs further study. The changes in the regional
spatio-temporal pattern are strongly related to the Calveg ecoregions. The transition between the
spatio-temporal states of the wildfire regimes takes place during a relatively short transition period.
The regions affected by wildfires, the dates and time periods related to the wildfire activity are
discernable on the polar time-space plot and can be translated into geographical locations.
The state of ground water in an area (Evaristo and McDonnell, 2017) influences the water content
of plants and their vulnerability to fires. California aquifers are mostly very rapidly used up
(Argus, 2014), and their natural replenishment is not always possible, especially in the San
Joaquim and Sacramento Valleys, where agriculture consumes 80% of the groundwater. If porous
sand and gravel deposits in the aquifer are under substantial lithostatic pressure, they can become
inelastic, with the parameters changing irreversibly (Murray et al., 2018). Such deformations can
already be observed, and can cause redistribution of water in the aquifers in Sierra Nevada.
Climate change has a serious impact on the entire environment, and could explain such regional
changes; however, the short transition periods are not characteristics of the climate change (yet).
The explanation of the spatio-temporal processes, which shaped the distribution of forest fires is
connected to the ecological, hydrological, and hydrogeological parameters that define the Calveg
zones.
The tool presented in this paper may be used for exploratory space-time visualization of local and
regional processes. It can considerably enhance the work of both researchers and public decision
makers, including planners at the local and state government level and resource and risk managers,
and helps observe potential patterns. Exploratory data analysis is the first step in observing
correlations and causalities of processes in any area, and this tool enables new insights into the
main spatio-temporal events.
Acknowledgments
The first draft of this paper was submitted as graduation project to Hunter College at the City
University of New York. I thank Professor Jochen Albrecht for sharing expertise, and for the
sincere and valuable guidance and encouragement extended to me, and Professor Douglas
Williamson for comments that greatly improved the manuscript. I am immensely grateful to
Elisabeth Sydor, CIESIN at the Earth Institute of Columbia University, for her critical comments
on the manuscript, and for her participation in the final editing of the paper. I also thank one
anonymous reviewer for thoughtful suggestions.
Page26
Map 1. Wildfires across California, 1900–2017.
Page27
References
Adrienko, G. et al. 2010. Space, Time and Visual Analytics. International Journal of
Geographical Information Science 24 (10), 1577–1600
Agee, J. K. 2005. The complex nature of mixed severity fire regimes. USDA Forest Service Joint
Venture Agreement PNW-03-JV-11261927-354, CROP Forest Ecology.
Allen, J. F. 1984. Towards a general theory of action and time. Artificial Intelligence 23:123–I 54.
Argus, D. F. 2014. Seasonal variation in total water storage in California inferred from GPS
observations of vertical land motion. AGU, Geophysical Research Letters 10.1002/2014GL059570
Batty, M. and Jian, B. 1999. Multi-agent simulation: new approaches to exploring space-time
dynamics in GIS. (CASA Working Papers 10). Centre for Advanced Spatial Analysis (UCL):
London, UK.
Batty, M. 2005. Approaches to Modeling in GIS: Spatial Representation and Temporal Dynamics.
In GIS, Spatial Analysis and Modeling, eds. D. J. Maguire, pp. 41–61. Redlands, CA: ESRI Press.
Bendix, J. and M.G. Commons. 2017. Distribution and frequency of wildfire in California riparian
ecosystems. Env. Res. Lett. 12 (2017) 075008.
Bradshaw, G. W. 1987. Fire Protection in the Urban/Wildland Interface. Who plays what role?
Fire Technology. August 1988, Volume 24, Issue 3, pp 195–203
Cleland, D.T. and Dickman D. I. 2002. Fire Return Intervals and Fire Cycles for Historic Fire
Regimes in the Great Lakes. Researchgate.
Couclelis, H. 1999. Space, Time, Geography. Geographical Information Systems, 1, 29–38.
Dennison, P.E. et al. 2014. Large wildfire trends in the western United States 1984–2011.
Geophysical Research Letters 10.1002/2014GL059576
Evaristo, J. and McDonnell, J. J. 2017. Prevalence and magnitude of groundwater use by
vegetation: a global stable isotope meta-analysis. Scientific Reports, March 2017, Nature
Publishing Group, DOI: 10.1038/srep44110.
Galton, A. & Mizoguchi, R. 2009. The Water Falls but the Waterfall does not Fall–New
perspectives on Objects, Processes and Events, Applied Ontology, 4(2): 71–107.
Galton, F. 1881. Isochronic Passage Chart. Proceedings of the Royal Geographical Society,
London.
Häagerstrand, T. 1967. Innovation Diffusion as a Spatial Process. Chicago, Illinois: The
University of Chicago Press.
Hägerstrand, T. 1970. What About People in Regional Science? Papers of the Regional Science
Association, 24(1): 7–24.
Page28
Hornsby, K. and Egenhofer, M. J. 2000. Identity-based change: a foundation for spatio-temporal
knowledge representation. INT. J. Geographical Information Science, 14(3),207–224.
Keeley, J. E. and Fotheringham, C.J. 2001. Historic Fire Regime in Southern California
Shrublands. Conservation Biology, 15(6):1536–1548
Keeley, J. E. Fotheringham, C.J. et al., 2009. The 2007 Southern California Wildfires: Lessons in
Complexity. Journal of Forestry, 107(6): 287–296
Langran, G. 1992. Time in Geographic Information Systems, Taylor and Francis, London,
Washington DC
Loidl, M. Wallentin, G. Cyganski, R. Graser, A. Scholz, J. and Haslauer, 2016. GIS and Transport
Modeling—Strengthening the Spatial Perspective. International Journal of Geo-Information, 5(6),
doi.org/10.3390/ijgi5060084.
Mark, D. et al. 1999. Cognitive models of geographical space. IJGIS, 13(8):747–774.
Murray, K. D. et al. 2018. Short-lived pause in Central California subsidence after heavy winter
precipitation of 2017. Science Advances, Geology; August 29, 2018.
Nauslar, N.J. et al. 2018. The North Bay and Southern California Fires: A Case Study. Fire 2018,
1, 18: doc:10.3390/fire 1010018.
Nunes, J. 1995. General concepts of space and time. In Frank, A.U. (Eds.), Geographic
Information Systems, Material for a Post-Graduate Course, Vienna: Department of
Geoinformation of TU Vienna, pp. 7–34
Peuquet, D. J. 1994. It's About Time: A Conceptual Framework for the Representation of
Temporal Dynamics in Geographic Information Systems, Annals of the Association of American
Geographers, 84:3, 441-461, DOI: 10.1111/j.1467–8306.1994.tb01869.x
Tobler, W. R. 1970. A Computer Movie Simulating Urban Growth in the Detroit Region.
Economic Geography, 46: 234–240
Tobler, W. R. 1979. A transformational view of cartography. American Cartographer, 6(2): 101–
106
O’Sullivan, D. and Perry, G. L. W. 2013. Spatial simulation exploring pattern and processes. John
Wiley & Sons, Ltd.
Versini, et al. 2012. Hydrological impact of forest fires and climate change in a Mediterranean
basin. Nat. Hazards 66:609–628. DOI 10.1007/s11069-012-0503-z
Wine, M. L. and Cadol, D. 2016. Hydrologic effects of large southwestern USA wildfires
significantly increase regional water supply: fact or fiction? Environ. Res. Lett. 11 (2016) 085006
Page29
LIST OF FIGURES
Figure 1. Tobler’s transformational matrix
Figure 2. Circular study area in Central California.
Figure 3. Wildfires within the study area of Central California, 1930–1931.
Figure 4. Wildfires within the study area of Central California, 1900–1940.
Figure 5. Wildfires within the study area of Central California, 1940–1960.
Figure 6. Wildfires within the study area of Central California, 1960–2000.
Figure 7. Wildfires within the study area of Central California, 2000–2017.
Figure 8. Wildfires within the study area of Central California, 1900–2017.
Figure 9. Wildfires within the study area of Central Valley East region, 1900–1950.
Figure 10. Wildfires within the study area of Central Valley East region,1950–2000.
Figure 11. Wildfires within the study area of Central Valley East region, 2000–2017.
Figure 12. Wildfires within the study area of Central Valley East region, 1900–2017.
Figure 13. Altitude (meter) of wildfire occurrences and year of incidence in the region between
Central Valley and the eastern border of California, 1900–2017.
Figure 14. Treemaps.
Figure 15. a) County population over time, 1940–2016. b) Yearly total number of wildfires per
county during the same period.
Figure 16. Scatterplots, State of California.
Figure 17. Scatterplots, Los Angeles County.
Figure 18. ArcPy tool, which calculates the space-time coordinates.
Figure 19. ArcGIS Create Graph Wizard.
Figure 20. California wildfires, 1900–2017. The color represents the size of the burned areas and
the central circle represents the area’s center.
Figure 21. Scatterplot of wildfires within the study area of Central Valley East region, 1900–2017.
The colors represent the radius lengths.
Page30
APPENDIX
ARCPY SCRIPT TO UPDATE DATAFRAME WITH NEW COORDINATES AND YEAR INDEX.
import arcpy
from arcpy import env
from arcpy.sa import *
from arcpy.da import *
arcpy.env.overwriteoutput=True
import math
inFeatures = arcpy.GetParameterAsText(0) #input: shapefile to calculate origin of the
coordinate system
inFeatures1 = arcpy.GetParameterAsText(1) # input: x coordinate to calculate origin of
the coordinate system
inFeatures2 = arcpy.GetParameterAsText(2) #input: y coordinate to calculate origin of
the coordinate system
inFeatures3 = arcpy.GetParameterAsText(3) # input: Date or time periods, such as
occurrence year, month, week, day etc. from inFeatures4
inFeatures4 = arcpy.GetParameterAsText(4) #input: shapefile to be plotted
inFeatures5 = arcpy.GetParameterAsText(5) # input: x coordinate to be plotted
inFeatures6 = arcpy.GetParameterAsText(6) #input: y coordinate to be plotted
field1 = inFeatures1 #coordinate x of the point
field2 = inFeatures2 #coordinate y of the point
x_val = [row[0] for row in arcpy.da.SearchCursor(inFeatures, field1)]
Xmax = max(x_val)
Xmin = min(x_val)
y_val = [row[0] for row in arcpy.da.SearchCursor(inFeatures, field2)]
Ymax = max(y_val)
Ymin = min(y_val)
if Xmax<0:
Xcentr =Xmax-((Xmax-Xmin)/2)
elif Xmin>0:
Xcentr = Xmin+((Xmax-Xmin)/2)
else:
Xcentr = 0
if Ymax<0:
Ycentr = Ymax-((Ymax-Ymin)/2)
elif Ymin>0:
Ycentr = Ymin+((Ymax-Ymin)/2)
else:
Ycentr = 0
field3 = "R"
field4 = "Radian"
field5 = "year_ix"
field6 = "Rindex"
Page31
field7 = "R1"
field10 = "x"
field11 = "y"
field12 = "Degree"
arcpy.AddField_management(inFeatures4, "R", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
arcpy.AddField_management(inFeatures4, "Radian", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
arcpy.AddField_management(inFeatures4, "year_ix", "SHORT", "", "", "4", "", "NULLABLE",
"REQUIRED", "")
arcpy.AddField_management(inFeatures4, "R1", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
arcpy.AddField_management(inFeatures4, "x", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
arcpy.AddField_management(inFeatures4, "y", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
arcpy.AddField_management(inFeatures4, "Degree", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
field13 = inFeatures5 #coordinate x of the point
field14 = inFeatures6 #coordinate y of the point
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
row.setValue(field10, row.getValue(field13) - Xcentr)
cursor.updateRow(row)
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
row.setValue(field11, row.getValue(field14) - Ycentr)
cursor.updateRow(row)
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
# field3 will be square root of sum of squared field1, field2
row.setValue(field3, math.sqrt((row.getValue(field10) * row.getValue(field10))+
(row.getValue(field11) * row.getValue(field11))))
cursor.updateRow(row)
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
row.setValue(field4, math.atan2(row.getValue(field11),row.getValue(field10)))
cursor.updateRow(row)
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
row.setValue(field12, row.getValue(field4)* 180/ math.pi)
cursor.updateRow(row)
fld_in = inFeatures3 # Date, time of occurences
fld_out = "year_ix" #field added; it will be filled with output values
#create a list of values from the field using list comprehensions
lst_values=[r[0] for r in arcpy.da.SearchCursor(inFeatures4, (fld_in))]
#use da module for faster access
#use UpdateCursor to update values
index = 0
year_index = [0]
with arcpy.da.UpdateCursor(inFeatures4,(fld_in, fld_out)) as curs:
for row in curs:
name=row[0]
if index == len(lst_values)-1:
next_name = None
Page32
else:
next_name = lst_values[index+1]
#compare the values
if name == next_name:
out_value = year_index[index]
else:
out_value = year_index[index] + 1
year_index.append(out_value)
#update the values
curs.updateRow((name, out_value))
#increment index
index += 1
# search for maximum value of the Radius
field3 = "R"
r = [row[0] for row in arcpy.da.SearchCursor(inFeatures4, field3)]
Rmax = max(r)
arcpy.AddField_management(inFeatures4, "Rindex", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
row.setValue(field6, Rmax * row.getValue(field5))
cursor.updateRow(row)
arcpy.AddField_management(inFeatures4, "R1", "DOUBLE", "", "", "25", "", "NULLABLE",
"REQUIRED", "")
field7 = "R1"
cursor = arcpy.UpdateCursor(inFeatures4)
for row in cursor:
row.setValue(field7, row.getValue(field6) + row.getValue(field3))
cursor.updateRow(row)
arcpy.AddMessage("Script was run. To see result check the input Feaure Class updated
columns")
The ArcPy Tool (Figure 18) will calculate the transformed polar coordinates and will save them
in the same shapefile. In the data entry field, the user will pick the previously sorted shapefile
(entries in the time columns must be sorted in ascending order), will select the x and y coordinate
columns of the main study area, the x and y coordinates of the data subset which will be plotted,
and finally the time/date column. After the tool is run, the graph will be drawn using the ArcGIS
graph builder tool (View/Graphs/Create Graphs). The Graph Manager can be activated, which
lets additional changes be made using the “Advanced Properties,” by right-click on the Graph
Manager.
The ArcPy tool can be downloaded from the following link:
https://www.arcgis.com/home/item.html?id=980cbf370eb74586a636b9da9c74f43d
Page33
Figure 18. ArcPy tool, which calculates the space-time coordinates.
The ArcPy Toolbox can be accessed through the Catalog working folder. The Python scripts
should be saved by the user in the same folder together with the Toolbox containing the new tools.
Page34
Figure 19a.
Figure 19b.
Page35
Figure 19 c.
Page36
Figure 19 d)
Figure 19. a) ArcGIS Create Graph Wizard. b) The radius value is Rmax; c), d) are script
parameters. The Python file is stored in the folder that was connected in the ArcGIS catalog.
Page37
The script creates a heat map of population density per county for each year (Figure 15 a). The same
script was used to draw the heat map of fire counts per county over time (Figure 15 b).
#plot total number of fires per county as heat map (Hovmoeller Plot)
#excel sheet: columns = counties, rows=years, cell values: count of fires per county in
one year
import numpy as np
import matplotlib.pyplot as plt
import openpyxl
import seaborn as sns
import pandas as pd
mydf = pd.read_excel('POPULATION1940-2016.xls', sheet_name='Sheet1')
# plot heatmap
#change the figure size
plt.subplots(figsize=(20,15))
plt.yticks(np.arange(0.5, len(mydf.index), 1), mydf.index)
#create heat map
sns.heatmap(mydf,cmap="plasma")
plt.show()
# save figure
plt.savefig('FireCountCounty.png', dpi=100)
This is the second part of the original script. It was changed to allow a color-coded representation
of wildfire areas and in addition it inserts an offset into the central portion of the plot. Instead of
areas, the script can be run using radius values to picture the expansion of the distances across the
plot. Alternatively, values of any column in the attribute table can be inserted to obtain change of
its values over time.
mydf['year_ix'] = year_ix
#calculate the new polar coordinates and add them to the dataframe
mydf["R"] = np.sqrt(np.power(mydf["x"],2 ) + np.power(mydf["y"],2 )).astype(np.float32)
#calculate distance form the central point with coordinates (0,0)
Rmax = mydf["R"].max()
mydf["Angle_rad"] = np.arctan2(mydf["y"], mydf["x"]).astype(np.float32) #calculate angle
in radian
mydf["R1"] = mydf['R'] + (Rmax * mydf['year_ix'])
theta = mydf['Angle_rad'].tolist() #angle in radian
r = mydf['R1'].tolist() #this is the transformed radius containing the time info and
radius as polar coordinate
A = mydf['Shp_Area'].tolist()
R1max = mydf["R1"].max()
Page38
#draw polar scatterplot
year_set = set(year_ix)
my_index = list(year_set)
area = 5 #this value defines the area of the points in the scatterplot
colors = A #the color of the dots in the scatterplot. Here I define the colors based
on the polygon areas (wildfire areas).
fig = plt.figure()
ax = fig.add_axes([0.1,0.1,0.8, 0.8],polar=True)
ax.set_ylim(0, R1max) #maximum value of R1
ax.set_yticks(np.arange(0, R1max, Rmax)) # if the date rings interval is modified, e.g.
drawn every 10th year: ax.set_yticks(np.arange(0, R1max,10*Rmax)
ax.set_yticklabels(my_index, fontsize=7)
heatmap = ax.scatter(theta,r,c = A, cmap = 'spectral', s=area)
ax.set_rmin(-30000000.0) # set offset in plot’s central portion to enhance visibility
plt.colorbar(heatmap)
plt.show(heatmap)
Figure 20. California wildfires, 1900–2017. The color represents the size of the burned areas and
the central circle represents the area’s center. The points close to the central point are more
dispersed after the offset has been applied, and the interpretation is easier. This graph is an
illustration of the additional features of the Python tool. The area measurements were not always
accurate; therefore, the results were not discussed.
Page39
Figure 21. Scatterplot of wildfires within the study area of Central Valley East region, 1900–2017.
The colors represent the radius lengths.