ChapterPDF Available

Gazetteers Enriched: A Conceptual Basis for Linking Gazetteers with Other Kinds of Information

Authors:
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
Gazetteers Enriched: A Conceptual Basis for Linking Gazetteers !
with Other Kinds of Information
Ryan Shaw, School of Information and Library Science, University of North Carolina at Chapel Hi!
Introduction
The gazetteer is a tool that has taken on new importance in the context of networked,
digital communication. Gazetteers are used to facilitate access to geographic information
systems, to integrate data related to places, and to enable text-oriented search of spatial
information (or spatially-oriented search of textual information). Many information systems,
including the major Web search engines, rely on proprietary or publicly available gazetteers
to provide place-oriented search. As tools that support the description, discovery,
understanding, and processing of information, gazetteers are exemplars of a broader class of
tools known as known as knowledge organization systems (Hodge 2000).
As these tools have moved onto the Web there has been a tendency to de-emphasize the
dierences among them, in favor of assimilation into a generic “cloud” of linked data.
Certainly there is value in recognizing anities among dierent knowledge organization
tools and designing systems in which they can be used together. But dissolving all of these
tools in a “global data space” (Heath and Bizer 2011) may not be the most eective approach.
Distinctions among knowledge organization tools are grounded in their dierent functions.
While these distinctions have never been crisp, eacing them completely leads to focusing
too much on details of minting identifiers and modeling data, rather than the purposes to
which the data will be put.
The purpose of this essay is to explore how one might embed gazetteers in a networked
environment without losing sight of what makes them distinctive tools. First I review the
general notion of a knowledge organization system. I then argue that the distinctive
function of a gazetteer is to provide descriptions identifying the referents of proper names
typically places, but in principle any named thing that can be spatially located. Next I
consider how to expand this functionality both horizontally, giving the gazetteer greater
breadth, and vertically, giving it greater depth. Gazetteers expand their breadth by including
names and descriptions of things other than places, such as people, material objects, or
historical events. Their depth increases when the descriptions of the objects they describe
are enriched with more information about how those descriptions were constructed.
of 1 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
Knowledge Organization Systems
“Knowledge organization system” (KOS) is an umbrella term for a variety of tools including
controlled vocabularies, authority files, synonym rings, taxonomies and classification
schemes, thesauri, and ontologies (Tudhope and Koch 2004, Lei Zeng 2008). Reference
works such as encyclopedias and bibliographies can also be understood as KOSs (Hjørland
2007). Essentially a KOS is any tool that brings together related concepts and their names in
a meaningful way, such that users of the KOS can easily comprehend the relationships
represented. Various kinds of KOS dier primarily in how they are structured. In the
following paragraphs I briefly describe some of these structural dierences; readers
interested in learning more should consult Lambe (2007, 448).
The simplest structure for a KOS is just a list, for example a list of Irish civil parish names.
But a long list can be cumbersome to use, so one typically wants a way to break it into
pieces (e.g. parish names grouped by county), each of which can be represented by a name in
a more manageable list (of Irish county names). Now we have a tree structure, which we can
use to make increasingly broad generalizations up the tree and increasingly narrow
distinctions down the tree. We can augment these broader-than and narrower-than relations
by adding relations of equivalence (this name and that name actually refer to the same parish)
or association (these two parishes were originally parts of one larger parish). A KOS that
relates names in these three ways is known as a thesaurus, an example of which in the
geographical domain is the Getty Thesaurus of Geographic Names (TGN). A thesaurus like
1
the TGN diers from a GIS in that its records generally do not contain coordinate
information, because the purpose of the thesaurus (like any KOS) is to understand semantic
relationships, not to display information on a map.
The most complex form of KOS is an ontology, which attempts to specify relationships
among entities using a formal logic so that new relationships can be algorithmically inferred.
Algorithmic inference requires that definitions of relationships be perfectly consistent and
unambiguous. Unfortunately, relationships as used and understood by people rarely have
these properties. But compromise is possible. The Simple Knowledge Organization System
(SKOS) is an ontology specifying the components of KOSs and the relationships they
represent (Isaac and Summers 2009). By representing a KOS using SKOS, one is formally
modeling the structure of the KOS and not necessarily the full meaning of relationships
represented. This can allow some limited algorithmic inferences to be made, without forcing
the definitions to be fully consistent and unambiguous. This is useful for making KOSs (such
http://www.getty.edu/research/tools/vocabularies/tgn/
1
of 2 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
as the TGN) usable by software programs, for example by publishing them on the Web as
linked data or web services.2
Gazetteer Web Services
The Open Geospatial Consortium, in its document outlining best practices for the design of
gazetteer web services, identified four typical use cases of such services (Fitzke and Atkinson
2006). One use is geocoding: identifying names of places in arbitrary text or recorded speech.
This involves not just finding where place names are used, but disambiguating each use by
identifying the specific place being referred to. So, for example, a geocoding web service
could ideally associate the name “Greenville” with Greenville, North Carolina rather than
Greenville, South Carolina, by examining the discursive context in which it was used.
A second use is navigation: enabling users to search or browse the names of places as a first
step toward accessing information about a place of interest. The assumption here is that a
gazetteer can include names familiar to users and thus can act as an entry vocabulary that
they can use to specify their interests (Buckland et al. 1999). Including vernacular and
domain-specific names is particularly important for this kind of use. For example, a user
considering a move to central North Carolina may know only the names of cities in that
area, yet much useful information may be labeled with region names such as “Piedmont” or
“Triangle”. By linking individual city names to these non-administrative regions, a gazetteer
can help users access this information. A related use is query expansion, the dierence being
that in this case the vocabularies being bridged are those used by separate web services
rather than a user and a single web service.
A third use is selection: finding names and other information about places that meet a certain
description. This is simply the inverse of the navigation use case, wherein the user knows
something about the places she is interested in, but is unable to easily refer to them by
name. For example, she may want to know the names of all the places that she will drive
through on a road trip from Mount Pisgah to Cumberland Knob, so that she can make a
mixtape of music by artists from those places (Lamere 2012). A gazetteer web service
providing information about the locations of places in North Carolina could be combined
with information about the route followed by the Blue Ridge Parkway to make this possible.
See http://vocab.getty.edu/ for a SKOS representation of the TGN.
2
of 3 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
Theorizing Gazetteers
Looking at the use cases outlined in the previous section, we can discern a common theme:
the gazetteer mediates between discourse and the world. In the case of geocoding, a name is
used in discourse to pick out some particular place in the world, and the gazetteer provides
the background knowledge necessary (the coordinates of an extent of space) for a reader or
listener to know which particular place is being picked out. When a gazetteer is used for
navigation, it is the user’s discourse that is being mediated. The user is using a name to
indicate what particular place he is interested in, and the gazetteer is providing to the
system the background knowledge necessary for the system to “understand” which place he
is indicating. The case of query expansion is similar, except now both the “speaker” (using a
name) and the “listener” (trying to understand which place it refers to) are information
services. Finally, in the case of selection a user is giving a description, and the gazetteer is
providing the background knowledge necessary to know what places meet that description,
and how they might be referred to more succinctly using their names.
This way of looking at a gazetteer is broad enough to encompass the kinds of historical and
cultural applications discussed by Southall et al. (2011). A historical gazetteer mediates
between historical discourse and the world. It provides the background knowledge necessary
to understand what place is being referred to by a name in a historical text, even when that
place has changed significantly or no longer exists, or the name is no longer in use. For
example, a historical gazetteer could provide the knowledge necessary to know that the
name “Joara” in a Spanish explorer’s journal referred to a Native American settlement in the
foothills of the Blue Ridge Mountains, near present-day Morganton, North Carolina
(Hudson 2005). This background knowledge may be incomplete, and thus the gazetteer’s
mediation may be imperfect, but the goal is the same.
Any KOS that associates terms with concepts might be said to mediate between discourse
and the world. When one encounters an unfamiliar term used in discourse, one can consult a
dictionary to acquire the information necessary to understand it: the term’s meaning. And
depending on one’s philosophical disposition, one might take this meaning to be part of the
world. So what distinguishes a gazetteer as a distinct kind of KOS is not simply the fact that
it mediates between discourse and the world, but the specific way in which it does this. To
understand the specific way in which gazetteers mediate between discourse and the world, it
is useful to turn to the work of the philosopher P. F. Strawson.
Strawson was interested in how the way we use language to refer diers from the way we use
language to describe. To refer to something is to indicate what it is that one is talking about,
while to describe something is to actually say something about it (Strawson 1950). Most of
of 4 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
what we say or write involves both reference (to a subject) and description (via some
predicate). If I say, “Carrboro was a mill town,” I use the name “Carrboro” to introduce
(refer to) the subject, and the phrase “was a mill town” to introduce a predicate describing
that subject. Both reference and description involve using linguistic expressions to
introduce terms (subjects or predicates) in this way. And in both cases, this introduction
involves identification (Strawson 1959, 181). “Carrboro” identifies a particular place, while
“mill town” identifies a general concept or characteristic. I describe the identified place by
attributing the identified characteristic to it (using the word “was”).
Strawson observed that despite this similarity, the linguistic rules for identifying a particular
thing one is talking about dier from the linguistic rules for identifying a general concept
being attributed to that thing (Strawson 1959, 180213). When I say to you that “Carrboro
was a mill town,” you can understand the general concept of a “mill town” as long as you
understand English and know what these words mean. But to understand what I mean by
“Carrboro,” knowing English is not enough, or even relevant. To correctly understand
“Carrboro,” you must know something about the world. Specifically, you must know what it
is that uniquely identifies the town of Carrboro: some set of true statements that describe it
and nothing else (Strawson 1959, 183). So, for example, you might understand “Carrboro” if
you know that “Carrboro borders Chapel Hill to the west,” as this statement uniquely
describes Carrboro. But of course to correctly understand that statement, you need to know
what is meant by “Chapel Hill,” which requires knowing the set of true statements that
uniquely identify the place to which it refers.
The example suggests how one might replace expressions referring to particular things (such
as Carrboro) with sets of statements that refer to other particular things (such as Chapel
Hill). Strawson claims that one can do this recursively, without fear of infinite regress,
because eventually one can always reach a set of statements that uniquely describe the
particular thing to which one is referring, yet which do not themselves refer to any other
particular things. In other words, one can always eventually obtain a description that
uniquely identifies the particular thing being referred to and that uses only some
combination of general concepts (and, possibly, actually pointing to things) (Strawson 1959,
210). This is possible because we share a common spatiotemporal framework: we are
embodied beings existing in space and time. By virtue of this shared framework, there is
always the possibility of me uniquely describing some particular thing by locating it in space
and time relative to your location in space and time (Strawson 1959, 2426).
Of course, Strawson was not claiming that, when I use the name “Carrboro,” I actually have
in mind some specific set of statements that would allow me to uniquely describe its
spatiotemporal location to you without referring to any other particular things. However he
of 5 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
was claiming that when I refer to a particular thing using a name like this, I presuppose that
such statements could be made (Strawson 1959, 183186). I do not assert these statements; I
take them for granted. If I did not take it for granted that I could, if pressed, actually
identify the referent of the name I used, then my use of the name would be meaningless. As
Strawson put it, identification of a particular thing via the use of its name “rests on”
empirical facts in a way that specification of a general concept does not. Understanding the
meaning of words used in discourse is not enough to understand references to particular
things; one must understand something about the world: the relative locations of things in
space and time.
The Distinctive Function of a Gazetteer
Armed with Strawson’s theory, we can now state more clearly the distinctive function of a
gazetteer: it supplies identifying descriptions for proper names. Proper names are
syntactica!y simple expressions that refer, or at least purport to refer, to particular objects/
individuals” (Reimer 2010). Proper names are needed when members of a communicative
community frequently need to refer to some thing, and they intend to all be referring to the
same thing, and there is no short description known to be known to all that could be used
to identify the thing (Strawson 1974, 36). In such a situation there is a need for an expression
that is not tied to a specific description of the thing, which can continue to refer to the
thing even as it changes, and which can successfully be understood as referring to the same
thing even by people who know that thing dierently (Strawson 1974, 38). This expression is
a proper name. A name may be given to a particular thing through an act of naming, as when
parents name their children or a government names a town, or they can simply arise over
time in response to the need to identifyingly refer.
When a speaker or writer uses a name to refer to a particular place, she must know
something about the world that enables her to identify specifically which place she is
referring to. This knowledge consists of a description of the place that uniquely identifies it.
By using the name, she presupposes that her audience has such knowledge too. But of
course, they may not. If this happens in conversation, the listener can ask the speaker
questions in order to elicit the uniquely identifying description necessary to understand the
name. But when the name appears in a written text, this is usually not an option. So this is
when a gazetteer is needed: given an unfamiliar place name, a gazetteer can supply a
uniquely identifying description of the place referred to by that name. This description may
not and probably will not be the same one known and presupposed by the original writer.
But that is not necessary. All that is necessary is that both the description presupposed by
of 6 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
the writer and the one supplied by the gazetteer identify the same place (Strawson 1959,
183).
Understanding gazetteers as tools for obtaining uniquely identifying descriptions of things
referred to in discourse helps explain a few things about them. First, it explains why many
gazetteer standards focus on the provision of geospatial footprints (coordinates for points,
lines, or polygons). This is not because gazetteers are simply adjuncts to GIS, but because a
global coordinate system is particularly ecient and ecacious way to provide a uniquely
identifying description of a place. Recall that the ultimate ground for successful reference to
particular things is our shared spatiotemporal framework. A global coordinate system is just
a tool for systematically generating descriptions of locations within the spatial component
of that framework. (A calendar performs the same function for the temporal component.)
These descriptions (e.g. 35°55!14"N 79°5!2"W) can thus replace the contingent, idiosyncratic
and possibly cumbersome descriptions (e.g. “the town just to the west of where I went to
university”) presupposed by the users of names.
The need to uniquely identify also explains the need for feature types. Two things can exist
in the same place at the same time, if they are dierent kinds of things. Strawson gave the
example of a man and his body (1974, 14). In a somewhat similar way, one can distinguish
between a populated place and an administrative division that have the same location. Of
course it is likely that the establishment of the populated place predated the creation of the
administrative division that shares its location. So it is true that it is rare for even things of
dierent types to occupy precisely the same extent of space-time. But it is not impossible.
And given that it is dicult to establish precise spatiotemporal boundaries for a vague thing
like a populated place, it is convenient to use type as a distinguishing factor.
Finally, this problem of vagueness also explains why gazetteers have traditionally been
focused on physical landscape features and administrative boundaries. Both of these can be
precisely located using geospatial coordinates, although for dierent reasons. Administrative
boundaries can be precisely located because they are established by fiat, while physical
features can be precisely located because they are “public objects of perception” (Strawson
1959, 45). Public objects of perception are material things that dierent people can see and
touch and directly locate and agree that they are the same things even when encountered at
dierent times. They are things like mountains and rivers and coastlines. Strawson calls
these things “basic particulars” because they are the kinds of particular things we can use to
establish a shared spatiotemporal frame of reference even in the absence of a global
coordinate system. All of our other locating activity, including the establishment of
administrative boundaries and even the establishment of coordinate systems, depends on
our ability to publicly perceive and locate these basic particulars.
of 7 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
Broad Gazetteers
Of course gazetteers should not be limited to covering basic or easily georeferenced things
like physical features or administrative boundaries. But our understanding of where places
are is ultimately grounded in our ability to locate physical features, and so it is unsurprising
that these features have been a major focus of gazetteer construction. Precisely locatable
places are the points in reference to which we can relatively locate less precisely locatable or
even unlocatable places (Elliott and Gillies 2011). There may not be enough information
known about a place to actually enable it to be located. What is necessary is that the
description given is sucient to uniquely identify the place, not that it precisely locates the
place. What makes the place eligible for inclusion in a gazetteer is that it is the kind of thing
that could be located given the right information, whether or not it is actually locatable in
practice.
Conceived broadly as tools that supply identifying descriptions for proper names, gazetteers
can provide information about any object of interest in the world that 1) has a name, and 2)
can, at least in theory, be spatiotemporally located. Such objects are a subset of the class of
things that Strawson calls particulars; they are the subset of particulars with proper names.
People, groups, organizations, and institutions are particulars (they can be located in space
and time) and they usually have names. Eventsthe things that individuals and groups do
are also particulars, although relatively few of them are given proper names. Finally, there
are named historical periods such as the Renaissance or the Late Bronze Age. There are
many other kinds of spatiotemporally locatable particulars, but lacking proper names most
of them are not suitable for inclusion in gazetteers. Periods, events, people individually and
in groups, and places: these are the primary subjects of gazetteers conceived broadly.
A broad gazetteer does not provide information about the names of non-particular things
such as classes, kinds, relations, numbers, or other general concepts (except to the extent
that these names are used in descriptions of particular things). The category of “named
particular thing” is broad, but not so broad as the category of “concept” claimed as the
domain of KOSs more generally. Typically the things listed in broad gazetteers will be
individual persons, groups of people, organizations and institutions, periods and events, and
places. But they also may be things that are less easily categorized, like cultures, mentalités,
political processes, social conditions, or behavioral patterns. These things can all be
localized in space and time, so they are particular things, but they will only be suited to
inclusion in a broad gazetteer if they have been given proper names, such as the First Great
Awakening or the Great Migration. Names like this refer to large-scale heterogeneous
groupings of people and attitudes and processes. They are not directly observable (Strawson
of 8 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
1959, 4445) and thus their contours are open to interpretation. To deal properly with these
more complex particular things, a gazetteer needs greater depth.
Deep Gazetteers
Rossum and Lavin’s (2000) study of definitions of the “Great Plains” region exemplifies
what an entry in a “deep” gazetteer might look like. They examined 50 published maps
identifying the “Great Plains” by means of a boundary line. Their goal was not to establish a
consensus on the boundaries of the region, but to treat each published interpretation
equally and observe the range of perspectives taken and judgments made. They observed a
wide variety of criteria used to delimit the region, which they classified broadly into
“physical” and “cultural” criteria. An example of the former is the eastern edge of the Rocky
Mountains, widely but not universally used as criteria for defining the western boundary of
the Great Plains. Examples of cultural criteria include the edges of Native American tribal
territories or modern state boundaries. These dierent criteria resulted in widely varying
boundary definitions, which nonetheless shared a (relatively small) common “core.”
Analyzing the changing definitions over time, Rossum and Lavin observed a trend toward
more use of cultural characteristics as boundary-defining criteria, and an accompanying
growth in variance of the areas enclosed by the defined boundaries.
Rossum and Lavin’s example suggests a general form for deep gazetteers. Each entry in a
deep gazetteer is associated with a place name that has been given dierent definitions over
time. Each definition is a uniquely identifying description of the named place. These
descriptions might come from maps (as in Rossum and Lavin’s study), prose descriptions in
the text of scholarly publications, or ordinary “shallow” gazetteers. Each description is
associated with bibliographic data documenting when, where, and by whom it was
published. Because what is being uniquely identified is not just a spatiotemporal location
but also a particular definition of a spatiotemporal location, it is important that each
uniquely identifying description is “semantic,” reflecting the criteria used for the definition.
While coordinates might be sucient for uniquely identifying a location, they are not
necessarily sucient for uniquely identifying a definition of a location, because two
dierent people might use dierent defining criteria yet settle upon the same spatial
footprint. Conversely, two people might use the same criteria (e.g. the edge of the Rocky
Mountains) to define a boundary, yet dierently translate that boundary into coordinates
(since the “edge” of a mountain range is itself subject to interpretation).
Such a gazetteer could be made yet “deeper” by recording not only the published definitions
of a place, but all known citations or uses of each definition. This would serve two purposes.
First, it would link each gazetteer entry to a wider expanse of literature “downstream” from
of 9 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
the published definitions. This would enable a scholar to move from an appearance of a
name such as “Great Plains” in a text (whether or not the name is accompanied by a
definition or citation) to a history of uniquely identifying descriptions associated with the
term, and then from any one of those descriptions to a bibliography of works ratifying that
description. Second, it would provide some basis for comparing the relative influence of
various definitions and how it has changed over time. Rather than simply giving each
definition equal weight, a deep gazetteer could use citation data to identify “mainstream” or
“marginal” definitions.
Strawson observed that “proper names ... owe their referential utility to the complexity or
variousness, or both, of their descriptive hinterlands” (1974, 42). Dierent people typically
have dierent relationships to, and thus dierently know the persons, places, events and
other particular things about which they communicate. In order to communicate about a
thing they need to be able to identifyingly refer to it at dierent times and have confidence
that they are continuing to refer to the same thing. Yet given their dierent bases of
knowledge, there will likely be no one commonly known description of the thing that can
uniquely identify it. This is when it is useful to give the thing a proper name, which can
serve as a constant point of reference among communicators who know it dierently
(Strawson 1974, 38). It is the varying yet overlapping sets of background knowledge
presupposed by each use of the namewhat Strawson called its “descriptive hinterlands”
that give it its referential utility. A deep gazetteer provides a map of these descriptive
hinterlands.
Implications for the Design of Gazetteer Services
Gazetteer services can be used for identifying proper names used in text or speech,
browsing through or searching over lists of names, finding names related to a given name,
and querying for named things by description. There is no reason in principle to treat places
dierently from other named things that can be located in our spatiotemporal framework,
such as periods, events, individual persons and groups of people. This is not an exhaustive
list; one can imagine naming and spatiotemporally locating things like specific mentalités
(world-views, e.g. “American exceptionalism”) or technologies. (Many period names are
defined on the basis of and named after spatiotemporal patterns of technology use,
estimated on the basis of surviving artifacts.) Some named things, such as physical landscape
features or administrative divisions, might be precisely locatable at a specific time using a
global coordinate system. But in most cases it will be necessary to locate named things in
space and time by relating them to other named things (as when we partially locate a person
in time and space by relating her to her birthplace).
of 10 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
There are many ways that named things can be related to one another. Dierent people may
know dierent subsets of the relations that any one thing participates in, and thus have
dierent kinds of knowledge supporting their command of the name. This will be
particularly true of things that are “vague ideas” (Paasi 1986, 125) like regions. These kinds of
things can be identified as having location and duration, but estimates will vary depending
on the specific perspective taken. And because such things are always participating in some
historically contingent process of institutionalization (Paasi 1986), over time there may be
more or less consensus on the identity of the named thing. A deep gazetteer can trace the
history of how a name has been used to communicate. This history can provide some basis
for making distinctions between dierent stages in the production of the name’s symbolic
significance, or among the dierent meanings the name has had for dierent communicative
communities at a given time.
Gazetteers can build depth by gathering all known examples of name use, and documenting
what is known about the circumstances of each use (Southall et al. 2011, 141). Of course,
compilers of a deep gazetteer may intentionally restrict their scope. One gazetteer may
include only uses of names in scholarly publications; another may include any mention of a
name on Twitter, while another may include any record tagged with the name in a data
repository. Regardless of scope, however, the documentation should include a date and place
for each usage (e.g. when and where it was published), and any assertions made about the
name such as identifying it with a particular range of space or time (possibly indirectly or
approximately), relating it to other names, or associating it with general concepts (such as
categories). By documenting attestations, deep gazetteers can provide tools for historicizing
and contextualizing the processes through which their names have taken on symbolic
significance, and mapping or modeling degrees of consensus and ambiguity at dierent
points in this process (Mostern 2008, Southall et al. 2011).
A deep gazetteer, unlike an ontology, is not intended to specify an unambiguous
standardized conceptualization of each thing it describes. Rather it is intended to map the
usage of a proper name by dierent people at dierent times in dierent contexts, and the
overlaps between the things picked out by these uses. A gazetteer is a tool for recording and
discovering common patterns of language use. These patterns can potentially be aggregated
at dierent levels, when it is desirable to characterize what passes for consensus on the
referent of name at a particular time or in a particular discourse community. These
aggregations have no special status: they reflect regularities in processes of making meaning,
but there may be more than one way to characterize those regularities (Shaw 2013).
However, the fact that the referents of the names documented in gazetteer are particulars
things locatable in our shared spatiotemporal frameworkmay mean that regularities in
of 11 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
their patterns of use are easier to discern than they are for patterns of word use in general
(Strawson 1974, x).
Current best practices for publishing Linked Data forces publishers to make decisions about
identity, in order to assign URIs to particular things. Typically publishers are also pushed to
make decisions about how to categorize these identified things under general concepts.
Thinking of gazetteers as tools for understanding (the history of) name usage suggests a
change of approach. Rather than trying to pin down whether “Great Plains” is one concept
or fifty concepts, I can remain agnostic about the precise outlines of the conceptual
divisions, and simply note that the name has been used at various times to make various
assertions (e.g. about where the boundaries of the region are). Counting uses of a name is
simpler than counting its referents. To answer the question of whether two dierent authors
referred to the “same” thing by their separate uses of the name “Great Plains,” I can make a
judgment based on what each use explicitly asserted, and what each use seemed to
presuppose, about the referent. If combining all the assertions made or presupposed
between the two uses leads to contradiction, I have some evidence to suggest that they
might not be referring to the same thing. But there is still no need for me to make a strong
denial of identity or to judge whether they are referring to dierent kinds of thing.
Conclusions
When designing information services it is critical to think clearly about not only the
representation of the information but also the function of the service. The distinctive
function of a gazetteer service is to supply identifying descriptions for proper names. If we
conceive of gazetteers broadly, these may be the proper names not only of places, but also of
anything that is possible to spatiotemporally locate. Some things, like periods or regions,
can be located only approximately, because spatial and temporal extents are subject to
interpretation. Proper names for such things are indispensable, as they allow us to
communicate about them without requiring us to agree on precise descriptions of them
(Searle 1958). The precise descriptions associated with a proper name can vary over time and
from person to person. A deep gazetteer is a tool for tracing these variations.
Current approaches to publishing data on the web are ill suited for providing information
services that mediate between discourse and the world. These approaches proceed from the
assumption that, given two instances of using a name, we can say definitively whether the
two uses refer to the same thing (in which case we can replace the two names with a single
identifier) or whether they refer to two dierent things (in which case we can replace the
two names with two dierent identifiers). This forces us to explicitly identify a thing before
we can communicate about it. But this is not how names work in natural discourse, where
of 12 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
we communicate about things by implicitly presupposing that identification is possible. We
ought not abandon the semantics of natural discourse simply to fit a convenient
formalization. By focusing our eorts on representing published statements that use names,
rather than trying to definitively identify and classify the things referred to by those names,
we can build gazetteers with breadth and depth: powerful tools for navigating the complex
relationships between what we say and where we are.
References
Ankersmit, Frank R. 1983. Narrative logic: A semantic analysis of the historian’s language. The
Hague: M. Nijho.
Buckland, Michael, Aitao Chen, Hui-Min Chen, Youngin Kim, Byron Lam, Ray Larson,
Barbara Norgard, Jacek Purat, and Fredric Gey. 1999. Mapping entry vocabulary to
unfamiliar metadata vocabularies. D-Lib Magazine 5 (1). http://www.dlib.org/dlib/
january99/buckland/01buckland.html.
Elliott, Tom, and Sean Gillies. 2011. Pleiades: An un-GIS for ancient geography. In Digital
Humanities Conference 2011. Stanford. http://dh2011abstracts.stanford.edu/xtf/view?
docId=tei/ab-192.xml.
Fitzke, Jens, and Rob Atkinson, ed. 2006. Gazetteer service - Application profile of the Web
Feature Service implementation specification. http://portal.opengeospatial.org/files/?
artifact_id=15529.
Heath, Tom, and Christian Bizer. 2011. Linked data: evolving the Web into a global data
space. Synthesis Lectures on the Semantic Web: Theory and Technology 1 (1): 1136. doi:
10.2200/S00334ED1V01Y201102WBE001.
Hjørland, Birger. 2007. Semantics and knowledge organization. Annual Review of Information
Science and Technology 41: 367405. doi: 10.1002/aris.2007.1440410115.
Hodge, Gail. 2000. Systems of knowledge organization for digital libraries: Beyond
traditional authority files. Washington, DC: Council on Library and Information
Resources. http://www.clir.org/pubs/reports/pub91.
Hudson, Charles. 2005. The Juan Pardo expeditions: Explorations of the Carolinas and Tennessee,
1566-1568. Tuscaloosa: University of Alabama Press.
Lambe, Patrick. 2007. Organising knowledge: Taxonomies, knowledge and organisational
eectiveness. Oxford: Chandos.
Lamere, Paul. 2012. Roadtrip mixtape. Music Machinery, June 17. http://
musicmachinery.com/2012/06/17/roadtrip-mixtape/.
of 13 14
To appear in Placing Names: Enriching and Integrating Gazetteers, Indiana University Press.
Lei Zeng, Marcia. 2008. Knowledge organization systems (KOS). Knowledge Organization 35
(23): 160182.
Mostern, Ruth. 2008. Historical gazetteers: An experiential perspective, with examples from
Chinese history. Historical Methods 41 (1): 3946. doi: 10.3200/HMTS.41.1.39-64.
Pardo, Juan. 2003. Account of Florida, 1566-1568. Wisconsin Historical Society; Smithsonian
Institution Press.
Paasi, Anssi. 1986.The institutionalization of regions: a theoretical framework for
understanding the emergence of regions and the constitution of regional identity.
Fennia 164 (1): 105146.
Reimer, Marga. 2010. Reference. The Stanford encyclopedia of philosophy. http://
plato.stanford.edu/archives/spr2010/entries/reference/.
Rossum, Sonja, and Stephen Lavin. 2000. Where are the Great Plains? A cartographic
analysis. The Professional Geographer 52 (3): 543552. doi: 10.1111/0033-0124.00245.
Searle, John R. 1958. Proper names. Mind LXVII (266): 166173. doi: 10.1093/mind/LXVII.
266.166.
Shaw, Ryan. 2013. Information organization and the philosophy of history. Journal of the
American Society for Information Science and Technology 64 (6): 10921103. doi: 10.1002/
asi.22843.
Southall, Humphrey, Ruth Mostern, and Merrick Lex Berman. 2011. On historical
gazetteers. International Journal of Humanities and Arts Computing 5(2): 127145. doi:
10.3366/ijhac.2011.0028.
Strawson, P. F. 1950. On referring. Mind 59 (235): 320344. http://www.jstor.org/stable/2251176.
Strawson, P. F. 1959. Individuals: An essay in descriptive metaphysics. London: Methuen.
Strawson, P. F. 1974. Subject and predicate in logic and grammar. London: Methuen.
Tudhope, Douglas, and Traugott Koch. 2004. New applications of knowledge organization
systems. Journal of Digital Information 4 (4). http://journals.tdl.org/jodi/index.php/jodi/
article/viewArticle/109/108.
of 14 14
... The category network of Wikipedia serves an analog function to that of the subject access systems of librarians and archivists. It is a knowledge organization system (KOS) that relates more specific subjects to more general ones by broader-than relations [8]. As the product of the contributions of a diverse community, it encodes multiple perspectives on the world in a single structure which allows for multiple paths to exist between any two subjects. ...
... The semantic annotations that are derived from relevance assertions on text fragments can provide richly indexed access to statements about the past. This notion is similar to Ryan Shaw's proposal for "deep gazetteers, " in which multiple descriptions of the same named entity are linked to fragments of discourse in which its name is used [8]. In our case, however, we use representations of periods-as-subjects to connect particular conceptions of these periods to discourse that refers to these periods from a different perspective, rather than use of the same name per se. ...
... Our approach is related to ongoing efforts to produce reusable data for research in the humanities (e.g. [1,8,9]) and we are still investigating how we can best link to the existing models. We have prioritized the development of a useful search tool over the production of reusable data in order to investigate which data can be captured during actual research in the humanities, rather than designing our models first and finding out later that they are not as usable as we had hoped [9]. ...
Article
Full-text available
A heightened interest in the presence of the past has given rise to the new field of memory studies, but there is a lack of search and research tools to support studying how and why the past is evoked in diachronic discourses. Searching for temporal references is not straightforward. It entails bridging the gap between conceptually-based information needs on one side, and term-based inverted indexes on the other. Our approach enables the search for references to (intersubjective) historical periods in diachronic corpora. It consists of a semantically-enhanced search engine that is able to find references to many entities at a time, which is combined with a novel interface that invites its user to actively sculpt the search result set. Until now we have been concerned mostly with user-friendly retrieval and selection of sources, but our tool can also contribute to existing efforts to create reusable linked data from and for research in the humanities.
... A common problem with these more complex terms is the co-existence of different domain-specific definitions. Approaches like 'deep' and 'broad' gazetteers 21 address this issue, e.g., by allowing co-existence of terms or fuzzy descriptions (Shaw, 2016). Although definition and reuse of such more complex terms has potential, currently proper definitions are missing in ESS. ...
Article
Full-text available
Extensive data quality descriptions as a vital part of a dataset’s metadata are widely accepted, albeit their provision in a formalized manner is often lacking. This is due to a number of problems that are frequently encountered by geodata producing scientists. As one of these problems, we identified missing, unknown or unused options to model inhomogeneity of data quality across space, time, and theme in a dataset’s metadata. Detailed information of inhomogeneous geodata quality beyond dataset-wide statistical measures (variance, min, max, etc.) is often only described in dataset accompanying papers or quality reports. These text-based approaches prevent precise querying and hinder the development of advanced data discovery tools that could make valuable use of inhomogeneous data quality information. We propose a profile for the data quality vocabulary (DQV) that allows to model inhomogeneous geodata quality. Considering established vocabularies typically used to describe geographic metadata, as well as ensuring compatibility with the default version of DQV, enhances the usability and thus, minimizes the effort for data producers to provide formalized descriptions of inhomogeneous data quality.
... The Gazetteer has been conceived to provide a complementary, semantic approach to the GIS, as it better points out the information about past geography provided by ancient texts, which is in the form of names, and allows to express cultural phenomena, such as political and administrative entities, which are not easily represented in their physical extension, and their numerous changes over time ( [10]). Moreover, as gazetteers enhance the name-based search of spatial information and the spatiallyoriented search of textual information on the web, which has a semantic organization, it is expected to support the description, discovery, understanding, and process of data about Ancient Arabia on the web ( [9]). 5 ...
... Using relational databases or publishing Linked Data enforces resolutions about the identity of places to establish explicit identifiers to individual localities (Shaw, 2016). Research practice is inconsistent in this regard, due to the lack of criteria for the recognition of particular localities as identical or different. ...
Article
Full-text available
The paper discusses the problem of diachronic criteria of identity for historic localities. We argue that such criteria are needed not just for the sake of ontological clarity but also are indispensable for database management and maintenance. Our survey of the current research in database management and engineering ontology literature found no satisfactory candidates thereof. Therefore we attempt to search for such criteria in the historic-geographical scholarship by exposing the ontological assumptions the researchers made there and by stating them explicitly. This attempt consisted of us presenting a number of brief scenarios taken from the historical studies whereby localities are claimed to maintain their identity through certain types of change or to be destroyed due to other types of change. Generalising these cases we provide a tentative formulation of the criterion and discuss its limitations.
... 8 The research presented herein focuses on the first approach, designing and building an online historical gazetteer called MormonPlaces, which documents tens of thousands of places relevant to the early (1830-1930) history of The Church of Jesus Christ of Latter-day Saints and its members. In this application, we define 'place' very broadly to include not just settlements, but potentially any phenomenon for which history and geography are significant, 9 such as congregations, cemeteries, buildings, roads, and even events. There is a large community of people who could both use and contribute to this database, including professional and amateur historians, genealogists, and Church members with knowledge (and source artifacts such as journals) of their ancestors. ...
Article
Full-text available
Historical place databases can be an invaluable tool for capturing the rich meaning of past places. However, this richness presents obstacles to success: the daunting need to simultaneously represent complex information such as temporal change, uncertainty, relationships, and thorough sourcing has been an obstacle to historical GIS in the past. The Qualified Assertion Model developed in this paper can represent a variety of historical complexities using a single, simple, flexible data model based on a) documenting assertions of the past world rather than claiming to know the exact truth, and b) qualifying the scope, provenance, quality, and syntactics of those assertions. This model was successfully implemented in a production-strength historical gazetteer of religious congregations, demonstrating its effectiveness and some challenges.
Conference Paper
Full-text available
Periods are the set pieces of history, and their staging is a strategy for making change over time meaningful and understandable. Periodization structures not only histories themselves, but also the ways those histories are organized in libraries, the ways teachers of history organize syllabi and textbooks, and the ways historians organize themselves in academic institutions. Like the histories they structure, periodizations are also imposed on the conquered by their conquerors. Periodization itself is a legacy of colonialism, grounded in a linear ontology of time that has forced aside indigenous understandings of temporality. Periodization is also a perennial topic for reflection in the humanities, as scholars cast a critical eye on the categories that organize their work. But like a linear conception of time, periodization is both easily critiqued and difficult to relinquish.
Conference Paper
Full-text available
Nello scenario creatosi in seguito all’emergenza Coronavirus le esigenze socio-assistenziali ed educati- ve delle fasce più fragili della società si sono ulteriormente ampliate e diversificate, ponendo enormi sfide al sistema dei servizi e, parallelamente, alle loro infrastrutture informatiche. Si è pertanto assistito a una riorganizzazione delle modalità di offerta ed erogazione di servizi e interventi professionali (a di- stanza, o in presenza ma con i dovuti dispositivi di protezione) e la nascita o il consolidamento di nuo- ve connessioni tra enti pubblici e privati, professionisti e società civile, gruppi formali e informali, al fine di garantire continuità ai percorsi di accompagnamento e di diversificare la programmazione del welfare locale. Pertanto, nel campo dei servizi socio-educativi e sociosanitari, è osservabile una riconfi- gurazione in prospettiva smart potenzialmente in grado di fornire risposte via via più coerenti con i bi- sogni della cittadinanza. Si prefigura così il consolidamento del paradigma dello smart welfare, caratte- rizzato dall’utilizzo delle Digital Humanities (DH), dalle Information and Communication Technolo- gies (ICT) e da una reinterpretazione delle prassi operative sia tra operatori e famiglie che all’interno dei servizi stessi.
Article
Full-text available
Knowledge organization systems (KOS) can be described based on their structures (from flat to multidimensional) and main functions. The latter include eliminating ambiguity, controlling synonyms or equivalents, establishing explicit semantic relationships such as hierarchical and associative relationships, and presenting both relationships and properties of concepts in the knowledge models. Examples of KOS include lists, authority files, gazetteers, synonym rings, taxonomies and classification schemes, thesauri, and ontologies. These systems model the underlying semantic structure of a domain and provide semantics, navigation, and translation through labels, definitions, typing, relationships, and properties for concepts. The term knowledge organization systems (KOS) is intended to encompass all types of schemes for organizing information and promoting knowledge management, such as classification schemes, gazetteers, lexical databases, taxonomies, thesauri, and ontologies (Hodge 2000). These systems model the underlying semantic structure of a domain and provide semantics, navigation, and translation through labels, definitions, typing, relationships, and properties for concepts (Hill et al. 2002, Koch and Tudhope 2004). Embodied as (Web) services, they facilitate resource discovery and retrieval by acting as semantic road maps, thereby making possible a common orientation for indexers and future users, either human or machine (Koch and Tudhope 2003,2004).
Article
Full-text available
The aim of the present study is to examine the differences between the classical categories of geographical thought, region and place, and their rela­tion to the societal context and the day‑to‑day practices of individuals. After analysing the traditional definitions of the concepts, an interpretation of the concept of region as a human and social category is put forward. The region is comprehended as a historically continuous process whose institutionaliza­tion consists of four stages: the assumption of territorial shape, the forma­tion of conceptual (symbolic) shape, the formation of institutional shape and establishment as an entity in the regional system and social consciousness of the society. As regards the concrete regions in a society, the order of these stages can vary. Institutionalization of a region is a process during which some specific level of the spatial structure becomes an established entity which is identified in different spheres of social action and conscious­ness and which is continually reproduced in individual and institutional practices (e.g. economic, political, legal, educational, cultural, etc.). The meanings of the concept 'regional identity' are also discussed as an expression of the dimensions of an institutionalized region. The nature of regional communities, regional consciousness and images of regions is discussed using the framework of the institutionalization of regions as a background.
Article
Taxonomies are often thought to play a niche role within content-oriented knowledge management projects. They are thought to be â??nice to haveâ?? but not essential. In this ground-breaking book, Patrick Lambe shows how they play an integral role in helping organizations coordinate and communicate effectively. Through a series of case studies, he demonstrates the range of ways in which taxonomies can help organizations to leverage and articulate their knowledge. A step-by-step guide in the book to running a taxonomy project is full of practical advice for knowledge managers and business owners alike.
Article
Historians and other humanists interested in modeling the spatiality of past places should consider the value of digital gazetteers, which are indexes of place names. In contrast to geographic information systems (GIS), gazetteers make it feasible to record uncertainty, textual references, multiple perspectives, and temporal change. Using examples from Chinese history, the author argues that it is feasible to design gazetteers that are consistent with scholarly practices in history and the humanities.
Article
Since its publication in 1959, Individuals has become a modern philosophical classic. Bold in scope and ambition, it continues to influence debates in metaphysics, philosophy of logic and language, and epistemology. Peter Strawson's most famous work, it sets out to describe nothing less than the basic subject matter of our thought. It contains Strawson's now famous argument for descriptive metaphysics and his repudiation of revisionary metaphysics, in which reality is something beyond the world of appearances. Throughout, Individuals advances some highly influential and controversial ideas, such as 'non-solipsistic consciousness' and the concept of a person a 'primitive concept'