Simon ScheiderUtrecht University | UU · Department of Human Geography and Spatial Planning
Simon Scheider
Dr. rer. nat
About
120
Publications
53,174
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,576
Citations
Introduction
Simon Scheider is an associate professor at the Department of Human Geography and Spatial Planning , Utrecht University. He does research in Geographic Information Science, Artificial Intelligence, and Data Mining.
Additional affiliations
August 2014 - December 2015
Publications
Publications (120)
The concept of validity is a cornerstone of science. Given this central role, it is somewhat surprising to find that validity remains a rather obscure concept. Unfortunately, the term is often reduced to a matter of ground truth data, seemingly because we fail to come to grips with it. In this paper, instead, we take a purpose-based approach to the...
Exposure is a central concept of the health and behavioural sciences needed to study the influence of the environment on the health and behaviour of people within a spatial context. While an increasing number of studies measure different forms of exposure, including the influence of air quality, noise, and crime, the influence of land cover on phys...
Human behavior may be one of the most challenging phenomena to model and validate. This paper proposes a method for automatically extracting and compiling evidence on human behavior determinants into a knowledge graph. The method (1) extracts associations of behavior determinants and choice options in relation to study groups and moderators from pu...
Scenario microsimulations like agent-based models can account for feedbacks and spatiotemporal and social heterogeneity when projecting future intervention impacts. Addressing air pollution exposure requires traffic scenario models (e.g. of car-free zones).
Traditional air pollution models do not meet all requirements for traffic scenario microsimu...
The term homeomerosity refers to when a whole and its parts are the same kind of thing. For instance, a computer and its processor can both be classified as machines. Homeomerosity is a prerequisite for meaningful addition and subtraction. For example, adding the area sizes of two independent regions gives another area size, but adding an area size...
Transformations are essential for dealing with geographic information. They are involved not only in the conversion between geodata formats and reference systems, but also in turning geodata into useful information according to some purpose. However, since a transformation can be implemented in various formats and tools, its function and purpose us...
The recent success of large language models and AI chatbots such as ChatGPT in various knowledge domains has a severe impact on teaching and learning Geography and GIScience. The underlying revolution is often compared to the introduction of pocket calculators, suggesting analogous adaptations that prioritize higher-level skills over other learning...
There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such gener...
Taken literally, geoAI is the use of Artificial Intelligence methods and techniques in solving geo-spatial problems. Similar to AI more generally, geoAI has seen an influx of new (big) data sources and advanced machine learning techniques, but also a shift in the kind of problems under investigation. In this article, we highlight some of these chan...
Krzysztof Janowicz is a professor for Geoinformatics at the University of Vienna and the Univer sity of California, Santa Barbara. His research focuses on how humans conceptualize the space around them based on their behavior, focusing particularly on regional and cultural differences, with the goal of assisting machines to better understand the in...
Current artificial intelligence (AI) approaches to handle geographic information (GI) reveal a fatal blindness for the information practices of exactly those sciences whose methodological agendas are taken over with earth-shattering speed. At the same time, there is an apparent inability to remove the human from the loop, despite repeated efforts....
Transformations are essential for dealing with geographic information. They are involved not only in converting between geodata formats and reference systems, but also in turning geodata into useful information according to some purpose. However, since a transformation can be implemented in various formats and tools, its function and purpose usuall...
Understanding the role of humans in environmental change is one of the most pressing challenges of the 21st century. Environmental narratives – written texts with a focus on the environment – offer rich material capturing relationships between people and surroundings. We take advantage of two key opportunities for their computational analysis: mass...
The Netherlands have a problem regarding quality and as well as equality in their school system. Many students fail to reach minimal skill levels and underachieve with respect to their own learning capacities. They end up on educational levels that do not correspond to their learning potential, especially when their parents do not speak the languag...
With ever more people living in cities worldwide, it becomes increasingly important to understand and improve the impact of the urban habitat on livability, health behaviors and health outcomes. However, implementing interventions that tackle the exposome in complex urban systems can be costly and have long-term, sometimes unforeseen, impacts. Henc...
Due to the increasing prevalence and relevance of geo-spatial data in the age of data science, Geographic Information Systems are enjoying wider interdisciplinary adoption by communities outside of GIScience. However, properly interpreting and analysing geo-spatial information is not a trivial task due to knowledge barriers. There is a need for a t...
The next generation of Geographic Information Systems (GIS) is anticipated to automate some of the reasoning required for spatial analysis. An important step in the development of such systems is to gain a better understanding and corresponding modeling practice of when to apply arithmetic operations to quantities. The concept of extensivity plays...
In this article, we examined the potential of the current version of WordNet and Google Translate API to enhance the quality of geodata source retrieval in the Dutch geoinformation portal (PDOK) using semantic keywords for the geographic phenomena requested. Keywords gathered from real users’ questions in natural language extracted in an English co...
Natural language Interfaces (NLIs) have the ability to make Geographic Information Systems more accessible for interdisciplinary researchers or any inexperienced users. However, the majority of research on NLIs for GIS explored NLIs for visualization or spatial data retrieval. Research on NLIs for geo-analytical questions is still lacking. Google B...
Geographic Question Answering (GeoQA) systems can automatically answer questions phrased in natural language. Potentially this may enable data analysts to make use of geographic information without requiring any GIS skills. However, going beyond the retrieval of existing geographic facts on particular places remains a challenge. Current systems usu...
Why do some neighborhoods thrive, and others do not? While the importance of the local amenity mix has been established as a key determinant of local livability, its link to urban transport infrastructure remains understudied, partially due to a lack of data. Using spatiotemporal social media data from Foursquare, we analyze the impact of metro sta...
Spatial network analysis is a collection of methods for measuring accessibility potentials as well as for analyzing flows over transport networks. Though it has been part of the practice of geographic information systems for a long time, designing network analytical workflows still requires a considerable amount of expertise. In principle, artifici...
Slides of the keynote talk given by Simon at the 11th International Conference on Geographical Information Science. Poznań, 27-39 September, 2021.
The talk was recorded and is available on youtube under:
https://www.youtube.com/watch?v=jA3lBeWAWEQ
Spatial network analysis is a collection of methods for measuring accessibility potentials as well as for analyzing flows over transport networks. Though it has been part of the practice of Geographic Information Systems (GIS) for a long time, designing network analytical workflows still requires a considerable amount of expertise. In principle, Ar...
Running is a popular form of physical activity. Personal, social, and environmental determinants influence the engagement of the individual. To get insight in the relation between running behavior and external situations for different types of users, we carried out an extensive data mining study on large-scale datasets. We combined 4 years of histo...
User-generated content provides rich and easily accessible data for tourism destination managers, especially when combined with a sentiment analysis to uncover perceptions and attitudes. These reviews are often primarily useful in a business/attraction-context and scaling up their relevance for destination management is problematic. Furthermore, th...
Loose programming enables analysts to program with concepts instead of procedural code. Data transformations are left underspecified, leaving away procedural details and exploiting knowledge about the applicability of functions to data types. To synthesize workflows of high quality for a geo-analytical task, the semantic type system needs to reflec...
“Data Science” has taken many disciplines by storm. And for a good reason: New forms and unseen quantities of data enter nearly every scientific field, substantially changing the ways how scientists do science, and potentially allowing them to answer old questions or to pose them in novel ways. The recent success of Data Science is also reflected i...
Understanding syntactic and semantic structure of geographic questions is a necessary step towards true geographic question-answering (GeoQA) machines. The empirical basis for the understanding of the capabilities expected from GeoQA systems are geographic question corpora. Available corpora in English have been mostly drawn from generic Web search...
In geographic information systems (GIS), analysts answer questions by designing workflows that transform a certain type of data into a certain type of goal. Semantic data types help constrain the application of computational methods to those that are meaningful for such a goal. This prevents pointless computations and helps analysts design effectiv...
Understanding syntactic and semantic structure of geographic questions is a necessary step towards true geographic question-answering (GeoQA) machines. The empirical basis for the understanding of the capabilities expected from GeoQA systems are geographic question corpora. Available corpora in English have been mostly drawn from generic Web search...
This article compares the spatio-temporal concentration and dispersion of day trippers and tourists from Shenzhen as well as from the rest of Mainland China in Hong Kong using Weibo check-in data. The results show that hotspots of visitors from the rest of Mainland China are mostly concentrated in downtown areas, while visitors from Shenzhen also g...
Question Answering (QA), the process of computing valid answers to questions formulated in natural language, has recently gained attention in both industry and academia. Translating this idea to the realm of geographic information systems (GIS) may open new opportunities for data scientists. In theory, analysts may simply ask spatial questions to e...
Background
Our understanding of how food choices are affected by exposure to the food environment is limited, and there are important gaps in the literature. Recently developed smartphone-based technologies, including global positioning systems and ecological momentary assessment, enable these gaps to be filled.
Objective
We present the FoodTrack...
Search engines make information about places available to billions of users, who explore geographic information for a variety of purposes. The aggregated, large-scale search behavioural statistics provided by Google Trends can provide new knowledge about the spatial and temporal variation in interest in places. Such search data can provide useful k...
Spatial point tracks are of concern for an increasing number of analysts studying spatial behaviour patterns and environmental effects. Take an epidemiologist studying the behavior of cyclists and how their health is affected by the city's air quality. The accuracy of such analyses critically depends on the positional accuracy of the tracked points...
The increasing availability of geospatial data offers great opportunities for advancing scientific discovery and practices in society. However, the massive volume, heterogeneous, and distributed nature of global geospatial data pose challenges in geospatial information processing and computing. This chapter introduces three technologies for geospat...
The realization that knowledge often forms a densely interconnected graph has fueled the development of graph databases, Web-scale knowledge graphs and query languages for them, novel visualization and query paradigms, as well as new machine learning methods tailored to graphs as data structures. One such example is the densely connected and global...
Search engines make information about places available to billions of users, who explore geographic information for a variety of purposes. The aggregated, large-scale search behavioural statistics provided by Google Trends can provide new knowledge about the spatial and temporal variation in interest in places. Such search data can provide useful k...
We analyzed a large data set from a mobile exercise application to find the preferred running situations of a large number of users. We categorized the users according to their running behaviors (i.e. regularly active, or rarely active over the year), then studied the influence of 15 features, including temporal, geographical and weather-based feat...
A most fundamental and far-reaching trait of geographic information is the distinction between extensive and intensive properties. In common understanding, originating in Physics and Chemistry, extensive properties increase with the size of their supporting objects, while intensive properties are independent of this size. It has long been recognize...
Web data is the most prominent source of information for deciding where to go and what to do. Exploiting this source for geographic analysis, however, does not come without difficulties. First, in recent years, the amount and diversity of available Web information about urban space have exploded, and it is therefore increasingly difficult to overvi...
Geographic information has become central for data scientists of many disciplines to put their analyses into a spatio-temporal perspective. However, just as the volume and variety of data sources on the Web grow, it becomes increasingly harder for analysts to be familiar with all the available geospatial tools, including toolboxes in Geographic Inf...
Every day, practitioners, researchers, and students consult the Web to meet their information needs about GIS concepts and tools. How do we improve GIS in terms of conceptual organisation, findability, interoperability and relevance for user needs? So far, efforts have been mainly top-down, overlooking the actual usage of software and tools. In thi...
Every day, practitioners, researchers, and students consult the Web to meet their information needs about GIS concepts and tools. How do we improve GIS in terms of conceptual organisation, findability, interoperability and relevance for user needs? So far, efforts have been mainly top-down, overlooking the actual usage of software and tools. In thi...
In everyday communication, people effortlessly translate between spatial cognitive frames of reference. For example, a tourist guide translates from a map (“the fountain is north-west of the church”) into a cognitive frame for a tourist (“the fountain in front of the church”). While different types of cognitive reference frames and their relevance...
While location-based games (LBGs) have been around for some time, only few of them have succeeded in attracting a larger number of players. One reason is the difficulty of suitable embedding of game concepts in an environment. In order to reach players from different places, LBG concepts need to be relocalized in a way which preserves the particula...
Using cognitive linguistic strategies, people can verbally encode and convey their spatial realities with little effort (i.e. “my house is right across the street from the grocery store”). However, to date there are a limited number of ways to transform such spatial information into forms that are useful for computational analysis in a geographic i...
Currently, systems that let people search for opportunities to fulfill their spatio-temporal needs are built according to the conceptual model of service provider and consumer: After the providers make their needs publicly available, consumers use a specifically tailored query engine to find fitting offers. E.g., in carpooling, someone wants to fil...
In Geographic Information Systems (GIS), geoprocessing workflows allow analysts to organize their methods on spatial data in complex chains. We propose a method for expressing workflows as linked data, and for semi-automatically enriching them with semantics on the level of their operations and datasets. Linked workflows can be easily published on...
In this article, we critically examine the role of semantic technology in data driven analysis. We explain why learning from data is more than just analyzing data, including also a number of essential synthetic parts that suggest a revision of George Box’s model of data analysis in statistics. We review arguments from statistical learning under unc...
The linked data Web provides a simple and flexible way of accessing information resources in a self-descriptive format. This offers a realistic chance of perforating existing data silos. However, in order to do so, space, time and other semantic concepts need to function as dimensions for effectively exploring, querying and filtering contents. Whil...
How can data analysts identify spatio-temporal datasets that are suitable for their task? Answering this question is not only dependent on the aim of the analysis and the semantic contents of the data, but also on knowing whether the required data combinations and transformations , spatio-temporal analysis methods, charts and map visual-izations ar...
Private transport accounts for a large amount of total CO2 emissions, thus significantly contributing to global warming. Tools that actively support people in engaging in a more sustainable life-style without restricting their mobility are urgently needed. How can location-aware information and communication technology (ICT) enable novel interactiv...
Maintaining knowledge about the provenance of datasets, that is, about how they were obtained, is crucial for their further use. Contrary to what the overused metaphors of ‘data mining’ and ‘big data’ are implying, it is hardly possible to use data in a meaningful way if information about sources and types of conversions is discarded in the process...
Private transport accounts for a large amount of total CO2 emissions, thus significantly contributing to global warming. Tools that actively support people in engaging in a more sustainable life-style without restricting their mobility are urgently needed. How can location-aware information and communication technology (ICT) enable novel interactiv...
Wayfinding models can be helpful in describing, understanding, and technologically supporting the processes involved in navigation. However, current models either lack a high degree of formalization, or they are not holistic and perceptually grounded, which impedes their use for cognitive engineering. In this paper, we propose a novel formalism tha...
This poster outlines a system that is able to publish and match complementary spatio-temporal needs of people, e.g., the need for carpooling.
We outline a system that is able to publish and match complementary spatio-temporal needs of people, e.g., the need for carpooling. Key points discussed are the modeling and publishing of needs, their specification by the user, and the efficient processing of match queries.
One important issue in developing assistive navigation systems for people with disability is the accuracy and relevancy of the systems’ knowledge bases from the perspective of these special user groups. The theory of affordances coupled with computer-based simulation offers a solution for automating the extraction of the relevant information from r...
In this paper, we provide an overview on the design of scores that can be used in gamification and sketch how
user behavior can be influenced by design and communication.
What exactly does interoperability mean in the context of information science? Which entities are supposed to interoperate, how can they interoperate, and when can we say they are interoperating? This question, crucial to assessing the benefit of semantic technology and information ontologies, has been understood so far primarily in terms of standa...
Increasingly large amounts of data are being generated by technical sensors distributed in the human environment. However, a naked sensor value alone is meaningless. It lacks crucial meta-information, including the support and spatio-temporal resolution, and more generally the observation and interpretation process in which it is embedded. In order...
This paper discusses counter-measures for cognitive biases in maps based on pragmatic communication. We argue that communicative measures can be used to either increase bias awareness or to switch the representation to a form which avoids particular biases depending on the task.
We argue that current technical and legal attempts aimed at protecting Geoprivacy are insufficient. We propose a novel 2-dimensional model of privacy, which we term "civilized cyberspace". On one dimension there are engineering, social and legal tools while on the other there are different kinds of interaction with information. We argue why such a...
The workshop on computational models of place was held the first time in the context of the ACM SIGSPATIAL conference series in 2013. A workshop with a similar focus (PLACE'08) was held before at GIScience 2008 in Park City. Recently it has become apparent that a workshop specifically dedicated to computational approaches to place is required in or...