
Kai SunUniversity at Buffalo, The State University of New York | SUNY Buffalo · Department of Geography
Kai Sun
Doctor of Philosophy
About
28
Publications
5,348
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
193
Citations
Introduction
Skills and Expertise
Publications
Publications (28)
Geoscience knowledge graph (GKG) can organize various geoscience knowledge into a machine understandable and computable semantic network and is an effective way to organize geoscience knowledge and provide knowledge-related services. As a result, it has gained significant attention and become a frontier in geoscience. Geoscience knowledge is derive...
Cross-validation (CV) has been widely used in GeoAI research to evaluate the performance of machine learning models. Often, a labeled data set is randomly split into training and validation data, and a machine learning model is trained on the training data and then evaluated on the validation data in an iterative manner. Such a random CV approach c...
Point of interest (POI) data provide digital representations of places in the real world, and have been increasingly used to understand human-place interactions, support urban management, and build smart cities. Many POI datasets have been developed, which often have different geographic coverages, attribute focuses, and data quality. From time to...
Landslide susceptibility assessment is an important means of helping to reduce and manage landslide risk. The existing studies, however, fail to examine the spatially varying relationships between landslide susceptibility and its explanatory factors. This paper investigates the spatial variation in such relationships in Liangshan, China, leveraging...
The risk of coal mine accidents rises significantly with mining depth, making it urgent for accident prevention to be supported by both scientific analysis and advanced technologies. Hence, a comprehensive grasp of the research progress and differences in hotspots of coal mine accidents in China serves as a guide to find the shortcomings of studies...
Geo-parsing, one of the key components of geographical information retrieval, is a process to recognize and geo-locate toponyms mentioned in texts. Such a process can obtain locations contained in toponyms successfully with consistent updating of neural network models and multiple contextual features. The significant offset distance between the geo...
Time is an essential reference system for recording objects, events, and processes in the field of geosciences. There are currently various time references, such as solar calendar, geological time, and regional calendar, to represent the knowledge in different domains and regions, which subsequently entails a time conversion process required to int...
The investigation of Ecological Agriculture (EA) patterns can reveal the differences, aggregation, and diversity of agricultural development, providing specific paths in agricultural development and environmental protection to achieve the Sustainable Development Goals. Although field surveys, literature analysis, and the method using administrative...
African swine fever (ASF) has spread to many countries in Africa, Europe and Asia in the past decades. However, the potential geographic extent of ASF infection is unknown. Here we combined a modeling framework with the assembled contemporary records of ASF cases and multiple covariates to predict the risk distribution of ASF at a global scale. Loc...
Toponym recognition is used to extract toponyms from natural language texts, which is a fundamental task of ubiquitous geographic information applications. Existing toponym recognition methods with state‐of‐the‐art performance mainly leverage supervised learning (i.e., deep‐learning‐based approaches) with parameters learned from massive, labeled da...
Geospatial data is an indispensable data resource for research and applications in many fields. The technologies and applications related to geospatial data are constantly advancing and updating, so identifying the technologies and applications among them will help foster and fund further innovation. Through topic analysis, new research hotspots ca...
In order to understand how related research are evolving to respond to COVID-19 and to facilitate the containment of COVID-19, this paper accurately extracted the spatial and topic information from the metadata of papers related to COVID-19 using text mining techniques, and with the extracted information, the research evolution was analyzed from th...
地理知识库是地理实体及其相互间关系的集合,对于智能搜索、问答、推荐等知识服务有重要的支撑作用。然而,已有的地理知识库由于来源、形式、构建者等的不同,在实体地名、空间位置、类别等方面存在“同义异形”和“同形异义”的语义异构现象,影响了地理知识库间的知识融合与共享。语义对齐是解决语义异构的一种有效方法,其中实体类别对齐是语义对齐的基础,对于提高实体地名和空间位置的对齐精度具有重要作用。现有的实体类别对齐方法主要采用传统的字符相似度和结构相似度等来度量类别的相似度,无法捕捉实体类别深层次的语义相关性,从而影响了类别对齐的精确性。因此,本文提出了一种基于词嵌入的地理实体类别对齐方法,采用词嵌入模型从语料中学习实体类别的语义信息,并通过词向量来表达,以此弥补现有方法存在的缺失,进而提升实体对齐精度。进...
Historical maps contain rich geographic information about the past of a region. They are sometimes the only source of information before the availability of digital maps. Despite their valuable content, it is often challenging to access and use the information in historical maps, due to their forms of paper-based maps or scanned images. It is even...
互联网的海量旅游信息为旅游者提供了极大便利,同时信息的语义异构问题也对旅游者造成了困扰。旅游本体的研究对于解决旅游信息语义异构问题以及实现旅游信息一致化表达具有重要意义。该文设计了一种全要素旅游本体模型,系统分析旅游领域的研究对象,提出旅游信息的概念模型;在此基础上,设计了包含概念、属性、关系、实例四元组的全要素旅游本体模型。以山东省为例,在全要素旅游本体模型的指导下,采用模块化方法构建山东省旅游本体。实证研究发现,全要素旅游本体可为旅游本体的研究提供统一、可共享的本体模型,模块化的旅游本体构建方法可避免重复构建通用基础子本体,研究成果对旅游本体的研究与构建具有一定的参考价值。
Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science. However, the semantic heterogeneity of geospatial data is a major problem that significantly hinders geospatial data integration and sharing. Ontologies are regarded as a promisin...
Geographic knowledge bases (GKBs) with multiple sources and forms are of obvious heterogeneity, which hinders the integration of geographic knowledge. Entity alignment provides an effective way to find correspondences of entities by measuring the multidimensional similarity between entities from different GKBs, thereby overcoming the semantic gap....
数据来源是数据可靠性评价的重要参考因素,是地理空间数据本体的重要研究内容。本文针对来源这一重要的地理空间数据研究对象,系统地分析了地理空间数据来源的涵义,建立了地理空间数据来源本体模型,在此基础上,提出了地理空间数据来源本体的概念体系和来源实体间关系及其实体属性的形式化表达方法,并构建出地理空间数据来源本体。最后以“科技基础性工作专项”项目数据资料为例,基于来源本体库,利用RDF从来源角度实现数据的语义关联,基于web前端框架D3.js技术实现数据与其来源信息的可视化。结果表明,基于来源本体的数据关联可以有效解决数据来源信息描述不规范的问题以及能够支持地学数据语义检索、智能推荐等应用,为促进地学数据共享和数据关联应用提供了一种新方法和新思路。
The complexity of geographic modelling is increasing; hence, preparing data to drive geographic models is becoming a time-consuming and difficult task that may significantly hinder the application of such models. Meanwhile, a huge number of data sets have been shared and have become publicly accessible through the Internet. This study presents a da...
Linked Data is known as one of the best solutions for multisource and heterogeneous web data integration and discovery in this era of Big Data. However, data interlinking, which is the most valuable contribution of Linked Data, remains incomplete and inaccurate. This study proposes a multidimensional and quantitative interlinking approach for Linke...
Semantic heterogeneity of scientific data is main bottleneck for its integration and sharing. Data Ontology is an effective way to solves mantic heterogeneity of data. On the basis of systematic analysis of geodata characteristics, this paper puts forward the overall architecture of GeoData Ontology (GDO) and mainly studies essential characteristic...
Data resources collection and standardized reorganization for national science and technology projects have great significance for promoting public sharing of data, fulfilling maximum value of data and realizing maximum benefit of the state investment in science and technology projects. National Special Program on Basic Works for Science and Techno...
The semantic heterogeneity of geospatial data is the main bottleneck for the realization of data association, the intelligent recommendation and the accurate discovery of data. Geospatial data ontology is known as an effective approach to solve the semantic heterogeneity of geospatial data. The morphological characteristic is an important feature o...
Semantic heterogeneity of geospatial data is the main bottleneck for implementing linked data, intelligent recommendation and accurate discovery of data. The ontology theory is an effective way to solve the semantic heterogeneity of data. Morphological Characteristics is the important research content of Semantic heterogeneity of data. This paper m...