About
202
Publications
56,463
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,369
Citations
Introduction
Current institution
Additional affiliations
December 2013 - December 2014
September 2017 - present
December 2009 - present
Publications
Publications (202)
Accurate and real-time traffic forecasting plays an important role in the intelligent traffic system and is of great significance for urban traffic planning, traffic management, and traffic control. However, traffic forecasting has always been considered an "open" scientific issue, owing to the constraints of urban road network topological structur...
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in real-world applications. However, artificial neural networks face the well-known problem of catastrophic forgetting. What is worse, the degradation of previously learned skills becomes more severe as the task...
Graph neural networks (GNNs) have achieved great success in many graph-based tasks. Much work is dedicated to empowering GNNs with adaptive locality ability, which enables the measurement of the importance of neighboring nodes to the target node by a node-specific mechanism. However, the current node-specific mechanisms are deficient in distinguish...
Haifeng Li Yi Li Guo Zhang- [...]
Chao Tao
Recently, supervised deep learning has achieved a great success in remote sensing image (RSI) semantic segmentation. However, supervised learning for semantic segmentation requires a large number of labeled samples, which is difficult to obtain in the field of remote sensing. A new learning paradigm, self-supervised learning (SSL), can be used to s...
The existing methods learn geographic network representations through deep graph neural networks (GNNs) based on the i.i.d. assumption. However, the spatial heterogeneity and temporal dynamics of geographic data make the out-of-distribution (OOD) generalisation problem particularly salient. The latter are particularly sensitive to distribution shif...
Investigating causal interactions between entities is a crucial task across various scientific domains. The traditional causal discovery methods often assume a predetermined causal direction, which is problematic when prior knowledge is insufficient. Identifying causal directions from observational data remains a key challenge. Causal discovery typ...
The increasing reliance on deep neural network-based object detection models in various applications has raised significant security concerns due to their vulnerability to adversarial attacks. In physical 3D environments, existing adversarial attacks that target object detection (3D-AE) face significant challenges. These attacks often require large...
In remote sensing scene classification, leveraging the transfer methods with well-trained optical models is an efficient way to overcome label scarcity. However, cloud contamination leads to optical information loss and significant impacts on feature distribution, challenging the reliability and stability of transferred target models. Common soluti...
Urban sensing has become increasingly important as cities evolve into the centers of human activities. Large language models (LLMs) offer new opportunities for urban sensing based on commonsense and worldview that emerged through their language-centric framework. This paper illustrates the transformative impact of LLMs, particularly in the potentia...
RGB, multispectral, point and other spatio-temporal modal data fundamentally represent different observational approaches for the same geographic object. Therefore, leveraging multimodal data is an inherent requirement for comprehending geographic objects. However, due to the high heterogeneity in structure and semantics among various spatio-tempor...
An intelligent understanding model of a remote sensing image will present different visual representations of the same object in the remote sensing image, under the interference of offset factors, such as weather and season. This variability adversely affects the generalization ability of the model; therefore, an open challenge is how to learn inva...
Remote sensing change detection (RS-CD) relies on the model’s ability to learn features of marked change objects, known as foreground targets. Beyond foreground targets, the background targets are more valuable samples for change detection, such as unlabelled ones, semantically ambiguous ones, pseudo changes, and non-interesting changes, referred t...
Although significant advances have been made in the semantic segmentation of high-resolution remote sensing images, obtaining accurate pixel-wise annotations remains resource-intensive. We propose SparseFormer, a credible dual-CNN expert-guided Transformer model designed for semantic segmentation using point-level annotations to reduce this annotat...
Deep neural network-based synthetic aperture radar (SAR) automatic target recognition (ATR) systems are susceptible to attack by adversarial examples, which leads to misclassification by the SAR ATR system, resulting in theoretical model robustness problems and security problems in practice. Inspired by optical images, current SAR ATR adversarial e...
Existing works typically treat spatial-temporal prediction as the task of learning a function $F$ to transform historical observations to future observations. We further decompose this cross-time transformation into three processes: (1) Encoding ($E$): learning the intrinsic representation of observations, (2) Cross-Time Mapping ($M$): transforming...
Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it attractive to holistically c...
The existing change detection (CD) methods can be summarized as the visual-first change detection (ViFi-CD) paradigm, which first extracts change features from visual differences and then assigns them specific semantic information. However, CD is essentially dependent on change regions of interest (CRoIs), meaning that the CD results are directly d...
Graph neural networks (GNNs) have been highly successful in graph representation learning. The goal of GNNs is to enrich node representations by aggregating information from neighboring nodes. Much work has attempted to improve the quality of aggregation by introducing a variety of graph information with representational capabilities. The class of...
End-to-end interpretation is currently the prevailing paradigm for remote sensing fine-grained ship classification (RS-FGSC) task. However, its inference process is uninterpretable, leading to criticism as a black box model. To address this issue, we propose a large vision-language model (LVLM) named IFShip for interpretable fine-grained ship class...
Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it attractive to holistically c...
The existing change detection(CD) methods can be summarized as the visual-first change detection (ViFi-CD) paradigm, which first extracts change features from visual differences and then assigns them specific semantic information. However, CD is essentially dependent on change regions of interest (CRoIs), meaning that the CD results are directly de...
Due to the different acquisition conditions, large variations in the feature distributions of two temporal domains generally exist, known as temporal domain shift. The temporal domain shift is primarily influenced by coupled dual-factor: global style variations (such as illumination and weather conditions) and local style variations (such as the in...
The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new...
Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting intelligent understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it necessary to hol...
Predicting urban morphology based on local attributes is an important issue in urban science research. The deep generative models represented by generative adversarial network (GAN) models have achieved impressive results in this area. However, in such methods, the urban morphology is assumed to follow a specific probability distribution and be abl...
Ming Zhang Xin Gu Ji Qi- [...]
Haifeng Li
The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset plays an ever-shrink...
In synthetic aperture radar (SAR) imaging, intelligent object detection methods are facing significant challenges in terms of model robustness and application security, which are posed by adversarial examples. The existing adversarial example generation methods for SAR object detection can be divided into two main types: global perturbation attacks...
Despite the success of deep learning in land cover classification, high-resolution (HR) land cover mapping remains challenging due to the time-consuming and labor-intensive process of collecting training samples. Many global land cover products (LCP) can reflect the low-level commonality (LLC) knowledge of land covers, such as basic shape and under...
Unmanned aerial vehicle (UAV) imaging object detection systems based on deep neural networks are vulnerable to adversarial patch attacks. However, existing UAV image adversarial patch generation methods mainly target flat digital images, neglecting the adjustments to the adversarial patch morphology brought about by changes in the imaging projectio...
Effective feature representation is pivotal in numerous remote sensing image (RSIs) interpretation tasks. Notably, a distinct attribute of RSIs is their inclination toward multi-scale feature dependence. Previous research predominantly focuses on designing intricate and complex networks or modules to encapsulate rich multi-scale features. However,...
Temporal-based methods effectively improve the utilization rate of remote sensing images, but large ratios of missing information still need to be improved in the reconstruction models. In this paper, based on the imaging theory with the help of a radiation correction model, a decoupling-reconstruction network (DecRecNet) for image reconstruction i...
With the continuous improvement in the volume and spatial resolution of remote sensing images, the self-supervised contrastive learning paradigm driven by a large amount of unlabeled data is expected to be a promising solution for large-scale land cover classification with limited labeled data. However, due to the richness and scale diversity of gr...
Learning from a sequence of tasks for a lifetime is essential for an agent toward artificial general intelligence. Despite the explosion of this research field in recent years, most work focuses on the well-known catastrophic forgetting issue. In contrast, this work aims to explore knowledge-transferable lifelong learning without storing historical...
Contrastive learning techniques make it possible to pretrain a general model in a self-supervised paradigm using a large number of unlabeled remote sensing images. The core idea is to pull positive samples defined by data augmentation techniques closer together while pushing apart randomly sampled negative samples to serve as supervised learning si...
Self-supervised contrastive learning (SSCL) has achieved significant milestones in remote sensing image (RSI) understanding. Its essence lies in designing an unsupervised instance discrimination pretext task to extract image features from a large number of unlabeled images that are beneficial for downstream tasks. However, existing instance discrim...
Automatic and periodic recompiling of building databases with up-to-date high-resolution images has become a critical requirement for rapidly developing urban environments. However, the architecture of most existing approaches for change extraction attempts to learn features related to changes but ignores objectives related to buildings. This inevi...
The open access journal Remote Sensing (IF: 4.509, ISSN 2072-4292) is pleased to announce that we have launched a new Special Issue entitled “Geospatial Foundation Model in Urban Environments: Challenges and New Technologies”
Given the depth of your expertise in this field, I would like to cordially invite you to contribute an article to the Speci...
Graph contrastive learning (GCL) is a promising direction toward alleviating the label dependence, poor generalization and weak robustness of graph neural networks, learning representations with invariance, and discriminability by solving pretasks. The pretasks are mainly built on mutual information estimation, which requires data augmentation to c...
Open-source land cover products (LCPs) are essential for many areas of scientific research. However, they have deficiencies such as low accuracy, low resolution, and poor timeliness when applied to a specific area. Therefore, we developed WESUP-LCP, a novel two-stage weakly supervised semantic segmentation framework to improve the resolution of LCP...
Automatic and high-precision extraction of buildings from remote sensing images has a wide range of application and importance. Optical and synthetic aperture radar (SAR) images are typical types of multimodal remote sensing data with different imaging methods. To bridge the huge gap between them and achieve high-precision joint semantic segmentati...
Deep learning-based semantic segmentation has been widely applied for building extraction. However, due to the domain gap, the extraction of building in high-resolution remote sensing imagery is difficult when the model trained on a source dataset is directly used to test on a target data. Considering that humans can retrieve memory to deal with co...
During the past decades, the invention and employment of multiple sensors enable multi-sensor remote-sensing (RS) image acquisition. In order to effectively use these images for RS scene understanding, a scene classification model trained with samples collected from one sensor should generalize well to other sensors. However, it is extremely challe...
Chao Tao Ji Qi Guo Zhang- [...]
Haifeng Li
Are we on the right way for remote sensing image understanding (RSIU) by training models in a supervised data-dependent and task-dependent manner, instead of original human vision in a label-free and task-independent way? We argue that a more desirable RSIU model should be trained with intrinsic structure from data rather than extrinsic human label...
Deep learning has achieved great success in learning features from massive remote sensing images (RSIs). To better understand the connection between three feature learning paradigms, which are unsupervised feature learning (USFL), supervised feature learning (SFL), and self-supervised feature learning (SSFL), this paper analyzes and compares them f...
Change detection and extraction of buildings based on convolutional neural networks (CNNs) have made encouraging progress in the remote sensing community. Although these two tasks are different in objective and application scenarios, both focus on building objects. However, previous methods were accustomed to considering these two tasks separately,...
Overcoming catastrophic forgetting is a key difficulty for remote sensing image (RSI) classification in open world applications. The core of this problem lies in the ability of RSI scene classification models to adapt to the changing environment and maintain the learned knowledge while continually learning new knowledge. Mainstream replay-based app...
Self-supervised contrastive learning (SSCL) has achieved significant milestones in remote sensing image (RSI) understanding. Its essence lies in designing an unsupervised instance discrimination pretext task to extract image features from a large number of unlabeled images that are beneficial for downstream tasks. However, existing instance discrim...
In recent years, using a self-supervised learning framework to learn the general characteristics of graphs has been considered a promising paradigm for graph representation learning. The core of self-supervised learning strategies for graph neural networks lies in constructing suitable positive sample selection strategies. However, existing GNNs ty...
Deep learning has achieved great success in learning features from massive remote sensing images (RSIs). To better understand the connection between feature learning paradigms (e.g., unsupervised feature learning (USFL), supervised feature learning (SFL), and self-supervised feature learning (SSFL)), this paper analyzes and compares them from the p...
The existing SSCL of RSI is built based on constructing positive and negative sample pairs. However, due to the richness of RSI ground objects and the complexity of the RSI contextual semantics, the same RSI patches have the coexistence and imbalance of positive and negative samples, which causing the SSCL pushing negative samples far away while pu...
The key to traffic prediction is to accurately depict the temporal dynamics of traffic flow traveling in a road network, so it is important to model the spatial dependence of the road network. The essence of spatial dependence is to accurately describe how traffic information transmission is affected by other nodes in the road network, and the GNN-...
Understanding the evolutionary mechanisms of dynamic graphs is crucial since dynamic is a basic characteristic of real-world networks. The challenges of modeling dynamic graphs are as follows: (1) Real-world dynamics are frequently characterized by group effects, which essentially emerge from high-order interactions involving groups of entities. Th...
Deep Neural Network (DNN) based point cloud semantic segmentation has presented significant achievements on large-scale labeled aerial laser point cloud datasets. However, annotating such large-scaled point clouds is time-consuming. Due to density variations and spatial heterogeneity of the Airborne Laser Scanning (ALS) point clouds, DNNs lack gene...
The pretasks are mainly built on mutual information estimation, which requires data augmentation to construct positive samples with similar semantics to learn invariant signals and negative samples with dissimilar semantics in order to empower representation discriminability. However, an appropriate data augmentation configuration depends heavily o...
With the rapid development of remote sensing technology and the growing demand for applications, the classical deep learning-based object detection model is bottlenecked in processing incremental data, especially in the increasing classes of detected objects. It requires models to sequentially learn new classes of objects based on the current model...
Earth observation resources are becoming increasingly indispensable in disaster relief, damage assessment, and other related domains. Many unpredictable factors, such as changes in observation task requirements, bad weather, and resource malfunctions, may cause the scheduled observation scheme to become infeasible. In these cases, it is crucial to...
High-resolution remote sensing images bring a large amount of data as well as challenges to traditional vision tasks. Vehicle re-identification (ReID), as an essential vision task that can utilize remote sensing images, has been widely used in suspect vehicle searches, cross-border vehicle tracking, traffic behavior analysis, and automatic toll col...
Providing accurate crop yield estimations at large spatial scales and understanding yield losses under extreme climate stress is an urgent challenge for sustaining global food security. While the data-driven deep learning approach has shown great capacity in predicting yield patterns, its capacity to detect and attribute the impacts of climatic ext...
The deep learning method is widely used in remote sensing object detection on the premise that the training data have complete features. However, when data with a fixed class are added continuously, the trained detector is less able to adapt to new instances, impelling it to carry out incremental learning (IL). IL has two tasks with knowledge-relat...
As an important task of Graph Neural Networks (GNN), graph classification has received increasing attention, as it can be widely used in numerous fields, such as protein prediction and community prediction. The GNN models for graph classification usually require aggregating the structure and feature information of an input graph into a hidden repre...
Using complex network analysis methods to analyze the internal structure of geographic networks is a popular topic in urban geography research. Statistical analysis occupies a dominant position in the current research on geographic networks. This perspective mainly focuses on node connectivity, while other perspectives, such as geometric and algebr...
Do we on the right way for remote sensing image understanding (RSIU) by training models via supervised data-dependent and task-dependent way, instead of human vision in a label-free and task-independent way? We argue that a more desirable RSIU model should be trained with intrinsic structure from data rather that extrinsic human labels to realize g...
Urban transit networks need to be upgraded in accordance with urban development. While methods have been studied to design an optimal transit network given the locations of stations, these methods focus on the whole network as the optimization object. However, the strategy to improve parts of an existing transit network based on the gap between tra...
Li Chen Qi Li Weiye Chen- [...]
Haifeng Li
Adversarial examples pose many security threats to convolutional neural networks (CNNs). Most defense algorithms prevent these threats by finding differences between the original images and adversarial examples. However, the found differences do not contain features about the classes, so these defense algorithms can only detect adversarial examples...
Ming Zhang Xin Gu Jun Xiao- [...]
Sumin Li
The coexistence of different cultures is a distinctive feature of human society, and globalization makes the construction of cities gradually tend to be the same, so how to find the unique memes of urban culture in a multicultural environment is very important for the development of a city. Most of the previous analyses of urban style have been bas...
While considering the spatial and temporal features of traffic, capturing the impacts of various external factors on travel is an essential step towards achieving accurate traffic forecasting. However, existing studies seldom consider external factors or neglect the effect of the complex correlations among external factors on traffic. Intuitively,...
Remote sensing imagery (RSI) and point of interest (POI) are two complementary data for urban functional zone (UFZ) extraction. However, current methods only use single data or just simply fuse the features extracted from these two data, which may not fully exploit their complementary strength. To solve this problem, in this paper, we propose a uni...
Thick clouds seriously impact the quality of optical remote sensing images (RSIs) and limit their application. For removing the cloud, some learning-based methods have been proposed and attracted considerable attention. However, these methods need to train paired multitemporal images with/without cloud, which are difficult and costly to collect. To...
Self-supervised learning achieves close to supervised learning results on remote sensing image (RSI) scene classification. This is due to the current popular self-supervised learning methods that learn representations by applying different augmentations to images and completing the instance discrimination task which enables convolutional neural net...
The increase in self-supervised learning (SSL), especially contrastive learning, has enabled one to train deep neural network models with unlabeled data for remote sensing image (RSI) scene classification. Nevertheless, it still suffers from the following issues: 1) the performance of the contrastive learning method is significantly impacted by the...
To infer unknown remote sensing scenarios, most existing technologies use a supervised learning paradigm to train deep neural network (DNN) models on closed datasets. This paradigm faces challenges such as highly spatiotemporal variants and ever-changing scale-heterogeneous remote sensing scenarios. Additionally, DNN models cannot scale to new scen...
Hao Wang Chao Tao Ji Qi- [...]
Haifeng Li
Reducing the feature distribution shift caused by the factor of visual-environment changes, namely as VE-changes, is a hot issue in domain adaptation learning. However, in the semantic segmentation task of remote sensing imageries, besides VE-changes, the change of semantic-scenes (SS-changes) is another factor raising domain gap, which brings the...
Few-shot remote sensing scene classification tries to make a model quickly adapt to new scenes with only a few samples that do not appear in the closed training set. Since limited samples can hardly describe the distribution of data, it is a challenge for a model to learn good generalized features. Since limited samples are rarely representative, i...
The imaging process of optical remote sensing images are easily affected by external conditions. Therefore, remote sensing images under different imaging conditions often show color differences, resulting in feature distribution differences between the source and target domain, hindering the migration of semantic segmentation models between domains...
Self-supervised contrastive learning (SSCL) is a potential learning paradigm for learning remote sensing image (RSI)-invariant features through the label-free method. The existing SSCL of RSI is built based on constructing positive and negative sample pairs. However, due to the richness of RSI ground objects and the complexity of the RSI contextual...
Learning from a sequence of tasks for a lifetime is essential for an agent towards artificial general intelligence. This requires the agent to continuously learn and memorize new knowledge without interference. This paper first demonstrates a fundamental issue of lifelong learning using neural networks, named anterograde forgetting, i.e., preservin...
Humans' continual learning (CL) ability is closely related to Stability Versus Plasticity Dilemma that describes how humans achieve ongoing learning capacity and preservation for learned information. The notion of CL has always been present in artificial intelligence (AI) since its births. This paper proposes a comprehensive review of CL. Different...
Remembering and forgetting mechanisms are two sides of the same coin in a human learning-memory system. Inspired by human brain memory mechanisms, modern machine learning systems have been working to endow machine with lifelong learning capability through better remembering while pushing the forgetting as the antagonist to overcome. Nevertheless, t...
Considering the success of generative adversarial networks (GANs) for image-to-image translation, researchers have attempted to translate satellite images to maps (si2map) through GAN for cartography. However, these studies involved limited scales, which hinders multi-scale map creation. By extending their method, high-resolution satellite images c...
Ling Zhao Li Luo Bo Li- [...]
Haifeng Li
The city landscape is largely related to the design concept and aesthetics of planners. Influenced by globalization, planners and architects have borrowed from available designs, resulting in the “one city with a thousand faces” phenomenon. In order to create a unique urban landscape, they need to focus on local urban characteristics while learning...
With the adversarial attack of convolutional neural networks (CNNs), we are able to generate adversarial patches to make an aircraft undetectable by object detectors instead of covering the aircraft with large camouflage nets. However, aircraft in remote sensing images (RSIs) have the problem of large variations in scale, which can easily cause siz...
Multi-temporal deep learning approaches have exhibited excellent classification performance in large-scale crop mapping. These approaches efficiently and automatically transform remote sensing time series into high-dimensional feature representations to identify crop types. The lack of interpretation, however, is regarded as a major drawback of the...
Airborne Laser Scanning (ALS) point clouds have complex structures, and their 3D semantic labeling has been a challenging task. It has three problems: (1) the difficulty of classifying point clouds around boundaries of objects from different classes, (2) the diversity of shapes within the same class, and (3) the scale differences between classes. I...
Generative adversarial networks (GANs) are subject to catastrophic forgetting when learning stream of data. Inspired by the knowledge of neuroscience, this article develops a memory formation system (MFS) to establish memory for GANs. MFS is composed of three modules, including the identifier, weights upgrade (WU), and weights reactivate (WR). Thes...
Adversarial examples fool the models into predicting wrong results through generated perturbations, demonstrating the vulnerability of convolutional neural networks (CNNs). Recent studies also show that many CNNs applied to remote sensing image (RSI) scene classification are still subject to adversarial example attacks. Through further analysis of...
Accurate real-time traffic forecasting is a core technological problem against the implementation of the intelligent transportation system. However, it remains challenging considering the complex spatial and temporal dependencies among traffic flows. In the spatial dimension, due to the connectivity of the road network, the traffic flows between li...
Graph neural networks (GNNs) have achieved great success in many graph-based tasks. Much work is dedicated to empowering GNNs with the adaptive locality ability, which enables measuring the importance of neighboring nodes to the target node by a node-specific mechanism. However, the current node-specific mechanisms are deficient in distinguishing t...
Haifeng Li Yi Li Guo Zhang- [...]
Chao Tao
A new learning paradigm, self-supervised learning (SSL), can be used to solve such problems by pre-training a general model with large unlabeled images and then fine-tuning on a downstream task with very few labeled samples. Contrastive learning is a typical method of SSL, which can learn general invariant features. However, most of the existing co...
Museum cultural relics represent a special material cultural heritage, and modern interpretations of them are needed in current society. Based on the catalogue data of cultural relics published by the State Administration of Cultural Heritage, this paper analyzes the continuity and intermittentness of cultural relics in time series by using the met...
Detecting the changes of buildings in urban environments is essential. Existing methods that use only nadir images suffer from severe problems of ambiguous features and occlusions between buildings and other regions. Furthermore, buildings in urban environments vary significantly in scale, which leads to performance issues when using single-scale f...