December 2024
·
7 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
December 2024
·
7 Reads
July 2024
·
72 Reads
·
3 Citations
Deep learning-based building extraction methods have widespread applications in diverse fields. However, the evaluation of large-scale extraction results remains challenging, due to traditional evaluation metrics rely on manually created ground-truth samples and the lack of comprehensive reference-building data for developing countries. To address these problems, we proposed a two-level framework for evaluating large-scale footprint extraction. First, we utilised global open-source population and land use data as the proxy data, to assess grid-level completeness for the areas with insufficient reference data. Second, we introduced an improved two-way area-overlapping method to match the extracted footprints with the reference buildings, thereby enabling a comprehensive evaluation of the study region. Tested in Hyogo Prefecture and Numazu City, Japan, the results demonstrated a 2.6-% improvement in grid classification accuracy and an increase of 0.53 in the completeness correlation, compared with the results obtained using a single proxy indicator. Moreover, the optimised matching method achieved an outstanding semantic matching accuracy of 99%, with high efficiency and robustness in multi-scale matching. Therefore, the proposed approach can effectively evaluate large-scale footprint extraction results and interpret their semantic relationship with actual buildings, applicable globally regardless of the availability of reference building datasets. ARTICLE HISTORY
July 2024
·
79 Reads
·
36 Citations
Landscape and Urban Planning
Developing a model to evaluate urban streetscapes based on subjective perceptions is important for quantitative understanding. However, previous studies have only considered limited types of subjective perceptions, neglecting the relationships between them. Further, accurately measuring subjective perception with low computational costs for large-scale urban regions at high spatial resolutions has been difficult. We present a deep-learning-based multilabel classification model that can measure 22 subjective perceptions scores from street-view images. This model uses the results of a web questionnaire survey encompassing 22 subjective perceptions, with 8.8 million responses. Our model demonstrates high accuracy (0.80–0.91) in measuring subjective perception scores from street-view images and achieves low computational cost by training on 22 subjective perception relationships. The 22 subjective perceptions were analyzed using PCA and k-means analysis. By categorizing the 22 subjective perceptions into a two-dimensional space visualized and grouped into distinct groups—positive, negative, calm, and lively—we unearthed vital insights into the intricate nuances of human perception. In addition, the study used semantic segmentation to extract landscape elements from street-view images and applied ℓ1-regularized sparse modeling to identify the landscape elements structurally correlating with each subjective perception class. The analysis revealed that only seven out of nineteen landscape elements significantly correlated with subjective impressions, and these effects varied by class. Notably, sky coverage positively influences positive subjective perceptions, such as attractiveness and calmness, but negatively affects lively impressions. The proposed model can be used to map the overall image of a city and identify landscape design issues in community development design.
December 2023
·
46 Reads
·
2 Citations
November 2023
·
1,663 Reads
·
13 Citations
Remote sensing image semantic segmentation of land use, benefitted from the development of deep learning and consequently made considerable progress in terms of inferencing accuracy and speed. However, the effective training of semantic segmentation models for remote sensing imagery necessitates extensively detailed pixel-level annotations, and gathering such data is both time-intensive and laborious. Thus, this study implemented low-rank adaptation on a stable diffusion algorithm to learn the distribution of the pixel-level annotations in case of the LoveDA dataset. Consequently, the annotation-image pairs were used to train the remote sensing image generator based on stable diffusion guided by ControlNet. We proposed a stable diffusion based approach, which can generate image-annotation pairs from scratch. The generated annotation and image pairs achieved a high accuracy of 0.520 mean intersection-over-union on LoveDA dataset, which is close to the original data training result of 0.539 mIoU. Furthermore, the mixed training using generated and original data achieved 0.542 mIoU, thereby demonstrating the data augmentation function of our approach. This study provided a solution for the high-cost pixel-level annotation issue, and thus, exhibited the potential of artificial intelligence generated content.
September 2023
·
14 Reads
April 2023
·
288 Reads
·
17 Citations
Engineering Applications of Artificial Intelligence
Land price is an important economic factor in producing meaningful references for regional planners by assisting them in urban planning, economic decision-making, and land resource allocation. However, related studies in land price analysis were mainly focused on the factors of site area and plot ratio, analysis of the potential impact of streetscape factors and human subjective perception on the land price has been lacking, regardless of the impact on supply and demand relationship. Therefore, this study developed a new approach for estimating and analyzing land prices through deep learning that considered the streetscape and human subjective perception factors. In the estimation part, we developed a fine-grained end-to-end deep learning model, the input is street view images, and the output is land prices. In the analysis part, we extracted the semantic segmentation results and human subjective perception scores from the images and combined them with the results of land price estimation. We then introduced a combination of quantitative analysis using the gradient-weighted class activation mapping and L1-based sparse linear regression to model the relationship between the streetscape and the human subjective perception quantitatively. The gradient-weighted class activation mapping was used to determine which categories of pixels deep learning relied on to output the results of land price estimation quantitatively. Combined with segmentation results, we implemented L1-based sparse linear regression and quantitatively determined the importance of the streetscape factors for land prices. Overall, our deep learning model achieved 77.99 % accuracy by only using street views in estimating the land price, and we illustrated the impacts of the streetscape and perception on the land price by showing that perception scores are more important than streetscape factors such as road and mountain in the streetscape. Comfortable, lived-in feel in perception have the most important impact on land price estimation.
February 2023
·
196 Reads
·
1 Citation
People flow trend estimation is crucial to traffic and urban safety planning and management. However, owing to privacy concerns, the collection of individual location data for people flow statistical analysis is difficult; thus, an alternative approach is urgently needed. Furthermore, the trend in people flow is reflected in streetscape factors, yet the relationship between them remains unclear in the existing literature. To address this, we propose an end-to-end deep-learning approach that combines street view images and human subjective score of each street view. For a more detailed people flow study, estimation and analysis were implemented using different time and movement patterns. Consequently, we achieved a 78% accuracy on the test set. We also implemented the gradient-weighted class activation mapping deep learning visualization and L1 based statistical methods and proposed a quantitative analysis approach to understand the land scape elements and subjective feeling of street view and to identify the effective elements for the people flow estimation based on a gradient impact method. In summary, this study provides a novel end-to-end people flow trend estimation approach and sheds light on the relationship between streetscape, human subjective feeling, and people flow trend, thereby making an important contribution to the evaluation of existing urban development.
January 2023
·
274 Reads
·
18 Citations
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
The built year and structure of individual buildings are crucial factors for estimating and assessing potential earthquake and tsunami damage. Recent advances in sensing and analysis technologies allow the acquisition of high-resolution street view images (SVIs) that present new possibilities for research and development. In this study, we developed a model to estimate the built year and structure of a building using omnidirectional SVIs captured using an onboard camera. We used geographic information system (GIS) building data and SVIs to generate an annotated built year and structure dataset by developing a method to automatically combine the GIS data with images of individual buildings cropped through object detection. Furthermore, we trained a deep learning model to classify the built year and structure of buildings using the annotated image dataset based on a deep convolutional neural network (DCNN) and a vision transformer (ViT). The results showed that SVI accurately predicts the built year and structure of individual buildings using ViT (overall accuracies for structure = 0.94 [3 classes] and 0.96 [2 classes] and for age = 0.68 [6 classes] and 0.90 [3 classes]). Compared with DCNN-based networks, the proposed Swin transformer based on ViT architectures effectively improves prediction accuracy. The results indicate that multiple high-resolution images can be obtained for individual buildings using SVI, and the proposed method is an effective approach for classifying structures and determining building age. The automatic, accurate, and large-scale mapping of the built year and structure of individual buildings can help develop specific disaster prevention measures.
December 2022
·
1,754 Reads
·
9 Citations
The detection of road facilities or roadside structures is essential for high-definition (HD) maps and intelligent transportation systems (ITSs). With the rapid development of deep-learning algorithms in recent years, deep-learning-based object detection techniques have provided more accurate and efficient performance, and have become an essential tool for HD map reconstruction and advanced driver-assistance systems (ADASs). Therefore, the performance evaluation and comparison of the latest deep-learning algorithms in this field is indispensable. However, most existing works in this area limit their focus to the detection of individual targets, such as vehicles or pedestrians and traffic signs, from driving view images. In this study, we present a systematic comparison of three recent algorithms for large-scale multi-class road facility detection, namely Mask R-CNN, YOLOx, and YOLOv7, on the Mapillary dataset. The experimental results are evaluated according to the recall, precision, mean F1-score and computational consumption. YOLOv7 outperforms the other two networks in road facility detection, with a precision and recall of 87.57% and 72.60%, respectively. Furthermore, we test the model performance on our custom dataset obtained from the Japanese road environment. The results demonstrate that models trained on the Mapillary dataset exhibit sufficient generalization ability. The comparison presented in this study aids in understanding the strengths and limitations of the latest networks in multiclass object detection on large-scale street-level datasets.
... Therefore, instance segmentation methods that rely on natural images will not yield optimal results in remote sensing applications. In order to achieve optimal segmentation results for remote sensing datasets, researchers have developed instance segmentation models tailored to remote sensing image characteristics [9][10][11][12][13]. Xu et al. [14] proposed a remote sensing image instance segmentation method based on BoxInst from the perspective of weak supervision, which fully utilizes the existing rich OBB annotations and reduces the annotation burden. ...
July 2024
... Furthermore, combining street view data with deep learning technologies has become a mainstream method for evaluating residents' subjective perceptions (Ji et al. 2021;Ogawa et al. 2024;. Kruse et al. (2021) used a deep residual network (ResNet) model trained on tagged street view data to evaluate human perceptions of urban playability. ...
July 2024
Landscape and Urban Planning
... Therefore, evaluating the subjective perceptions of each location with high spatial resolution for the entire city is challenging. To overcome this limitation, subjective impression evaluation methods, using street-view images and crowdsourcing, have been proposed (Dubey et al., 2016;Imadegawa et al., 2023;Naik et al., 2014;Quercia et al., 2014;Rossetti et al., 2019;Xu et al. 2022;Yao et al., 2021;Zhang et al., 2018). These approaches have collected a considerable amount of data on subjective perceptions via web surveys that present street-view images; subsequently, models have been developed to evaluate these (Dubey et al., 2016;Zhang et al., 2018). ...
December 2023
... However, these things alone are not enough, so it is important to generate not only highquality images, but also image-annotation pairs that are more practical. For example, Zhao et al [89] implemented a twostage fine-tuning of the SD model to enable the generation from noise to annotations and then from annotations to images.Toker et al [90] extended the standard diffusion model to a joint probabilistic model for images and their corresponding labels, which enabled the simultaneous generation of labels and images. ...
November 2023
... 性 [27] 、天气及昼夜差异 [28] 等变量对公众视觉 感知的作用或调节效应。同时,众包视觉感 知方法还被用于推测和分析城市地价 [29] 、犯罪 率 [30] 、城市开发潜力 [ [22] A visual perception reasoning model based on DCNN [22] 3 基于众包视觉感知方法的东京高密度木构居住区层次聚类分析 [22] Hierarchical clustering analysis of the high-density wooden residential areas in Tokyo based on crowdsourced visual perception method [22] 练 马 区 Optimization and evaluation of street scenes based on crowdsourced visual perception data and GAN model [34] 5 基于众包视觉感知数据和SD模型的街道场景优化和评估 [35] Optimization and evaluation of street scenes based on crowdsourced visual perception data and SD [Methods] The research employs a case study approach, analyzing multiple urban research papers focused on Tokyo, Japan, that utilize crowdsourced visual perception methods. It systematically investigates how these methods, ...
April 2023
Engineering Applications of Artificial Intelligence
... (2023) employ a multi-city material categories (brick, stucco, etc.) using geotagged SVI perspective views, aligning visual patterns with ground-truth material information for scalable building classification. Combining the aspects of building age and material, Ogawa et al. (2023) introduced a method to detect and geolocate buildings from panoramic images, automatically annotating them with objective building data in Kobe, Japan. Furthermore, building type or usage -a critical attribute in urban remote sensing and land use classification frameworks -is another important aspect in street-level research (Kang et al., 2018;Zhao et al., 2021;Lindenthal and Johnson, 2021;Ramalingam and Kumar, 2023;Li et al., 2025b). ...
January 2023
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
... It is a very complete streetscape dataset. An example of an AI trained by Mapillary is [19]. ...
December 2022
... While numerous open-source databases exist that cater to specific domains of building information, they are often limited to a single aspect of building data. The foundational building attribute knowledge within these databases is typically derived from field surveys or footprint data extracted from remote sensing imagery [10]. To date, no open-source database has been developed that integrates street view imagery to provide a comprehensive, visually-driven repository encompassing the spatial. ...
June 2022