Michael Ying Yang

Michael Ying Yang
University of Twente | UT · Scene Understanding Group - Department of Earth Observation Science (EOS)

Professor

About

233
Publications
56,155
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,626
Citations
Citations since 2016
169 Research Items
3410 Citations
20162017201820192020202120220200400600800
20162017201820192020202120220200400600800
20162017201820192020202120220200400600800
20162017201820192020202120220200400600800
Introduction
Michael Ying Yang is currently Assistant Professor with University of Twente (The Netherlands), heading a group working on scene understanding. He published over 100 papers. He serves as Associate Editor of ISPRS Journal of Photogrammetry and Remote Sensing and co-chair of ISPRS working group II/5 Dynamic Scene Analysis and and recipient of Best Science Paper Award at BMVC (2016), The Schermerhorn Award (2020). His research interests are in the fields of computer vision and photogrammetry.
Additional affiliations
March 2015 - May 2016
Technische Universität Dresden
Position
  • PostDoc Position
May 2012 - February 2015
Leibniz Universität Hannover
Position
  • PostDoc Position
August 2008 - April 2012
University of Bonn
Position
  • Research Assistant

Publications

Publications (233)
Article
Full-text available
With the increasing demand of autonomous systems, pixelwise semantic segmentation for visual scene understanding needs to be not only accurate but also efficient for potential real-time applications. In this paper, we propose Context Aggregation Network, a dual branch convolutional neural network, with significantly lower computational costs as com...
Article
Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder network (LGENet) for semantic segmentation of ALS point clouds. Adapting the KPConv network, we first...
Preprint
Humans perceive and construct the surrounding world as an arrangement of simple parametric models. In particular, man-made environments commonly consist of volumetric primitives such as cuboids or cylinders. Inferring these primitives is an important step to attain high-level, abstract scene descriptions. Previous approaches directly estimate shape...
Chapter
Full-text available
Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous wor...
Preprint
As a natural extension of the image synthesis task, video synthesis has attracted a lot of interest recently. Many image synthesis works utilize class labels or text as guidance. However, neither labels nor text can provide explicit temporal guidance, such as when an action starts or ends. To overcome this limitation, we introduce semantic video sc...
Preprint
Full-text available
Generating a 3D point cloud from a single 2D image is of great importance for 3D scene understanding applications. To reconstruct the whole 3D shape of the object shown in the image, the existing deep learning based approaches use either explicit or implicit generative modeling of point clouds, which, however, suffer from limited quality. In this w...
Preprint
Full-text available
Trajectory prediction has been a long-standing problem in intelligent systems such as autonomous driving and robot navigation. Recent state-of-the-art models trained on large-scale benchmarks have been pushing the limit of performance rapidly, mainly focusing on improving prediction accuracy. However, those models put less emphasis on efficiency, w...
Article
It is observed that a human inspector can obtain better visual observations of surface defects via changing the lighting/viewing directions from time to time. Accordingly, we first build a multi-light source illumination/acquisition system to capture images of workpieces under individual lighting directions and then propose a multi-stream CNN model...
Article
Multispectral pedestrian detection has received much attention in recent years due to its superiority in detecting targets under adverse lighting/weather conditions. In this paper, we aim to generate highly discriminative multi-modal features by aggregating the human-related clues based on all available samples presented in multispectral images. To...
Article
While modern deep learning algorithms for semantic segmentation of airborne laser scanning (ALS) point clouds have achieved considerable success, the training process often requires a large number of labelled 3D points. Pointwise annotation of 3D point clouds, especially for large scale ALS datasets, is extremely time-consuming work. Weak supervisi...
Preprint
Full-text available
Different objects in the same scene are more or less related to each other, but only a limited number of these relationships are noteworthy. Inspired by DETR, which excels in object detection, we view scene graph generation as a set prediction problem and propose an end-to-end scene graph generation model RelTR which has an encoder-decoder architec...
Article
Full-text available
The European Union (EU) Commission’s whitepaper on Artificial Intelligence (AI) proposes shaping the emerging AI market so that it better reflects common European values. It is a master plan that builds upon the EU AI High-Level Expert Group guidelines. This article reviews the masterplan, from a culture cycle perspective, to reflect on its potenti...
Article
In he past decade, object detection has achieved significant progress in natural images but not in aerial images, due to the massive variations in the scale and orientation of objects caused by the bird’s-eye view of aerial images. More importantly, the lack of large-scale benchmarks has become a major obstacle to the development of object detectio...
Preprint
Full-text available
This report summarizes the results of Learning to Understand Aerial Images (LUAI) 2021 challenge held on ICCV 2021, which focuses on object detection and semantic segmentation in aerial images. Using DOTA-v2.0 and GID-15 datasets, this challenge proposes three tasks for oriented object detection, horizontal object detection, and semantic segmentati...
Preprint
Full-text available
A lifespan face synthesis (LFS) model aims to generate a set of photo-realistic face images of a person's whole life, given only one snapshot as reference. The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving. This is extreme...
Preprint
Full-text available
Dynamic scene graph generation aims at generating a scene graph of the given video. Compared to the task of scene graph generation from images, it is more challenging because of the dynamic relationships between objects and the temporal dependencies between frames allowing for a richer semantic interpretation. In this paper, we propose Spatial-temp...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
We report key elements and figures related to the proceedings of the 2021 edition of the XXIVth ISPRS Congress. Similarly to 2020, the COVID-19 pandemic caused global travel challenges and restrictions for the first half of 2021. Consequently, the physical Congress re-scheduled from June 2020 to July 2021 was again postponed to June 2022, still in...
Article
Full-text available
Semantic segmentation for aerial platforms has been one of the fundamental scene understanding task for the earth observation. Most of the semantic segmentation research focused on scenes captured in nadir view, in which objects have relatively smaller scale variation compared with scenes captured in oblique view. The huge scale variation of object...
Article
Video object detection is a fundamental research task for scene understanding. Compared with object detection in images, object detection in videos has been less researched due to shortage of labelled video datasets. As frames in a video clip are highly correlated, a larger quantity of video labels are needed to have good data variation, which are...
Article
Full-text available
Unmanned Aerial Vehicles (UAVs) have become an essential photogrammetric measurement as they are affordable, easily accessible and versatile. Aerial images captured from UAVs have applications in small and large scale texture mapping, 3D modelling, object detection tasks, Digital Terrain Model (DTM) and Digital Surface Model (DSM) generation etc. P...
Article
Full-text available
The past years have witnessed great progress on remote sensing (RS) image interpretation and its wide applications. With RS images becoming more accessible than ever before, there is an increasing demand for the automatic interpretation of these images. In this context, the benchmark datasets serve as essential prerequisites for developing and test...
Preprint
Full-text available
A text to image generation (T2I) model aims to generate photo-realistic images which are semantically consistent with the text descriptions. Built upon the recent advances in generative adversarial networks (GANs), existing T2I models have made great progress. However, a close inspection of their generated images reveals two major limitations: (1)...
Conference Paper
Full-text available
A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff), conditioned on a given layout. Built upon the recent advances in generative adversarial networks (GANs), existing L2I models have made great progress. However, a close inspection of their generated i...
Preprint
Full-text available
A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff), conditioned on a given layout. Built upon the recent advances in generative adversarial networks (GANs), existing L2I models have made great progress. However, a close inspection of their generated i...
Chapter
Automatic captioning of images is a task that combines the challenges of image analysis and text generation. One important aspect of captioning is the notion of attention: how to decide what to describe and in which order. Inspired by the successes in text analysis and translation, previous works have proposed the transformer architecture for image...
Preprint
Full-text available
In the past decade, object detection has achieved significant progress in natural images but not in aerial images, due to the massive variations in the scale and orientation of objects caused by the bird's-eye view of aerial images. More importantly, the lack of large-scale benchmarks becomes a major obstacle to the development of object detection...
Preprint
Full-text available
Semantic segmentation for aerial platforms has been one of the fundamental scene understanding task for the earth observation. Most of the semantic segmentation research focused on scenes captured in nadir view, in which objects have relatively smaller scale variation compared with scenes captured in oblique view. The huge scale variation of object...
Article
Trajectory prediction is critical for applications of planning safe future movements and remains challenging even for the next few seconds in urban mixed traffic. How an agent moves is affected by the various behaviors of its neighboring agents in different environments. To predict movements, we propose an end-to-end generative model named Attentiv...
Preprint
Full-text available
UAVs have become an essential photogrammetric measurement as they are affordable, easily accessible and versatile. Aerial images captured from UAVs have applications in small and large scale texture mapping, 3D modelling, object detection tasks, DTM and DSM generation etc. Photogrammetric techniques are routinely used for 3D reconstruction from UAV...
Preprint
Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder network (LGENet) for semantic segmentation of ALS point clouds. Adapting the KPConv network, we first...
Preprint
Full-text available
Image Super-Resolution (SR) provides a promising technique to enhance the image quality of low-resolution optical sensors, facilitating better-performing target detection and autonomous navigation in a wide range of robotics applications. It is noted that the state-of-the-art SR methods are typically trained and tested using single-channel inputs,...
Preprint
Full-text available
With the increasing demand of autonomous machines , pixel-wise semantic segmentation for visual scene understanding needs to be not only accurate but also efficient for any potential real-time applications. In this paper, we propose CABiNet (Context Aggregated Bilateral Network), a dual branch convolutional neural network (CNN), with significantly...
Article
Full-text available
Supervised training of a deep neural network for semantic segmentation of point clouds requires a large amount of labelled data. Nowadays, it is easy to acquire a huge number of points with high density in large-scale areas using current LiDAR and photogrammetric techniques. However it is extremely time-consuming to manually label point clouds for...
Preprint
Full-text available
To accurately predict future positions of different agents in traffic scenarios is crucial for safely deploying intelligent autonomous systems in the real-world environment. However, it remains a challenge due to the behavior of a target agent being affected by other agents dynamically, and there being more than one socially possible paths the agen...
Chapter
In this paper, we propose FairNN a neural network that performs joint feature representation and classification for fairness-aware learning. Our approach optimizes a multi-objective loss function which (a) learns a fair representation by suppressing protected attributes (b) maintains the information content by minimizing the reconstruction loss and...
Article
Full-text available
State-of-the-art object detection approaches such as Fast/Faster R-CNN, SSD, or YOLO have difficulties detecting dense, small targets with arbitrary orientation in large aerial images. The main reason is that using interpolation to align RoI features can result in a lack of accuracy or even loss of location information. We present the Local-aware R...
Article
Full-text available
With the development of LiDAR and photogrammetric techniques, more and more point clouds are available with high density and in large areas. Point cloud interpretation is an important step before many real applications like 3D city modelling. Many supervised machine learning techniques have been adapted to semantic point cloud segmentation, aiming...
Conference Paper
Full-text available
Depth is an essential component for various scene understanding tasks and for reconstructing the 3D geometry of the scene. Estimating depth from stereo images requires multiple views of the same scene to be captured which is often not possible when exploring new environments with a UAV. To overcome this monocular depth estimation has been a topic o...
Article
Full-text available
Semantic segmentation has been one of the leading research interests in computer vision recently. It serves as a perception foundation for many fields, such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. There alre...
Preprint
Full-text available
The past decade has witnessed great progress on remote sensing (RS) image interpretation and its wide applications. With RS images becoming more accessible than ever before, there is an increasing demand for the automatic interpretation of these images, where benchmark datasets are essential prerequisites for developing and testing intelligent inte...
Preprint
Full-text available
Trajectory prediction is a crucial task in different communities, such as intelligent transportation systems, photogrammetry, computer vision, and mobile robot applications. However, there are many challenges to predict the trajectories of heterogeneous road agents (e.g. pedestrians, cyclists and vehicles) at a microscopical level. For example, an...