Xiaopeng Zhang

Xiaopeng Zhang
Chinese Academy of Sciences | CAS · National Lab. on Pattern Recognition

Professor

About

287
Publications
89,228
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,180
Citations

Publications

Publications (287)
Article
Accurate lung lesion segmentation from computed tomography (CT) images is crucial to the analysis and diagnosis of lung diseases, such as COVID-19 and lung cancer. However, the smallness and variety of lung nodules and the lack of high-quality labeling make the accurate lung nodule segmentation difficult. To address these issues, we first introduce...
Article
In physics‐based cloth animation, rich folds and detailed wrinkles are achieved at the cost of expensive computational resources and huge labor tuning. Data‐driven techniques make efforts to reduce the computation significantly by utilizing a preprocessed database. One type of methods relies on human poses to synthesize fitted garments, but these m...
Article
Full-text available
Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the H...
Article
Efficiently training accurate deep models for weakly supervised semantic segmentation (WSSS) with image-level labels is challenging and important. Recently, end-to-end WSSS methods have become the focus of research due to their high training efficiency. However, current methods suffer from insufficient extraction of comprehensive semantic informati...
Article
Local features detection and description are widely used in many vision applications with high industrial and commercial demands. With large-scale applications, these tasks raise high expectations for both the accuracy and speed of local features. Most existing studies on local features learning focus on the local descriptions of individual keypoin...
Article
Full-text available
Existing architecture semantic modeling methods in 3D complex urban scenes continue facing difficulties, such as limited training data, lack of semantic information, and inflexible model processing. Focusing on extracting and adopting accurate semantic information into a modeling process, this work presents a framework for lightweight modeling of b...
Preprint
Full-text available
Reverse engineering CAD models from raw geometry is a classic but strenuous research problem. Previous learning-based methods rely heavily on labels due to the supervised design patterns or reconstruct CAD shapes that are not easily editable. In this work, we introduce SECAD-Net, an end-to-end neural network aimed at reconstructing compact and easy...
Article
Full-text available
Traditional multi-view stereo (MVS) is not applicable for the point cloud reconstruction of serialized video frames. Among them, the exhausted feature extraction and matching for all the prepared frames are time-consuming, and the scope of the search requires covering all the key frames. In this paper, we propose a novel serialized reconstruction m...
Article
Full-text available
Specular highlight detection and removal is a fundamental problem in computer vision and image processing. In this paper, we present an efficient end-to-end deep learning model for automatically detecting and removing specular highlights in a single image. In particular, an encoder—decoder network is utilized to detect specular highlights, and then...
Preprint
Efficiently training accurate deep models for weakly supervised semantic segmentation (WSSS) with image-level labels is challenging and important. Recently, end-to-end WSSS methods have become the focus of research due to their high training efficiency. However, current methods suffer from insufficient extraction of comprehensive semantic informati...
Article
High spatial resolution (HSR) remote sensing images contain complex foreground-background relationships, which makes the remote sensing land cover segmentation a special semantic segmentation task. The main challenges come from the large-scale variation, complex background samples and imbalanced foreground-background distribution. These issues make...
Article
The Class Activation Map (CAM) is widely used to generate pseudo-labels for Weakly Supervised Semantic Segmentation (WSSS), while it does not adequately consider the modeling of foreground-independent information, resulting in prone to false positive pixels. In this paper, we propose a Wave-like Class Activation Map (WaveCAM) from the perspective o...
Article
Accurate and efficient keypoint detection and description is a fundamental step in various computer vision tasks. In this paper, we extract robust descriptors and detect accurate keypoints by learning local Features with Domain adaptation (DomainFeat). Specifically, our Domainfeat includes image-level domain invariance supervision, pixel-level doma...
Article
Automatically extracting roads from very high resolution (VHR) remote sensing images is of great importance in a wide range of remote sensing applications. However, complex shapes of roads ( i.e ., long, geometrically deformed, and thin) always affected the extraction accuracy, which is one of the challenges of road extraction. Based on the insigh...
Article
Textureless objects, repetitive patterns and limited computational resources pose significant challenges to man-made structure reconstruction from images, because feature-points-based reconstruction methods usually fail due to the lack of distinct texture or ambiguous point matches. Meanwhile multi-view stereo approaches also suffer from high compu...
Article
The point pair feature (PPF) is widely used in manufacturing for estimating 6-D poses. The key to the success of PPF matching is to establish correct 3-D correspondences between the object and the scene, i.e., finding as many valid similar point pairs as possible. However, efficient sampling of point pairs has been overlooked in existing framewor...
Chapter
Since the morphology of retinal vessels plays a pivotal role in clinical diagnosis of eye-related diseases and diabetic retinopathy, retinal vessels segmentation is an indispensable step for the screening and diagnosis of retinal diseases, yet it is still a challenging problem due to the complex structure of retinal vessels. Current retinal vessels...
Article
With its unique advantages of high flexibility and high efficiency, UAV has become a reasonable substitute for conventional aerial measurement technology. Especially in the low altitude remote sensing image processing, the ortho-rectification and mosaic of aerial images are the key to vision-based UAV orthoimage generation. Therefore, how to select...
Article
Limited by the locality of convolutional neural networks, most existing local features description methods only learn local descriptors with local information and lack awareness of global and surrounding spatial context. In this work, we focus on making local descriptors ``look wider to describe better'' by learning local Descriptors with More Than...
Preprint
Registering urban point clouds is a quite challenging task due to the large-scale, noise and data incompleteness of LiDAR scanning data. In this paper, we propose SARNet, a novel semantic augmented registration network aimed at achieving efficient registration of urban point clouds at city scale. Different from previous methods that construct corre...
Article
Full-text available
Fast and accurate semantic analysis of natural disaster images is crucial for rational rescue plans and resource allocation. However, the scarcity of meticulously labelled datasets and the ignorance of region-of-interest scale variations of popular general-purpose methods lead to undesirable performance. In this paper, we propose a novel triple-str...
Article
Full-text available
Recent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results. In this work, a novel end-to-end framework is proposed based on accurate text stroke detection. Specifically, the text removal problem is decoupled into text stro...
Preprint
Recent studies show that the deep neural networks (DNNs) have achieved great success in various tasks. However, even the \emph{state-of-the-art} deep learning based classifiers are extremely vulnerable to adversarial examples, resulting in sharp decay of discrimination accuracy in the presence of enormous unknown attacks. Given the fact that neural...
Article
Instance segmentation in biological images is an important task in the field of biological images and biomedical analysis. Different from the instance segmentation of natural image scenes, this task is still challenging because there are a large number of overlapping objects with similar appearance as well as great variability in shape, size and te...
Article
Due to the complicated arrangement of the pipes in the narrow space leads to random orientation of the mechanical water meter dial meanwhile its digit wheels are accompanied by arbitrary angle rotation, which makes the detection and recognition of meter reading more difficult. Even the latest visual network technology cannot deal with the challenge...
Preprint
The growing size of point clouds enlarges consumptions of storage, transmission, and computation of 3D scenes. Raw data is redundant, noisy, and non-uniform. Therefore, simplifying point clouds for achieving compact, clean, and uniform points is becoming increasingly important for 3D vision and graphics tasks. Previous learning based methods aim to...
Preprint
Limited by the locality of convolutional neural networks, most existing local features description methods only learn local descriptors with local information and lack awareness of global and surrounding spatial context. In this work, we focus on making local descriptors "look wider to describe better" by learning local Descriptors with More Than j...
Article
Existing deep models for facade parsing often fail in classifying pixels in heavily occluded regions of facade images due to the difficulty in feature representation of these pixels. In this paper, we solve facade parsing with occlusions by progressive feature learning. To this end, we locate the regions contaminated by occlusions via Bayesian unce...
Article
For a long time, the local descriptors learning for image matching benefits from the use of L2 normalization, which projects the descriptor space onto the hypersphere. However, there is no free lunch in the world. Although hypersphere description space stabilizes the optimization and improves the repeatability of the descriptors, it causes the desc...
Article
Full-text available
We present TreePartNet , a neural network aimed at reconstructing tree geometry from point clouds obtained by scanning real trees. Our key idea is to learn a natural neural decomposition exploiting the assumption that a tree comprises locally cylindrical shapes. In particular, reconstruction is a two-step process. First, two networks are used to de...
Preprint
Full-text available
Reconstructing high-fidelity 3D facial texture from a single image is a challenging task since the lack of complete face information and the domain gap between the 3D face and 2D image. The most recent works tackle facial texture reconstruction problem by applying either generation-based or reconstruction-based methods. Although each method has its...
Chapter
Medical image segmentation is essential for disease diagnosis analysis. There are many variants of U-Net that are based on attention mechanism and dense connections have made progress. However, CNN-based U-Net lacks the ability to capture the global context, and the context information of different scales is not effectively integrated. These limita...
Article
Specular reflections pose great challenges on various multimedia and computer vision tasks, e.g. , image segmentation, detection and matching. In this paper, we build a large-scale Paired Specular-Diffuse (PSD) image dataset, where the images are carefully captured by using real-world objects and the ground-truth specular-free diffuse images are...
Article
Automatic registration of point clouds captured by terrestrial laser scanning (TLS) plays an important role in many fields including remote sensing (e.g., transportation management, 3-D reconstruction in large-scale urban areas and environment monitoring), computer vision, and virtual reality and robotics. However, noise, outliers, nonuniform point...
Article
We propose a framework to generate customized summarizations of visual data collections, such as collections of images, materials, 3D shapes, and 3D scenes. We assume that the elements in the visual data collections can be mapped to a set of vectors in a feature space, in which a fitness score for each element can be defined, and we pose the proble...
Article
Existing physical cloth simulators suffer from expensive computation and difficulties in tuning mechanical parameters to get desired wrinkling behaviors. Data-driven methods provide an alternative solution. They typically synthesize cloth animation at a much lower computational cost, and also create wrinkling effects that are similar to the trainin...
Article
We present a novel and efficient approach to estimate 6D object poses of known objects in complex scenes represented by point clouds. Our approach is based on the well-known point pair feature (PPF) matching, which utilizes self-similar point pairs to compute potential matches and thereby cast votes for the object pose by a voting scheme. The main...
Article
Full-text available
The intelligent grasping expects that the manipulator has the ability to grasp objects with high degree of freedom in a wild (unstructured) environment. Due to low perception ability in handing targets and environments, most industrial robots are limited to top-down 4-DoF grasping. In this work, we propose a novel low-cost coarse to fine robotic gr...
Article
Full-text available
Automatic understanding of floor plan images is a key component of various applications. Due to the style diversity of rural housing design, the latest learning-based approaches cannot achieve satisfactory recognition results. In this paper, we present a new framework for parsing floor plans of rural residence that combines semantic neural networks...
Chapter
Full-text available
Accurate image keypoints detection and description are of central importance in a wide range of applications. Although there are various studies proposed to address these challenging tasks, they are far from optimal. In this paper, we devise a model named MLIFeat with two novel light-weight modules for multi-level information fusion based deep loca...
Preprint
In physics-based cloth animation, rich folds and detailed wrinkles are achieved at the cost of expensive computational resources and huge labor tuning. Data-driven techniques make efforts to reduce the computation significantly by a database. One type of methods relies on human poses to synthesize fitted garments which cannot be applied to general...
Article
Meaningful feature curves provide high-level shape representation of the geometrical shapes and are useful in various applications. In this paper, we propose an automatic method on the basis of the quadric surface fitting technique to extract complete feature curve networks (FCNs) from 3D surface meshes, as well as finding cycles and generating a h...
Article
Full-text available
Recognizing and fitting shape primitives from underlying 3D models is a key component of many computer graphics applications. Although there exists many structure recovery methods, they usually fail to identify blending surfaces, which are small transition regions between relatively large primary patches. To address this issue, we present a novel a...
Article
The traditional stem model is inconsistent with the real geometry of the stem. Terrestrial laser scanning (TLS) provides a possibility of constructing a realistic stem model. In this study, we present a 3D stem model, which includes the stem axis curve and stem cross-sectional profile curve, with geometrical consistency and stem parameter retrieval...
Preprint
Full-text available
Recent learning-based approaches show promising performance improvement for scene text removal task. However, these methods usually leave some remnants of text and obtain visually unpleasant results. In this work, we propose a novel "end-to-end" framework based on accurate text stroke detection. Specifically, we decouple the text removal problem in...
Article
Advanced computer graphics rendering software tools can now produce computer-generated (CG) images with increasingly high level of photorealism. This makes it more and more difficult to distinguish natural images (NIs) from CG images by naked human eyes. For this forensic problem, recently some CNN(convolutional neural network)-based methods have b...
Article
In this article, we present a survey on surface remeshing techniques, classifying all collected articles in different categories and analyzing specific methods with their advantages, disadvantages, and possible future improvements. Following the systematic literature review methodology, we define step-by-step guidelines throughout the review proces...
Article
The good fusion of multi-scale features obtained by Convolutional neural networks (CNNs) is key to semantic edge detection; however, obtaining fusion is challenging. This paper presents a Multi-scale Spatial Context-based deep network for Semantic Edge Detection (MSC-SED). Different from state-of-the-art methods, MSC-SED gradually fuses multi-scale...
Preprint
Accurate 2D lung nodules segmentation from medical Computed Tomography (CT) images is crucial in medical applications. Most current approaches cannot achieve precise segmentation results that preserving both rich edge details description and smooth transition representations between image regions due to the tininess, complexities, and irregularitie...
Article
We propose a novel framework for computing descriptors for characterizing points on three-dimensional surfaces. First, we present a new non-learned feature that uses graph wavelets to decompose the Dirichlet energy on a surface. We call this new feature Wavelet Energy Decomposition Signature (WEDS). Second, we propose a new Multiscale Graph Convolu...
Article
We introduce an inverse procedural modeling approach that learns L-system representations of pixel images with branching structures. Our fully automatic model generates a compact set of textual rewriting rules that describe the input. We use deep learning to discover atomic structures such as line segments or branchings. Orientation and scaling of...
Article
Accurate depth estimation from images is a fundamental problem in computer vision. In this paper, we propose an unsupervised learning based method to predict high-quality depth map from multiple images. A novel multi-view constrained DenseDepthNet is designed for this task. Our DenseDepthNet can effectively leverage both the low-level and high-leve...
Preprint
Adversarial examples have been well known as a serious threat to deep neural networks (DNNs). In this work, we study the detection of adversarial examples, based on the assumption that the output and internal responses of one DNN model for both adversarial and benign examples follow the generalized Gaussian distribution (GGD), but with different pa...
Article
Stereo image completion (SIC) is to fill holes existing in a pair of stereo images. SIC is more complicated than single image repairing, which needs to complete the pair of images while keeping their stereoscopic consistency. In recent years, deep learning has been introduced into single image repairing but seldom used for SIC. The authors present...
Article
Full-text available
This paper presents an effective framework for correspondence field estimation. The core idea is to construct pixel-level and superpixel-level patch matching to achieve high accuracy estimation as well as fast speed computation. To this end, a hybrid edge-preserving supported weighting approach is first developed, which contributes to better perfor...
Preprint
Existing physical cloth simulators suffer from expensive computation and difficulties in tuning mechanical parameters to get desired wrinkling behaviors. Data-driven methods provide an alternative solution. It typically synthesizes cloth animation at a much lower computational cost, and also creates wrinkling effects that highly resemble the much c...
Article
Full-text available
A discriminative local shape descriptor plays an important role in various applications. In this paper, we present a novel deep learning framework that derives discriminative local descriptors for deformable 3D shapes. We use local “geometry images” to encode the multi-scale local features of a point, via an intrinsic parameterization method based...
Preprint
Full-text available
We propose a novel framework for computing descriptors for characterizing points on three-dimensional surfaces. First, we present a new non-learned feature that uses graph wavelets to decompose the Dirichlet energy on a surface. We call this new feature wavelet energy decomposition signature (WEDS). Second, we propose a new multiscale graph convolu...
Article
Full-text available
Particle techniques mainly deal with physically-based particle variations for phenomena and shapes in computer animation and geometric modeling. Typical particle techniques include smoothed particle hydrodynamics (SPH) simulation for animation and particle systems for shape modeling. SPH simulation and particle systems are meshfree methods and have...
Conference Paper
Full-text available
Image colorization achieves more and more realistic results with the increasing power of recent deep learning techniques. It becomes more difficult to identify the synthetic colorized images by human eyes. In the literature, handcrafted-feature-based and convolutional neural network (CNN)-based forensic methods are proposed to distinguish between n...
Conference Paper
Full-text available
In the field of image forensics, many convolutional neural network (CNN)-based forensic methods have been proposed and generally achieved the state-of-the-art performance. However, some questions are worth studying and answering regarding the trustworthiness of such methods, including for example the appropriateness of the discrimina-tive informati...
Article
Full-text available
Face alignment and segmentation are challenging problems which have been extensively studied in the field of multimedia. These two tasks are closely related and their learning processes are supposed to benefit each other. Hence, we present a joint multi-task learning algorithm for both face alignment and segmentation using deep convolutional neural...