Xiaopeng Zhang

Xiaopeng Zhang
Chinese Academy of Sciences | CAS · National Lab. on Pattern Recognition

Professor

About

255
Publications
67,737
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,085
Citations

Publications

Publications (255)
Preprint
Registering urban point clouds is a quite challenging task due to the large-scale, noise and data incompleteness of LiDAR scanning data. In this paper, we propose SARNet, a novel semantic augmented registration network aimed at achieving efficient registration of urban point clouds at city scale. Different from previous methods that construct corre...
Article
Full-text available
Fast and accurate semantic analysis of natural disaster images is crucial for rational rescue plans and resource allocation. However, the scarcity of meticulously labelled datasets and the ignorance of region-of-interest scale variations of popular general-purpose methods lead to undesirable performance. In this paper, we propose a novel triple-str...
Article
Full-text available
Recent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results. In this work, a novel end-to-end framework is proposed based on accurate text stroke detection. Specifically, the text removal problem is decoupled into text stro...
Article
Instance segmentation in biological images is an important task in the field of biological images and biomedical analysis. Different from the instance segmentation of natural image scenes, this task is still challenging because there are a large number of overlapping objects with similar appearance as well as great variability in shape, size and te...
Article
Due to the complicated arrangement of the pipes in the narrow space leads to random orientation of the mechanical water meter dial meanwhile its digit wheels are accompanied by arbitrary angle rotation, which makes the detection and recognition of meter reading more difficult. Even the latest visual network technology cannot deal with the challenge...
Preprint
Limited by the locality of convolutional neural networks, most existing local features description methods only learn local descriptors with local information and lack awareness of global and surrounding spatial context. In this work, we focus on making local descriptors "look wider to describe better" by learning local Descriptors with More Than j...
Article
Existing deep models for facade parsing often fail in classifying pixels in heavily occluded regions of facade images due to the difficulty in feature representation of these pixels. In this paper, we solve facade parsing with occlusions by progressive feature learning. To this end, we locate the regions contaminated by occlusions via Bayesian unce...
Preprint
Full-text available
Reconstructing high-fidelity 3D facial texture from a single image is a challenging task since the lack of complete face information and the domain gap between the 3D face and 2D image. The most recent works tackle facial texture reconstruction problem by applying either generation-based or reconstruction-based methods. Although each method has its...
Chapter
Medical image segmentation is essential for disease diagnosis analysis. There are many variants of U-Net that are based on attention mechanism and dense connections have made progress. However, CNN-based U-Net lacks the ability to capture the global context, and the context information of different scales is not effectively integrated. These limita...
Article
Specular reflections pose great challenges on various multimedia and computer vision tasks, e.g., image segmentation, detection and matching. In this paper, we build a large-scale Paired Specular-Diffuse (PSD) image dataset, where the images are carefully captured by using real-world objects and the ground-truth specular-free diffuse images are pro...
Article
Automatic registration of point clouds captured by terrestrial laser scanning (TLS) plays an important role in many fields including remote sensing (e.g., transportation management, 3-D reconstruction in large-scale urban areas and environment monitoring), computer vision, and virtual reality and robotics. However, noise, outliers, nonuniform point...
Article
We propose a framework to generate customized summarizations of visual data collections, such as collections of images, materials, 3D shapes, and 3D scenes. We assume that the elements in the visual data collections can be mapped to a set of vectors in a feature space, in which a fitness score for each element can be defined, and we pose the proble...
Article
Existing physical cloth simulators suffer from expensive computation and difficulties in tuning mechanical parameters to get desired wrinkling behaviors. Data-driven methods provide an alternative solution. They typically synthesize cloth animation at a much lower computational cost, and also create wrinkling effects that are similar to the trainin...
Article
We present a novel and efficient approach to estimate 6D object poses of known objects in complex scenes represented by point clouds. Our approach is based on the well-known point pair feature (PPF) matching, which utilizes self-similar point pairs to compute potential matches and thereby cast votes for the object pose by a voting scheme. The main...
Article
Full-text available
The intelligent grasping expects that the manipulator has the ability to grasp objects with high degree of freedom in a wild (unstructured) environment. Due to low perception ability in handing targets and environments, most industrial robots are limited to top-down 4-DoF grasping. In this work, we propose a novel low-cost coarse to fine robotic gr...
Article
Full-text available
Automatic understanding of floor plan images is a key component of various applications. Due to the style diversity of rural housing design, the latest learning-based approaches cannot achieve satisfactory recognition results. In this paper, we present a new framework for parsing floor plans of rural residence that combines semantic neural networks...
Chapter
Full-text available
Accurate image keypoints detection and description are of central importance in a wide range of applications. Although there are various studies proposed to address these challenging tasks, they are far from optimal. In this paper, we devise a model named MLIFeat with two novel light-weight modules for multi-level information fusion based deep loca...
Preprint
In physics-based cloth animation, rich folds and detailed wrinkles are achieved at the cost of expensive computational resources and huge labor tuning. Data-driven techniques make efforts to reduce the computation significantly by a database. One type of methods relies on human poses to synthesize fitted garments which cannot be applied to general...
Article
Meaningful feature curves provide high-level shape representation of the geometrical shapes and are useful in various applications. In this paper, we propose an automatic method on the basis of the quadric surface fitting technique to extract complete feature curve networks (FCNs) from 3D surface meshes, as well as finding cycles and generating a h...
Article
Full-text available
Recognizing and fitting shape primitives from underlying 3D models is a key component of many computer graphics applications. Although there exists many structure recovery methods, they usually fail to identify blending surfaces, which are small transition regions between relatively large primary patches. To address this issue, we present a novel a...
Article
The traditional stem model is inconsistent with the real geometry of the stem. Terrestrial laser scanning (TLS) provides a possibility of constructing a realistic stem model. In this study, we present a 3D stem model, which includes the stem axis curve and stem cross-sectional profile curve, with geometrical consistency and stem parameter retrieval...
Preprint
Full-text available
Recent learning-based approaches show promising performance improvement for scene text removal task. However, these methods usually leave some remnants of text and obtain visually unpleasant results. In this work, we propose a novel "end-to-end" framework based on accurate text stroke detection. Specifically, we decouple the text removal problem in...
Article
Advanced computer graphics rendering software tools can now produce computer-generated (CG) images with increasingly high level of photorealism. This makes it more and more difficult to distinguish natural images (NIs) from CG images by naked human eyes. For this forensic problem, recently some CNN(convolutional neural network)-based methods have b...
Article
In this article, we present a survey on surface remeshing techniques, classifying all collected articles in different categories and analyzing specific methods with their advantages, disadvantages, and possible future improvements. Following the systematic literature review methodology, we define step-by-step guidelines throughout the review proces...
Article
The good fusion of multi-scale features obtained by Convolutional neural networks (CNNs) is key to semantic edge detection; however, obtaining fusion is challenging. This paper presents a Multi-scale Spatial Context-based deep network for Semantic Edge Detection (MSC-SED). Different from state-of-the-art methods, MSC-SED gradually fuses multi-scale...
Preprint
Accurate 2D lung nodules segmentation from medical Computed Tomography (CT) images is crucial in medical applications. Most current approaches cannot achieve precise segmentation results that preserving both rich edge details description and smooth transition representations between image regions due to the tininess, complexities, and irregularitie...
Article
We propose a novel framework for computing descriptors for characterizing points on three-dimensional surfaces. First, we present a new non-learned feature that uses graph wavelets to decompose the Dirichlet energy on a surface. We call this new feature Wavelet Energy Decomposition Signature (WEDS). Second, we propose a new Multiscale Graph Convolu...
Article
We introduce an inverse procedural modeling approach that learns L-system representations of pixel images with branching structures. Our fully automatic model generates a compact set of textual rewriting rules that describe the input. We use deep learning to discover atomic structures such as line segments or branchings. Orientation and scaling of...
Article
Accurate depth estimation from images is a fundamental problem in computer vision. In this paper, we propose an unsupervised learning based method to predict high-quality depth map from multiple images. A novel multi-view constrained DenseDepthNet is designed for this task. Our DenseDepthNet can effectively leverage both the low-level and high-leve...
Preprint
Adversarial examples have been well known as a serious threat to deep neural networks (DNNs). In this work, we study the detection of adversarial examples, based on the assumption that the output and internal responses of one DNN model for both adversarial and benign examples follow the generalized Gaussian distribution (GGD), but with different pa...
Article
Full-text available
This paper presents an effective framework for correspondence field estimation. The core idea is to construct pixel-level and superpixel-level patch matching to achieve high accuracy estimation as well as fast speed computation. To this end, a hybrid edge-preserving supported weighting approach is first developed, which contributes to better perfor...
Preprint
Existing physical cloth simulators suffer from expensive computation and difficulties in tuning mechanical parameters to get desired wrinkling behaviors. Data-driven methods provide an alternative solution. It typically synthesizes cloth animation at a much lower computational cost, and also creates wrinkling effects that highly resemble the much c...
Article
Full-text available
A discriminative local shape descriptor plays an important role in various applications. In this paper, we present a novel deep learning framework that derives discriminative local descriptors for deformable 3D shapes. We use local “geometry images” to encode the multi-scale local features of a point, via an intrinsic parameterization method based...
Preprint
Full-text available
We propose a novel framework for computing descriptors for characterizing points on three-dimensional surfaces. First, we present a new non-learned feature that uses graph wavelets to decompose the Dirichlet energy on a surface. We call this new feature wavelet energy decomposition signature (WEDS). Second, we propose a new multiscale graph convolu...
Article
Full-text available
Particle techniques mainly deal with physically-based particle variations for phenomena and shapes in computer animation and geometric modeling. Typical particle techniques include smoothed particle hydrodynamics (SPH) simulation for animation and particle systems for shape modeling. SPH simulation and particle systems are meshfree methods and have...
Conference Paper
Full-text available
Image colorization achieves more and more realistic results with the increasing power of recent deep learning techniques. It becomes more difficult to identify the synthetic colorized images by human eyes. In the literature, handcrafted-feature-based and convolutional neural network (CNN)-based forensic methods are proposed to distinguish between n...
Conference Paper
Full-text available
In the field of image forensics, many convolutional neural network (CNN)-based forensic methods have been proposed and generally achieved the state-of-the-art performance. However, some questions are worth studying and answering regarding the trustworthiness of such methods, including for example the appropriateness of the discrimina-tive informati...
Article
Full-text available
Face alignment and segmentation are challenging problems which have been extensively studied in the field of multimedia. These two tasks are closely related and their learning processes are supposed to benefit each other. Hence, we present a joint multi-task learning algorithm for both face alignment and segmentation using deep convolutional neural...
Preprint
Full-text available
Image colorization achieves more and more realistic results with the increasing computation power of recent deep learning techniques. It becomes more difficult to identify the fake colorized images by human eyes. In this work, we propose a novel forensic method to distinguish between natural images (NIs) and colorized images (CIs) based on convolut...
Article
Full-text available
3D point cloud classification is one of the basic topics in multimedia analysis and understanding. By the construction of the discriminant model and efficient parameter optimization, point cloud classification can be achieved after the training. However, most parameter optimization methods do not guarantee the highest global classification accuracy...
Article
We present a deep learning framework for efficient large-scale 3D point cloud analysis and classification using the designed feature description matrix (FDM). As the 3D points are unordered in the large-scale scene, and no topology structure can be employed directly for classification and recognition, it is difficult to apply deep neural network di...
Conference Paper
With the development of pedestrian detection technologies, existing methods cannot simultaneously satisfy high-quality detection and fast calculation for practical applications, especially for accurate 3D locating and tracking of basketball players. We propose an algorithm which can robustly and automatically locate and track basketball players fro...
Book
In the past decade, with the power of graphical boards, visualization of virtual natural scene become popular in multimedia applications. It usually relies on degraded virtual single plant geometrical representations and massive use of textures. Such scenes show however poor variability in terms of the number of species and individual plasticity. T...
Article
Full-text available
The full-image based kernel estimation strategy is usually susceptible by the smooth and fine-scale background regions impacting and it is time-consuming for large-size image deblurring. Since not all the pixels in the blurred image are informative and it is frequent to restore human-interested objects in the foreground rather than background, we p...
Article
Full-text available
Interactive stereo image segmentation (i.e., cutting out objects from stereo pairs with limited user assistance) is an important research topic in computer vision. Given a pair of images, users mark a few foreground/background pixels, based on which prior models are formulated for labeling unknown pixels. Note that color priors might not help if th...
Article
Full-text available
We introduce a new approach for procedural modeling. Our main idea is to select shapes using selection-expressions instead of simple string matching used in current state-of-the-art grammars like CGA shape and CGA++. A selection-expression specifies how to select a potentially complex subset of shapes from a shape hierarchy, e.g. "select all tall w...
Conference Paper
Full-text available
Squared forms of photos are widely used in social media as album covers or thumbnails of image streams. In this study, we realize photo squarization by modeling Retargeting Visual Perception Issues, which reflect human perception preference toward image ratargeting. General image retargeting techniques deal with three common issues, namely, salient...
Article
Full-text available
With the development of pedestrian detection technologies, existing methods can not simultaneously satisfy high quality detection and fast calculation for practical applications. Therefore, the goal of our research is to balance of pedestrian detection in aspects of the accuracy and efficiency, then get a relatively better method compared with curr...
Article
Illumination estimation is an essential problem in computer vision, graphics and augmented reality. In this paper, we propose a learning based method to recover low‐frequency scene illumination represented as spherical harmonic (SH) functions by pairwise photos from rear and front cameras on mobile devices. An end‐to‐end deep convolutional neural n...
Article
Full-text available
In this paper, we describe a novel procedural modeling technique for generating realistic plant models from multi-view photographs. The realism is enhanced via visual and spatial information acquired from images. In contrast to previous approaches that heavily rely on user interaction to segment plants or recover branches in images, our method auto...
Article
The Guided Filter (GF) is a widely used smoothing tool in computer vision and image processing. However, to the best of our knowledge, few papers investigate the mathematical connection between this filter and the least squares optimization. In this paper, we first interpret the guided filter as the cyclic coordinate descent solver of a least squar...
Chapter
In this paper, we present a novel deep learning framework that derives discriminative local descriptors for 3D surface shapes. In contrast to previous convolutional neural networks (CNNs) that rely on rendering multi-view images or extracting intrinsic shape properties, we parameterize the multi-scale localized neighborhoods of a keypoint into regu...
Article
High-definition ancient paintings have complex structures composed of interlaced drawing curves and distinct canvas textures. Aiming at restoring high-definition ancient Chinese paintings naturally, this paper presented an interactive repairing method based on decomposition of drawing curves. First, a painting was parsed into contents and canvases....