Kun Li

Kun Li
Tianjin University | tju · School of Computer Science and Technology

Ph.D.

About

79
Publications
8,824
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,526
Citations
Citations since 2017
59 Research Items
1287 Citations
2017201820192020202120222023050100150200250300
2017201820192020202120222023050100150200250300
2017201820192020202120222023050100150200250300
2017201820192020202120222023050100150200250300

Publications

Publications (79)
Article
The disparity information reflects pixel-wise inter-view correlations among sub-aperture images (SAIs) of a light field (LF) image. Existing CNN-based methods for LF spatial super-resolution (SR) rarely utilize the disparities to incorporate the underlying inter-view correlations due to the lack of ground-truth disparity labels. In this paper, we i...
Article
Full-text available
Novel viewpoint image synthesis is very challenging, especially from sparse views, due to large changes in viewpoint and occlusion. Existing image-based methods fail to generate reasonable results for invisible regions, while geometry-based methods have difficulties in synthesizing detailed textures. In this paper, we propose STATE, an end-to-end d...
Preprint
Naturally controllable human-scene interaction (HSI) generation has an important role in various fields, such as VR/AR content creation and human-centered AI. However, existing methods are unnatural and unintuitive in their controllability, which heavily limits their application in practice. Therefore, we focus on a challenging task of naturally an...
Preprint
Full-text available
Image-based multi-person reconstruction in wide-field large scenes is critical for crowd analysis and security alert. However, existing methods cannot deal with large scenes containing hundreds of people, which encounter the challenges of large number of people, large variations in human scale, and complex spatial distribution. In this paper, we pr...
Article
Depth sensing is essential for intelligent computer vision applications, but it often suffers from low range precision and spatial resolution. To address this problem, we propose a novel framework that combines non-uniform sampling and reconstruction based on graph theory. Our framework consists of two main components: (1) a graph Laplacian induced...
Chapter
3D face reconstruction from a single image is a challenging problem, especially under partial occlusions and extreme poses. This is because the uncertainty of the estimated 2D landmarks will affect the quality of face reconstruction. In this paper, we propose a novel joint 2D and 3D optimization method to adaptively reconstruct 3D face shapes from...
Preprint
Nowadays, online screen sharing and remote cooperation are becoming ubiquitous. However, the screen content may be downsampled and compressed during transmission, while it may be displayed on large screens or the users would zoom in for detail observation at the receiver side. Therefore, developing a strong and effective screen content image (SCI)...
Article
Accurately estimating the human inner-body under clothing is very important for body measurement, virtual try-on and VR/AR applications. In this paper, we propose the first method to allow everyone to easily reconstruct their own 3D inner-body under daily clothing from a self-captured video with the mean reconstruction error of 0.73 cm within 15 s....
Preprint
The advent of deep learning has led to significant progress in monocular human reconstruction. However, existing representations, such as parametric models, voxel grids, meshes and implicit neural representations, have difficulties achieving high-quality results and real-time speed at the same time. In this paper, we propose Fourier Occupancy Field...
Preprint
Full-text available
Nowadays, there is an explosive growth of screen contents due to the wide application of screen sharing, remote cooperation, and online education. To match the limited terminal bandwidth, high-resolution (HR) screen contents may be downsampled and compressed. At the receiver side, the super-resolution (SR) of low-resolution (LR) screen content imag...
Article
The detection of clouds in remote sensing (RS) images is an important task, and convolutional neural networks (CNNs) have been used to perform it. However, supervised cloud detection CNNs rely heavily on a large number of samples annotated at pixel level to tune their parameter. Annotating RS images is a labor-intensive procedure and requires exper...
Article
Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet) , to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to str...
Article
3D human reconstruction from a single image is a challenging problem. Existing methods have difficulties to infer 3D clothed human models with consistent topologies for various poses. In this paper, we propose an efficient and effective method using a hierarchical graph transformation network. To deal with large deformations and avoid distorted geo...
Article
This paper proposes an intrinsic decomposition method from a single RGB-D image. To remedy the highly ill-conditioned problem, the reflectance component is regularized by a sparsity term, which is weighted by a bilateral kernel to exploit non-local structural correlation. As shading images are piece-wise smooth and have sparse gradient fields, the...
Article
Most convolutional neural network (CNN)-based cloud detection methods are built upon the supervised learning framework that requires a large number of pixel-level labels. However, it is expensive and time-consuming to manually annotate pixelwise labels for massive remote sensing images. To reduce the labeling cost, we propose an unsupervised domain...
Article
Cloud detection in optical imagery has drawn remarkable attention in the era of big Earth observation data analytic. While multiple supervised learning models have been developed for such purpose, large volumes of paired training samples annotated at the pixel level are essential to ensure the model's generalization capacity. However, constructing...
Preprint
Person image synthesis, e.g., pose transfer, is a challenging problem due to large variation and occlusion. Existing methods have difficulties predicting reasonable invisible regions and fail to decouple the shape and style of clothing, which limits their applications on person image editing. In this paper, we propose PISE, a novel two-stage genera...
Article
Full-text available
Abstract Low‐light images suffer from poor visibility and noise. In this paper, a low‐light image enhancement method based on Retinex decomposition is proposed. A pyramid network is first utilized to extract multi‐scale features to improve the quality of Retinex decomposition. Then the decomposed illumination is refined via an adaptive Gamma correc...
Preprint
Human pose transfer, which aims at transferring the appearance of a given person to a target pose, is very challenging and important in many applications. Previous work ignores the guidance of pose features or only uses local attention mechanism, leading to implausible and blurry results. We propose a new human pose transfer method using a generati...
Article
Optical satellite images are often affected by haze atmospheric conditions, which degrades the quality of remote sensing (RS) data and reduces the accuracy of interpretation and classification. Hence, haze removal becomes a necessary preprocessing step for most of the applications of RS image. In this article, we propose a novel haze removal method...
Article
Full-text available
Estimating correspondence between two shapes continues to be a challenging problem in geometry processing. Most current methods assume deformation to be near-isometric, however this is often not the case. For this paper, a collection of shapes of different animals has been curated, where parts of the animals (e.g., mouths, tails & ears) correspond...
Article
Human pose transfer, which aims at transferring the appearance of a given person to a target pose, is very challenging and important in many applications. Previous work ignores the guidance of pose features or only uses local attention mechanism, leading to implausible and blurry results. We propose a new human pose transfer method using a generati...
Preprint
Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet), to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to streng...
Article
Multispectral remote sensing (RS) images are often contaminated by the haze that degrades the quality of RS data and reduces the accuracy of interpretation and classification. Recently, the emerging deep convolutional neural networks (CNNs) provide us new approaches for RS image dehazing. Unfortunately, the power of CNNs is limited by the lack of s...
Article
This paper addresses the challenge of 3D motion recovery by exploiting the spatio-temporal correlations of corrupted 3D skeleton sequences. We propose a new 3D motion recovery method using spatio-temporal reconstruction, which uses joint low-rank and sparse priors to exploit temporal correlation and an isometric constraint for spatial correlation....
Article
Full-text available
Cloud detection is a crucial preprocessing step for optical satellite remote sensing (RS) images. This article focuses on the cloud detection for RS imagery with cloud-snow coexistence and the utilization of the satellite thumbnails that lose considerable amount of high resolution and spectrum information of original RS images to extract cloud mask...
Article
Full-text available
Based on measuring the polarimetric parameters which contain specific physical information, polarimetric imaging has been widely applied to various fields. However, in practice, the noise during image acquisition could lead to the output of noisy polarimetric images. In this paper, we propose, for the first time to our knowledge, a learning-based m...
Article
This paper proposes a new method for simultaneous 3D reconstruction and semantic segmentation for indoor scenes. Unlike existing methods that require recording a video using a color camera and/or a depth camera, our method only needs a small number of ( e.g. , 3~5) color images from uncalibrated sparse views, which significantly simplifies data ac...
Chapter
Depth super-resolution (SR) with color guidance is a classic vision problem to upsample low-resolution depth images. It has a wide range of applications in 3D reconstruction, automotive driver assistance and augmented reality. Due to the easy acquirement of the aligned high-resolution color images, there have been many depth SR approaches with colo...
Article
Separating moving objects and backgrounds from a video is an important yet challenging task for video analysis due to complex moving behaviors, camera jitters/movements, and huge data amount in real-world applications. To deal with these issues, this paper proposes a unified framework called spatiotemporally scalable matrix recovery (SSMR), which h...
Article
In this work, we introduce multi‐column graph convolutional networks (MGCNs), a deep generative model for 3D mesh surfaces that effectively learns a non‐linear facial representation. We perform spectral decomposition of meshes and apply convolutions directly in the frequency domain. Our network architecture involves multiple columns of graph convol...
Article
Nowadays, haze is a common and serious problem and $\text{PM}_{2.5}$ is a main measurement for air quality. Current methods estimate the level of primary pollutant with professional instruments which is expensive and inconvenient. Moreover, with haze, the captured images will be unclear and are difficult to estimate the depth of scene using passi...
Preprint
This paper proposes a new method for simultaneous 3D reconstruction and semantic segmentation of indoor scenes. Unlike existing methods that require recording a video using a color camera and/or a depth camera, our method only needs a small number of (e.g., 3-5) color images from uncalibrated sparse views as input, which greatly simplifies data acq...
Article
We present a novel global non-rigid registration method for dynamic 3D objects. Our method allows objects to undergo large non-rigid deformations, and achieves high quality results even with substantial pose change or camera motion between views. In addition, our method does not require a template prior and uses less raw data than tracking based me...
Article
Most tensor completion methods assume that missing entries are randomly distributed in incomplete tensors, and the low-rank prior or its variants are used to well pose the problem. However, this could be violated in practical applications where missing entries are not only randomly but also structurally distributed. To remedy this, this paper propo...
Article
Non-rigid registration is challenging because it is ill-posed with high degrees of freedom and is thus sensitive to noise and outliers. We propose a robust non-rigid registration method using reweighted sparsities on position and transformation to estimate the deformations between 3-D shapes. We formulate the energy function with position and trans...
Article
Full-text available
Non-rigid registration is challenging because it is ill-posed with high degrees of freedom and is thus sensitive to noise and outliers. We propose a robust non-rigid registration method using reweighted sparsities on position and transformation to estimate the deformations between 3-D shapes. We formulate the energy function with dual sparsities on...
Article
Full-text available
Photorealistic animation is a desirable technique for computer games and movie production. We propose a new method to synthesize plausible videos of human actors with new motions using a single cheap RGB-D camera. A small database is captured in a usual office environment, which happens only once for synthesizing different motions. We propose a mar...
Article
Separation of video clips into foreground and background components is a useful and important technique, making recognition, classification and scene analysis more efficient. In this paper, we propose a motion-assisted matrix restoration (MAMR) model for foreground-background separation in video clips. In the proposed MAMR model, the backgrounds ac...
Article
This paper proposes a new approach for nonrigid structure from motion with occlusion, based on sparse representation. We address the occlusion problem based on the latest developments on sparse representation: matrix completion, which can recover the observation matrix that has high percentages of missing data and can also reduce the noises and out...
Article
Non-rigid registration of 3D shapes is an essential task of increasing importance as commodity depth sensors become more widely available for scanning dynamic scenes. Non-rigid registration is much more challenging than rigid registration as it estimates a set of local transformations instead of a single global transformation, and hence is prone to...
Article
This paper proposes a video super-resolution method based on an adaptive superpixel-guided auto-regressive (AR) model. Key-frames are automatically selected and super-resolved by a sparse regression method. Non-key-frames are super-resolved by exploiting the spatio-temporal correlations: the temporal correlation is exploited by an optical flow meth...
Article
This paper proposes a new video super-resolution method based on feature-guided variational optical flow. The key-frames are automatically selected and super-resolved using a method based on sparse regression. To overcome the blocking artifacts and deal with the case of small structures with large displacement, an efficient method based on feature-...
Article
Full-text available
With the advances of depth sensing technologies, color image plus depth information (referred to as RGB-D data hereafter) is more and more popular for comprehensive description of 3-D scenes. This paper proposes a two-stage segmentation method for RGB-D data: 1) oversegmentation by 3-D geometry enhanced superpixels and 2) graph-based merging with l...
Article
This paper proposes an adaptive color-guided autoregressive (AR) model for high quality depth recovery from low quality measurements captured by depth cameras. We observe and verify that the AR model tightly fits depth maps of generic scenes. The depth recovery task is formulated into a minimization of AR prediction errors subject to measurement co...
Conference Paper
This paper introduces a novel 3-D geometry enhanced superpixels for RGB-D data. First, we reconstruct the 3-D geometry of the scene by projecting the depth map into 3-D coordinates. Then, a distance metric for superpixel clustering is constructed using 3-D geometry and color information. Finally, pixels are iteratively clustered into superpixels us...
Conference Paper
This paper proposes an adaptive color-guided auto-regressive (AR) model for high quality depth recovery from low quality measurements captured by depth cameras. We formulate the depth recovery task into a minimization of AR prediction errors subject to measurement consistency. The AR predictor for each pixel is constructed according to both the loc...
Article
Temporal-dense 3-D reconstruction for dynamic scenes is a challenging and important research topic in signal processing. Although dynamic scenes can be captured by multiple high frame rate cameras, high price, and large storage are still problematic for practical applications. To address this problem, we propose a new method for temporal-densely ca...
Conference Paper
This paper presents a new approach to image superresolution based on sparse representation. This problem is formulated as a compressive sensing system, in which an over-complete dictionary is used to sparsely represent a low resolution image and generate a high resolution image. We propose a method to adaptively construct training images by selecti...
Conference Paper
In this paper, we propose a novel color face feature extraction approach named statistically orthogonal analysis (SOA). It in turn calculates the projection transforms of the red, green and blue color component image sets by using the Fisher criterion, and simultaneously makes the obtained transforms mutually statistically orthogonal. SOA can enhan...
Article
Full-text available
Three-dimensional motion estimation from multiview video sequences is of vital importance to achieve high-quality dynamic scene reconstruction. In this paper, we propose a new 3-D motion estimation method based on matrix completion. Taking a reconstructed 3-D mesh as the underlying scene representation, this method automatically estimates motions o...
Article
We propose a new markerless shape and motion capture approach from multiview video sequences. The shape recovery method consists of two steps: separating and merging. In the separating step, the depth map represented with a point cloud for each view is generated by solving a proposed variational model, which is regularized by four constraints to en...
Article
This paper proposes a collaborative color calibration method for multi-camera systems. The multi-camera color calibration problem is formulated as an overdetermined linear system, in which a dynamic range shaping is incorporated to ensure the high contrasts for captured images. The cameras are calibrated with the parameters obtained by solving the...
Conference Paper
This paper proposes a new color calibration method for multi-camera systems with a novel omnidirectional color checker. The designed cylindrical color checker contains a periodic array of color patches, and is visible for all the cameras without manual adjustment. For color calibration, accurate global correspondences are first generated by local d...
Article
Full-text available
In this paper, a color transfer technique based on Adaptive Directional Wavelet Transform with Quincunx Sampling (ADWQS) is proposed to transfer color from a color reference image to a grayscale target image. Due to ADWQS's directional selectivity and symmetrical characteristic, the proposed scheme yields the best color transfer performance. We sea...
Article
Discrete wavelet transform is an effective tool to generate scalable stream, but it cannot efficiently represent edges which are not aligned in horizontal or vertical directions, while natural images often contain rich edges and textures of this kind. Hence, recently, intensive research has been focused particularly on the directional wavelets whic...

Network

Cited By