Feng Wang

Feng Wang
Tusimple · Algorithm Group

Doctor of Engineering

About

37
Publications
6,296
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,850
Citations
Additional affiliations
October 2016 - present
Johns Hopkins University
Position
  • PhD Student
September 2014 - present
University of Electronic Science and Technology of China
Position
  • PhD Student

Publications

Publications (37)
Article
Full-text available
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class diff...
Conference Paper
Full-text available
Thanks to the recent developments of Convolutional Neural Networks, the performance of face verification methods has increased rapidly. In a typical face verification method, feature normalization is a critical step for boosting performance. This motivates us to introduce and study the effect of normalization during training. But we find this is no...
Preprint
Animated video separates foreground and background elements into layers, with distinct processes for sketching, refining, coloring, and in-betweening. Existing video generation methods typically treat animation as a monolithic data domain, lacking fine-grained control over individual layers. In this paper, we introduce LayerAnimate, a novel archite...
Article
LiDAR-based fully sparse architecture has gained increasing attention. FSDv1 stands out as a representative work, achieving impressive efficacy and efficiency, albeit with intricate structures and handcrafted designs. In this paper, we present FSDv2, an evolution that aims to simplify the previous FSDv1 and eliminate the ad-hoc heuristics in its ha...
Preprint
LiDAR-based fully sparse architecture has garnered increasing attention. FSDv1 stands out as a representative work, achieving impressive efficacy and efficiency, albeit with intricate structures and handcrafted designs. In this paper, we present FSDv2, an evolution that aims to simplify the previous FSDv1 while eliminating the inductive bias introd...
Preprint
Radar is ubiquitous in autonomous driving systems due to its low cost and good adaptability to bad weather. Nevertheless, the radar detection performance is usually inferior because its point cloud is sparse and not accurate due to the poor azimuth and elevation resolution. Moreover, point cloud generation algorithms already drop weak signals to re...
Article
As the perception range of LiDAR expands, LiDAR-based 3D object detection contributes ever-increasingly to the long-range perception in autonomous driving. Mainstream 3D object detectors often build dense feature maps, where the cost is quadratic to the perception range, making them hardly scale up to the long-range settings. To enable efficient lo...
Preprint
This paper aims for high-performance offline LiDAR-based 3D object detection. We first observe that experienced human annotators annotate objects from a track-centric perspective. They first label the objects with clear shapes in a track, and then leverage the temporal coherence to infer the annotations of obscure objects. Drawing inspiration from...
Preprint
Full-text available
As the perception range of LiDAR expands, LiDAR-based 3D object detection contributes ever-increasingly to the long-range perception in autonomous driving. Mainstream 3D object detectors often build dense feature maps, where the cost is quadratic to the perception range, making them hardly scale up to the long-range settings. To enable efficient lo...
Preprint
Full-text available
As the perception range of LiDAR increases, LiDAR-based 3D object detection becomes a dominant task in the long-range perception task of autonomous driving. The mainstream 3D object detectors usually build dense feature maps in the network backbone and prediction head. However, the computational and spatial costs on the dense feature map are quadra...
Preprint
Full-text available
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases. Overlooking this difference, many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds. In this pap...
Preprint
Full-text available
LiDAR-based 3D detection in point cloud is essential in the perception system of autonomous driving. In this paper, we present LiDAR R-CNN, a second stage detector that can generally improve any existing 3D detector. To fulfill the real-time and high precision requirement in practice, we resort to point-based approach other than the popular voxel-b...
Preprint
Full-text available
In this paper, we propose an anchor-free single-stage LiDAR-based 3D object detector -- RangeDet. The most notable difference with previous works is that our method is purely based on the range view representation. Compared with the commonly used voxelized or Bird's Eye View (BEV) representations, the range view representation is more compact and w...
Preprint
Full-text available
In this work, a discriminatively learned CNN embedding is proposed for remote sensing image scene classification. Our proposed siamese network simultaneously computes the classification loss function and the metric learning loss function of the two input images. Specifically, for the classification loss, we use the standard cross-entropy loss funct...
Article
Pedestrian detection has achieved great improve-ments in recent years, while complex occlusion handling and high-accurate localization are still the most important problems. To take advantage of the body part semantic information and the contextual information for pedestrian detection, we propose the part and context network (PCN) in this paper. A...
Preprint
Full-text available
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class diff...
Article
This paper develops a novel sequential subspace clustering method for sequential data. Inspired by state-of-the-art methods, ordered subspace clustering (OSC) and temporal subspace clustering (TSC), we design a novel local temporal regularization term based on the concept of temporal predictability. Through minimizing the short-term variance on his...
Article
Visualization from trained deep neural networks has drawn massive public attention in recent. One of the visualization approaches is to train images maximizing the activation of specific neurons. However, directly maximizing the activation would lead to unrecognizable images, which cannot provide any meaningful information. In this paper, we introd...
Conference Paper
Full-text available
Limited annotated data is available for the research of estimating facial expression intensities, which makes the training of deep networks for automated expression assessment very challenging. Fortunately, fine-tuning from a data-extensive pre-trained domain such as face verification can alleviate the problem. In this paper, we propose a transferr...
Conference Paper
Full-text available
Compared with unsupervised hashing, supervised hashing commonly illustrates better accuracy in many real applications by leveraging semantic (label) information. However, it is tough to solve the supervised learning problem directly because it is essentially a discrete optimization problem. Some other works try to solve the discrete optimization pr...
Preprint
Full-text available
Thanks to the recent developments of Convolutional Neural Networks, the performance of face verification methods has increased rapidly. In a typical face verification method, feature normalization is a critical step for boosting performance. This motivates us to introduce and study the effect of normalization during training. But we find this is no...
Conference Paper
Full-text available
In this work we formulate the problem of image captioning as a multimodal translation task. Analogous to machine translation, we present a sequence-to-sequence recurrent neural networks (RNN) model for image caption generation. Different from most existing work where the whole image is represented by convolutional neural network (CNN) feature, we p...
Preprint
In this work we formulate the problem of image captioning as a multimodal translation task. Analogous to machine translation, we present a sequence-to-sequence recurrent neural networks (RNN) model for image caption generation. Different from most existing work where the whole image is represented by convolutional neural network (CNN) feature, we p...
Article
In this paper, we propose a remote sensing image fusion method which combines the wavelet transform and sparse representation to obtain fusion images with high spectral resolution and high spatial resolution. Firstly, intensity-hue-saturation (IHS) transform is applied to Multi-Spectral (MS) images. Then, wavelet transform is used to the intensity...
Article
Full-text available
This paper develops a human action recognition method for human silhouette sequences based on supervised temporal t-SNE and incremental learning. Inspired by the Stochastic Neighbor Embedding (SNE) and its variants, supervised temporal t-SNE is proposed to learn the underlying relationship between action frames in a manifold, where the class label...
Conference Paper
Full-text available
Attacking fingerprint-based biometric systems by presenting fake fingers is a serious threat for unattended devices. In this work, we introduce a novel algorithm, by extracting features along the fingerprint curves, to discriminate between fake fingers and real ones on static images. Pairs of mean value and standard deviation are sampled from the r...
Article
Full-text available
In this paper, a SAR target recognition method is proposed based on the improved joint sparse representation (IJSR) model. The IJSR model can effectively combine multiple-view SAR images from the same physical target to improve the recognition performance. The classification process contains two stages. Convex relaxation is used to obtain support s...

Network

Cited By