Peng Wu

Peng Wu
Xidian University · School of Artificial Intelligence

Phd Student

About

36
Publications
2,196
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,267
Citations

Publications

Publications (36)
Preprint
Video anomaly detection (VAD) aims to discover behaviors or events deviating from the normality in videos. As a long-standing task in the field of computer vision, VAD has witnessed much good progress. In the era of deep learning, with the explosion of architectures of continuously growing capability and capacity, a great variety of deep learning b...
Article
Text-based Person Retrieval aims to search the target pedestrian image from video surveillance or a large image database with a text description. Previous works have recognized the significance of mining local information in images and descriptions and performing fine-grained alignment. These approaches adopt hard division or auxiliary networks for...
Preprint
Full-text available
Current weakly supervised video anomaly detection (WSVAD) task aims to achieve frame-level anomalous event detection with only coarse video-level annotations available. Existing works typically involve extracting global features from full-resolution video frames and training frame-level classifiers to detect anomalies in the temporal dimension. How...
Article
Video anomaly detection (VAD) aims to identify events or scenes in videos that deviate from typical patterns. Existing approaches primarily focus on reconstructing or predicting frames to detect anomalies and have shown improved performance in recent years. However, they often depend highly on local spatio-temporal information and face the challeng...
Article
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events and single labels, e...
Article
In the domain of video surveillance, describing the behavior of each individual within the video is becoming increasingly essential, especially in complex scenarios with multiple individuals present. This is because describing each individual's behavior provides more detailed situational analysis, enabling accurate assessment and response to potent...
Article
Generating high-quality high dynamic range (HDR) images in dynamic scenes is particularly challenging due to the influence of large motion. Despite the effectiveness of existing deep learning methods, they still suffer from ghosting artifacts when saturation and motion coexist. Inspired by fusion on static scenes, we propose an inpainting and fusio...
Preprint
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies% at the frame level, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events...
Preprint
Full-text available
Video anomaly detection (VAD) is a significant computer vision problem. Existing deep neural network (DNN) based VAD methods mostly follow the route of frame reconstruction or frame prediction. However, the lack of mining and learning of higher-level visual features and temporal context relationships in videos limits the further performance of thes...
Chapter
Existing methods for anomaly detection based on memory-augmented autoencoder (AE) have the following drawbacks: (1) Establishing a memory bank requires additional memory space. (2) The fixed number of prototypes from subjective assumptions ignores the data feature differences and diversity. To overcome these drawbacks, we introduce DLAN-AC, a Dynam...
Preprint
Full-text available
Existing methods for anomaly detection based on memory-augmented autoencoder (AE) have the following drawbacks: (1) Establishing a memory bank requires additional memory space. (2) The fixed number of prototypes from subjective assumptions ignores the data feature differences and diversity. To overcome these drawbacks, we introduce DLAN-AC, a Dynam...
Article
Violence detection in video is very promising in practical applications due to the emergence of massive videos in recent years. Most previous works define violence detection as a simple video classification task and use the single modality of small-scale datasets, e.g., visual signal. However, such solutions are undersupplied. To mitigate this prob...
Article
Automatic defect detection on textured surfaces has long been one of the hotspots in the computer vision community. Many methods based on deep learning have been proposed in the past few years, although most are semantic segmentation based methods that require a large number of accurately labeled samples. However, obtaining precise labels is time-c...
Article
X-ray imagery security screening is an essential component of transportation and logistics. In recent years, some researchers have used computer vision algorithms to replace inefficient and tedious manual baggage inspection. However, X-ray images are complicated, and objects overlap with one another in a semi-transparent state, which underperforms...
Article
Full-text available
Anomaly detection in videos is the task of identifying frames from a video sequence that depict events that do not conform to expected behavior, which is an extremely challenging task due to the ambiguous and unbounded properties of anomalies. With the development of deep learning, video anomaly detection methods based on deep neural networks have...
Preprint
Video-text retrieval is an important yet challenging task in vision-language understanding, which aims to learn a joint embedding space where related video and text instances are close to each other. Most current works simply measure the video-text similarity based on video-level and text-level embeddings. However, the neglect of more fine-grained...
Article
Weakly supervised anomaly detection is a challenging task since frame-level labels are not given in the training phase. Previous studies generally employ neural networks to learn features and produce frame-level predictions and then use multiple instance learning (MIL)-based classification loss to ensure the interclass separability of the learned f...
Chapter
Violence detection has been studied in computer vision for years. However, previous work are either superficial, e.g., classification of short-clips, and the single scenario, or undersupplied, e.g., the single modality, and hand-crafted features based multimodality. To address this problem, in this work we first release a large-scale and multi-scen...
Preprint
Full-text available
Violence detection has been studied in computer vision for years. However, previous work are either superficial, e.g., classification of short-clips, and the single scenario, or undersupplied, e.g., the single modality, and hand-crafted features based multimodality. To address this problem, in this work we first release a large-scale and multi-scen...
Article
The semi-supervised video anomaly detection assumes that only normal video clips are available for training. Therefore, the intuitive idea is either to learn a dictionary by sparse coding or to train encoding-decoding neural networks by minimizing the reconstruction errors. For the former, the optimization of sparse coefficients is extremely time-c...
Article
How to build a generic deep one-class (DeepOC) model to solve one-class classification problems for anomaly detection, such as anomalous event detection in complex scenes? The characteristics of existing one-class labels lead to a dilemma: it is hard to directly use a multiple classifier based on deep neural networks to solve one-class classificati...
Article
The complex network is an important tool to represent relational data in nature and human society, which has been widely applied in various real-world application scenarios. A key issue for analyzing the features of networks is to represent the characteristic information in the network with rationality. Network embedding, attracting plenty of atten...
Conference Paper
The babyhood is a very important stage in the human growth process. Thus learning the facial expression characteristics of infants is of great significance to nursing care for infants. Infants' faces are significantly different from adults' faces, like eyebrows, eyes, nose, cheeks, skin texture, etc. Therefore, the facial expression recognition mod...
Chapter
The Local Binary Pattern (LBP) is a widely used descriptor in facial expression recognition due to its efficiency and effectiveness. However, existing facial expression recognition methods based on LBP either ignore different kinds of information, such as details and the contour of faces, or rely on the division of face images, such as dividing the...
Article
Hyperspectral unmixing (HU) is a method used to estimate the fractional abundances corresponding to endmembers in each of the mixed pixels in the hyperspectral remote sensing image. In recent times, deep learning has been recognized as an effective technique for hyperspectral image classification. In this letter, an end-to-end HU method is proposed...

Network

Cited By