Pheng-Ann Heng

Pheng-Ann Heng
Chinese University of Hong Kong | CUHK

About

746
Publications
199,506
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
35,770
Citations

Publications

Publications (746)
Preprint
Recently Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms, however, these methods often overlook latent information in less prominent regions, leading to increased sensitivity to perturbations and limited global comprehension. To solve this issue, we introduce PointACL, an attention-driven con...
Article
Full-text available
The precise classification of cell types from single-cell RNA sequencing (scRNA-seq) data is pivotal for dissecting cellular heterogeneity in biological research. Traditional graph neural network (GNN) models are constrained by reliance on predefined graphs, limiting the exploration of complex cell-to-cell relationships. We introduce scGraphformer,...
Preprint
Full-text available
Proteolysis Targeting Chimeras (PROTACs) are heterobifunctional ligands that form ternary complexes with Protein Of Interests (POIs) and E3 ligases, exploiting the ubiquitin-proteasome system to degrade disease-causing proteins, promising to drug the undruggable. While PROTAC research primarily relies on costly and time-consuming wet experimental a...
Article
Convolutional neural networks (CNNs) have shown remarkable progress in medical image segmentation. However, the lesion segmentation remains a challenge to state-of-the-art CNN-based algorithms due to the variance in scales and shapes. On the one hand, tiny lesions are hard to delineate precisely from the medical images which are often of low resolu...
Article
Full-text available
Label scarcity, class imbalance and data uncertainty are three primary challenges that are commonly encountered in the semi-supervised medical image segmentation. In this work, we focus on the data uncertainty issue that is overlooked by previous literature. To address this issue, we propose a probabilistic prototype-based classifier that introduce...
Preprint
Full-text available
DNA-encoded library (DEL) screening has revolutionized the detection of protein-ligand interactions through read counts, enabling rapid exploration of vast chemical spaces. However, noise in read counts, stemming from nonspecific interactions, can mislead this exploration process. We present DEL-Ranking, a novel distribution-correction denoising fr...
Preprint
Full-text available
Recent advancements in text-to-image (T2I) diffusion models have enabled the creation of high-quality images from text prompts, but they still struggle to generate images with precise control over specific visual concepts. Existing approaches can replicate a given concept by learning from reference images, yet they lack the flexibility for fine-gra...
Preprint
Full-text available
Panoptic lifting is an effective technique to address the 3D panoptic segmentation task by unprojecting 2D panoptic segmentations from multi-views to 3D scene. However, the quality of its results largely depends on the 2D segmentations, which could be noisy and error-prone, so its performance often drops significantly for complex scenes. In this wo...
Article
Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from...
Preprint
Full-text available
Unseen Object Instance Segmentation (UOIS) is crucial for autonomous robots operating in unstructured environments. Previous approaches require full supervision on large-scale tabletop datasets for effective pretraining. In this paper, we propose UOIS-SAM, a data-efficient solution for the UOIS task that leverages SAM's high accuracy and strong gen...
Article
The clinical adoption of small interfering RNAs (siRNAs) has prompted the development of various computational strategies for siRNA design, from traditional data analysis to advanced machine learning techniques. However, previous studies have inadequately considered the full complexity of the siRNA silencing mechanism, neglecting critical elements...
Preprint
Full-text available
Shadows are formed when light encounters obstacles, leading to areas of diminished illumination. In computer vision, shadow detection, removal, and generation are crucial for enhancing scene understanding, refining image quality, ensuring visual consistency in video editing, and improving virtual environments. This paper presents a comprehensive su...
Preprint
Full-text available
This paper addresses the limitations of adverse weather image restoration approaches trained on synthetic data when applied to real-world scenarios. We formulate a semi-supervised learning framework employing vision-language models to enhance restoration performance across diverse adverse weather conditions in real-world settings. Our approach invo...
Preprint
We introduce SAM2Point, a preliminary exploration adapting Segment Anything Model 2 (SAM 2) for zero-shot and promptable 3D segmentation. SAM2Point interprets any 3D data as a series of multi-directional videos, and leverages SAM 2 for 3D-space segmentation, without further training or 2D-3D projection. Our framework supports various prompt types,...
Preprint
Pringle maneuver (PM) in laparoscopic liver resection aims to reduce blood loss and provide a clear surgical view by intermittently blocking blood inflow of the liver, whereas prolonged PM may cause ischemic injury. To comprehensively monitor this surgical procedure and provide timely warnings of ineffective and prolonged blocking, we suggest two c...
Preprint
Full-text available
Reversible face anonymization, unlike traditional face pixelization, seeks to replace sensitive identity information in facial images with synthesized alternatives, preserving privacy without sacrificing image clarity. Traditional methods, such as encoder-decoder networks, often result in significant loss of facial details due to their limited lear...
Preprint
Multi-modal brain tumor segmentation typically involves four magnetic resonance imaging (MRI) modalities, while incomplete modalities significantly degrade performance. Existing solutions employ explicit or implicit modality adaptation, aligning features across modalities or learning a fused feature robust to modality incompleteness. They share a c...
Article
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR). However, previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. This pipeline may pot...
Article
In this study, we propose a novel approach for RGB-D salient instance segmentation using a dual-branch cross-modal feature calibration architecture called CalibNet . Our method simultaneously calibrates depth and RGB features in the kernel and mask branches to generate instance-aware kernels and mask features. CalibNet consists of three simple...
Article
Full-text available
Aims To develop and externally test deep learning (DL) models for assessing the image quality of three-dimensional (3D) macular scans from Cirrus and Spectralis optical coherence tomography devices. Methods We retrospectively collected two data sets including 2277 Cirrus 3D scans and 1557 Spectralis 3D scans, respectively, for training (70%), fine...
Preprint
Full-text available
One-shot detection of anatomical landmarks is gaining significant attention for its efficiency in using minimal labeled data to produce promising results. However, the success of current methods heavily relies on the employment of extensive unlabeled data to pre-train an effective feature extractor, which limits their applicability in scenarios whe...
Preprint
Full-text available
Semi-supervised learning (SSL) has achieved notable progress in medical image segmentation. To achieve effective SSL, a model needs to be able to efficiently learn from limited labeled data and effectively exploiting knowledge from abundant unlabeled data. Recent developments in visual foundation models, such as the Segment Anything Model (SAM), ha...
Article
Deep generative models have unlocked another profound realm of human creativity. By capturing and generalizing patterns within data, we have entered the epoch of all-encompassing Artificial Intelligence for General Creativity (AIGC). Notably, diffusion models, recognized as one of the paramount generative models, materialize human ideation into tan...
Preprint
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide...
Preprint
Laparoscopic liver surgery poses a complex intraoperative dynamic environment for surgeons, where remains a significant challenge to distinguish critical or even hidden structures inside the liver. Liver anatomical landmarks, e.g., ridge and ligament, serve as important markers for 2D-3D alignment, which can significantly enhance the spatial percep...
Preprint
Full-text available
The ability to learn sequentially from different data sites is crucial for a deep network in solving practical medical image diagnosis problems due to privacy restrictions and storage limitations. However, adapting on incoming site leads to catastrophic forgetting on past sites and decreases generalizablity on unseen sites. Existing Continual Learn...
Preprint
A comprehensive guidance view for cardiac interventional surgery can be provided by the real-time fusion of the intraoperative 2D images and preoperative 3D volume based on the ultrasound frame-to-volume registration. However, cardiac ultrasound images are characterized by a low signal-to-noise ratio and small differences between adjacent frames, c...
Article
Full-text available
Hypertensive retinopathy (HR) can potentially lead to vision loss if left untreated. Early screening and treatment are critical in reducing the risk of vision loss. The computer-aided diagnostic system presents an opportunity to improve the efficiency and reliability of HR screening and diagnosis, particularly given the shortage of specialized medi...
Article
Full-text available
Graph neural networks (GNNs) have drawn more and more attention from material scientists and demonstrated a strong capacity to establish connections between structures and properties. However, with only unrelaxed structures provided as input, few GNN models can predict the thermodynamic properties of relaxed configurations with an acceptable level...
Preprint
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management. However, traditional methods often struggle to capture the complex dynamics of meteorological systems, particularly in the presence of high-resolution data. In this paper, we propose the Spatial-Frequency Attention Network (SFANet), a novel de...
Preprint
Data augmentation has proven to be a vital tool for enhancing the generalization capabilities of deep learning models, especially in the context of 3D vision where traditional datasets are often limited. Despite previous advancements, existing methods primarily cater to unimodal data scenarios, leaving a gap in the augmentation of multimodal triple...
Preprint
Full-text available
Offline meta reinforcement learning (OMRL) has emerged as a promising approach for interaction avoidance and strong generalization performance by leveraging pre-collected data and meta-learning techniques. Previous context-based approaches predominantly rely on the intuition that maximizing the mutual information between the task and the task repre...
Article
Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferabil...
Article
Upon remarkable progress in cardiac image segmentation, contemporary studies dedicate to further upgrading model functionality toward perfection, through progressively exploring the sequentially delivered datasets over time by domain incremental learning. Existing works mainly concentrated on addressing the heterogeneous style variations, but overl...
Article
Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires a huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a...
Article
Reversible face anonymization, unlike traditional face pixelization, seeks to replace sensitive identity information in facial images with synthesized alternatives, preserving privacy without sacrificing image clarity. Traditional methods, such as encoder-decoder networks, often result in significant loss of facial details due to their limited lear...
Article
Background Deep learning (DL) is promising to detect glaucoma. However, patients’ privacy and data security are major concerns when pooling all data for model development. We developed a privacy-preserving DL model using the federated learning (FL) paradigm to detect glaucoma from optical coherence tomography (OCT) images. Methods This is a multic...
Chapter
In medical image analysis, anomaly detection in weakly supervised settings has gained significant interest due to the high cost associated with expert-annotated pixel-wise labeling. Current methods primarily rely on auto-encoders and flow-based healthy image reconstruction to detect anomalies. However, these methods have limitations in terms of hig...
Chapter
Despite great progress in semi-supervised learning (SSL) that leverages unlabeled data to improve the performance over fully supervised models, existing SSL approaches still fail to exhibit good results when faced with a severe class imbalance problem in medical image segmentation. In this work, we propose a novel Mean-teacher based class imbalance...
Chapter
In medical image analysis, imbalanced noisy dataset classification poses a long-standing and critical problem since clinical large-scale datasets often attain noisy labels and imbalanced distributions through annotation and collection. Current approaches addressing noisy labels and long-tailed distributions separately may negatively impact real-wor...
Chapter
Cross-domain distribution shift is a common problem for medical image analysis because medical images from different devices usually own varied domain distributions. Test-time adaptation (TTA) is a promising solution by efficiently adapting source-domain distributions to target-domain distributions at test time with unsupervised manners, which has...
Conference Paper
Full-text available
Semi-supervised learning (SSL) has recently demonstrated great success in medical image segmentation, significantly enhancing data efficiency with limited annotations. However, despite its empirical benefits, there are still concerns in the literature about the theoretical foundation and explanation of semi-supervised segmentation. To explore this...
Article
Full-text available
Purpose The purpose of this study was to develop an artificial intelligence (AI) system for the identification of disease status and recommending treatment modalities for retinopathy of prematurity (ROP). Methods This retrospective cohort study included a total of 24,495 RetCam images from 1075 eyes of 651 preterm infants who received RetCam exami...
Preprint
Large-scale well-annotated datasets are of great importance for training an effective object detector. However, obtaining accurate bounding box annotations is laborious and demanding. Unfortunately, the resultant noisy bounding boxes could cause corrupt supervision signals and thus diminish detection performance. Motivated by the observation that t...
Conference Paper
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for both 2D and 3D computer vision. However, existing MAE-style methods can only learn from the data of a single modality, i.e., either images or point clouds, which neglect the implicit semantic and geometric correlation between 2D and 3D. In this paper, we expl...
Preprint
High-accuracy Dichotomous Image Segmentation (DIS) aims to pinpoint category-agnostic foreground objects from natural scenes. The main challenge for DIS involves identifying the highly accurate dominant area while rendering detailed object structure. However, directly using a general encoder-decoder architecture may result in an oversupply of high-...
Preprint
We propose a novel approach for RGB-D salient instance segmentation using a dual-branch cross-modal feature calibration architecture called CalibNet. Our method simultaneously calibrates depth and RGB features in the kernel and mask branches to generate instance-aware kernels and mask features. CalibNet consists of three simple modules, a dynamic i...
Preprint
Robotic bin packing is very challenging, especially when considering practical needs such as object variety and packing compactness. This paper presents SDF-Pack, a new approach based on signed distance field (SDF) to model the geometric condition of objects in a container and compute the object placement locations and packing orders for achieving...
Article
Full-text available
Purpose: Segmentation of liver vessels from CT images is indispensable prior to surgical planning and aroused a broad range of interest in the medical image analysis community. Due to the complex structure and low-contrast background, automatic liver vessel segmentation remains particularly challenging. Most of the related researches adopt FCN, U-...
Preprint
Full-text available
Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a la...
Preprint
Full-text available
Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and not stable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, i...
Preprint
Full-text available
Convolutional Neural Networks (CNNs) have shown remarkable progress in medical image segmentation. However, lesion segmentation remains a challenge to state-of-the-art CNN-based algorithms due to the variance in scales and shapes. On the one hand, tiny lesions are hard to be delineated precisely from the medical images which are often of low resolu...
Article
Full-text available
Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield predictio...
Preprint
Full-text available
Semi-supervised learning (SSL) methods assume that labeled data, unlabeled data and test data are from the same distribution. Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers). Most previous works focused o...
Preprint
Video dehazing aims to recover haze-free frames with high visibility and contrast. This paper presents a novel framework to effectively explore the physical haze priors and aggregate temporal information. Specifically, we design a memory-based physical prior guidance module to encode the prior-related features into long-range memory. Besides, we fo...
Preprint
Full-text available
Masked Image Modeling (MIM) has achieved impressive representative performance with the aim of reconstructing randomly masked images. Despite the empirical success, most previous works have neglected the important fact that it is unreasonable to force the model to reconstruct something beyond recovery, such as those masked objects. In this work, we...
Preprint
Data augmentation is an effective regularization strategy for mitigating overfitting in deep neural networks, and it plays a crucial role in 3D vision tasks, where the point cloud data is relatively limited. While mixing-based augmentation has shown promise for point clouds, previous methods mix point clouds either on block level or point level, wh...