December 2024
·
3 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
December 2024
·
3 Reads
June 2024
·
3 Reads
·
4 Citations
June 2024
·
14 Reads
·
1 Citation
International Journal of Computer Vision
The presence of noisy examples in the training set inevitably hampers the performance of out-of-distribution (OOD) detection. In this paper, we investigate a previously overlooked problem called OOD detection under asymmetric open-set noise, which is frequently encountered and significantly reduces the identifiability of OOD examples. We analyze the generating process of asymmetric open-set noise and observe the influential role of the confounding variable, entangling many open-set noisy examples with partial in-distribution (ID) examples referred to as hard-ID examples due to spurious-related characteristics. To address the issue of the confounding variable, we propose a novel method called Adversarial Confounder REmoving (ACRE) that utilizes progressive optimization with adversarial learning to curate three collections of potential examples (easy-ID, hard-ID, and open-set noisy) while simultaneously developing invariant representations and reducing spurious-related representations. Specifically, by obtaining easy-ID examples with minimal confounding effect, we learn invariant representations from ID examples that aid in identifying hard-ID and open-set noisy examples based on their similarity to the easy-ID set. By triplet adversarial learning, we achieve the joint minimization and maximization of distribution discrepancies across the three collections, enabling the dual elimination of the confounding variable. We also leverage potential open-set noisy examples to optimize a K+1-class classifier, further removing the confounding variable and inducing a tailored K+1-Guided scoring function. Theoretical analysis establishes the feasibility of ACRE, and extensive experiments demonstrate its effectiveness and generalization. Code is available at https://github.com/Anonymous-re-ssl/ACRE0.
May 2024
·
10 Reads
Semi-supervised learning can significantly boost model performance by leveraging unlabeled data, particularly when labeled data is scarce. However, real-world unlabeled data often contain unseen-class samples, which can hinder the classification of seen classes. To address this issue, mainstream safe SSL methods suggest detecting and discarding unseen-class samples from unlabeled data. Nevertheless, these methods typically employ a single-model strategy to simultaneously tackle both the classification of seen classes and the detection of unseen classes. Our research indicates that such an approach may lead to conflicts during training, resulting in suboptimal model optimization. Inspired by this, we introduce a novel framework named Diverse Teacher-Students (\textbf{DTS}), which uniquely utilizes dual teacher-student models to individually and effectively handle these two tasks. DTS employs a novel uncertainty score to softly separate unseen-class and seen-class data from the unlabeled set, and intelligently creates an additional (K+1)-th class supervisory signal for training. By training both teacher-student models with all unlabeled samples, DTS can enhance the classification of seen classes while simultaneously improving the detection of unseen classes. Comprehensive experiments demonstrate that DTS surpasses baseline methods across a variety of datasets and configurations. Our code and models can be publicly accessible on the link https://github.com/Zhanlo/DTS.
May 2024
·
8 Reads
Machine Learning
March 2024
·
10 Reads
·
2 Citations
Proceedings of the AAAI Conference on Artificial Intelligence
Detecting out-of-distribution (OOD) data is essential to ensure the reliability of machine learning models when deployed in real-world scenarios. Different from most previous test-time OOD detection methods that focus on designing OOD scores, we delve into the challenges in OOD detection from the perspective of typicality and regard the feature’s high-probability region as the feature’s typical set. However, the existing typical-feature-based OOD detection method implies an assumption: the proportion of typical feature sets for each channel is fixed. According to our experimental analysis, each channel contributes differently to OOD detection. Adopting a fixed proportion for all channels results in several channels losing too many typical features or incorporating too many abnormal features, resulting in low performance. Therefore, exploring the channel-aware typical features is crucial to better-separating ID and OOD data. Driven by this insight, we propose expLoring channel-Aware tyPical featureS (LAPS). Firstly, LAPS obtains the channel-aware typical set by calibrating the channel-level typical set with the global typical set from the mean and standard deviation. Then, LAPS rectifies the features into channel-aware typical sets to obtain channel-aware typical features. Finally, LAPS leverages the channel-aware typical features to calculate the energy score for OOD detection. Theoretical and visual analyses verify that LAPS achieves a better bias-variance trade-off. Experiments verify the effectiveness and generalization of LAPS under different architectures and OOD scores.
January 2024
·
30 Reads
·
1 Citation
Machine Learning
Pseudo-labeling methods are popular in semi-supervised learning (SSL). Their performance heavily relies on a proper threshold to generate hard labels for unlabeled data. To this end, most existing studies resort to a manually pre-specified function to adjust the threshold, which, however, requires prior knowledge and suffers from the scalability issue. In this paper, we propose a novel method named Meta-Threshold, which learns a dynamic confidence threshold for each unlabeled instance and does not require extra hyperparameters except a learning rate. Specifically, the instance-level confidence threshold is automatically learned by an extra network in a meta-learning manner. Considering limited labeled data as meta-data, the overall training objective of the classifier network and the meta-net can be formulated as a nested optimization problem that can be solved by a bi-level optimization scheme. Furthermore, by replacing the indicator function existed in the pseudo-labeling with a surrogate function, we theoretically provide the convergence of our training procedure, while discussing the training complexity and proposing a strategy to reduce its time cost. Extensive experiments and analyses demonstrate the effectiveness of our method on both typical and imbalanced SSL tasks.
October 2023
·
2 Reads
·
3 Citations
October 2023
·
8 Reads
·
5 Citations
June 2023
·
10 Reads
·
9 Citations
Proceedings of the AAAI Conference on Artificial Intelligence
Source free domain adaptation (SFDA) transfers a single-source model to the unlabeled target domain without accessing the source data. With the intelligence development of various fields, a zoo of source models is more commonly available, arising in a new setting called multi-source-free domain adaptation (MSFDA). We find that the critical inborn challenge of MSFDA is how to estimate the importance (contribution) of each source model. In this paper, we shed new Bayesian light on the fact that the posterior probability of source importance connects to discriminability and transferability. We propose Discriminability And Transferability Estimation (DATE), a universal solution for source importance estimation. Specifically, a proxy discriminability perception module equips with habitat uncertainty and density to evaluate each sample's surrounding environment. A source-similarity transferability perception module quantifies the data distribution similarity and encourages the transferability to be reasonably distributed with a domain diversity loss. Extensive experiments show that DATE can precisely and objectively estimate the source importance and outperform prior arts by non-trivial margins. Moreover, experiments demonstrate that DATE can take the most popular SFDA networks as backbones and make them become advanced MSFDA solutions.
... It employs a storage mechanism to calculate the average logits for each class, preparing this simple storage pattern for subsequent OOD detection tasks. Drawing inspiration from experience, DDCS [84] adapts to select suitable channels for data classification after correcting each channel in the neural network. These channels are evaluated based on inter-class similarity and variance to measure their discriminative power for ID data. ...
June 2024
... In healthcare, the diversity and complexity of data 1291 mean that LLMs are frequently confronted with cases that 1292 differ significantly from their training data, such as rare con-1293 ditions or unique patient demographics. This is where OOD 1294 detection becomes crucial-identifying when the model is 1295 encountering new, unseen categories allows for better han-1296 dling of such cases [344], [345], [346], [347]. Effective OOD 1297 detection methods can help LLMs determine when they are 1298 less confident, enabling healthcare professionals to step in 1299 and mitigate risks [348]. ...
June 2024
International Journal of Computer Vision
... VRA (Xu et al. 2023) zeros out anomalously low activations and truncates anomalously high activations. BATS proposes rectifying activations towards their typical set, while LAPS (He et al. 2024) improves BATS by considering channel-aware typical sets. These methods only examine anomalies at the activation level, whereas managing overconfidence anomalies at a more granular parameter level is important for more effective OOD detection. ...
March 2024
Proceedings of the AAAI Conference on Artificial Intelligence
... This motivates us to devise effective methods for utilizing such data. In semi-supervised learning (Sohn et al., 2020;Xu et al., 2021;Wei et al., 2024;Zhang et al., 2021a), pseudolabeling is a widely studied and adopted technique. Grounded in the principle of entropy minimization (Grandvalet & Bengio, 2004), it typically selects the most reliable samples from unlabeled data based on the confidence for inclusion in training. ...
January 2024
Machine Learning
... Another approach, ETLT [10], employs linear regression to predict the output of an OOD detector. Similarly, [8,16,23,56] assume a large set of unlabeled data consisting of in-and OOD samples is available after deployment. ...
October 2023
... Evaluation Metrics. We evaluate our approach using common OOD detection metrics [24,41,52,64,65] [66] as the underlying visionlanguage model for OOD sample detection. For consistency across all compared methods, the feature extractor within RAM, swinlarge [39], is used for all approaches. ...
October 2023
... After selecting the most informative samples, we assign pseudo-labels to the remaining unlabeled samples. We begin by initializing class centroids in the feature space as in (Liang et al., 2020;Wang et al., 2023). The class centroids o c are derived from the model's current class predictions: ...
June 2023
... Recent works about MSFDA emphasize more on harmonizing discriminability and transferability for MSFDA (Kundu et al., 2022;Han et al., 2023). Discriminability stands for the ease of classifying objects into given categories by a pretrained classifier, while transferability refers to feature representations invariance between different domains (Chen et al., , 2020aKundu et al., 2022). ...
June 2023
Proceedings of the AAAI Conference on Artificial Intelligence
... Further complicating this scenario is the presence of unseen-class samples within the training data, which can significantly disrupt the learning process, potentially leading to unstable outcomes or severe performance degradation [5, 9,28]. Several similar definitions have emerged to describe this scenario, including safe SSL [9], open-set SSL [22,24,31,45], and the challenge of managing UnLabeled data from Unseen Classes in Semi-Supervised Learning (ULUC-SSL) [14]. In this paper, we prefer to refer to it as the safe classification problem of safe SSL [9], to emphasize that our core goal is to ensure that the model's performance is not compromised, or even degraded, by the presence of unseen-class samples during the training process. ...
January 2023
IEEE Transactions on Knowledge and Data Engineering
... However, even though our approaches provided results better than the Reproducibility baseline, they only refer to the best on the tasks W→A and W→D, achieving 89.8% for Generation++ and 99.4 for Original and Augmentation approaches, respectively. Yet, the unknown exploitation could improve OVANet, a 2021 approach, to be barely comparable with novel state-of-the-art methods, such as MLNet (Lu et al., 2024), a 2024 approach, and NCAL (Su et al., 2023), a 2023 approach. ...
May 2023
Pattern Recognition