Licheng Zhang’s research while affiliated with University of Science and Technology of China and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (14)


Multi-Prototype Grouping for Continual Learning in Visual Question Answering
  • Conference Paper

April 2025

Licheng Zhang

·

Zhendong Mao

·

Yixing Peng

·

[...]

·

Yongdong Zhang


Feature-Adaptive and Data-Scalable In-Context Learning

May 2024

·

1 Read

In-context learning (ICL), which promotes inference with several demonstrations, has become a widespread paradigm to stimulate LLM capabilities for downstream tasks. Due to context length constraints, it cannot be further improved in spite of more training data, and general features directly from LLMs in ICL are not adaptive to the specific downstream task. In this paper, we propose a feature-adaptive and data-scalable in-context learning framework (FADS-ICL), which can leverage task-adaptive features to promote inference on the downstream task, with the supervision of beyond-context samples. Specifically, it first extracts general features of beyond-context samples via the LLM with ICL input form one by one, and introduces a task-specific modulator to perform feature refinement and prediction after fitting a specific downstream task. We conduct extensive experiments on FADS-ICL under varying data settings (4\sim128 shots) and LLM scale (0.8\sim70B) settings. Experimental results show that FADS-ICL consistently outperforms previous state-of-the-art methods by a significant margin under all settings, verifying the effectiveness and superiority of FADS-ICL. For example, under the 1.5B and 32 shots setting, FADS-ICL can achieve \textbf{+14.3} average accuracy from feature adaptation over vanilla ICL on 10 datasets, with \textbf{+6.2} average accuracy over the previous state-of-the-art method, and the performance can further improve with increasing training data. Code and data are publicly available at \url{https://github.com/jiahaozhenbang/FADS-ICL}.







Curriculum Learning Driven Domain Adaptation for Low-Resource Machine Reading Comprehension

January 2024

·

3 Reads

·

1 Citation

Signal Processing Letters, IEEE

Although the pre-trained language models have achieved great success on machine reading comprehension task, they often rely on large-scale annotated data, while only a little amount of data is available in the most real-world scenarios. To enhance the PTLMs' capabilities in low-resource scenario, we propose a curriculum learning driven domain adaptation method for low-resource machine reading comprehension, the basic paradigm of which is to train a source model with sufficient data and then adaptive it to our target domain. In the adapting procedure, we introduce the curriculum learning strategy, the core idea of which is arranging training examples from easy to difficult, to bridge the gap between source and target domains and enable the source model adapting to the target domain progressively. Specifically, before fine-tuning the well-trained source model using target data, we firstly calculate the loss of each target example using the source model to evaluating the example difficulty accurately. After that, we sample suitable batches based on an increasing sampling function at each fine-tuning step, allowing the source model to start learning from easy examples in the target domain and gradually transition to difficult ones. Experiments conducted on two public datasets have demonstrated the effectiveness of our method.


QGAE: an End-to-end Answer-Agnostic Question Generation Model for Generating Question-Answer Pairs

January 2023

·

49 Reads

·

3 Citations

JUSTC

Question generation aims to generate meaningful and fluent questions, which can address the lack of question-answer type annotated corpus by augmenting the available data. Using unannotated text with optional answers as input contents, question generation can be divided into two types based on whether answers are provided: answer-aware and answer-agnostic. While generating questions with providing answers is challenging, generating high-quality questions without providing answers is even more difficult, for both humans and machines. In order to address this issue, we proposed a novel end-to-end model called QGAE, which is able to transform answer-agnostic question generation into answer-aware question generation by directly extracting candidate answers. This approach effectively utilizes unlabeled data for generating high-quality question-answer pairs, and its end-to-end design makes it more convenient compared to a multi-stage method that requires at least two pre-trained models. Moreover, our model achieves better average scores and greater diversity. Our experiments show that QGAE achieves significant improvements in generating question-answer pairs, making it a promising approach for question generation.


Citations (7)


... In addition to directly selecting examples from training data, another research trend involves utilizing LLMs to reformat the representation of existing demonstrations Hao et al., 2022b;Liu et al., 2024a;Li et al., 2024a) (Sorensen et al., 2022) Human design GPT-3 Mutual Information EPR (Rubin et al., 2022) Human design GPT-{J, 3}/CodeX Score-based Retrieval IDS (Qin et al., 2023) Human design GPT-3.5 Iterative Selection AdaICL (Mavromatis et al., 2023) Human design GPT-{J, Neo} Selective Demonstration UDR (Li et al., 2023d) Human design GPT-Neo-2.7B Unified Retrieval ...

Reference:

A Survey on In-context Learning
Feature-Adaptive and Data-Scalable In-Context Learning
  • Citing Conference Paper
  • January 2024

... Curriculum learning [2] has been extensively studied as a training paradigm that orders the training set by increasing difficulty to enhance stability and sample efficiency. In the context of question answering (QA), curriculum learning has been leveraged to bridge distributional gaps between pre-training and downstream fine-tuning datasets [31], mitigating domain shift and improving generalization. Recent advances in LLMs have incorporated curriculum-inspired self-improvement mechanisms [10], wherein models iteratively augment their training data with instances they can already solve, thereby facilitating generalization to more complex reasoning tasks. ...

Curriculum Learning Driven Domain Adaptation for Low-Resource Machine Reading Comprehension
  • Citing Article
  • January 2024

Signal Processing Letters, IEEE

... Recent parameter-efficient approaches in KGs representation have introduced methods for reducing model complexity and embedding dimensions by utilizing only a small subset of entities [14,15]. In these methods, entities for embedding are chosen randomly beforehand, and rather than independently embedding each entity, the model leverages specific types of distinguishing information to encode all entities. ...

Random Entity Quantization for Parameter-Efficient Compositional Knowledge Graph Representation
  • Citing Conference Paper
  • January 2023

... Providing an appropriate context for QG is crucial in order to produce questions that are relevant to the educational material. Before LLMs appeared, related work [10], [5], [9] implemented context-specific QG models using well-known datasets. Other work focused on educational environments, such as in [4], where resources like school repositories, Wikipedia, or other websites provided context. ...

QGAE: an End-to-end Answer-Agnostic Question Generation Model for Generating Question-Answer Pairs
  • Citing Article
  • January 2023

JUSTC

... Given the scarcity of parallel data (i.e., text pairs conveying the same content but differing in styles) and the labor-intensive nature of annotating such pairs, existing research has predominantly focused on unsupervised TST. Recent contributions in this domain, including studies by Lee et al., 2021;Huang et al., 2021;Suzgun et al., 2022;Ramesh Kashyap et al., 2022;Han et al., 2023), have demonstrated significant progress. Despite notable success, these works primarily concentrate on the transfer of a single sentence, which we call short TST. ...

Text Style Transfer with Contrastive Transfer Pattern Mining
  • Citing Conference Paper
  • January 2023

... A series of illustrative experiments were meticulously crafted to showcase the advantages of employing curriculum strategies in both image classification and language modeling. When focusing on the field of NLP, by experimenting with several heuristics, (Sachan and Xing 2016;Xu et al. 2020) migrated the success of CL to NLU tasks. Zhou et al. 2021) improved the machine translation modeling by carefully designing different curricula. ...

Curriculum Learning for Natural Language Understanding
  • Citing Conference Paper
  • January 2020