Yuan Yuan’s research while affiliated with Chinese Academy of Sciences and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (24)


Ablation Studies on Different Importance metrics, where || · || * represents the nuclear norm.
Time and memory cost comparison between GoRA and baseline methods. We report the number of trainable parameters(#Params), Memory cost recorded by DeepSpeed(Memory), training time(Time@train) and initialization time(Time@init).
GoRA: Gradient-driven Adaptive Low Rank Adaptation
  • Preprint
  • File available

February 2025

·

2 Reads

Haonan He

·

Peng Ye

·

Yuchen Ren

·

[...]

·

Lei Chen

Low-Rank Adaptation (LoRA) is a crucial method for efficiently fine-tuning pretrained large language models (LLMs), with its performance largely influenced by two key factors: rank and initialization strategy. Numerous LoRA variants have been proposed to enhance its performance by addressing these factors. However, these variants often compromise LoRA's usability or efficiency. In this paper, we analyze the fundamental limitations of existing methods and introduce a novel approach, GoRA (Gradient-driven Adaptive Low Rank Adaptation), which adaptively assigns ranks and initializes weights for low-rank adapters simultaneously based on gradient information. Extensive experimental results demonstrate that GoRA significantly improves performance while preserving the high usability and efficiency of LoRA. On the T5 model fine-tuned for the GLUE benchmark, GoRA achieves a 5.88-point improvement over LoRA and slightly surpasses full fine-tuning. Similarly, on the Llama3.1-8B-Base model fine-tuned for GSM8k tasks, GoRA outperforms LoRA with a 5.13-point improvement and exceeds full fine-tuning in high-rank settings by a margin of 2.05 points.

Download


ELECTRA-based graph network model for multi-hop question answering

June 2023

·

98 Reads

·

9 Citations

Journal of Intelligent Information Systems

The emergence of the HotpotQA dataset addressed the lack of training datasets on multi-hop question answering. Based on the strengths of this dataset, we proposed a novel model applicable to multi-hop question answering, called it ELECTRA-based Graph Network model (EGN). First, the method was able to correlate questions with contextual paragraphs and external Wikipedia data to naturally obtain next-hop connected paragraph, initialized the text data with a pre-trained context encoder, Efficiently Learning an Encoder that Classifies Token Re-placements Accurately (ELECTRA). Second, it iterated and updated text features at different levels with the modified Graph Attention Network (GATv2) network. EGN was able to achieve a comparable result in less time with the iterative computation of GATv2 by linking more sensible clues and using ELECTRA to obtain a better representation of the data. In the experiments, EGN performed well with the FullWiki setting on the HotpotQA validation dataset, achieving a Joint EM/F1 score of 47.35/74.62 on the validation set.




Question Answering on Agricultural Knowledge Graph Based on Multi-label Text Classification

February 2023

·

57 Reads

·

2 Citations

Communications in Computer and Information Science

Traditional search engines retrieve relevant web pages based on keywords in the entered questions, while sometimes the required information may not be included in these keyword-based retrieved web pages. Compared to the search engines, the question answering system can provide more accurate answers. However, traditional question answering systems can only provide answers to users based on matching the questions in a question answering pair. At the same time, the number of question answering pairs remain somewhat limited. As a result, the user’s requirements cannot be met well. In contrast, knowledge graphs can store information such as entities and their relationships in a structured pattern. Therefore, the knowledge graph is highly scalable as the data is stored in a structured form. Besides, the relationship between entities and the knowledge graph structure allows the desired answer to be found quickly. Moreover, the process of relation classification can also be regarded as an operation of text classification. Therefore, this study proposed a new approach to knowledge graph-based question answering systems that require a named entity recognition method and a multi-label text classification method to search for the answers. The results of entity name and question type are turned into a Cypher query that searches for the answer in the knowledge graph. In this paper, three models, i.e., TextCNN, bi-LSTM, and bi-LSTM + Att, are used to examine the effectiveness of multi-label text classification, demonstrating our method’s feasibility. Among these three models, TextCNN worked best, attaining an F1 score of 0.88.





History Reuse and Bag-of-Words Loss for Long Summary Generation

July 2021

·

35 Reads

·

8 Citations

IEEE/ACM Transactions on Audio Speech and Language Processing

Recurrent Neural Network (RNN) based abstractive text summarization models have made great progress over the past few years, largely triggered by the encoder-decoder architecture. However, there has been little work improving the generation of relatively long summaries. In this paper, we concentrate on two prominent problems in long summary generation. First, although significant efforts have been made to assist the encoder in handling long sequences, the decoder struggles with long sequences owing to the limited storage capacity of RNN. We propose a simple and effective approach called history reuse, which first mines critical information from the history summary sequence and then transmits the information to the decoder. Second, since encoder-decoder models are typically trained to produce exactly the same summary as the target summary, certain word order deviations between the predicted summary and target summary are excessively punished. Accordingly, we introduce a fully differentiable loss called bag-of-words (BoW) loss, which takes advantage of the feature of BoW discarding word order information in texts, and computes the difference between the two summaries at the BoW space. Experiments on two benchmark datasets, CNN/Daily Mail and Pubmed, demonstrate that our methods significantly improve the baseline.


Citations (15)


... This has made English the most frequently studied language in the field of NLP and has led to the formation of a large literature on tasks such as QA. English QA systems have been developed with SQuAD and its derivative datasets, and the effectiveness of Transformer-based models has been proven in different fields (Raza et al., 2022;Alzubi et al., 2023;Zhu et al., 2023). NLP studies on low-resource languages such as Turkish are limited. ...

Reference:

Performance Evaluation of Transformer-Based Pre-Trained Language Models for Turkish Question-Answering
ELECTRA-based graph network model for multi-hop question answering

Journal of Intelligent Information Systems

... Ensuring stable rice production and increasing harvest yields have long been critical goals in international agricultural production, playing a significant role in maintaining global food security and social stability [1][2][3]. However, rice production is frequently threatened by various diseases, which, if not treated in time, can spread extensively, leading to substantial reductions in both yield and quality, ultimately causing severe economic losses [4][5][6][7]. For instance, bacterial blight can result in yield reductions of 20-30% during outbreaks, with losses reaching 50% or even leading to complete crop failure in extreme cases [8,9]. ...

Improved domain adaptive rice disease image recognition based on a novel attention mechanism
  • Citing Article
  • May 2023

Computers and Electronics in Agriculture

... Their approach delivered good classification results for 14 crop species and 26 diseases in the Plant Village dataset. However, deep learning is a supervised learning method [13] . As a result, especially in the studies of crop disease image recognition, the modeling quality relies heavily on large batches of labeled training samples [14] . ...

Impact of dataset on the study of crop disease image recognition

International Journal of Agricultural and Biological Engineering

... In [15], the authors highlighted some shortcomings with this first baseline, leading to further investigations into optimizing the task's results. Following this works, [16,13,18,11,17] focused on using better metrics and generation strategies as well as enhancing the dataset itself. Notably, [1] aimed to generate more contextually relevant comments by using external knowledge [14]. ...

Multi-modal guided attention for live video comments generation
  • Citing Conference Paper
  • March 2022

... Hal tersebut dapat menjerumuskan mereka kepada apa yang disebut Hu et al. (2021) sebagai beauty sickness (penyakit kecantikan) yang muncul ketika energi emosional wanita begitu terobsesi dan kekhawatiran yang berlebihan terhadap tampilan luar fisik mereka sehingga mempersulit mereka melihat aspek lain dalam kehidupan mereka. Hal ini diperlakukan secara diskriminatif atas dasar tampilan fisik atau yang disebut the beautiful sindrom wanita cantik. ...

PathosisGAN: Sick Face Image Synthesis with Generative Adversarial Network
  • Citing Conference Paper
  • May 2021

... Fu et al. [21] proposed a potential BOW model to generate defnitions and determine discrete potential variables' semantics. Liu et al. [22] fused BOW with historical features for summarization generation tasks. However, counting the number of word occurrences cannot determine word frequency's importance. ...

History Reuse and Bag-of-Words Loss for Long Summary Generation
  • Citing Article
  • July 2021

IEEE/ACM Transactions on Audio Speech and Language Processing

... Traditional methods rely on mature algorithms and analysis technologies for target detection, but they have some drawbacks, Huibo Zhou and Hui Xie have contributed equally to this work. such as high-accuracy requirements, long processing times, complex mechanisms, and the need for a large number of manually generated samples [7]. Conversely, deep learningbased methods facilitate hierarchical learning through the automated recognition and learning of targets from data by means of complex neural networks, circumventing the necessity for a manual feature extraction process [5]. ...

Advanced Agricultural Disease Image Recognition Technologies: A Review

Information Processing in Agriculture

... Deng et al. [10] proposed that abstract extraction of answers could be used to generate more concise answers to solve the noise problem caused by too long answers, which is convenient for users to read and understand. Yuan et al. [11] proposed the use of deep learning to enhance the semantics and improve the accuracy of the task of answer selection by mining deeper semantics in a non-factual question-answering system. Bao et al. [12] proposed a double attention recurrent convolutional neural network, realized the interaction between questions and answers by using cross-attention, and conducted multidimensional semantic modeling of questions and answers, so as that the question-and-answer text can be better represented and improve the accuracy of the task of answer selection. ...

Answer Selection Using Multi-Layer Semantic Representation Learning

IOP Conference Series Materials Science and Engineering

... Compared to the other deep transfer learning approaches, fine-tuning had the best transferable performance, with an accuracy of >88 % for the test set of the target data. Kathiresan et al. [23] combined datasets from multiple sources, including dataset used by previous studies [11,30] and Chen et al. [10] trained 1100 rice plant disease samples, with 660 images sourced online and 440 from agricultural fields with varied backdrops and lighting, aided by the Fujian Institute of Subtropical Botany. The images were categorized by experts, saved in JPG format, and processed into the RGB model using Photoshop, resized to 224 × 224 pixels. ...

Agricultural Disease Image Dataset for Disease Identification Based on Machine Learning
  • Citing Chapter
  • August 2019

Lecture Notes in Computer Science

... Leaf diseases are the primary issue that reduces agricultural productivity [1]. According to the studies, 50% of crop losses are caused by plant diseases and pets [2]. Managing and controlling diseases is essential to increasing crop productivity. ...

Crop Disease Image Classification Based on Transfer Learning with DCNNs: First Chinese Conference, PRCV 2018, Guangzhou, China, November 23-26, 2018, Proceedings, Part II
  • Citing Chapter
  • November 2018

Lecture Notes in Computer Science