Zhenhua Duan’s research while affiliated with Xidian University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (234)


Figure 1: An instance of the Waterworld problem
Figure 3: DFA of í µí¼‘ = (¬í µí±Ÿ ∧ ¬í µí±”) U((í µí±Ÿ ∧ ¬í µí±”) ∧ ○((¬í µí±Ÿ ∧ ¬í µí±”)U(í µí±” ∧ ¬í µí±Ÿ))). ¬í µí±” ∧ ¬í µí±Ÿ, í µí±” ∧ ¬í µí±Ÿ, í µí±Ÿ ∧ ¬í µí±” are logical representations of {í µí±”, í µí±Ÿ}, {í µí±”}, {í µí±Ÿ} respectively. í µí±” represents {í µí±”, í µí±Ÿ} or {í µí±”}, similar to í µí±Ÿ.
Figure 4: Relationship of task phases in ParMod
Figure 5: Depiction of the Racecar problem
Figure 6: Depiction of the Halfcheetah problem

+3

ParMod: A Parallel and Modular Framework for Learning Non-Markovian Tasks
  • Preprint
  • File available

December 2024

·

5 Reads

Ruixuan Miao

·

Xu Lu

·

·

[...]

·

Zhenhua Duan

The commonly used Reinforcement Learning (RL) model, MDPs (Markov Decision Processes), has a basic premise that rewards depend on the current state and action only. However, many real-world tasks are non-Markovian, which has long-term memory and dependency. The reward sparseness problem is further amplified in non-Markovian scenarios. Hence learning a non-Markovian task (NMT) is inherently more difficult than learning a Markovian one. In this paper, we propose a novel \textbf{Par}allel and \textbf{Mod}ular RL framework, ParMod, specifically for learning NMTs specified by temporal logic. With the aid of formal techniques, the NMT is modulaized into a series of sub-tasks based on the automaton structure (equivalent to its temporal logic counterpart). On this basis, sub-tasks will be trained by a group of agents in a parallel fashion, with one agent handling one sub-task. Besides parallel training, the core of ParMod lies in: a flexible classification method for modularizing the NMT, and an effective reward shaping method for improving the sample efficiency. A comprehensive evaluation is conducted on several challenging benchmark problems with respect to various metrics. The experimental results show that ParMod achieves superior performance over other relevant studies. Our work thus provides a good synergy among RL, NMT and temporal logic.

Download










Citations (40)


... In addition, we used two different criteria to evaluate the importance score of filters: l 2 -norm [30] and geometric median [28]. Our experiments using the criteria of l 2 -norm is denoted as ''Ours(L2)'' and the criterion using the geometric median is denoted as ''Ours(GM)'' in Tables 1 to 4. We compare FPAD with the previous pruning methods, e.g., FPGM [28], FTWT [52], PFEC [21], HRank [6], ABCPruner [38], LFPC [29], PGMPF [50], MGPF [51], AFPruner [53], CLR-RNF [54], WACP [55]. Experimental results show that our FPAD is superior to the state-of-the-art methods. ...

Reference:

Filter pruning via annealing decaying for deep convolutional neural networks acceleration
A multi-granularity CNN pruning framework via deformable soft mask with joint training
  • Citing Article
  • December 2023

Neurocomputing

... These algorithms continuously share information between searches, optimizing both path length and obstacle avoidance efficiency, making them highly effective for real-time applications. In addition, Yao et al. developed a Dynamic Parameter Adaptive Path Planning Algorithm (DPARL), which introduces a dynamic parameter adjustment strategy to improve path stability and convergence speed in environments lacking prior information [6]. This adaptability enhances the algorithm's performance in complex, unpredictable settings. ...

A Dynamic Parameter Adaptive Path Planning Algorithm
  • Citing Chapter
  • December 2023

Lecture Notes in Computer Science

... Generating test descriptions. DiffSpec incorporates the specifications and constraints for each tested, along with all the additional context extract from the various artifacts, to prompt the LLM to generate a configurable number (we use 10 This step is repeated for every combination of extracted code difference and bug category. ...

SBDT: Search-Based Differential Testing of Certificate Parsers in SSL/TLS Implementations
  • Citing Conference Paper
  • July 2023

... Distributing the monitoring tasks across multiple processors or systems [42,82,83] enables the system to handle a higher volume of events and data without signiőcant performance degradation, hence improving the efficiency of runtime veriőcation. ...

Adaptively parallel runtime verification based on distributed network for temporal properties
  • Citing Article
  • June 2023

Parallel Computing

... Third, the basic ideas of the proposed approach are rather general. As such, it would be interesting to test these ideas on other related problems, such as the clustered set covering problem [1], the maximum group set coverage problem [8], and budgeted maximum coverage problem [15]. Finally, as far as the exact solution of the problem is concerned, only the general CPLEX solver (B&B) has been studied in the literature. ...

Formulate Full View Camera Sensor Coverage by Using Group Set Coverage
  • Citing Chapter
  • February 2023

Lecture Notes of the Institute for Computer Sciences

... Program synthesis [11] and verification [5] are two ways to improve the quality of software. In this paper, we propose a tool, namely PIChecker, that utilizes the PC-DPOR [9] and C-Intp [8] techniques to verify the reachability properties of concurrent programs. These techniques work in two different ways, equivalent trace class partitioning and infeasible conditional branch pruning, to reduce the search space in model checking. ...

Prioritized Constraint-Aided Dynamic Partial-Order Reduction
  • Citing Conference Paper
  • January 2023

... Distributing the monitoring tasks across multiple processors or systems [42,82,83] enables the system to handle a higher volume of events and data without signiőcant performance degradation, hence improving the efficiency of runtime veriőcation. ...

A Distributed Network-Based Runtime Verification of Full Regular Temporal Properties
  • Citing Article
  • January 2022

IEEE Transactions on Parallel and Distributed Systems

... Xiao et al. [24] proposed an adversarial example generation method via source-agnostic adversarial feature inducing for improving the transferability of the generated adversarial examples. Dong et al. [25] proposed SDM-FGSM to improve the transferability of the generated adversarial examples, and alleviate the overfitting of source model in target attack by data augmentation. ...

Improving Transferability of Adversarial Examples by Saliency Distribution and Data Augmentation
  • Citing Article
  • June 2022

Computers & Security

... A trilayer mobile hybrid hierarchical peer-to-peer (MHP2P) model was proposed by Duan et al. in [65] as a cloudlet for efficient load balancing strategy through mobile edge computing (MEC). MHP2P promises high reliability, scalability, and efficiency in service lookups. ...

A novel load balancing scheme for mobile edge computing
  • Citing Article
  • December 2021

Journal of Systems and Software

... Jiang et al used Node2Graph [109] to compute embedded vec-tors from function-call graphs in parallel with a boolean model of Windows API calls to input them in an auto-encoder neural network for malware detection [134]. Zhang et al used GloVe [215], which trains a global word co-occurrence matrix and produces a word vector space model, to compute the embedded vectors and use them to train a CNN for malware detection [325]. ...

Malware Detection using CNN via Word Embedding
  • Citing Conference Paper
  • August 2021