Figure - available from: Sensors
This content is subject to copyright.
Results of assembly phase analysis: (a–c) upper-hole inserting phase; (d,e) lower-hole approaching phase; and (f–i) lower-hole inserting phase.

Results of assembly phase analysis: (a–c) upper-hole inserting phase; (d,e) lower-hole approaching phase; and (f–i) lower-hole inserting phase.

Source publication
Article
Full-text available
For the dual peg-in-hole compliance assembly task of upper and lower double-hole structural micro-devices, a skill-learning method is proposed. This method combines offline training in a simulation space and online training in a realistic space. In this paper, a dual peg-in-hole model is built according to the results of a force analysis, and conta...

Similar publications

Preprint
Full-text available
Alignment with human preferences is commonly framed using a universal reward function, even though human preferences are inherently heterogeneous. We formalize this heterogeneity by introducing user types and examine the limits of the homogeneity assumption. We show that aligning to heterogeneous preferences with a single policy is best achieved us...
Article
Full-text available
Reinforcement learning (RL) systems can be complex and non-interpretable, making it challenging for non-AI experts to understand or intervene in their decisions. This is due in part to the sequential nature of RL in which actions are chosen because of their likelihood of obtaining future rewards. However, RL agents discard the qualitative features...
Preprint
Full-text available
This paper presents Dual Action Policy (DAP), a novel approach to address the dynamics mismatch inherent in the sim-to-real gap of reinforcement learning. DAP uses a single policy to predict two sets of actions: one for maximizing task rewards in simulation and another specifically for domain adaptation via reward adjustments. This decoupling makes...
Article
Full-text available
Multi-feed radial distribution systems are used to reduce the losses in the system using reconfiguration techniques. Reconfiguration can reduce the losses in the system only to a certain extent. Introduction of distributed generators has vastly improved the performance of distribution systems. Distributed generators can be used for reduction of los...
Article
Full-text available
This work investigates the implementation of the Deep Deterministic Policy Gradient (DDPG) algorithm to enhance the target-reaching capability of the seven degree-of-freedom (7-DoF) Franka Pandarobotic arm. A simulated environment is established by employing OpenAI Gym, PyBullet, and Panda Gym. After 100,000 training time steps, the DDPG algorithm...

Citations

... AAC showed superior sample efficiency and stability, particularly in more complex tasks, while the HRL algorithm with four options demonstrated success in generalization tasks. Wu et al. (2023) employed a dual pegin-hole task with an upper and lower double-hole structure, generating an initial pose, identifying contact points, and calculating contact forces before the actor executes actions using the DDPG algorithm. Finally, Apolinarska et al. (2021) investigated the assembly of lap joints for custom timber frames, using an APE-X DDPG algorithm to insert two laps with one joint and three laps with two joints. ...
... To enhance generalization, reward curriculum learning was implemented, and a model fusion framework was employed to integrate previously acquired knowledge. Wu et al. (2023) utilizes a skill-learning method that combines pre-trained offline training in a simulated environment with online training in a real-world setting. Additionally, Gaussian noise is applied, and experiments are conducted using a real robot. ...
... A notable strategy for transferring learning from simulation to reality involves initially training the policy within a simulated environment using domain randomization, followed by retraining the policy on a real robot. This approach tends to yield optimal performance, particularly when the discrepancy between the simulation and real-world physics is minimal (Yasutomi et al., 2023;Zhang et al., 2022;Nguyen et al., 2024;Beltran-Hernandez et al., 2020;Chen et al., 2021;Men et al., 2023;Ji et al., 2024;Li et al., 2022;Li & Wang, 2024;Wu et al., 2023;Shi et al., 2023;Ming et al., 2023;Jiang et al., 2023). Conversely, transferring policies directly from simulation without subsequent retraining (Hebecker et al., 2021;Leyendecker et al., 2022;Apolinarska et al., 2021), or training solely on real robots from scratch (Zhao et al., 2020), generally does not achieve comparable performance. ...
Article
Full-text available
The increasing complexity of production environments and fluctuations in short-term demand requires adaptive and robust processes. To cope with the inherent challenges, deep reinforcement learning algorithms were widely deployed in assembly processes in recent years, due to their generalization capabilities, which ensure enhanced usability and flexibility for diverse assembly applications. Despite a growing number of scientific papers investigating deep learning based assembly and associated generalization capabilities, a comprehensive review and assessment of potential generalization capabilities has yet to be conducted. This paper aims to provide researchers and practitioners with an evaluation of key influences which contribute to a successful generalization of deep reinforcement learning within assembly processes, thereby facilitating further implementations. Our findings reveal that current research primarily focuses on examining generalization in insertion and sequence planning assembly tasks. Furthermore, we identified many context-specific approaches to enhance generalization, as well as remaining research challenges and gaps. The results comprise four overarching factors, containing several specific approaches that increase generalizability in assembly processes. However, future research must focus on verifying the context independence of these factors.