Asen Nachkov’s research while affiliated with Medical University of Sofia and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (5)


Figure 1. Task scheme. Our method learns the deformation animation of an object between two frames with large topological changes between them.
Figure 2. Architecture of the SDF module. Each point x ∈ [0, 1] 3 is encoded using (a) HashGrid which is presented in Section 3.1. Then, the coordinate encoding (of dimension F × L) are concatenated with the positional encoding of time (γ(t)) and fed into a MLP (SDF Head). (b) Signed distance value for each point is estimated, and the Lagrangian representation of mesh is extracted via Marching Cubes [20]. As our experiments show, the model can learn continuous representation of the deformation with respect to time.
Figure 8. Ten frame static Voronoi sphere reconstruction. Although the model is supervised on 10 frames (t = 0, 0.1, ..., 0.9), because of the time consistency regularization, the prediction is consistent among other times in this interval too.
Figure 11. Rendering Module (Sec. 3.4) outcomes after our inference approach, rendered in different time steps. The Rendering Module is initialized by placing splats on the surface of a sphere to cover the sphere completely.
Figure 13. NIE [22] model initialized as all spheres and rendered in 3 sample time steps.

+6

Neural 4D Evolution under Large Topological Changes from 2D Images
  • Preprint
  • File available

November 2024

·

24 Reads

·

·

Asen Nachkov

·

[...]

·

Danda Paudel

In the literature, it has been shown that the evolution of the known explicit 3D surface to the target one can be learned from 2D images using the instantaneous flow field, where the known and target 3D surfaces may largely differ in topology. We are interested in capturing 4D shapes whose topology changes largely over time. We encounter that the straightforward extension of the existing 3D-based method to the desired 4D case performs poorly. In this work, we address the challenges in extending 3D neural evolution to 4D under large topological changes by proposing two novel modifications. More precisely, we introduce (i) a new architecture to discretize and encode the deformation and learn the SDF and (ii) a technique to impose the temporal consistency. (iii) Also, we propose a rendering scheme for color prediction based on Gaussian splatting. Furthermore, to facilitate learning directly from 2D images, we propose a learning framework that can disentangle the geometry and appearance from RGB images. This method of disentanglement, while also useful for the 4D evolution problem that we are concentrating on, is also novel and valid for static scenes. Our extensive experiments on various data provide awesome results and, most importantly, open a new approach toward reconstructing challenging scenes with significant topological changes and deformations. Our source code and the dataset are publicly available at https://github.com/insait-institute/N4DE.

Download


Fig. 1: End-to-end learning of controllers. Our framework utilizes a differentiable simulator to learn autonomous vehicle controllers from the corrections between the simulated new states and the target states.
Fig. 3: Unrolling the model in time with gradient detachment inside the differentiable simulator. Starting from the simulator state s t , we obtain an observation o t , containing the scene elements such as agents locations, traffic lights, and roadgraph, which gets encoded into features x t . An RNN (recurrent over time) with a policy head outputs actions a t which are executed in the simulated environment to obtain the new state s t+1 . When applying a loss between s t+1 andˆsandˆ andˆs t+1 the gradients flow back through the environment and update the policy head, RNN, and the scene encoder. Similar to BPTT, gradients through the RNN hidden state accumulate. We do not backpropagate through the observation or the simulator state.
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation

September 2024

·

14 Reads

Current methods to learn controllers for autonomous vehicles (AVs) focus on behavioural cloning. Being trained only on exact historic data, the resulting agents often generalize poorly to novel scenarios. Simulators provide the opportunity to go beyond offline datasets, but they are still treated as complicated black boxes, only used to update the global simulation state. As a result, these RL algorithms are slow, sample-inefficient, and prior-agnostic. In this work, we leverage a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers on the large-scale Waymo Open Motion Dataset. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of the environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We combine this setup with a recurrent architecture that can efficiently propagate temporal information across long simulated trajectories. This APG method allows us to learn robust, accurate, and fast policies, while only requiring widely-available expert trajectories, instead of scarce expert actions. We compare to behavioural cloning and find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.



Figure 1: Schematic and data flow in BoP.
Figure 2: Comparison of BDPG and BoP on selected Atari environments.
Bag of Policies for Distributional Deep Exploration

August 2023

·

57 Reads

Efficient exploration in complex environments remains a major challenge for reinforcement learning (RL). Compared to previous Thompson sampling-inspired mechanisms that enable temporally extended exploration, i.e., deep exploration, we focus on deep exploration in distributional RL. We develop here a general purpose approach, Bag of Policies (BoP), that can be built on top of any return distribution estimator by maintaining a population of its copies. BoP consists of an ensemble of multiple heads that are updated independently. During training, each episode is controlled by only one of the heads and the collected state-action pairs are used to update all heads off-policy, leading to distinct learning signals for each head which diversify learning and behaviour. To test whether optimistic ensemble method can improve on distributional RL as did on scalar RL, by e.g. Bootstrapped DQN, we implement the BoP approach with a population of distributional actor-critics using Bayesian Distributional Policy Gradients (BDPG). The population thus approximates a posterior distribution of return distributions along with a posterior distribution of policies. Another benefit of building upon BDPG is that it allows to analyze global posterior uncertainty along with local curiosity bonus simultaneously for exploration. As BDPG is already an optimistic method, this pairing helps to investigate if optimism is accumulatable in distributional RL. Overall BoP results in greater robustness and speed during learning as demonstrated by our experimental results on ALE Atari games.