C. Daniel Freeman’s research while affiliated with Google Inc. and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (16)


Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction
  • Chapter

October 2024

Mykhaylo Andriluka

·

Baruch Tabanpour

·

C. Daniel Freeman

·

Cristian Sminchisescu

Fig. 3: Example of a generated sequence obtained with a model that includes displacement features and displacement loss (bottom row), and without either of these elements (middle row). We highlight inconsistencies of the joint positions with the red circles.
Fig. 4: Experiments with the dynamics network: training subsequence length N h (a), ablations of the joint displacement feature and loss (b), ablations of non-linearity type and learning rate schedule (c), and evaluation of variants for the contact network (d). Error bars show one standard deviation calculated over 5 runs. The x-axis sweeps the time window over which metrics are computed.
Fig. 5: Evaluation of LARP on datasets with colliding objects.
Fig. 6: Left: Reconstructed 3d poses on four consecutive video frames on AIST-hard dataset. Middle row shows results obtained with the kinematic pipeline from [13, 18] that LARP uses for initialization. Bottom row show results obtained with LARP integrated into [13]. Middle: Motion sequence with person-ball collision simulated with LARP (bottom) and comparison to Bullet engine [8] (top). Right: Examples of generated human motion sequences of a person kicking a ball for three different ball targets. In each image we show position of the ball right after the kick and at the end of the sequence. Note that the person pose differs considerably depending on the ball target.
Fig. 9: Example of estimated pose from "S9-WalkDog" seq. after 11 sec. of input. Left: input frame, middle: result obtained with SuperTrack trained on longer sequences, right: result obtained with LARP.
Learned Neural Physics Simulation for Articulated 3D Human Pose Reconstruction
  • Preprint
  • File available

October 2024

·

12 Reads

We propose a novel neural network approach, LARP (Learned Articulated Rigid body Physics), to model the dynamics of articulated human motion with contact. Our goal is to develop a faster and more convenient methodological alternative to traditional physics simulators for use in computer vision tasks such as human motion reconstruction from video. To that end we introduce a training procedure and model components that support the construction of a recurrent neural architecture to accurately simulate articulated rigid body dynamics. Our neural architecture supports features typically found in traditional physics simulators, such as modeling of joint motors, variable dimensions of body parts, contact between body parts and objects, and is an order of magnitude faster than traditional systems when multiple simulations are run in parallel. To demonstrate the value of LARP we use it as a drop-in replacement for a state of the art classical non-differentiable simulator in an existing video-based reconstruction framework and show comparative or better 3D human pose reconstruction accuracy.

Download

Figure 1: Data and the training pipeline. <S TKN>, <P TKN> and <O TKN> are special tokens indicating subject, predicate, and object, respectively. (a) The original data exist in the form of a Knowledge Graph (KG), where nodes representing subjects and objects are connected by predicates (arrows). (b) The KG is then formatted into triplets: subject, predicate, object, and further prefixed with special tokens indicating their identity. Such formatted data are used to pretrain autoregressive LMs with the common next-token-prediction loss. (c) Pretrained LMs are evaluated by prefixing with subject and predicate alongside special tokens to predict objects. (d) On top of pretrained LMs, detectors are trained to detect the presence of hallucinations during generation.
Figure 5: Hallucination detection accuracy as a function of the LM size for various task formulations and detector types. Detectors were trained and evaluated on distinct splits of data obtained by having a given pretrained LM generate 5 completions for every subjectpredicate in its training set (using temp = 1.0). The accuracy of all the trained hallucination detectors is generally high, especially for outputs of the larger LMs. Larger (full) detectors work better than smaller ones (head). The token-level detection task formulation seems to provide higher detection accuracy, although not in all cases. The results here are confounded by the varying hallucination rates of the underlying LM (e.g., if the LM hallucinates only 5% of the time, a detector which finds no hallucinations achieves 95% accuracy).
Figure 6: AUC-PR as function of LM hallucination rate for the full detectors. Same setup as in Figure 5, except LM size is now represented by the marker size. Showing results for data generated by LMs trained for 100 (resp. 20) epochs on 1% (resp. 10%) of the data. AUC-PR does not depend on the proportion of hallucinations in the evaluation data (i.e., the LM's hallucination rate), thus providing providing a better measure of the detector's ability to catch hallucinations. Unlike for accuracy (Figure 5), the sentence task is clearly superior in AUC-PR terms (can also be seen in Figure 9), although better token performance can be attained by attaching the detector to a different LM layer (Figure 10). More importantly, the detectability of hallucinations is inversely proportional to the LM size (largest dots/LMs in bottom left, smallest in top right). Larger LMs have lower hallucination rates, but it's also the harder to detect their hallucinations. This can be seen even more clearly in Figure 7.
Figure 8: Hallucination rate per LM training FLOPs on examples seen (top) and not seen (bottom) during training. Same as Figure 2, except for using temp = 0 instead of temp = 1 to generate the samples. Note the more pronounced decay in out-of-distribution (IVS) performance with length of training, emphasising the possible trade-off between training set hallucination rate and
Number of training steps for different data sizes.
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

August 2024

·

54 Reads

·

1 Citation

Jiri Hron

·

Laura Culp

·

Gamaleldin Elsayed

·

[...]

·

Simon Kornblith

While many capabilities of language models (LMs) improve with increased training budget, the influence of scale on hallucinations is not yet fully understood. Hallucinations come in many forms, and there is no universally accepted definition. We thus focus on studying only those hallucinations where a correct answer appears verbatim in the training set. To fully control the training data content, we construct a knowledge graph (KG)-based dataset, and use it to train a set of increasingly large LMs. We find that for a fixed dataset, larger and longer-trained LMs hallucinate less. However, hallucinating on 5\leq5% of the training data requires an order of magnitude larger model, and thus an order of magnitude more compute, than Hoffmann et al. (2022) reported was optimal. Given this costliness, we study how hallucination detectors depend on scale. While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations.




Transformer-Based Learned Optimization

December 2022

·

15 Reads

In this paper, we propose a new approach to learned optimization. As common in the literature, we represent the computation of the update step of the optimizer with a neural network. The parameters of the optimizer are then learned on a set of training optimization tasks, in order to perform minimisation efficiently. Our main innovation is to propose a new neural network architecture for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization approaches, our formulation allows for conditioning across different dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for evaluation of optimization algorithms, as well as on the real world-task of physics-based reconstruction of articulated 3D human motion.


VeLO: Training Versatile Learned Optimizers by Scaling Up

November 2022

·

20 Reads

·

3 Citations

While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. Meta-trained with approximately four thousand TPU-months of compute on a wide variety of optimization tasks, our optimizer not only exhibits compelling performance, but optimizes in interesting and unexpected ways. It requires no hyperparameter tuning, instead automatically adapting to the specifics of the problem being optimized. We open source our learned optimizer, meta-training code, the associated train and test data, and an extensive optimizer benchmark suite with baselines at velo-code.github.io.


Fig. 2: The planning loop takes in state s and optimizes reference signals (i.e., operational space references ∆x and null space references ∆q) for the robot controller, which generates joint torques τ as actuation commands.
Fig. 5: Example contact force profile when reaching a goal in the wall environment. Pictures correspond to the robot configuration induced by our method at time stamps marked by dotted vertical lines; (a, b) the robot tracks trajectories to push obstacles in a compliant manner and adjusts its joint configuration in the null space; (c) the robot reaches the goal while maintaining a minimum contact force. Compared to the ablation method, controlling both operational and null space trajectories reduces the overall contact forces.
Allowing Safe Contact in Robotic Goal-Reaching: Planning and Tracking in Operational and Null Spaces

October 2022

·

45 Reads

In recent years, impressive results have been achieved in robotic manipulation. While many efforts focus on generating collision-free reference signals, few allow safe contact between the robot bodies and the environment. However, in human's daily manipulation, contact between arms and obstacles is prevalent and even necessary. This paper investigates the benefit of allowing safe contact during robotic manipulation and advocates generating and tracking compliance reference signals in both operational and null spaces. In addition, to optimize the collision-allowed trajectories, we present a hybrid solver that integrates sampling- and gradient-based approaches. We evaluate the proposed method on a goal-reaching task in five simulated and real-world environments with different collisional conditions. We show that allowing safe contact improves goal-reaching efficiency and provides feasible solutions in highly collisional scenarios where collision-free constraints cannot be enforced. Moreover, we demonstrate that planning in null space, in addition to operational space, improves trajectory safety.


Practical tradeoffs between memory, compute, and performance in learned optimizers

March 2022

·

18 Reads

Optimization plays a costly and crucial role in developing machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric functions. The parameters of these functions are then optimized so that the resulting learned optimizer minimizes a target loss on a chosen class of models. Learned optimizers can both reduce the number of required training steps and improve the final test loss. However, they can be expensive to train, and once trained can be expensive to use due to computational and memory overhead for the optimizer itself. In this work, we identify and quantify the design features governing the memory, compute, and performance trade-offs for many learned and hand-designed optimizers. We further leverage our analysis to construct a learned optimizer that is both faster and more memory efficient than previous work.


Gradients are Not All You Need

November 2021

·

28 Reads

Differentiable programming techniques are widely used in the community and are responsible for the machine learning renaissance of the past several decades. While these methods are powerful, they have limits. In this short report, we discuss a common chaos based failure mode which appears in a variety of differentiable circumstances, ranging from recurrent neural networks and numerical physics simulation to training learned optimizers. We trace this failure to the spectrum of the Jacobian of the system under study, and provide criteria for when a practitioner might expect this failure to spoil their differentiation based optimization algorithms.


Citations (3)


... While GANs were initially designed for generating synthetic data, researchers like [10] began exploring their potential in optimization tasks, including aircraft loading. In a more recent study, [11] utilized transformer-based neural networks for aircraft loading, emphasizing the potential of attention mechanisms in optimization. ...

Reference:

A Hybrid Optimization Algorithm for a Multi-Objective Aircraft Loading Problem With Complex Constraints
Transformer-Based Learned Optimization
  • Citing Conference Paper
  • June 2023

... Additionally, local minima can be a problem when using contact-aware controllers, and global planning algorithms are necessary. Due to the possible multi-modal characteristics of the planning problem, the receding horizon planning may also be stuck into the local minimum [5] [9]. ...

Allowing Safe Contact in Robotic Goal-Reaching: Planning and Tracking in Operational and Null Spaces
  • Citing Conference Paper
  • May 2023

... Inspired by the recent development of GPU-based simulators in reinforcement learning (Bonnet et al. 2022;Freeman et al. 2021;Lange 2022b;Makoviychuk et al. 2021), we also implemented simulators for perishable inventory problems using JAX, which enabled us to run large numbers of simulations in parallel. ...

Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation