FIG 2 - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Content may be subject to copyright.
DeepLabCut work-flow. The diagram delineates the work-flow as well as the directory and file structures. The process of the work-flow is color-coded to represent the location of their output. The main steps are opening a python session, importing deeplabcut, creating a project, selecting frames, labeling frames, then training a network. Once trained, this network can be used to apply labels on new videos, or the network can be refined if needed. The process is enabled by interactive GUIs at several key steps, and all lines can be run from a simple terminal interface.
Source publication
Noninvasive behavioral tracking of animals during experiments is crucial to many scientific pursuits. Extracting the poses of animals without using markers is often essential for measuring behavioral effects in biomechanics, genetics, ethology & neuroscience. Yet, extracting detailed poses without markers in dynamically changing backgrounds has bee...
Contexts in source publication
Context 1
... is organized according to the following workflow (Fig 2). The user starts by creating a new project based on a project and username as well as some (initial) videos, which are required to create the training dataset (Step A-B). ...
Context 2
... (numerically) comprises an acceptable MAE depends on many factors (including the size of the tracked body parts, the labeling variability, etc.). Note that the test error can also be larger than the training error due to human variability (in labeling, see Figure 2 in [12]). ...
Context 3
... is organized according to the following workflow (Fig 2). The user starts by creating a new project based on a project and username as well as some (initial) videos, which are required to create the training dataset (Step A-B). ...
Similar publications
Third-party resources ($e.g.$, samples, backbones, and pre-trained models) are usually involved in the training of deep neural networks, which brings backdoor threats. To facilitate the research and development of more secure training schemes, we propose a Python toolbox that implements representative and advanced backdoor attacks and defenses unde...
Noninvasive behavioral tracking of animals during experiments is critical to many scientific pursuits. Extracting the poses of animals without using markers is often essential to measuring behavioral effects in biomechanics, genetics, ethology, and neuroscience. However, extracting detailed poses without markers in dynamically changing backgrounds...
A Bayesian optimizer in Python tries to optimize the cumulative oil production as the only objective function in a five-spot waterflooding simulation in the 61st layer of the SPE 10 model using MATLAB Reservoir Simulation Toolbox (MRST), in Google Colab. This experiment is not intended to conduct the optimization commonly done in the well placement...
Citations
... OpenPose is based on pre-determined landmarks (not exactly the same as the ones used in the manual analysis) and has been trained by non-experts, and whilst it draws information from an extensive library of labelled training images (26), these images are unlikely to reflect the intricacies of the long jump movement. Allowing the researcher to train their own models with self-selected landmarks and custom datasets (e.g., DeepLabCut) could be a potential advancement, however such techniques still present significant limitations and they do not necessarily perform better when compared to OpenPose for basic two-dimensional movements and 3D joint locations (10,13,18,31,32). ...
This study tested the performance of OpenPose on footage collected by two cameras at 200 Hz from a real-life competitive setting by comparing it with manually analyzed data in SIMI motion. The same take-off recording from the men's Long Jump finals at the 2017 World Athletics Championships was used for both approaches (markerless and manual) to reconstruct the 3D coordinates from each of the camera's 2D coordinates. Joint angle and Centre of Mass (COM) variables during the final step and take-off phase of the jump were determined. Coefficients of Multiple Determinations (CMD) for joint angle waveforms showed large variation between athletes with the knee angle values typically being higher (take-off leg: 0.727 ± 0.242; swing leg: 0.729 ± 0.190) than those for hip (take-off leg: 0.388 ± 0.193; swing leg: 0.370 ± 0.227) and ankle angle (take-off leg: 0.247 ± 0.172; swing leg: 0.155 ± 0.228). COM data also showed considerable variation between athletes and parameters, with position (0.600 ± 0.322) and projection angle (0.658 ± 0.273) waveforms generally showing better agreement than COM velocity (0.217 ± 0.241). Agreement for discrete data was generally poor with high random error for joint kinematics and COM parameters at take-off and an average ICC across variables of 0.17. The poor agreement statistics and a range of unrealistic values returned by the pose estimation underline that OpenPose is not suitable for in-competition performance analysis in events such as the long jump, something that manual analysis still achieves with high levels of accuracy and reliability.
... Both of them have achieved relatively high accuracy and achieved partial detection of key points, but there are also false detection cases affected by illumination. The DeepLabCut (DLC) algorithm [23] has been used for the pose estimation of experimental animals (e.g., mice and fruit flies) in high-definition videos by utilizing its advantages of providing a robust model and small sample labeling [24][25][26] . The multi-target tracking accuracy of DLC can reach more than 95%, and running efficiency is also improved. ...
... Animal Pose Estimation from 2D to 3D. Several approaches exist that estimate keypoints in 3D by either computing them from extracted 2D keypoints (Hu et al., 2021;Joska et al., 2021;Kearney et al., 2020;Martinez et al., 2017;Nath et al., 2018;Zhang et al., 2021;Tome et al., 2017) or inferring them directly from 2D images or videos using volumetric convolutional networks Iskakov et al., 2019;Mehta et al., 2017;Pavlakos et al., 2017). Our method falls into the first category. ...
Accurate tracking of the 3D pose of animals from video recordings is critical for many behavioral studies, yet there is a dearth of publicly available datasets that the computer vision community could use for model development. We here introduce the Rodent3D dataset that records animals exploring their environment and/or interacting with each other with multiple cameras and modalities (RGB, depth, thermal infrared). Rodent3D consists of 200 min of multimodal video recordings from up to three thermal and three RGB-D synchronized cameras (approximately 4 million frames). For the task of optimizing estimates of pose sequences provided by existing pose estimation methods, we provide a baseline model called OptiPose. While deep-learned attention mechanisms have been used for pose estimation in the past, with OptiPose, we propose a different way by representing 3D poses as tokens for which deep-learned context models pay attention to both spatial and temporal keypoint patterns. Our experiments show how OptiPose is highly robust to noise and occlusion and can be used to optimize pose sequences provided by state-of-the-art models for animal pose estimation.
... The purpose of this paper is to summarise popular markerless approaches for estimating joint angles, highlighting their strengths and limitations. I focus mainly on 2D applications, since the use of pose estimation for markerless 3D joint angle prediction is still in its infancy (see Nakano et al., 2020;Nath et al., 2019). ...
... Some of these methods even allow videos to be processed in real-time (Cao et al., 2017;Kane et al., 2020). One algorithm that has received particular attention is DeepLabCut (Mathis et al., 2018), which was initially designed for tracking animal behaviour, but can also be used to track human movement in 2D or 3D Nath et al., 2019). These and many other recent studies have demonstrated the potential value of markerless neural network approaches in the field of human movement science (see also Tome et al., 2018). ...
Kinematic analysis is often performed in a lab using optical cameras combined with reflective markers. With the advent of artificial intelligence techniques such as deep neural networks, it is now possible to perform such analyses without markers, making outdoor applications feasible. In this paper I summarise 2D markerless approaches for estimating joint angles, highlighting their strengths and limitations. In computer science, so-called “pose estimation” algorithms have existed for many years. These methods involve training a neural network to detect features (e.g. anatomical landmarks) using a process called supervised learning, which requires “training” images to be manually annotated. Manual labelling has several limitations, including labeller subjectivity, the requirement for anatomical knowledge, and issues related to training data quality and quantity. Neural networks typically require thousands of training examples before they can make accurate predictions, so training datasets are usually labelled by multiple people, each of whom has their own biases, which ultimately affects neural network performance. A recent approach, called transfer learning, involves modifying a model trained to perform a certain task so that it retains some learned features and is then re-trained to perform a new task. This can drastically reduce the required number of training images. Although development is ongoing, existing markerless systems may already be accurate enough for some applications, e.g. coaching or rehabilitation. Accuracy may be further improved by leveraging novel approaches and incorporating realistic physiological constraints, ultimately resulting in low-cost markerless systems that could be deployed both in and outside of the lab.
... We also trained neural networks to analyze string-pulling videos to detect ears, nose, and hands in Black mice. We used the Python based frame work of Deeplabcut toolbox Nath et al., 2018) to train the ResNet50 network for identifying ears, nose, and hands (He et al., 2016). Four separate networks were trained; three for the individual recognition of the ears, nose, and hands and one for the combined recognition of all three in a single step. ...
String-pulling by rodents is a behavior in which animals make rhythmical body, head, and bilateral forearm as well as skilled hand movements to spontaneously reel in a string. Typical analysis includes kinematic assessment of hand movements done by manually annotating frames. Here, we describe a Matlab-based software that allows whole-body motion characterization using optical flow estimation, descriptive statistics, principal component, and independent component analyses as well as temporal measures of Fano factor, entropy, and Higuchi fractal dimension. Based on image-segmentation and heuristic algorithms for object tracking, the software also allows tracking of body, ears, nose, and forehands for estimation of kinematic parameters such as body length, body angle, head roll, head yaw, head pitch, and path and speed of hand movements. The utility of the task and software is demonstrated by characterizing postural and hand kinematic differences in string-pulling behavior of two strains of mice, C57BL/6 and Swiss Webster.
... We also trained neural networks to analyze string-pulling videos to detect ears, nose, and hands in Black mice. We used the Python based frame work of Deeplabcut toolbox Nath et al., 2018) to train the ResNet50 network for identifying ears, nose, and hands (He et al., 2016). Four separate networks were trained; three for the individual recognition of the ears, nose, and hands and one for the combined recognition of all three in a single step. ...
String-pulling by rodents is a behavior in which animals make rhythmical body, head, and bilateral forearm as well as skilled hand movements to spontaneously reel in a string. Typical analysis includes kinematic assessment of hand movements done by manually annotating frames. Here, we describe a Matlab-based software that allows whole-body motion characterization using optical flow estimation, descriptive statistics, principal component, and independent component analyses as well as temporal measures of Fano factor, entropy, and Higuchi fractal dimension. Based on image-segmentation and heuristic algorithms for object tracking, the software also allows tracking of body, ears, nose, and forehands for estimation of kinematic parameters such as body length, body angle, head roll, head yaw, head pitch, and path and speed of hand movements. The utility of the task and software is demonstrated by characterizing postural and hand kinematic differences in string-pulling behavior of two strains of mice, C57BL/6 and Swiss Webster.
... DeepLabCut using the ResNet-50 neural network was trained on the annotated images for 1,030,000 iterations, then used to track the locations of the nose and digits in the full set of video segments. Frames with poor tracking were visually identified by manual inspection of the videos or L and D traces (see below), corrected using DeepLabCut's refinement GUI [43], and the model retrained until satisfactory tracking results were obtained. Sections of tracking that remained poor after refinement (i.e., exhibited large single-frame jumps or jitter) were excluded from analysis. ...
The small first digit (D1) of the mouse’s hand resembles a volar pad, but its thumb-like anatomy suggests ethological importance for manipulating small objects. To explore this possibility, we recorded high-speed close-up video of mice eating seeds and other food items. Analyses of ethograms and automated tracking with DeepLabCut revealed multiple distinct microstructural features of food-handling. First, we found that mice indeed made extensive use of D1 for dexterous manipulations. In particular, mice used D1 to hold food with either of two grip types: a pincer-type grasp, or a “thumb-hold” grip, pressing with D1 from the side. Thumb-holding was preferentially used for handling smaller items, with the smallest items held between the two D1s alone. Second, we observed that mice cycled rapidly between two postural modes while feeding, with the hands positioned either at the mouth (oromanual phase) or resting below (holding phase). Third, we identified two highly stereotyped D1-related movements during feeding, including an extraordinarily fast (~20 ms) “regrip” maneuver, and a fast (~100 ms) “sniff” maneuver. Lastly, in addition to these characteristic simpler movements and postures, we also observed highly complex movements, including rapid D1-assisted rotations of food items and dexterous simultaneous double-gripping of two food fragments. Manipulation behaviors were generally conserved for different food types, and for head-fixed mice. Wild squirrels displayed a similar repertoire of D1-related movements. Our results define, for the mouse, a set of kinematic building-blocks of manual dexterity, and reveal an outsized role for D1 in these actions.
... The method only requires a small amount of manual labelling of image frames, and in the best case, this training process only needs to be performed once. The successfully trained network can then be used to label new videos quickly (45 s for a 10 s video on a standard CPU), and near real-time tracking is also possible with GPU support (Nath et al., 2018). Given the challenges associated with imaging deep water running, it is likely that this approach could easily be modified to analyse kinematics in other human movements and measurement settings, simply by re-training the network using a suitable dataset. ...
... Given the challenges associated with imaging deep water running, it is likely that this approach could easily be modified to analyse kinematics in other human movements and measurement settings, simply by re-training the network using a suitable dataset. Moreover, using additional cameras, this approach could be used to perform 3D analyses (Nath et al., 2018). As stated by Colyer et al. (2018), the development of methods aided by artificial intelligence could revolutionise sports biomechanics and rehabilitation by broadening the applications of motion analysis to training or even competition environments. ...
Kinematic analysis is often performed with a camera system combined with reflective markers placed over bony landmarks. This method is restrictive (and often expensive), and limits the ability to perform analyses outside of the lab. In the present study, we used a markerless deep learning-based method to perform 2D kinematic analysis of deep water running, a task that poses several challenges to image processing methods. A single GoPro camera recorded sagittal plane lower limb motion. A deep neural network was trained using data from 17 individuals, and then used to predict the locations of markers that approximated joint centres. We found that 300–400 labelled images were sufficient to train the network to be able to position joint markers with an accuracy similar to that of a human labeler (mean difference < 3 pixels, around 1 cm). This level of accuracy is sufficient for many 2D applications, such as sports biomechanics, coaching/training, and rehabilitation. The method was sensitive enough to differentiate between closely-spaced running cadences (45–85 strides per minute in increments of 5). We also found high test–retest reliability of mean stride data, with between-session correlation coefficients of 0.90–0.97. Our approach represents a low-cost, adaptable solution for kinematic analysis, and could easily be modified for use in other movements and settings. Using additional cameras, this approach could also be used to perform 3D analyses. The method presented here may have broad applications in different fields, for example by enabling markerless motion analysis to be performed during rehabilitation, training or even competition environments.
Obsessive-compulsive disorder (OCD) is a debilitating psychiatric disorder characterized by intrusive obsessive thoughts and compulsive behaviors. Multiple studies have shown the association of polymorphisms in the SLC1A1 gene with OCD. The most common of these OCD-associated polymorphisms increases the expression of the encoded protein, excitatory amino acid transporter 3 (EAAT3), a neuronal glutamate transporter. Previous work has shown that increased EAAT3 expression results in OCD-relevant behavioral phenotypes in rodent models. In this study, we created a novel mouse model with targeted, reversible overexpression of Slc1a1 in forebrain neurons. The mice do not have a baseline difference in repetitive behavior but show increased hyperlocomotion following a low dose of amphetamine (3 mg/kg) and increased stereotypy following a high dose of amphetamine (8 mg/kg). We next characterized the effect of amphetamine on striatal cFos response and found that amphetamine increased cFos throughout the striatum in both control and Slc1a1 -overexpressing (OE) mice, but Slc1a1 -OE mice had increased cFos expression in the ventral striatum relative to controls. We used an unbiased machine classifier to robustly characterize the behavioral response to different doses of amphetamine and found a unique response to amphetamine in Slc1a1 -OE mice, relative to controls. Lastly, we found that the differences in striatal cFos expression in Slc1a1 -OE mice were driven by cFos expression specifically in D1 neurons, as Slc1a1 -OE mice had increased cFos in D1 ventral medial striatal neurons, implicating this region in the exaggerated behavioral response to amphetamine in Slc1a1 -OE mice.
A major challenge in human stroke research is interpatient variability in the extent of sensorimotor deficits and determining the time course of recovery following stroke. Although the relationship between the extent of the lesion and the degree of sensorimotor deficits is well established, the factors determining the speed of recovery remain uncertain. To test these experimentally, we created a cortical lesion over the motor cortex using a reproducible approach in four common marmosets, and characterized the time course of recovery by systematically applying several behavioral tests before and up to 8 weeks after creation of the lesion. Evaluation of in-cage behavior and reach-to-grasp movement revealed consistent motor impairments across the animals. In particular, performance in reaching and grasping movements continued to deteriorate until 4 weeks after creation of the lesion. We also found consistent time courses of recovery across animals for in-cage and grasping movements. For example, in all animals, the score for in-cage behaviors showed full recovery at 3 weeks after creation of the lesion, and the performance of grasping movement partially recovered from 4 to 8 weeks. In addition, we observed longer time courses of recovery for reaching movement, which may rely more on cortically initiated control in this species. These results suggest that different recovery speeds for each movement could be influenced by what extent the cortical control is required to properly execute each movement.