Conference Paper

MURE: fast agent based crowd simulation for VFX and Animation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Crowd simulation in visual effects and animation is a field where creativity is often bound by the scalability of its tools. High end animation systems like Autodesk Maya [Autodesk ] are tailored for scenes with at most tens of characters, whereas more scaleable VFX packages like SideFX's Houdini [SideFX] can lack the directability required by character animation. We present a suite of technologies built around Houdini that vastly improves both its scalability and directability for agent based crowd simulation. Dubbed MURE (Japanese for "crowd"), this system employs a new VEX context with lock-free, multithreaded KD-Tree construction/look-up, a procedural finite state machine for massive animation libraries, a suite of VEX nodes for fuzzy logic, and a fast GPU drawing plugin built upon the open source USD (Universal Scene Description) library [Pixar Animation Studios ]. MURE has proven its success on two feature films, The Good Dinosaur, and Finding Dory, with crowd spectacles including flocks of birds, swarms of fireflies, automobile traffic, and schools of fish. Pixar has a history with agent based crowd simulation using a custom Massive [Massive Software] based pipeline, first developed on Ratatouille [Ryu and Kanyuk 2007], and subsequently used on Wall-E, Up, and Cars 2. A re-write of the studio's proprietary animation software, Presto, deprecated this crowd pipeline. The crowds team on Brave and Monster's University replaced it with a new system for "non-simulated" crowds that sequenced geometry caches [Kanyuk et al. 2012] via finite state machines and sketch based tools [Arumugam et al. 2013]. However, the story reels for The Good Dinosaur called for large crowds with such complex inter-agent and environment interaction that simulated crowds were necessary. This creative need afforded Pixar's crowd team the opportunity of evaluate the pros and cons of our former agent based simulation pipeline and weigh which features would be part of its successor. Fuzzy logic brains and customizable navigation were indispensable, but our practice of approximating hero quality rigs with simulatable equivalents was fraught with problems. Creating the mappings was labor intensive, lossy, and even when mostly correct, animators found the synthesized animation splines so foreign that many would start from scratch rather than build upon a crowd simulation. The avoid this pitfall, we instead opted to start building our new pipeline around pre-cached clips of animation and thus always be able to deliver crowd animators clean splines. This reliance on caches also affords tremendous opportunities for interactivity at massive scales. Thus, rather than focusing on rigging/posing, the goals of our new system, MURE, became interactivity and directability.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Virtual environments: Crowd analysis methods can be used to understand the underlying phenomenon thereby enabling us to establish mathematical models that can provide accurate simulations. These mathematical models can be further used for simulation of crowd phenomena for various applications such as computer games, inserting visual effects in film scenes and designing evacuation plans [36,74]. pects and victims in events such as bombing, shooting or accidents in large gatherings. ...
Preprint
Estimating count and density maps from crowd images has a wide range of applications such as video surveillance, traffic monitoring, public safety and urban planning. In addition, techniques developed for crowd counting can be applied to related tasks in other fields of study such as cell microscopy, vehicle counting and environmental survey. The task of crowd counting and density map estimation is riddled with many challenges such as occlusions, non-uniform density, intra-scene and inter-scene variations in scale and perspective. Nevertheless, over the last few years, crowd count analysis has evolved from earlier methods that are often limited to small variations in crowd density and scales to the current state-of-the-art methods that have developed the ability to perform successfully on a wide range of scenarios. The success of crowd counting methods in the recent years can be largely attributed to deep learning and publications of challenging datasets. In this paper, we provide a comprehensive survey of recent Convolutional Neural Network (CNN) based approaches that have demonstrated significant improvements over earlier methods that rely largely on hand-crafted representations. First, we briefly review the pioneering methods that use hand-crafted representations and then we delve in detail into the deep learning-based approaches and recently published datasets. Furthermore, we discuss the merits and drawbacks of existing CNN-based approaches and identify promising avenues of research in this rapidly evolving field.
... Large multi-agent systems, such as the real-time simulation of human crowds, are important tasks in games [3,4], animation [5,6] and visual effects [7][8][9]. In the realm of crowd simulation studies, the classification typically involves two categories: flow-based, wherein the entire crowd is conceptualized as a substance with fluid-like characteristics, and agent-based, where each individual of the crowd is represented as an intelligent agent. ...
Preprint
Full-text available
Modeling crowds has many important applications in games and computer animation. Inspired by the emergent following effect in real-life crowd scenarios, in this work, we develop a method for implicitly grouping moving agents. We achieve this by analyzing local information around each agent and rotating its preferred velocity accordingly. Each agent could automatically form an implicit group with its neighboring agents that have similar directions. In contrast to an explicit group, there are no strict boundaries for an implicit group. If an agent's direction deviates from its group as a result of positional changes, it will autonomously exit the group or join another implicitly formed neighboring group. This implicit grouping is autonomously emergent among agents rather than deliberately controlled by the algorithm. The proposed method is compared with many crowd simulation models, and the experimental results indicate that our approach achieves the lowest congestion levels in some classic scenarios. In addition, we demonstrate that adjusting the preferred velocity of agents can actually reduce the dissimilarity between their actual velocity and the original preferred velocity. Our work is available online.
... -Virtual conditions: Crowd investigation strategies can be utilized to comprehend the basic marvel accordingly empowering us to set up numerical models that can give precise reproductions. These numerical models can be additionally utilized for recreation of group wonders for different applications, for example, PC games, embedding special visualizations in film scenes and planning clearing plans [15], [16]. -Forensic investigate: Crowd investigation also be utilized to look for suspect and casualties in occasions like besieging, shooting or mishaps in huge social events. ...
Preprint
Full-text available
Crowd counting is one of the most challenging issues in computer vision community for safety and security through surveillance systems. It has extensive range of applications, such as disaster management, surveillance event detection, intelligence gathering and analysis, public safety control, traffic monitoring, design of public spaces, anomaly detection and military. Early approaches still encounter many issues, like non-uniform density distribution, partial occlusion and discrepancies in scale and point of view. To address the above problems, Feature Pyramid Networks are introduced in deep convolution networks for counting the individuals in the Crowd. The designed network has extracted the features at all resolutions and is constructed rapidly from only one input image. This method achieves out-performance results compared to the well-known networks on three demanding standard crowd counting datasets.
... • Virtual conditions Crowd investigation strategies can be utilized to comprehend the basic marvel accordingly empowering us to set up numerical models that can give precise reproductions. These numerical models can be additionally utilized for recreation of group wonders for different applications, for example PC games, embedding special visualizations in film scenes and planning clearing plans (Gustafson et al. 2016;Perez et al. 2016). • Forensic investigation Crowd investigation also be utilized to look for suspect and casualties in occasions like besieging, shooting or mishaps in huge social events. ...
Article
Full-text available
Crowd counting is one of the most challenging issues in the computer vision community for safety and security through surveillance systems. It has extensive range of applications, such as disaster management, surveillance event detection, intelligence gathering and analysis, public safety control, traffic monitoring, design of public spaces, anomaly detection and military. Early approaches still encounter many issues like non-uniform density distribution, partial occlusion and discrepancies in scale and perspective. To address the above problems, feature pyramid networks are introduced in deep convolution networks for counting the individuals in the crowd. The designed network has extracted the features at all resolutions and is constructed rapidly from only one input image. This method achieves outperformance results compared to the well-known networks on three standard crowd counting datasets.
... 1145/3214745.3214803 in SideFX's Houdini [SideFX, [n. d.]] via our MURE tools [Gustafson et al., 2016], the crowds team on Incredibles 2 produced rich scenes of busy streets and urban panic. ...
Conference Paper
The stylized world of Incredibles 2 features large urban crowds both in everyday situations and in scenes of panicked mayhem. While Pixar's now academy award winning animation software, Presto, has allowed us to create expressive and nuanced rigs for our crowd characters, our proprietary approach has made it difficult to utilize animation from external sources, such as crowd simulations or from motion capture. In this talk, we discuss how we can automatically approximate our complex rigs with skinned skeletons, as well as how this has opened up our crowd pipeline to procedural look-ats, motion blending, ragdoll physics, and motion capture. In particular, the use of motion capture is novel for Pixar, and finding a way to integrate this workflow into our animator-centric pipeline and culture has been an ongoing effort. The system we designed allows us to capture motion data for multiple characters in the context of complex shots in Presto, and it facilitates choreography of nuanced and specifically timed crowd motions. Together with traditional hand animated motion cycles, our crowd choreography tools in Presto [Arumugam et al., 2013], and skeletal agent based simulation in SideFX's Houdini [SideFX, [n. d.]] via our MURE tools [Gustafson et al., 2016], the crowds team on Incredibles 2 produced rich scenes of busy streets and urban panic.
... This allows fast iteration and is easily interchangeable with a full animation rig when additional specificity is needed. MURE [Gustafson et al. 2016] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. ...
Conference Paper
Coco, Pixar's largest human-based crowds film to date, was ambitious both visually and technically. Over a third of the film contains crowd scenes, ranging from a mansion-filled dance party to thousands of skeleton families journeying across a bridge, to a colossal cheering stadium. This complexity required vast amounts of both animation specificity and look variation in our characters. Asset management, animation directability, and rendering would have been extremely difficult with our previous pipeline for human crowds at this scale. An array of techniques were developed to tackle these challenges, including crowd asset and workflow improvements; a new skeletal rigging and posing system to procedurally control animation; more automated, aggressive shading and geometric level of detail; and optimized geometry unrolling in Katana to significantly reduce scene processing time and file IO.
... Virtual environments: Crowd analysis methods can be used to understand the underlying phenomenon thereby enabling us to establish mathematical models that can provide accurate simulations. These mathematical models can be further used for simulation of crowd phenomena for various applications such as computer games, inserting visual e↵ects in film scenes and designing evacuation plans[36,74]. Forensic search: Crowd analysis can be used to search for suspects and victims in events such as bombing, shooting or accidents in large gatherings. Traditional face detection and recognition algorithms can be speeded up using crowd analysis techniques which are more adept at handling such scenarios[47,7]. ...
Article
Full-text available
Estimating count and density maps from crowd images has a wide range of applications such as video surveillance, traffic monitoring, public safety and urban planning. In addition, techniques developed for crowd counting can be applied to related tasks in other fields of study such as cell microscopy, vehicle counting and environmental survey. The task of crowd counting and density map estimation is riddled with many challenges such as occlusions, non-uniform density, intra-scene and inter-scene variations in scale and perspective. Nevertheless, over the last few years, crowd count analysis has evolved from earlier methods that are often limited to small variations in crowd density and scales to the current state-of-the-art methods that have developed the ability to perform successfully on a wide range of scenarios. The success of crowd counting methods in the recent years can be largely attributed to deep learning and publications of challenging datasets. In this paper, we provide a comprehensive survey of recent Convolutional Neural Network (CNN) based approaches that have demonstrated significant improvements over earlier methods that rely largely on hand-crafted representations. First, we briefly review the pioneering methods that use hand-crafted representations and then we delve in detail into the deep learning-based approaches and recently published datasets. Furthermore, we discuss the merits and drawbacks of existing CNN-based approaches and identify promising avenues of research in this rapidly evolving field.
Article
Reproducing realistic collective behaviors presents a captivating yet formidable challenge. Traditional rule-based methods rely on hand-crafted principles, limiting motion diversity and realism in generated collective behaviors. Recent imitation learning methods learn from data but often require ground-truth motion trajectories and struggle with authenticity, especially in high-density groups with erratic movements. In this paper, we present a scalable approach, Collective Behavior Imitation Learning (CBIL), for learning fish schooling behavior directly from videos , without relying on captured motion trajectories. Our method first leverages Video Representation Learning, in which a Masked Video AutoEncoder (MVAE) extracts implicit states from video inputs in a self-supervised manner. The MVAE effectively maps 2D observations to implicit states that are compact and expressive for following the imitation learning stage. Then, we propose a novel adversarial imitation learning method to effectively capture complex movements of the schools of fish, enabling efficient imitation of the distribution of motion patterns measured in the latent space. It also incorporates bio-inspired rewards alongside priors to regularize and stabilize training. Once trained, CBIL can be used for various animation tasks with the learned collective motion priors. We further show its effectiveness across different species. Finally, we demonstrate the application of our system in detecting abnormal fish behavior from in-the-wild videos.
Article
We propose a novel contact-aware method to synthesize highly-dense 3D crowds of animated characters. Existing methods animate crowds by, first, computing the 2D global motion approximating subjects as 2D particles and, then, introducing individual character motions without considering their surroundings. This creates the illusion of a 3D crowd, but, with density, characters frequently intersect each other since character-to-character contact is not modeled. We tackle this issue and propose a general method that considers any crowd animation and resolves existing residual collisions. To this end, we take a physics-based approach to model contacts between articulated characters. This enables the real-time synthesis of 3D high-density crowds with dozens of individuals that do not intersect each other, producing an unprecedented level of physical correctness in animations. Under the hood, we model each individual using a parametric human body incorporating a set of 3D proxies to approximate their volume. We then build a large system of articulated rigid bodies, and use an efficient physics-based approach to solve for individual body poses that do not collide with each other while maintaining the overall motion of the crowd. We first validate our approach objectively and quantitatively. We then explore relations between physical correctness and perceived realism based on an extensive user study that evaluates the relevance of solving contacts in dense crowds. Results demonstrate that our approach outperforms existing methods for crowd animation in terms of geometric accuracy and overall realism.
Article
Simulation of swarm motion is a crucial research area in computer graphics and animation, and is widely used in a variety of applications such as biological behavior research, robotic swarm control, and the entertainment industry. In this paper, we address the challenges of preserving structural relations between the individuals in swarm flight simulations by proposing an innovative motion control framework that utilizes a graph‐based hierarchy to illustrate patterns within a swarm and allows the swarm to perform flight motions along externally specified paths. In addition, this study designs motion propagation strategies with different focuses for varied application scenarios, analyzes the effects of information transfer latencies on pattern preservation under these strategies, and optimizes the control algorithms at the mathematical level. This study not only establishes a complete set of control methods for group flight simulations, but also has excellent scalability, which can be combined with other techniques in this field to provide new solutions for group behavior simulations.
Article
Full-text available
El estudio de animación Pixar Animation Studios ha sido el creador de 21 largometrajes animados por computadora (hasta 2019) como Monsters Inc., Intensa-Mente, Buscando a Nemo, la saga de Toy Story, entre muchas otras. Estas han impactado a varias generaciones gracias a sus historias, a la gran variedad de emociones que provocan, a sus personajes entrañables y a esos mundos increíbles que han desarrollado. La creación de cada una de estas películas lleva mucho tiempo y el trabajo de muchas personas en distintos departamentos como de arte, diseño de personajes, guión, animación, entre muchos otros; pero hay uno en especial que es de vital importancia, la tecnología: Tal vez te sorprenda la cantidad de tecnología que fue necesaria para hacer [las películas] posibles. De hecho, Pixar es en gran parte responsable de algunos de los desarrollos más significativos en gráficos por computadora de la historia. Muchas personas miran el producto y creen que sólamente es el arte, pero no entienden lo que hubo detrás. Es una combinación de arte y tecnología lo que hace que una película sea realidad. (Evers, 2019, 0:32, traducción propia). De esta manera, el objetivo de esta monografía es exponer la evolución e innovación de la tecnología de los gráficos por ordenador en los largometrajes animados a partir de Toy Story. Esta monografía es importante ya que mostrará algunos de los avances tecnológicos en diversos aspectos como renderizado, iluminación, simulaciones, softwares 3D, entre muchos otros, que se han desarrollado en el transcurso de los años en Pixar y así demostrar que la tecnología está fuertemente ligada con la animación. Asimismo, cómo es que cada historia requiere un reto tecnológico más grande y comunicar el proceso, las técnicas y los detalles en los que llegaron a cumplirlo mediante un lenguaje comprensible a todo lector. Es necesario incentivar a los jóvenes y a las personas en general en investigar cómo es que la tecnología ha cambiado sus vidas: no sólo en ejemplos comunes como un teléfono o una computadora, sino, en algo que ellos no esperarían: el entretenimiento; para así poder contribuir en un futuro a su desarrollo.
Article
Pedestrian counting from unconstrained images is an important task in various applications such as resource management, transportation engineering, urban design, and advertising, but it is greatly challenged by some factors such as interocclusion, cross‐scene, scale, and scene perspective distortion. Traditional image‐based methods suffer from them, and the performance of conventional sensor‐based methods such as Kinect and LASER degrades gradually with the increase in pedestrian count and distance from the device to pedestrians. Based on these challenges, this paper proposes a new network model making use of stacked multicolumn convolutional neural networks (CNNs) for pedestrian counting. The human's head features are used to replace the whole body for solving the problem of serious occlusion and choose multicolumn CNNs for dealing with scale and scene perspective distortion. Also, pretrained VGG‐16 is used to generate deeper detailed features and expand the receptive field of the model. Extensive analysis and experiments on current major pedestrian counting datasets show that the proposed network model has considerable advantages in pedestrian counting tasks compared to other state‐of‐the‐art models, and the proposed network model has an improvement effect for the training process. Moreover, the visual differences between the generated density map and ground‐truth density map are visualized and analyzed quantitatively to demonstrate the feasibility of the model.
Conference Paper
In Epic (2013), crowds are integral to the narrative and form a character as a whole. This required a new type of crowd at Blue Sky Studios, one that permits dynamic interaction between crowd characters and the environments around them in addition ...
Article
Visual representations of traffic flow and density in 3D city models provide substantial decision support in urban planning. While a large repertoire of efficient techniques exists for visualizing the static components of such environments (e.g., digital ...
Article
A new architecture for controlling mobile robots is described. Layers of control system are built to let the robot operate at increasing levels of competence. Layers are made up of asynchronous modules that communicate over low-bandwidth channels. Each module is an instance of a fairly simple computational machine. Higher-level layers can subsume the roles of lower levels by suppressing their outputs. However, lower levels continue to function as higher levels are added. The result is a robust and flexible robot control system. The system has been used to control a mobile robot wandering around unconstrained laboratory areas and computer machine rooms. Eventually it is intended to control a robot that wanders the office areas of our laboratory, building maps of its surroundings using an onboard arm to perform simple tasks.
Project Website: http://bulletphysics.org/wordpress
  • Bulletphysics
  • Org
Headstrong, hairy, and heavily clothed: Animating crowds of scotsmen
  • P Kanyuk
  • L J W Park
  • E Weihrich
Company Website: http://sidefx
  • Sidefx