Try   HackMD

Ubisoft Proposal: Rapid Motion Prototyping System for Virtual Creatures

Nam Hee Kim
Finnish Center for Articifial Intelligence / Aalto University

About the Author

I am a doctoral researcher at Aalto University studying machine learning and optimization for human-in-the-loop applications. My passion subject is motion control and character animation.

Problem Domain: Animation Authoring

Character animation is a promising area for artificial intelligence techniques with a variety of practical applications, including robotics, films, and games. As the market for animation grows, the demand for high-quality, efficient animation increases as well. While some products use advanced learning, optimization, and tooling methods (e.g. [Ragdoll, Unity IK]) to address these needs, there is still a lack of a fully-fledged physics-based animation interface for creating character animations.

Knowledge Gaps

Non-humanoids. Since animation is a creative process, an animator's experience should necessarily be expressive, explorative, and imaginative. However, modern animation contexts involve lots of manual and reductive work. The rise of learning-based methods leveraging the abundance of motion capture (MoCap) data shows promise in alleviating this tedium. For example, recent breakthroughs in motion tracking via imitation learning has enabled fast physics-based MoCap cleanup [SuperTrack, AMP] as well as skill discovery [ASE]. However, for non-humanoids, such MoCap data are rarely available. Until adequate MoCap and/or retargeting techniques are invented for creatures of more general morphologies, such as bipedal birds, quadrupeds, insects, snakes, and more, animation challenges for wildlife and alien creatures remain difficult. Meanwhile, recent years have witnessed a rising popularity in gaming experience involving non-humanoid protagonists (e.g. [Stray]) and NPCs (e.g. [Horizon Forbidden West, Red Dead Redemption 2]). 3D storytelling interests have also expanded into expressive ananthromophic characters (e.g. [DC League of Super Pets, The Bad Guys, How to Train Your Dragon]). The gap between the sparse supply of methods supporting non-humanoid rigs and the demand of artful experience involving diverse character rigs will likely increase at a suffocating rate.

Lack of Authoring Applications. Additionally, supporting intuitive and diversity-focused authoring processes with learning-based techniques has remained a bottleneck for real-life applications. Recent methods have been used for supporting in-betweening across sparse keyframes in kinematic and physically-based domains [Robust Motion Inbetweening], but their applications have been constrained to humanoid rigs.

Threats of AI-Generated Art. The recent surge in AI-generated art movements [DALL-E, StableDiffusion, Midjourney, etc.] serves as a vivid example of how expressive data-driven automations can be. The techniques enabling AI-generated art has already spilled over to the animation domain, where movement assets of decent quality are now generated via text prompt or music [EDGE, PhysDiff]. Meanwhile, artists and spectators are increasingly experiencing fear and concerns that these highly-advanced automations will eventually replace humans in art [ArtStation protest article]. Animators are no exceptions in sharing this concernwhile the interactive and technical nature of animation provides a fulfilling opportunity of artistic expressions, the advances in data-driven asset creation methods may reduce the workflow to mere prompt engineering.

Objectives

Animate Non-humanoids with Physics. In light of the knowledge gap pointed out above, I propose developing methods to generate virtual character movements using GPU-accelerated rigid-body simulation frameworks and learned control methods that do not rely on MoCap or are only minimally reliant on it. These methods should be able to handle a variety of morphologies and have high degrees of freedom. To achieve this goal, I will use my expertise in GPU-accelerated simulation frameworks such as [Isaac] and [Brax], as well as my background in the theory and the application of deep reinforcement learning.

Keep Humans in the Loop. AI-generated art often lacks precision and specificity due to difficulties in data gathering and inherent design limitations of generative models. To address this, I propose developing methods that can be directly controlled through storytelling and interactive intent. These methods should then be designed specifically for use in human-in-the-loop inference scenarios. As a first step, I plan to use partial keyframes as input goals for the animation system, which will generate physically-plausible animations that satisfy these goals. The user will then be able to fine-tune the movements as desired. The system should be able to handle a large number of keyframes and be responsive and reactive in order to support the exploration of animation scenarios involving virtual characters.

Find and Present Surprises. I propose developing an animation system that generates complex and creative motions emerging from a character's interaction with the environment. This will allow for exploration-driven workflows and inspire interesting animation scenarios, such as a 7-legged alien or robotic creature attacking the player character through an obstacle-ridden environment. The system should explore and learn from a wide variety of simulated interactions to create an illusion of intelligent behaviour, using the characters' morphology, scale, agility, and environment to perform the task. Such a system can also spark creativity beyond generating animation assets; for example, analyzing the diverse emergent behaviours may inspire improvements in character and environment design. As a focus on high precision may limit the diversity of the output, the system should also balance precision of the output animation with the ability to find and present diverse emergent behaviors.

Project Description

I propose exploring, implementing, and evaluating systems that augment creative motion design workflows using machine learning and numerical optimization methods. Starting with simple wildlife character scenarios, I will design and deploy physically-simulated environments and visualize diverse and emergent behaviors using partial keyframes to pursue an animator's intent. I wish to generate the following broad contributions:

  • create a system to generate skilled movements for non-humanoid rigid-body characters in real-time using partial keyframes
  • present and validate the emerging learning, optimization, and tooling methods
  • analyze the capabilities and limitations of my system

From a technical perspective, my project will involve large-scale scientific computing experiments that explore the fusion of closed-loop control (e.g. reinforcement learning) and open-loop control (e.g. trajectory optimization) via meta-learning and self-supervised learning.

My project will broadly require the following technological components:

  • Custom implementations of deep reinforcement learning algorithms and novel optimization algorithms written in Python
  • Use of GPU-accelerated rigid-body simulators (Isaac and/or Brax)
  • Development of an interface prototype for a common workstation software (e.g. Unity) that leverages the learned solution

Definition of Done

My project will be considered done when the emerging system and its constituent methods are validated, such that a technical paper may be produced and submitted to a relevant venue with a late-spring deadeline, such as SIGGRAPH ASIA 2023.

Milestones

  • P1 (baseline goals)
    • Draft use cases and implement into GPU-accelerated, physically-simulated character scenarios using e.g. Brax, Isaac, etc.
    • Explore and implement spacetime optimization solutions for a simplified case of no variation of motion specification (e.g. fixed initial pose, always same keyframe to reach). Using this solution in production will be slower than interactive-time.
    • Integrate the emerging partial keyframing solution into an existing workflow.
    • Document technical details into an internal review-ready paper draft.
  • P2 (main goals)
    • Develop and evaluate methods to support full variation of motion specification with multiple partial keyframes. Some re-optimization may be used, hence inference will be interactive or near-interactive time.
    • Integrate the updated solution into an existing workflow.
    • Document technical details into a submission-ready paper draft and an accompanying video.
  • P3 (reach goals)
    • Make the computation real-time (faster than 30Hz of inference and visualization)
    • Support additional input such as emotion

Timeline

  • January-February 2023: search relevant literature, use case-based brainstorm with minimal viable toy examples (e.g. snake, jellyfish, quadruped, hand)
  • March-April 2023: develop and evaluate methods
  • May 2023: writing for relevant venues (SIGGRAPH ASIA 2023, SCA 2023)

References