ChainQueen - HackMD

<h1>ChainQueen</h1> - Reference: [Hu et al., ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics, ICRA, 2019.](https://arxiv.org/abs/1810.01054) - Post author: [Jedd Yang](https://hackmd.io/@jeddiot) - Post written date: 2025-Oct-11 - Keywords: Soft Robotics, Robotics simulator <h2>Overview</h2> <div class="section-box"> <h3>Background</h3> Differentiable physical simulator enables gradient-based optimizers for efficient control. Rigid body are studied in approximate and exact methods. Deformable body, however, cannot be effectively used for solving inverse problems such as optimal control and motion planning due to: - Slow simulation runtime, due to higher DoFs. - Challenging Contact detection and resolution, due to geometry change. - Closed-form and efficient computation of gradients is challenging in the presence of contact Therefore, a differentiable physical simulator for soft body is in urgent need. <h3>Preliminary Knowledge</h3> - Eulerian-Lagrangian description - Continuum mechanics - Material point method - Deformation matrix <h3>Contribution</h3> Hu et al. built the first fully differentiable simulation framework for soft robotics, using moving least squares material point method (MLS-MPM). </div> <h2>Methodology</h2> <div class="section-box"> <h3>Material Point Method (MPM)</h3> >[!Note] What is MPM > MPM is a algorithm using hybrid Eulerian-Lagrangian description of continuum mechanics, where both particles and grid nodes are used. <center>  <iframe width="560" height="315" src="https://www.youtube.com/embed/O0kyDKu8K-k?si=XkHg6TpLG6HM_aro" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> <br> <em>Vid. 1: Disney Frozen MPM demo.</em> </center> MPM discretizes a continuum body by a set of material points (but generally people call them particles) onto a background grid. It solves dynamic (transient) problem in timesteps, and 4 stages are done in each timestep Figure 1. >[!Tip] Why use MPM among all computational mechanics algorithm? > 1. Physics-based (compared to other graphics-based approaches) > 2. Friendly parallelization > 3. Naturally handles large deformation and (self-)collision <h3>MPM Solution Process</h3> 1. Particle-to-grid projection (P2G): Particles projecting physical quantities onto grid nodes. 2. Grid updating: Solves physical properies for the next step. 3. Grid-to-particle interpolation (G2P): Particles interpolate properties for the next step from grid nodes. 4. Grid resetting: Grid resets to prevent mesh distortion issue. <div class="three-column-layout"> <center> <img src="https://geomechanics.berkeley.edu/wp-content/uploads/2019/08/Capture.png" width="600"> <figcaption>Fig. 1: MPM Solution Steps Within Each Timestep.</figcaption> <br> </center> <center> <img src="https://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00466-023-02381-0/MediaObjects/466_2023_2381_Fig1_HTML.png" width="400"> <figcaption>Fig. 2: MLS-MPM discretization.</figcaption> <br> </center> <center> <img src="https://www.researchgate.net/profile/Masaaki-Nagahara/publication/255787253/figure/fig1/AS:669434721099777@1536617083063/Polynomial-spline-of-order-N-0-1-2-3.png" width="500"> <figcaption>Fig. 3: B-Spline Smoothness at different degree comparison.</figcaption> <br> </center> </div> --- <h3>Moving Least Square (MLS) Interpolation</h3> Notice in step 3 (the bottom right corner of Fig. 1), an interpolation is done using nearby 4 grid nodes like the Finite Element Method (FEM). This is problematic because regular interpolation uses piecewise-linear shape functions. MLS interpolation creates effective support, covering more grid nodes, ensuring smoother kernel (weighting function), therefore ensuring better momentum conservation. >[!Tip] Why better momentum conservation? > When a particle crosses a cell boundary, with smoother kernel (weighting function), its weighting on grid nodes changes gradually, not abruptly. --- <h3>Actuation model for soft robotics</h3> Soft robotic actuators produce internal stresses, causing an <font color="#FAC31D">active deformation</font>. ChainQueen introduces an extra Cauchy stress $\mathbf{A}_p$. This term is added to the normal material stress in the momentum equation. $$ \mathbf{A}_p = \mathbf{F}_p \mathbf{\sigma}_{pa} \mathbf{F}_p^T $$ where $\mathbf{\sigma}_{pa}$ is a diagonal matrix of actuation coefficients defined in the initial configuration, and $\mathbf{F}_{p}$ is the deformation gradient that describes how the material around that particle has deformed, a <font color="#FAC31D">passive deformation</font>. $$ \mathbf{F}_p = \frac{\partial \mathbf{x}}{\partial \mathbf{X}} $$ in which the $\mathbf{x}$ represents the [deformed configuration](https://www.sciencedirect.com/topics/engineering/deformed-configuration); $\mathbf{X}$, the initial configuration. $$ \mathbf{\sigma}_{pa} = \begin{bmatrix} a_x & 0 & 0\\ 0 & a_y & 0\\ 0 & 0 & a_z \end{bmatrix} $$ By multiplying $\mathbf{\sigma}_{pa}$ with $\mathbf{F}_p$ and $\mathbf{F}_p^T$, we convert it from initial configuration to the deformed configuration. The total Cauchy stress used in MPM’s momentum update is then: $$ \mathbf{\sigma}^{total}_{p} = \mathbf{\sigma}^{passive}_{p} + \mathbf{A}_p $$ >[!Tip] Are there other types of stress? Why do we use Cauchy stress? > Yes, there are other types of stresses. Because Cauchy stress represents stress w.r.t. deformed configuration, and in MPM, the update is calculate based on previous deformation, that is, the deformed configuration in previous timestep. > > | **Stress type** | **Config** | **Formula** | **Physical meaning** | > |-----------------|------------|-------------|----------------------| > | **Cauchy stress** ($\mathbf{\sigma}$) | Deformed | $\mathbf{f} / \text{current area}$ | True stress in deformed body | > | **First Piola–Kirchhoff** ($\mathbf{P}$) | Mixed | $\mathbf{P} = J \mathbf{\sigma} \mathbf{F}^{-T}$ | Force per unit initial area | > | **Second Piola–Kirchhoff** ($\mathbf{S}$) | Initial | $\mathbf{S} = \mathbf{F}^{-1} \mathbf{P}$ | Symmetric; used in constitutive laws | <center> <img src="https://www.researchgate.net/publication/367431871/figure/fig3/AS:11431281114847533@1674702919807/Schematic-drawing-illustrating-the-initial-and-deformed-configuration-for-continuum.png" width="800"> <figcaption>Fig. 4: Relationship between initial configuration and the deformed configuration.</figcaption> <br> </center> --- <h3>Backward Differentiation</h3> Hu et al. stated MLS-MPM is naturally differentiable, capable of both foward simulation and backward differentiation. Although how it is differentiable is unstated (secret knowhow?). </div> <h2>Result</h2> <div class="section-box"> <center>  <iframe width="560" height="315" src="https://www.youtube.com/embed/4IWD4iGIsB4?si=Fwiqgb6i7OnUSJ-d" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> <br> <em>Vid. 2: ChainQueen demo.</em> </center> --- <h3>Application I: Inference / System Identification</h3> :::success The simulator can use gradient descent to optimize parameters, to figure out the value of the physical parameters that make the physical outcome happen. ::: Normal physics simulator (e.g., PhysX, Havok) works in one direction: You input the physical parameters (mass, density, stiffness, etc.); the simulator outputs motion. But differential simulator can go both ways. In this scenario 1. Ball A is moving to the right. 2. Ball B is stationary or slower. 3. A hits B, and B moves to a final target position C. ChainQueen is able to figure out the physical parameters (density of ball A) that make this outcome happen, B ends up exactly at C. --- <h3>Application II: Controller Design</h3> :::success The simulator is capable of providing gradients with respect to dynamics and controller parameters in low iterations. ::: The controller takes as input the state vector $z$ (target position, center of mass position, velocity of each composed soft component). The actuation vector $\mathbf{a} = tanh(\mathbf{W}\mathbf{z} + \mathbf{b})$ for up to 16 actuators is generated by gradient descent on variables $\mathbf{W}$ and $\mathbf{b}$. They designed a series of experiments including the 2D biped runner and robotic finger, and 3D quadrupedal runner, crawler and robotic arm. Gradient-based optimizers successfully compute desired controllers within only tens or hundreds of iterations. ![Screenshot 2025-10-11 103056](https://hackmd.io/_uploads/BkNErHPaxx.png) ![Screenshot 2025-10-11 103258](https://hackmd.io/_uploads/r1BESSvpxe.png) <center> <figcaption>Fig. 5: Gradient-based optimizers successfully compute desired controllers for biped and quadruped runners.</figcaption> <br> </center> Now simplify the case to single link version, and compare with Proximal Policy Optimization (PPO) (state-of-the-art reinforcement learning algorithm @ 2018). ![Screenshot 2025-10-11 103355](https://hackmd.io/_uploads/r1XkIBw6ex.png) <center> <figcaption>Fig. 6: Comparison of ChainQueen-based control method with PPO.</figcaption> <br> </center> --- <h3>Application III: Co-design</h3> :::success The simulator is capable of providing gradients with respect to structural design parameters as well. In this work, it can design how stiff/soft each part of the arm is, according to the task. ::: Consider a multi-link robot arm (two links, two joints each with two sideby-side actuators; all parts deformable) with a goal ball on the side, the desired, optimized end-effector of the arm is to reach the goal ball with final $0$ arm velocity, and minimized for actuation cost. $$ Cost = \sum^N_{i=0} u_i^T u_i dt $$ where $u_i$ is the actuation vector at timestep $i$; $N$ is the total number of timesteps. Actuation for each time step is solved for, along with the time-invariant Young’s modulus of the system for each particle. They used [NLOPT](https://nlopt.readthedocs.io/en/latest/)’s sequential least squares programming algorithm (SLSQP) for optimization. They compared their co-design solution to fixed designs showned as following ![Screenshot 2025-10-11 103140](https://hackmd.io/_uploads/r1rkwBwpge.png) <center> <figcaption>Fig. 7: Comparison of designed stiffness distribution.</figcaption> <br> </center> Only the co-design arm (the far right one) fully converges to the target goal, and with lower actuation cost. ![Screenshot 2025-10-11 103410](https://hackmd.io/_uploads/r1pdwHD6gl.png) <center> <figcaption>Fig. 8: Convergence for the different tasks.</figcaption> <br> </center> </div>