Humanoid Synergy Editor Leveraging Postural Synergies for Kinematically Optimal Free-Space Control

Rhea Malhotra William Chong, Catie Cuan, Oussama Khatib
Stanford University

Intro GIF

SynSculptor presents a humanoid motion synergy editing tool for kinematically optimal and stylistically accurate humanoid control directly imitating human free space dance motion in real time.


Abstract

Generating sequences of human-like motions for humanoid robots presents challenges in collecting and analyzing reference human motions, synthesizing new motions based on these reference motions, and mapping the generated motion onto humanoid robots. To address these issues, we introduce SynSculptor, a humanoid motion analysis and editing framework that leverages postural synergies for training-free human-like motion scripting. To analyze human motion, we collect 3+ hours of motion capture data across 20 individuals where a real-time operational space controller mimics human motion on a simulated humanoid robot. The major postural synergies are extracted using principal component analysis (PCA) for velocity trajectories segmented by changes in robot momentum, constructing a style-conditioned synergy library for free-space motion generation. To evaluate generated motions using the synergy library, the foot-sliding ratio and proposed metrics for motion smoothness involving total momentum and kinetic energy deviations are computed for each generated motion, and compared with reference motions. Finally, we leverage the synergies with a motion-language transformer, where the humanoid, during execution of motion tasks with its end-effectors, adapts its posture based on the chosen synergy.

Video


Real-Time Controller

Task-Specific Control of HRP4c

We develop a real-time framework for mapping OptiTrack motion capture data onto a Supraped HRP4c humanoid. Human motion was tracked using OptiTrack's 41-marker baseline suit and streamed via NatNetLinux, with the user first assuming a reference posture to initialize landmark-specific frames. Relative poses and velocities of each anatomical landmark were computed from these frames and provided as inputs to a constraint-consistent operational space controller, which managed a prioritized stack-of-tasks—pose, orientation, then joint posture. Joint torques were computed at 1 kHz and executed in real time in the OpenSai dynamics simulation engine at the same rate.

Stack-Of-Tasks Hierarchy

Priority Level Anatomical Landmark Task
1 Pelvis Pose
1 Right Foot Pose
1 Left Foot Pose
1 Right Hand Pose
1 Left Hand Pose
1 Head Orientation
2 Upper Torso Orientation
2 Right Elbow Orientation
2 Left Elbow Orientation
3 Posture Joint

Postural Synergy Extraction

We propose humanoid postural synergies as a compact representations to parameterize full-body motion as differential joint motions encoded in low-dimensional spaces. Temporal motion decomposes each pose and velocity trajectory into synergies, where each synergy is defined by an initial configuration and a small set of principal joint–velocity vector directions that capture dominant temporal gradients. To extract meaningful synergies from continuous motion, we segment trajectories based on momentum discontinuities.


Evaluations

We evaluate the intuition and utility of human synergies and SynSculptor with the following objectives:

  • Benchmark biomechanical vs robot baseline mechanical power consumption from OpenSim to validate motion mapping and motivating synergies.
  • Identify minimal synergy basis to accurately reconstruct prototypical tasks and quantify stylistic expressivity across motion genres.
  • Compare the energetic profiles of reconstructed motions against original movements.
  • Integrate of human synergies into text-to-motion models to fine-tune stylistic outputs of MotionGPT.

Human and Robot Energetic Consumption

OpenSai vs OpenSim walking comparison
Bar comparison chart
Lines comparison chart

We map MoCap data to a biomechanical model in OpenSim and our humanoid mapping framework to compute and compare power. (Left) Mean power consumption over 10-second trials, averaged across 20 individuals for each motion. (Right) Representative power time series for a single individual performing each motion.


Kinematic Efficacy: Power

Further we analyze the power to demonstrate that operating in the synergy space span offers kinematic advantages in power consumption.

Further we analyze the power to demonstrate that operating in the synergy space span offers kinematic advantages in power consumption.


Kinematic Efficacy: Power

Here we compare mean instantaneous changes in whole-body momentum (∆P) and kinetic energy (∆KE) at 1 kHz for four prototypical moves. Synergy-based reconstructions closely match the original energetic profiles, demonstrating that the reduced basis preserves core dynamics while filtering out high-frequency components


MotionGPT Synergy Conditioned Fine-tuning

Synergy-Augmented MotionGPT with Torso Projection Synergy-based fine-tuning enhances generated motions’ realism and kinematic accuracy.

20 MotionGPT trials per movement show that raw MotionGPT outputs (teal) have higher foot-sliding ratios and power than original MoCap (gold) and synergy-only reconstructions (blue). Projecting MotionGPT trajectories through the torso null-space synergy reduces foot slipping and power, matching the original MoCap performance.

MotionGPT Foot Sliding Comparison