Alibaba releases EchoMimic_V2, a half-body digital human

Yesterday, Alibaba launched the second version of the EchoMimic series ——

The main update is half-body character animation. Previously, Alibaba had released EchoMimicV1, which I introduced earlier:

GitHub address: https://github.com/antgroup/echomimic_v2
Paper address: https://arxiv.org/abs/2411.10061

🚀The Evolutionary History of the EchoMimic Series

It achieves realistic portrait animation driven by audio, supporting the generation of vivid character animations through editable keypoint conditions.
Further improvements are made on this basis, focusing on more simplified, smoother effects that support half-body portrait animations.

🔬Method

The research team proposed a half-body character animation method, introducing the following innovative strategies:

Includes the following two parts:

Optimizing action sampling for half-body details.
Enhancing the expressiveness of facial expressions and body movements while simplifying conditional redundancy.

To address the scarcity of half-body data, we use head portrait data to supplement the training framework. During inference, the head data can be ignored, thus achieving a "free lunch" in animation training.

Guiding motion trajectories, details, and low-level quality at different stages, thereby enhancing the expressiveness and consistency of the animation.

📊 Comparative Analysis

Comparison with pose-driven algorithms
Comparison with audio-driven algorithms