Alibaba releases EchoMimic_V2, a half-body digital human

Yesterday, Alibaba launched the second version of the EchoMimic series ——

The main update is half-body character animation. Previously, Alibaba had released EchoMimicV1, which I introduced earlier:

  • GitHub address: https://github.com/antgroup/echomimic_v2

  • Paper address: https://arxiv.org/abs/2411.10061

🚀The Evolutionary History of the EchoMimic Series

  • It achieves realistic portrait animation driven by audio, supporting the generation of vivid character animations through editable keypoint conditions.

  • Further improvements are made on this basis, focusing on more simplified, smoother effects that support half-body portrait animations.

🔬Method

The research team proposed a half-body character animation method, introducing the following innovative strategies:

  1. Includes the following two parts:

  • Optimizing action sampling for half-body details.
  • Enhancing the expressiveness of facial expressions and body movements while simplifying conditional redundancy.
  • To address the scarcity of half-body data, we use head portrait data to supplement the training framework. During inference, the head data can be ignored, thus achieving a "free lunch" in animation training.

  • Guiding motion trajectories, details, and low-level quality at different stages, thereby enhancing the expressiveness and consistency of the animation.

  • 📊 Comparative Analysis

    1. Comparison with pose-driven algorithms
    2. Comparison with audio-driven algorithms