OmniTalker: Video to Video (alibaba) https://humanaigc.github.io/omnitalker/ ACTalker: https://github.com/harlanhong/ACTalker Voice Cloner and Text-to-Speech: "Spark-TTS" An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens https://sparkaudio.github.io/spark-tts/ --------------------------------------------------------------------------------------- Image + Audio = Video with Hands and Face movement "EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation" https://github.com/antgroup/echomimic_v2 Tested System Environment: Centos 7.2/Ubuntu 22.04, Cuda >= 11.7 Tested GPUs: A100(80G) / RTX4090D (24G) / V100(16G) Tested Python Version: 3.8 / 3.10 / 3.11 --------------------------------------------------------------------------------------- Image + Audio = Video with only Face movement "Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation" https://fudan-generative-vi...
Comments