- moxii
- Posts
- Alibaba's new AI turns images into singers
Alibaba's new AI turns images into singers
🤖 Meet EMO (short for "Emote Portrait Alive").
Alibaba is showing off its newest AI tool that transforms still portraits into moving, lifelike actors and singers.
Imagine a single picture coming alive, the face emoting and singing along to your favorite song. EMO accomplishes by taking a single portrait image and combining it with audio input like speech or singing.
This results in an animated avatar video complete with realistic facial expressions and head movements.
While AI video generation isn't new, successfully combining audio and achieving natural lip sync has been a major challenge. Traditional methods had a hard time capturing the full range of human expressions and little facial details.
🖼️ 250 hours and 150 million images to train
EMO was trained with a dataset of over 250 hours of footage and 150 million images encompassing various languages, speeches, movies, and singing performances.
It allows EMO to not only generate smooth facial movements and head poses but also adapt to different artistic styles, including photographs, paintings, and even anime characters.
One example showcased by the team involves taking an AI generated image of a man in a tracksuit rapping “Rap God” by Eminem. Because of this, EMO has potential for creating engaging content across different fields like education, entertainment, marketing and e-commerce.
Reply