this incredible AI transforms any image into a photorealistic video

Created by researchers at Stanford University, WALT is a new video-generation AI. Unlike existing tools such as Runway, it is not only capable of creating a video from text, but also of transforming an image into a video!

For a little over a year now, many of us have been using generative AI tools such as DALL-E or MidJourney to create images from text prompts ! However, this is only the beginning of the artificial intelligence revolution.

This technology is evolving extremely fast, and you probably don’t realize how fast that await us in the coming months. Among these major advances are video-generating AIs.

Solutions already exist for create a video from a promptsuch as Runway and Meta Make-A-Video, or Google VideoPoet, but they are still in their infancy and less impressive than image generators.

However, it’s only a matter of time before it will be possible to create a YouTube video, or even a movie simply by typing a few words on your keyboard.

For its part, the AI WALT model offers a slightly different but equally impressive approach. In addition to generating a video from a prompt, it is capable of transform an image into a photorealistic video !

We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇 pic.twitter.com/uJKMtMsumv

– Agrim Gupta (@agrimgupta92) December 11, 2023

A number of clips have been released to demonstrate the project, including a flame-breathing dragonasteroids hitting the Earth head-on, or horses galloping on the beach.

This AI is created by a team at Stanford Universityand excels in its ability to create coherent 3D motion on a statistical object from a natural language prompt.

AI trained on both videos and images

There are already several image-generating AIs, created by Pika Labs, Runway, Meta and StabilityAI.. Performance varies from model to model, particularly in terms of fluidity, consistency and quality.

However, as the researcher behind WALT, Agrim Gupta, explains, this AI stands out by its ability to generate videos from text or images, and can be used for 3D animation.

In his words, ” although generative AI has made great strides for images, progress on video generation lags behind “. He is convinced that a unified framework will close the gap between image and video generation.

3/ Second, for memory and training efficiency, we use a window attention based transformer architecture for joint spatial and temporal generative modeling in latent space. pic.twitter.com/0uxVdRqlPL

– Agrim Gupta (@agrimgupta92) December 11, 2023

This artificial intelligence has been trained using photographs and video clips stored in the same space. This enabled it to be trained on both types of content from the outset, giving the model a deeper understanding of the notion of motion.

Designed to be scalable and efficient, WALT can deliver excellent results with three models covering image and video. In this way, definition is increased and motion is coherent.

WALT vs Runway and Pika Labs: the best AI video generator?

Thanks to this innovative approach, WALT seems a step ahead of the competition in the field of video generation. This is particularly true of for 3D motion.

The quality of the result is nevertheless below than Runway or Pika Labs, but this is just the beginning. This search AI is designed to improve.

The basic model produces small 128 x 128 clips, which are then “upsampled to a definition of 512×896 at eight frames per second.

In comparison, Runway Gen-2 creates video clips up to 1536×896but requires a paid subscription. The free basic version produces 768×448 videos, a lower definition than WALT.

Nevertheless, Runway and Pika Labs can generate up to 24 frames per second. This is much closer to a real video shot by a human being. It remains to be seen how WALT will improve with future versions…

Our blog is powered by readers. When you purchase via links on our site, we may earn an affiliate commission.

Dsxhub

Dream Big, Think AI.

this incredible AI transforms any image into a photorealistic video

AI trained on both videos and images

WALT vs Runway and Pika Labs: the best AI video generator?