Meta declares Make-A-Video, which produces video from text
- Guest Posts
- October 1, 2022
Today, Meta declared Make-A-Video, an AI-powered video generator that can make novel video content from text or picture prompts, like existing picture union instruments like DALL-E and Stable Dissemination. It can likewise make varieties of existing videos, however it’s not yet accessible for public use.
On Make-A-Video’s declaration page, Meta shows example videos created from text, including “a young couple walking in heavy rain” and “a teddy bear painting a portrait.” It likewise exhibits Make-A-Video’s ability to take a static source picture and animate it. For example, a still photograph of a sea turtle, when processed through the AI model, can give off an impression of being swimming.
The key technology behind Make-A-Video — and why it has shown up sooner than certain specialists expected — is that it works off existing work with text-to-image synthesis used with picture generators like OpenAI’s DALL-E. In July, Meta reported its own text-to-image AI model called Make-A-Scene.
Rather than training the Make-A-Video model on named video data (for example, captioned descriptions of the actions depicted), Meta rather took image synthesis data (actually pictures trained with captions) and applied unlabeled video training information so the model learns a sense of where a text or picture brief could exist in reality. Then it can predict what comes after the picture and show the scene moving for a short period.
“Using function-preserving transformations, we extend the spatial layers at the model initialization stage to include temporal information,” Meta wrote in a white paper. “The extended spatial-temporal network includes new attention modules that learn temporal world dynamics from a collection of videos.”
Meta has not made a declaration about how or when Make-A-Video could become available to people in general or who might access it. Meta gives a sign-up form individuals can fill out on the off chance that they are interested in trying it in the future.
Meta recognizes that the capacity to encourage photorealistic videos on demand presents specific social dangers. At the bottom of the declaration page, Meta says that all AI-generated video content from Make-A-Video contains a watermark to “help ensure viewers know the video was generated with AI and is not a captured video.”