Google Research’s VideoPoet: All You Need to Know

Here is all you need to know about Google’s new AI VideoPoet.

After the launch of OpenAI’s Chat GPT which showed the world the ability of artificial intelligence to create and draft content using the natural learning language type of machine learning and also create relevant images of every prompt given by its users. There are a ton of AI that are launched by other tech giants and startups that promise to give results according to the prompts given by the users and also offer to be more creative. Recently, Google launched Gemini (Previously known as Bard) to compete with OpenAI and Microsoft also launched its own chatbot named Copilot. After the chatbot for prompts given to create text was launched. Google has now released a chatbot that can create videos using images and generate text using video and also other features that are similar to Chat GPT. Let’s have a look at what Google’s Video Poet can do.

What is Google Video Poet?

Google Research has introduced this new AI chatbot to compete with OpenAI’s ChatGPT and other chatbots in the market that works on a simple modelling method that has the ability to convert any Large Language Model (LLM) to a video generator. This chatbot has components mentioned below that can help.

A Pre-trained MAGVIT V2 video tokenizer and a SoudStream audio tokenizer can transfer images, Video, and audio clips with variable lengths and convert these into separate codes that are compatible with text-based language models. This helps Video Poet to integrate with other modalities such as text.

It is equipped with an autoregressive language model that gives it the ability to learn from different images, videos and text models and predict the next image, video, and words in the sequence.

The software is a combination of multi-modal generative learning models in the learning language model that includes text-to-video, text-to-image, image-to-video, video frame continuation, video inpainting and outpainting, Video Stylization, and video-to-audio. All these tasks are compiled into a zero-shot capabilities.

This software can edit and create videos using different models. It can pick different clips and use its machine learning model to predict the next video and other media and text and create a high-quality video which includes a series of predicted clips.

The software also edits and synthesizes the clips. VideoPoet uses state-of-the-art video generation to produce a wide range of large, interesting, and high-quality videos. This also supports square or portrait orientation. So that it can generate videos for short-form content. It also supports audio generation from video.

So, basically, this AI can generate any kind of video with high resolution and also give relevant audio to the video.

How VideoPoet Works?

You can see the demo of this on the Google Research page.

To use this AI first use a chatbot like ChatGPT or Bard to generate a series of prompts that can create a story then use the prompts that are generated from the chatbot to generate a series of short videos. Now, stitch the video clips together to form a video depicting the story.

VideoPoet is capable of predicting 1 second after the video output and input and is capable of creating a video of any duration. It also has the ability to edit the videos as users can select from a list of possible outcomes and select an appropriate response that suits their video. The users can also finely control the editing of the clip by selecting clips from the large generated videos. This software also has the ability to convert input images into short videos. Video Poet has the ability to stylize the input video by colour correction and basic enhancement that makes the video appealing.

As shown on the Google Research website. With this new addition to the AI world video generation and editing have become easy and fast increasing the productivity of video creators and makers. AI has increased the productivity of humans to create new possibilities and expand their creativity. Through this, we can create motions or GIFs and also generate longer-duration videos. These are all the features that Google’s VideoPoet can do. This AI can help reduce the time taken to generate and edit a video and also increase productivity.