Text-to-video model

This is an old revision of this page, as edited by Hr7161 (talk | contribs) at 23:23, 2 February 2023 (added information about a new ai model that came out which is state of the art.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Text-to-Video is a state of the art artificial intelligence technology which needs only text as input for the output as video. The inspiration came from text-to-image models which deliver images as output from text as input.

Video prediction on making objects realistic in a stable background is performed by using recurrent neural network for a sequence to sequence model with a connector convolutional neural network encoding and decoding each frame pixel by pixel,[1] creating video using deep learning.[2]

Methodology

Models

There are different models including open source models. CogVideo presented their code in GitHub.[3] Meta Platforms uses text-to-video with makeavideo.studio.[4][5][6]Google used Imagen Video for converting text-to-video.[7][8][9][10][11]

'Make-A-Video' is one of the latest text-to-video models and claims to be state of the art as of late 2022.[12]

Antonia Antonova presented another model.[13]

References

  1. ^ "Leading India" (PDF).
  2. ^ Narain, Rohit (2021-12-29). "Smart Video Generation from Text Using Deep Neural Networks". Retrieved 2022-10-12.
  3. ^ CogVideo, THUDM, 2022-10-12, retrieved 2022-10-12
  4. ^ Davies, Teli (2022-09-29). "Make-A-Video: Meta AI's New Model For Text-To-Video Generation". W&B. Retrieved 2022-10-12.
  5. ^ Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.
  6. ^ "Meta's Make-A-Video AI creates videos from text". www.fonearena.com. Retrieved 2022-10-12.
  7. ^ "google: Google takes on Meta, introduces own video-generating AI - The Economic Times". m.economictimes.com. Retrieved 2022-10-12.
  8. ^ Monge, Jim Clyde (2022-08-03). "This AI Can Create Video From Text Prompt". Medium. Retrieved 2022-10-12.
  9. ^ "Nuh-uh, Meta, we can do text-to-video AI, too, says Google". www.theregister.com. Retrieved 2022-10-12.
  10. ^ "Papers with Code - See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction". paperswithcode.com. Retrieved 2022-10-12.
  11. ^ "Papers with Code - Text-driven Video Prediction". paperswithcode.com. Retrieved 2022-10-12.
  12. ^ CatalyzeX. "Make-A-Video: Text-to-Video Generation without Text-Video Data: Paper and Code". CatalyzeX. Retrieved 2023-02-02.
  13. ^ "Text to Video Generation". Antonia Antonova. Retrieved 2022-10-12.