Text-to-video model: Difference between revisions

Content deleted Content added
This is a list and should be formatted as such. Also adds clarity.
Reworded first sentences for grammar. Cogvideo shouldn't be in the introduction of the topic and is already mentioned and referenced in the Models section.
Line 1:
'''Text-to-Video''' is a state of the art technology which needs only text as input for outcomethe output as video.The inspiration came from [[Texttext-to-image model]] s which deliversdeliver images as output forfrom text as input by CogVideo.<ref>{{Citation |title=CogVideo |date=2022-10-12 |url=https://github.com/THUDM/CogVideo |publisher=THUDM |access-date=2022-10-12}}</ref>
 
Video prediction on making objects realistic in a stable background is performed by using [[Recurrentrecurrent neural network]] for a sequence to sequence model with a connector [[Convolutionalconvolutional neural network]] encoding/ and decoding each frame pixel by pixel,<ref>{{Cite web |title=Leading India |url=https://www.leadingindia.ai/downloads/projects/VP/vp_16.pdf}}</ref> creating video using [[Deep learning]].<ref>{{Cite web |last=Narain |first=Rohit |date=2021-12-29 |title=Smart Video Generation from Text Using Deep Neural Networks |url=https://www.datatobiz.com/blog/smart-video-generation-from-text/ |access-date=2022-10-12 |language=en-US}}</ref>
 
== Methodology ==