Content deleted Content added
बंदर फोन लेकर रो रहा था। स्क्रीन पर लिखा था — "Subscribe please!" 🐒📱 Tags: Reverted non-English content Mobile edit Mobile web edit |
m Reverted edits by 117.223.94.125 (talk) (HG) (3.4.13) |
||
Line 3:
[[File:OpenAI Sora in Action- Tokyo Walk.webm|thumb|upright=1.35|A video generated using OpenAI's [[Sora (text-to-video model)|Sora]] text-to-video model, using the prompt: <code>A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.</code>]]
A '''text-to-video model''' is a [[machine learning model]] that uses a [[natural language]] description as input to produce a [[video]] relevant to the input text.<ref name="AIIR">{{cite report|url=https://aiindex.stanford.edu/wp-content/uploads/2023/04/HAI_AI-Index-Report_2023.pdf|title=Artificial Intelligence Index Report 2023|publisher=Stanford Institute for Human-Centered Artificial Intelligence|page=98|quote=Multiple high quality text-to-video models, AI systems that can generate video clips from prompted text, were released in 2022.}}</ref> Advancements during the 2020s in the generation of high-quality, text-conditioned videos have largely been driven by the development of video [[diffusion model]]s.<ref>{{cite arXiv |last1=Melnik |first1=Andrew |title=Video Diffusion Models: A Survey |date=2024-05-06 |eprint =2405.03150 |last2=Ljubljanac |first2=Michal |last3=Lu |first3=Cong |last4=Yan |first4=Qi |last5=Ren |first5=Weiming |last6=Ritter |first6=Helge|class=cs.CV }}</ref>
== Models ==
|