Revision as of 14:34, 10 March 2025 edit Joyous! (talk \| contribs) Administrators 100,754 edits Reverted 1 edit by Nishanttyagi1111 (talk): Rv giant short desc Tags: Twinkle Undo ← Previous edit		Revision as of 07:15, 12 March 2025 edit undo Lima lotha (talk \| contribs) 2 edits No edit summary Tags: Reverted Visual edit Mobile edit Mobile web edit Next edit →
Line 1: {{short description\|Machine learning model}} {{Use dmy dates\|date=November 2024}} [[File:OpenAI Sora in Action- Tokyo Walk.webm\|thumb\|upright=1.35\|A video generated using OpenAI's [[Sora (text-to-video model)\|Sora]] text-to-video model, using the prompt: ~~<code>A stylish woman walks down~~ a ~~Tokyo~~elephant ~~street filled with warm glowing neon and animated city signage. She wears~~in a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestriansbattle ~~walk~~armour ~~about.</code>]]~~ ]] A '''text-to-video model''' is a [[machine learning model]] that uses a [[natural language]] description as input to produce a [[video]] relevant to the input text.<ref name="AIIR">{{cite report\|url=https://aiindex.stanford.edu/wp-content/uploads/2023/04/HAI_AI-Index-Report_2023.pdf\|title=Artificial Intelligence Index Report 2023\|publisher=Stanford Institute for Human-Centered Artificial Intelligence\|page=98\|quote=Multiple high quality text-to-video models, AI systems that can generate video clips from prompted text, were released in 2022.}}</ref> Advancements during the 2020s in the generation of high-quality, text-conditioned videos have largely been driven by the development of video [[diffusion model]]s.<ref>{{cite arXiv \|last1=Melnik \|first1=Andrew \|title=Video Diffusion Models: A Survey \|date=2024-05-06 \|eprint =2405.03150 \|last2=Ljubljanac \|first2=Michal \|last3=Lu \|first3=Cong \|last4=Yan \|first4=Qi \|last5=Ren \|first5=Weiming \|last6=Ritter \|first6=Helge\|class=cs.CV }}</ref>

Text-to-video model: Difference between revisions