Content deleted Content added
Eurohunter (talk | contribs) →Comparison of models: -double bold |
No edit summary Tags: Reverted Mobile edit Mobile web edit |
||
Line 3:
[[File:OpenAI Sora in Action- Tokyo Walk.webm|thumb|upright=1.35|A video generated using OpenAI's [[Sora (text-to-video model)|Sora]] text-to-video model, using the prompt: <code>A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.</code>]]
A '''text-to-video model''' is a [[machine learning model]] that uses a [[natural language]] description as input to produce a [[video]] relevant to the input text.<ref name="AIIR">{{cite report|url=https://aiindex.stanford.edu/wp-content/uploads/2023/04/HAI_AI-Index-Report_2023.pdf|title=Artificial Intelligence Index Report 2023|publisher=Stanford Institute for Human-Centered Artificial Intelligence|page=98|quote=Multiple high quality text-to-video models, AI systems that can generate video clips from prompted text, were released in 2022.}}</ref> Advancements during the 2020s in the generation of high-quality, text-conditioned videos have largely been driven by the development of video [[diffusion model]]s.<ref>{{cite arXiv |last1=Melnik |first1=Andrew |title=Video Diffusion Models: A Survey |date=2024-05-06 |eprint =2405.03150 |last2=Ljubljanac |first2=Michal |last3=Lu |first3=Cong |last4=Yan |first4=Qi |last5=Ren |first5=Weiming |last6=Ritter |first6=Helge|class=cs.CV }}</ref>
A hyper-realistic cinematic close-up of a whole, full-shaped [pineapple red
] made of transparent glass with a soft light-colored outer hue — for example, pale yellow for a banana, light red for an apple, gentle orange for a carrot. The glass fruit is perfectly centered on a wooden cutting board, glowing subtly under studio lighting. A human hand is clearly visible, holding a sharp stainless steel knife just above the fruit, ready to slice. In slow motion, the knife makes the first clean slice through the glass fruit — the front section breaks off cleanly with delicate glass-crack sounds. Then, the knife immediately makes a second slice, cutting another piece smoothly. Transparent shards scatter lightly from both cuts. ASMR slicing sounds only — no talking, no music. Only the hand, knife, and fruit are visible. Ultra-sharp macro lens, shallow depth of field, cinematic lighting, 1280x720 resolution, 30 FPS.
== Models ==
|