Revision as of 22:36, 22 May 2025 edit ALittleClass (talk \| contribs) Extended confirmed users 931 edits added information on google's "Veo 3" model Tag: Visual edit ← Previous edit		Revision as of 06:34, 25 May 2025 edit undo ALittleClass (talk \| contribs) Extended confirmed users 931 edits added link of "veo" Tag: Visual edit Next edit →
Line 12: In January 2024, [[Google]] announced development of a text-to-video model named Lumiere which is anticipated to integrate advanced video editing capabilities.<ref>{{Cite web \|last=Yirka \|first=Bob \|date=2024-01-26 \|title=Google announces the development of Lumiere, an AI-based next-generation text-to-video generator. \|url=https://techxplore.com/news/2024-01-google-lumiere-ai-based-generation.html \|access-date=2024-11-18 \|website=Tech Xplore}}</ref> [[Matthias Niessner]] and [[Lourdes Agapito]] at AI company [[Synthesia (company)\|Synthesia]] work on developing 3D neural rendering techniques that can synthesise realistic video by using 2D and 3D neural representations of shape, appearances, and motion for controllable video synthesis of avatars.<ref>{{Cite web \|title=Text to Speech for Videos \|url=https://www.synthesia.io/text-to-speech \|access-date=2023-10-17 \|website=Synthesia.io}}</ref> In June 2024, Luma Labs launched its [[Dream Machine (text-to-video model)\|Dream Machine]] video tool.<ref>{{Cite web \|last=Nuñez \|first=Michael \|date=2024-06-12 \|title=Luma AI debuts 'Dream Machine' for realistic video generation, heating up AI media race \|url=https://venturebeat.com/ai/luma-ai-debuts-dream-machine-for-realistic-video-generation-heating-up-ai-media-race/ \|access-date=2024-11-18 \|website=VentureBeat \|language=en-US}}</ref><ref>{{Cite web \|last=Fink \|first=Charlie \|title=Apple Debuts Intelligence, Mistral Raises $600 Million, New AI Text-To-Video \|url=https://www.forbes.com/sites/charliefink/2024/06/13/apple-debuts-intelligence-mistral-raises-600-million-new-ai-text-to-video/ \|access-date=2024-11-18 \|website=Forbes \|language=en}}</ref> That same month,<ref>{{Cite web \|last=Franzen \|first=Carl \|date=2024-06-12 \|title=What you need to know about Kling, the AI video generator rival to Sora that's wowing creators \|url=https://venturebeat.com/ai/what-you-need-to-know-about-kling-the-ai-video-generator-rival-to-sora-thats-wowing-creators/ \|access-date=2024-11-18 \|website=VentureBeat \|language=en-US}}</ref> [[Kuaishou]] extended its Kling AI text-to-video model to international users. In July 2024, [[TikTok]] owner [[ByteDance]] released Jimeng AI in China, through its subsidiary, Faceu Technology.<ref>{{Cite web \|date=2024-08-06 \|title=ByteDance joins OpenAI's Sora rivals with AI video app launch \|url=https://www.reuters.com/technology/artificial-intelligence/bytedance-joins-openais-sora-rivals-with-ai-video-app-launch-2024-08-06/ \|access-date=2024-11-18 \|publisher=[[Reuters]]}}</ref> By September 2024, the Chinese AI company [[MiniMax (company)\|MiniMax]] debuted its video-01 model, joining other established AI model companies like [[Zhipu AI]], [[Baichuan]], and [[Moonshot AI]], which contribute to China’s involvement in AI technology.<ref>{{Cite web \|date=2024-09-02 \|title=Chinese ai "tiger" minimax launches text-to-video-generating model to rival OpenAI's sora \|url=https://finance.yahoo.com/news/chinese-ai-tiger-minimax-launches-093000322.html \|access-date=2024-11-18 \|website=Yahoo! Finance}}</ref> Alternative approaches to text-to-video models include<ref>{{Citation \|title=Text2Video-Zero \|date=2023-08-12 \|url=https://github.com/Picsart-AI-Research/Text2Video-Zero \|access-date=2023-08-12 \|publisher=Picsart AI Research (PAIR)}}</ref> Google's Phenaki, Hour One, [[Colossyan]],<ref name=":5" /> [[Runway (company)\|Runway]]'s Gen-3 Alpha,<ref>{{Cite web \|last=Kemper \|first=Jonathan \|date=2024-07-01 \|title=Runway's Sora competitor Gen-3 Alpha now available \|url=https://the-decoder.com/runways-sora-competitor-gen-3-alpha-now-available/ \|access-date=2024-11-18 \|website=THE DECODER \|language=en-US}}</ref><ref>{{Cite news \|date=2023-03-20 \|title=Generative AI's Next Frontier Is Video \|url=https://www.bloomberg.com/news/articles/2023-03-20/generative-ai-s-next-frontier-is-video \|access-date=2024-11-18 \|work=Bloomberg.com \|language=en}}</ref> and OpenAI's [[Sora (text-to-video model)\|Sora]],<ref>{{Cite web \|date=2024-02-15 \|title=OpenAI teases 'Sora,' its new text-to-video AI model \|url=https://www.nbcnews.com/tech/tech-news/openai-sora-video-artificial-intelligence-unveiled-rcna139065 \|access-date=2024-11-18 \|website=NBC News \|language=en}}</ref> <ref>{{Cite web \|last=Kelly \|first=Chris \|date=2024-06-25 \|title=Toys R Us creates first brand film to use OpenAI's text-to-video tool \|url=https://www.marketingdive.com/news/toys-r-us-openai-sora-gen-ai-first-text-video/719797/ \|access-date=2024-11-18 \|website=Marketing Dive \|publisher=[[Informa]] \|language=en-US}}</ref> Several additional text-to-video models, such as Plug-and-Play, Text2LIVE, and TuneAVideo, have emerged.<ref>{{Cite book \|last1=Jin \|first1=Jiayao \|last2=Wu \|first2=Jianhang \|last3=Xu \|first3=Zhoucheng \|last4=Zhang \|first4=Hang \|last5=Wang \|first5=Yaxin \|last6=Yang \|first6=Jielong \|chapter=Text to Video: Enhancing Video Generation Using Diffusion Models and Reconstruction Network \|date=2023-08-04 \|title=2023 2nd International Conference on Computing, Communication, Perception and Quantum Technology (CCPQT) \|chapter-url=https://ieeexplore.ieee.org/document/10336607 \|publisher=IEEE \|pages=108–114 \|doi=10.1109/CCPQT60491.2023.00024 \|isbn=979-8-3503-4269-7}}</ref> [[FLUX.1]] developer Black Forest Labs has announced its text-to-video model SOTA.<ref>{{Cite web \|date=2024-08-01 \|title=Announcing Black Forest Labs \|url=https://blackforestlabs.ai/announcing-black-forest-labs/ \|access-date=2024-11-18 \|website=Black Forest Labs \|language=en-US}}</ref> [[Google]] was preparing to launch a video generation tool named [[Veo (text-to-video model)\|Veo]] for [[YouTube Shorts]] in 2025.<ref>{{Cite web \|last=Forlini \|first=Emily Dreibelbis \|date=2024-09-18 \|title=Google's veo text-to-video AI generator is coming to YouTube shorts \|url=https://www.pcmag.com/news/googles-veo-text-to-video-ai-generator-is-coming-to-youtube-shorts \|access-date=2024-11-18 \|website=[[PC Magazine]]}}</ref> On May 2025, Google launched the Veo 3 iteration of the model. It was noted for it's impressive audio generation capabilities, which were a previous limitation for text-to-video models.<ref>{{Cite web \|last=Subin \|first=Jennifer Elias,Samantha \|date=2025-05-20 \|title=Google launches Veo 3, an AI video generator that incorporates audio \|url=https://www.cnbc.com/2025/05/20/google-ai-video-generator-audio-veo-3.html \|access-date=2025-05-22 \|website=CNBC \|language=en}}</ref> == Architecture and training ==

Text-to-video model: Difference between revisions