Content deleted Content added
Line 21:
Multimodal machine learning has numerous applications across various domains:
=== Cross-
Cross-modal retrieval allows users to search for data across different modalities (e.g., retrieving images based on text descriptions), improving multimedia search engines and content recommendation systems. Models like [[Contrastive Language-Image Pre-training|CLIP]] facilitate efficient, accurate retrieval by embedding data in a shared space, demonstrating strong performance even in zero-shot settings.<ref>{{Cite arXiv |last1=Hendriksen |first1=Mariya |last2=Vakulenko |first2=Svitlana |last3=Kuiper |first3=Ernst |last4=de Rijke |first4=Maarten |date=2023 |title=Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study |class=cs.CV |eprint=2301.05174}}</ref>
|