Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
On 30 June 2025, Azure AI Vision Video Retrieval will be retired. The decision to retire this feature is part of our ongoing effort to improve and simplify and improve the features offered for video processing. Migrate to Azure AI Content Understanding and Azure AI Search to benefit from their additional capabilities.
Video processing: Video Retrieval vs Azure AI Content Understanding
Feature | Video Retrieval for video description | Azure AI Content Understanding |
---|---|---|
Video Length Supported | Optimized for short videos, up to ~3 minutes | Supports short & long videos, up to 4 hours |
Frame Processing | Up to 20 frames | Batch processing, sampling shot-by-shot sampled across entire video |
Content Extraction Pre-Processing | Transcription | Transcription, Shot identification, Face grouping |
Structured Output Support | Not supported | Supports schema-conforming structured outputs |
Data types | Video supported | Video, images, documents, and speech supported |
Pricing | Variable Token-based | Fixed cost per minute of video processed |
To migrate to Content Understanding for video summaries and descriptions, we'd recommend reviewing the Azure AI Content Understanding documentation.
Video Search: Video Retrieval vs. Azure AI Search and Content Understanding
Feature | Video Retrieval for video search | Azure AI Search and Content Understanding |
---|---|---|
Visual Embedding type | Frame-based Image Embeddings | Video description text embeddings |
Content Extraction Pre-Processing | Transcription, OCR | Transcription, Shot identification, Face grouping |
People & Object search support | Strong support | Strong support |
Action and Event support | Limited | Strong support |
Customization | None | Content Understanding analyzer can be customized to focus using the fields and field descriptions |
To start building the search use case with Content Understanding, we recommend starting with this sample which shows how to use Azure AI Search to search videos.
To avoid service disruptions, migrate by 30 June 2025.
Video Retrieval is a service that lets you create a search index, add documents (videos and images) to it, and search with natural language. Developers can define metadata schemas for each index and ingest metadata to the service to help with retrieval. Developers can also specify what features to extract from the index (vision, speech) and filter their search based on features.
Input requirements
Supported formats
File format | Description |
---|---|
asf |
ASF (Advanced / Active Streaming Format) |
avi |
AVI (Audio Video Interleaved) |
flv |
FLV (Flash Video) |
matroskamm , webm |
Matroska / WebM |
mov ,mp4 ,m4a ,3gp ,3g2 ,mj2 |
QuickTime / MOV |
Supported video codecs
Codec | Format |
---|---|
h264 |
H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 |
h265 |
H.265/HEVC |
libvpx-vp9 |
libvpx VP9 (codec vp9) |
mpeg4 |
MPEG-4 part 2 |
Supported audio codecs
Codec | Format |
---|---|
aac |
AAC (Advanced Audio Coding) |
mp3 |
MP3 (MPEG audio layer 3) |
pcm |
PCM (uncompressed) |
vorbis |
Vorbis |
wmav2 |
Windows Media Audio 2 |