你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

如何使用 Azure AI Foundry 模型生成图像嵌入

2025-05-20

重要

本文中标记了“（预览版）”的项目目前为公共预览版。此预览版未提供服务级别协议，不建议将其用于生产工作负载。某些功能可能不受支持或者受限。有关详细信息，请参阅 Microsoft Azure 预览版补充使用条款。

本文介绍如何将图像嵌入 API 与 Azure AI Foundry 模型配合使用。

先决条件

若要在应用程序中使用嵌入模型，你需要：

Azure 订阅。如果你正在使用 GitHub 模型，则可以升级体验并在此过程中创建 Azure 订阅。如果是这种情况，请阅读从 GitHub 模型升级到 Azure AI Foundry 模型。
Azure AI Foundry 资源（以前称为 Azure AI 服务）。有关详细信息，请参阅创建 Azure AI Foundry 资源。
终结点 URL 和密钥。

请使用以下命令安装适用于 Python 的 Azure AI 推理包：
```
pip install -U azure-ai-inference
```

图像嵌入模型部署。如果没有，请阅读《添加并配置 Foundry 模型》以便向资源添加嵌入模型。
- 此示例使用 Cohere 提供的 Cohere-embed-v3-english。

使用图像嵌入

首先，创建客户端以使用模型。以下代码使用存储在环境变量中的终结点 URL 和密钥。

import os
from azure.ai.inference import ImageEmbeddingsClient
from azure.core.credentials import AzureKeyCredential

client = ImageEmbeddingsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
    model="Cohere-embed-v3-english"
)

如果将资源配置为具有 Microsoft Entra ID 支持，则可以使用以下代码片段创建客户端。

import os
from azure.ai.inference import ImageEmbeddingsClient
from azure.identity import DefaultAzureCredential

client = ImageEmbeddingsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=DefaultAzureCredential(),
    model="Cohere-embed-v3-english"
)

创建嵌入

若要创建图像嵌入，则需要将图像数据作为请求的一部分传递。图像数据应采用 PNG 格式，并编码为 base64。

from azure.ai.inference.models import ImageEmbeddingInput

image_input= ImageEmbeddingInput.load(image_file="sample1.png", image_format="png")
response = client.embed(
    input=[ image_input ],
)

提示

创建请求时，请考虑模型的标记输入限制。如果需要嵌入较大的文本，则需要分块策略。

响应如下所示，可从中查看模型的使用统计信息：

import numpy as np

for embed in response.data:
    print("Embeding of size:", np.asarray(embed.embedding).shape)

print("Model:", response.model)
print("Usage:", response.usage)

重要

可能并非所有模型都支持按批计算嵌入内容。例如，对于 Cohere-embed-v3-english 模型，你需要一次发送一个图像。

嵌入图像和文本对

某些模型可以通过图像和文本对生成嵌入内容。在这种情况下，可以使用请求中的 image 和 text 字段将图像和文本传递给模型。以下示例演示如何为图像和文本对创建嵌入内容：

text_image_input= ImageEmbeddingInput.load(image_file="sample1.png", image_format="png")
text_image_input.text = "A cute baby sea otter"
response = client.embed(
    input=[ text_image_input ],
)

创建不同类型的嵌入

某些模型可以为同一输入生成多个嵌入，具体取决于你计划如何使用它们。此功能允许检索 RAG 模式的更准确的嵌入。

以下示例演示如何创建嵌入，这些嵌入用于为要存储在矢量数据库中的文档创建嵌入：

from azure.ai.inference.models import EmbeddingInputType

response = client.embed(
    input=[ image_input ],
    input_type=EmbeddingInputType.DOCUMENT,
)

处理查询以检索此类文档时，可以使用以下代码片段为查询创建嵌入，并最大程度地提高检索性能。

from azure.ai.inference.models import EmbeddingInputType

response = client.embed(
    input=[ image_input ],
    input_type=EmbeddingInputType.QUERY,
)

请注意，并非所有嵌入模型都支持在请求中指示输入类型，在这些情况下会返回 422 错误。

重要

本文介绍如何将图像嵌入 API 与 Azure AI Foundry 模型配合使用。

先决条件

若要在应用程序中使用嵌入模型，你需要：

Azure 订阅。如果你正在使用 GitHub 模型，则可以升级体验并在此过程中创建 Azure 订阅。如果是这种情况，请阅读从 GitHub 模型升级到 Azure AI Foundry 模型。
Azure AI Foundry 资源（以前称为 Azure AI 服务）。有关详细信息，请参阅创建 Azure AI Foundry 资源。
终结点 URL 和密钥。

使用以下命令安装适用于 JavaScript 的 Azure 推理库：

npm install @azure-rest/ai-inference
npm install @azure/core-auth
npm install @azure/identity

如果使用 Node.js，则可以在 package.json中配置依赖项：

package.json

{
  "name": "main_app",
  "version": "1.0.0",
  "description": "",
  "main": "app.js",
  "type": "module",
  "dependencies": {
    "@azure-rest/ai-inference": "1.0.0-beta.6",
    "@azure/core-auth": "1.9.0",
    "@azure/core-sse": "2.2.0",
    "@azure/identity": "4.8.0"
  }
}

导入以下内容：

import ModelClient from "@azure-rest/ai-inference";
import { isUnexpected } from "@azure-rest/ai-inference";
import { createSseStream } from "@azure/core-sse";
import { AzureKeyCredential } from "@azure/core-auth";
import { DefaultAzureCredential } from "@azure/identity";

图像嵌入模型部署。如果您还没有，请阅读添加和配置 Foundry 模型以便向您的资源添加嵌入模型。
- 此示例使用 Cohere 提供的 Cohere-embed-v3-english。

使用图像嵌入

首先，创建客户端以使用模型。以下代码使用存储在环境变量中的终结点 URL 和密钥。

const client = ModelClient(
    "https://<resource>.services.ai.azure.com/models", 
    new AzureKeyCredential(process.env.AZURE_INFERENCE_CREDENTIAL)
);

如果已使用 Microsoft Entra ID 支持配置资源，则可以使用以下代码片段创建客户端。

const clientOptions = { credentials: { "https://cognitiveservices.azure.com" } };

const client = ModelClient(
    "https://<resource>.services.ai.azure.com/models", 
    new DefaultAzureCredential()
    clientOptions,
);

创建嵌入

若要创建图像嵌入，则需要将图像数据作为请求的一部分传递。图像数据应采用 PNG 格式，并编码为 base64。

var image_path = "sample1.png";
var image_data = fs.readFileSync(image_path);
var image_data_base64 = Buffer.from(image_data).toString("base64");

var response = await client.path("/images/embeddings").post({
    body: {
        input: [ { image: image_data_base64 } ],
        model: "Cohere-embed-v3-english",
    }
});

提示

创建请求时，请考虑模型的标记输入限制。如果需要嵌入较大的文本，则需要分块策略。

响应如下所示，可从中查看模型的使用统计信息：

if (isUnexpected(response)) {
    throw response.body.error;
}

console.log(response.embedding);
console.log(response.body.model);
console.log(response.body.usage);

重要

可能并非所有模型都支持按批计算嵌入内容。例如，对于 Cohere-embed-v3-english 模型，你需要一次发送一个图像。

嵌入图像和文本对

var image_path = "sample1.png";
var image_data = fs.readFileSync(image_path);
var image_data_base64 = Buffer.from(image_data).toString("base64");

var response = await client.path("/images/embeddings").post({
    body: {
        input: [
            {
                text: "A cute baby sea otter",
                image: image_data_base64
            }
        ],
        model: "Cohere-embed-v3-english",
    }
});

创建不同类型的嵌入

某些模型可以为同一输入生成多个嵌入，具体取决于你计划如何使用它们。此功能允许检索 RAG 模式的更准确的嵌入。

以下示例演示如何创建嵌入，这些嵌入用于为要存储在矢量数据库中的文档创建嵌入：

var response = await client.path("/images/embeddings").post({
    body: {
        input: [ { image: image_data_base64 } ],
        input_type: "document",
        model: "Cohere-embed-v3-english",
    }
});

处理查询以检索此类文档时，可以使用以下代码片段为查询创建嵌入，并最大程度地提高检索性能。

var response = await client.path("/images/embeddings").post({
    body: {
        input: [ { image: image_data_base64 } ],
        input_type: "query",
        model: "Cohere-embed-v3-english",
    }
});

请注意，并非所有嵌入模型都支持在请求中指示输入类型，在这些情况下会返回 422 错误。

注意

使用图像嵌入仅支持 Python、JavaScript、C# 或 REST 请求。

重要

本文介绍如何将图像嵌入 API 与 Azure AI Foundry 模型配合使用。

先决条件

若要在应用程序中使用嵌入模型，你需要：

Azure 订阅。如果你正在使用 GitHub 模型，则可以升级体验并在此过程中创建 Azure 订阅。如果是这种情况，请阅读从 GitHub 模型升级到 Azure AI Foundry 模型。
Azure AI Foundry 资源（以前称为 Azure AI 服务）。有关详细信息，请参阅创建 Azure AI Foundry 资源。
终结点 URL 和密钥。

请使用以下命令安装 Azure AI 推理包：

dotnet add package Azure.AI.Inference --prerelease

如果使用 Entra ID，则还需要以下包：
```
dotnet add package Azure.Identity
```

图像嵌入模型部署。如果没有现成的，请阅读Add and configure Foundry Models以向资源添加嵌入模型。
- 此示例使用 Cohere 提供的 Cohere-embed-v3-english。

使用图像嵌入

首先，创建客户端以使用模型。以下代码使用存储在环境变量中的终结点 URL 和密钥。

ImageEmbeddingsClient client = new ImageEmbeddingsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);

如果使用 Microsoft Entra ID 支持配置了资源，则可以使用以下代码片段创建客户端。请注意，includeInteractiveCredentialstrue仅用于演示目的，因此可以使用 Web 浏览器进行身份验证。对于生产工作负荷，应删除参数。

TokenCredential credential = new DefaultAzureCredential(includeInteractiveCredentials: true);
AzureAIInferenceClientOptions clientOptions = new AzureAIInferenceClientOptions();
BearerTokenAuthenticationPolicy tokenPolicy = new BearerTokenAuthenticationPolicy(credential, new string[] { "https://cognitiveservices.azure.com/.default" });

clientOptions.AddPolicy(tokenPolicy, HttpPipelinePosition.PerRetry);

ImageEmbeddingsClient client = new ImageEmbeddingsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    credential,
    clientOptions
);

创建嵌入

若要创建图像嵌入，则需要将图像数据作为请求的一部分传递。图像数据应采用 PNG 格式，并编码为 base64。

List<ImageEmbeddingInput> input = new List<ImageEmbeddingInput>
{
    ImageEmbeddingInput.Load(imageFilePath:"sampleImage.png", imageFormat:"png")
};

var requestOptions = new ImageEmbeddingsOptions()
{
    Input = input,
    Model = "Cohere-embed-v3-english"
};

Response<EmbeddingsResult> response = client.Embed(requestOptions);

提示

创建请求时，请考虑模型的标记输入限制。如果需要嵌入较大的文本，则需要分块策略。

响应如下所示，可从中查看模型的使用统计信息：

foreach (EmbeddingItem item in response.Value.Data)
{
    List<float> embedding = item.Embedding.ToObjectFromJson<List<float>>();
    Console.WriteLine($"Index: {item.Index}, Embedding: <{string.Join(", ", embedding)}>");
}

重要

所有模型都可能不支持以批处理为单位的计算嵌入。例如，对于 Cohere-embed-v3-english 模型，你需要一次发送一个图像。

嵌入图像和文本对

var image_input = ImageEmbeddingInput.Load(imageFilePath:"sampleImage.png", imageFormat:"png")
image_input.text = "A cute baby sea otter"

var requestOptions = new ImageEmbeddingsOptions()
{
    Input = new List<ImageEmbeddingInput>
    {
        image_input
    },
    Model = "Cohere-embed-v3-english"
};

Response<EmbeddingsResult> response = client.Embed(requestOptions);

创建不同类型的嵌入

某些模型可以为同一输入生成多个嵌入，具体取决于你计划如何使用它们。此功能允许检索 RAG 模式的更准确的嵌入。

以下示例演示如何为将存储在向量数据库中的文档创建嵌入内容：

var requestOptions = new EmbeddingsOptions()
{
    Input = image_input,
    InputType = EmbeddingInputType.DOCUMENT, 
    Model = "Cohere-embed-v3-english"
};

Response<EmbeddingsResult> response = client.Embed(requestOptions);

处理查询以检索此类文档时，可以使用以下代码片段为查询创建嵌入，并最大程度地提高检索性能。

var requestOptions = new EmbeddingsOptions()
{
    Input = image_input,
    InputType = EmbeddingInputType.QUERY,
    Model = "Cohere-embed-v3-english"
};

Response<EmbeddingsResult> response = client.Embed(requestOptions);

请注意，并非所有嵌入模型都支持在请求中指示输入类型，在这些情况下会返回 422 错误。

重要

本文介绍如何将图像嵌入 API 与 Azure AI Foundry 模型配合使用。

先决条件

若要在应用程序中使用嵌入模型，你需要：

Azure 订阅。如果你正在使用 GitHub 模型，则可以升级体验并在此过程中创建 Azure 订阅。如果是这种情况，请阅读从 GitHub 模型升级到 Azure AI Foundry 模型。
Azure AI Foundry 资源（以前称为 Azure AI 服务）。有关详细信息，请参阅创建 Azure AI Foundry 资源。
终结点 URL 和密钥。

图像嵌入模型部署。如果您没有，请阅读添加和配置 Foundry 模型来将嵌入模型添加到您的资源中。
- 此示例使用 Cohere 提供的 Cohere-embed-v3-english。

使用图像嵌入

若要使用文本嵌入，请使用追加到基 URL 的路由 /images/embeddings 以及 api-key 中指示的凭据。 Authorization 标头也支持 Bearer <key> 格式。

POST https://<resource>.services.ai.azure.com/models/images/embeddings?api-version=2024-05-01-preview
Content-Type: application/json
api-key: <key>

如果已将资源配置为具有 Microsoft Entra ID 支持，请在标头中以 Authorization 格式传递令牌。Bearer <token> 使用范围 https://cognitiveservices.azure.com/.default。

POST https://<resource>.services.ai.azure.com/models/images/embeddings?api-version=2024-05-01-preview
Content-Type: application/json
Authorization: Bearer <token>

使用 Microsoft Entra ID 可能需要资源中的其他配置才能授予访问权限。了解如何使用 Microsoft Entra ID 配置无密钥身份验证。

创建嵌入

若要创建图像嵌入，则需要将图像数据作为请求的一部分传递。图像数据应采用 PNG 格式，并编码为 base64。

{
    "model": "Cohere-embed-v3-english",
    "input": [
        {
            "image": "data:image/png;base64,iVBORw0KGgoAAAANSUh..."
        }
    ]
}

提示

创建请求时，请考虑模型的标记输入限制。如果需要嵌入较大的文本，则需要分块策略。

响应如下所示，可从中查看模型的使用统计信息：

{
    "id": "0ab1234c-d5e6-7fgh-i890-j1234k123456",
    "object": "list",
    "data": [
        {
            "index": 0,
            "object": "embedding",
            "embedding": [
                0.017196655,
                // ...
                -0.000687122,
                -0.025054932,
                -0.015777588
            ]
        }
    ],
    "model": "Cohere-embed-v3-english",
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 0,
        "total_tokens": 9
    }
}

重要

可能并非所有模型都支持按批计算嵌入内容。例如，对于 Cohere-embed-v3-english 模型，你需要一次发送一个图像。

嵌入图像和文本对

{
    "model": "Cohere-embed-v3-english",
    "input": [
        {
            "image": "data:image/png;base64,iVBORw0KGgoAAAANSUh...",
            "text": "A photo of a cat"
        }
    ]
}

创建不同类型的嵌入

某些模型可以为同一输入生成多个嵌入，具体取决于你计划如何使用它们。此功能允许检索 RAG 模式的更准确的嵌入。

以下示例演示如何创建嵌入，这些嵌入用于为要存储在矢量数据库中的文档创建嵌入：

{
    "model": "Cohere-embed-v3-english",
    "input": [
        {
            "image": "data:image/png;base64,iVBORw0KGgoAAAANSUh..."
        }
    ],
    "input_type": "document"
}

处理查询以检索此类文档时，可以使用以下代码片段为查询创建嵌入，并最大程度地提高检索性能。

{
    "model": "Cohere-embed-v3-english",
    "input": [
        {
            "image": "data:image/png;base64,iVBORw0KGgoAAAANSUh..."
        }
    ],
    "input_type": "query"
}

请注意，并非所有嵌入模型都支持在请求中指示输入类型，在这些情况下会返回 422 错误。

通过

如何使用 Azure AI Foundry 模型生成图像嵌入

先决条件

使用图像嵌入

创建嵌入

嵌入图像和文本对

创建不同类型的嵌入

先决条件

使用图像嵌入

创建嵌入

嵌入图像和文本对

创建不同类型的嵌入

先决条件

使用图像嵌入

创建嵌入

嵌入图像和文本对

创建不同类型的嵌入

先决条件

使用图像嵌入

创建嵌入

嵌入图像和文本对

创建不同类型的嵌入

相关内容

反馈

其他资源