AI 图像入门

2025-05-20

Windows AI Foundry 中的映像功能支持以下功能：

图像超分辨率：缩放和锐化图像。
图像说明：生成描述图像的文本。
图像分段：标识图像中的对象。
对象擦除：从图像中删除对象。

有关 API 详细信息，请参阅人工智能图像功能的API参考。

有关 内容审查详细信息，请参阅使用生成 AI API 的内容安全。

重要

下面是 Windows AI 功能和当前支持的 Windows 应用 SDK 版本列表。

版本 1.8 实验版（1.8.0-experimental1） - 对象擦除， Phi 硅， Phi 硅的LoRA微调，对话摘要（文本智能）

个人预览版 - 语义搜索

版本 1.7.1 （1.7.250401001） - 所有其他 API

这些 API 仅在已收到 5 月 7 日更新的 Windows 预览体验计划预览版（WIP）设备上正常运行。 5 月 28 日至 29 日，可选更新将发布到非 WIP 设备，随后将进行 6 月 10 日的更新。此更新将附带 Windows AI API 正常运行所需的 AI 模型。这些更新还要求任何使用 Windows AI API 的应用在运行时获得包身份之前，无法使用这些 API。

我可以使用图像超分辨率做什么？

图像超级分辨率 API 支持图像锐化和缩放。

缩放限制为最大 8 倍，因为更高的缩放因子可能会引入伪影并损害图像准确性。如果最终宽度或高度大于其原始值的 8 倍，将引发异常。

有关图像缩放器的更多详细信息

下面的示例演示如何更改现有软件位图图像（targetWidth）的规模（targetHeight、softwareBitmap），并改进图像锐度（为了在不缩放图像的情况下提高锐度，只需使用 ImageScaler 对象指定现有图像宽度和高度）。

通过调用 ImageScaler.GetReadyState 方法，然后等待 ImageScaler.EnsureReadyAsync 方法成功返回，确保图像超分辨率模型可用。
图像超级分辨率模型可用后，创建 ImageScaler 对象来引用它。
通过使用 ScaleSoftwareBitmap 方法将现有图像和所需宽度和高度传递给模型，获取现有图像的锐化和缩放版本。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.Management.Deployment;
using Microsoft.Windows.AI;
using Windows.Graphics.Imaging;

if (ImageScaler.GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageScaler.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}
ImageScaler imageScaler = await ImageScaler.CreateAsync();
SoftwareBitmap finalImage = imageScaler.ScaleSoftwareBitmap(softwareBitmap, targetWidth, targetHeight);

#include <winrt/Microsoft.Graphics.Imaging.h>
#include <winrt/Microsoft.Windows.AI.h>
#include <winrt/Windows.Foundation.h>
#include <winrt/Windows.Graphics.Imaging.h>

using namespace winrt::Microsoft::Graphics::Imaging;
using namespace winrt::Microsoft::Windows::AI;
using namespace winrt::Windows::Foundation; 
using namespace winrt::Windows::Graphics::Imaging; 

if (ImageScaler::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageScaler::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}
int targetWidth = 100;
int targetHeight = 100;
ImageScaler imageScaler = ImageScaler::CreateAsync().get();
Windows::Graphics::Imaging::SoftwareBitmap finalImage = 
    imageScaler.ScaleSoftwareBitmap(softwareBitmap, targetWidth, targetHeight);

图像说明可以用来做些什么？

重要

图像说明目前在中国不可用。

图像说明 API 提供为图像生成各种类型的文本说明的功能。

支持以下类型的文本说明：

辅助功能 - 提供一个较长的说明，其中包含面向具有辅助功能需求的用户的详细信息。
标题 - 提供适合图像标题的简短说明。如果未指定任何值，则为默认值。
DetailedNarration - 提供较长的说明。
OfficeCharts - 提供适合图表和关系图的说明。

由于这些 API 使用机器学习（ML）模型，因此在文本无法正确描述图像的位置偶尔会发生错误。因此，不建议在以下方案中将这些 API 用于映像：

其中图像包含潜在的敏感内容和不准确的描述可能会引起争议，例如国旗、地图、地球、文化符号或宗教符号。
当准确的描述至关重要时，例如医疗建议或诊断、法律内容或财务文档。

从图像获取文本说明

图像说明 API 采用图像、所需文本描述类型（可选）和要采用的内容审查级别（可选），以防止有害使用。

以下示例演示如何获取图像的文本说明。

注意

图像必须是 ImageBuffer 对象，因为当前不支持 SoftwareBitmap 。此示例演示如何将 SoftwareBitmap 转换为 ImageBuffer。

通过调用 ImageDescriptionGenerator.GetReadyState 方法，然后等待 ImageDescriptionGenerator.EnsureReadyAsync 方法成功返回，确保图像超分辨率模型可用。
图像超分辨率模型可用后，创建 ImageDescriptionGenerator 对象来引用它。
（可选）创建 ContentFilterOptions 对象并指定首选值。如果选择使用默认值，则可以传入 null 对象。
通过使用原始图像调用 ImageDescriptionGenerator.DescribeAsync 方法、首选描述类型（可选）的枚举和 ContentFilterOptions 对象（可选），获取图像说明 (LanguageModelResponse.Response)。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.Management.Deployment;  
using Microsoft.Windows.AI;
using Microsoft.Windows.AI.ContentModeration;
using Windows.Storage.StorageFile;  
using Windows.Storage.Streams;  
using Windows.Graphics.Imaging;

if (ImageDescriptionGenerator.GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageDescriptionGenerator.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}

ImageDescriptionGenerator imageDescriptionGenerator = await ImageDescriptionGenerator.CreateAsync();

// Convert already available softwareBitmap to ImageBuffer.
ImageBuffer inputImage = ImageBuffer.CreateCopyFromBitmap(softwareBitmap);  

// Create content moderation thresholds object.
ContentFilterOptions filterOptions = new ContentFilterOptions();
filterOptions.PromptMinSeverityLevelToBlock.ViolentContentSeverity = SeverityLevel.Medium;
filterOptions.ResponseMinSeverityLevelToBlock.ViolentContentSeverity = SeverityLevel.Medium;

// Get text description.
LanguageModelResponse languageModelResponse = await imageDescriptionGenerator.DescribeAsync(inputImage, ImageDescriptionScenario.Caption, filterOptions);
string response = languageModelResponse.Response;

#include <winrt/Microsoft.Graphics.Imaging.h>
#include <winrt/Microsoft.Windows.AI.Imaging.h>
#include <winrt/Microsoft.Windows.AI.ContentSafety.h>
#include <winrt/Microsoft.Windows.AI.h>
#include <winrt/Windows.Foundation.h>
#include <winrt/Windows.Graphics.Imaging.h> 
#include <winrt/Windows.Storage.Streams.h>
#include <winrt/Windows.Storage.StorageFile.h>

using namespace winrt::Microsoft::Graphics::Imaging; 
using namespace winrt::Microsoft::Windows::AI;
using namespace winrt::Microsoft::Windows::AI::ContentSafety; 
using namespace winrt::Microsoft::Windows::AI::Imaging; 
using namespace winrt::Windows::Foundation; 
using namespace winrt::Windows::Graphics::Imaging;
using namespace winrt::Windows::Storage::Streams;
using namespace winrt::Windows::Storage::StorageFile;    

if (ImageDescriptionGenerator::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageDescriptionGenerator::EnsureReadyAsync().get();
    auto loadResult = ImageScaler::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}

ImageDescriptionGenerator imageDescriptionGenerator = 
    ImageDescriptionGenerator::CreateAsync().get();

// Convert already available softwareBitmap to ImageBuffer.
auto inputBuffer = Microsoft::Graphics::Imaging::ImageBuffer::CreateForSoftwareBitmap(bitmap); (softwareBitmap);

// Create content moderation thresholds object.

ContentFilterOptions contentFilter{};
contentFilter.PromptMaxAllowedSeverityLevel().Violent(SeverityLevel::Medium);
contentFilter.ResponseMaxAllowedSeverityLevel().Violent(SeverityLevel::Medium);

// Get text description.
auto response = imageDescriptionGenerator.DescribeAsync(inputImage, ImageDescriptionKind::BriefDescription, contentFilter).get();
string text = response.Description();

我可以用图像分割做些什么？

图像分割可用于识别图像中的特定对象。该模型同时接收图像和“提示”对象，并返回已识别对象的掩码。

提示可以通过以下任意组合提供：

属于要标识的点的坐标。
不属于要标识的点的坐标。
一个坐标矩形，该矩形将你标识的内容括起来。

你提供的提示越多，模型就越精确。遵循以下提示准则，尽量减少不准确的结果或错误。

避免在提示中使用多个矩形，因为它们会产生不准确的掩码。
避免只使用排除点而不使用包含点或矩形。
不要指定超过支持的最大值 32 个坐标（点为 1，矩形为 2），因为这将返回错误。

返回的掩码采用灰度-8 格式，其中标识对象的掩码的像素值为 255（所有其他值为 0）。

标识图像中的对象

以下示例演示了标识图像中的对象的方法。这些示例假定你已经有一个用于输入的软件位图对象（softwareBitmap）。

通过调用 GetReadyState 方法并等待 EnsureReadyAsync 方法成功返回，确保图像分段模型可用。
图像分段模型可用后，创建 ImageObjectExtractor 对象来引用它。
将映像传递给 ImageObjectExtractor.CreateWithSoftwareBitmapAsync。
创建 ImageObjectExtractorHint 对象。稍后将演示创建具有不同输入的提示对象的其他方法。
使用 GetSoftwareBitmapObjectMask 方法将提示提交到模型，该方法返回最终结果。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.AI;
using Microsoft.Windows.Management.Deployment;
using Windows.Graphics.Imaging;

if (ImageObjectExtractor::GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageObjectExtractor.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}

ImageObjectExtractor imageObjectExtractor = await ImageObjectExtractor.CreateWithSoftwareBitmapAsync(softwareBitmap);

ImageObjectExtractorHint hint = new ImageObjectExtractorHint{
    includeRects: null, 
    includePoints:
        new List<PointInt32> { new PointInt32(306, 212),
                               new PointInt32(216, 336)},
    excludePoints: null};
    SoftwareBitmap finalImage = imageObjectExtractor.GetSoftwareBitmapObjectMask(hint);

#include <winrt/Microsoft.Graphics.Imaging.h> 
#include <winrt/Microsoft.Windows.AI.Imaging.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Foundation.h>
using namespace winrt::Microsoft::Graphics::Imaging; 
using namespace winrt::Microsoft::Windows::AI.Imaging;
using namespace winrt::Windows::Graphics::Imaging; 
using namespace winrt::Windows::Foundation;

if (ImageObjectExtractor::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageObjectExtractor::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}

ImageObjectExtractor imageObjectExtractor = ImageObjectExtractor::CreateWithSoftwareBitmapAsync(softwareBitmap).get();

ImageObjectExtractorHint hint(
    {},
    {
        Windows::Graphics::PointInt32{306, 212},        
        Windows::Graphics::PointInt32{216, 336}
    },
    {}
);

Windows::Graphics::Imaging::SoftwareBitmap finalImage = imageObjectExtractor.GetSoftwareBitmapObjectMask(hint);

指定包含点和排除点的提示

此代码片段演示如何将包含的和排除的点用作提示。

ImageObjectExtractorHint hint(
    includeRects: null,
    includePoints: 
        new List<PointInt32> { new PointInt32(150, 90), 
                               new PointInt32(216, 336), 
                               new PointInt32(550, 330)},
    excludePoints: 
        new List<PointInt32> { new PointInt32(306, 212) });

ImageObjectExtractorHint hint(
    {}, 
    { 
        PointInt32{150, 90}, 
        PointInt32{216, 336}, 
        PointInt32{550, 330}
    },
    { 
        PointInt32{306, 212}
    }
);

使用矩形指定提示

此代码片段演示如何将矩形（RectInt32 X, Y, Width, Height）用作提示。

ImageObjectExtractorHint hint(
    includeRects: 
        new List<RectInt32> {new RectInt32(370, 278, 285, 126)},
    includePoints: null,
    excludePoints: null );

ImageObjectExtractorHint hint(
    { 
        RectInt32{370, 278, 285, 126}
    }, 
    {},
    {}
);

使用对象擦除可以做些什么？

对象擦除可用于从图像中删除对象。该模型采用图像和灰度掩码，指示要删除的对象、清除图像中的屏蔽区域，并将擦除的区域替换为图像背景。

从图像中删除不需要的对象

以下示例演示如何从图像中删除对象。该示例假定你已经拥有图像和掩码的软件位图对象（softwareBitmap）。掩码必须采用 Gray8 格式，并且待去除区域的每个像素都设置为 255，所有其他像素都设置为 0。

通过调用 GetReadyState 方法并等待 EnsureReadyAsync 方法成功返回，确保图像分段模型可用。
对象擦除模型可用后，创建 ImageObjectRemover 对象以引用它。
最后，使用 RemoveFromSoftwareBitmap 方法将图像和掩码提交到模型，该方法返回最终结果。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.AI;
using Microsoft.Windows.Management.Deployment;
using Windows.Graphics.Imaging;

if (ImageObjectRemover::GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageObjectRemover.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}
ImageObjectRemover imageObjectRemover = await ImageObjectRemover.CreateAsync();
SoftwareBitmap finalImage = imageObjectRemover.RemoveFromSoftwareBitmap(imageBitmap, maskBitmap); // Insert your own imagebitmap and maskbitmap

#include <winrt/Microsoft.Graphics.Imaging.h>
#include <winrt/Microsoft.Windows.AI.Imaging.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Foundation.h>
using namespace winrt::Microsoft::Graphics::Imaging;
using namespace winrt::Microsoft::Windows::AI.Imaging;
using namespace winrt::Windows::Graphics::Imaging; 
using namespace winrt::Windows::Foundation;
if (ImageObjectRemover::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageObjectRemover::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}

ImageObjectRemover imageObjectRemover = ImageObjectRemover::CreateAsync().get();
// Insert your own imagebitmap and maskbitmap
Windows::Graphics::Imaging::SoftwareBitmap buffer = 
    imageObjectRemover.RemoveFromSoftwareBitmap(imageBitmap, maskBitmap);

负责任的 AI

我们已使用以下步骤的组合来确保这些映像 API 可信、安全且负责任地生成。建议查看在应用中实现 AI 功能时在 Windows 上负责任的生成 AI 开发中介绍的最佳做法。

通过