Vertex AI in express mode overview

This guide provides an overview of Vertex AI in express mode, the fastest way to start building generative AI applications on Google Cloud. It covers the following topics:

  • What is express mode?: Learn about the features, limitations, and how it compares to the full Vertex AI experience.
  • Express mode workflow: Understand the three-step process to test models in Vertex AI Studio and integrate them into your application using an API key.
  • Manage your account: Find out how to view API keys, check quotas, enable billing, and graduate to the full Google Cloud experience.

Vertex AI in express mode is the fastest way to start building generative AI applications on Google Cloud. Signing up in express mode is quick and easy, and it doesn't require entering any billing information. After you sign up, you can access and use Google Cloud APIs in just a few steps.

To learn more about Vertex AI in express mode, see Google Cloud express mode FAQs.

Concepts

  • Express mode: A streamlined, no-billing-required environment for quickly trying Generative AI on Vertex AI with a 90-day limit and specific usage quotas.
  • API Key: An encrypted string used to authenticate your application's requests to Vertex AI APIs in express mode, instead of using a project ID.
  • Quota: A limit on the amount of a Google Cloud resource you can use, which in express mode restricts the rate of API requests you can make for free.

Express mode eligibility

Vertex AI in express mode is separate from, and not available through, the Google Cloud Free Program. If you are in the Google Cloud Free Program, see the other quickstarts in the Get Started section to start using Generative AI on Vertex AI.

Vertex AI is available in express mode for developers that click the Try Vertex AI Studio free button and sign up using a @gmail.com Google Account. Accounts used previously to access Google Cloud are ineligible for express mode and are not shown the Try Vertex AI Studio free button. For example, if you used your Google Account to create a Google Cloud free trial account, you are not eligible to sign up in express mode with that same Google Account.

About Vertex AI in express mode

Upon completing your sign-up in express mode, you get access to the following:

  • Core Vertex AI Studio features: You can test and customize prompts for different generative AI models in Vertex AI Studio in express mode, and get the corresponding code to use in your application.
  • An API key.
  • 90 days to try Vertex AI in express mode.

During your 90 days, you can use the Vertex AI APIs that support express mode for free up to their quotas. You can increase your quota limits at any time by enabling billing.

After enabling billing, the 90 day limit is removed, your quotas are increased, and you only pay for what you use. At any time, you can choose to end express mode and start using all the Google Cloud services and capabilities.

The following table lists the differences between Vertex AI express mode without billing, Vertex AI with billing, and Vertex AI without express mode:

Feature Vertex AI express mode Vertex AI express mode with billing Vertex AI
Time limit 90 days Unlimited Unlimited
Available services Basic Generative AI on Vertex AI services. Expanded Vertex AI services and select Google Cloud services. All Google Cloud services, including Vertex AI.
Data sources Google Drive
  • Google Drive
  • Web files
  • YouTube video URLs
All data sources available in Google Cloud.
Quota See Available models and rate limits in express mode. See Rate limits. See Rate limits.
Service level agreement (SLA) None Vertex AI SLA Vertex AI SLA
Standard format of API endpoints Specify API key instead of project ID and ___location. For example:
https://aiplatform.googleapis.com/v1/publishers/google/models/{model}:streamGenerateContent?key={API_KEY}
Specify API key instead of project ID and ___location. For example:
https://aiplatform.googleapis.com/v1/publishers/google/models/{model}:streamGenerateContent?key={API_KEY}
Specify project ID and ___location. For example:
https://{___location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{___location}/publishers/google/models/{model}:streamGenerateContent

Available models and rate limits in express mode

You can try out several models in express mode, including the Gemini 2.0 Flash models. The following table lists the models that are available in express mode, along with their rate limits:

Model category Available models Requests per minute Discontinuation date
Gemini gemini-2.5-pro 30 June 17, 2026
gemini-2.5-flash 30 June 17, 2026
gemini-2.5-flash-lite-preview-06-17 30
gemini-2.5-flash-preview-05-20
30 July 15, 2025
gemini-2.5-flash-preview-04-17
30 July 15, 2025
gemini-2.5-pro-preview-05-06 30 July 15, 2025
gemini-2.0-flash-001 30 February 5, 2026
gemini-2.0-flash-lite-001 30 February 25, 2026

For Gemini 2.0 models, the Multimodal Live API isn't available in the Console in express mode. To use the Multimodal Live API in express mode, use the Vertex AI API or the Google Gen AI SDK.

Vertex AI in express mode workflow

You can start sending requests from your application to Vertex AI APIs in three steps:

  1. Use Vertex AI Studio to test features.

    In the Google Cloud console in express mode, select Vertex AI > Freeform and use the Freeform page to create and optimize multimodal prompts using a variety of Gemini models.

  2. Get the code.

    On the Freeform page, click Get code. A panel opens showing code that programmatically sends the same requests that you implemented in the UI. You can get the code for a programming language or curl. You can use Google Colab to try the Python code.

  3. Use your API key to authenticate.

    In the Google Cloud console in express mode, click Menu and select API Keys, and then copy your key into your code where it says "YOUR_API_KEY". For example:

    Python

    The Google Gen AI SDK for Python is available on PyPI and GitHub:

    To learn more, see the Python SDK reference (opens in a new tab).

    from google import genai
    
    # TODO(developer): Update below line
    API_KEY = "YOUR_API_KEY"
    
    client = genai.Client(vertexai=True, api_key=API_KEY)
    
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents="Explain bubble sort to me.",
    )
    
    print(response.text)
    # Example response:
    # Bubble Sort is a simple sorting algorithm that repeatedly steps through the list

What's different in express mode

Vertex AI in express mode provides a subset of the features for Generative AI on Vertex AI. Therefore, some of the Vertex AI documentation is not relevant if you signed up in express mode. For details on the available API endpoints in express mode, see the Vertex AI in express mode REST API reference.

In addition, customers in Google Cloud typically use organizations and projects to work with resources (for example, to call an API endpoint). When using Vertex AI in express mode, you don't need to worry about organizations or projects. However, you might see them mentioned in some of the Google Cloud documentation that you reference while you're using Vertex AI in express mode. You can still use the documentation, but ignore concepts and instructions that refer to organizations and projects. In addition, the ___location you selected when signing up in express mode is used throughout your experience.

When calling REST API endpoints in express mode, you'll use the endpoint format for express mode and specify your API key. For example:

Standard endpoint URL https://{___location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{___location}/publishers/google/models/{model}:streamGenerateContent
Endpoint URL in express mode https://aiplatform.googleapis.com/v1/publishers/google/models/{model}:streamGenerateContent?key={API_KEY}

Manage your express mode account

What's next