How to Call Claude on Vertex AI | Model IDs, Authentication & Pricing

Google Cloud's Vertex AI lets you call Anthropic's Claude models directly from GCP infrastructure. This guide is aimed at engineers who want to use Claude on Vertex AI but are unsure which model ID to specify or how to configure SDK authentication — drawing from Anthropic's official documentation and Google Cloud's primary sources.
Google Cloud's Vertex AI lets you call Anthropic's Claude models directly from GCP infrastructure. Key benefits include seamless integration with existing GCP projects via IAM authentication and VPC Service Controls, as well as enterprise-grade compliance and audit logging.
On Vertex AI, the model ID is specified as part of the request URL path, which differs from the Anthropic direct API. Claude Opus 4.8 / 4.7 / 4.6 and Claude Sonnet 4.6 support a 1M-token context window.
This article consolidates everything from the list of available model IDs to GCP project prerequisites, SDK installation and authentication setup, endpoint differences, and pricing — based on Anthropic's official documentation and Google Cloud's primary sources.
目次 (10)
- What Is Claude on Vertex AI?
- Available Models and Model ID List
- Prerequisites: GCP Project and Model Garden Activation
- SDK Installation
- Authentication Setup (gcloud Command)
- Basic Python Code for Calling Claude
- Global, Multi-Region, and Regional Endpoint Differences
- Understanding Pricing
- Supported and Unsupported Features
- Choosing Between Vertex AI and the Direct API
What Is Claude on Vertex AI?
Claude on Vertex AI is a service that lets you call Anthropic's Claude models through Google Cloud's managed AI platform. Since 2023, Anthropic and Google have maintained a partnership spanning both capital investment and technology, and successive generations of Claude models are available via Vertex AI's Model Garden.
The main benefits of using Claude through Vertex AI are:
- Integrates with existing GCP projects (IAM authentication, VPC Service Controls, etc.)
- Zero data retention (compliant with Google Cloud Vertex AI terms)
- Enterprise-grade compliance and audit logging
- Choice of global, multi-region, or regional endpoints
Source: Claude on Vertex AI — Anthropic Docs
Available Models and Model ID List
Below is the list of Claude models callable from Vertex AI as of June 2026, along with their API model IDs. On Vertex AI, the model ID is specified as part of the request URL path (unlike the Anthropic direct API).
| Model Name | Vertex AI Model ID |
|---|---|
| Claude Opus 4.8 | claude-opus-4-8 |
| Claude Opus 4.7 | claude-opus-4-7 |
| Claude Opus 4.6 | claude-opus-4-6 |
| Claude Sonnet 4.6 | claude-sonnet-4-6 |
| Claude Sonnet 4.5 | claude-sonnet-4-5@20250929 |
| Claude Haiku 4.5 | claude-haiku-4-5@20251001 |
| Claude Opus 4.5 | claude-opus-4-5@20251101 |
Claude Opus 4.8 / 4.7 / 4.6 and Claude Sonnet 4.6 support a 1M-token context window. All other models support 200k tokens. Note that Vertex AI imposes a 30 MB request payload limit, so when sending large documents or many images you may hit this ceiling before reaching the token limit.
Model availability may vary by region. For the latest information, search for "Claude" in the Vertex AI Model Garden or refer to the Google Cloud Claude on Vertex AI page.
Prerequisites: GCP Project and Model Garden Activation
Before calling Claude on Vertex AI, you need to complete the following setup in Google Cloud Console.
- Prepare a Google Cloud project and enable the Vertex AI API.
- Open the Vertex AI Model Garden and search for "Anthropic Claude."
- On the page for the model you want to use, click "Enable" and complete the terms of service agreement.
- Once you agree to Anthropic's terms of service, API access for that model is activated.
Skipping these steps will result in permission errors when calling the API. You must activate each model separately when you start using a new one.
SDK Installation
Anthropic provides official SDKs for Vertex AI. Python and TypeScript are the most widely used.
Python:
pip install -U google-cloud-aiplatform "anthropic[vertex]"
TypeScript / Node.js:
npm install @anthropic-ai/vertex-sdk
Official SDKs for C#, Go, Java, PHP, and Ruby are also available. See the Anthropic client SDK documentation for details.
Authentication Setup (gcloud Command)
Calling Claude on Vertex AI requires Google Cloud credentials (Application Default Credentials). For local development environments, authenticate with the following command:
gcloud auth application-default login
This generates ~/.config/gcloud/application_default_credentials.json, which the SDK reads automatically. For production environments, service account-based authentication is recommended. When using a service account key, set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the key file so the SDK can auto-detect it (see: Google Cloud Service Accounts official documentation).
Basic Python Code for Calling Claude
Once authentication is complete, use the following code to call Claude. Set project_id to your GCP project ID.
from anthropic import AnthropicVertex
project_id = "MY_PROJECT_ID"
region = "global" # Global endpoint recommended
client = AnthropicVertex(project_id=project_id, region=region)
message = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Calling Claude via Vertex AI.",
}
],
)
print(message.content[0].text)
In TypeScript, use the AnthropicVertex class from @anthropic-ai/vertex-sdk in the same way.
import { AnthropicVertex } from "@anthropic-ai/vertex-sdk";
const client = new AnthropicVertex({
projectId: "MY_PROJECT_ID",
region: "global",
});
const result = await client.messages.create({
model: "claude-opus-4-8",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello from Vertex AI!" }],
});
console.log(result.content[0]);
The Vertex AI request body format differs from the Anthropic direct API in two ways: the model is specified in the URL path (not in the request body), and anthropic_version must be passed in the body as "vertex-2023-10-16". When using the SDK, these differences are handled automatically.
Global, Multi-Region, and Regional Endpoint Differences
Vertex AI offers three endpoint types to choose from.
Global Endpoint (Recommended)
Specify region = "global". Traffic is dynamically routed to the region with available capacity, giving you the highest availability. No additional pricing premium — pay-as-you-go only. Best for applications with flexible data residency requirements.
Multi-Region Endpoint
Specify region = "us" or region = "eu". Use this when you need to restrict data residency geographically to the US or EU. A 10% pricing premium applies compared to the global endpoint.
Regional Endpoint
Specify a particular region such as region = "us-east1". Use this for strict data residency requirements or when Provisioned Throughput (guaranteed throughput) is needed. A 10% pricing premium also applies. Provisioned Throughput is only available with regional endpoints.
Understanding Pricing
Claude pricing on Vertex AI is comparable to the Anthropic direct API, but varies by endpoint type. The global endpoint has no premium; multi-region and regional endpoints carry a +10% premium (applicable to Claude Sonnet 4.5 and newer models).
For the latest pricing, refer to:
Supported and Unsupported Features
Here are the main features supported by Claude on Vertex AI.
Supported Features
- Messages API
- Prompt caching (with flexible TTL settings)
- Extended Thinking
- Tool use (Bash tool, computer use tool, text editor tool)
- Web search tool
- Structured output
- Batch Predictions
- Activity logging (request and response recording)
Unsupported Features
- Files API (image and document input via URL source)
- Message Batches API endpoint
- Managed Agents / MCP connectors
- Models, Admin, and Usage and Cost API endpoints
- Code execution tool, Web Fetch tool (server-side tools)
For the complete list of features, see the Anthropic Features overview.
Choosing Between Vertex AI and the Direct API
Using Claude via Vertex AI is a good fit when you want to integrate with existing GCP infrastructure or leverage Google Cloud's IAM, compliance, and audit logging capabilities. It is particularly strong for enterprise internal systems and scenarios requiring control over data residency.
On the other hand, for individual development, small-scale projects, or when you want immediate access to the latest features such as the Files API or Managed Agents, the Anthropic direct API offers a broader range of options.
In addition to Vertex AI, Claude is also available on Amazon Bedrock and Microsoft Foundry. Choose the platform that best fits your cloud environment and development stack.