Using Claude 4 on Vertex AI | Model IDs, Pricing & Implementation Code
Most people searching "vertex claude 4" want to know whether they can call Anthropic's Claude 4 family of models from Google Cloud's Vertex AI, which model ID to specify, and how to implement it. This article uses official Google Cloud and Anthropic documentation to provide a practical overview of the current state of Claude 4 on Vertex AI and the steps to implement it.
Subsequent models have continued to be added, and as of June 2026, Opus 4.5 / 4.6 / 4.7 / 4.8, Sonnet 4.5 / 4.6, and Haiku 4.5 are all available on Vertex AI. Because Vertex AI embeds the model ID in the endpoint URL, knowing the correct ID string is essential.
This article draws on primary sources from Google Cloud and Anthropic to compile a quick-reference table of Claude 4 model IDs available on Vertex AI, Model Garden activation steps, AnthropicVertex SDK call code, per-endpoint pricing differences, and 1M token support.
目次 (9)
- Can You Use Claude 4 on Vertex AI? — The Bottom Line
- Quick-Reference Table of Claude 4 Model IDs on Vertex AI
- When Did Claude Opus 4 / Sonnet 4 Arrive on Vertex?
- How to Enable Claude 4 in Model Garden
- Call Code with the AnthropicVertex SDK
- Pricing Differences: Global, Multi-Region, and Region Endpoints
- 1M Token Support and Feature Coverage
- Model ID Pinning and Migration Considerations
- Summary
Can You Use Claude 4 on Vertex AI? — The Bottom Line
The short answer is yes: the Claude 4 family is available on Vertex AI as a Model-as-a-Service (MaaS) offering. Anthropic and Google Cloud announced on May 23, 2025 that Claude Opus 4 and Claude Sonnet 4 were generally available (GA) on Vertex AI (source: Google Cloud Blog). Both models are hybrid reasoning models that can switch between instant-response mode and extended thinking mode for working through complex problems.
Subsequent models have been added since then, and as of June 2026, Opus 4.5 / 4.6 / 4.7 / 4.8, Sonnet 4.5 / 4.6, and Haiku 4.5 are all available on Vertex AI. Opus 4, Sonnet 4, and Opus 4.1 have been marked "Deprecated" but remain callable for the time being.
Quick-Reference Table of Claude 4 Model IDs on Vertex AI
Because Vertex AI embeds the model in the endpoint URL rather than the request body, knowing the correct model ID string is important. The Claude 4 model IDs listed in the official Anthropic documentation (Claude on Vertex AI) are as follows.
| Model | Vertex AI Model ID | Status |
|---|---|---|
| Claude Opus 4.8 | claude-opus-4-8 |
Latest Opus |
| Claude Opus 4.7 | claude-opus-4-7 |
Available |
| Claude Opus 4.6 | claude-opus-4-6 |
Available |
| Claude Opus 4.5 | claude-opus-4-5@20251101 |
Available |
| Claude Opus 4.1 | claude-opus-4-1@20250805 |
Deprecated |
| Claude Opus 4 | claude-opus-4@20250514 |
Deprecated |
| Claude Sonnet 4.6 | claude-sonnet-4-6 |
Latest Sonnet |
| Claude Sonnet 4.5 | claude-sonnet-4-5@20250929 |
Available |
| Claude Sonnet 4 | claude-sonnet-4@20250514 |
Deprecated |
| Claude Haiku 4.5 | claude-haiku-4-5@20251001 |
Available |
The date suffix after @ is a "pinned" version specifier that locks the model to a specific version. The latest-generation models from Opus 4.6 onward are provided in an alias format without a suffix.
When Did Claude Opus 4 / Sonnet 4 Arrive on Vertex?
The Claude 4 family on Vertex AI began with the GA of Opus 4 and Sonnet 4 on May 23, 2025. Claude Opus 4 is positioned as a strong performer for coding and long-running autonomous tasks, while Claude Sonnet 4 offers a balance of performance and cost, surpassing its predecessor Sonnet 3.7. Because both run on Vertex AI's fully managed infrastructure, there is no need to manage GPUs or servers (source: Google Cloud Blog).
How to Enable Claude 4 in Model Garden
If you are using Claude 4 on Vertex AI for the first time, you first need to enable the model in Model Garden. The steps are as follows.
- Open the Google Cloud Console, select your target project, and navigate to Vertex AI's Model Garden.
- Type "Claude" in the search bar and open the model card for the model you want to use (e.g., Claude Opus 4.8).
- Click "Enable" on the model card and follow the on-screen instructions.
- If calling from a local machine, run
gcloud auth application-default loginin your terminal to authenticate with GCP. - From there, call the API by specifying the model ID, project ID, and region.
In addition to enabling models from the Model Garden model card, you can also procure them through the Google Cloud Marketplace (source: Google Cloud Blog).
Call Code with the AnthropicVertex SDK
The Claude API on Vertex AI is almost identical to the standard Messages API, with the key difference being that you pass anthropic_version as vertex-2023-10-16 in the request body. Using Anthropic's official client SDKs abstracts away this difference.
For Python, install and call as follows.
pip install -U google-cloud-aiplatform "anthropic[vertex]"
from anthropic import AnthropicVertex
client = AnthropicVertex(project_id="MY_PROJECT_ID", region="global")
message = client.messages.create(
model="claude-opus-4-8",
max_tokens=100,
messages=[{"role": "user", "content": "Hey Claude!"}],
)
print(message)
For TypeScript, use the dedicated package.
npm install @anthropic-ai/vertex-sdk
import { AnthropicVertex } from "@anthropic-ai/vertex-sdk";
const client = new AnthropicVertex({ projectId: "MY_PROJECT_ID", region: "global" });
const result = await client.messages.create({
model: "claude-opus-4-8",
max_tokens: 100,
messages: [{ role: "user", content: "Hey Claude!" }],
});
(Source: Claude on Vertex AI (Anthropic Docs))
Pricing Differences: Global, Multi-Region, and Region Endpoints
Vertex AI offers three types of endpoints, selected using the region parameter.
- Global endpoint (recommended): Specify
region="global". Dynamically routes to regions with available capacity, offering the highest availability with no pricing premium. - Multi-region endpoint: Specify
region="us"orregion="eu". Keeps data resident within the specified geography while load balancing. Pricing is 10% higher than global. - Regional endpoint: Specify a specific region such as
region="us-east1". Required for single-region data residency or provisioned throughput (dedicated capacity). Also carries a 10% premium.
This pricing structure applies to newer models from Claude Sonnet 4.5 onward; older models such as Opus 4 and Sonnet 4 retain the previous pricing structure. If data residency requirements are flexible, the global endpoint is the first choice — it incurs no additional cost and offers higher availability (source: Claude on Vertex AI (Anthropic Docs)).
The base unit prices (based on the global endpoint) are as follows. These are input/output prices per 1 million tokens (MTok) based on the official Anthropic pricing page as of June 2026. The 10% premium described above is added on top of these prices for multi-region and regional endpoints.
| Model | Input ($ / MTok) | Output ($ / MTok) |
|---|---|---|
| Claude Opus 4.8 | $5 | $25 |
| Claude Sonnet 4.6 | $3 | $15 |
| Claude Haiku 4.5 | $1 | $5 |
Prices are subject to change, so always check the latest rates on the official Anthropic pricing page and the Vertex AI Claude model pricing page (source: Pricing (Anthropic Docs)).
1M Token Support and Feature Coverage
Context window size varies by model. Opus 4.8 / 4.7 / 4.6 and Sonnet 4.6 support 1M tokens on Vertex AI, while Sonnet 4.5, Sonnet 4, and others support 200K tokens. Note that Vertex AI limits request payloads to 30MB, so sending large numbers of images or very long documents may hit this limit before reaching the token ceiling.
In terms of features, prompt caching, extended thinking, tool use (Bash, computer use, text editor, etc.), web search, citations, and structured outputs are all supported. On the other hand, some capabilities are not available via Vertex: input source specification via the Files API, certain API endpoints such as Message Batches, and Managed Agents. Data handling follows Google Cloud Vertex AI policies (source: Claude on Vertex AI (Anthropic Docs)).
Model ID Pinning and Migration Considerations
For production use, it is recommended to pin the version using a model ID with a date suffix, such as claude-opus-4-5@20251101. Using an alias without a suffix means behavior could change whenever Anthropic releases an update, potentially breaking existing workflows.
What about using generation models that are currently only available as suffix-less aliases — like Opus 4.6 and later — in production? First, check whether a date-suffixed ID has been added to the model card in Model Garden; if so, pin to that. If you must use the alias form, track Anthropic's model update announcements (release notes) and make sure you have a process in place to run regression checks on existing workflows whenever an update occurs.
Also, model availability and regions on Vertex can change at any time. Check availability by searching for "Claude" in the Vertex AI Model Garden or by visiting Google Cloud's Anthropic Claude model list for the latest information. Models marked as Deprecated (Opus 4, Sonnet 4, Opus 4.1, etc.) will eventually be retired, so it is safer to migrate to a successor generation sooner rather than later.
Summary
Here are the key points for using Claude 4 on Vertex AI. Opus 4 and Sonnet 4 went GA on May 23, 2025, and a full lineup of successors up through Opus 4.8 and Sonnet 4.6 is now available. To implement, enable the model in Model Garden, authenticate with gcloud auth application-default login, and pass the model ID (e.g., claude-opus-4-8), your project ID, and region="global" to the AnthropicVertex SDK. The global endpoint is the default choice for lower cost and higher availability; choose multi-region or regional if you have data residency requirements. In production, pin your model ID and migrate away from Deprecated models promptly.