How to Use Claude Code with Google | Vertex AI Setup and Model Pinning

Developers searching for "how to run Claude Code on Google Cloud" or "how to integrate with company GCP billing without a separate Anthropic contract" need two things above all: the quickest path to launching Claude Code (CLI) via Google Vertex AI, and an understanding of how it differs from a personal Pro/Max login. This article draws on Anthropic's official Claude Code documentation and Google Cloud Vertex AI documentation to provide a single, comprehensive overview of the /setup-vertex wizard, environment variables, IAM, and model pinning.

AI-generated article summarypowered by Claude
結論powered by Claude

The primary way to use Claude Code with Google is to launch the CLI via Google Cloud Vertex AI Model Garden. Claude Code v2.1.98 and later includes a /setup-vertex wizard that reads gcloud Application Default Credentials to auto-detect your GCP project and region, then walks you all the way through pinning available Claude models. No separate Anthropic contract is needed — billing is consolidated into Google Cloud, and IAM, VPC Service Controls, and Cloud Logging all apply out of the box.

Personal Pro/Max plan Google account OAuth login and enterprise Vertex AI access are two separate paths. The former is a subscription tied to Anthropic billing via claude.ai; the latter uses Vertex AI token-based pricing and is only activated when CLAUDE_CODE_USE_VERTEX=1 is set. For rolling out to multiple users in a company, Vertex AI is the right choice for SSO, auditing, and cost management.

The four most common setup pain points are: (1) enabling the Vertex AI API, (2) requesting model access in Model Garden (which can take 24–48 hours), (3) mismatches between region and model, and (4) Haiku fallback issues. This article covers prerequisites, the wizard walkthrough, manual environment variable setup, model pinning, IAM configuration, enabling 1M token context, and troubleshooting.

目次 (10)

Two Paths for Using Claude Code with Google

There are two main ways to use Claude Code with Google, each suited to a different use case. The first is for individual developers: log in with a Google account via OAuth — the same account used for claude.ai — and run Claude Code CLI under a Pro or Max subscription. The second is for enterprises and Google Cloud users: launch the CLI via Vertex AI Model Garden, integrating with GCP billing and IAM. The first path simply requires running claude and selecting "Anthropic Console," while the second requires enabling CLAUDE_CODE_USE_VERTEX=1 to route requests to Vertex AI.

This article focuses primarily on Vertex AI setup. Compared to a direct Anthropic contract, the advantages are: (1) consolidating billing into a single Google Cloud account, (2) managing access via roles/aiplatform.user in IAM, (3) applying VPC Service Controls for perimeter protection, (4) retaining audit logs in Cloud Logging, and (5) applying existing Google Cloud SSO and compliance controls as-is — five benefits in total. (source) On the other hand, note that the latest models are often released on the Anthropic direct API first, and there can be a lag of several days before they are enabled on Vertex AI.

Note that a personal Pro/Max "Google account login" and a Vertex AI "Google Cloud authentication" are entirely different things. The former uses claude.ai OAuth; the latter uses Application Default Credentials (gcloud auth). The credential storage and billing routes are different. Even if you log into claude.ai with a Workspace account, you are not automatically routed through Vertex AI — for business use, you must explicitly select Vertex AI.

Prerequisites for Using Claude Code on Vertex AI

Before starting the Vertex AI setup, confirm that the following five conditions are met. Billing configuration and quota allocation are especially easy to overlook and will later cause "model not found 404" errors.

  1. You have a Google Cloud Platform account with billing enabled.
  2. The Vertex AI API (aiplatform.googleapis.com) is enabled in the GCP project you plan to use.
  3. Access to the target Claude model (e.g., Claude Sonnet 4.6, Opus 4.7) has been requested and approved in the Vertex AI Model Garden.
  4. Google Cloud SDK (gcloud CLI) is installed in your local development environment and gcloud auth application-default login has been completed.
  5. Quota for Claude models is allocated in your target GCP region (e.g., us-east5, europe-west1, or the global endpoint).

Additionally, Claude Code v2.1.98 or later is recommended. Check with claude --version and update via npm i -g @anthropic-ai/claude-code or the official installer if needed. Versions prior to v2.1.98 lack the /setup-vertex wizard and startup model check described below, meaning you must configure environment variables entirely by hand. (source)

Model access requests can be submitted via the "Request" button on the Claude detail page in Model Garden. Since Anthropic's review can take 24–48 hours, submit your request early to fit your testing schedule.

3 Steps to Sign In with the /setup-vertex Wizard

Claude Code v2.1.98 and later includes the /setup-vertex wizard for interactively completing the initial Vertex AI connection setup. It reads gcloud Application Default Credentials to auto-detect your project and region, lists available Claude models, and completes pinning all in one flow. With GCP prerequisites in place, the whole process takes 3–5 minutes.

  1. Enable the Vertex AI API in your GCP project and request access to the target Claude model in Model Garden. Wait for approval so the model can be called from the project.
  2. Run claude in your terminal. At the initial login prompt, select 3rd-party platform, then choose Google Vertex AI. If you are already logged in, run /setup-vertex directly.
  3. Follow the wizard prompts to choose your authentication method: gcloud Application Default Credentials, a service account key file, or environment variables. The project ID and region are auto-detected, and you pin the model you want from the list of callable models.

The wizard writes its results to the env block of your user settings file (e.g., ~/.claude/settings.json), so you do not need to manually export CLAUDE_CODE_USE_VERTEX or CLOUD_ML_REGION. If you later want to change the project or region, simply re-run /setup-vertex to overwrite the settings. (source)

In environments where interactive input is not possible — such as CI or containers — use the manual setup described in the next section to set environment variables explicitly. For enterprise rollouts, the standard practice is to use the wizard for individual developers and manual setup for CI/images.

Verifying a Successful Setup

Once the wizard or manual configuration is complete, verify that Claude Code is actually running via Vertex AI with these three checks. If all three pass, initial setup is complete.

  1. Launch claude and check whether the status bar at the bottom of the screen shows the Vertex AI connection (project and region).
  2. Send a simple prompt and confirm you get a response. A successful response means authentication, model pinning, and region are all correctly aligned.
  3. Check claude --version (client version), gcloud config list (project and region), and gcloud auth list (account) to confirm the values are as expected.

If the status bar does not show a Vertex connection, or no response is returned, refer to the "Common Errors and Troubleshooting" section later in this article.

Manual Setup | Configuring 4 Environment Variables

In environments where the wizard cannot be launched — such as CI pipelines or Docker images — enable Vertex AI by setting four environment variables. The minimum configuration requires only the first three (CLAUDE_CODE_USE_VERTEX / CLOUD_ML_REGION / ANTHROPIC_VERTEX_PROJECT_ID); the rest are optional. The values to export are as follows:

# Enable routing via Vertex AI
export CLAUDE_CODE_USE_VERTEX=1

# Region: global / us / eu / specific region (e.g., us-east5)
export CLOUD_ML_REGION=global

# Project ID (takes precedence over GCLOUD_PROJECT)
export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID

# Only required when using service account authentication (optional/advanced)
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa-key.json

CLOUD_ML_REGION accepts three formats: global endpoint (global), multi-region (us or eu), or specific region (e.g., us-east5). Claude Code automatically selects the hostname based on the format, and for multi-region sends requests to aiplatform.eu.rep.googleapis.com or aiplatform.us.rep.googleapis.com. global offers the most stable availability and is recommended as the first choice. (source)

Project ID resolution follows this priority order: (1) ANTHROPIC_VERTEX_PROJECT_ID, (2) GCLOUD_PROJECT / GOOGLE_CLOUD_PROJECT, (3) the credentials file in GOOGLE_APPLICATION_CREDENTIALS, (4) gcloud configuration, (5) the attached service account. When switching between multiple projects, explicitly setting ANTHROPIC_VERTEX_PROJECT_ID prevents mistakes.

Prompt caching is enabled by default on Vertex AI as well. To disable it, set DISABLE_PROMPT_CACHING=1. To extend the default 5-minute TTL to 1 hour, set ENABLE_PROMPT_CACHING_1H=1. The 1-hour TTL comes with a higher write cost tier, so it is practical to limit it to agent tasks that genuinely require long session continuity.

Model Pinning and Region Selection in Practice

When rolling out to multiple users, pinning model versions is essential. Without a pin, aliases like sonnet or opus resolve to "the latest version Anthropic has published," but if that new model has not yet been enabled in your Vertex AI project, requests will fail. Pinning gives administrators control over when new models are introduced.

As of June 2026, the current flagship is Claude Opus 4.8 (with 1M context support), and the pin examples below list Opus 4.8 first. However, since the latest models may take a few days to be enabled on Vertex AI, confirm that the target model is available in Model Garden before pinning. Note that these pinning environment variables are optional (advanced) for an individual developer's initial setup and become required when rolling out to multiple users. Representative pinning environment variables are as follows:

# Pin Opus to the current flagship 4.8 (1M context supported)
export ANTHROPIC_DEFAULT_OPUS_MODEL='claude-opus-4-8'
export ANTHROPIC_DEFAULT_SONNET_MODEL='claude-sonnet-4-6'
export ANTHROPIC_DEFAULT_HAIKU_MODEL='claude-haiku-4-5@20251001'

Without pins, the default is claude-sonnet-4-5@20250929, and smaller/faster models also fall back to the same value as the primary. On Vertex AI, Haiku is sometimes not enabled for a given project or region, causing background tasks like session title generation to run on Sonnet and incur unexpected costs. Explicitly setting ANTHROPIC_DEFAULT_HAIKU_MODEL to point to a region where Haiku is enabled is the safe approach. (source)

Even when running CLOUD_ML_REGION=global, some models may not support the global endpoint. In that case, use the optional/advanced VERTEX_REGION_CLAUDE_* variables to override the region per model.

export VERTEX_REGION_CLAUDE_HAIKU_4_5=us-east5
export VERTEX_REGION_CLAUDE_4_6_SONNET=europe-west1

At startup, Claude Code automatically checks whether the pinned models can be called from the project. If a new version becomes available in Model Garden, you will be prompted to update, and accepting will write the new ID to your settings file. If administrators want to control update timing centrally, a common approach is to share the settings file across the team to suppress individual acceptance prompts.

The minimum IAM permission required to run Claude Code on Vertex AI is aiplatform.endpoints.predict, which covers both model invocation and token counting. The built-in role roles/aiplatform.user includes this permission, making it the simplest choice to assign to individual developers.

For tighter restrictions, create a custom role containing only aiplatform.endpoints.predict and assign it to a Claude Code service account. Including the Vertex AI API within a VPC Service Controls perimeter allows you to block calls from outside the corporate network. (source)

Anthropic officially recommends creating a dedicated GCP project for Claude Code. The reasons are: (1) cost tracking can be isolated to a single billing line, (2) access control does not interfere with other workloads, and (3) the blast radius of quota exhaustion is limited. Giving the project a descriptive name like claude-code-prod and attaching labels for team name and cost center makes monthly billing analysis significantly easier.

Enabling 1 Million Token Context and Key Considerations

Claude Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 all support a 1 million token context window on Vertex AI. When you select a 1M variant, Claude Code automatically enables extended context — no additional CLI configuration is needed. The /setup-vertex wizard presents the 1M option when pinning models. Note that the latest models, including the current flagship Opus 4.8, may take a few days to be enabled on Vertex AI, so confirm availability in Model Garden before use.

For manual pinning, append the [1m] suffix to the model ID. For example, setting claude-opus-4-8[1m] as ANTHROPIC_DEFAULT_OPUS_MODEL runs that model with 1M context. Since 1M mode uses a higher pricing tier than the standard 200K mode, the general principle is to enable it only for tasks that clearly require a large context — such as comprehending a massive codebase all at once or running long sessions.

From a practical standpoint, using Claude Code's /compact and /clear commands effectively means many use cases run fine within 200K context, and there is rarely a need to keep 1M "always on." The realistic approach is to monitor monthly costs and selectively switch to 1M pinning for specific repositories or agent workflows.

Common Errors and Troubleshooting

Errors encountered when running Claude Code via Vertex AI fall into four main patterns. Knowing the cause and fix for each significantly reduces time lost during initial setup.

  1. "Could not load default credentials" error: Application Default Credentials are not configured. Run gcloud auth application-default login, or set GOOGLE_APPLICATION_CREDENTIALS to the path of a service account key file.
  2. Model not found 404 error: The model is not enabled in Model Garden, or it is not available in the specified region. Check "Supported features" in Model Garden, and for models that don't support the global endpoint, use VERTEX_REGION_CLAUDE_* to specify individual regions.
  3. 429 Too Many Requests error: Quota is insufficient for the regional endpoint. Switch to CLOUD_ML_REGION=global or request a quota increase in the Cloud Console. Also verify that both the primary and Haiku models are supported in that region.
  4. Quota exhausted / "Resource Exhausted" error: The project's Vertex AI quota is depleted. Check the relevant quota (e.g., online_prediction_requests_per_base_model) on the Quotas page in the Cloud Console and submit an increase request. (source)

Additionally, Claude Code v2.1.121 and later support X.509 certificate-based workload identity federation. Pointing GOOGLE_APPLICATION_CREDENTIALS to a credential configuration file that uses certificates issued by an internal PKI lets you authenticate without distributing service account keys, enabling a more secure configuration.

When troubleshooting, the standard flow is: first check the client version with claude --version, then inspect the authentication context with gcloud config list and gcloud auth list. About 80% of issues fall into one of two categories: "running under an unexpected project or account" or "model access not yet approved in Model Garden." (source)

Choosing Between Anthropic Direct API, Vertex AI, and AWS Bedrock

In addition to Vertex AI, Claude Code also works via the Anthropic direct API and AWS Bedrock. Choosing between the three comes down to billing route and existing cloud integrations.

The Anthropic direct API offers early access to new models and is tied to Pro/Max plan or Anthropic Console billing. Vertex AI excels at unified GCP operations (IAM, VPC SC, Cloud Logging, Workspace SSO) and suits organizations that center their governance on Google Cloud. AWS Bedrock similarly excels at unified AWS operations and is ideal for organizations that manage controls through IAM, KMS, and CloudTrail. The only difference from Vertex AI is enabling CLAUDE_CODE_USE_BEDROCK=1.

Base token pricing is roughly equivalent across all three routes, but Vertex AI and Bedrock integrate into their respective cloud billing statements, making it possible to combine them with annual contracts or reserved capacity. When using multiple routes simultaneously, configure explicit route switching via settings files or environment variables to prevent accidental requests flowing to the Anthropic direct API and incurring double billing.

Finally, given that Google has invested a cumulative approximately $2.5 billion in Anthropic and signed a multi-year contract to supply up to one million of its proprietary AI chips (TPUs), Claude availability on Vertex AI is expected to continue expanding. (source) For organizations primarily operating on GCP, standardizing Claude Code via Vertex AI is the most rational path — providing access to the latest models while minimizing long-term vendor lock-in risk.

参考になったら ♡
Clauder Navi 編集部
@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務 を毎日発信。 運営方針 は メディアについて をご覧ください。