Can You Run Claude Locally? | Closed Weights and Alternatives
"I want Claude to run entirely on my own computer." "I need local execution for privacy or offline use." — Many people search run claude locally with exactly these goals in mind. The short answer: you cannot run the Claude model itself locally. That said, a practical middle ground does exist: you can run Claude Code — which is a locally operating tool — and point it at a different model running on your own machine. This article explains why fully local execution is impossible, what actually can run locally, and which alternatives best fit your needs.
Claude's model weights are not publicly released, so — like ChatGPT — you cannot run inference entirely locally or offline. The model is only accessible through Anthropic's cloud API.
That said, the terminal tool Claude Code runs on your machine. The tool is local; the model doing the thinking is in the cloud. Keeping this distinction clear prevents most of the confusion.
If you truly need everything local, two realistic options exist: (1) redirect Claude Code to a local LLM (LM Studio / Ollama), or (2) use an open-weight model as a substitute. The first approach uses a bridge like LiteLLM to translate Anthropic-format requests to a locally hosted OpenAI-compatible server.
目次 (8)
- The Bottom Line: Claude's Model Cannot Run Locally
- Why Doesn't Anthropic Open-Source Claude?
- Claude Code, However, Does Run Locally
- Workaround 1: Point Claude Code at a Local LLM
- Workaround 2: Substitute an Open-Weight Model
- Official Self-Hosting Options Cover Only the Execution Environment
- By Use Case: What Do You Actually Need?
- Summary: The Realistic Middle Ground for Local Execution
The Bottom Line: Claude's Model Cannot Run Locally
Let's be direct. Anthropic does not distribute the model weights for Claude (Opus / Sonnet / Haiku). This means you cannot download a weights file and run inference on your own GPU the way you can with Llama or Qwen. The only way to use Claude is through Anthropic's cloud API or via partner clouds such as AWS Bedrock or Google Cloud Vertex AI.
In other words, if "running Claude locally" means making it fully self-contained and offline, the answer is still No as of 2026. Multiple sources confirm this (Can You Run Claude Locally in 2026?). In cases where Claude appears to be "running locally," actual inference is still happening in the cloud — your machine is merely relaying input and output.
Why Doesn't Anthropic Open-Source Claude?
You might wonder why such a capable model isn't released openly. There are two main reasons.
- The business model depends on API access: Anthropic's primary revenue stream is charging for access to the cloud API. Releasing the weights would instantly commoditize their core product.
- Protecting safety-aligned training: Claude is trained with embedded value guidelines (Constitutional AI). Publishing the weights would make it trivial for third parties to fine-tune those behaviors away, undermining the safety design.
Given both factors, an open-weight release of Claude in the near future is unlikely.
Claude Code, However, Does Run Locally
Here is where many people get confused. The model does not run locally, but Anthropic's official terminal tool Claude Code does run locally on your machine.
Once installed, Claude Code launches from the shell with the claude command, reading and writing local files as it works. The "thinking" portion, however, queries the Claude model in the cloud (official setup documentation).
To summarize:
- Runs locally: Claude Code itself, file operations, command execution
- Lives in the cloud: Claude model inference (reasoning and generation)
When people say they "installed Claude locally," what they actually experienced is this local operation of the tool — not local execution of the model.
Workaround 1: Point Claude Code at a Local LLM
If you genuinely cannot allow traffic to leave your network, or you want to eliminate API costs, the most practical approach is to redirect Claude Code's connection to a locally running model. Claude Code lets you override the endpoint URL and authentication token via environment variables, allowing you to point it at a locally hosted OpenAI-compatible server (Claude Code: Self host model configuration, How to run Claude Code against a free local model).
The basic setup looks like this:
- Start a local model: Load an open-weight model such as Qwen into LM Studio or Ollama and start the server.
- Set up a bridge: Launch LiteLLM as a proxy. LiteLLM translates the Anthropic-format requests Claude Code sends into the OpenAI-compatible format your local server expects.
- Redirect the connection: Set the environment variable
ANTHROPIC_API_KEY=<dummy string (e.g., fake-key)>and pointANTHROPIC_BASE_URLto your LiteLLM proxy URL. - Start Claude Code and verify: Launch
claudeand confirm that responses are coming from your local model.
With this setup, all inference runs on your own machine and no code is sent externally. Keep in mind that since you are not running the actual Claude model, complex reasoning and nuanced instruction-following will fall short of genuine Claude. For simple edits and routine automation tasks, however, this setup is practically viable. Set your expectations accordingly.
Note that official support for local LLMs has been requested from Anthropic, and the current approach remains an unofficial bridging configuration (Feature Request: Support for Self-Hosted LLMs · Issue #7178).
Workaround 2: Substitute an Open-Weight Model
If you are not attached to Claude specifically and simply want AI running locally, using an open-weight model from the start is the straightforward path.
- Ollama + Open WebUI: Run a model locally and access it through a browser with a ChatGPT-style interface. With a carefully crafted system prompt, some users report achieving roughly 80% of Claude's perceived usability.
- LM Studio: A GUI-based option that handles everything from model download to startup, making it accessible even for beginners.
Where open-weight models tend to fall short is in complex reasoning accuracy and the precision with which they decline inappropriate requests — areas where proprietary training retains an advantage. The more routine and well-defined your use case, the more satisfied you will be with a local alternative.
Official Self-Hosting Options Cover Only the Execution Environment
You may have heard that Anthropic offers an official self-hosting option. This does not mean self-hosting the model. What Anthropic provides is the ability to host the sandbox (execution environment) for running code on your own infrastructure — inference itself still runs on Claude in the cloud (Self-hosted sandboxes — Claude API Docs).
The same pattern holds: execution is local, thinking is in the cloud.
By Use Case: What Do You Actually Need?
The desire to "run it locally" often masks a more specific underlying goal. Identifying that goal points you to the right solution.
- Privacy (don't want code leaving your network): Use Workaround 1 to redirect to a local LLM, or substitute an open-weight model.
- Offline use: Not possible with genuine Claude. Use an open-weight model.
- Reduce costs: Switch to a free local model, or if you want genuine Claude, get the most out of a Pro / Max flat-rate plan.
- A self-contained working environment on your own machine: Claude Code's local operation already provides this.
"Genuine Claude quality" and "fully local" cannot coexist. Deciding which one takes priority is the first decision you need to make.
Summary: The Realistic Middle Ground for Local Execution
Key takeaways:
- Claude's model cannot run locally (weights are closed; cloud API only).
- Claude Code runs locally, but the model doing the thinking is in the cloud.
- If you truly need local execution, your two options are: (1) redirect Claude Code to a local LLM, or (2) substitute an open-weight model.
- Anthropic's official self-hosting covers only the execution environment — not the model itself.
Running Claude the model locally is not possible today, but if your real goal is a self-contained working environment or keeping data off external networks, that is entirely achievable. Start by determining whether quality or locality is your priority, then choose the option that matches.
WROTE — run-claude-locally