How It Works
What this actually does
A developer named Alishahryar1 built a tiny proxy called Free Claude Code. It sits between Claude Code and Anthropic's API. Claude Code thinks it's talking to Anthropic. It's actually routing every message through whatever free or local model you point it at.
NVIDIA NIM, OpenRouter, Kimi, DeepSeek, Z.ai, Ollama, LM Studio, llama.cpp. Ten provider backends. You pick.
26,000+ developers are already running it. Quality lands at roughly 80 to 90 percent of real Claude for most coding tasks. The CLI, the model picker, the keyboard flow, all identical.
The repo: github.com/Alishahryar1/free-claude-code
The Setup
The 6-step install
Plan for about 5 minutes if you already have Node and Python around. Closer to 10 if you don't.
Install Claude Code
Skip this if you already have it.
The repo specifically recommends npm. The new native installer at claude.ai/install.sh works too.
Install uv + Python 3.14
The proxy is a Python tool. uv is the fastest way to manage it.
macOS / Linux
Windows (PowerShell)
Grab a free NVIDIA NIM API key
Go to build.nvidia.com/settings/api-keys. Sign up, create a key, copy it. NVIDIA NIM is the easiest free starting point. You can swap to OpenRouter, DeepSeek, Z.ai, or fully local Ollama later from the same config screen.
Install the proxy
Run the same command later when you want to update it.
Start the proxy and paste your key
The terminal prints an admin URL, usually http://127.0.0.1:8082/admin. Open it in your browser. Paste your NVIDIA NIM API key into the NVIDIA_NIM_API_KEY field. Click Validate, then Apply. Default model is already set to nvidia_nim/z-ai/glm4.7.
Launch Claude Code through the proxy
Open a new terminal window. Leave fcc-server running in the first one. Run:
Done. You're inside Claude Code, running on a free open source model.
Model Picks
Which free models actually hold up
Not all open source models are equal for coding. These are the ones worth pointing at.
nvidia_nim/z-ai/glm4.7 Default. Solid all-around coding. What the proxy ships with. Start here unless you have a reason not to.
nvidia_nim/moonshotai/kimi-k2.5 Long files, big refactors, large contexts. Kimi's strong suit is long-context coding work where you need the model to hold a lot in working memory.
nvidia_nim/minimaxai/minimax-m2.5 Architectural reasoning. Strong on system design questions and refactor planning.
open_router/deepseek/deepseek-r1-0528:free Reasoning-heavy tasks. R1-style thinking model. Use when you want Claude to actually deliberate before writing code.
wafer/DeepSeek-V4-Pro Pro-grade DeepSeek on a free tier. Underrated. Worth testing for general coding.
zai/glm-5.1 Newer GLM. Very capable. Z.ai's hosted endpoint. Faster than the NIM version in some workloads.
ollama/llama3.1 Full privacy. Local only. Runs on your machine via Ollama. Needs decent hardware (16GB+ RAM minimum, GPU strongly preferred).
Change the active model anytime from the Admin UI. No restart needed.
Pro Move #1
Mix providers per model tier
Claude Code has three model tiers: Opus, Sonnet, Haiku. The proxy lets you route each tier to a different provider.
In the Admin UI, set each tier independently:
MODEL_OPUS → nvidia_nim/moonshotai/kimi-k2.5 MODEL_SONNET → open_router/deepseek/deepseek-r1-0528:free MODEL_HAIKU → ollama/llama3.1 MODEL → zai/glm-5.1 Now your Claude Code session quietly picks the best free model for whatever it's doing. Heavy lifting on Kimi. Reasoning on DeepSeek. Fast local stuff on Ollama. Everything else on GLM 5.1.
Pro Move #2
Turn on the model picker
The proxy can populate Claude Code's native /model picker with every free model on your gateway. It already exposes a /v1/models endpoint. You just need Claude Code to ask for it.
Set this env var, then export it in your shell:
Add to your shell profile
Now hit /model inside Claude Code. You'll see the full list of free models you can swap to mid-session. Workflow upgrade.
IDE Integration
Works with VS Code and JetBrains too
VS Code
Open Settings. Search for claude-code.environmentVariables. Choose "Edit in settings.json". Add this block:
settings.json
"claudeCode.environmentVariables": [
{ "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
{ "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" },
{ "name": "CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY", "value": "1" },
{ "name": "CLAUDE_CODE_AUTO_COMPACT_WINDOW", "value": "190000" }
]Reload the extension. Done.
JetBrains
Edit the Claude ACP config file:
- macOS / Linux:
~/.jetbrains/acp.json - Windows:
%APPDATA%\JetBrains\acp-agents\installed.json
Under the acp.registry.claude-acp block, set the four env vars. Restart the IDE.
JetBrains ACP env block
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:8082",
"ANTHROPIC_AUTH_TOKEN": "freecc",
"CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY": "1",
"CLAUDE_CODE_AUTO_COMPACT_WINDOW": "190000"
}Bonus
Code from Discord, Telegram, or with voice notes
This is the part most people miss. The proxy ships with an optional Discord and Telegram bot wrapper. It runs Claude Code sessions on your machine and lets you text them from your phone.
Discord setup
- Create a bot in the Discord Developer Portal. Enable Message Content Intent.
- Invite the bot to your server with read, send, and history permissions.
- In the Admin UI go to Messaging. Set platform to
discord. Paste your bot token plus channel ID. Set Allowed Directory to the project folder the bot can edit. - Click Validate, then Apply.
Useful commands: /stop cancels a task. /clear resets sessions. /stats shows session state.
Voice notes
Install the voice extras alongside the proxy:
Local Whisper (CPU or CUDA)
Or NVIDIA NIM transcription
Restart fcc-server. Enable Voice Notes in Admin UI → Messaging. Send a voice memo from Telegram and Claude Code writes the code.
Honest Tradeoffs
Where this wins and where it doesn't
Quality is roughly 80 to 90 percent of real Claude. For client work or anything mission-critical, real Claude Code is still worth paying for. For personal projects, learning, side hacks, throwaway scripts, this is a no-brainer.
Free tiers have rate limits. Fine for solo dev work. Not for production traffic. If you hit limits, swap models in the Admin UI.
It's community-built. Not Anthropic-supported. The proxy is open source. You can read every line.
Tool-use support varies. Heavy agentic workflows want GLM 4.7+, Kimi K2.5, or DeepSeek V4-Pro. Smaller local models drop tool calls. Don't run those for anything that depends on edits + bash chains.
Troubleshooting
When things break
fcc-server doesn't start
Make sure Python 3.14 is installed. Run uv python list to confirm. Re-run the proxy install command.
Admin UI won't load
Check the port in the startup logs. Usually 8082, but it shifts if that port is taken.
fcc-claude says "unauthorized"
Open the Admin UI. Click Apply once more. The auth token sometimes needs to be re-saved after a config change.
Model returns errors mid-session
You hit the free tier's rate limit. Swap models in the Admin UI or wait a few minutes.
Model picker isn't showing gateway models
Confirm CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY=1 is in your environment when you launch Claude Code.
Tool calls failing on a local model
Some local models, especially smaller Llamas, don't properly support tool use. Switch to GLM 4.7 or Kimi K2.5.
Work with Me
Need AI to actually work for your business?
I help businesses cut through the AI hype and build the workflows, automations, and systems that actually move the needle. Direct, hands-on, no fluff.
Work with me