Files

NikitaBabenko 26eef92fe7 feat: add headless gRPC server for external agent integration (#278 )

* gRPC Server

* gRPC fix

* UpdProto

* fix: address PR review feedback for gRPC server

- Update bun.lock for new dependencies (frozen-lockfile CI fix)
- Add multi-turn session persistence via initialMessages
- Replace hardcoded done payload with real token counts
- Default bind to localhost instead of 0.0.0.0

* fix(grpc): startup parity, cancel interrupt, and cli text fallback

- Replace enableConfigs() with await init() in start-grpc.ts for full
  bootstrap parity with the main CLI (env vars, CA certs, mTLS, proxy,
  OAuth, Windows shell)
- Call engine.interrupt() before call.end() in the cancel handler so
  in-flight model/tool execution is actually stopped
- Show done.full_text in the CLI client when no text_chunk was received,
  preventing silent drops when streaming is unavailable

* fix(grpc): wire session_id end-to-end and remove dead provider field

- Move session_id from ClientMessage into ChatRequest to fix proto-loader
  oneofs encoding bug and make the field functional
- Implement in-memory session store so reconnecting with the same
  session_id resumes conversation context across streams
- Remove ChatRequest.provider — per-request provider routing requires
  global process.env mutation, unsafe for concurrent clients; provider
  is configured via env vars at server startup

* fix(grpc): mirror CLI auth bootstrap in start-grpc and fix tool_name field

scripts/start-grpc.ts now runs the same provider/auth bootstrap as the
normal CLI entrypoint: enableConfigs, safe env vars, Gemini/GitHub token
hydration, saved-profile resolution with warn-and-fallback, and provider
validation before the server binds.

ToolCallResult.tool_name was being populated with the tool_use_id UUID.
Added a toolNameById map (filled in canUseTool) so tool_name now carries
the actual tool name (e.g. "Bash"). The UUID moves to a new tool_use_id
field (proto field 4) for client-side correlation.

* fix(grpc): add tool_use_id to ToolCallStart and interrupt engine on stream close

Two blocker-level issues flagged in code review:

- ToolCallStart was missing tool_use_id, making it impossible for clients
  to correlate tool_start events with tool_result when the same tool runs
  multiple times. Added tool_use_id = 3 to the proto message and populated
  it from the toolUseID parameter in canUseTool.

- On stream close without an explicit CancelSignal the server only nulled
  the engine reference, leaving the underlying model/tool work running
  as an orphan. Added engine.interrupt() in the call.on('end') handler
  to stop work immediately when the client disconnects.

* fix(grpc): resolve pending promises on disconnect and guard post-cancel writes

Four lifecycle and contract issues identified during proactive review:

- Pending permission Promises in canUseTool would hang forever if the
  client disconnected mid-stream. On call 'end', all pending resolvers
  are now called with 'no' so the engine can unblock and terminate.

- The done message and session save could fire after call.end() when
  a CancelSignal arrived mid-generation. Added an `interrupted` flag
  set on both cancel and stream close to gate all post-loop writes.

- The session map had no eviction policy, allowing unbounded memory
  growth. Capped at MAX_SESSIONS=1000 with FIFO eviction of the
  oldest entry.

- Field 3 was silently absent from ChatRequest. Added `reserved 3`
  to document the gap and prevent accidental reuse in future.

* fix(grpc): reset previousMessages on each new request to prevent session history leak

previousMessages was declared at stream scope and only overwritten when
the incoming session_id already existed in the session store. A second
request on the same stream with a new session_id would silently inherit
the first request's conversation history in initialMessages instead of
starting fresh, violating the session contract.

Fix: reset previousMessages to [] at the start of each ChatRequest
before the session-store lookup.

* fix(grpc): reset interrupted flag between requests and guard against concurrent ChatRequest

Two stream-scoped state bugs found during proactive audit:

- The `interrupted` flag was never reset between requests on the same
  stream. If the first request was cancelled, all subsequent requests
  would silently skip the done message, causing the client to hang.

- A second ChatRequest arriving while the first was still processing
  would overwrite the engine reference, corrupting the lifecycle of
  both requests. Now returns ALREADY_EXISTS error instead. Engine is
  nulled after the for-await loop completes so subsequent requests
  can proceed normally.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-06 17:54:10 +08:00

11 KiB

Raw Permalink Blame History

OpenClaude

OpenClaude is an open-source coding-agent CLI for cloud and local model providers.

Use OpenAI-compatible APIs, Gemini, GitHub Models, Codex, Ollama, Atomic Chat, and other supported backends while keeping one terminal-first workflow: prompts, tools, agents, MCP, slash commands, and streaming output.

Why OpenClaude

Use one CLI across cloud APIs and local model backends
Save provider profiles inside the app with /provider
Run with OpenAI-compatible services, Gemini, GitHub Models, Codex, Ollama, Atomic Chat, and other supported providers
Keep coding-agent workflows in one place: bash, file tools, grep, glob, agents, tasks, MCP, and web tools
Use the bundled VS Code extension for launch integration and theme support

Quick Start

Install

npm install -g @gitlawb/openclaude

If the install later reports ripgrep not found, install ripgrep system-wide and confirm rg --version works in the same terminal before starting OpenClaude.

Start

openclaude

Inside OpenClaude:

run /provider for guided provider setup and saved profiles
run /onboard-github for GitHub Models onboarding

Fastest OpenAI setup

macOS / Linux:

export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_API_KEY=sk-your-key-here
export OPENAI_MODEL=gpt-4o

openclaude

Windows PowerShell:

$env:CLAUDE_CODE_USE_OPENAI="1"
$env:OPENAI_API_KEY="sk-your-key-here"
$env:OPENAI_MODEL="gpt-4o"

openclaude

Fastest local Ollama setup

macOS / Linux:

export CLAUDE_CODE_USE_OPENAI=1
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_MODEL=qwen2.5-coder:7b

openclaude

Windows PowerShell:

$env:CLAUDE_CODE_USE_OPENAI="1"
$env:OPENAI_BASE_URL="http://localhost:11434/v1"
$env:OPENAI_MODEL="qwen2.5-coder:7b"

openclaude

Setup Guides

Beginner-friendly guides:

Advanced and source-build guides:

Supported Providers

Provider	Setup Path	Notes
OpenAI-compatible	`/provider` or env vars	Works with OpenAI, OpenRouter, DeepSeek, Groq, Mistral, LM Studio, and other compatible `/v1` servers
Gemini	`/provider` or env vars	Supports API key, access token, or local ADC workflow on current `main`
GitHub Models	`/onboard-github`	Interactive onboarding with saved credentials
Codex	`/provider`	Uses existing Codex credentials when available
Ollama	`/provider` or env vars	Local inference with no API key
Atomic Chat	advanced setup	Local Apple Silicon backend
Bedrock / Vertex / Foundry	env vars	Additional provider integrations for supported environments

What Works

Tool-driven coding workflows: Bash, file read/write/edit, grep, glob, agents, tasks, MCP, and slash commands
Streaming responses: Real-time token output and tool progress
Tool calling: Multi-step tool loops with model calls, tool execution, and follow-up responses
Images: URL and base64 image inputs for providers that support vision
Provider profiles: Guided setup plus saved .openclaude-profile.json support
Local and remote model backends: Cloud APIs, local servers, and Apple Silicon local inference

Provider Notes

OpenClaude supports multiple providers, but behavior is not identical across all of them.

Anthropic-specific features may not exist on other providers
Tool quality depends heavily on the selected model
Smaller local models can struggle with long multi-step tool flows
Some providers impose lower output caps than the CLI defaults, and OpenClaude adapts where possible

For best results, use models with strong tool/function calling support.

Agent Routing

OpenClaude can route different agents to different models through settings-based routing. This is useful for cost optimization or splitting work by model strength.

Add to ~/.claude/settings.json:

{
  "agentModels": {
    "deepseek-chat": {
      "base_url": "https://api.deepseek.com/v1",
      "api_key": "sk-your-key"
    },
    "gpt-4o": {
      "base_url": "https://api.openai.com/v1",
      "api_key": "sk-your-key"
    }
  },
  "agentRouting": {
    "Explore": "deepseek-chat",
    "Plan": "gpt-4o",
    "general-purpose": "gpt-4o",
    "frontend-dev": "deepseek-chat",
    "default": "gpt-4o"
  }
}

When no routing match is found, the global provider remains the fallback.

Note: api_key values in settings.json are stored in plaintext. Keep this file private and do not commit it to version control.

Web Search and Fetch

By default, WebSearch works on non-Anthropic models using DuckDuckGo. This gives GPT-4o, DeepSeek, Gemini, Ollama, and other OpenAI-compatible providers a free web search path out of the box.

Note: DuckDuckGo fallback works by scraping search results and may be rate-limited, blocked, or subject to DuckDuckGo's Terms of Service. If you want a more reliable supported option, configure Firecrawl.

For Anthropic-native backends and Codex responses, OpenClaude keeps the native provider web search behavior.

WebFetch works, but its basic HTTP plus HTML-to-markdown path can still fail on JavaScript-rendered sites or sites that block plain HTTP requests.

Set a Firecrawl API key if you want Firecrawl-powered search/fetch behavior:

export FIRECRAWL_API_KEY=your-key-here

With Firecrawl enabled:

WebSearch can use Firecrawl's search API while DuckDuckGo remains the default free path for non-Claude models
WebFetch uses Firecrawl's scrape endpoint instead of raw HTTP, handling JS-rendered pages correctly

Free tier at firecrawl.dev includes 500 credits. The key is optional.

Headless gRPC Server

OpenClaude can be run as a headless gRPC service, allowing you to integrate its agentic capabilities (tools, bash, file editing) into other applications, CI/CD pipelines, or custom user interfaces. The server uses bidirectional streaming to send real-time text chunks, tool calls, and request permissions for sensitive commands.

1. Start the gRPC Server

Start the core engine as a gRPC service on localhost:50051:

npm run dev:grpc

Configuration

Variable	Default	Description
`GRPC_PORT`	`50051`	Port the gRPC server listens on
`GRPC_HOST`	`localhost`	Bind address. Use `0.0.0.0` to expose on all interfaces (not recommended without authentication)

2. Run the Test CLI Client

We provide a lightweight CLI client that communicates exclusively over gRPC. It acts just like the main interactive CLI, rendering colors, streaming tokens, and prompting you for tool permissions (y/n) via the gRPC action_required event.

In a separate terminal, run:

npm run dev:grpc:cli

Note: The gRPC definitions are located in src/proto/openclaude.proto. You can use this file to generate clients in Python, Go, Rust, or any other language.

Source Build And Local Development

bun install
bun run build
node dist/cli.mjs

Helpful commands:

bun run dev
bun test
bun run test:coverage
bun run security:pr-scan -- --base origin/main
bun run smoke
bun run doctor:runtime
bun run verify:privacy
focused bun test ... runs for the areas you touch

Testing And Coverage

OpenClaude uses Bun's built-in test runner for unit tests.

Run the full unit suite:

bun test

Generate unit test coverage:

bun run test:coverage

Open the visual coverage report:

open coverage/index.html

If you already have coverage/lcov.info and only want to rebuild the UI:

bun run test:coverage:ui

Use focused test runs when you only touch one area:

bun run test:provider
bun run test:provider-recommendation
bun test path/to/file.test.ts

Recommended contributor validation before opening a PR:

bun run build
bun run smoke
bun run test:coverage for broader unit coverage when your change affects shared runtime or provider logic
focused bun test ... runs for the files and flows you changed

Coverage output is written to coverage/lcov.info, and OpenClaude also generates a git-activity-style heatmap at coverage/index.html.

Repository Structure

src/ - core CLI/runtime
scripts/ - build, verification, and maintenance scripts
docs/ - setup, contributor, and project documentation
python/ - standalone Python helpers and their tests
vscode-extension/openclaude-vscode/ - VS Code extension
.github/ - repo automation, templates, and CI configuration
bin/ - CLI launcher entrypoints

VS Code Extension

The repo includes a VS Code extension in vscode-extension/openclaude-vscode for OpenClaude launch integration, provider-aware control-center UI, and theme support.

Security

If you believe you found a security issue, see SECURITY.md.

Community

Use GitHub Discussions for Q&A, ideas, and community conversation
Use GitHub Issues for confirmed bugs and actionable feature work

Contributing

Contributions are welcome.

For larger changes, open an issue first so the scope is clear before implementation. Helpful validation commands include:

bun run build
bun run test:coverage
bun run smoke
focused bun test ... runs for touched areas

Disclaimer

OpenClaude is an independent community project and is not affiliated with, endorsed by, or sponsored by Anthropic.

OpenClaude originated from the Claude Code codebase and has since been substantially modified to support multiple providers and open use. "Claude" and "Claude Code" are trademarks of Anthropic PBC. See LICENSE for details.

License

See LICENSE.

11 KiB Raw Permalink Blame History