Self-hosted Claude-compatible endpoint powered by Qwen2.5-Coder-7B
Choose based on your needs: 7B for quality, 1.5B for speed (3x faster)
# Set environment variables
export ANTHROPIC_API_KEY="any-key"
export ANTHROPIC_BASE_URL="https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic"
# Run Claude Code with custom model
claude --model qwen2.5-coder-7b "Write a hello world in Python"
import anthropic
client = anthropic.Anthropic(
api_key="any-key",
base_url="https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic"
)
message = client.messages.create(
model="qwen2.5-coder-7b",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)
curl -X POST "https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic/v1/messages" \
-H "Content-Type: application/json" \
-H "x-api-key: any-key" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "qwen2.5-coder-7b",
"max_tokens": 256,
"messages": [{"role": "user", "content": "Hello!"}]
}'
| Method | Endpoint | Description |
|---|---|---|
| GET | / | Health check with full status |
| GET | /health | Simple health check |
| GET | /logs | View API logs |
| GET | /queue/status | Request queue statistics |
| GET | /models/status | Loaded models info |
| POST | /anthropic/v1/messages | Anthropic Messages API |
| POST | /v1/chat/completions | OpenAI Chat API |
| GET | /anthropic/v1/models | List available models |
Built with llama.cpp + FastAPI | Model: Qwen2.5-Coder-7B-Instruct (Q4_K_M)
Open source and self-hostable