Anthropic-Compatible API

Self-hosted Claude-compatible endpoint powered by Qwen2.5-Coder-7B

API Status

Checking...

Queue Status

Cache Stats

Available Models

Choose based on your needs: 7B for quality, 1.5B for speed (3x faster)

Loading models...

Quick Start

Claude Code CLI

# Set environment variables
export ANTHROPIC_API_KEY="any-key"
export ANTHROPIC_BASE_URL="https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic"

# Run Claude Code with custom model
claude --model qwen2.5-coder-7b "Write a hello world in Python"

Python SDK

import anthropic

client = anthropic.Anthropic(
    api_key="any-key",
    base_url="https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic"
)

message = client.messages.create(
    model="qwen2.5-coder-7b",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

cURL

curl -X POST "https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: any-key" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "qwen2.5-coder-7b",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Try it Now

Model:

Live Logs

Click "Refresh" to load logs...

API Endpoints

Method	Endpoint	Description
GET	/	Health check with full status
GET	/health	Simple health check
GET	/logs	View API logs
GET	/queue/status	Request queue statistics
GET	/models/status	Loaded models info
POST	/anthropic/v1/messages	Anthropic Messages API
POST	/v1/chat/completions	OpenAI Chat API
GET	/anthropic/v1/models	List available models

Built with llama.cpp + FastAPI | Model: Qwen2.5-Coder-7B-Instruct (Q4_K_M)

Open source and self-hostable