API Overview
Kouri Ai API basics and usage guide
Kouri Ai provides API services fully compatible with OpenAI API format, while also supporting Anthropic, Gemini and other protocols, allowing you to easily call dozens of flagship models including GPT, Claude, Gemini, DeepSeek, and more.
Endpoints
OpenAI Compatible Endpoints
| Endpoint Type | URL | Description |
|---|---|---|
| Chat Completions | https://api.kourichat.com/v1/chat/completions | Chat completion API, for most models |
| Responses | https://api.kourichat.com/v1/responses | Response API, required for reasoning models |
| Standard Endpoint | https://api.kourichat.com/v1 | Recommended for SDKs |
| Base Endpoint | https://api.kourichat.com | For some applications |
Important: Some models like gpt-5.2-pro and o3-pro only support the Response API, not Chat Completions. Please choose the correct API based on model requirements.
Other Protocol Endpoints
| Protocol | Endpoint URL | Description |
|---|---|---|
| Anthropic Protocol | https://api.kourichat.com/v1/messages | Claude native protocol |
| Gemini Protocol | https://api.kourichat.com/v1beta | Gemini native protocol |
Model Compatibility: Both Anthropic and Gemini protocol endpoints support calling all models (not limited to Claude or Gemini), while the OpenAI protocol's Responses endpoint only supports specific reasoning models.
Authentication
All API requests require authentication via API token. You can create and manage tokens in the Console.
HTTP Header Authentication
Add the Authorization header to your request:
Authorization: Bearer sk-xxxxxxxxxxxxxxxxComplete Request Example
curl https://api.kourichat.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxx" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'Security Notice: Keep your API token safe. Never expose it in client-side code, public repositories, or logs.
Request Format
All API requests use JSON format:
- Content-Type:
application/json - Method:
POST(chat endpoints),GET(query endpoints)
Basic Request Structure
{
"model": "model-name",
"messages": [
{"role": "system", "content": "System prompt"},
{"role": "user", "content": "User message"}
],
"temperature": 0.7,
"max_tokens": 2048,
"stream": false
}Common Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✅ | Model name, e.g., gpt-4o, claude-sonnet-4-20250514 |
messages | array | ✅ | Message list with roles and content |
temperature | number | ❌ | Randomness, 0-2, default 1 |
max_tokens | integer | ❌ | Maximum output tokens |
stream | boolean | ❌ | Enable streaming, default false |
top_p | number | ❌ | Nucleus sampling, 0-1 |
Response Format
Standard Response
{
"id": "chatcmpl-xxxxxxxx",
"object": "chat.completion",
"created": 1234567890,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 15,
"total_tokens": 25
}
}Streaming Response
With stream: true, responses are returned as Server-Sent Events (SSE):
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Hello"},"index":0}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"!"},"index":0}]}
data: [DONE]Error Codes
| HTTP Status | Error Type | Description |
|---|---|---|
| 400 | Bad Request | Invalid request parameters |
| 401 | Unauthorized | Invalid or missing token |
| 403 | Forbidden | Access denied |
| 404 | Not Found | Endpoint or model not found |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | Service temporarily unavailable |
Error Response Example
{
"error": {
"message": "Invalid API key provided",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}Universal Model Access
Kouri Ai provides universal model access, allowing you to call all chat models through the standard ChatCompletion endpoint, including:
- OpenAI Series: GPT-4o, GPT-4, o1, o3, etc.
- Anthropic Series: Claude Sonnet, Claude Opus, etc.
- Google Series: Gemini 2.5 Flash, Gemini 2.5 Pro, etc.
- Other Models: DeepSeek, Qwen, etc.
Kouri Ai handles protocol conversion automatically. You can use /v1/chat/completions to call all models without worrying about underlying protocol differences.
Rate Limits
- Request Rate: Varies by token type and account level
- Request Timeout: 5 minutes for normal requests, longer for complex reasoning
- Max Message Length: Depends on the model's context window
For higher quotas, please contact support for enterprise plans.