API Documentation

OpenAI-compatible API gateway to top Chinese AI models — DeepSeek, Qwen, GLM, MiniMax. Use your existing SDK, just change the base URL.

Quick Start

3 steps to your first API call: Sign up → Get API key → Send request. New accounts get $0.50 free credits.

1. Python

from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_API_KEY",
    base_url="https://api.tunanapi.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

2. cURL

curl https://api.tunanapi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

3. Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-YOUR_API_KEY',
  baseURL: 'https://api.tunanapi.com/v1'
});

const response = await client.chat.completions.create({
  model: 'deepseek-chat',
  messages: [{ role: 'user', content: 'Hello!' }]
});

console.log(response.choices[0].message.content);

Authentication

All requests require an API key in the Authorization header:

Authorization: Bearer sk-YOUR_API_KEY

Get your key at tunanapi.com → Sign up with email → Copy API key from dashboard.

Keep your key secret. Never commit it to git or expose it in client-side code. Use environment variables: export OPENAI_API_KEY="sk-..."

Base URL & SDK Configuration

Environment	Variable	Value
OpenAI SDK	`base_url` / `OPENAI_BASE_URL`	`https://api.tunanapi.com/v1`
LangChain	`base_url`	`https://api.tunanapi.com/v1`
cURL / HTTP	—	`https://api.tunanapi.com/v1/...`

Drop-in replacement: If you're already using OpenAI's API, just change base_url and api_key. Everything else stays the same.

Models & Pricing

All prices per 1M tokens. Input = prompt tokens, Output = completion tokens. Billed per token, no minimums.

Flagship Models

Model ID	Provider	Input	Output	Context	Best For
`deepseek-chat`	DeepSeek	$0.20	$0.40	128K	General tasks, best value
`deepseek-reasoner`	DeepSeek	$2.50	$5.00	128K	Complex reasoning, math, code
`qwen3.7-max`	Qwen	$1.80	$5.40	128K	High-quality generation
`minimax-m3`	MiniMax	$0.43	$3.31	1M	Ultra-long context

Fast & Affordable

Model ID	Provider	Input	Output	Context
`qwen3.5-flash`	Qwen	$0.07	$0.22	128K
`glm-4-flash`	GLM	$0.07	$0.22	128K
`qwen3.7-plus`	Qwen	$0.58	$1.74	128K
`glm-4-plus`	GLM	$1.80	$5.40	128K

Specialized Models

Model ID	Provider	Input	Output	Context	Type
`minimax-m2.5`	MiniMax	$0.22	$1.65	197K	Long-context
`minimax-m2.7`	MiniMax	$0.29	$1.73	205K	Long-context
`qwen-coder-plus`	Qwen	$0.58	$1.74	128K	Code generation
`qwen-math-plus`	Qwen	$0.58	$1.74	128K	Math & science

Not sure which model? Start with deepseek-chat — it's the best all-rounder at the lowest price. Use deepseek-reasoner for hard problems, qwen3.5-flash for speed, and minimax-m3 when you need 1M token context.

Chat Completions

POST /v1/chat/completions

Create a model response for a conversation. Fully compatible with OpenAI's chat completions endpoint.

Request Body

Parameter	Type	Required	Default	Description
`model`	string	Yes	—	Model ID from the table above
`messages`	array	Yes	—	Array of message objects with `role` and `content`
`temperature`	number	No	1	0–2. Higher = more creative, lower = more deterministic
`max_tokens`	integer	No	auto	Maximum tokens in the completion
`stream`	boolean	No	false	Stream partial results via SSE
`top_p`	number	No	1	Nucleus sampling threshold
`stop`	string/array	No	null	Stop sequences (max 4)
`frequency_penalty`	number	No	0	-2 to 2. Penalize repeated tokens
`presence_penalty`	number	No	0	-2 to 2. Encourage new topics
`n`	integer	No	1	Number of completions to generate
`response_format`	object	No	—	JSON mode: `{"type": "json_object"}`

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1718900000,
  "model": "deepseek-chat",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 9,
    "total_tokens": 17
  }
}

Example: Multi-turn conversation

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to reverse a linked list"},
    ]
)

Example: JSON mode

response = client.chat.completions.create(
    model="deepseek-chat",
    response_format={"type": "json_object"},
    messages=[{"role": "user", "content": "Return a JSON object with name, age, city"}]
)

List Models

GET /v1/models

List all available models on TunanAPI.

Request

curl https://api.tunanapi.com/v1/models \
  -H "Authorization: Bearer sk-YOUR_API_KEY"

Response

{
  "object": "list",
  "data": [
    {"id": "deepseek-chat", "object": "model", "owned_by": "deepseek"},
    {"id": "qwen3.7-max", "object": "model", "owned_by": "qwen"},
    ...
  ]
}

Embeddings

POST /v1/embeddings

Generate text embeddings for semantic search, clustering, or classification.

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	`text-embedding-v3` (Qwen, 1024d) or `embedding-3` (GLM, 2048d)
`input`	string/array	Yes	Text or array of texts to embed

Example

response = client.embeddings.create(
    model="text-embedding-v3",
    input="Hello world"
)

print(response.data[0].embedding[:5])  # [0.0023, -0.0094, ...]

Vision

POST /v1/chat/completions

Analyze images using vision models. Same endpoint — just add image content to messages.

Available Vision Models

Model ID	Context	Best For
`qwen-vl-max`	128K	High-quality image understanding
`qwen-vl-plus`	128K	Fast & affordable vision
`glm-4v`	128K	GLM vision model

Example: Image URL

response = client.chat.completions.create(
    model="qwen-vl-max",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
        ]
    }]
)

Example: Base64 image

response = client.chat.completions.create(
    model="qwen-vl-max",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this image"},
            {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,/9j/4AAQ..."}}
        ]
    }]
)

Streaming

Set stream: true to receive partial results as Server-Sent Events (SSE). Essential for chat interfaces and long responses.

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Write a poem about the sea"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Raw SSE format (cURL)

curl https://api.tunanapi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek-chat","messages":[{"role":"user","content":"Hi"}],"stream":true}'

# Each chunk:
data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"}}]}
data: [DONE]

Fallback & Routing

Automatic Failover: TunanAPI automatically retries failed requests on backup channels. If DeepSeek is down, your request is routed to Qwen — same model family, zero config needed.

How it works

When you call a model like deepseek-chat, TunanAPI routes your request to the primary provider. If that provider returns a 5xx error or times out, the system automatically retries on a configured backup channel. You get:

Higher uptime — one provider going down doesn't break your app
Zero code changes — just call the model as usual
Transparent billing — you're billed by the model that actually served the request

Pro tip: For critical production workloads, use models available from multiple providers (e.g., deepseek-chat has backup channels) to maximize uptime.

Rate Limits

Limit	Free Tier	Paid
Requests per minute (RPM)	60	60
Tokens per minute (TPM)	~500K	~500K
Max concurrent requests	5	10

Need higher limits? Contact us for custom rate limits. Enterprise plans available.

Rate limit headers

Every response includes headers to help you track your usage:

Header	Description
`X-RateLimit-Limit`	Your RPM limit
`X-RateLimit-Remaining`	Requests remaining in this window
`X-RateLimit-Reset`	Unix timestamp when the window resets

Error Codes

Status	Type	Description	Action
`400`	Bad Request	Invalid parameters or malformed JSON	Check request body format
`401`	Unauthorized	Invalid or missing API key	Verify your API key
`402`	Insufficient Quota	Not enough credits	Top up at tunanapi.com
`404`	Not Found	Model not available or endpoint doesn't exist	Check model ID spelling
`429`	Rate Limited	Too many requests	Slow down, use exponential backoff
`500`	Server Error	Internal error	Retry after a moment
`503`	Unavailable	Upstream provider down	Retry; fallback may activate

Error response format

{
  "error": {
    "message": "Insufficient quota. Please top up your account.",
    "type": "insufficient_quota",
    "code": 402
  }
}

Retry strategy

Recommended: Use exponential backoff with jitter for 429 and 5xx errors. Wait 1s → 2s → 4s → 8s. Max 3 retries.

Billing & Quota

How billing works

You're billed per token, per the pricing table above. There are no subscriptions or minimums — pay only for what you use.

Check your balance

# Using API
curl https://api.tunanapi.com/api/user/self \
  -H "Authorization: Bearer sk-YOUR_API_KEY"

# Response includes:
{"data": {"quota": 500000, "used_quota": 12345, ...}}

Quota is in units where 1 unit = $0.002 / 1K tokens (1,000,000 units = $2).

Top up

Visit tunanapi.com dashboard → Top Up. We accept credit cards and PayPal.

Pricing Tiers

Tier	Price	Credits	Bonus
Starter	$5	$5.00	—
Growth	$20	$21.00	+5%
Business	$50	$55.00	+10%
Enterprise	$100	$115.00	+15%

Integrations

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="deepseek-chat",
    api_key="sk-YOUR_API_KEY",
    base_url="https://api.tunanapi.com/v1"
)

response = llm.invoke("Explain quantum computing in one paragraph")

LlamaIndex

from llama_index.llms.openai import OpenAI as LlamaOpenAI

llm = LlamaOpenAI(
    model="deepseek-chat",
    api_key="sk-YOUR_API_KEY",
    api_base="https://api.tunanapi.com/v1"
)

CrewAI

from crewai import Agent, Task, Crew
import os

os.environ["OPENAI_API_KEY"] = "sk-YOUR_API_KEY"
os.environ["OPENAI_API_BASE"] = "https://api.tunanapi.com/v1"

agent = Agent(role="Researcher", goal="Find insights", backstory="Expert researcher")

AutoGen

import autogen

config_list = [{
    "model": "deepseek-chat",
    "api_key": "sk-YOUR_API_KEY",
    "base_url": "https://api.tunanapi.com/v1"
}]

assistant = autogen.AssistantAgent("assistant", llm_config={"config_list": config_list})

Migration from OpenAI

It takes 2 lines of code. Just change base_url and api_key. All other code stays the same.

# Before (OpenAI)
client = OpenAI(api_key="sk-openai-...")  # uses api.openai.com

# After (TunanAPI)
client = OpenAI(
    api_key="sk-YOUR_TUNANAPI_KEY",
    base_url="https://api.tunanapi.com/v1"
)

# Everything else is identical!

Model mapping

OpenAI Model	TunanAPI Equivalent	Savings
`gpt-4o`	`deepseek-chat`	~97%
`gpt-4o-mini`	`qwen3.5-flash`	~90%
`o1`	`deepseek-reasoner`	~95%
`gpt-4-turbo`	`qwen3.7-max`	~93%

Changelog

2026-06-14

Launched comprehensive API documentation
Added Fallback & Routing documentation
Added Vision API and Embeddings documentation
Added integration guides (LangChain, LlamaIndex, CrewAI, AutoGen)

2026-06-09

Pricing V2.1上线 — 混合毛利率策略，8 models online
MiniMax M3 (1M context) added

2026-06-07

I Ching Oracle product launch at oracle.tunanapi.com
Terms of Service & Privacy Policy published