API Documentation

OpenAI-compatible API gateway to top Chinese AI models — DeepSeek, Qwen, GLM, MiniMax. Use your existing SDK, just change the base URL.

Quick Start

3 steps to your first API call: Sign up → Get API key → Send request. New accounts get $0.50 free credits.

1. Python

from openai import OpenAI

client = OpenAI(
    api_key="sk-YOUR_API_KEY",
    base_url="https://api.tunanapi.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

2. cURL

curl https://api.tunanapi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

3. Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-YOUR_API_KEY',
  baseURL: 'https://api.tunanapi.com/v1'
});

const response = await client.chat.completions.create({
  model: 'deepseek-chat',
  messages: [{ role: 'user', content: 'Hello!' }]
});

console.log(response.choices[0].message.content);

Authentication

All requests require an API key in the Authorization header:

Authorization: Bearer sk-YOUR_API_KEY

Get your key at tunanapi.com → Sign up with email → Copy API key from dashboard.

Keep your key secret. Never commit it to git or expose it in client-side code. Use environment variables: export OPENAI_API_KEY="sk-..."

Base URL & SDK Configuration

EnvironmentVariableValue
OpenAI SDKbase_url / OPENAI_BASE_URLhttps://api.tunanapi.com/v1
LangChainbase_urlhttps://api.tunanapi.com/v1
cURL / HTTPhttps://api.tunanapi.com/v1/...
Drop-in replacement: If you're already using OpenAI's API, just change base_url and api_key. Everything else stays the same.

Models & Pricing

All prices per 1M tokens. Input = prompt tokens, Output = completion tokens. Billed per token, no minimums.

Flagship Models

Model IDProviderInputOutputContextBest For
deepseek-chatDeepSeek$0.20$0.40128KGeneral tasks, best value
deepseek-reasonerDeepSeek$2.50$5.00128KComplex reasoning, math, code
qwen3.7-maxQwen$1.80$5.40128KHigh-quality generation
minimax-m3MiniMax$0.43$3.311MUltra-long context

Fast & Affordable

Model IDProviderInputOutputContext
qwen3.5-flashQwen$0.07$0.22128K
glm-4-flashGLM$0.07$0.22128K
qwen3.7-plusQwen$0.58$1.74128K
glm-4-plusGLM$1.80$5.40128K

Specialized Models

Model IDProviderInputOutputContextType
minimax-m2.5MiniMax$0.22$1.65197KLong-context
minimax-m2.7MiniMax$0.29$1.73205KLong-context
qwen-coder-plusQwen$0.58$1.74128KCode generation
qwen-math-plusQwen$0.58$1.74128KMath & science
Not sure which model? Start with deepseek-chat — it's the best all-rounder at the lowest price. Use deepseek-reasoner for hard problems, qwen3.5-flash for speed, and minimax-m3 when you need 1M token context.

Chat Completions

POST /v1/chat/completions

Create a model response for a conversation. Fully compatible with OpenAI's chat completions endpoint.

Request Body

ParameterTypeRequiredDefaultDescription
modelstringYesModel ID from the table above
messagesarrayYesArray of message objects with role and content
temperaturenumberNo10–2. Higher = more creative, lower = more deterministic
max_tokensintegerNoautoMaximum tokens in the completion
streambooleanNofalseStream partial results via SSE
top_pnumberNo1Nucleus sampling threshold
stopstring/arrayNonullStop sequences (max 4)
frequency_penaltynumberNo0-2 to 2. Penalize repeated tokens
presence_penaltynumberNo0-2 to 2. Encourage new topics
nintegerNo1Number of completions to generate
response_formatobjectNoJSON mode: {"type": "json_object"}

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1718900000,
  "model": "deepseek-chat",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 9,
    "total_tokens": 17
  }
}

Example: Multi-turn conversation

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to reverse a linked list"},
    ]
)

Example: JSON mode

response = client.chat.completions.create(
    model="deepseek-chat",
    response_format={"type": "json_object"},
    messages=[{"role": "user", "content": "Return a JSON object with name, age, city"}]
)

List Models

GET /v1/models

List all available models on TunanAPI.

Request

curl https://api.tunanapi.com/v1/models \
  -H "Authorization: Bearer sk-YOUR_API_KEY"

Response

{
  "object": "list",
  "data": [
    {"id": "deepseek-chat", "object": "model", "owned_by": "deepseek"},
    {"id": "qwen3.7-max", "object": "model", "owned_by": "qwen"},
    ...
  ]
}

Embeddings

POST /v1/embeddings

Generate text embeddings for semantic search, clustering, or classification.

Request Body

ParameterTypeRequiredDescription
modelstringYestext-embedding-v3 (Qwen, 1024d) or embedding-3 (GLM, 2048d)
inputstring/arrayYesText or array of texts to embed

Example

response = client.embeddings.create(
    model="text-embedding-v3",
    input="Hello world"
)

print(response.data[0].embedding[:5])  # [0.0023, -0.0094, ...]

Vision

POST /v1/chat/completions

Analyze images using vision models. Same endpoint — just add image content to messages.

Available Vision Models

Model IDContextBest For
qwen-vl-max128KHigh-quality image understanding
qwen-vl-plus128KFast & affordable vision
glm-4v128KGLM vision model

Example: Image URL

response = client.chat.completions.create(
    model="qwen-vl-max",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
        ]
    }]
)

Example: Base64 image

response = client.chat.completions.create(
    model="qwen-vl-max",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this image"},
            {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,/9j/4AAQ..."}}
        ]
    }]
)

Streaming

Set stream: true to receive partial results as Server-Sent Events (SSE). Essential for chat interfaces and long responses.

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Write a poem about the sea"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Raw SSE format (cURL)

curl https://api.tunanapi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek-chat","messages":[{"role":"user","content":"Hi"}],"stream":true}'

# Each chunk:
data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"}}]}
data: [DONE]

Fallback & Routing

Automatic Failover: TunanAPI automatically retries failed requests on backup channels. If DeepSeek is down, your request is routed to Qwen — same model family, zero config needed.

How it works

When you call a model like deepseek-chat, TunanAPI routes your request to the primary provider. If that provider returns a 5xx error or times out, the system automatically retries on a configured backup channel. You get:

Pro tip: For critical production workloads, use models available from multiple providers (e.g., deepseek-chat has backup channels) to maximize uptime.

Rate Limits

LimitFree TierPaid
Requests per minute (RPM)6060
Tokens per minute (TPM)~500K~500K
Max concurrent requests510
Need higher limits? Contact us for custom rate limits. Enterprise plans available.

Rate limit headers

Every response includes headers to help you track your usage:

HeaderDescription
X-RateLimit-LimitYour RPM limit
X-RateLimit-RemainingRequests remaining in this window
X-RateLimit-ResetUnix timestamp when the window resets

Error Codes

StatusTypeDescriptionAction
400Bad RequestInvalid parameters or malformed JSONCheck request body format
401UnauthorizedInvalid or missing API keyVerify your API key
402Insufficient QuotaNot enough creditsTop up at tunanapi.com
404Not FoundModel not available or endpoint doesn't existCheck model ID spelling
429Rate LimitedToo many requestsSlow down, use exponential backoff
500Server ErrorInternal errorRetry after a moment
503UnavailableUpstream provider downRetry; fallback may activate

Error response format

{
  "error": {
    "message": "Insufficient quota. Please top up your account.",
    "type": "insufficient_quota",
    "code": 402
  }
}

Retry strategy

Recommended: Use exponential backoff with jitter for 429 and 5xx errors. Wait 1s → 2s → 4s → 8s. Max 3 retries.

Billing & Quota

How billing works

You're billed per token, per the pricing table above. There are no subscriptions or minimums — pay only for what you use.

Check your balance

# Using API
curl https://api.tunanapi.com/api/user/self \
  -H "Authorization: Bearer sk-YOUR_API_KEY"

# Response includes:
{"data": {"quota": 500000, "used_quota": 12345, ...}}

Quota is in units where 1 unit = $0.002 / 1K tokens (1,000,000 units = $2).

Top up

Visit tunanapi.com dashboard → Top Up. We accept credit cards and PayPal.

Pricing Tiers

TierPriceCreditsBonus
Starter$5$5.00
Growth$20$21.00+5%
Business$50$55.00+10%
Enterprise$100$115.00+15%

Integrations

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="deepseek-chat",
    api_key="sk-YOUR_API_KEY",
    base_url="https://api.tunanapi.com/v1"
)

response = llm.invoke("Explain quantum computing in one paragraph")

LlamaIndex

from llama_index.llms.openai import OpenAI as LlamaOpenAI

llm = LlamaOpenAI(
    model="deepseek-chat",
    api_key="sk-YOUR_API_KEY",
    api_base="https://api.tunanapi.com/v1"
)

CrewAI

from crewai import Agent, Task, Crew
import os

os.environ["OPENAI_API_KEY"] = "sk-YOUR_API_KEY"
os.environ["OPENAI_API_BASE"] = "https://api.tunanapi.com/v1"

agent = Agent(role="Researcher", goal="Find insights", backstory="Expert researcher")

AutoGen

import autogen

config_list = [{
    "model": "deepseek-chat",
    "api_key": "sk-YOUR_API_KEY",
    "base_url": "https://api.tunanapi.com/v1"
}]

assistant = autogen.AssistantAgent("assistant", llm_config={"config_list": config_list})

Migration from OpenAI

It takes 2 lines of code. Just change base_url and api_key. All other code stays the same.
# Before (OpenAI)
client = OpenAI(api_key="sk-openai-...")  # uses api.openai.com

# After (TunanAPI)
client = OpenAI(
    api_key="sk-YOUR_TUNANAPI_KEY",
    base_url="https://api.tunanapi.com/v1"
)

# Everything else is identical!

Model mapping

OpenAI ModelTunanAPI EquivalentSavings
gpt-4odeepseek-chat~97%
gpt-4o-miniqwen3.5-flash~90%
o1deepseek-reasoner~95%
gpt-4-turboqwen3.7-max~93%

Changelog

2026-06-14

2026-06-09

2026-06-07