OpenAI-, Claude- & Gemini-compatible

One API for every AI model.

Access GPT, Claude, Gemini, DeepSeek and dozens more through a single endpoint. Better prices, higher uptime, low latency.

Get a free API key → Browse models

✓ Drop-in compatible — your OpenAI SDK just works

request.sh

# Point your OpenAI client at VietToken
curl https://api.viettoken.app/v1/chat/completions \
  -H "Authorization: Bearer $VIETTOKEN_KEY" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "stream": true,
    "messages": [{"role":"user","content":"Hi!"}]
  }'

▎stream

89+

Models

35+

Providers

~50ms

Gateway overhead

99.9%

Uptime target

How it works

One request, every model

Send one request to a single endpoint. VietToken routes it to the best model and streams tokens straight back — with automatic failover.

💻

Your app

openai SDK

request

tokens

⚡

VietToken

gateway · 1 endpoint

route

stream

openai/gpt-5.x

frontier

anthropic/claude-opus-4

reasoning

google/gemini-3-pro

multimodal

deepseek/deepseek-v4

reasoning

x-ai/grok-4

frontier

qwen/qwen-3.5

open

moonshot/kimi-k2.5

long-context

mistral/mistral-large

fast

Auto-failover: if a model or key goes down, traffic reroutes instantly — your app never breaks.

Why VietToken

What you actually need

One integration, every model. Fast, reliable, and easy to plug in.

⚡

Low latency

Edge servers + unbuffered streaming for near-instant first tokens. ~30–90ms overhead.

🔀

Any model, one API

OpenAI, Anthropic, Google, DeepSeek, xAI and 35+ providers — switch with one line.

🛡️

Higher uptime

Automatic failover across keys and providers. When one fails, traffic reroutes instantly.

🔌

Drop-in compatible

Keep your existing code. Just change the base URL — no rewrites, no lock-in.

Get started

Live in three steps

From sign-up to your first streamed token in under a minute.

Create an account

Add credits

Top up once, spend on any model. No subscriptions.

Get your API key

Create a key and start making requests.

base_url = "https://api.viettoken.app/v1"

One key — every model

The latest models, instantly

Access frontier models through one unified API. Add or remove anytime.

GPT-5.x GPT-4o mini Claude Opus 4 Claude Haiku 4.5 Gemini 3 Pro Gemini 3 Flash DeepSeek V4 Qwen 3.5 Grok 4 Kimi K2.5 Mistral Large + your custom model

Pricing

Pay as you go. No surprises.

Buy credits once, use on any model. No subscriptions, no hidden fees.

Free

For trials & side projects

✓$1 trial credit
✓Access to all models
✓3 API keys
✓Community support

Start free

Pro

For growing products

At cost

✓No per-token markup — pay model price
✓Unlimited tokens & keys
✓Private custom providers
✓Load-balancing & failover
✓Priority support

Add credits

Enterprise

For teams & scale

Custom

✓Volume-based discounts
✓SLA & uptime guarantees
✓SSO, roles & data policies
✓Dedicated 24/7 support

Contact sales

Quickstart

Integrate in 60 seconds

OpenAI-compatible — just change the base URL.

request.sh

curl https://api.viettoken.app/v1/chat/completions \
  -H "Authorization: Bearer $VIETTOKEN_KEY" \
  -d '{ "model":"openai/gpt-4o-mini", "messages":[{"role":"user","content":"Hello"}] }'

from openai import OpenAI

client = OpenAI(
    base_url="https://api.viettoken.app/v1",
    api_key="$VIETTOKEN_KEY",
)
stream = client.chat.completions.create(
    model="anthropic/claude-haiku-4.5",
    messages=[{"role":"user","content":"Hello"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.viettoken.app/v1",
  apiKey: process.env.VIETTOKEN_KEY,
});
const stream = await client.chat.completions.create({
  model: "google/gemini-3-flash",
  messages: [{ role: "user", content: "Hello" }],
  stream: true,
});

FAQ

Questions, answered

Is it really OpenAI-compatible?+

Yes. Use the official OpenAI SDKs — just set the base URL to VietToken and your existing code works unchanged.

Which providers are supported?+

OpenAI, Anthropic, Google, DeepSeek, xAI, Mistral, OpenRouter and 35+ types, plus any OpenAI-compatible custom endpoint — including self-hosted vLLM/Ollama.

How do you keep it fast?+

Edge servers close to your users plus pass-through SSE streaming with no buffering. The gateway adds only ~30–90ms.

How does billing work?+

Pay as you go: buy credits once and spend them on any model. Track usage, latency and cost per key. No subscriptions.

One API for every AI model.

One request, every model

What you actually need

Low latency

Any model, one API

Higher uptime

Drop-in compatible

Live in three steps

Create an account

Add credits

Get your API key

The latest models, instantly

Pay as you go. No surprises.

Free

Pro

Enterprise

Integrate in 60 seconds

Questions, answered

Is it really OpenAI-compatible?+

Which providers are supported?+

How do you keep it fast?+

How does billing work?+

Start building with every model today