Appendix A — Providers, models & prices
This appendix is the thClaws.cloud gateway catalogue — the models you can call when
you point an agent at https://thclaws.cloud/gateway (or run thClaws.cloud
self-hosted). Desktop / CLI builds have their own catalogue covering 22 providers; see
Chapter 6 — Providers, models & API keys for that
list.
Catalogue refreshed 2026-06-02. To re-pull the latest models + rates from upstream APIs and LiteLLM’s pricing feed, run
python3 scripts/refresh-model-catalogue.pyfrom the repo root.
Pricing model
Rates below are what you pay — that is, upstream cost × platform markup. Internally the gateway keeps two pieces:
| Layer | What | Where |
|---|---|---|
| DB row rate | Raw upstream cost from LiteLLM (USD per token) | model_pricing table, seeded by scripts/refresh-model-catalogue.py |
| Platform markup | 1.25× multiplier applied at meter time |
THCLAWS_PLATFORM_MARKUP env var on the gateway service |
This means: if Anthropic raises Opus 4.8 to $6/M, you only update LiteLLM (or wait for the next sync) and the gateway picks up the new rate on the next pricing refresh — the markup stays a single env-var knob.
All prices are in US dollars per 1,000,000 tokens (1M tokens). The DB stores these
as microcents per 1k tokens (µ¢/kt); the formula is $/M = µ¢/kt / 100,000.
Anthropic
| Model | Tier | Input ($/M) | Output ($/M) | Remark |
|---|---|---|---|---|
claude-haiku-4-5-20251001 |
starter | $1.25 | $6.25 | Latest Haiku (dated alias of claude-haiku-4-5) |
claude-opus-4-8 |
enterprise | $6.25 | $31.25 | Current flagship Opus |
claude-opus-4-7 |
enterprise | $6.25 | $31.25 | Previous flagship |
claude-opus-4-6 |
enterprise | $6.25 | $31.25 | |
claude-opus-4-5-20251101 |
enterprise | $6.25 | $31.25 | |
claude-opus-4-1-20250805 |
enterprise | $18.75 | $93.75 | Legacy Opus 4.1 — Anthropic still serves it but for new work prefer Opus 4.8 |
claude-opus-4-20250514 |
enterprise | $18.75 | $93.75 | Legacy Opus 4.0 |
claude-sonnet-4-6 |
pro | $3.75 | $18.75 | Current default model |
claude-sonnet-4-5-20250929 |
pro | $3.75 | $18.75 | |
claude-sonnet-4-20250514 |
pro | $3.75 | $18.75 | Legacy Sonnet 4.0 |
Note — the
tiercolumn is display-only. thClaws.cloud dropped the tier-gating ladder in v0.28 — any user with positive credit can call any active model; per-call price differential is the only gate. See ch27 § “Why no tier gate” for the rationale.
OpenAI
| Model | Tier | Input ($/M) | Output ($/M) | Remark |
|---|---|---|---|---|
gpt-4o |
pro | $3.125 | $12.50 | |
gpt-4o-mini |
starter | $0.1875 | $0.75 | Cheapest chat-capable OpenAI model on the gateway |
o1 |
enterprise | $18.75 | $75.00 | Reasoning model — output includes hidden reasoning tokens |
OpenAI exposes ~76 chat-capable models via /v1/models (every gpt-5.x, o3, o4 variant,
plus dated snapshots). Only the three above are currently seeded into the gateway. If
you need a specific model, run the refresh script with --providers openai --apply
to add it; it will pull pricing from LiteLLM automatically. The full list available
upstream:
gpt-5.5,gpt-5.5-pro,gpt-5.4,gpt-5.4-pro,gpt-5.4-mini,gpt-5.4-nano,gpt-5.3-codex,gpt-5.3-chat-latestgpt-5.2,gpt-5.2-pro,gpt-5.2-codex,gpt-5.2-chat-latestgpt-5.1,gpt-5.1-codex,gpt-5.1-codex-max,gpt-5.1-codex-minigpt-5,gpt-5-pro,gpt-5-codex,gpt-5-mini,gpt-5-nano,gpt-5-search-apigpt-4.1,gpt-4.1-mini,gpt-4.1-nanoo3,o3-pro,o3-mini,o3-deep-research,o4-mini,o4-mini-deep-research,o1-pro- Legacy:
gpt-4,gpt-4-turbo,gpt-3.5-turbo*
Google (Gemini)
| Model | Tier | Input ($/M) | Output ($/M) | Remark |
|---|---|---|---|---|
gemini-2.0-flash |
starter | $0.125 | $0.50 | Cheapest model on the gateway across all providers |
gemini-2.0-pro |
pro | $1.95 | $7.81 | ⚠ See note below — not in LiteLLM, rate is from initial seed and may be stale |
Google exposes 20+ Gemini models via /v1beta/models (gemini-2.5-flash, gemini-2.5-pro,
gemini-3-pro-preview, gemini-3.1-flash-lite, etc.). Run the refresh script to seed them
with current pricing.
OpenRouter
| Model | Tier | Input ($/M) | Output ($/M) | Remark |
|---|---|---|---|---|
openrouter/auto |
starter | (pass-through) | (pass-through) | OpenRouter’s auto-router; actual cost determined by the routed model. The gateway forwards usage as-is. |
The DB row for openrouter/auto holds 0/0 because the upstream cost varies per request.
We rely on OpenRouter’s own metering in this case.
Inactive / deprecated rows
| Model | Reason |
|---|---|
anthropic/claude-haiku-4-5 |
Replaced by the dated alias claude-haiku-4-5-20251001 — Anthropic’s /v1/models no longer returns the bare alias. Kept in the table with active=FALSE so historical usage rows still join. |
Remarks on accuracy
| Row | Status |
|---|---|
| 14 of 16 active rows | Exact match with LiteLLM’s published pricing as of 2026-06-02 |
google/gemini-2.0-pro |
Not in LiteLLM — its current rate is from the initial migration 006 seed. Google may have changed it since. The refresh script can’t reprice this row automatically; review periodically against ai.google.dev/pricing. |
openrouter/auto |
Pass-through, no canonical cost |
If you spot a price drift, run:
python3 scripts/refresh-model-catalogue.py --reprice-only # dry-run
python3 scripts/refresh-model-catalogue.py --reprice-only --apply # commit
Adjusting the platform markup
The 1.25× multiplier is a single env var on the gateway service:
# thclaws-cloud/docker-compose.yml
gateway:
environment:
THCLAWS_PLATFORM_MARKUP: ${THCLAWS_PLATFORM_MARKUP:-1.25}
Changing it to 1.50 and bouncing the gateway is the fastest way to widen margin
uniformly across every model. The gateway clamps any value below 1.0 (running at
exact upstream cost is the minimum — anything lower would mean losing money per call)
and emits a warning.
For per-model differential pricing, edit the model_pricing row directly — but note
that the refresh script will reprice it back to LiteLLM on the next --reprice run.
Mark such rows by setting active=FALSE then active=TRUE with a custom rate, and
exclude them from refresh with --providers scoping.
Where the rates live in code
| Concern | File |
|---|---|
| Cost computation | thclaws-cloud/gateway/src/pricing.rs (cost_cents_with_markup) |
| Markup env-var parsing | thclaws-cloud/gateway/src/config.rs (platform_markup) |
| Meter call-site | thclaws-cloud/gateway/src/meter.rs |
| Pricing table schema | thclaws-cloud/api/alembic/versions/006_*.py |
| Refresh script | scripts/refresh-model-catalogue.py |