活跃 API KeyActive Keys

24h 请求数24h Requests

总 Token 用量Total Tokens

熔断状态Circuit Breaker

📊 24小时请求趋势24h Request Trends

📈 24小时 Token 用量24h Token Usage

⚡ 快速开始Quick Start

cURL curl https://api.stormengine.cloud/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "Qwen2.5-32B",
    "messages": [{"role": "user", "content": "你好"}],
    "stream": false
  }'

🔑 创建 API KeyCreate API Key （需管理员权限）(Admin required)

名称Name

配额Quota (tokens)

限速Rate (rpm)

Key 列表Key List （每账号最多 2 个 Key）(Max 2 keys per account)

名称Name	ID	用量 / 配额Usage / Quota	限速Rate	状态Status	操作Actions

今日费用Today (估算est.)

本月累计This Month

定价Pricing

$/M tokens$/M tokens

💵 各 Key 费用明细Per-Key Billing

Key	Token 用量Token Usage	今日费用Today	本月估算Month Est.	配额使用Quota Used

📈 费用趋势 (24h)Cost Trends (24h)

💳 充值Recharge

充值通道暂未开放，请联系我们获取 API Key。Recharge not yet available. Contact us for API Key access.

💬 对话测试Chat Test

输入问题测试 STORM 模型Type a question to test STORM

📡 可用模型Available Models

Qwen2.5-32B DGX Spark

上下文Context: 64K tokens

最大输出Max Output: 8192 tokens

最大并发Max Concurrency: 30

平均延迟Avg Latency: ~19s

Function Calling: ✅

流式输出Streaming: ✅ SSE

JSON 模式Mode: ✅

Qwen2.5:14B Mac M4

最大并发Max Concurrency: 18

平均延迟Avg Latency: ~9s

用途Role: 轻量任务分流Lightweight offload

Qwen3-8B FALLBACK

上下文Context: 32K tokens

用途Role: 熔断降级Circuit-breaker fallback

🔗 API Endpoints

POST	`/v1/chat/completions`	对话补全 (兼容 OpenAI SDK)Chat Completions (OpenAI-compatible)
GET	`/v1/models`	模型列表Model List
GET	`/health`	健康检查Health Check
GET	`/admin/usage`	用量查询 (需 API Key)Usage Query (API Key required)

🐍 Python SDK 示例Python SDK Example

Python from openai import OpenAI

client = OpenAI(
    base_url="https://api.stormengine.cloud/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="Qwen2.5-32B",
    messages=[{"role": "user", "content": "你好，STORM！"}],
    max_tokens=1024,
    temperature=0
)

print(response.choices[0].message.content)

📦 Node.js 示例Node.js Example

JavaScript import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stormengine.cloud/v1',
  apiKey: 'YOUR_API_KEY'
});

const response = await client.chat.completions.create({
  model: 'Qwen2.5-32B',
  messages: [{ role: 'user', content: 'Hello STORM!' }]
});

console.log(response.choices[0].message.content);

🔐 认证方式Authentication

所有 API 请求需在All API requests require an API Key in the Authorization header 中携带 API Key：header:
Authorization: Bearer sk-storm-xxxxxxxx

⚡ STORM AI — Pricing

Inference API for Agent developers · Updated 2026-05

Built for developers who need determinism, dedicated capacity, and data sovereignty — not the cheapest tokens on the market. Pay for guaranteed throughput on physical hardware, not shared cloud quota.

Plans

Free

$0/mo

Try it out, side projects.

100K tokens/month
5 req/min
Qwen2.5-32B
Shared concurrency
Community support

Start free

Experience

$3.9/mo

Getting started with Agents.

500K tokens/month
10 req/min
Qwen2.5-32B
Shared concurrency
Community support

Subscribe — $3.9/mo

Starter

$9.9/mo

Testing an Agent in pre-production.

1M tokens/month
10 req/min
Qwen2.5-32B + 14B
Shared concurrency
Community support

Subscribe — $9.9/mo

Developer

$49/mo

Indie devs running a single Agent.

Unlimited tokens
20 req/min
1 dedicated slot
32B + 14B
48h email support
99% SLA

Subscribe — $49/mo

Full Feature Comparison

Feature	Free	Experience	Starter	Developer	Pro	Business	Enterprise
Monthly tokens	100K	500K	1M	Unlimited	Unlimited	Unlimited	Unlimited
Rate limit	5/min	10/min	10/min	20/min	60/min	None	None
Dedicated concurrency slots	—	—	—	1	4	16	Custom
Available models	32B	32B	32B + 14B	32B + 14B	32B + 14B	32B + 14B	All + Custom
TTFT P50	~1.5s	~1.5s	~1.5s	~1.5s	~1.2s	~1.0s	<1s
SLA uptime	—	—	—	99%	99.5%	99.5%	99.9%
Data residency	China	China	China	China	China	China	Your choice
Support response	Community	Community	Community	48h	24h	4h	24/7
Custom system prompts	✓	✓	✓	✓	✓	✓	✓
Tool calling	✓	✓	✓	✓	✓	✓	✓
Streaming output	✓	✓	✓	✓	✓	✓	✓
Usage dashboard	✓	✓	✓	✓	✓	✓	✓
Custom fine-tuning	—	—	—	—	—	—	✓
On-premise deployment	—	—	—	—	—	—	✓

Pay-as-you-go (Alternative)

Prefer per-token billing instead of a subscription? You can charge usage to your account balance:

Model	Input	Output	Best for
Qwen2.5-32B (STORM main)	$0.50 / 1M	$1.00 / 1M	Complex code, Agents, long context
Qwen2.5-14B (Mac M4)	$0.20 / 1M	$0.40 / 1M	Simple tasks, classification, extraction

When to subscribe vs pay-as-you-go: Subscriptions become cheaper after roughly 3M tokens/month on the 32B model. Most production Agents cross this threshold within days. If you run a real Agent, subscribe.

Why STORM vs OpenAI / Claude / DeepSeek

Capability	STORM	OpenAI GPT-4o-mini	Anthropic Claude 3.5	DeepSeek
Output price ($/1M tokens)	$1.00	$0.60	$15	$0.30
Dedicated capacity (no noisy neighbor)	✓	—	—	—
No rate limits on paid tier	✓	—	—	—
Data stays on our hardware (never trained on)	✓	Opt-out	Opt-out	Opt-out
Tool calling / Function calling	✓	✓	✓	✓
Full system prompt control	Full	Limited	Limited	Limited
1,280 tests, 0 structural errors	✓	n/a	n/a	n/a

Get Started in 60 Seconds

STORM is OpenAI API compatible. Drop in your key, change the base_url, and you're running.

# 1. Install the OpenAI SDK
pip install openai

# 2. Drop in your STORM key
from openai import OpenAI

client = OpenAI(
    base_url="https://api.stormengine.cloud/v1",
    api_key="YOUR_STORM_KEY",
)

response = client.chat.completions.create(
    model="Qwen2.5-32B",
    messages=[{"role": "user", "content": "Write a Python sort function."}],
    temperature=0,  # Deterministic output for Agent pipelines
)

print(response.choices[0].message.content)

100,000 free tokens every month. No credit card required for the Free tier.

Payment Methods

💳 Credit / Debit Card (Stripe) 🅿️ PayPal 🏦 Wire Transfer (Business+, $500+) ₿ USDC / USDT (on request)

FAQ

Why is STORM cheaper than OpenAI but more expensive than DeepSeek?

We're not a cloud giant with shared inference. Each paying customer gets dedicated capacity on physical hardware. You're paying for guaranteed throughput, not just tokens.

What happens when I hit my Free token cap?

Requests return HTTP 429 until the next month. Upgrade anytime to remove the cap immediately.

Can I cancel anytime?

Yes. Monthly subscriptions, cancel anytime in your dashboard. No questions, no retention calls.

Where is my data processed?

On our NVIDIA DGX Spark in Nanjing, China by default. Enterprise customers choose their deployment region.

Do you train on my data?

No. Ever. Prompts and outputs are never used for training. Logs are kept for billing and abuse-prevention only, and auto-purge after 30 days.

Refund policy?

7-day full refund on the first month of any paid plan. No questions asked.

How is the SLA calculated?

SLA = (Total time − Downtime) / Total time, measured monthly. Credits are issued automatically when we miss the target.

What's a "dedicated slot"?

A guaranteed concurrent inference slot on our hardware. You can have 1 (Developer) to 16 (Business) requests in flight without queuing — no noisy neighbors.

Enterprise & Custom

Need something beyond Business? We work directly with compliance-sensitive teams:

On-premise deployment Install STORM on your hardware, behind your firewall.

Custom fine-tuning Train Qwen2.5 on your domain data, hosted privately.

Air-gapped environments Fully isolated deployments for classified workloads.

Compliance SOC2, HIPAA, GDPR DPA — ask for our compliance pack.

Dedicated engineer 24/7 named contact, Slack channel, custom integrations.

Volume discounts 50%+ off list at high commitment levels.

Contact: [email protected]

Trust & Transparency

Open benchmark data github.com/YIQI-NUMBER1/stormengine

Hardware NVIDIA DGX Spark (verified) + Apple M4 Pro

Inference engine vLLM 0.21 (open source)

Models Qwen2.5 series (open weights)

Ready to build?

Get a free key in 60 seconds. No credit card.

Start free View benchmark

At-a-Glance

Free$0/mo · 100K tokens

Starter$9/mo · 1M tokens

Developer$49/mo · Unlimited + 1 slot

Pro$159/mo · Unlimited + 4 slots

Business$799/mo · Unlimited + 16 slots

Enterprise$4,999+/mo · Custom + on-prem

📧 联系我们Contact Us

有建议、投诉或合作意向？直接留言，内容将发送至我们的邮箱。Suggestions, feedback, or partnerships? Leave a message and we'll get back to you.

⚡ STORM 开发者平台Developer Platform

南京暴风引擎科技有限公司Nanjing Storm Engine Technology

—— 或使用 Key 登录or use Key ——

管理功能需 Admin Key，用量查询可用 API KeyAdmin Key for management, API Key for usage queries