| 💾 本地存储Local Storage
活跃 API KeyActive Keys
-
24h 请求数24h Requests
-
总 Token 用量Total Tokens
-
熔断状态Circuit Breaker
-

📊 24小时请求趋势24h Request Trends

📈 24小时 Token 用量24h Token Usage

快速开始Quick Start

cURL curl https://api.stormengine.cloud/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "Qwen2.5-32B",
    "messages": [{"role": "user", "content": "你好"}],
    "stream": false
  }'

🔑 创建 API KeyCreate API Key (登录用户可创建,每账号最多5个)(Max 5 keys per account)

Key 列表Key List

列表内是你的全部 API Key。API Key 仅在创建时可见可复制,请妥善保存。不要与他人共享你的 API Key,或将其暴露在浏览器或其他客户端代码中。为了保护你的账户安全,我们可能会自动禁用我们发现已公开泄露的 API Key。 All your API keys are listed here. API keys are only visible and copyable at creation time. Please save them securely. Do not share your API key with others or expose it in browser code. To protect your account, we may automatically disable leaked API keys.
名称NameID用量 / 配额Usage / Quota限速Rate状态Status操作Actions
今日费用Today (估算est.)
$0
本月累计This Month
$0
PricingPricing
$/M tokens$/M tokens

💵 各 Key 费用明细Per-Key Billing

KeyToken 用量Token Usage今日费用Today本月估算Month Est.配额使用Quota Used

📈 费用趋势 (24h)Cost Trends (24h)

💳 充值Recharge

充值通道暂未开放,请联系我们获取 API Key。Recharge not yet available. Contact us for API Key access.

💬

点击下方按钮开始对话Click below to start chatting

📊 每日用量Daily Usage

API 请求次数API Requests

Token 调用量Token Usage

🧾 充值账单Recharge Bills

订单编号Order ID金额Amount状态Status创建时间Time
暂无记录No records

🔄 退款管理Refund

提示Notice

  1. 未消耗且未开具发票的充值余额,支持全额退款Unused uninvoiced balance: full refund.
  2. 已消耗金额不支持退款;已开票未消耗金额,先作废发票再申请Consumed: non-refundable. Void invoices first.
  3. 对公汇款暂不支持在线申请退款Wire transfers: not supported online.
  4. 微信/支付宝超12个月、Paypal超180天的不支持退款WeChat/Alipay >12m, Paypal >180d: non-refundable.
  5. 计费可能有延迟,可退款金额为预估Billing may have delays. Amount estimated.
  6. 通过后5个工作日内原路退回Approved: within 5 business days.
  7. 一个账号仅一笔处理中的退款One pending refund per account.

申请退款金额Refund Amount

¥ 0.00

📄 发票管理Invoices

📌 开票规则Rules

  1. 发票抬头需与实名认证一致Title must match verified identity.
  2. 约7个工作日,邮件发送~7 business days via email.
  3. 已开票金额不予退款Invoiced amounts non-refundable.
  4. 开票信息提交后无法修改Cannot modify after submission.

可开票金额Available

¥ 0.00

📡 可用模型Available Models

Qwen2.5-32B DGX Spark

上下文Context: 64K tokens
最大输出Max Output: 8192 tokens
最大并发Max Concurrency: 30
平均延迟Avg Latency: ~19s
Function Calling:
流式输出Streaming: ✅ SSE
JSON 模式Mode:

Qwen2.5:14B Mac M4

最大并发Max Concurrency: 18
平均延迟Avg Latency: ~9s
用途Role: 轻量任务分流Lightweight offload

Qwen3-8B FALLBACK

上下文Context: 32K tokens
用途Role: 熔断降级Circuit-breaker fallback

🔗 API Endpoints

POST/v1/chat/completions对话补全 (兼容 OpenAI SDK)Chat Completions (OpenAI-compatible)
GET/v1/models模型列表Model List
GET/health健康检查Health Check
GET/admin/usage用量查询 (需 API Key)Usage Query (API Key required)

🐍 Python SDK 示例Python SDK Example

Python from openai import OpenAI

client = OpenAI(
    base_url="https://api.stormengine.cloud/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="Qwen2.5-32B",
    messages=[{"role": "user", "content": "你好,STORM!"}],
    max_tokens=1024,
    temperature=0
)

print(response.choices[0].message.content)

📦 Node.js 示例Node.js Example

JavaScript import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.stormengine.cloud/v1',
  apiKey: 'YOUR_API_KEY'
});

const response = await client.chat.completions.create({
  model: 'Qwen2.5-32B',
  messages: [{ role: 'user', content: 'Hello STORM!' }]
});

console.log(response.choices[0].message.content);

🔐 认证方式Authentication

所有 API 请求需在All API requests require an API Key in the Authorization header 中携带 API Key:header:
Authorization: Bearer sk-storm-xxxxxxxx

⚡ STORM AI - Pricing

Inference API for Agent developers · Updated 2026-05

Built for developers who need determinism, dedicated capacity, and data sovereignty - not the cheapest tokens on the market. Pay for guaranteed throughput on physical hardware, not shared cloud quota.

Plans

Free
$0/mo
Try it out, side projects.
  • 100K tokens/month
  • 5 req/min
  • Qwen2.5-32B
  • Shared concurrency
  • Community support
Start free
Starter
$9.9/mo
Try it out, small projects.
  • 1M tokens/month
  • 10 req/min
  • Qwen2.5-32B
  • Shared concurrency
  • Email support
Subscribe - $9.9/mo
Developer
$49/mo
Indie devs running a single Agent.
  • Unlimited tokens
  • 20 req/min
  • 1 dedicated slot
  • 32B + 14B
  • 48h email support
  • 99% SLA
Subscribe - $49/mo
Most popular
Pro
$159/mo
Small teams, multiple Agents.
  • Unlimited tokens
  • 60 req/min
  • 4 dedicated slots
  • 32B + 14B
  • 24h email support
  • 99.5% SLA
Subscribe - $159/mo
Business
$799/mo
Production workloads, no limits.
  • Unlimited tokens
  • No rate limit
  • 16 dedicated slots
  • 32B + 14B
  • 4h priority support
  • 99.5% SLA
Subscribe - $799/mo
Enterprise
$4,999+/mo
Compliance, on-prem, custom.
  • Custom capacity
  • On-premise deploy
  • Custom fine-tuning
  • Your data residency
  • 24/7 dedicated engineer
  • 99.9% SLA
Contact sales

Full Feature Comparison

Feature Free Starter Developer Pro Business Enterprise
Monthly tokens100K 1MUnlimitedUnlimitedUnlimitedUnlimited
Rate limit5/min 10/min20/min60/minNoneNone
Dedicated concurrency slots-- 1416Custom
Available models32B 32B32B + 14B32B + 14B32B + 14BAll + Custom
TTFT P50~1.5s ~1.5s~1.5s~1.2s~1.0s<1s
SLA uptime-- 99%99.5%99.5%99.9%
Data residencyChina ChinaChinaChinaChinaYour choice
Support responseCommunity Email48h 24h4h24/7
Custom system prompts
Tool calling
Streaming output
Usage dashboard
Custom fine-tuning----
On-premise deployment----

Pay-as-you-go (Alternative)

Prefer per-token billing instead of a subscription? You can charge usage to your account balance:

Model Input Output Best for
Qwen2.5-32B (STORM main) $0.50 / 1M $1.00 / 1M Complex code, Agents, long context
Qwen2.5-14B (Mac M4) $0.20 / 1M - - $0.40 / 1M Simple tasks, classification, extraction
When to subscribe vs pay-as-you-go: Subscriptions become cheaper after roughly 3M tokens/month on the 32B model. Most production Agents cross this threshold within days. If you run a real Agent, subscribe.

Why STORM vs OpenAI / Claude / DeepSeek

Capability STORM OpenAI GPT-4o Anthropic Claude 3.5 DeepSeek
Output price ($/1M tokens) $1.00 $10 $15 $0.30
Dedicated capacity (no noisy neighbor) - - -
No rate limits on paid tier - - -
Data stays on our hardware (never trained on) Opt-out Opt-out Opt-out
Tool calling / Function calling
Full system prompt control Full Limited Limited Limited
1,280 tests, 0 structural errors n/a n/a n/a

Get Started in 60 Seconds

STORM is OpenAI API compatible. Drop in your key, change the base_url, and you're running.

# 1. Install the OpenAI SDK pip install openai # 2. Drop in your STORM key from openai import OpenAI client = OpenAI( base_url="https://api.stormengine.cloud/v1", api_key="YOUR_STORM_KEY", ) response = client.chat.completions.create( model="Qwen2.5-32B", messages=[{"role": "user", "content": "Write a Python sort function."}], temperature=0, # Deterministic output for Agent pipelines ) print(response.choices[0].message.content)

100,000 free tokens every month. No credit card required for the Free tier.

Payment Methods

💳 Credit / Debit Card (Stripe) 🅿️ PayPal 🏦 Wire Transfer (Business+, $500+) ₿ USDC / USDT (on request)

FAQ

Why is STORM cheaper than OpenAI but more expensive than DeepSeek?
We're not a cloud giant with shared inference. Each paying customer gets dedicated capacity on physical hardware. You're paying for guaranteed throughput, not just tokens.
What happens when I hit my Free token cap?
Requests return HTTP 429 until the next month. Upgrade anytime to remove the cap immediately.
Can I cancel anytime?
Yes. Monthly subscriptions, cancel anytime in your dashboard. No questions, no retention calls.
Where is my data processed?
On our NVIDIA DGX Spark in Nanjing, China by default. Enterprise customers choose their deployment region.
Do you train on my data?
No. Ever. Prompts and outputs are never used for training. Logs are kept for billing and abuse-prevention only, and auto-purge after 30 days.
Refund policy?
7-day full refund on the first month of any paid plan. No questions asked.
How is the SLA calculated?
SLA = (Total time - Downtime) / Total time, measured monthly. Credits are issued automatically when we miss the target.
What's a "dedicated slot"?
A guaranteed concurrent inference slot on our hardware. You can have 1 (Developer) to 16 (Business) requests in flight without queuing - no noisy neighbors.

Enterprise & Custom

Need something beyond Business? We work directly with compliance-sensitive teams:

On-premise deployment Install STORM on your hardware, behind your firewall.
Custom fine-tuning Train Qwen2.5 on your domain data, hosted privately.
Air-gapped environments Fully isolated deployments for classified workloads.
Compliance SOC2, HIPAA, GDPR DPA - ask for our compliance pack.
Dedicated engineer 24/7 named contact, Slack channel, custom integrations.
Volume discounts 50%+ off list at high commitment levels.

Contact: [email protected]

Trust & Transparency

Hardware NVIDIA DGX Spark (verified) + Apple M4 Pro
Inference engine vLLM 0.21 (open source)
Models Qwen2.5 series (open weights)

Ready to build?

Get a free key in 60 seconds. No credit card.

At-a-Glance

Free$0/mo · 100K tokens
Starter$9.9/mo · 1M tokens · 10/min
Developer$49/mo · Unlimited + 1 slot
Pro$159/mo · Unlimited + 4 slots
Business$799/mo · Unlimited + 16 slots
Enterprise$4,999+/mo · Custom + on-prem
活跃 KeyActive Keys
-
24h 请求Requests
-
24h TokenTokens
-
总用量Total
-

📊 24h 请求趋势Requests

📈 24h Token 用量Usage

🔑 所有 KeyAll Keys

名称NameID用量/配额Usage/Quota限速Rate状态Status操作Actions

🌐 最近 IPRecent IPs

IP请求数Requests最后访问Last Seen

💵 各 Key 费用Key Costs

KeyToken 用量Tokens费用Cost配额比Quota%

📧 联系我们Contact Us

有建议、投诉或合作意向?直接留言,内容将发送至我们的邮箱。Suggestions, feedback, or partnerships? Leave a message and we'll get back to you.

或直接发邮件:Or email: [email protected]

STORM AI

Nanjing Storm Engine Technology

管理功能需 Admin Key,用量查询可用 API KeyAdmin Key for management, API Key for usage

使用邮箱+密码登录Use Email + Password

使用 API Key 登录Use API Key Login

STORM AI 💾 本地存储Local Storage ✏️ 更改昵称Rename