Stop rebuilding the same limits for every app.
Per-user caps, plans, credits - the same boring backend, every time. Define plans in the dashboard, wrap the AI call in reserve(): quota holds atomically, no race at 1ms apart.
Every AI call — quantity, cost, metadata.
→ counters · personsAtomic increments. reserve holds quota 60s.
→ plans & limitsQuotas per unit, defined in the dashboard.
Current plan + full change history.
→ plans & limits · credit_ledger · personsPre-paid credits, partial burns, rollover.
→ personsEvery event and quota joined to one profile.
if (usage < limit) leaks under parallel calls. reserve → commit → release(60s TTL) doesn’t.Stop checking your users with SQL. The dashboard joins every event and quota back to one person.


Fifteen jobs, one API. Every card is a full tutorial with working code.
One bill per account, every limit pooled across your workspaces. Start free, upgrade only when you outgrow a tier.
Your first AI feature, metered in production.
For a launched app with paying users.
Scale across apps and millions of calls.
Your volume and terms, on our metering rails.
Define a per-user quota in the dashboard, then wrap your call in vevee.reserve(). It returns { allowed: false } the moment a user hits their cap, before you spend a cent. Any provider, any unit.
A naive if (usage < limit) check has a race condition: two requests arrive in the same millisecond, both pass, both fire. Vevee exposes a reserve → commit → release pattern with a 60-second TTL so the second request sees the first reservation. Auto-released on crash or timeout.
Neither. We never touch money: keep your billing (or none) and call upsertSubscription() from your webhook. And your code calls the model directly; we sit beside the call to gate it and record it. Prompts stay yours unless you opt in.
All of them. Vevee is provider-agnostic: OpenAI, Anthropic, Gemini, Mistral, Replicate, fal, or your own self-hosted model. You decide what an event is, we meter it.
You define the compose type once in the dashboard — a prompt plus which data it can see. Vevee runs it per user, grounded in that user's real usage and behavior, returns typed output, and meters the AI cost. You never assemble the context or build the data plumbing.
Five minutes from npm install to your first enforced limit. Free to 50k ops/mo, no credit card.