Question 1

How do I limit OpenAI usage per user?

Accepted Answer

Define a plan with a per-user quota in the Vevee dashboard, then wrap your AI call in vevee.reserve(). It returns { allowed: false } the moment a user hits their cap - before you spend a cent on the actual API call. Works for any model provider and any unit (tokens, images, seconds, requests).

Question 2

How do I stop users from bypassing rate limits with parallel requests?

Accepted Answer

A naive "if (usage < limit) callAI()" check has a race condition: two requests arrive in the same millisecond, both pass, both fire. Vevee exposes a reserve → commit → release pattern with a 60-second TTL so the second request sees the first reservation, even at 1ms apart. Auto-released on crash or timeout.

Question 3

How do I enforce different limits for free, pro, and enterprise plans?

Accepted Answer

Build plans visually in the dashboard. Each plan has its own limits, periods (daily, weekly, monthly, lifetime), and units. When a user upgrades, call vevee.upsertSubscription({ userId, planId }) from your existing billing webhook - Vevee handles the rest, including history for downgrade and churn analysis.

Question 4

Does Vevee process payments?

Accepted Answer

No. Vevee never touches money. You keep your existing billing system - or no billing at all - and call vevee.upsertSubscription() from your webhook to tell us what plan a user is on. We do the enforcement and tracking; you do the charging.

Question 5

Does Vevee proxy my AI calls?

Accepted Answer

No. Your code calls OpenAI, Anthropic, Replicate, or your own model directly. Vevee sits beside that call to gate it before and record it after. We never see your prompts or model responses unless you explicitly opt in to prompt logging.

Question 6

Which AI providers does it support?

Accepted Answer

All of them. Vevee is provider-agnostic - works with OpenAI, Anthropic, Gemini, Mistral, Replicate, fal, your own self-hosted model, or anything else you call. You decide what an event is.

Question 7

Can I use Vevee for a free AI app with no payments?

Accepted Answer

Yes. Define a single free plan with limits and never call upsertSubscription - every user is implicitly on the free plan. This is the simplest way to cap AI cost on a free app without writing any backend code.

Question 8

What is the latency overhead?

Accepted Answer

p50 around 30ms, p95 under 80ms. The reserve call is the only one on the hot path; commit and track can be fire-and-forget so they never block your response.

Question 9

What is compose() and how is it different from a prompt I write myself?

Accepted Answer

You define a compose type once in the Vevee dashboard - a prompt plus which data it can see. Vevee runs it per user, grounded in that user's real usage and behavior, returns typed output, and meters the AI cost. You never assemble the context or build the data plumbing yourself.

The backend every AI app rebuilds. Shipped as one SDK.

You know this schema. You’ve written it before.

Every user. What they did, what it cost, where they stand.

One compose type per job. Define it once - it writes fresh for every user.

Pay for usage, not seats.

The questions devs ask first.

The boring backend is built. Go ship the interesting part.