Usage Quotas & Tracking
Real-time usage monitoring with quotas, alerts, and controls to manage consumption and costs
Overview
Quotas in Instafill.ai are enforced per workspaceId at the API layer. Each subscription plan defines limits for sources, storage, forms, and sessions. When a workspace's usage reaches a plan limit, further requests return HTTP 429. Usage counters are tracked per workspaceId and are visible in Settings → Usage dashboard.
API responses include rate limit headers — X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset — so callers can monitor remaining capacity on every request. When a quota is exceeded, the platform emits a quota.exceeded webhook event that can be used to trigger alerts or automation on your end.
Key Capabilities
- Per-Workspace Quota Enforcement: All quota checks run against the
workspaceId; limits apply independently per workspace - HTTP 429 on Limit Breach: Requests that exceed plan limits receive a 429 response at the API layer — no partial processing
- Rate Limit Headers on Every Response:
X-RateLimit-Limit,X-RateLimit-Remaining, andX-RateLimit-Resetreturned with API responses so clients can track remaining capacity quota.exceededWebhook: Platform fires aquota.exceededevent when a workspace hits a plan limit, enabling downstream alerting or automation- Settings → Usage Dashboard: View current usage counters per
workspaceIdwithout making API calls - Plan-Based Limits: Quotas derive from the active subscription plan and update immediately when the Stripe subscription webhook fires a plan change event
How It Works
Plan Limits by Tier:
- Free: 10 sources / 100 MB storage, limited forms, limited sessions
- Starter: 100 sources / 1 GB storage
- Professional: 1,000 sources / 10 GB storage
- Enterprise: Custom limits negotiated per contract
Enforcement at the API Layer: Every request that could consume a quota-tracked resource runs a check against the workspace's current counters before proceeding. If the counter is at or above the plan limit, the request is rejected with HTTP 429 before any processing occurs.
Rate Limit Headers: Each API response carries:
X-RateLimit-Limit: The plan's defined limit for the relevant resourceX-RateLimit-Remaining: Units remaining before the limit is hitX-RateLimit-Reset: Unix timestamp indicating when the counter resets
quota.exceededWebhook: When a workspace hits a quota boundary, the platform fires aquota.exceededevent. You can subscribe to this event to trigger notifications, pause automated workflows, or prompt users to upgrade.Usage Dashboard: Settings → Usage shows current counters for each tracked resource within the workspace. Counters update as requests are processed.
Quota Reset: Counters reset on the workspace's subscription renewal date. Unused quota does not roll over.
Use Cases
A startup on the Starter plan uses X-RateLimit-Remaining headers in their integration to detect when sources are running low and surface an in-product upgrade prompt before hitting 429. An agency running multiple client workspaces monitors each workspace's usage independently in Settings → Usage, since quota tracking is scoped per workspaceId and never aggregated across workspaces. An enterprise operations team subscribes to the quota.exceeded webhook to post an alert to their Slack channel whenever any workspace hits its source or storage limit, allowing them to reallocate or upgrade before workflows are blocked.
Benefits
- 429 Before Processing: Quota limits are checked at the API boundary, so over-limit requests fail fast without consuming partial resources or producing inconsistent state
- Client-Side Visibility Without Extra Calls: Rate limit headers on every response mean API consumers always know remaining capacity without polling a separate endpoint
- Webhook-Driven Alerting: The
quota.exceededevent enables integration with external alerting and automation tools without polling - Workspace Isolation: Quota counters are scoped per
workspaceId, so a high-volume workspace does not affect the limits or visibility of other workspaces - Immediate Limit Updates on Plan Change: When a Stripe subscription webhook fires after a plan upgrade or downgrade, the workspace's enforced limits update immediately
Security & Privacy
Usage counters track resource consumption counts per workspaceId — they record how much was used, not the content of what was processed. Form content and document data are not part of usage tracking.
Data is scoped to workspaceId and protected via the shared JWT authentication middleware running in both the .NET and Python service layers. Only authenticated requests carrying a valid JWT for a given workspace can read that workspace's usage counters or receive its quota-related webhook events. Workspace members can view their workspace's usage in Settings → Usage; counters from other workspaces are not accessible.
Common Questions
What happens when I hit my quota limit?
Requests that exceed the active plan's limits return HTTP 429 at the API layer. The response includes X-RateLimit-Limit, X-RateLimit-Remaining (which will be 0), and X-RateLimit-Reset indicating when the counter resets.
At the same time, the platform fires a quota.exceeded webhook event for the workspace. To resume normal operation before the reset date, upgrade the workspace's subscription plan from Settings → Billing. The new limits apply as soon as the Stripe plan-change webhook is processed.
How do quotas reset? Can I roll over unused quota?
Quota counters reset on the workspace's subscription renewal date. Unused quota does not roll over — counters return to zero at the start of each billing period regardless of how much was consumed.
Annual plans may have quotas expressed as annual totals depending on the contract. Contact the sales team for Enterprise arrangements that require custom rollover or flexible period structures.
Can I set different quotas for different workspaces?
Each workspace has its own subscription and its own quota counters scoped to its workspaceId. Quotas are not shared across workspaces. If different workspaces within an organization need different limits, each workspace's subscription plan can be set independently from Settings → Billing within that workspace.
Enterprise contracts can include custom limits per workspace. Contact the sales team to configure per-workspace limits as part of an Enterprise agreement.
What counts toward different quota types?
- Sources quota: Each source added to a workspace increments the source counter. Free workspaces are limited to 10 sources; Starter to 100; Professional to 1,000; Enterprise is custom.
- Storage quota: Total size of stored source documents and associated files, measured in MB/GB. Free workspaces are limited to 100 MB; Starter to 1 GB; Professional to 10 GB.
- Forms and sessions: Tracked per workspace. Free and Starter tiers have defined limits; Professional and Enterprise tiers have higher or custom limits.
The X-RateLimit-Limit and X-RateLimit-Remaining headers in API responses reflect the limit and remaining capacity for the resource dimension being accessed in that request.
Can I get alerts before running out of quota?
Yes. Subscribe to the quota.exceeded webhook event to receive a notification when a workspace hits its plan limit. This event can be routed to any endpoint — use it to send Slack messages, trigger PagerDuty, or post to any alerting pipeline.
The X-RateLimit-Remaining header on API responses also provides per-request visibility into remaining capacity, so your integration can detect when limits are approaching and take action before a 429 is returned.
In-platform, the Settings → Usage dashboard shows current counters so workspace owners can monitor usage without writing custom tooling.