Pro Portal - VibeCheck

AI-powered endpoints are expensive to run and attractive targets for abuse. Rate limiting is essential to protect your infrastructure and budget.

Why Rate Limit AI Endpoints?

Prevent cost overruns from API abuse
Protect against denial-of-service attacks
Ensure fair usage among users
Maintain service availability

In-Memory Rate Limiting

Simple rate limiting for single-server deployments:

// lib/rate-limit.ts
const rateLimitMap = new Map<string, { count: number; resetAt: number }>();

export function checkRateLimit(
  key: string,
  limit: number,
  windowMs: number
): { allowed: boolean; remaining: number } {
  const now = Date.now();
  const record = rateLimitMap.get(key);

  if (!record || record.resetAt < now) {
    rateLimitMap.set(key, { count: 1, resetAt: now + windowMs });
    return { allowed: true, remaining: limit - 1 };
  }

  if (record.count >= limit) {
    return { allowed: false, remaining: 0 };
  }

  record.count++;
  return { allowed: true, remaining: limit - record.count };
}

Using in API Routes

// app/api/ai/generate/route.ts
import { auth } from "@/auth";
import { checkRateLimit } from "@/lib/rate-limit";

export async function POST(request: Request) {
  const session = await auth();
  if (!session?.user?.id) {
    return Response.json({ error: "Unauthorized" }, { status: 401 });
  }

  // Rate limit: 10 requests per minute per user
  const { allowed, remaining } = checkRateLimit(
    `ai:${session.user.id}`,
    10,
    60 * 1000
  );

  if (!allowed) {
    return Response.json(
      { error: "Rate limit exceeded" },
      {
        status: 429,
        headers: { "X-RateLimit-Remaining": "0" },
      }
    );
  }

  // Process AI request...
}

Production Considerations

Use Redis or similar for distributed rate limiting
Implement sliding window algorithms for smoother limits
Add different tiers for free vs. paid users
Include retry-after headers in 429 responses

What VibeCheck Detects

VC-ABUSE-001: AI endpoints without rate limiting
VC-ABUSE-002: Missing cost controls on expensive operations

Rate Limiting for AI Endpoints

Why Rate Limit AI Endpoints?

In-Memory Rate Limiting

Using in API Routes

Production Considerations

What VibeCheck Detects