Back to Guides

Rate Limiting for AI Endpoints

AI-powered endpoints are expensive to run and attractive targets for abuse. Rate limiting is essential to protect your infrastructure and budget.

Why Rate Limit AI Endpoints?

  • Prevent cost overruns from API abuse
  • Protect against denial-of-service attacks
  • Ensure fair usage among users
  • Maintain service availability

In-Memory Rate Limiting

Simple rate limiting for single-server deployments:

// lib/rate-limit.ts
const rateLimitMap = new Map<string, { count: number; resetAt: number }>();

export function checkRateLimit(
  key: string,
  limit: number,
  windowMs: number
): { allowed: boolean; remaining: number } {
  const now = Date.now();
  const record = rateLimitMap.get(key);

  if (!record || record.resetAt < now) {
    rateLimitMap.set(key, { count: 1, resetAt: now + windowMs });
    return { allowed: true, remaining: limit - 1 };
  }

  if (record.count >= limit) {
    return { allowed: false, remaining: 0 };
  }

  record.count++;
  return { allowed: true, remaining: limit - record.count };
}

Using in API Routes

// app/api/ai/generate/route.ts
import { auth } from "@/auth";
import { checkRateLimit } from "@/lib/rate-limit";

export async function POST(request: Request) {
  const session = await auth();
  if (!session?.user?.id) {
    return Response.json({ error: "Unauthorized" }, { status: 401 });
  }

  // Rate limit: 10 requests per minute per user
  const { allowed, remaining } = checkRateLimit(
    `ai:${session.user.id}`,
    10,
    60 * 1000
  );

  if (!allowed) {
    return Response.json(
      { error: "Rate limit exceeded" },
      {
        status: 429,
        headers: { "X-RateLimit-Remaining": "0" },
      }
    );
  }

  // Process AI request...
}

Production Considerations

  • Use Redis or similar for distributed rate limiting
  • Implement sliding window algorithms for smoother limits
  • Add different tiers for free vs. paid users
  • Include retry-after headers in 429 responses

What VibeCheck Detects

  • VC-ABUSE-001: AI endpoints without rate limiting
  • VC-ABUSE-002: Missing cost controls on expensive operations
Pro Portal - VibeCheck | VibeCheck