Problem Statement
Your API gateway at Stripe needs to limit requests to 100 per second per user. Implement a thread-safe Token Bucket rate limiter that handles concurrent access.
Token Bucket Algorithm
- Bucket holds tokens up to a maximum capacity
- Tokens are added at a fixed rate (e.g., 100/second)
- Each request consumes one token
- If no tokens available, request is rejected
Implementation
Per-User Rate Limiting
HTTP Middleware Example
Production Considerations
- Memory leak: Clean up old user buckets periodically
- Distributed: Use Redis with Lua scripts for multi-instance
- Graceful degradation: Return Retry-After header
Follow-up Questions
- How would you implement a sliding window rate limiter?
- How do you handle distributed rate limiting across multiple servers?
- What's the difference between token bucket and leaky bucket?