Common strategies are token bucket and leaky bucket (or fixed/sliding window). You can enforce limits at the edge/API gateway, load balancer, or in the app (per user/IP), ideally close to the entry point.