With many instances, an in-memory counter limits only one instance, so overall traffic can exceed the limit. You usually need a shared store (e.g., Redis) or enforce limits at the gateway. Hard parts: correctness under concurrency, time windows, clock drift, and avoiding hot keys.