Scale horizontally (multiple stateless instances behind a load balancer) and move heavy work to async jobs/queues (background processing). You can also scale reads with caching and replicas.