Interview kitsBlog

Your dream job? Lets Git IT.
Interactive technical interview preparation platform designed for modern developers.

XGitHub

Platform

  • Categories

Resources

  • Blog
  • About the app
  • FAQ
  • Feedback

Legal

  • Privacy Policy
  • Terms of Service

© 2026 LetsGit.IT. All rights reserved.

LetsGit.IT/Categories/Architecture
Architecturemedium

Why do teams watch p95/p99 latency, not just average latency?

Tags
#latency#p99#performance#observability
Back to categoryPractice quiz

Answer

Averages hide tail latency: a few very slow requests can be invisible in the mean but painful for users. p95/p99 show how the slowest 5%/1% behave and help catch queueing and saturation issues.

Advanced answer

Deep dive

Expanding on the short answer — what usually matters in practice:

  • Context (tags): latency, p99, performance, observability
  • Scaling: what scales horizontally vs vertically, where bottlenecks appear.
  • Reliability: retries/circuit breakers/idempotency, observability (logs/metrics/traces).
  • Evolution: keep changes cheap (boundaries, contracts, tests).
  • Explain the "why", not just the "what" (intuition + consequences).
  • Trade-offs: what you gain/lose (time, memory, complexity, risk).
  • Edge cases: empty inputs, large inputs, invalid inputs, concurrency.

Examples

A tiny example (an explanation template):

// Example: discuss trade-offs for "why-do-teams-watch-p95/p99-latency,-not-just-ave"
function explain() {
  // Start from the core idea:
  // Averages hide tail latency: a few very slow requests can be invisible in the mean but pain
}

Common pitfalls

  • Too generic: no concrete trade-offs or examples.
  • Mixing average-case and worst-case (e.g., complexity).
  • Ignoring constraints: memory, concurrency, network/disk costs.

Interview follow-ups

  • When would you choose an alternative and why?
  • What production issues show up and how do you diagnose them?

Related questions

Architecture
Cache stampede (thundering herd): what is it and how do you mitigate it?
#architecture#caching#cache-stampede
Architecture
What makes a good alert and how do you avoid alert fatigue?
#alerting#runbook#observability
Architecture
Load Balancing Strategies?
#load-balancing#performance
  • How would you test edge cases?
  • #scalability
    Operating Systems
    Explain virtual memory and paging.
    #virtual-memory#paging#performance
    Operating Systems
    What is context switching and why is it expensive?
    #context-switch#scheduler#performance
    Observability
    How do you investigate a latency regression in production?
    #latency#incident#tracing