Microservices

Recruitment and knowledge question base. Filter, search and test your knowledge.

Topics

Monolith vs Microservices?

mediumarchitecturemicroservicesmonolith+1

Answer

A monolith is one deployable application with shared runtime and usually one database, simpler to build and debug but harder to scale parts independently. Microservices split the system into independently deployable services that can scale and evolve separately, but add distributed‑system complexity and operational overhead.

What are the benefits and trade-offs of microservices?

mediummicroservicestradeoffsscalability+1

Answer

Benefits include independent deployment and scaling, team autonomy and technology flexibility. Trade‑offs include distributed‑system complexity, network latency, harder debugging/observability, and eventual consistency challenges.

How do microservices communicate? Synchronous vs asynchronous.

mediumcommunicationrestgrpc+1

Answer

Microservices communicate synchronously via REST/gRPC where the caller waits for a response, or asynchronously via messaging/events where services publish and consume messages. Sync is simple but tightly couples services and propagates failures; async improves resilience but adds eventual consistency and complexity.

What is service discovery and why is it needed?

mediumservice-discoveryregistrykubernetes+1

Answer

Service discovery is a mechanism for locating service instances dynamically (via a registry like Eureka/Consul or Kubernetes DNS). It’s needed because instances scale up/down and change IPs, so clients must discover healthy endpoints at runtime.

How to handle data consistency in microservices (saga, outbox)?

hardsagaoutboxconsistency+1

Answer

Consistency across microservices is usually eventual. Sagas coordinate a series of local transactions with compensating actions on failure. The outbox pattern stores events in the same DB transaction as state changes, then publishes them reliably to a broker.

What is a microservice (in practice, not a buzzword)?

easymicroservicesarchitecturebounded-context

Answer

A microservice is a small, independently deployable service that owns a business capability and its data. The goal is independent change and scaling, but the trade-off is distributed complexity.

What is an API Gateway used for?

easyapi-gatewayroutingsecurity

Answer

It’s a single entry point in front of services: routing, auth, rate limiting, TLS termination, and request aggregation. It simplifies clients, but can become a bottleneck if overused.

Service discovery — what problem does it solve?

easyservice-discoverydnsregistry

Answer

In dynamic environments (autoscaling), service instances come and go. Service discovery lets clients find healthy instances (via registry/DNS) without hardcoding IPs.

Synchronous vs asynchronous communication — what’s the trade-off?

mediumcommunicationhttpmessaging+1

Answer

Sync calls (HTTP/gRPC) are simpler and give immediate response, but create tight coupling and can cascade failures. Async messaging improves decoupling and resilience, but adds eventual consistency and operational complexity (queues, retries, ordering).

What is the Circuit Breaker pattern and why is it useful?

mediumcircuit-breakerresiliencetimeouts

Answer

It stops calling a failing dependency for a while (open state) and fails fast, protecting your service and giving the dependency time to recover. It reduces cascading failures and improves overall stability.

Distributed tracing — what are trace/span and why do you need correlation IDs?

mediumtracingobservabilitycorrelation-id

Answer

A trace represents one request across services; spans are timed operations inside it. Correlation/trace IDs let you connect logs, metrics, and spans across boundaries, making debugging production incidents much faster.

What is the Outbox pattern and what problem does it solve?

hardoutboxeventsconsistency+1

Answer

It writes an event/message to an “outbox” table in the same DB transaction as the business change, then publishes it asynchronously. This avoids losing events when the DB commit succeeds but publishing fails.

Why do consumers need to be idempotent in event-driven systems?

hardidempotencymessagingretries

Answer

Because messages can be delivered more than once (retries, redeliveries). Idempotent consumers handle duplicates safely (e.g., by dedup keys or upserts), preventing double side-effects.

How do you avoid breaking changes between services (API/contracts)?

hardcontractsversioningbackward-compatibility

Answer

Prefer backward-compatible changes (add optional fields, don’t remove/rename), version when needed, and validate contracts with consumer-driven tests. Deploy in an order that keeps old and new versions compatible during rollout.

How do you reduce cascading failures (name two techniques)?

hardresiliencetimeoutsbulkhead+1

Answer

Use timeouts + circuit breakers, and keep retries bounded with jitter/backoff. Also consider bulkheads (limit concurrency per dependency) to prevent one failure from exhausting all threads/connections.

Why are timeouts important in service-to-service calls?

easytimeoutsresiliencecascading-failures

Answer

Without timeouts, requests can hang and consume threads/connections, causing cascading failures. Timeouts let you fail fast, recover, and apply retries or fallbacks safely.

What is backpressure and where does it show up?

mediumbackpressurequeuesstreams+1

Answer

Backpressure is a way to signal “slow down” when producers generate data faster than consumers can handle. It shows up in streams, messaging systems, and APIs—without it you get growing queues, memory pressure, and timeouts.

What is a schema registry and why is it useful for events?

mediumschema-registryeventscompatibility+1

Answer

A schema registry stores and versions event schemas (Avro/Protobuf/JSON schema) and enforces compatibility rules. It helps producers and consumers evolve safely without breaking each other.

What is the Bulkhead pattern and how does it help reliability?

hardbulkheadresilienceconcurrency+1

Answer

Bulkhead isolates resources so one failing dependency can’t exhaust everything. In practice: separate thread pools, connection pools, or concurrency limits per dependency, so the rest of the system stays responsive.

Distributed locks — when do you need them and what are the risks?

harddistributed-lockcoordinationreliability

Answer

You need a distributed lock when multiple instances must ensure only one performs a critical section (e.g., one scheduler job). Risks: lock leaks, split-brain, clock/network issues, and added latency; prefer idempotency and DB constraints when possible.

Answer

It couples services through shared schema and transactions: one change can break others, deployments must be coordinated, and ownership becomes unclear. It also makes scaling and security boundaries harder.

Answer

Through APIs (request/response) or events (publish/subscribe). A service owns its data and exposes it via stable contracts; other services can build read models or caches from events when they need local reads.

What is a retry storm and how do you prevent it?

mediumretriesbackoffjitter+1

Answer

A retry storm is when many clients retry at once, amplifying load on a struggling dependency and making recovery harder. Prevent it with exponential backoff + jitter, bounded retries, circuit breakers, and rate limiting.

At-least-once delivery: how do you avoid duplicate side effects in a consumer?

hardidempotencydeduplicationmessaging+1

Answer

Assume duplicates and make the handler idempotent. Common patterns: store a processed message ID with a unique constraint, use upserts, and keep changes + dedup in one transaction (inbox/dedup table).

Saga orchestration vs choreography — what’s the difference?

hardsagaorchestrationchoreography+1

Answer

Orchestration uses a central coordinator that tells services what to do next. Choreography is decentralized: services react to events and trigger the next step. Orchestration is easier to reason about; choreography reduces central coupling but can be harder to trace.

Consumer-driven contract tests: what are they and why use them?

easymicroservicestestingcontracts+1

Answer

In consumer-driven contract testing, the consumer defines expectations for the API (request/response shapes), and the provider verifies it still satisfies them. It catches breaking changes early and helps teams deploy independently with more confidence.

mTLS between services: what does it protect and what does it NOT protect?

mediummicroservicessecuritymtls+1

Answer

mTLS encrypts traffic and authenticates both sides (service identity), which helps prevent impersonation and sniffing. It does NOT solve authorization by itself (what a service is allowed to do), and it doesn’t replace input validation or business-level security rules.

In a saga, what is a compensating action and why is it tricky?

mediummicroservicessagacompensation+1

Answer

A compensating action is a business operation that “undoes” a previous step (e.g., cancel a reservation after payment fails). It’s tricky because it’s not a real rollback: it can fail, it may not perfectly restore the previous state, and it must be idempotent and well-observed.

Distributed rate limiting: why is it harder than a simple in-memory counter?

hardmicroservicesrate-limitingredis+1

Answer

With many instances, an in-memory counter limits only one instance, so overall traffic can exceed the limit. You usually need a shared store (e.g., Redis) or enforce limits at the gateway. Hard parts: correctness under concurrency, time windows, clock drift, and avoiding hot keys.

Multi-region microservices: what are the main benefits and the main pain points?

hardmicroservicesmulti-regionavailability+1

Answer

Benefits: higher availability and lower latency for global users. Pain points: data replication, consistency/conflict resolution, higher operational complexity, and cross-region latency/cost. Many teams start with active-passive (failover) before going active-active.

BFF (Backend for Frontend): what is it and when does it help?

easymicroservicesbffapi+1

Answer

A BFF is a backend tailored to one frontend (web, mobile, etc.). It helps when different clients need different data shapes, when you want to reduce chatty calls from the UI, or when you need a safe place to aggregate multiple service calls into one API for that client.

REST vs gRPC for service-to-service calls: what are the key trade-offs?

mediummicroservicesgrpcrest+2

Answer

REST (JSON over HTTP) is easy to debug and widely compatible. gRPC uses HTTP/2 + Protobuf, gives strong contracts, good performance, and supports streaming, but is harder to inspect without tooling and is less browser-friendly. Pick based on interoperability, performance needs, and team/tooling maturity.

Why is synchronous fan-out (one request calling many services) risky, and how do you reduce it?

mediummicroservicesfan-outlatency+2

Answer

It increases latency (you wait for multiple calls), increases failure probability (one dependency failing breaks the whole request), and can amplify load. Reduce it by aggregating in a BFF/API Gateway, caching, using async/event-driven flows, and by setting timeouts + bulkheads so one slow dependency doesn’t stall everything.

Kafka ordering: what ordering guarantees do you get and how do you design for ordering?

hardmicroserviceskafkaordering+2

Answer

Kafka guarantees ordering only within a single partition. To keep events for an entity in order, publish them with the same partition key (e.g., `orderId`) so they land in the same partition. There is no global ordering across partitions, and consumers should still handle duplicates/retries.

Trace context propagation: what is the `traceparent` header and why should services forward it?

hardmicroservicesobservabilitytracing+1

Answer

`traceparent` is the W3C Trace Context header that carries the trace ID and parent span info across hops. If every service forwards it to downstream calls, you can connect logs/spans into one end-to-end trace and debug latency/failures across many services.