Circuit Breakers, Retries & Bulkheads with Resilience4j in Spring Boot
A practical guide to the four Resilience4j patterns I applied across 9 microservices in a production ecommerce platform: with real configuration you can use today.
Why resilience patterns matter in microservices
In a monolith, a failing database query throws an exception and you handle it. In a microservices architecture, a slow downstream service doesn't just fail: it hangs. Threads pile up waiting for a response that never comes. The service that made the call runs out of thread pool capacity. Now a second service is degraded because the first one is stuck. This cascade failure can take down your entire platform in minutes.
Resilience4j is a lightweight fault-tolerance library designed for Java 8+ that gives you four tools to prevent this: Circuit Breaker, Retry, Rate Limiter, and Bulkhead. Here's how each works and how I configured them across the ecommerce microservices platform.
1. Circuit Breaker: stop calling a failing service
A circuit breaker wraps a remote call. It monitors the failure rate over a sliding window of calls. When the failure rate crosses a threshold, it opens: subsequent calls immediately return a fallback response without even attempting the remote call. After a configurable wait time, it enters half-open state and allows a few trial calls through. If those succeed, it closes and normal operation resumes.
This prevents thread exhaustion and gives the failing downstream service time to recover.
Configuration (application.yml):
resilience4j:
circuitbreaker:
instances:
paymentService:
registerHealthIndicator: true
slidingWindowSize: 10 # evaluate last 10 calls
minimumNumberOfCalls: 5 # wait for 5 calls before evaluating
permittedNumberOfCallsInHalfOpenState: 3
waitDurationInOpenState: 10s # stay open for 10 seconds
failureRateThreshold: 50 # open if 50% of calls fail
slowCallRateThreshold: 80 # also open if 80% are slow
slowCallDurationThreshold: 3s # a "slow" call takes > 3s
Applying it in code:
@Service
public class OrderService {
@CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
public PaymentResponse initiatePayment(PaymentRequest request) {
return paymentClient.processPayment(request);
}
// Called automatically when the circuit is open or call fails
public PaymentResponse paymentFallback(PaymentRequest request, Exception ex) {
log.warn("Payment service unavailable, queuing for retry: {}", ex.getMessage());
// Queue to outbox for async retry, return pending status
return PaymentResponse.pending(request.getOrderId());
}
}
2. Retry: handle transient failures automatically
Not all failures are worth a fallback. A network blip, a momentary database lock, or a brief Kafka broker restart are transient: retrying after a short wait usually succeeds. The Retry pattern wraps a call and automatically retries it a configured number of times with exponential backoff before giving up.
The key insight: retry only on exceptions that are genuinely recoverable. Don't retry a 400 Bad Request: retrying it a hundred times won't fix a malformed payload.
resilience4j:
retry:
instances:
inventoryService:
maxAttempts: 3
waitDuration: 500ms
enableExponentialBackoff: true
exponentialBackoffMultiplier: 2 # 500ms → 1000ms → 2000ms
retryExceptions:
- java.net.ConnectException
- java.util.concurrent.TimeoutException
ignoreExceptions:
- com.example.exceptions.InvalidRequestException
@Retry(name = "inventoryService", fallbackMethod = "inventoryFallback")
@CircuitBreaker(name = "inventoryService")
public InventoryResponse checkStock(String productId) {
return inventoryClient.getStock(productId);
}
// Tip: stack @Retry inside @CircuitBreaker
// Retry fires first; if it exhausts attempts, CircuitBreaker counts it as a failure
3. Rate Limiter: protect your service from being overwhelmed
Rate limiting is about protecting your service from too many inbound requests, not about being polite to downstream services. In the API Gateway of the ecommerce platform, rate limiting prevents a single client from flooding the system and degrading service for everyone else.
resilience4j:
ratelimiter:
instances:
apiGateway:
limitForPeriod: 100 # max 100 requests...
limitRefreshPeriod: 1s # ...per second
timeoutDuration: 0 # don't wait: reject immediately if over limit
@RateLimiter(name = "apiGateway", fallbackMethod = "rateLimitFallback")
public ResponseEntity<?> handleRequest(HttpServletRequest request) {
return processRequest(request);
}
public ResponseEntity<?> rateLimitFallback(HttpServletRequest request,
RequestNotPermitted ex) {
return ResponseEntity.status(429)
.header("Retry-After", "1")
.body("Too many requests. Please slow down.");
}
4. Bulkhead: isolate thread pools between services
Without bulkheads, all your outbound calls share the same thread pool. If calls to the Payment Service hang (because payment is slow), they consume all available threads and starve calls to other services like Inventory or Shipping: even though those services are perfectly healthy. A bulkhead assigns a dedicated, limited thread pool to each downstream dependency.
resilience4j:
bulkhead:
instances:
paymentService:
maxConcurrentCalls: 10 # max 10 concurrent calls to payment
maxWaitDuration: 100ms # wait 100ms for a slot, then reject
shippingService:
maxConcurrentCalls: 15
maxWaitDuration: 0ms # reject immediately if no slot
@Bulkhead(name = "paymentService", type = Bulkhead.Type.SEMAPHORE,
fallbackMethod = "bulkheadFallback")
@CircuitBreaker(name = "paymentService")
public PaymentResponse charge(ChargeRequest request) {
return paymentClient.charge(request);
}
public PaymentResponse bulkheadFallback(ChargeRequest request,
BulkheadFullException ex) {
// Payment service is saturated: queue the request
outboxService.queuePayment(request);
return PaymentResponse.queued(request.getOrderId());
}
Stacking the patterns: the right order
When you combine multiple Resilience4j decorators on the same method, the order matters. Spring applies them outermost-first, so the annotation listed first in the stack is the outermost wrapper.
// Recommended stack for an outbound service call:
@RateLimiter(name = "paymentService") // 1. Check rate limit first
@Bulkhead(name = "paymentService") // 2. Check thread capacity
@CircuitBreaker(name = "paymentService") // 3. Check circuit state
@Retry(name = "paymentService") // 4. Retry on transient failure
public PaymentResponse charge(ChargeRequest request) {
return paymentClient.charge(request);
}
Key takeaways
- Circuit Breaker: stops calling a downstream service when it's consistently failing: prevents cascade failures.
- Retry: handles transient failures automatically: only retry exceptions that are actually recoverable.
- Rate Limiter: protects your service from inbound overload: return 429 immediately rather than queueing indefinitely.
- Bulkhead: isolates thread pools per dependency: one slow service can't starve calls to healthy ones.
- Stack all four:
RateLimiter → Bulkhead → CircuitBreaker → Retry: for defence in depth.
Want to see the full configuration?
The complete ecommerce microservices platform with Resilience4j wired across all 9 services is on GitHub.