Resiliency in MSA
In any case, design for resiliency should not be ignored or given less priority where architects design for known failures and if something fails then it should fail fast and gracefully.
Few Reasons for failure:
- Failure can occur in distributed architecture when any of the resource is not available or exhausted because of any reason.
- 100% availability of Hardware, infrastructure and Communication channel cannot be considered.
- External system or service may not respond.
- In microservices architecture, few services may not respond on time which can result in cascading failure.
- Communication with database server may take long time.
Patterns for Resiliency
- Timeout Pattern
Using Timeout pattern, a client service or application can set the max waiting time in the form of timeout when connecting to a network resource.
- Retry Pattern
Retry pattern can be implemented when an operation in a system experience transient faults and success can be expected.
- Circuit Breaker PatternCircuit Breaker Pattern enables an application to break an operation in the expectation that it will fail.
- Bulkhead Pattern
The advantage of using Bulkhead pattern is to avoid faults in one part of system which can result in impacting the whole system.
- Compensating Transaction Pattern
In distributed transaction scenario, Compensating Transaction pattern helps in achieving eventual consistency and provides a way to undo the work performed by a series of steps in SAGA.