Throttling Pattern

Throttling pattern helps in controlling the consumption of resources used by an instance of an application, an individual tenant, or an entire service.

Throttling limit can be set at Application level, Resource level or API Level based on the requirement. In case of throttled, service can return HTTP status code 429 (“Too many requests”) or 503 (“Server Too Busy”).


  • To maintain the SLA
  • To prevent the DDOS attack.
  • To allow the system to continue to function when an increase in demand places an extreme load on resources.
  • It can be used as an alternative strategy to auto scaling and allow applications to use resources only up to a limit, and then throttle them when this limit is reached.
  • It also makes system responsive and highly available when used with auto scaling option.
Please share this