What is the difference between throughput and latency?

Throughput is the number of requests processed per unit time (requests/second). Latency is the time to process a single request (milliseconds). They are related but independent: a slow individual request can coexist with high aggregate throughput if many requests are processed in parallel.

Why does throughput degrade above 70-80% utilization?

Queuing theory predicts that as utilization approaches 100%, queue lengths grow without bound (M/M/1 queue model). In practice, at 80% utilization, average queue length doubles. High utilization causes latency spikes that reduce effective throughput.

Throughput Calculator | DevTools Surf

DevTools Surf

About Throughput Calculator

Throughput Calculator preview - Calculators tool

Calculate system throughput and capacity planning with saturation analysis. Part of the DevTools Surf developer suite. Browse more tools in the Calculators collection.

Use Cases

Calculate maximum sustainable request rate for a service under queue saturation constraints.
Determine the number of parallel workers needed to process a batch job within a time budget.
Model the impact of adding capacity (more servers, more threads) on system throughput.
Calculate end-to-end throughput for a pipeline of sequential processing stages.

Tips

Calculate throughput under saturation conditions, not just at normal load — most systems experience nonlinear throughput degradation above 70–80% utilization due to queuing effects.
Use Little's Law: L = λW (average queue length = arrival rate × average wait time) to connect throughput measurements to latency observations.
Measure throughput at the bottleneck resource, not the system entry point — they differ when upstream components buffer or shed load.

Fun Facts

Little's Law was proven by John D.C. Little in 1961 and is one of the most powerful results in queuing theory — it applies to any stable system regardless of distributions, making it useful across hardware, software, and service systems.
Amdahl's Law (1967) sets a hard upper bound on throughput gains from parallelization: if 5% of a program is sequential, maximum speedup is 20x no matter how many processors are added.
The Universal Scalability Law (Neil Gunther, 1993) extended Amdahl's model to account for coherency penalties in concurrent systems, predicting throughput degradation beyond an optimal thread count — a pattern observed in virtually every highly concurrent system.

FAQ

What is the difference between throughput and latency?: Throughput is the number of requests processed per unit time (requests/second). Latency is the time to process a single request (milliseconds). They are related but independent: a slow individual request can coexist with high aggregate throughput if many requests are processed in parallel.
Why does throughput degrade above 70-80% utilization?: Queuing theory predicts that as utilization approaches 100%, queue lengths grow without bound (M/M/1 queue model). In practice, at 80% utilization, average queue length doubles. High utilization causes latency spikes that reduce effective throughput.

Related Calculators Tools

Basic Calculator Percentage Calculator Unit Converter Loan / EMI Calculator Number Formatter Tip Calculator BMI Calculator Mortgage Calculator