

Which is better for response time? A single fast server or slower, parallel servers?

Ref: Controlling Queue Delay (Nichols 2012)
Ref: bufferbloat.net

Ref: Meet Bandaid, the Dropbox service proxy (Dropbox 2018)

Offline, batch systems
Larger queues
Optimized for throughput
Online, live systems
Critical to load-balancing
Optimized for latency

Ref: Kafka: a Distributed Messaging System for Log Processing (Kreps 2011)
Queueing is central to systems performance:
Ref/Example: Don't Block the Event Loop (or the Worker Pool) (node.js docs)



If your load test is a closed-loop load generator, it might be lying to you!
Ref: Open Versus Closed: A Cautionary Tale (Schroeder 2006)

Ref: Guerilla Capacity Planning (Gunther 2007)
| Algorithm | Description |
|---|---|
| (RAND) | Randomly choose |
| (RR) | Round robin |
| FCFS | First come, first serve |
| SJF | Shortest job first |
| PLCFS | Preemptive LCFS |
| FB | Foreground-background |
| PSJF | Preemptive SJF |
| SRPT | Shortest remaining processing time |

Ref: Task Assignment with Unknown Duration (Harchol-Balter 2002)
There's so much more. Where to learn more about queueing?