Problem Statement
In a Principal Engineer interview at Google, you're asked: "Explain how Go can run millions of goroutines on a few OS threads. What is the GMP model and how does work stealing improve performance?"
The GMP Model
- G (Goroutine): User-space thread, starts at 2KB stack, grows dynamically
- M (Machine): OS thread, executes goroutines
- P (Processor): Logical processor, holds the run queue
Visual Representation
Key Scheduler Events
| Event | What Happens |
|---|---|
| Goroutine blocks (I/O, channel) | M releases P, P picks another G |
| Goroutine syscall | M blocks with G, new M takes P |
| P run queue empty | Work stealing from other Ps |
| Goroutine runs too long (>10ms) | Preempted, put back in queue |
Debugging with GODEBUG
Work Stealing Algorithm
GOMAXPROCS
Follow-up Questions
- What happens when a goroutine makes a blocking syscall?
- How does Go 1.14+ achieve asynchronous preemption?
- Why doesn't increasing GOMAXPROCS beyond NumCPU help CPU-bound work?