— summary — 3 min read
This post summarizes the key topics in this very accessible paper published in 2009.
We argue that a new OS for a multicore machine should be designed ground-up as a distributed system, using concepts from that field.
Modern hardware resembles a networked system even more than past large multi-processors: in addition to familiar latency effects, it exhibits node heterogeneity and dynamic membership changes.
A modern computer is a networked system of point-to-point links exchanging messages.
Figure 1: Node layout of a commodity 32-core machineA single machine today consists of a dynamically changing collection of heterogeneous processing elements, communicating via channels (whether messages or shared-memory) with diverse latencies.
Distributed systems are historically distinguished from centralized ones by three additional challenges:
Node heterogeneity:
Node Dynamicity: Nodes (CPU, memory, etc) come and go due to partial failures and other reconfigurations — however, the hardware of a computer from the OS perspective is not viewed in this manner.
Communication Latency: The problem of latency in cache-coherent NUMA machines is well-known.
Access | Cycles | Normalized to L1 | Per-hop cost |
---|---|---|---|
L1 cache | 2 | 1 | - |
L2 cache | 15 | 7.5 | - |
L3 cache | 7 | 5 37.5 | - |
Other L1/L2 | 130 | 65 | - |
1-hop cache | 190 | 95 | 60 |
2-hop cache | 260 | 130 | 70 |
The principal impact on clients is that they now invoke an agreement protocol (propose a change to system state, and later receive agreement or failure notification) rather than modifying data under a lock or transaction.
Closer to the level of system software, routing problems emerge when considering where to place buffers in memory as data flows through processors, DMA controllers, memory, and peripherals.
For example, data that arrives at a machine and is immediately forwarded back over the network should be placed in buffers close to the NIC, whereas data that will be read in its entirety should be DMAed to memory local to the computing core.