Python — From Basics to Advanced · Concurrency II — Threads, Processes, GIL
multiprocessing — True Parallelism
Concurrency II — Threads, Processes, GIL
Introduction
The multiprocessing module delivers true CPU parallelism in Python by bypassing the GIL — each process has its own interpreter and its own GIL. Costs: process startup time, argument pickling, IPC instead of shared memory, higher RAM footprint. Choosing between threading and multiprocessing: I/O-bound → threading, CPU-bound → multiprocessing.
API: Process(target, args) analogous to Thread, Pool for a worker pool (map, imap, apply_async), Queue and Pipe for IPC, Manager for shared structures (list, dict, Namespace). Start methods: fork (Linux/Mac default <3.14, fast, copy-on-write), spawn (Windows default, clean interpreter), forkserver (compromise).
Pitfalls: (1) if __name__ == "__main__": is REQUIRED under spawn, (2) arguments must be picklable (lambdas, local classes do not work), (3) fork after threads → undefined behavior, (4) NumPy + fork can hang with OpenBLAS, (5) IPC overhead — Pool.map with chunksize=1 on large arguments is slow.