Python Multithreading
> "Concurrency is not parallelism, but it enables it." — Rob Pike # Python Multithreading: Running the World in Parallel > "The art of programming is the art...

"Concurrency is not parallelism, but it enables it." — Rob Pike
Python Multithreading: Running the World in Parallel
"The art of programming is the art of organizing complexity." — Edsger W. Dijkstra
What Is Multithreading?
Multithreading is Python's way of doing multiple things at once, or at least appearing to. A thread is the smallest unit of execution within a program. When you run multiple threads, your program can juggle tasks like downloading files, responding to user input, and writing to a database simultaneously, without waiting for each to finish before starting the next.
"Do not communicate by sharing memory; instead, share memory by communicating." — Go Proverbs (equally wise in Python)
The threading Module
Python's built-in threading module is the standard way to create and manage threads.
import threading
def greet(name):
print(f"Hello, {name}!")
t1 = threading.Thread(target=greet, args=("Alice",))
t2 = threading.Thread(target=greet, args=("Bob",))
t1.start()
t2.start()
t1.join()
t2.join()
start() launches the thread. join() tells the main program to wait until that thread finishes before moving on.
The GIL: Python's Double-Edged Sword
"The GIL is not a bug, it's a feature, just not always your feature." — Community wisdom
The Global Interpreter Lock (GIL) is the most talked-about limitation of Python threads. It ensures only one thread executes Python bytecode at a time, even on multi-core machines. This makes Python thread-safe for memory, but means threads don't truly run in parallel for CPU-bound tasks.
Multithreading shines for:
- I/O-bound tasks (network requests, file reading, database calls)
- Tasks that spend most time waiting, not computing
Multithreading struggles with:
- CPU-bound tasks (heavy math, image processing). Use
multiprocessinginstead.
Thread Synchronization with Locks
When multiple threads access shared data, race conditions can corrupt it. A Lock ensures only one thread touches the data at a time.
import threading
counter = 0
lock = threading.Lock()
def increment():
global counter
for _ in range(100000):
with lock:
counter += 1
threads = [threading.Thread(target=increment) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()
print(f"Final count: {counter}") # Always 500000
"A lock is a promise between threads. Keep it, or pay with corrupted state."
Using ThreadPoolExecutor (The Modern Way)
The concurrent.futures module offers a cleaner, higher-level API:
from concurrent.futures import ThreadPoolExecutor
import requests
urls = [
"https://api.example.com/data/1",
"https://api.example.com/data/2",
"https://api.example.com/data/3",
]
def fetch(url):
return requests.get(url).status_code
with ThreadPoolExecutor(max_workers=3) as executor:
results = list(executor.map(fetch, urls))
print(results)
This is the recommended approach for most real-world use cases: cleaner, exception-safe, and easy to scale.
Daemon Threads
"A daemon thread lives to serve. It dies when the program does."
Daemon threads run in the background and are killed automatically when the main program exits. Useful for background tasks like log flushing or heartbeat checks:
t = threading.Thread(target=background_task, daemon=True)
t.start()
# No need to join — it will die with the main program
Quick Reference
| Concept | Tool | Use When |
|---|---|---|
| Basic threads | threading.Thread | Simple parallel tasks |
| Thread pool | ThreadPoolExecutor | Managing many threads cleanly |
| Shared data safety | threading.Lock | Preventing race conditions |
| Background tasks | daemon=True | Fire-and-forget background work |
| CPU-bound tasks | multiprocessing | GIL-free true parallelism |
When NOT to Use Threads
"Premature optimization is the root of all evil." — Donald Knuth
Don't reach for threads just because something feels slow. If the bottleneck is computation, threads won't help due to the GIL. Profile first, then decide: threads for I/O, processes for CPU, and sometimes asyncio for high-concurrency I/O.