Skip to main content

Concurrency

Threads, processes, async, locks, and the common race patterns — and how to reason about them out loud in an interview.

Free reference · last reviewed

Concurrency is dealing with many things at once by interleaving them; parallelism is doing many at once on multiple cores. Most interview bugs trace to shared mutable state touched without a lock — a race — so the skill is spotting the critical section and naming the primitive that guards it. Below: threads vs processes vs coroutines, the lock zoo, deadlock, and the classic race patterns, each with how to reason about it out loud.


Concurrency vs Parallelism

ConcurrencyParallelism
Many tasks in progress at onceMany tasks executing simultaneously
One CPU + interleavingMultiple CPUs / cores
Async I/O fitsCPU-bound work benefits

You can have concurrency without parallelism (single-threaded async). You need parallelism for CPU-bound speed-up.


Process vs Thread vs Coroutine

ProcessThreadCoroutine (async)
Memoryisolatedshared in same processshared, single-threaded
Creation costhigh (~ms)medium (~µs)very low (~ns)
Switch costhigh (kernel + TLB flush)medium (kernel)low (userspace)
CommunicationIPC (sockets, pipes, shared mem)shared memory + locksdirect
Fault isolationcrash doesn't kill siblingsone bad thread crashes processone exception bubbles up
Best forisolation, CPU-bound parallelism, GIL bypassI/O + CPU mix where shared state mattersI/O-bound at scale

Python specifics

  • GIL: only one thread runs Python bytecode at a time → threads don't help CPU-bound code.
  • CPU-boundmultiprocessing (or concurrent.futures.ProcessPoolExecutor).
  • I/O-boundthreading or, better, asyncio.
  • Mostly I/O at huge scaleasyncio (handles 10K+ connections per process).

When to pick what

  • 1000 HTTP fetches → asyncio.
  • Image processing → multiprocessing.
  • 50 background jobs polling DBs → threads are fine.
  • A web server → framework handles it; usually thread or async worker pool.

Async I/O (Python asyncio)

import asyncio, aiohttp

async def fetch(session, url):
    async with session.get(url) as r:
        return await r.text()

async def main(urls):
    async with aiohttp.ClientSession() as s:
        return await asyncio.gather(*(fetch(s, u) for u in urls))

asyncio.run(main(["https://a", "https://b"]))

Mental model

  • One event loop, one thread.
  • await = "yield control until ready"; the loop runs other coroutines.
  • Never time.sleep() inside async: blocks the loop. Use asyncio.sleep().
  • Never call blocking C extensions / sync libs from async; use asyncio.to_thread().

Locks & Friends

Mutex / Lock

Mutual exclusion: only one holder at a time.

import threading
lock = threading.Lock()
with lock:
    # critical section
    counter += 1

RLock (reentrant)

Same thread can acquire it multiple times (releases must match). Use when a method calls another method that also locks.

Semaphore

Predict the pattern

You have a connection pool capped at 10. Multiple threads need to acquire a slot; any of the 10 may be taken simultaneously. Which primitive models this directly?

Allow up to N holders. Use for bounded resource (connection pool, rate limit).

sem = threading.Semaphore(10)
with sem:
    do_io()

Condition variable

Wait until some predicate is true.

cv = threading.Condition()
with cv:
    while not ready:
        cv.wait()
    process()
# Producer:
with cv:
    ready = True
    cv.notify_all()

Event

One-shot flag; threads wait() until set().

e = threading.Event()
# consumer
e.wait()
# producer
e.set()

Barrier

N threads wait at the barrier; all released together.

Atomic ops

In Python, single bytecode ops on built-ins are atomic-ish (don't rely on it for invariants spanning multiple ops). Use threading.Lock to be safe. Other languages: AtomicInteger, compareAndSwap.


Read-Write Lock (RWLock)

Many readers OR one writer.

  • Reader acquires shared lock.
  • Writer acquires exclusive lock.
  • Avoid writer starvation: queue writers ahead of subsequent readers.

Python threading doesn't ship one; implement with a lock + counter + condition, or use readerwriterlock library.


Deadlock

Predict the pattern

Thread A holds lock L1 and waits for L2; Thread B holds L2 and waits for L1. Neither can proceed. What concurrency problem is this?

Four necessary conditions (Coffman):

  1. Mutual exclusion: resource held exclusively
  2. Hold and wait: hold one, waiting for another
  3. No preemption: can't be forcibly taken away
  4. Circular wait: A waits B, B waits C, C waits A

Break any one → no deadlock.

Prevention (most common)

  • Global lock ordering: always acquire locks in the same order (e.g. by id).
def transfer(a, b, amount):
    first, second = sorted([a, b], key=lambda x: x.id)
    with first.lock:
        with second.lock:
            a.balance -= amount; b.balance += amount
  • Try-lock with timeout + retry:
if lock.acquire(timeout=2):
    try: ...
    finally: lock.release()
  • Hold one lock at a time: copy state, release, then act.

Detection

  • Cycle in the "wait-for" graph.
  • DB tools: pg_locks, SHOW ENGINE INNODB STATUS.
  • Thread dump: py-spy dump, jstack.

Livelock & Starvation

  • Livelock: threads keep changing state in response to each other but make no progress (two people stepping aside in a corridor).
  • Starvation: low-priority thread never runs (greedy scheduler / unfair lock).

Fix: bounded retries + random backoff (livelock); fair locks / priority caps (starvation).


Common Race Patterns

1. Check-then-act

if not exists(key):       # someone else might create it here
    create(key)

Fix: atomic upsert (DB unique key, INSERT ... ON CONFLICT, SETNX).

2. Read-modify-write (counter)

n = counter
counter = n + 1            # lost updates

Fix: lock, atomic increment, or DB UPDATE t SET n = n + 1.

3. Time-of-check to time-of-use (TOCTOU)

if user.balance >= 100:    # ok
    sleep(1)               # someone else withdraws
    user.balance -= 100    # now negative

Fix: do check and act atomically (DB transaction with row lock, CAS).

4. Double-checked locking (broken in many languages without barriers)

if instance is None:
    with lock:
        if instance is None:
            instance = Foo()

Works in Python due to GIL guarantees on assignment; in Java needs volatile.

5. Lazy init under concurrency

Prefer functools.cache / lru_cache or module-level eager init.

6. Iterator invalidation

for x in items:
    items.remove(x)        # skips elements or raises

Fix: iterate over a copy list(items), or build a new list.

7. Async forgetting to await

result = fetch_data()       # returns coroutine, not value

Always await. Lint with mypy / ruff.


Producer / Consumer

Classic pattern with a bounded queue.

import threading, queue, time

q = queue.Queue(maxsize=10)

def producer():
    for i in range(100):
        q.put(i)        # blocks if full → backpressure
    for _ in range(N_CONS): q.put(None)   # poison pills

def consumer():
    while True:
        item = q.get()
        if item is None: return
        process(item)
        q.task_done()

threads = [threading.Thread(target=consumer) for _ in range(N_CONS)]
for t in threads: t.start()
producer()
for t in threads: t.join()

Bounded queue = built-in backpressure. Unbounded = OOM risk under burst.


Reader / Writer Problem

Many threads can read; only one can write. Reads must wait for writers, writers wait for readers + other writers.

Pitfall: continuous readers can starve writers. Solution: queue any incoming reader behind a waiting writer (writer-priority RWLock).


Dining Philosophers

5 philosophers, 5 forks, each needs both adjacent forks.

  • Naive: each picks left then right → deadlock.
  • Fix 1: global ordering (always pick lower-numbered fork first).
  • Fix 2: one philosopher picks right then left (asymmetry breaks cycle).
  • Fix 3: waiter that grants both forks together.
  • Fix 4: max 4 philosophers eating at once (semaphore).

Distributed concurrency

When state spans machines, in-process locks don't help.

Distributed lock

  • Redis SETNX with TTL: simple, fast. Caveats: lock expires under GC pause.
  • Redlock: quorum across N Redis nodes.
  • ZooKeeper / etcd: strong consistency, ephemeral znodes.
  • DB row lock (SELECT ... FOR UPDATE): easy if you're already in DB.

Optimistic concurrency (preferred when contention is rare)

UPDATE orders SET total = ?, version = version + 1
WHERE id = ? AND version = ?;
-- 0 rows updated → someone else won; retry with fresh read

Idempotency

For retries to be safe: client provides a key, server dedupes.


Memory Model 101

Predict the pattern

In Java, marking a field volatile guarantees that a write by one thread is immediately seen by other threads. What property does this provide — and what does it NOT provide for compound operations like i++?

  • Happens-before: write x = 1 before reads of x for that read to see it. Without sync, no guarantee.
  • Reordering: compilers/CPUs reorder operations for speed; not visible to single thread; deadly across threads.
  • Memory barriers / fences: enforce ordering.
  • Volatile (Java) / atomic (C++): prevent reordering / cache staleness for that variable.

In Python: GIL gives you a lot for free, but only on single ops. For multi-op invariants, use locks.


Cheat sheet: which primitive

From memory: which primitive fits each situation?

SituationPrimitive
Protect a counterLock (or atomic if available)
One-time initdouble-checked + lock, or eager init
Bound concurrent connections to NSemaphore(N)
Wait for a flagEvent
Wait until predicateCondition variable
Many readers, occasional writerRWLock
Multi-producer multi-consumerBounded Queue
Reset across N workersBarrier
Avoid cross-machine raceDB row lock / distributed lock
Avoid double-charge on retryIdempotency key

Common Interview Questions

"Implement a thread-safe counter"

import threading
class Counter:
    def __init__(self):
        self._n = 0
        self._lock = threading.Lock()
    def inc(self):
        with self._lock: self._n += 1
    @property
    def value(self):
        with self._lock: return self._n

"Implement a bounded blocking queue"

import threading
class BoundedQueue:
    def __init__(self, cap):
        self.q = []; self.cap = cap
        self.lock = threading.Lock()
        self.not_full = threading.Condition(self.lock)
        self.not_empty = threading.Condition(self.lock)
    def put(self, x):
        with self.not_full:
            while len(self.q) >= self.cap:
                self.not_full.wait()
            self.q.append(x)
            self.not_empty.notify()
    def get(self):
        with self.not_empty:
            while not self.q:
                self.not_empty.wait()
            x = self.q.pop(0)
            self.not_full.notify()
            return x

"Implement a rate limiter (token bucket)"

See lld.md §10.

import threading
class FooBar:
    def __init__(self, n):
        self.n = n
        self.foo_e = threading.Event(); self.foo_e.set()
        self.bar_e = threading.Event()
    def foo(self, printer):
        for _ in range(self.n):
            self.foo_e.wait(); self.foo_e.clear()
            printer("foo")
            self.bar_e.set()
    def bar(self, printer):
        for _ in range(self.n):
            self.bar_e.wait(); self.bar_e.clear()
            printer("bar")
            self.foo_e.set()

"Dining philosophers without deadlock"

  • Number forks 0..N-1.
  • Each philosopher picks min(left, right) first, then max.

"Implement async / await semantics yourself"

  • A coroutine is a state machine; await saves resume point; event loop schedules ready coroutines (callbacks on I/O completion).

"How does the GIL affect performance?"

  • One Python bytecode runs at a time.
  • I/O releases the GIL during the syscall → threads help for I/O.
  • CPU-bound work doesn't parallelize → use multiprocessing.
  • C extensions can release the GIL (NumPy, etc.).

Anti-patterns

Anti-patternWhy bad
time.sleep() for polling instead of condition variablewastes CPU, slow response
Catching exception in thread and ignoringthread dies silently; surface it
threading.Thread(daemon=False) then forgetting to joinprogram hangs on exit
Locks held across I/Ohuge critical section, contention
Two-level locks without orderingdeadlock waiting to happen
Async function with sync DB call insideblocks event loop, kills throughput
try: lock.acquire() without finally: release()leaked locks on exception
Using multiprocessing to share stateuse Queue / Manager, not bare globals