Every systems programmer eventually writes a bounded queue. And almost every systems programmer eventually gets one wrong.
This page is about why I wrote yet another bounded work queue using
pthread_mutex_t and pthread_cond_t — even though
this problem has been “solved” for decades.
On paper, bounded queues are simple: a fixed-size buffer, producers, consumers, and some blocking when things fill up or run dry.
In practice, they are a graveyard of subtle bugs: missed wakeups, incorrect signaling, spurious wakeups, broken invariants under load, and code that works perfectly… until it doesn’t.
I’ve seen these bugs show up in thread pools, logging systems, job schedulers, and “temporary” utilities that somehow made it into production.
This implementation started as a personal reset: what does the simplest correct solution actually look like, if we care more about reasoning than raw speed?
The goal here was intentionally unambitious:
If this queue ever misbehaves, it should be easy to explain why. If it’s slow, at least it’s slow in a predictable way.
Rather than starting with code, I started with a short list of invariants. Everything else exists only to preserve these under concurrency:
0 <= size <= capacityhead and tail stay within boundsIf any of these are violated, the queue is broken — no amount of performance can save it.
The entire queue state is protected by a single mutex. This is a deliberate choice.
Fine-grained locking can reduce contention, but it also multiplies the number of states your brain has to simulate. For this exercise, I wanted one lock and one place where the truth lives.
Two condition variables complete the picture:
not_full — producers sleep when the buffer is fullnot_empty — consumers sleep when the buffer is empty
All waits use while, never if.
This isn’t defensive paranoia — it’s required to handle spurious wakeups
and racing producers or consumers.
Correctness isn’t just about mutual exclusion. It’s also about memory visibility.
The mutex and condition variable operations establish a happens-before relationship between producers and consumers. Anything a producer writes before enqueue is guaranteed to be visible to a consumer after dequeue.
This guarantee is easy to take for granted — and easy to break if synchronization is misused.
This design deliberately avoids a few common “optimizations”:
if-based waitsThe result is not the fastest queue you can build. It is, however, a queue whose behavior you can predict when things get ugly.
I don’t believe in “it looks correct”. This implementation was exercised with:
None of this proves correctness. It simply increases confidence that the invariants survive contact with reality.
This queue is a baseline. Something to measure more complex designs against.
It is not lock-free. It is not wait-free. It is not meant to win benchmarks.
But if you’re building something faster, you should be able to explain exactly which guarantees you’re trading away — and why.