2016-03-26

线程同步之条件变量

The common use of condition vars is something like:

thread 1:
    pthread_mutex_lock(&mutex);
    while (!condition)
        pthread_cond_wait(&cond, &mutex);
    /* do something that requires holding the mutex and condition is true */
    pthread_mutex_unlock(&mutex);

thread2:
    pthread_mutex_lock(&mutex);
    /* do something that might make condition true */
    pthread_cond_signal(&cond);
    pthread_mutex_unlock(&mutex);

1. pthread_cond_wait

pthread_cond_wait到底做了什么事情呢？
(1)释放mutex
(2)把当前线程加入到此条件变量的等待线程队列，然后睡眠
(3)当其他线程调用pthread_cond_signal或者pthread_cond_broadcast后，调用pthread_cond_wait的线程从睡眠状态醒来，然后试图重新获取mutex，如果此时能够获取mutex则从pthread_cond_wait返回，如果不能获取mutex则进入阻塞状态，等待其他线程释放mutex

其中，系统保证第一步和第二步是原子操作。为什么是原子操作呢？因为如果不是原子操作的话，会存在竞争条件(race condition)。当前线程释放mutex后，其他线程可以获得该mutex并调用pthread_cond_signal，然后当前线程再执行第二步，则会一直处于睡眠状态(直到其他线程再次调用pthread_cond_signal)，即错过了一次条件变量变为真的情况。

2. pthread_cond_signal

pthread_cond_signal does not unlock the mutex (it can’t as it has no reference to the mutex, so how could it know what to unlock?) In fact, the signal need not have any connection to the mutex; the signalling thread does not need to hold the mutex, though for most algorithms based on condition variables it will.

3. 线程1和线程2的并发执行情况

所以，线程1和线程2的并发执行情况是这样的:
(1)线程1获得mutex
(2)线程1测试条件为满足，则调用pthread_cond_wait，导致其释放mutex并把自己加入到条件变量的等待线程队列，然后睡眠
(3)线程2获得mutex
(4)线程2将条件设置为true，调用pthread|_cond_signal唤醒线程1，线程1被唤醒后，试图重新获取mutex，因为此时mutex被线程2持有，所以线程1被阻塞在获取mutex的操作上。
(5)线程2释放mutex
(6)线程1获得mutex，并从pthread_cond_wait返回。

4. pthread_cond_signal和pthread_mutex_unlock的顺序

我们看到线程2是先调用pthread_cond_signal，再释放mutex的。那么它们的执行顺序能反过来吗？答案是不能，因为反过来的话存在竞争条件。线程2在释放mutex后，线程1获取mutex，进行条件测试发现为false，于是线程1准备调用pthread_cond_wait。但就在此时，线程2执行pthread_cond_signal。之后，线程1再调用pthread_cond_wait导致其阻塞。即错过了一次条件变量变为真的情况。

5. 后记

A spin lock is like a mutex, except that instead of blocking a process by sleeping, the process is blocked by busy-waiting (spinning) until the lock can be acquired.