Two types of processes
- Independent : Not affected by other process
- Cooperating : Affected by each other
- Need data generated by the other process (data sharing
- Need to execute after a certain computation by another process (control)
- Split computation to many parallel processes (parallelism)
The method of communication is Synchronisation
- Part of the OS
- Implementation depends on OS
- Another reason for compiling programs for particular OS
- Mechanism for passing messages between processes
- Processes can SEND and RECEIVE messages with data or control commands
Similar to trap commands between process and OS
Primitives
- Mutex
- Semaphore
Linux kernel synchronisation primitives:
- Atomics: read-modify-write (RMW) operations
- Memory barriers: impose a perceived partial ordering over the memory operations
- Spin locks
- Futexes And also special mutexes and semaphores
Mutex
A Boolean variable (locked, unlocked) Asking to lock a locked mutex blocks execution Resume on release Called locks, or binary semaphores
#include <pthread.h>
pthread_mutex_t lock;
void* trythis(void* arg)
{
pthread_mutex_lock(&lock);
… //do some work
pthread_mutex_unlock(&lock);
}Semaphore
A counter atomically incremented/decremented
If (--sem & sem <1) blockBlock until sem >= 1
Modern processors provide special atomic instructions that allow implementing semaphores efficiently.
#include <pthread.h>
#include <semaphore.h>
sem_t lock;
void* trythis(void* arg)
{
sem_wait (&lock);
… //do some work
sem_post (&lock);
}Atomics
Read-modify-Write (RMW) operations No other write will occur to that location between the read and the write
Two classes: Operate on atomic_t or atomic64_t data type Operate on bitmaps Unsigned long/ array of unsigned long
#include <stdatomic.h>
atomic_int acnt;
…
++acnt;
…Barriers
Multi-CPU memory access is relaxed - not ordered
Strict overall order for threads
before_atomic,after_atomic are not the only barriers available.
General barrier
Only serves as an instruction to the compiler to prevent reordering of memory access A general barrier does not affect runtime. It is only a compiler instruction to keep the correct order.
Mandatory barrier
General, write, read Mandatory barriers are most useful when communicating with memory-mapped external peripherals. There are three types, the general, write and read barrier and act as full system memory barriers! Any operation before the barrier must commit its outputs to memory before any operations after the barrier can start execution.
SMP conditional barrier
SMP barriers ensure consistency between CPU cores meaning cache coherency.
Implicit barrier
Use locking constructs to act as implicit SMP barriers
Implicit barriers use other locking constructs to perform the operation of a barrier , often as a loop to spin the program until the memory commitment condition is met.
atomic_fetch_add();
// is equivalent to :
smp_mb before_atomic();
atomic_fetch_add_relaxed();
smp_mb after_atomic();Locks
Simplest form of locking Task goes into a loop doing nothing until it acquires a lock Occupies the CPU while waiting (busy wait) Should be put to sleep if there are other tasks using the CPU
while (! has_lock ) {
// try to get the lock
}Futexes
Fast user-space mutex Linux specific Optimised for performance for the case when there is no contention for resources
Occurs entirely in user space for the non-contended case Kernel is only involved to handle the contended case
Lightweight and scalable Semaphore semantics
Kernel Mutexes
Exclusive to the kernel Three-state atomic counter Unlocked 1 Locked no waiters 0 Locked with waiters < 0 Spin lock with wait-queue Three paths wen acquiring mutex
- Fastpath Tries to atomically acquire the lock by decrementing the counter. If it was already taken by another task, it goes to the next possible path. This logic is architecture-specific but typically requires only a few instructions.
- Midpath aka optimistic spinning, tries to spin for acquisition while the lock owner is running, and there are no other tasks ready to run that have higher priority (need_resched). The rationale is that if the lock owner is running, it is likely to release the lock soon.
- Slowpath if the lock is still unable to be acquired, the task is added to the wait queue and sleeps until woken up by the unlock path. Under normal circumstances, it blocks as TASK_- UNINTERRUPTIBLE.
Core observation: busy-waiting for a few cycles has a lower overhead than putting a task on the wait queue
Kernel Semaphores
Sempaphores are also locks with blocking wait (sleep), they are a generalized version of mutexes.
Where a mutex can only have values 0 or 1, a semaphore can hold
an integer count, i.e., a semaphore may be acquired count times before sleeping.
If the count is zero, there may be tasks waiting on the wait_list. The spin lock controls access to the other members of the semaphore.
Unlike the mutex above, the semaphore always sleeps. Incrementing and decrementing the counter follows the generic semaphore rules but there is also a waiters queue like in kernel mutexes.
POSIX primitives
As Kernel Mutex & Semaphores happen in kernel space, there is an interface in user space to use them:
#include <pthread.h>
pthread_mutex_t lock; = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_init(&lock);
pthread_mutex_lock(&lock);
pthread_mutex_unlock(&lock);
pthread_mutex_destroy(&lock);
#include <semaphore.h>
//unnamed version
int sem_init(sem_t *sem, int pshared, unsigned int value);
int sem_destroy(sem_t *sem);
int sem_post(sem_t *sem); //release
int sem_wait(sem_t *sem); //acquire
// Variants
int sem_trywait(sem_t *sem);
int sem_timedwait(sem_t *sem, const struct timespec *abs_timeout);Semaphores
- Named: have a name id Two processes can operate under the same name
- Unnamed: memory-based Memory is shared between threads/processes
Condition variables
-
Condition on the actual value of data
-
Wait until condition is satisfied without polling
-
Always used together with a mutex lock
Conditional variables can also have a signal or a broadcast sent out to one or all threads waiting for a condition to be met.