Random Facts
A 64-bit Architecture can technically address up to 2^64 Bytes, or 16 exabytes In practice x86-64 only uses the lower 48 bits, meaning it supports only 2^48 = 256 TB
Multi-Value Types
Struct
This can be defined as such:
struct point { int x; int y; };
int main() {
struct point p = {1,2};
}or using the shorthand:
typedef struct {int x; int y;} point;
int main() {
point p = {1, 2};
}Strings
Strings in C are represented as arrays of characters
char greeting[] = "Hello World";is the same as
char greeting[] = {'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', '\0'};We use double quotes ” to write a (ASCII) string literal and single quotes ’ to write a character literal
Strings in C are terminated by the special character '\0' that is added automatically for string literals
To print a string with printf we use the %s formation character
printf("%s\n", greeting);If we forget the terminating '\0' character this will print the content of the memory until it hits the next bit pattern equivalent of '\0'!
Functions
Definition vs Declaration
All declaration does is define the interface for how to interact with the function, i.e.:
int max(int lhs, int rhs);This is what the compiler uses to verify the validity of calls, and then the linker will search for the Definition, which can be in a seperate file or library
The definition specifies how the function works:
A function definition in C looks like this:
int max(int lhs, int rhs) {
if (lhs > rhs) { return lhs; } else { return rhs; }
}The return type specifies the data type of the value that will be returned after evaluating the function. If a function returns no value the special type void is used as return type
The name which should describe the behaviour of the function
A parameter list specifies the data type and name of each parameter expected by the function
The function body is a block containing the code executed when calling the function
To call a function we provide an argument for each parameter and capture/process the return value
int x = max(5, 3);Pass by Value vs Pass by Reference
Pass by Value
In programming, “Pass by Value” means that:
- When a function is called, a copy of the argument’s value is passed to the function.
- Changes made to the parameter inside the function do not affect the original argument outside the function.
This method is typically used with primitive data types like integers and characters.
It ensures that the original data remains unchanged and secure from unintended modifications within functions. However, it can be less efficient for large data structures or objects due to the overhead of copying data.
#include <stdio.h>
void modifyValue(int num) {
num = 20; // Attempt to modify the parameter within the function
printf("Value inside function: %d\n", num);
}
int main() {
int x = 10;
modifyValue(x); // x is passed by value
printf("Value in main after function call: %d\n", x); // x remains unchanged
return 0;
}
// Output:
// Value inside function: 20
// Value in main after function call: 10Side Note: Arrays
Arrays are typically not passed fully by value, as this is expensive
Pass by Reference
“Pass by Reference” involves passing the actual address (reference) of an argument to a function, rather than a copy of its value.
This allows the function to modify the original argument directly, reflecting changes outside the function.
It is more memory-efficient, especially for large objects or arrays, as it avoids the need for copying data. However, it can lead to bugs if the data is unintentionally altered.
This method is commonly used in languages that support reference types, like C++ and Java (via object references).
#include <stdio.h>
void modifyReference(int *num) {
*num = 20; // Directly modify the value at the address
printf("Value inside function: %d\n", *num);
}
int main() {
int x = 10;
modifyReference(&x); // Address of x is passed
printf("Value in main after function call: %d\n", x); // x is changed to 20
return 0;
}
// Output:
// Value inside function: 20
// Value in main after function call: 20Function Pointer
Function names are automatically converted to function pointers, i.e. we do not have to write &func_name
return_type (* function_name)(argument_types)Pointers
All pointers have a size of 8 bytes, at least on 64 bit architectures.
The & “address of” operator gives the value of the address that the variable is stored at.
The * “dereference” operator is the inverse, and lets us access the value of the variable at the address we are “pointing” to.
We typically use the macro NULL to represent a pointer that currently points to nothing.
This unfortunately has the problem of:
Dereferencing a NULL pointer will crash your program
Const
Pointers can be const, i.e. unmodifiable, in three ways:
- The pointer itself, i.e. the address, can not be changed:
float * const ptr; - The value we are pointing to can not be changed:
const float * ptr; - Both value and pointer can not be changed:
const float * const ptr;
Pointer Arithmetic
Pointer arithmetic to modify the value of a pointer. (useful in the context of arrays)
We can:
- Add and subtract integer values to/from a pointer (float, etc. is invalid)
- subtract two pointers from each other
- compare pointers
int vector[6] = {1, 2, 3, 4, 5, 6};
int * ptr = vector; // start at the beginning
while (ptr <= &(vector[5])) {
printf("%d ", *ptr); // print the element in the array
ptr++; // go to the next element
} Pointer arithmetic takes into account the size of the type the pointer is pointing to:
int * i_ptr = &i;
char* c_ptr = &c;
i_ptr++; // this adds 4-bytes (1x sizeof(int)) to the address stored in i_ptr c_ptr+=2; // this adds 2-bytes (2x sizeof(char)) to the address stored in c_ptr Note: Pointers cannot be added or multiplied
Linked Lists
Typically:
struct node { char value; struct node * next; };With the last node pointing at NULL:
int main() {
struct node c = {'c', NULL};
struct node b = {'b', &c};
struct node a = {'a', &b};
struct node * ptr = &a;
while (ptr) {
printf("%d\n", ptr->value);
ptr = ptr->next;
}
}Note: ptr->m is a notation to access member m of a struct-pointer ptr and is equivalent to (*ptr).m
Void Pointer
This provides a generic means to address something, but is not concrete, which means that dereferencing a void pointer is forbidden
Dynamic Memory Management
The heap is a region of a computer’s memory used for dynamic memory allocation, where variables are allocated and freed at runtime. Unlike the stack, which manages memory in a last-in-first-out (LIFO) manner for static memory allocation, the heap allows for more flexible memory usage, accommodating varying-sized data blocks as needed by an application.
Mismanagement of the heap, such as failing to free allocated memory, can lead to memory leaks.
Usage:
We request memory from the heap using the malloc function, specifying how many bytes we want:
void* malloc( size_t size); // in <stdlib.h>This returns a void pointer to the first byte of the block on success, or NULL if failed.
Memory that is malloced must be freed
void free( void* ptr );Otherwise this memory is “Leaked”, which means that it is allocated, and unusable by other programs but not used. This means the memory is potentially permanently wasted until the end of the program.
Segmentation Fault (SegFault)
Dangling pointer
- a reference to memory which has been de-allocated then allocated to another var
- if you NULL after freeing, it is easier to check if this pointer is “active” Derefrencing a NULL pointer
- NULL is an invalid memory location Buffer overflow
- accessing memory outside allocated bounds, typically with arrays
- you can read the values in these extra locations, but this data would be corrupted Stack overflow
- trying to write more data than your allocated memory
- often triggered by a recursion without a base case Heap overflow
- Memory leaks
Debugging
First, if we use the -g flag while compiling and enter lldb:
$ lldb -- ./program 12345
(lldb) run Process 85058 launched: './program' (x86_64)
2018-10-21 20:56:00.106714+0100 program[85058:12827554] detected buffer overflow
Process 85058 stoppedAt this point the debugger has stopped, and we can use a backtrace (bt) :
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00007fff76d1ab86 libsystem_kernel.dylib`__pthread_kill + 10
...
frame #6: 0x00007fff76ca8e84 libsystem_c.dylib`__strcpy_chk + 8
frame #7: 0x0000000100000eaf program overflow(argc=2, argv=0x00007ffeefbff300) at program.c:35
frame #8: 0x0000000100000efb program`main(argc=2, argv=0x00007ffeefbff3c8) at program.c:42
frame #9: 0x00007fff76bdc085 libdyld.dylib`start + 1
frame #10: 0x00007fff76bdc085 libdyld.dylib`start + 1Here frame 7 shows us the file (program.c) and line (35) that triggered the segfault
Linter
A linter (the name comes from the first UNIX tool to perform static analysis on C)
clang-tidy is a flexible tool that allows to enforce coding guidelines and to modernize source code. It is possible to extend clang-tidy by writing your own checks
It is invoked like clang, accepting the same flags but after two dashes: — A series of checks can be en- and disabled. Here we enable the checks for readability:
$ clang-tidy -checks="readability-*" 6.c -- -Wall -Werror /Users/lito/Desktop/examples/6.c:2:16: warning: pointer parameter 'p' can be pointer to const [readability-non-const-parameter]
void test(int *p) {
~~~ ^
const /Users/lito/Desktop/examples/6.c:4:9: warning: statement should be inside braces [readability-braces-around-statements]
if (p)
^
{– It suggests to to use const and put braces around the branch of an if statement
Dynamic Analysis
There exists a family of bug detection tools that use dynamic analysis
These tools need the program to run and can only detect bugs which are encountered during the execution of a particular test input
The clang project calls these tools sanitizers. The most important are:
- [[#addresssanitizer|
AddressSanitizer]] - a memory error detector - [[#memorysanitizer|
MemorySanitizer]] - a detector of uninitialized reads - [[#leaksanitizer|
LeakSanitizer]] - a memory leak detector - [[#undefinedbehaviorsanitizer|
UndefinedBehaviorSanitizer]] - a detector of undefined behaviour - [[#threadsanitizer|
ThreadSanitizer]] - a data race detector
AddressSanitizer
Address Sanitizer is a memory error detector for: – Out-of-bounds access (buffer overflow) – Use-after-free (dangling pointer) – Double free
This is executed as such:
clang -fsanitize=address -fno-omit-frame-pointer -O1 -g -Wall -Werror program.c -o program
Where the additions of:
- fno-omit-frame-pointer produces a readable call stack
- O1 enables basic optimizations
MemorySanitizer
Memory sanitizer detects uninitialized reads to memory:
clang -fsanitize=memory -fno-omit-frame-pointer -g -O2 umr.c
Example Output:
$ clang -fsanitize=memory -fno-omit-frame-pointer -g -O2 umr.cc
$ ./a.out
WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f45944b418a in main umr.c:6
#1 0x7f45938b676c in __libc_start_main libc-start.c:226This tool is under active development and currently only available for Linux and BSD
LeakSanitizer
This detects memory leaks:
clang -fsanitize=address -g memory-leak.c
$ clang -fsanitize=address -g memory-leak.c
$ ./a.out
==23646==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x4af01b in __interceptor_malloc /projects/compiler-rt/lib/asan/asan_malloc_linux.cc:52:3
#1 0x4da26a in main memory-leak.c:4:7
#2 0x7f076fd9cec4 in __libc_start_main libc-start.c:287
SUMMARY: AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).UndefinedBehaviorSanitizer
Undefined behaviour sanitizer detects various types of undefined behaviour • Here, an integer overflow is detected:
int main(int argc, char **argv) {
int k = 0x7fffffff; // this is the largest possible signed int value ...
k += argc; // ... this will produce an integer overflow
return 0;
}
% clang -fsanitize=undefined -Wall -Werror intOverflow.c -o intOverflow
% ./intOverflow
intOverflow.c:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'RAII
RAII, or Resource Acquisition Is Initialization, is a programming paradigm used primarily in C++ to manage resource allocation and deallocation automatically. The core idea behind RAII is that an object’s lifetime, specifically when it is constructed and destructed, should control resource usage such as memory allocation, file handles, and network connections.
In RAII, resources are acquired in a constructor and released in a destructor. This ensures that resources are properly cleaned up when an object goes out of scope, thus preventing resource leaks and ensuring more robust exception safety.
For example, when an object that manages a file handle is created, it opens the file in its constructor. When this object is no longer needed and is destroyed, its destructor automatically closes the file. This automatic management of resources simplifies the code and reduces errors since the cleanup activities are made implicit and tied to the object lifecycle.
The beauty of RAII lies in its leveraging of C++‘s deterministic object lifetime, which provides automatic resource management without requiring explicit cleanup code from the programmer. This results in cleaner, safer, and more maintainable code, particularly in the face of exceptions where manually managing resource cleanup can be complicated and error-prone.
For storing a single value on the heap we should use one of two “smart” pointers:
std::unique_ptrfor unique ownership the default case, when a single variable owns the value on the heapstd::share_ptrfor shared ownership when it is not possible to identify a single owner (e.g. multi-threaded progs)
POSIX Threads
POSIX Threads (short pthreads) is the most commonly used threading implementation for C
To use it we need to #include <pthread.h>
and specify a compiler flag –lpthread
Create Thread
Threads are created with:
int pthread_create(pthread_t *thread,
const pthread_attr_t *attr,
void *(*start_routine)(void*),
void *arg);It takes four arguments:
1. A thread identifier, i.e. a pointer to a memory location of type pthread_t.
2. Thread attributes which set properties such as scheduling policies or stack size.
Passing NULL results in default attributes.
3. A function pointer to the start_routine.
This function takes a single argument of type void* and returns a value of type void*.
4. The argument that is passed to start_routine.
It returns 0 if the thread is created successfully or a non-zero error code.
Passing pointers to and from start_routine allows the passing of arbitrary data.
It also requires care to ensure that the memory locations pointed to have appropriate lifetimes
Wait for Other Thread to Terminate
To wait for another thread to terminate we use pthread_join
int pthread_join(pthread_t thread, void **value_ptr);It takes two arguments:
- A thread identifier.
- A pointer to a memory location of type void*. The return value of the start_routine passed to pthread_create will be copied to this location.
It returns 0 on success and a non-zero error code otherwise
Key Threading Concepts
Race Conditions
A Race Condition occurs when two or more threads or processes attempt to change shared data at the same time.
Imagine you and your friend are both trying to be the first to grab the last slice of pizza; you both reach for it simultaneously without knowing who will get it first. In computer terms, this uncertainty can cause problems.
For instance, if two threads are trying to update the same counter, they might both read the value at the same time, increment it, and write it back, resulting in a lost update if not handled correctly.
Critical Regions
A Critical Region is a part of the code where shared resources are accessed or modified.
Using the pizza slice example, the moment when someone reaches out to grab the slice can be seen as a critical region. In programming, you need to make sure that only one thread or process enters this critical part at a time to prevent confusion or errors.
This is like saying only one person can reach for the pizza slice at any one moment, ensuring that the slice goes to one person without dispute.
Mutexes
Mutexes (Mutual Exclusions) are tools used to avoid race conditions by controlling access to critical regions.
A mutex is like a key to a locked door that leads to the critical region or the pizza slice. If one thread has the key (owns the mutex), it can enter the critical region while other threads must wait outside until the mutex is released (the door is unlocked).
This ensures that only one thread can access the shared resource at a time.
Monitor
A class that provides thread safe access to a shared resource. (Like a bounded buffer) All public functions must provide mutual exclusion A java class where all the public methods are synchronised is an example of a monitor
General Flow of Resolving Multithreading Issues
- Identify the Problem (Race Condition): You realize that if multiple threads try to update the same data concurrently, you might end up with incorrect data because the threads are racing each other to complete their task.
- Protect the Solution (Critical Region): You determine which parts of your code access or modify shared resources. This part is your critical region, which needs protection to ensure data accuracy.
- Implement a Tool (Mutex): You use a mutex as a tool to lock access to the critical region. Only the thread that holds the lock can enter this protected section at a time, while others must wait, thus preventing any overlap and ensuring data is updated correctly and consistently.
This means, before executing a critical section, so as to avoid a race condition, the thread must acquire a lock (mutex), and release this lock (mutex) on leaving.
If a thread tried to acquire a lock that is already owned by another thread, the new thread is blocked until the owner releases.
ThreadSafe Linked List in Cpp
We encapsulate an std::list<int> and add a thread-safe interface protected by a (private) std::mutex
#include <list> #include <thread> #include <optional> #include <mutex>
struct list {
private:
std::list<int> list; std::mutex mutex; // mutex to protect critical section
public:
void append_to_list(int value) {
std::unique_lock<std::mutex> lock(mutex); // lock mutex
list.push_back(value);
} // mutex will be automatically unlocked here
std::optional<int> remove_from_list(int position) {
std::unique_lock<std::mutex> lock(mutex); // lock mutex
auto iter = list.begin();
while (position > 0 && iter != list.end()) { iter++; position--; }
if (position != 0 || iter == list.end()) { return {}; /* return nothin */ }
int value = *iter;
list.erase(iter);
return value;
} // mutex will be automatically unlocked here
};
Deadlock and Busy Waiting in the Bounded Buffer Problem
In concurrent programming, deadlock can occur when multiple processes hold onto resources while waiting for others to release different resources, preventing any from moving forward.
For Example: The bounded buffer, is shared between producers (who add items) and consumers (who remove items), and provides a clear example of potential deadlock situations.
Deadlock Scenario
Consider a setup with two mutexes: buffer_mutex for buffer access and count_mutex for managing the count of items. A deadlock may develop if:
-
Producer Process:
- Acquires
buffer_mutexto add an item. - Also needs
count_mutexto update the item count.
- Acquires
-
Consumer Process:
- Acquires
count_mutexto check the buffer’s item count. - Attempts to acquire
buffer_mutexto remove an item.
- Acquires
If both processes hold one mutex while attempting to acquire the other, neither can proceed:
- The producer holds
buffer_mutexand waits forcount_mutex. - The consumer holds
count_mutexand waits forbuffer_mutex.
Busy Waiting as a Solution
Busy waiting involves a process repeatedly checking a condition while waiting for a resource to become available. It’s a simple method to avoid deadlock under certain conditions but can lead to inefficient CPU usage. In the bounded buffer context, busy waiting can be implemented by having processes repeatedly attempt to acquire a lock until they succeed, without ever blocking.
To incorporate busy waiting effectively:
- Ensure processes release their mutex if they cannot acquire the second one, then retry acquiring both after a short delay.
- Use it conservatively to avoid high CPU consumption, especially in systems where thread or process responsiveness is critical.
Busy waiting can be a practical, if resource-intensive, strategy to prevent deadlocks in systems like the bounded buffer, where resource allocation order might lead to process standstills. However, this approach should be balanced with efficient CPU usage to maintain system performance.
Tea Maker Example
pthread_mutex_t m;
bool teaIsReady = false;
void *me(void* arg) {
(void)arg;
printf("Me: Waiting for my tea ...\n");
pthread_mutex_lock(&m);
while (!teaIsReady) {
pthread_mutex_unlock(&m);
printf("Me: (Unamused) // do nothing\n");
pthread_mutex_lock(&m); }
pthread_mutex_unlock(&m);
printf("Me: (Happy) ... finished waiting.\n");
return NULL;
}
void *teaRobot(void* arg) {
(void) arg;
printf(" Tea Robot: Making tea ...\n");
usleep(randInt());
pthread_mutex_lock(&m);
teaIsReady = true;
pthread_mutex_unlock(&m);
printf(" Tea Robot: Done!\n");
return NULL;
}
int main() {
pthread_t t1; pthread_t t2; int err;
err = pthread_mutex_init(&m, NULL); assert(err == 0);
err = pthread_create(&t1, NULL, me, NULL); assert(err == 0);
err = pthread_create(&t2, NULL, teaRobot, NULL); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);
err = pthread_join(t2, NULL); assert(err == 0);
pthread_mutex_destroy(&m);
}
Conditional Variables as a Solution
As we saw, busy waiting is wasteful for energy and CPU cycles (something, something PSD)
Its better instead to use a system where we control access to code, we use CVs to block threads and wait until we have a condition change, to unblock, and let the pick the lock up.
pthread_cond_wait(&cv, &m)
must be called with a locked mutex m
It releases the mutex and blocks the calling thread on cv
pthread_cond_signal(&cv)
assigns mutex ownership to one of the threads blocked on cv, and wakes it
•The condition may have changed before the wakened thread is scheduled, so it’s important to check the condition again:
pthread_mutex_lock(&m);
while (!cond) {
pthread_cond_wait(&cv, &m);}This is the same idea as a hardware interrupt/signal
Tea Maker Example
pthread_mutex_t m; pthread_cond_t cv;
bool teaIsReady = false;
void *me(void* arg) {
(void)arg;
pthread_mutex_lock(&m);
while (!teaIsReady) {
printf("Me: Waiting for my tea ...\n");
pthread_cond_wait(&cv, &m); }
printf("Me: (Happy) ... finished waiting.\n");
printf("Me: Is the tea really finished? %s\n",
teaIsReady ? "Yes" : "No");
pthread_mutex_unlock(&m);
return NULL;
}
void *teaRobot(void* arg) {
(void) arg;
printf(" Tea Robot: Making tea ...\n");
usleep(randInt());
pthread_mutex_lock(&m);
teaIsReady = true;
pthread_mutex_unlock(&m);
pthread_cond_signal(&cv);
printf(" Tea Robot: Done!\n");
return NULL;
}
int main() {
pthread_t t1; pthread_t t2; int err;
err = pthread_mutex_init(&m, NULL); assert(err == 0);
err = pthread_cond_init(&cv, NULL) ; assert(err == 0)
err = pthread_create(&t1, NULL, me, NULL); assert(err == 0);
err = pthread_create(&t2, NULL, teaRobot, NULL); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);
err = pthread_join(t2, NULL); assert(err == 0);
pthread_cond_destroy(&cv) ; pthread_mutex_destroy(&m);
}
Bounded Buffer Example
MAIN
struct BoundedBuffer {
int start; int end; int size; int* buffer;
pthread_mutex_t m;
pthread_cond_t add;
pthread_cond_t remove;
};
typedef struct BoundedBuffer BoundedBuffer;
BoundedBuffer * createBoundedBuffer(int size) { ... }
void destroyBoundedBuffer(BoundedBuffer * bb) { ... }
void addItem(BoundedBuffer * bb, int item) { ... }
int removeItem(BoundedBuffer * bb) { ... }
void * producer(void * arg) { ... }
void * consumer(void * arg) { ... }
int main() {
pthread_t t1; pthread_t t2; int err;
BoundedBuffer * bb = createBoundedBuffer(4);
err = pthread_create(&t1, NULL, consumer, bb); assert(err == 0);
err = pthread_create(&t2, NULL, producer, bb); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);
err = pthread_join(t2, NULL); assert(err == 0);
}PRODUCER
void * producer(void * arg) {
BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
int item = randInt();
printf("produced item %d\n", item);
addItem(bb, item);
usleep(randInt());
}
return NULL;
}CONSUMER
void * consumer(void * arg) {
BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
int item = removeItem(bb);
printf(" consumed item %d\n", item);
usleep(randInt());
}
return NULL;
}AddItem
void addItem(BoundedBuffer * bb, int item) {
If (!bb) return;
pthread_mutex_lock(&bb->m);
while (bb->start == bb->end) { // buffer is full
printf("== Buffer is full ==\n");
pthread_cond_wait(&bb->add, &bb->m);
}
// buffer is no longer full
bb->buffer[bb->start] = item;
bb->start = (bb->start + 1) % bb->size; // move start one forward
pthread_mutex_unlock(&bb->m);
pthread_cond_signal(&bb->remove);
}RemoveItem
int removeItem(BoundedBuffer * bb) {
if (!bb) assert(0);
pthread_mutex_lock(&bb->m);
while ( ((bb->end + 1) % bb->size) == bb->start ) { // buffer is empty
printf("== Buffer is empty ==\n");
pthread_cond_wait(&bb->remove, &bb->m);
}
// buffer is no longer empty
bb->end = (bb->end + 1) % bb->size; // move end one forward
int item = bb->buffer[bb->end];
pthread_mutex_unlock(&bb->m);
pthread_cond_signal(&bb->add);
return item;
}Coordination in Concurrent Programming
Partitioning
Determining what parts of the computation should be separately evaluated, e.g. a thread to serve each request, to render each frame of a film.
Data Sharing
What data to share between threads, and when to acquire it e.g. film rendering threads may hold 2 frames: 1 being processed and another ready to go as soon as the current frame is complete
Synchronisation
ensuring threads can cooperate without interference, e.g. if threads representing 2 players compete to get a single resource then only one succeeds.
Semaphore
Imagine you’re at a library with a limited number of study rooms. To manage these rooms so that no more than the available rooms are used at one time, the librarian uses a counter or a tracking system. This system ensures that everyone gets a turn without overcrowding. A semaphore operates similarly in computer systems—it’s a tool (more specifically, a variable) used to control access to a common resource by multiple processes in a concurrent system, such as a multitasking operating system.
Types of Semaphores
There are mainly two types of semaphores:
- Counting Semaphore
- Binary Semaphore (or Mutex)
Counting Semaphore
A counting semaphore can hold any non-negative value and is used to represent the number of resources available. Here’s how it works:
- Initialize: You set a semaphore to the number of identical resources available (like the number of free study rooms).
- Wait (or P operation): When a process needs to use a resource, it performs a wait operation. This decrements the semaphore’s value. If the value is positive, the resource is allocated to the process. If the value is zero (meaning no resources are free), the process has to wait until a resource becomes free.
- Signal (or V operation): When a process is finished using the resource, it performs a signal operation, incrementing the semaphore’s value, signaling that a resource has become free.
For example, if there are 5 identical printers, the semaphore would be initialized to 5. Each time a process needs a printer, the semaphore is decremented. When a printer is released, the semaphore is incremented.
Binary Semaphore (Mutex)
A binary semaphore is simpler, as it can only be 0 or 1. It’s mainly used to ensure that only one process can access the resource at a time (mutual exclusion).
- Initialize: Typically initialized to 1, indicating that the resource is available.
- Wait: If a process wants the resource, it attempts to perform a wait operation. If the semaphore value is 1, the semaphore is set to 0 and the process uses the resource. If it’s already 0, the process must wait.
- Signal: When the process no longer needs the resource, it performs a signal operation, setting the semaphore back to 1.
This type of semaphore acts like a key to a single room. If you have the key (semaphore is 1), you can enter the room. If someone else has the key (semaphore is 0), you must wait outside.
Bounded Buffer Example
struct BoundedBuffer { int start; int end; int size; int* buffer; };
typedef struct BoundedBuffer BoundedBuffer;
sem_t fillCount; // data in the buffer
sem_t emptyCount; // free space in the buffer
BoundedBuffer * createBoundedBuffer(int size) { ... }
void destroyBoundedBuffer(BoundedBuffer * bb) { ... }
void addItem(BoundedBuffer * bb, int item) { ... }
int removeItem(BoundedBuffer * bb) { ... }
void * producer(void * arg) { ... }
void * consumer(void * arg) { ... }
int main() {
pthread_t t1; pthread_t t2; int err;
BoundedBuffer * bb = createBoundedBuffer(4);
fillCount = sem_create(0, 4); // no data in the buffer yet
emptyCount = sem_create(4, 4); // all spaces in the buffer are free
err = pthread_create(&t1, NULL, consumer, bb); assert(err == 0);
err = pthread_create(&t2, NULL, producer, bb); assert(err == 0);
err = pthread_join(t1, NULL); assert(err == 0);
}
Producer
void * producer(void * arg) {
BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
sem_wait(emptyCount);
int item = randInt();
printf("produced item %d\n", item);
addItem(bb, item);
sem_signal(fillCount);
usleep(randInt());
}
return NULL;
} Consumer
void * producer(void * arg) {
BoundedBuffer * bb = (BoundedBuffer*)arg;
for (int i = 0; i < 10; i++) {
sem_wait(emptyCount);
int item = randInt();
printf("produced item %d\n", item);
addItem(bb, item);
sem_signal(fillCount);
usleep(randInt());
}
return NULL;
} REMOVEITEM
int removeItem(BoundedBuffer * bb) {
if (!bb) assert(0);
sem_wait(emptyCount);
// buffer is no longer empty
bb->end = (bb->end + 1) % bb->size; // move end one forward
int item = bb->buffer[bb->end];
sem_singal(fillCount);
return item;
}Note how much simpler this is than cond. vars
Lambda Expression / Anon Functions
Example:
We have seen a function pointer used to pass a function as an argument, e.g. to create threads with pthread:
void * PrintHelloWorld(void* arg) { printf("Hello World\n"); }
int main() {
// ...
int error = pthread_create(&thread, NULL, PrintHelloWorld, NULL);
// ... ^ this is passed as a function pointer
}By writing a lambda expression we can write a function without naming it:
int main() { auto t = std::thread( [](){ printf("Hello World\n"); } ); t.join(); }[](){ printf("Hello World\n"); is the lambda
Other names for lambdas are: function literal or anonymous functions
Lambda expressions in C++ are written as:
[ /*captures*/ ] ( /*params*/ ) { /*body*/ }
The captures lists variables from the surrounding scope that are passed to the lambda when it is created
The params list the parameters that are passed to the lambda when it is called
The body is the function body that executes when the lambda is called
As for function parameters variables are captured by-value i.e. they are copied
We can share access to a variable by capturing a pointer to it using this notation:
int main() {
auto l = list{}; l.append_to_list('a'); l.append_to_list('b’);
auto t1 = std::thread([l_ptr = &l] (){ l_ptr->remove_from_list(1); });
t1.join();
}Calling a standard pthread, our function pointer is passed in as a void* which means they lose their type information, however using captures we can mantain this information
Async Call
auto f6 = std::async([] { return fib(6); });This is same as:
std::future<int> f6 = std::async([] { return fib(6); });
printf("fib(6) = %d (computed asynchronously)\n", f6.get() );Futures and Promises:
- Futures: A future provides a mechanism to access the result of asynchronous operations. It acts as a handle to a result that is initially unknown but will be provided once the asynchronous operation completes.
- Promises: A promise is an object that can deliver a value to a corresponding future. A promise can set the value once, and the future can read the value, potentially in another thread.
Simplification with Futures and Promises:
- Futures and promises decouple the production of a value from its consumption, which simplifies code by eliminating the need for explicit thread management. This abstraction allows developers to focus on the computation’s logic rather than on synchronization details like locking mechanisms and condition variables, leading to code that is easier to write, read, and maintain.
We can see a future as the reading end of a communication channel: some_future.get() extracts the value
The writing end of the channel is called a promise: some_promise.set_value(42);
We can obtain a std::future object from a std::promise:
std::future<int> some_future = some_promise.get_future();Promise Example
// This sum function writes to the sum_promise
void sum(std::vector<int>::iterator begin,
std::vector<int>::iterator end,
std::promise<int> sum_promise)
{
int sum = std::accumulate(begin, end, 0);
sum_promise.set_value(sum); // 4. write result
}
int main() {
auto numbers = std::vector<int>{ 1, 2, 3, 4, 5, 6 };
std::promise<int> sum_promise; // 1. create promise for an int
std::future<int> sum_future = sum_promise.get_future(); // 2. get future from promise
// 3. create thread that takes ownership of the promise (ownership transfer with std::move)
auto t = std::thread(sum, numbers.begin(), numbers.end(), std::move(sum_promise) );
printf("result = %d\n", sum_future.get() ); // 4. wait and then read result
t.join();
}
Synchronisation
We can achieve this with
- an
std::promise<void>, a promise to produce nothing (but say when you’re done) - the
std::future<T>::wait()method
void do_work(std::promise<void> barrier) {
std::this_thread::sleep_for(std::chrono::seconds(1)); // do something (like sleeping)
barrier.set_value(); // 4. send signal to other thread
}
int main() {
std::promise<void> barrier; // 1. create promise
std::future<void> barrier_future = barrier.get_future(); // 2. get future from it
auto t = std::thread(do_work, std::move(barrier) ); // 3. launch thread
barrier_future.wait(); // 4. wait for signal
… // 5. continue execution: we know that do_work has completed
t.join(); }
Futures Example
int main() {
auto fs = std::vector<std::future<int>> ();
// launch 10 asynchronous tasks to compute fibonacci numbers
for (int i = 0; i < 10; i++) {
fs.push_back(std::async([] {
return fib(i+1);
}));
}
// ... do some other work; then print the computed numbers
for (int i = 0; i < 10; i++) {
printf("fib(%d) = %d\n", i+1, fs[i].get());
}
}
Call future.get() to retrieve the value. Blocks calling thread until the value has been computed
Parallel Sum
BAD EXAMPLE
int parallelSum(std::vector<int>::iterator begin, std::vector<int>::iterator end) {
auto len = end - begin;
// compute sequentially for small arrays
if (len < 1000) { return std::accumulate(begin, end, 0); }
auto mid = begin + len/2;
// launch asynchronous task for the left half of the array
auto left_side = std::async([=] {
return parallelSum(begin, mid);
});
// compute right half of array recursively
int right_side = parallelSum(mid, end);
// block to wait for left side to finish
return left_side.get() + right_side;
}
int main() {
std::vector<int> vec = createLargeVector();
auto sum = parallelSum(vec.begin(), vec.end());
printf("sum: %d\n", sum);
}int parallelSum(std::vector<int>::iterator begin, std::vector<int>::iterator end, int depth = 0 )
{
auto len = end - begin;
if (len < 1000 || depth > 3 ) { return std::accumulate(begin, end, 0); }
auto mid = begin + len/2;
auto left_side = std::async([=] {
return parallelSum(begin, mid, depth + 1 );
});
int right_side = parallelSum(mid, end, depth + 1 );
return left_side.get() + right_side;
}
The naïve parallel sum creates too many threads, and the time managing all of these threads outweighs the time saved by running the threads
If we have a small number of cores (say 8) we can fix this, by only spawning threads for the first few recursive calls:
Task Packaging
int main() {
auto task = std::packaged_task<int(int,int)>([](int a, int b) {
return pow(a, b);
});
std::future<int> result = task.get_future() ;
// The task can now be stored, or passed as a parameter.
// When we are ready to use it either
// launch task in the same thread via:
// task(2, 10);
// or start a new thread:
auto t = std::thread(std::move(task) , 2, 10);
t.join();
printf("task result: %d\n", result.get() );
}