Programming languages and operating systems separate the stack from the heap, storing each in a different region of virtual memory, and managing the two regions differently. Outline what data is stored on the stack and what data is stored on the heap. Explain why it is necessary to store the stack and the heap in separate regions of the virtual address space. [6]
Answer
Typically the stack is comprised of “Stack Frames” each of which typically contains local context related to some scope of a program, a good example of this is when you call a function, a new stack frame is called, and all local variables will therefore be stored there, on the stack. This means that when the function ends, the stack frame is popped and the memory is cleared. Stack memory also needs to be static in shape, as it isnt made in a way to allow for dynamic resizing
In contrast, heap memory stores variables that need to be seen outside of the scope from which they were called and therefore they can’t be stored on the stack, in addition to this, dynamic memory structures like vectors, or linked lists need to be stored on the heap, due to the nature of potentially needing to take up increasingly more memory
1b)
Question
One approach to automatic management of heap memory uses reference counts attached to each object. The runtime increments the reference count when a new reference to the object is created, and decrements it when a reference is removed. The memory allocated to an object is reclaimed when the reference count of the object is decremented to zero. Briefly outline the main benefits and problems inherent in using reference counting as a means of automatic memory management. [4]
Answer
The main benefit of Refence Counting (from which I’ll refer to as Rc), is that while manual memory management requires significant effort to ensure memory doesnt leak, and is relatively error prone, Rc takes away most of the complexity, for much less overhead than Garbage Collection, in addition to this, compared to Garbage Collection, the overhead introduced with Rc is far more predictable.
The core disadvantages are that due to the nature of the trees that Rc forms in tracking, any changes to Rc objects towards the root require far more updating than those closer to the leaves, especially in systems with lots of “live” nodes.
Another key disadvantage is in Rc limiting the programmer in terms of programming choice, primarily in cyclic references. This makes data structures like doubly linked lists impossible
1c)
Question
In the safe subset of the Rust programming language, a data item of type T can be accessed through one of two kinds of reference, represented as &T and &mut T. State what is the difference between these two kinds of reference. Describe the rules and restrictions around the existence of multiple references to the same data for each kind of reference. [4]
Answer
Rust provides two types of references, &T, an immutable reference, which is a read-only copy of the thing being referenced, and &mut T, a mutable reference, which allows you to take ownership of the object, and edit the value stored in it.
Rust has strict rules in how you can use these references, specifically, you can have infinite immutable references, but only one mutable reference, and on top of this, when you create a mutable reference, you take ownership over it temporarily, and no immutable references can be made while you have the mutable reference.
1d)
Question
With the aid of an example, explain why the restrictions on references discussed in part (c) of this question make it impossible to write certain classes of program using the safe subset of Rust. Discuss what is gained as a result of imposing such restrictions, and whether you think such benefits outweigh the costs of the restrictions. [6]
Answer
Conceptually the easiest example of the limitations this imposes is for multithreaded applications.
Say I have a reference to some storage object, like a bank transaction log. In theory if I have 4 threads spawned all making “transactions”, I would (at least naively) want 4 mutable references to that storage object.
The result of this, is two-fold, we physically are unable to induce a race condition in this scenario, as we can’t have a scenario that we have two seperate actors trying to impose a change on an object at the same time. On the other side, it complicates design in this scenario, a common solution to this would be to have each thread have a message channel that they use to send their “transactions” to a central thread, who is the only actor responsible for accessing the storage object.
While we dont run the risk of race conditions, we now have different concerns, such as not sending messages faster from threads than the central thread is able to process.
Question 2
2a)
Question
It’s common for programming languages to support concurrency by providing multiple threads of execution within a single address space, along with locks to control access to shared mutable data. This is the model adopted by the C programming language with the pthreads library, and by Java, for example. Discuss what are the problems inherent in this programming model, considering in particular correctness of the resulting code and composition of operations. Use pseudocode fragments to provide examples that illustrate key points. [8]
Answer
Some core challenges that are faced are that while individual components may be fine, their composition can lead to unintended consequences, that can cause the processor to hang without crashing.
The most common examples of this are deadlock and livelock.
Say a function:
def fun(first_acc, second_acc):
lock(first_acc)
first_acc += 15
lock(second_acc) # We have to wait for second_acc lock
second_acc += 15
unlock(second_acc)
unlock(first_acc)
thread1 = thread.spawn(fun(account_a, account_b))
thread2 = thread.spawn(fun(account_b, account_a
In this naive example, thread1 will lock account_a, then wait for account_b before it unlocks account_a. The problem is that account_b is locked by thread2, who wont release the lock until thread1 releases account_a. There is no way to resolve this, and these threads will never make progress.
A livelock is similar but instead of nothing happening, a cyclic chain will happen, for instance, ownership of an object bounces between thread1 and thread2 with no real progress in the core functionality.
Other problems stem from visibility of shared memory between threads. Each "core" has its own cache that doesnt implicitly update on every action from every thread. For instance in java, 64bit variables (like floats), dont necessarily update other threads until they pick up the lock for that object. This makes the system which acts with a view of memory that isnt necessarily true non-deterministic
2b)
Question
As an alternative to the thread-based programming model, some languages offer support for concurrency via transactions with automatic roll-back and retry, or through message passing. Systems that use message passing can be further subdivided into those that avoid concerns about shared mutable state by making messages immutable, and those that avoid such problems by tracking ownership to prevent shared access of mutable data. Such languages and systems were discussed in the lectures, and in the recommended reading for the course. Discuss the advantages and disadvantages of each of these three approaches for systems programming. Justify your answer, stating which you consider to be the most promising approach for improving systems programming; if you think none are promising, explain why. [12]
Answer
Tackling the methods in order, transaction based concurrency, such as seen in haskell, are unfortunately not very practical in most languages, due to their reliance on certain things like referential transparency (as seen with the Church-Rosser theorem). This dictates that you can approach solving any function in any order (including rolling back and retrying) and get the exact same solution. This idea quickly breaks down when using IO though. Say I have some procedure in a missile launch, we’ll quickly find that its not possible to “rollback” the step that involves deploying the missiles. This means that anything that happens to do with IO has to be wrapped in a monad that only completes once the whole STM transaction is complete for example.
Overall this system is not practical for most systems that use IO frequently, and dont want to add all the complexity to how their IO operates. In addition to not being supported in essentially any langauge outside of haskell
The next option is the immutable message passing system, where you send read-only copies of messages through communication channels, this lack of mutability immediately eliminates the majority of concurrency problems to do with race conditions, as well as isolating errors, as errors from one thread cant corrupt shared data. Most of the time this is implemented with dynamic typing, like elixir.
This means that it attempts to handle the cases it understands, simply rejecting whatever it doesnt understand.
This means its much easier to set up, but potentially less robust, as it will just ignore errors. In addition to this, the immutability of data naturally leads to data being copied a lot, which can potentially be problematic for larger data blocks.
Finally, mutable message passing, while this eliminates overhead introduced by copying data, its less implicitly safe, and requires much more discipline, which is why its useful to be done in a statically typed language, forcing you to directly handle errors (such as with match statements).
Overall of the three solutions immutable message parsing seems the best. Haskell’s STM system is simply not practical for most languages, and in comparison to mutable message parsing, while extra overhead in copying data is introduced, the extra safeguards and flexibility introduced by the system are probably better. The system in Rust is implemented in a fashion that puts most of the burden of the system working well onto the developer, which overall seems to be in contrast to the majority of the rest of the language which is developed to make sure the developer cant just “let things slip”, unlike C/C++
Answer
Transaction-based concurrency, such as by Haskell’s STM (Software Transactional Memory), offers an novel approach to managing concurrent state. The idea is that operations within a transaction execute atomically – they either all succeed or are automatically rolled back and retried upon conflicts. This eliminates many traditional concurrency hazards like deadlocks.
However, STM faces significant challenges in systems programming contexts. The primary issue concerns IO operations, which cannot be simply rolled back. For example, once a system has written to a file, sent a network packet, or updated a hardware register, these actions cannot be “undone” in a meaningful way. Haskell addresses this by requiring IO operations to be kept outside STM transactions, executing them only after a transaction successfully completes. This creates a complicated boundary between transactional and non-transactional code that developers must carefully manage.
STM also introduces overhead in rollbacks, and relies on systems (such as the IO separation), that fundamentally aren’t practical in most languages
Message passing systems with immutable messages, were popularized by erlang. This model eliminates shared mutable state entirely by ensuring that all communication happens through complete copies or immutable references, which immediately prevents most race conditions and data corruption issues.
The immutable approach offers fault isolation – errors in one component cannot corrupt data used by others. It also simplifies reasoning about program behavior since data, once received, cannot be unexpectedly modified by concurrent processes. This model extends naturally to distributed systems, as the immutability principle works equally well across process and machine boundaries.
The primary disadvantage is potential performance overhead from copying data, though this is often mitigated through implementation techniques like persistent data structures that share memory efficiently while preserving immutability semantics. Still, for very large data structures or performance-critical paths, the copying overhead can be significant.
The third approach, mutable message passing with ownership tracking, is used by Rust’s concurrency model. Rather than making data immutable, it tries to ensure that mutable data has exactly one owner at any time. When data is sent between threads, ownership transfers entirely, preventing concurrent access to mutable state without requiring immutability.
This approach offers excellent performance characteristics, avoiding unnecessary copying while still preventing race conditions and data corruption. It works particularly well for large data structures where copying would be prohibitively expensive. The ownership transfer is verified at compile time, eliminating runtime overhead for these safety checks.
The downsides include a steeper learning curve and increased development complexity. Rust’s ownership system, while powerful, requires developers to think explicitly about data ownership and lifetimes. This approach also doesn’t extend as naturally to distributed systems, where ownership across machine boundaries becomes more challenging to enforce.
Of these three approaches, I believe ownership-tracked mutable message passing shows the most promise for systems programming specifically. While all three models offer significant improvements over traditional threads and locks, Rust’s approach is specifically good for ensuring correctness with compiler ownership checks, while also reducing the overhead as much as possible. The ownership model addresses the core issues of shared mutable state without imposing the performance costs of immutability or the runtime overhead of STM.
Immutable message passing remains an great choice however for distributed systems and applications where simplicity and fault isolation outweigh raw performance concerns. STM, while elegant, faces too many practical challenges with IO operations and performance characteristics to be the primary concurrency model for general systems programming.
Question 3
3a)
Question
What is meant by the concept of a strong type system in programming languages? [2]
Answer
A strong type system strictly enforces rigid rules on how different types interact, it prevents operations on incompatible types without explicit conversion. It generally disallows type “coercion” like casts in C. This is generally done to provide more reliable code.
3b)
Question
Discuss whether it is possible, and a good idea, to write a complex systems program, such as an operating system, low-level device driver, or network protocol stack, entirely in a strongly typed programming language. Justify your answer. Give examples of any operations, features, or behaviours that you believe make it difficult to write an entire system in a strongly typed language. [8]
Answer
This answer requires a caveat, in that, while an Operating System like RedoxOS have been created entirely in a strongly typed programming language, its potentially disingenuous to say the entire system is strongly typed. Systems interacting with hardware typically use direct register access, bit manipulation and memory manipulation, which shouldn’t be allowed in a strongly typed language, however this is possible in Rust for example but just wrapping all of these interactions in an unsafe {} block.
In terms for whether this is sensible, this also depends, for many systems, the extra safety and robustness (like for an operating system), is certainly desirable. However while languages like Rust aim to add zero-cost abstraction, the strong-typing makes zero-copy conversion/processing impossible (again, with the caveat of being possible in unsafe blocks), which naturally increases code size, and memory usage. These increases are minimal, and not noticed by many systems, but many low-level system operate on extremely minimal memory, and minimal storage, therefore these increases become much more noticable.
3c)
Question
We discussed the type-driven development approach to designing and implementing systems programs. Describe what is meant by this approach. [8]
Answer
Type-driven development is an approach to programming that breaks down a system into common types that are used, then states before directly considering about how to implement functions.
For instance, think of an login system. We can define out states clearly as an enum:
enum AuthState { Unauthorized, Authorized(User),}struct User { UserName: Name, EmailAddress: EmailAddress,}struct Name { FirstName: String, MiddleNames: Option<String>, LastName: String,}...
And we can use defined types for easily handling data and enforcing constraints. It also self documents, we know exactly what each string inside of our Name represent, and how they are used in User.
This means we can directly use these types in state transition to make illegal states unrepresentable:
As we see here, login only works when we’re in the Unauthorized state, and we use it in a way it consumes our state and gives us an Authorized instead.
This all forces the compiler to detect if we’re doing anything that would directly cause errors, making our code less error prone
3d)
Question
Type-driven development can be viewed as imposing a relatively high up-front cost, by pushing the programmer to resolve certain design decisions and constraints early in the process. Discuss to what extent you think this is beneficial, or whether it’s desirable to leave some decisions until later in the process—even, potentially, at the cost of leaving some inconsistencies unresolved in the system. [2]
Answer
In general, its potentially naive for most systems to complain about “increased upfront cost” imposed by Type-Driven Development, as it creates a more maintainable system for future development, and any time saved by not implementing Type-Driven Development typically will be spent later on fixing the errors that are caused by not implementing this approach, ignoring the potential cost of errors and crashes for potentially critical systems.