2019q1 Homework6 (gc) === contributed by < `martinxluptak` > * Complete the Mark & sweep garbage collection principle, explanation and code analysis required for the [7th week test (below)](https://hackmd.io/s/Hy8aBIgYV), cover the implementation of reference counting and explore existing defects; [I completed the quiz](https://hackmd.io/s/B1batJbF4) and started examining the ogc program. In the mark and sweep garbage collection, every object has an extra mark bit which is reserved for memory management. During the mark phase, the algorithm identifies which objects are traceable. It does this by setting their reserved mark bit to 1. All other objects are the garbage objects - these objects can not be referenced from memory and therefore do not need to occupy memory anymore. During the sweep phase, the algorithm scans the heap for all objects which have their mark bit set to 0. These objects were not reached during the mark phase, these objects are freed. After the garbage collection, all mark bits are set back to zero to reset the state. Let's take a look at some of ogc's functions: * `gc_internals.h`: This file contains the definition for the global GC object. The global GC object is the whole memory management structure as opposed to a single user-allocated block of data. Let's examine it: ``` struct __gc { void *stack_start; gc_list_t *ptr_map[PTR_MAP_SIZE]; size_t ptr_num; size_t limit; size_t ref_count; uintptr_t min, max; gc_list_t *globals; }; typedef struct __gc gc_t; /* global GC object */ extern gc_t __gc_object; ``` Inside this object is a pointer map `gc_list_t *ptr_map[PTR_MAP_SIZE]`. It stores pointers to `gc_list_t` which is a singly linked list. `min` `max` point to the lowest/highest address assigned by malloc. `ptr_num` increments/decrements when objects are added/freed. `limit` is the treshold of added objects after which a garbage collection is ran. `ref_count` stores the number of places (i.e. processes) that can reach the global GC object - using it as your preferred memory allocation is always initiated through `gc_init()` and ends with `gc_destroy()`. When the last `gc_destroy()` is called, the whole memory structure is freed (each linked list inside `ptr_map`). `globals` keeps a list of objects which are always marked during the marking process. It is not used in `tests/`, but I think the intended use is for (other) global objects which are stored on neither stack or heap or for the user to define some blocks they do not want to be freed for whatever reason. `void *stack_start` is the first address passed to `gc_mark` during the marking phase. * `alloc.c` defines alloc/free to be used by the user: `gc_alloc()` alllocates memory by calling malloc and assigns it a helper struct `gc_ptr_t` which tracks its size and also their reserved mark bit. the function then appends it to the global GC `ptr_map`. `gc_free()` searches for a piece of allocated memory in `ptr_map` and frees it if found. * `core.c` defines init/destroy mark and sweep functions as well as `gc_run()`. `gc_mark()` loops over reachable blocks of memory and marks them and nests itself in a DFS. * `utils.c` defines `gc_list_t * gc_ptr_index()` which is used to retrieve data about a block by inputting its address. `gc_mark_stack()` is the top level call which starts marking reachable stack objects. Interestingly, it uses this call to determine the stack's address range: ``` uint8_t tmp; gc_mark(__gc_object.stack_start, &tmp); ``` In this call, a reference to a new uninitialized `uint8_t` is passed to `gc_mark` to start marking. This `uint8_t` value always allocates at the end of the used stack address space, effectively providing the upper boundary address, as stack is always allocated sequentially (Is this transferable to different architectures?). * `list.c` provides list manipulation utilities, similar to those in [F02: list](https://hackmd.io/s/S12jCWKHN). I noticed the global GC object is actually a global variable, perhaps it is best to only have one such memory management structure per stack, but in a multi-threaded environment, more stacks are going to be present and `gc_mark_stack()` will break because of the aforementioned call: ``` uint8_t tmp; gc_mark(__gc_object.stack_start, &tmp); ``` `tmp` will be allocated on current thread's stack whereas `__gc_object.stack_start` will use an address from the thread on which the global GC object was first instantiated (first `gc_init()` call). Mutexes/atomics need to be used to prevent other threads from writing to memory during garbage collection, as there can be discrepancies in contents of the global GC object. Apart from this, the stack parallelism issue needs to be addressed somehow: * allocate one `ptr_map` for each stack, during garbage collection only traverse one of the maps. This is perhaps for the best, rather than accessing stacks of other threads. Still, threads can pass each other heap accessors, and reference count can not be easily counted accross threads. * Bind stack in a different way, without relying on values in global GC object's `stack_start` or `min`, `max`. * Remember a `stack_start` for each thread and only run `gc_mark_stack()` on current stack. [ztex proposed this solution](https://hackmd.io/s/r19NJfF3N), although, as is case in the multi-`ptr_map` solution, if threads share heap data with each other, reaching into freed memory might occur. [Here is my attempt to parallelize OGC. Threading is still unsafe.](https://github.com/martinxluptak/ogc) Best to take inspiration from C++11 which features three standard smart pointer classes to tackle reference counting / garbage collection: * `unique_ptr<T>` wraps a pointer to a value and guarantees that there is only one pointer at a time to that value. You cannot copy `unique_ptr<T>` instances around (only move them), and when the `unique_ptr<T>` instance is destroyed, the underlying pointer is deleted. * `shared_ptr<T>` wraps a pointer to a value and a reference count, allowing multiple locations in the program to point to the same value at the same time. You can copy `shared_ptr<T>` instances around, and when the last `shared_ptr<T>` instance with a given pointer is destroyed, the underlying pointer is deleted. * `weak_ptr<T>` wraps a pointer to a value and can be converted to a `shared_ptr<T>`, temporarily (using the lock() method). However, while you have only a `weak_ptr<T>` instance, the underlying pointer can be deleted (because there are no “strong” pointers to it), and then your attempt to convert it to a `shared_ptr<T>` will fail gracefully.