# Modeling a tracing, mark-and-sweep GC
What do we need to change about our Alloy model to instead model a simple mark-and-sweep tracing GC?
:::success
What metadata do we now need on heap objects?
:::
<details>
<summary>(Think, then click!)</summary>
<br>
We need to track whether a heap object is marked. Rather than adding a boolean to each heap object, we can just track this globally with a `Marked` set, similar to our previous `Allocated` sig.
```java
sig HeapObject extends Memory {}
var sig Allocated in HeapObject {}
// Marked: heap object is in the set if marked.
// Variable: will change over time.
var sig Marked in HeapObject {}
```
</details>
<br>
:::success
The stencil code for for reference counting has 4 predicates: `heapChangedByProgrammer`, `referenceCountingFrees`, `referenceCountingCascade`, and `doNothing`.
Which of these do we need to only slightly update for a mark-and-sweep GC, and which do we need to replace entirely? What should we replace them with?
:::
<details>
<summary>(Think, then click!)</summary>
<br>
`heapChangedByProgrammer` and `doNothing` need to be only slightly updated.
Both should be updated to include the new variable sig, `Marked`. In both, we can just say the marked set does not change:
```java
Marked' = Marked
```
Because the marked set is only used by the active steps of the garbage collector, here we could also just say the set should be empty in these steps:
```java
no Marked
```
The other two steps, `referenceCountingFrees`, `referenceCountingCascade`, should be replaced.
What would be a good model for what to replace them with?
The natural two steps in the algorithm: `mark` and `sweep`.
</details>
<br>
### New `mark` and `sweep` transition predicates
```java
pred mark {
// Think about var sigs: references, Allocated, Marked
references' = references
// The allocated set does not change in this first pass
Allocated' = Allocated
// Don't call predicates, just say what marked should be
// Reachable tells us: ^references
Marked' = RootSet.^references
}
pred sweep {
// Assume this happens immediately after mark
// Think about var sigs: references, Allocated, Marked
no Marked'
// What stays allocated in the next step is what is marked in the current step.
Allocated' = Marked
// How should we change the references set.
// Hint: it should potentially shrink
//
// Subtract: outgoing edges from free, incoming edges to free
references' = references - ((HeapObject-Marked)->Memory) - (Memory -> (HeapObject-Marked))
// Same as:
references' = references & ((Marked + RootSet)->Marked)
// Same as:
references' = references & ((Allocated' + RootSet)->Allocated')
}
```
# Addition to make it easier to see steps
One way to make it easier to identify which step the trace is currently taking is to add an explicit sig marking the options for steps:
```java
// Abstract because we only want specific subsigs
abstract sig Step {}
// One because each is a single category of step.
// "extends" because each is mutually exclusive.
one sig Programmer, Mark, Sweep, DoNothing extends Step {}
// Variable, single sig tracks which step we are currently in.
// "in" because it is not mutually exclusive.
var one sig WhichStep in Step {}
We then constrain WhichStep within each predicate, for example:
pred heapChangedByProgrammer {
Programmer = WhichStep
// ...
}
```
Now, the visualizer will always show which step is the current step within a trace, which can make debugging easier. This is an addition that might be helpful if you use Alloy for your final project.
# Partial mark and sweep model from class
```java
/*
Goal: model tracing GC (mark and sweep) for safe, automatic memory management.
*/
abstract sig Memory {
// Variable, since the reference graph changes over time as the programmer
// modifies the heap.
var references: set HeapObject
}
// references: relation on Memory->HeapObject
// For simplicity, the root set is a single sig with potentially many outgoing
// edges in the references relation.
one sig RootSet extends Memory {}
sig HeapObject extends Memory {
// Successor in memory order
succ: lone HeapObject
}
fact {
// succ is a linked list
one a: HeapObject | all b: HeapObject - a | b in a.^succ
no iden & succ
}
// Objects not in Allocated are freed
// Variable, since what is allocated vs freed changes over time.
var sig Allocated in HeapObject {}
// Variable, since what is marked changes over time
var sig Marked in HeapObject {}
abstract sig Step {}
one sig Programmer, Mark, Sweep, DoNothing extends Step {}
var one sig WhichStep in Step {}
------------------------- Helper predicates -----------------------------
// Takes the var fields as arguments, so we can use this pred "now" and "after"
pred reachableFromRootSet[o: HeapObject, refs: Memory->HeapObject] {
o in RootSet.^refs
}
// Soundness: memory safety: no reachable memory has been freed.
pred memorySafe[alloc: Allocated, refs: Memory->HeapObject] {
// Everything reachable from the root set is still allocated (not free)
all o: HeapObject | reachableFromRootSet[o, refs] => o in alloc
}
// Completeness: no leaks in the given step (we've freed everything that is
// no longer reachanble).
pred noLeaks[alloc: Allocated, refs: Memory->HeapObject] {
// Everything we have not freed is actually reachable. No leaked, no longer
// reachable memory.
all o: HeapObject | o in alloc => reachableFromRootSet[o, refs]
}
------------------------- Provided facts -----------------------------
// Valid traces following the pattern we have seen before, allowing do nothing here.
fact validTrace {
init
always
(heapChangedByProgrammer or
mark or
sweep or
doNothing)
always (mark => after sweep)
always (sweep => before mark)
always (heapChangedByProgrammer => (after mark))
}
------------------------- State change predicates -----------------------------
// Starting condition: there are no leaks or memory safety issues to begin with,
// and everything is reachable.
pred init {
}
pred heapChangedByProgrammer {
// Number of total allocated object can increase, but not decrease
Allocated in Allocated'
// Nothing reachable is freed at this point
all o: HeapObject | reachableFromRootSet[o, references'] => o in Allocated'
// No changed to marked set
Marked' = Marked
WhichStep = Programmer
}
pred mark {
Marked' = RootSet.^references
Allocated' = Allocated
references' = references
WhichStep = Mark
}
pred sweep {
references' = references - ((HeapObject-Allocated')->HeapObject)
Allocated' = Marked
no Marked'
WhichStep = Sweep
}
pred doNothing {
}
------------------------- Pre-written change logic -----------------------------
-------------------- Predicates and runs to show instances -----------------------
pred fragmented {
-- there is some allocated heap object
some o: HeapObject | {
o in Allocated
-- with an unallocated object before and after it
some opred, osucc: HeapObject {
opred not in Allocated
osucc not in Allocated
o in opred.^succ
o.succ = osucc
}
}
}
//run {heapChangedByProgrammer and eventually (fragmented) } for 6
run { eventually fragmented } for 5
```