# Moving GC Extensions to Julia's Foreign Function Interface
## Introduction
Easy interoperability with C is a fundamental feature of Julia. It is common for users to allocate a data structure in Julia and pass it to C, where it is accessed by foreign code.
However, interoperating with C in a garbage-collected language like Julia presents challenges, since any object accessed by foreign code must remain alive and accessible through its original address for the duration of the foreign function’s execution - and in some cases, even longer, as discussed in contrived scenarios below.
Julia addresses this through the `GC.@preserve` macro, which defines a lexical scope within which the referenced objects are guaranteed to remain alive.
While this approach suffices under Julia’s current non-moving garbage collector, it becomes inadequate in the presence of a moving GC. In such cases, additional guarantees are required, specifically, that object addresses remain valid and objects are not relocated, even if a collection occurs during the foreign function's execution.
This document proposes extensions to Julia’s foreign function GC API to ensure compatibility with a moving garbage collector.
## `GC.@preserve`: Rooting Objects Within a Scope For Foreign Function Calls
As mentioned above, users can allocate a data structure in Julia and pass it to C, where it will be accessed by foreign code. This pattern requires ensuring the data structure remains alive, which is currently achieved using `GC.@preserve`:
```Julia=
mutable struct MyObject
x::Int
y::Union{Nothing,MyObject}
end
function call_c_with_object(obj::MyObject)
GC.@preserve obj begin
ptr = Ptr{Cvoid}(Base.pointer_from_objref(obj))
ccall((:my_c_function, "libmyclib"), Cvoid, (Ptr{Cvoid},), ptr)
end
end
```
This idiom guarantees that `obj` remains alive within the `GC.@preserve` scope and, therefore, that `ptr` does not become a dangling pointer.
## An Insufficient Extension for a Moving GC: Transitively Pinning Objects Within the GC.@preserve Scope
Note that in addition to the liveness properties described above, we must also ensure that the addresses of objects remain valid throughout the execution of the foreign function. In a moving GC, this is achieved by preventing the relocation of any object whose address is passed to C - for example, by pinning it.
Since foreign C code may access objects a few edges away from the original one by following pointers, pinning only the original data structure is not sufficient. We must also pin any objects that are transitively reachable from it.
The most straightforward extension of `GC.@preserve` for use with moving GCs would be to transitively pin all objects passed to the macro. That is, in the example above, `obj.y` would be pinned - not just the original `obj` passed to C.
## The Pathological Case: Objects Whose Pointers Are Stored in Native C Code
While the solution outlined above works in most cases, it doesn't apply to all scenarios. This is because a foreign C function might store a pointer to a Julia data structure and attempt to access it even after returning control to Julia - for example, if the C code is multi-threaded, or on a subsequent call to the C library.
Let's consider the C code below:
```=C
#include <pthread.h>
#include <stdio.h>
#include <unistd.h> // For sleep
typedef struct {
long x;
void *y; // Pointer to another MyObject
} MyObject;
static MyObject *stored = NULL;
static pthread_t thread;
void store_myobject(MyObject *obj) {
stored = obj;
}
void *background_reader(void *arg) {
for (int i = 0; i < 10; i++) {
if (stored) {
printf("MyObject.x = %ld\n", stored->x);
}
sleep(1);
}
return NULL;
}
void start_background_thread() {
pthread_create(&thread, NULL, background_reader, NULL);
}
```
and the corresponding Julia wrappers and callers:
```=Julia
# Global table to root objects
const pinned_objects = IdDict{Any, Nothing}()
mutable struct MyObject
x::Int
y::Union{Nothing,MyObject}
end
# Provide unsafe_convert for C interop
Base.unsafe_convert(::Type{Ptr{Cvoid}}, obj::MyObject) =
Base.unsafe_convert(Ptr{Cvoid}, pointer_from_objref(obj))
# Declare C bindings
function store_myobject(obj::MyObject)
ccall((:store_myobject, "libmyobject.so"), Cvoid, (Ptr{Cvoid},), obj)
end
function start_background_thread()
ccall((:start_background_thread, "libmyobject.so"), Cvoid, ())
end
# Safe version: Keep the object reachable
function run_test_safe()
leaf = MyObject(2, nothing)
root = MyObject(42, leaf)
# Add to global table to keep it alive
pinned_objects[root] = nothing
pinned_objects[leaf] = nothing # optional, GC won't collect `leaf` if `root` points to it
store_myobject(root)
end
# Start background thread and test
start_background_thread()
run_test_safe()
# Let the thread read safely
sleep(20)
```
The fundamental issue in this example, which makes `GC.@preserve` unsuitable, is that root may be accessed by the C code even after the lexical scope of `run_test_safe` has ended.
To prevent `root` from becoming a dangling pointer, we use the global `pinned_objects` table to ensure that the object remains alive for the duration of the foreign code.
**If we were using a moving GC, we would need to ensure not just that root was alive, but that its address remained valid for the duration of the C code**. However, the GC lacks the context to recognize that `pinned_objects` holds references to objects passed to C and must therefore avoid moving them.
We need an additional mechanism to inform the GC that it must not move certain objects that will be accessed by foreign code.
## Extensions for moving-GC–safe FFI in Julia
We introduce `increment_pin_count!` and `decrement_pin_count!` to explicitly preserve and pin objects beyond a lexical scope. `increment_tpin_count!` and `decrement_tpin_count!` do the same for transitive pinning.
These functions respectively increment and decrement an object’s pin count (the number of times the object has been pinned). When the count drops to zero, the object is no longer pinned and will no longer be traced by the GC unless there is another reference to that object.
```
"""
increment_pin_count!(obj)
Increment the pin count of `obj` to preserve it beyond a lexical scope.
This ensures that `obj` is not moved or collected by the garbage collector.
It is crucial for safely passing references to foreign code.
Each call increments the pin count by one. The object remains pinned and
alive until the count is decremented to zero via `decrement_pin_count!`.
# Examples
```julia
x = SomeObject()
increment_pin_count!(x) # x is now pinned (count = 1)
increment_pin_count!(x) # pin count is now 2
"""
```
```
"""
decrement_pin_count!(obj)
Decrement the pin count of `obj`.
When the count drops to zero, the object is no longer pinned and may be
moved (or collected by the garbage collector, if no other references
exist).
This is necessary to release objects that were previously preserved for
foreign code.
# Examples
```julia
x = SomeObject()
increment_pin_count!(x) # x is now pinned (count = 1)
increment_pin_count!(x) # pin count is now 2
decrement_pin_count!(x) # reduces pin count to 1
decrement_pin_count!(x) # count is 0; x may now be collected
"""
```
```
"""
increment_tpin_count!(obj)
Increment the transitive pin count of `obj` to preserve it beyond a
lexical scope.
This ensures that `obj` and any other objects reachable from it are not
moved or collected by the garbage collector. This is crucial for safely
passing references to foreign code.
Each call increments the transitive pin count by one. The object remains
transitively pinned and alive until the count is decremented to zero via
`decrement_tpin_count!`.
# Examples
```julia
x = SomeObject()
increment_tpin_count!(x) # x is now transitively pinned (count = 1)
increment_tpin_count!(x) # transitive pin count is now 2
"""
```
```
"""
decrement_tpin_count!(obj)
Decrement the transitive pin count of `obj`.
When the count drops to zero, `obj` and any objects reachable from it are
no longer pinned and may be moved (if no other pins exist) or collected by
the garbage collector (if no other references exist).
This is necessary to release object graphs that were previously preserved
for foreign code.
# Examples
```julia
x = SomeObject()
increment_tpin_count!(x) # pins x and reachable objects (count = 1)
decrement_tpin_count!(x) # reduces transitive pin count to 0
# objects may now be collected
"""
```
We also provide the functions `get_pin_count` and `get_tpin_count` to allow users to get the pin and transitive pin counts of each object:
```
"""
get_pin_count(obj)
Return the current pin count of `obj`.
This indicates how many times `obj` has been explicitly pinned via
`increment_pin_count!`. A nonzero count means the object is currently
pinned and will not be moved or collected by the garbage collector (GC).
# Examples
```julia
x = SomeObject()
increment_pin_count!(x)
get_pin_count(x) # returns 1
increment_pin_count!(x)
get_pin_count(x) # returns 2
decrement_pin_count!(x)
get_pin_count(x) # returns 1
"""
```
```
"""
get_tpin_count(obj)
Return the current transitive pin count of `obj`.
This indicates how many times `obj` has been explicitly transitively pinned
via `increment_tpin_count!`. A nonzero count means `obj` and all objects
reachable from it are currently pinned and will not be moved or collected
by the garbage collector (GC).
# Examples
```julia
x = SomeObject()
increment_tpin_count!(x)
get_tpin_count(x) # returns 1
increment_tpin_count!(x)
get_tpin_count(x) # returns 2
decrement_tpin_count!(x)
get_tpin_count(x) # returns 1
"""
```
Once we have these API extensions, the Julia snippet shown above would look like this:
```=Julia
mutable struct MyObject
x::Int
y::Union{Nothing,MyObject}
end
# Provide unsafe_convert for C interop
Base.unsafe_convert(::Type{Ptr{Cvoid}}, obj::MyObject) =
Base.unsafe_convert(Ptr{Cvoid}, pointer_from_objref(obj))
# Declare C bindings
function store_myobject(obj::MyObject)
ccall((:store_myobject, "libmyobject.so"), Cvoid, (Ptr{Cvoid},), obj)
end
function start_background_thread()
ccall((:start_background_thread, "libmyobject.so"), Cvoid, ())
end
# Safe version: Keep the object reachable
function run_test_safe()
leaf = MyObject(2, nothing)
root = MyObject(42, leaf)
# Transitively pin both (leaf is optional here, but safe)
increment_tpin_count!(root)
increment_tpin_count!(leaf)
store_myobject(root)
end
# Start background thread and test
start_background_thread()
run_test_safe()
# Let the thread read safely
sleep(20)
```
In this particular example, we don’t actually decrement the objects' transitive pin count, but we could do so once Julia is informed that the foreign C code has finished (e.g., through a callback).
## On the Semantics of `GC.@preserve` under a Moving GC
In the extended FFI, `GC.@preserve` continues to preserve objects for the duration of its lexical scope. However, it takes optional pinning modes `:pin` and `:tpin` which control whether the preserved objects should be pinned or transitively pinned.
Consider the snippet below, in which `my_other_c_function` accesses only `obj`, not its children. Passing a pinning mode `:pin` informs the GC that only `obj` needs to be pinned. This can reduce unnecessary pinning and allow the GC to move more objects.
```=Julia
function call_c_with_object(obj::MyObject)
GC.@preserve obj :pin begin
ptr = Ptr{Cvoid}(pointer_from_objref(obj))
ccall((:my_other_c_function, "libmyclib"), Cvoid, (Ptr{Cvoid},), ptr)
end
end
```
Semantically, when object is passed to `GC.@preserve` its pin count is incremented on block entry and decremented on block exit.
The pinning mode defaults to `:tpin` if it's not specifed by the user.
## On the Semantics of `ccall` Argument Conversion under a Moving GC
If `cconvert` is implemented for a type, Julia allows users to directly pass the object to a `ccall` and [handles the conversion and rooting automatically](https://docs.julialang.org/en/v1/manual/calling-c-and-fortran-code/#automatic-type-conversion).
The example from [this section](https://hackmd.io/2-vWNswsSaK23Sb0d71dCg#GCpreserve-Rooting-Objects-Within-a-Scope-For-Foreign-Function-Calls) can be rewritten as:
```=Julia
# Provide unsafe_convert for C interop
Base.unsafe_convert(::Type{Ptr{Cvoid}}, obj::MyObject) =
Base.unsafe_convert(Ptr{Cvoid}, pointer_from_objref(obj))
function call_c_with_object(obj::MyObject)
ccall((:my_other_c_function, "libmyclib"), Cvoid, (Ptr{Cvoid},), obj)
end
```
In the extended FFI, automatic conversion of `ccall` arguments transitively pins the corresponding objects.
If the user knows their code doesn't access other objects in the graph by chasing pointers, they can prevent unnecessary pinning by following the pattern in [this section](https://hackmd.io/2-vWNswsSaK23Sb0d71dCg#On-the-Semantics-of-ccall-Argument-Conversion-under-a-Moving-GC). That is, they can: explicitly convert the object to a raw pointer (via `Base.pointer_from_objref` or `Base.pointer`), call the foreign function inside a `GC.@preserve` block, and pass the `:pin` mode to `GC.@preserve`.
## On the Semantics of `Base.pointer` and `Base.pointer_from_objref`under a Moving GC
These functions are provided in `Base` to allow users to obtain raw pointers from Julia objects. We expect to keep them unchanged and propose that users be responsible for ensuring the validity of objects whose pointer they take.
Users can achieve this either by calling `increment_pin_count!` before`Base.pointer` and `Base.pointer_from_objref`, or by calling `Base.pointer` and `Base.pointer_from_objref` within a `GC.@preserve` block.
This approach introduces an additional burden on users to manage memory correctly when calling foreign code. We considered several alternatives, such as changing the behavior of `Base.pointer` and `Base.pointer_from_objref` to automatically pin their arguments. However, one of the main challenges with this approach was determining when an object could be unpinned. For example, if an object were pinned during a call to `Base.pointer_from_objref`, it would be unclear when it should be unpinned unless we explicitly paired `Base.pointer_from_objref` with a corresponding unpinning call. We decided not to pursue this option.
## Backwards Compatibility
We expect the moving GC to remain behind an experimental feature flag for multiple years, and we do not anticipate it being fully compatible with existing Julia code.
Users who wish to run Julia with a moving GC and pass objects to C or obtain raw pointers (via `Base.pointer` or `Base.pointer_from_objref`) will need to use either `GC.@preserve` or the new `PinTable/TPinTable` data structures.
## Rollout Plan
If we introduce these APIs, we expect users to gradually adopt them as they begin running their code with the moving GC. Since enabling the moving GC will involve breaking changes, we may need to wait until most code has migrated to the new APIs - or until the release of Julia 2.0.
## Acknowledgements
XYZZY.