CodeInstance, MethodInstances, CodeInfo, world ages, and more

Let's take a look at each of these internal representations of code, and document which of the fields are relevant to each consumer. This is not necessarily the current state, but rather the intended future state.

Definitions

First we have Method: this seems fairly self-explanatory. This represents a user-defined dispatch entry target.

Next we have MethodInstance: this is the pair of details: (Method, specialization types) which uniquely defines each object by egal and type-equality.

Next we have CodeInfo: this describes a precise input to codegen. It might be unspecialized (just the direct output of code lowering) or optimized. It contains the statements of the function, and some metadata for each of the statements. It may contain some extra data unused by codegen, but generally that data may not be available.

Next we have IRCode: this represents a superset of the information in CodeInfo, but also in particular represents it in a exploded way that is easier to mutate. The CodeInfo representation is optimized for size when compressed instead. It might not be easy to mutate this further (though it can be appoximately converted back to an IRCode with some effort)

And finally we have CodeInstance: this represents an executable chunk OR also an inference edge. It will be important to know which fields are valid for each use case, so this will be more refined further now. A CodeInstance has both "extrinsic" and "intrinsic" properties (using the terms as defined here https://hackmd.io/@Og2_pcUySm6R_RPqbZ06JA/S1bqP1D_6), which are as follows:

The "extrinsic" properties are:

MethodInstance: The cache that contained this. This might be the same as the def field (as is guaranteed now), or perhaps might be a subtype of def in the future.
owner (e.g. the cache_owner from AbstractInterpreter): the particular inference configuration that generated this, or the token nothing for the builtin rules
min_world/max_world: this inclusive set defines when the intrinsic properties are valid
edges (not yet present): This collects the inference predicates that fed into defining this. It is an exact dual of the worlds, in that the worlds can be computed from the edges and the edges could be computed from the worlds:
- Computing worlds from edges is the intersection of the worlds in each of the edges.
- Computing edges from worlds is inference, recording each of the intrinsic properties that influenced this result.

The "intrinsic" properties are most of the remaining fields (except for next which is a structural property), such as: inferred, invoke, ipo_effects, rettype, and so on. The list is not fixed, but rather is any property that we decide to add to inference to track. Of particular note: these fields do not define a specialization axis, but rather are defined by it. One particularly crucial aspect of this is that it means semantically they can be narrowed at any time by an intersection over their lattice, provided that they remain compatible with the worlds fields–or that the worlds can be widened provided that they remain compatible with thes properties. Of additional note: the edges list is also not part of the intrinsic properties, since it is equivalent to worlds. However, for implementation simplicity, we don't typically narrow these intrinsic properties, as those updates can race with another thread trying to widen the worlds. However, it is legal to observe that these properties can be narrower over their lattice if consistent with the current (or wider) worlds, and there may be cases where we will do so. Generally edges doesn't get changed, as there is no reason to add new edges and intersection on its lattice is rather complicated.

Lastly there is a "debuginfo" field. This seems to fall somewhat between being an intrinsic property and an edge, as it contains the metadata of what source code was inlined to create the "inferred" field contents (and thus also the "invoke" field contents). This makes it an extrinsic property like an edge, but as an edge to the source code definition of that exact Method, rather than to just the inferred intrinsic properties.

Additionally, a new boolean field can be added to the MethodInstance that states whether it is currently the invoke target for its own signature in world_counter world.

Usage Steps

Now lets turn to looking at how these definitions connect in a simple program, them we will add details about concurrency and (de)serialization.

First the user defines some Methods which get added to the global MethodTable (represented piecemeal in lookups from various TypeName fields).

Secondly the user calls one of these methods, resulting in selection of a Method by comparison of morespecific. These ends up being heavily cached in various places, such that this lookup in this particular world can be repeated efficiently and quickly.

Next, the caching logic selects a particular MethodInstance to represent the desired monomorphization (this is usually, but not always, the same as what will be codegened later).

Then it tries to invoke this MethodInstance, leading to the next phase: CodeInstance creation. The cache then tries to locate a CodeInstance that is valid for executing in the current world. Upon failing that, it asks type-inference (jl_type_infer) to give it one. If inference fails to do that, it will create one without optimizations. Each of these steps is mediated by the JIT engine (henceforth just called "engine", as it does not actually JIT or optimize code itself, but simply maintains the ownership of it), as follows. In the following description, the word "then" should be taken to imply a SCC graph ordering, not a linear ordering.

The process first submits a notification to the engine for a right to create this CodeInstance. The engine will atomically determine if another thread has already started work on an equivalent request and give back a handle to that work. This will be repeated for each vertex in the graph, the the process will collect quite a number of these rights. These are not locks, as the presence of cycles in inference may lead to different theads needing to acquire these rights in a different order. Thus, in particular, failure to acquire one of these rights will need extra handling.
- Initially, the process will simply acquire the codegen_lock before asking for any rights and while holding any rights. This simplifies the initial implementation, although it prevents any parallelism from being realized yet.
The process then runs inference, resulting in a cache of inference details for a graph of functions, in an internal representation (e.g. InferenceResult).
The process then runs optimizations, resulting a cache of optimizations, still in an internal representation.
The process then releases those rights with the engine by trading it for an object known to the engine, such as a CodeInfo. In exchange, the engine gives it back a CodeInstance object containing the intrinsic data and worlds that identify it as an edge.
- The engine can be asked any time later for a native pointer and will provide one in it. However, if it has lost the ability to generate one exactly for the specific intrinsic properties, the native pointers (both invoke and specptr) might be a trampoline that calls back to invoke on the MethodInstance and restarts this process later. It might even be an interpreter call. This is because it is always legal to redispatch from CodeInstance to MethodInstance, since the intrinsic properties are inherent to the any execution in the given world, and not inherent to that particular copy of it (e.g. because invalidation of worlds is not possible).
Alternatively, the process can instead release the rights without fufulling it at all, or partially fulfill it, or fulfill it with only the exact content needed for the engine to create a CodeInstance (including llvm bitcode or native pointers). The process might not fulfill it at all if LimitedAccuracy got returned, for example.
The process might partially fulfill it if it cannot get a later right that it needed to finish creating it. When a process only partially fulfills a request, it also may chose to lose any or all of the other rights it was holding and enter a waiting state. The next time a process tries to take that right, it can choose to steal as much of that partial work as it wants to. Eventually when that right becomes available or has been fulfilled, the process will resume and can finish fulfilling any others that didn't get fulfilled.

Edges

Everything up to this point is a description of the containers, so now lets turn our attention to the contents and the properties of those contents. Otherwise known as "edges". So lets talk about what those do.

Edges represent the complete set of facts use by inference interproceedurally.

Edges also represent the static call graph. This is almost the same, but not in all cases.

Previously, the former set of facts was only represented as backedges from a MethodInstance to other MethodInstances, as a summary of the CodeInstances. Additionly, the backedges only represented the set of facts that might be falsified later. Therefore, they exclude info that already has an upper bound on the usable world or which did not influence the final resulting code generated.

Previously also, the latter was represented by Expr(:invoke) calls to a MethodInstance.

Except during incremental precompile, when both used a third representation.

Now, it would benefit us to try to unify those.

What I propose is this:

The argument to Expr(:invoke) is changed from a MethodInstance to a CodeInstance, and commits codegen to emitting a particular call signature based on the intrinsic properties at that edge table index. If a MethodInstance is put there instead, it will generate an call to jl_invoke that instead. When compressed, this turns into an integer index into the edges table, instead of requiring adding to the root table.

The edges table is a list of CodeInstance objects (each of them typically in the cache, though not required). These serve several different purposes, and the valid fields to examine change based on the consumer (take care here!):

When codegen examines the CodeInstance, from an :invoke target, it may use the spectypes and rettypes field as well as any other intrinsic properties it considers useful. Exluded are inferred and invoke. In particular, codegen does not in any way recurse.
When inference examines a CodeInstance, from a cache lookup, it may use the same fields as codegen, plus inferred. In particular, it may recurse to recreate inferred also, if it chooses to.
If the optimizer was to examine a CodeInstance (such as from a cache lookup or :invoke), it may use the same fields as codegen, which does not give it much useful or stable info. Instead, this information should typically be given to it by inference.
When serialization examines a CodeInstance, it keeps the edges and intrinsic properties, but discards the worlds.
When method-add examines a CodeInstance following backedges, it truncates the max_world field of any it finds contains an edge that intersects with the new Method
When deserialization or re-validation examines a CodeInstance:
- from serialization: it recomputes the valid worlds, for the given set of intrinsic properties, from the edges, if the edges are recursively still valid
- from recursion: there must be found a provably equivalent set of edges, which can be proven in several ways
  - the set of method lookup results is identical
  - or inference over the lookup has the same intrinsic properties, but has a new set of edges

Edges Encoding

There are several other encodings possible in the table, as well as just a CodeInstance, which have slightly different meanings for "identical method lookup". In these cases, where it states a lookup of a CodeInstance, it means specifically the extrinsic properties (e.g. just the MethodInstance). For example that computing only(methods_by_ftype(ci.def.specTypes)) returns ci.def.def::Method in the lookup world.

A CodeInstance alone means it is both the call target and specialization target. This is most common, as it works for all call where there is only one result and a rettype was successfully inferred.
A type followed by a CodeInstance means it is an invoke target. The same as a call, but only considers the exact subtype lookup of that type lookup results in that CodeInstance, and does not consider all intersections.
An Int value, N, followed by a type followed by abs(N) CodeInstances means that a lookup of methods via type should result in all of those following CodeInstances (and no others). The value of boolean fully_covers is encoded as N <= 0.
Corollary: A 0 followed by a type means it is a "missing" edge, that targets the lookup of that type being empty.
A MethodInstance can appear anywhere a CodeInstance is valid in the previous list, and means that that target was used only for existance and not for any of the other properties.
In the future, we might also allow a Method to appear in some places in the list, to avoid allocating a MethodInstance where it has no useful content (e.g. when used as an existance target from an invoke or Int call dispatch)
Finally, in some instances the CodeInstance might by synthesized for some form of constant propagation that refined the result more, instead of being in the object from cache. It may be necessary to also simulataneously inline the edges from that CodeInstance instead, to preserve that graph connection for invalidation without allocating the synthetic CodeInstance.
Lastly, we do not yet have a representation for edges that represent more than just the intrinsic information generated from the extrinsic information. Such edges will be required to represent constant propagation more accurately and the effect of inlining on invalidations. One option is that these are represented entirely due to their presence in the debuginfo vector, since that already represents the effect of inlining.

If those edge properties all hold, recursively over each CodeInstance in the list, then the current CodeInstance is valid. Otherwise, inference can attempt to locate replacement CodeInstances for each of those, such that the intrinsic properties of it still hold, resulting a copy of the current CodeInstance with the same inferred field and intrinsic properties, but with a different list of edges in it.

We also currently have a "legacy" edge kind also:

A type followed by a MethodTable means that the lookup was incomplete for that signature, so we need an backedge from that table too

Implementation APIs

This is a list of the APIs expected to be needed to implement the runtime, as well as the documentation to include with them. Note that jl_engine_reserve is where most of the state management complexity occurs. The APIs here correspond most closely to the current SOURCE_MODE_ABI, while SOURCE_MODE_FORCE_SOURCE* will not be relevant (or possible) anymore. Any caller that used SOURCE_MODE_FORCE_SOURCE* in the past will use the (Core.Compiler).code_typed reflection utility instead from Julia, as everything becomes "push" based instead of calling back into inference from C for these utilities.







































// Request inference to populate an optimized jl_code_instance_t for execution
//jl_code_instance_t *jl_type_infer(jl_method_instance_t *m, size_t world, int notrequired) __nullable;
void jl_type_infer(jl_code_instance_t *ci, size_t world, int notrequired) __nullable;

// Request codegen to emit native code for a given src object
// Note that there is no recursion/workqueue. If the code is not available already for a given edge, no attempt is made to infer it and it simply immediately falls back to making a slower trampoline.
jl_llvm_functions_t jl_emit_code(Module &M, jl_code_info_t *src, jl_codegen_params_t &params) __nullable;

// Request engine to grant the right to infer m for owner (that includes world)
// This may block if another thread already has reserved this right.
// Return either a new, empty, unattached jl_code_instance_t that just records the reservation (with the `partial` boolean set)
// or an previously inferred object. The `inferred` field is volatile, so if the caller needs to ensure that sticks around, it should copy it to a local field.
// If it is empty (the `partial` boolean is true), the caller may use the `inferred` field to stash a handle to the current progress.
// If the `partial` boolean is false, then inference is done on this, and it is now part of the global cache. Do not mutate this further. There may or may not be source code available for it in the `inferred` field (assume not).
// The `leases` array is a list of all current rights that it is holding. If the call blocks, other tasks may be permitted to borrow those rights. All rights will be returned to the caller (possibly still incompleted or possibly completed) when this returns. If they are already completed, the caller should not try to fulfill them again.
// The `leases` array can be NULL, but the engine will not block (will return NULL) if it cannot be immediately fulfilled. This is useful if the caller knows that it can keep working on other things while it waits.
// Additionally, this will return NULL if the request would result in a deadlock (because it already has that reservation). The caller should treat that result similar to toplevel LimitedAccuracy (and avoid caching this current result to force use of the fallback interpreter).
// The returned object may have a finalizer attached, such that if this gets forgotten, we can print a warning and reject the promise implicitly, as well as release any internal resources before the object becomes invalid.
jl_code_instance_t *jl_engine_reserve(jl_method_instance_t *m, size_t world, jl_value_t *owner, int notrequired, jl_array_t *leases __nullable) __nullable;

// Fulfill a promise made for the right to infer a signature, which will finish populating `ci` with the data from `src`, call `jl_emit_code` with this to associate it with LLVM code in the JIT, and release all tasks waiting on this lease afterwards. This also handles installing their backedges correctly.
void jl_engine_fulfill(jl_code_instance_t *ci, jl_code_info_t *src);

// Advanced API for jl_engine_fulfill which takes arguments separately instead of nicely pre-packaged in a src object. The user is required to pass in either an LLVM Module or populate native function pointers as well.
void jl_engine_fulfill_advanced(jl_code_instance_t *ci, /* intrinsic properties, etc */);

// Sometimes fulfilling a promise requires simultaneously fulfilling other promises, since they have circularities in their edges. Since all edges must be valid before fulfilling a reservation, sometimes it will be necessary to return an entire list of strongly connected components simultaneously.
void jl_engine_fulfill_scc(jl_code_instance_t *ci[], jl_code_info_t *src[]);

void jl_engine_fulfill_scc_advanced(jl_code_instance_t *ci[], /* arrays of all the intrinsic properties, etc */);

// Reject a promise made for the right to infer a signature (garbage collect it or allow the next task waiting on the lease to take it back).
void jl_engine_reject(jl_code_instance_t *ci);

// Query if there is a jl_code_instance_t from an older world, which is already inferred but might not be relevant, so that inference can decide if it is relevant
jl_code_instance_t *jl_engine_closest(jl_method_instance_t *m, jl_value_t *owner, size_t world) __nullable;

// Internally, the engine can add a Module to the OrcJIT and retrieve a list of function pointers to content in it, given a map of edges by index to their corresponding function (jl_callptr_t) or specialization (specsig) pointers that was derived from the edges list of the CodeInstance along with the calling convention selector.
void jl_add_to_ee(std::unique_ptr<ThreadSafeModule> M, std::vector<jl_fptr_t> LinkMap);

For example, here is roughly the body of the fallback code:














jl_code_instance_t *fallback_inference(jl_method_instance_t *mi, jl_value_t *owner, size_t world)
{
    jl_code_instance_t *ci = jl_engine_reserve(mi, owner, world, jl_an_empty_array);
    if (ci->partial) {
        JL_GC_PUSH1(&ci);
        jl_code_info_t *src = jl_code_for_interpreter(mi, world);
        if (src)
            jl_engine_fulfill(ci, src);
        else
            { jl_engine_reject(ci); ci = NULL; }
        JL_GC_POP();
    }
    return ci;
}

The non-fallback code should look very similar to this, except of course have inference in there too! This also as a special case can be run without a reservation too, to get out of situations where it would cause deadlocks.

CodeInstance, MethodInstances, CodeInfo, world ages, and more

Definitions

Usage Steps

Edges

Edges Encoding

Implementation APIs

Implementation progress

Read more

Julep: Redesigning `Array` using a new lower-level container type

using considered beneficial

Playground: Redesigning `Array` using a new lower-level container type

Julep: Replaceable global bindings