Format:
what ?
what is the point of the idea ?
inline methods up to twice as large as the normal threshold if there is at least one parameter that does neither escape the caller nor the callee
A stack object must not be moved in memory, but the heap objects referenced by its fields need to be visited and kept alive. Therefore, we modified the garbage collector so that it does not visit stack objects but treats pointer fields within stack objects as root pointers
how is it interesting compared to prior works ?
how is it validated ?
is there any remaining discussion ?
what's next to read ?
In many cases, however, an object escapes just in a single unlikely branch.
Object getValue(int idx, Object ref) {
Key key = new Key(idx, ref);
if (key.equals(cacheKey)) {
return cacheValue; // not escape
} else {
cacheKey = key; // escapes to the static field
cacheValue = createValue(...);
return cacheValue;
}
}
Object getValue(int idx, Object ref) {
if (/*scalar-replaced operations*/ ...) {
return cacheValue;
} else {
Key key = new(idx, ref); // "materiazation"
cacheKey = key;
cacheValue = createValue(...);
return cacheValue;
}
}
class Id extends Node {...} // allocation identity
class ObjectState {}
class VirtualState extends ObjectState {
int lockCount;
Node[] fields;
}
class EscapedState extends ObjectState {
Node materializedValue;
}
class State { // each Node has this state
Map<Id, ObjectState> states;
/**
* Loading an object with `VirtualState` from a field of another
* `VirtualState` object inserts a new entry into this map, so
* that the `Load node can be recognized as referring to the
* `VirtualState` object during further processing.
*
* If any of the operands of a node is key in this `aliases` map,
* then the node needs to be examined also.
*/
Map<Node, Id> aliases;
}
Merge
node: merge control-flowsNotes:
This talk summarizes the status of escape analysis and related optimizations in Java community:
public static int simple_math_myobject_valueof(int val1, int val2) {
// these allocations can't be eliminated because they are represented as
// phi nodes of (global object, scalar-replaceable object)
MyObject int1 = MyObject.valueOf(val1);
MyObject int2 = MyObject.valueOf(val2);
return int1.add(int2);
}
public MyOjbect valueOf(int val) {
if (0 <= val && val <= 2) {
return _cache[val];
} else {
return new MyObject(val);
}
}
9432
lines of C++ codealias
methodgetModRefInfo(CS1, CS2)
: memory effect test between function calls
doesNotAccessMemory
/ onlyReadsMemory
basic-aa
: walks through dominationGeneral design choices:
Escape analyses in the wild:
struct Point
x::Float64
y::Float64
end
add(a::Point, b::Point) = Point(a.x + b.x, a.y + b.y)
function compute()
a = Point(1.5, 2.5)
b = Point(2.25, 4.75)
for i in 0:(100000000-1)
a = add(add(a, b), b)
end
a.x, a.y
end
struct Key
idx::Int
ref::Symbol
end
const CACHED_KEY = Ref(Key(0, :orig))
@inline eqkey(k1, k2) = k1.idx == k2.idx && k1.ref === k2.ref
function getref(idx, ref)
key = Key(idx, ref) # move the allocation into the `if` clause
if !eqkey(key, CACHED_KEY[])
CACHED_KEY[] = key
end
return key.ref
end
let
if cond
key = CACHED_KEY[]
else
key = Key(idx, ref) # can we avoid this allocation ?
CACHED_KEY[] = key
end
return key.ref # key::ϕ(global object, allocation)
end
Union
-typed fieldsImmutableArray
PartialStruct
for mutablesPartialStruct.fields
contain Replaceable(::SlotNumber)
getfield(...)::Union{Const,Replaceable}
Q. "SROA at inference"?
*
)why on LLVM ?
problems:
more researches
focus on compactness ?
other compilers (tracing JIT) vs. Julia (ahead of runtime)
This seems to be such a concept that is very specific to tracing-JIT compilers,
where VM execution sometimes transfers the control to the interpreter
in a case any speculative assumption for optimization gets invalidated. ↩︎
"synchronization removal" here seems to simply means "lock elision" ↩︎
This statement seems to reflect their purpose of SROA: SROA is done for allocation elimination and an object allocation can't be eliminated if there is any site it escapes. ↩︎
They seem to use "formal" when referring to declared method arguments, and "actual" to those actually passed at callsite. ↩︎
"How LuaJIT encodes flow-sensitivity ?"
LuaJIT's tracing-JIT nature separates an unlikely branch into a separate snapshot.
It allows escapes that happens in the rare snapshots not to escape the objects in the main trace.
In other word, necessary flow-sensitivity is encoded within the nature of tracing JIT. ↩︎ ↩︎
Well, usually, the "inter-procedural" part isn't so challenging: basically, inter-procedural
information is encoded within callee method descriptor, and the main lifting is to translate
the information in caller context. ↩︎
Even flow-insensitive SROA sometimes fails on Julia-level.
For example, if Point
is declared as mutable struct
, Julia-level SROA fails on the above example.
But for such cases, succeeding LLVM passes will optimize them away. ↩︎