<!-- using Weave; weave("docs/escapeanalysis20210614.jmd", doctype="github", mod=Main) -->
# Escape Analysis MTG @ 2021/06/14
## Wrap-ups of 2021/06/07
- we will try to implement escape analysis as a separate pass between inference and optimization
* for easier prototyping without taking a risk of breaking/complicating inference
* especially at least for this GSoC term
- we will focus on using escape information for Julia-level optimizations
* especially for:
+ **scalar replacements** and
+ **heap to stack allocation conversions**
* although there are many other applications of escape information, they'll be out of scope for the term
## Possible Target Examples (Shuhei)
### Scalar Replacements
Object fields are scalar-replaced only when they're:
- immutable fields with constant fields, or
* constant-folded at inference
- mutable fields when callers are fully inlined at optimization:
* this is obviously done by [`getfield_elim_pass`](https://github.com/JuliaLang/julia/blob/82ae530db62f364853b355bcd64b28c71445a2c4/base/compiler/ssair/passes.jl#L550-L885)
* mutable objects can't be constant-folded at inference even if it's fully known to be constant
```julia
struct Result
inf
opt
end
function Base.show(io::IO, res::Result)
printstyled(io, "=== inference ===\n"; color=:cyan)
show(io, res.inf)
println(io)
printstyled(io, "=== optimization ===\n"; color=:light_green)
show(io, res.opt)
end
function inf_vs_opt(f, types=(); kwargs...)
Result(
first(code_typed(f, types; optimize=false, kwargs...)),
first(code_typed(f, types; optimize=true, kwargs...)))
end
struct Immutable{S,T}
fld1::S
fld2::T
end
mutable struct Mutable{S,T}
fld1::S
fld2::T
end
g = nothing # very untyped
@noinline forcenoinline(_) = return g # force not to be inlined because of `g`
falseescape(cond, o) = cond ? o : forcenoinline(o)
```
```
falseescape (generic function with 1 method)
```
NOTE: `getproperty`s will be inlined and are thus equivalent to `getfield` for cases below.
```julia
#############################
### fully constant-folded ###
#############################
# works pretty well for immutable objects
julia> inf_vs_opt((Bool,)) do cond
o = Immutable(0, 1) # ::Const
falseescape(cond, o)
return o.fld1, o.fld2 # ::Const((0, 1))
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Immutable(0, 1))::Core.Const(Immutable{Int64, Int64}(0, 1))
│ Main.falseescape(cond, o::Core.Const(Immutable{Int64, Int64}(0, 1)))::Any
│ %3 = Base.getproperty(o::Core.Const(Immutable{Int64, Int64}(0, 1)), :fld1)::Core.Const(0)
│ %4 = Base.getproperty(o::Core.Const(Immutable{Int64, Int64}(0, 1)), :fld2)::Core.Const(1)
│ %5 = Core.tuple(%3, %4)::Core.Const((0, 1))
└── return %5
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ goto #3 if not cond
2 ─ goto #4
3 ─ invoke Main.forcenoinline($(QuoteNode(Immutable{Int64, Int64}(0, 1)))::Immutable{Int64, Int64})::Any
└── goto #4
4 ┄ return (0, 1)
) => Tuple{Int64, Int64}
# mutable fields can't be scalar-replaced even if it's fully known to be constant
julia> inf_vs_opt((Bool,)) do cond
o = Mutable(0, 1) # ::Mutable{Int} (allocation)
falseescape(cond, o)
return o.fld1, o.fld2 # ::Tuple{Int,Int}
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Mutable(0, 1))::Mutable{Int64, Int64}
│ Main.falseescape(cond, o)::Any
│ %3 = Base.getproperty(o, :fld1)::Int64
│ %4 = Base.getproperty(o, :fld2)::Int64
│ %5 = Core.tuple(%3, %4)::Tuple{Int64, Int64}
└── return %5
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = %new(Mutable{Int64, Int64}, 0, 1)::Mutable{Int64, Int64}
└── goto #3 if not cond
2 ─ goto #4
3 ─ invoke Main.forcenoinline(%1::Mutable{Int64, Int64})::Any
└── goto #4
4 ┄ %6 = Base.getfield(%1, :fld1)::Int64
│ %7 = Base.getfield(%1, :fld2)::Int64
│ %8 = Core.tuple(%6, %7)::Tuple{Int64, Int64}
└── return %8
) => Tuple{Int64, Int64}
# if we succeed to fully-inline, we can eliminate `getfield`s
# NOTE the return type is still not `Const` (because constant-folding is the result of optimization, and didn't happen at inference)
julia> inf_vs_opt((Bool,)) do cond
o = Mutable(0, 1) # ::Mutable{Int,Int} (allocation)
falseescape(true, o) # inlined
return o.fld1, o.fld2 # ::Tuple{Int,Int} (NOTE not constant-folded)
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Mutable(0, 1))::Mutable{Int64, Int64}
│ Main.falseescape(true, o)::Any
│ %3 = Base.getproperty(o, :fld1)::Int64
│ %4 = Base.getproperty(o, :fld2)::Int64
│ %5 = Core.tuple(%3, %4)::Tuple{Int64, Int64}
└── return %5
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = Core.tuple(0, 1)::Tuple{Int64, Int64}
└── return %1
) => Tuple{Int64, Int64}
julia> @noinline function not_folded(cond)
o = Mutable(0, 1) # ::Mutable{Int,Int} (allocation)
falseescape(true, o) # inlined
return o.fld1, o.fld2 # ::Tuple{Int,Int} (NOTE not constant-folded)
end
not_folded (generic function with 1 method)
julia> inf_vs_opt((Bool,)) do cond
not_folded(cond) # ::Tuple{Int,Int}, not Const(0, 1)
end
=== inference ===
CodeInfo(
1 ─ %1 = Main.not_folded(cond)::Tuple{Int64, Int64}
└── return %1
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = invoke Main.not_folded(_2::Bool)::Tuple{Int64, Int64}
└── return %1
) => Tuple{Int64, Int64}
#################################
### partially constant-folded ###
#################################
# works pretty well for immutable objects
julia> inf_vs_opt((Bool,Int,)) do cond, a0
o = Immutable(a0, 1) # ::PartialStruct
falseescape(cond, o)
return o.fld1, o.fld2 # ::PartialStruct(Tuple{Int,Int},Any[Int,Const(1)])
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Immutable(a0, 1))::Core.PartialStruct(Immutable{Int64, Int64}, Any[Int64, Core.Const(1)])
│ Main.falseescape(cond, o::Core.PartialStruct(Immutable{Int64, Int64}, Any[Int64, Core.Const(1)]))::Any
│ %3 = Base.getproperty(o::Core.PartialStruct(Immutable{Int64, Int64}, Any[Int64, Core.Const(1)]), :fld1)::Int64
│ %4 = Base.getproperty(o::Core.PartialStruct(Immutable{Int64, Int64}, Any[Int64, Core.Const(1)]), :fld2)::Core.Const(1)
│ %5 = Core.tuple(%3, %4)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(1)])
└── return %5
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = %new(Immutable{Int64, Int64}, a0, 1)::Immutable{Int64, Int64}
└── goto #3 if not cond
2 ─ goto #4
3 ─ invoke Main.forcenoinline(%1::Immutable{Int64, Int64})::Any
└── goto #4
4 ┄ %6 = Core.tuple(a0, 1)::Tuple{Int64, Int64}
└── return %6
) => Tuple{Int64, Int64}
# basically same as for fully constant-folded cases
julia> inf_vs_opt((Bool,Int,)) do cond, a0
o = Mutable(a0, 1) # ::PartialStruct
falseescape(cond, o)
return o.fld1, o.fld2 # ::Tuple{Int,Int}
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Mutable(a0, 1))::Mutable{Int64, Int64}
│ Main.falseescape(cond, o)::Any
│ %3 = Base.getproperty(o, :fld1)::Int64
│ %4 = Base.getproperty(o, :fld2)::Int64
│ %5 = Core.tuple(%3, %4)::Tuple{Int64, Int64}
└── return %5
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = %new(Mutable{Int64, Int64}, a0, 1)::Mutable{Int64, Int64}
└── goto #3 if not cond
2 ─ goto #4
3 ─ invoke Main.forcenoinline(%1::Mutable{Int64, Int64})::Any
└── goto #4
4 ┄ %6 = Base.getfield(%1, :fld1)::Int64
│ %7 = Base.getfield(%1, :fld2)::Int64
│ %8 = Core.tuple(%6, %7)::Tuple{Int64, Int64}
└── return %8
) => Tuple{Int64, Int64}
julia> inf_vs_opt((Bool,Int,)) do cond, a0
o = Mutable(a0, 1) # ::PartialStruct
falseescape(true, o) # inlined
return o.fld1, o.fld2 # ::Tuple{Int,Int}
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Mutable(a0, 1))::Mutable{Int64, Int64}
│ Main.falseescape(true, o)::Any
│ %3 = Base.getproperty(o, :fld1)::Int64
│ %4 = Base.getproperty(o, :fld2)::Int64
│ %5 = Core.tuple(%3, %4)::Tuple{Int64, Int64}
└── return %5
) => Tuple{Int64, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = Core.tuple(a0, 1)::Tuple{Int64, Int64}
└── return %1
) => Tuple{Int64, Int64}
```
> :warning: maybe off-topic
Probably, we can form `PartialStruct` even if it's not fully concrete, but immutable ?
Especially replace [`isconcretetype(t)` here](https://github.com/JuliaLang/julia/blob/3f27f88ec9344e96d4c51b538bf0352e4674b037/base/compiler/abstractinterpretation.jl#L1431) with something easier.
```julia
julia> inf_vs_opt((Bool,Any,)) do cond, a0
o = Immutable(a0, 1) # want ::PartialStruct(Immutable{A_,Int} where A_, Any[Any,Const(1)])
falseescape(cond, o)
return o.fld1, o.fld2 # ::PartialStruct(Tuple{Int,Int},Any[Int,Const(1)])
end
=== inference ===
CodeInfo(
1 ─ (o = Main.Immutable(a0, 1))::Immutable{_A, Int64} where _A
│ Main.falseescape(cond, o)::Any
│ %3 = Base.getproperty(o, :fld1)::Any
│ %4 = Base.getproperty(o, :fld2)::Int64
│ %5 = Core.tuple(%3, %4)::Tuple{Any, Int64}
└── return %5
) => Tuple{Any, Int64}
=== optimization ===
CodeInfo(
1 ─ %1 = Main.Immutable(a0, 1)::Immutable{_A, Int64} where _A
│ Main.falseescape(cond, %1)::Any
│ %3 = Base.getfield(%1, :fld1)::Any
│ %4 = Base.getfield(%1, :fld2)::Int64
│ %5 = Core.tuple(%3, %4)::Tuple{Any, Int64}
└── return %5
) => Tuple{Any, Int64}
```
### Heap to Stack Allocation Conversions
Maybe this code allocates `x` to heap ?
```julia
# by @vtjnash
julia> @noinline f(x) = length(x) # no escape actually
f (generic function with 1 method)
julia> inf_vs_opt() do
x = Any[]
y = "new thing"
push!(x, y)
# everything inlined up to this point
return f(x) # not inlined, but `x` can actually be stack allocated ???
end
=== inference ===
CodeInfo(
1 ─ (x = Base.getindex(Main.Any))::Vector{Any}
│ (y = "new thing")::Core.Const("new thing")
│ Main.push!(x, y::Core.Const("new thing"))::Any
│ %4 = Main.f(x)::Int64
└── return %4
) => Int64
=== optimization ===
CodeInfo(
1 ── %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
│ %2 = π ("new thing", String)
│ %3 = Core.lshr_int(1, 63)::Int64
│ %4 = Core.trunc_int(Core.UInt8, %3)::UInt8
│ %5 = Core.eq_int(%4, 0x01)::Bool
└─── goto #3 if not %5
2 ── invoke Core.throw_inexacterror(:check_top_bit::Symbol, UInt64::Type{UInt64}, 1::Int64)::Union{}
└─── unreachable
3 ── goto #4
4 ── %10 = Core.bitcast(Core.UInt64, 1)::UInt64
└─── goto #5
5 ── goto #6
6 ── goto #7
7 ── goto #8
8 ── $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), :(%1), :(%10), :(%10)))::Nothing
└─── goto #9
9 ── %17 = Base.arraylen(%1)::Int64
│ Base.arrayset(true, %1, %2, %17)::Vector{Any}
└─── goto #10
10 ─ %20 = invoke Main.f(%1::Vector{Any})::Int64
└─── return %20
) => Int64
```
#### Questions
- what's the "escape" ?
- escape to where ?
- `setfield!(x)` "escapes" `x` ?
- construct, `setfield!`: need prove the arguments/parent are not escape
- ~~`unsafe_xxx` ?~~
- ccall with `Any`-typed arguments (and `Ref{Any}`?)
- global (non-local?) assignments
- not-analyzed function calls
- return values
- throw
- finalizers (special case of escaping)
- (again, long-term) other possible benefits of escape information for inference stage
* "product abstract domain": type information × escape information
* we will use the former to derive the latter, but we can do vice versa
+ in the abstract interpretation literature, this seems to be called "reducing"
* e.g. once escape information is available at inference time, we can:
+ constant-fold mutable objects
+ form `MustAlias` for mutable objects (?)
* possible integration ?
1. forward analaysis (inference)
2. backward analysis (escape analysis, etc.)
3. forward analysis (inference, only locally and lightweight as possible)
4. optimization
- how can I judge which object is heap/stack-allocated ?
- <https://discourse.julialang.org/t/a-nice-explanation-of-memory-stack-vs-heap/53915>
From @mbauman
> It’s automatic. But roughly:
> Arrays, Dicts, and mutable structs are on the heap
> StaticArrays, Tuples, Named Tuples and non-mutable structs are on the stack (but sometimes never exist at all)
- we can use `@allocated` to test the optimization: <https://discourse.julialang.org/t/why-is-heap-memory-bad/33290/3>
- how will we want to tell the compiler to allocate objects in stack ?
- "forward" vs. "reverse"
from slack:
> The difference in terminology reflects that forward-inference means information is created at the definition site, and flows forward to the uses.
> In reverse-inference, the information is created at the usage sites (“is this an escape?“) and flows backwards to the definitions.
## Related
- [Julep/Very WIP - Heap allocated immutable arrays and compiler support](https://github.com/JuliaLang/julia/pull/31630):
* to enable allocation-less mutable->immutable array conversion, Keno also tried to get (very limited and seemingly only local) escape information in this PR
## Week 1 progress (Xuanda)
- familiarize myself with AbstractInterpreter code (for type inference) and Optimization code, thanks to @aviatesk 's code snippets and XCompiler
- put up a WIP PR here: https://github.com/TH3CHARLie/julia/pull/1
- some current design ideas: 1. right after conversion to IRCode 2. SSA-based 3. a backward pass 4. pseudo algorithm shown in comment of `escape_analysis.jl`
## Some questions (Xuanda)
- About SSA definitions: I checked https://docs.julialang.org/en/v1/devdocs/ssair/ about Julia's SSA, but I did find nodes other than what is mentioned in the docs (e.g. `ReturnNode` and `Expr` are front-end types). Do we have related docs for the mapping between SSA language types and Julia structs?
- Currently, I create two escape sites: returns and calls (calls still needs debugging to pass compilation so it's not on the pull request). Anything else significant that I am missing. For calls, I can think of recursively walking into its CFG and doing an escape analysis on it, returning the result back to the caller to see if the args escape.
- A: setfield, construct, assigning to a global, non-analyzed function calls
- Backward alias analysis, if I understands correctly, we need alias analysis to identify the real escaped source. I am not very familiar with how to do alias analysis in a backward direction.
## Type-based analysis alternative?
Jameson to write.