Shuhei Kadowaki
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Intensive Prior Research on Escape Analysis <!--{%hackmd theme-dark %}--> [TOC] ## Prior Researches **Format**: 1. what ? 1. what is the point of the idea ? 1. how is it interesting compared to prior works ? 1. how is it validated ? 1. is there any remaining discussion ? 1. what's next to read ? ### (JVM) Escape Analysis in the Context of Dynamic Compilation and Deoptimization - info: _paper_, _2005_ - <https://www.usenix.org/legacy/events/vee05/full_papers/p111-kotzmann.pdf> - Thomas Kotzmann Institute for System Software Johannes Kepler University Linz Linz, Austria - Hanspeter Mössenböck Institute for System Software Johannes Kepler University Linz Linz, Austria 1. what ? - intra-procedural and inter-procedural algorithm of escape analysis - with the support for dynamic class loading and "deoptimization"[^1] - target optimization: SROA / synchronization removal[^3] / stack allocation - by-product: a better inlining heuristic based on escape information [^1]: This seems to be such a concept that is very specific to tracing-JIT compilers, where VM execution sometimes transfers the control to the interpreter in a case any speculative assumption for optimization gets invalidated. [^3]: "synchronization removal" here seems to simply means "lock elision" 1. what is the point of the idea ? - _intra-procedural_ analysis: _forward_ abstract interpretation on SSA from - escape state: - local objects (SSA allocation): fixed-length array - fields: growing-length array - NOTE: propagate locals escapes to fields - control flow handling (i.e. phi-node handling) - if any operand of phi-function escapes, propagate the escapes to any other (since then "global SROA doesn't make any sense"[^4]) - use _equi-escape set (EES)_ to propagate escape information among the related phi-operands - EES is implemented as an union-find algorithm with path compression - _inter-procedural_ analysis: propagate the ("formal"[^2]) arguments escape information as a method descriptor - purpose: stack allocation (i.e. identifies objects that can be allocated on the caller stack) - benefits of stack allocation: - cheap allocation (reuse the same stack slot for allocation in a loop) - efficient field load (access can be relative to frame's base pointer) - (efficient field store: possible but note that actual argument might be heap-allocated) - by-product: inter-procedural escape information turned out to be a good inlining heuristics - > inline methods up to twice as large as the normal threshold if there is at least one parameter that does neither escape the caller nor the callee - runtime support - support for deoptimization: - debugging information keeps a record of how SROA and synchronization removal was performed by the compiler - GC modification for stack allocation: - > A stack object must not be moved in memory, but the heap objects referenced by its fields need to be visited and kept alive. Therefore, we modified the garbage collector so that it does not visit stack objects but treats pointer fields within stack objects as root pointers [^4]: This statement seems to reflect their purpose of SROA: SROA is done for allocation elimination and an object allocation can't be eliminated if there is any site it escapes. [^2]: They seem to use "formal" when referring to declared method arguments, and "actual" to those actually passed at callsite. 1. how is it interesting compared to prior works ? - As as good summary of a basic escape analysis design, and basic target optimization design - Their idea to use _equi-escape set_ to handle control flow effects 1. how is it validated ? - SROA validation: insert runtime assertion (run without SROA but with assertions on otherwise-replaced values) - performance: benchmarks 1. is there any remaining discussion ? - Is "graph" (EES) is really needed to handle control flow ? Couldn't backward analysis make it more simpler ? - ~~How they limit the lattice of field escapes (since its size isn't limited) ?~~ # of SSA statements is finite, and # of allocations is finite, thus field states is also finite 1. what's next to read ? - F. Vivien and M. Rinard. Incrementalized pointer and escape analysis. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 35–46, Snowbird, June 2001. - general observation on allocations, proposes incremental analysis ### (JVM) Partial Escape Analysis and Scalar Replacement for Java - info: _paper_, _2014_ - https://ssw.jku.at/Research/Papers/Stadler14/Stadler2014-CGO-PEA.pdf - Lukas Stadler Johannes Kepler University Linz, Austria - Thomas Würthinger Oracle Labs - Hanspeter Mössenböck Johannes Kepler University Linz, Austria 1. what ? - flow-sensitive escape analysis and scalar replacements - (with a support for deoptimization from speculative execution) - target optimizations: SROA, lock elision (_on individual branches_) 1. what is the point of the idea ? - > In many cases, however, an object escapes just in a single unlikely branch. ```java Object getValue(int idx, Object ref) { Key key = new Key(idx, ref); if (key.equals(cacheKey)) { return cacheValue; // not escape } else { cacheKey = key; // escapes to the static field cacheValue = createValue(...); return cacheValue; } } ``` - the basic idea is to transform above snippet into something like: ```java Object getValue(int idx, Object ref) { if (/*scalar-replaced operations*/ ...) { return cacheValue; } else { Key key = new(idx, ref); // "materiazation" cacheKey = key; cacheValue = createValue(...); return cacheValue; } } ``` - concepts - _"partial escape analysis"_: flow-sensitive escape analysis - _"materialization"_: materialize allocation to where it escapes - analysis: _forward_, _flow-sensitive_, _intra-procedural_ - allocation state ```java class Id extends Node {...} // allocation identity class ObjectState {} class VirtualState extends ObjectState { int lockCount; Node[] fields; } class EscapedState extends ObjectState { Node materializedValue; } class State { // each Node has this state Map<Id, ObjectState> states; /** * Loading an object with `VirtualState` from a field of another * `VirtualState` object inserts a new entry into this map, so * that the `Load node can be recognized as referring to the * `VirtualState` object during further processing. * * If any of the operands of a node is key in this `aliases` map, * then the node needs to be examined also. */ Map<Node, Id> aliases; } ``` - state propagation - each "node" has the state (=> flow-sensitivity) - each node updates its state using the states from previous nodes - `Merge` node: merge control-flows - loop entry node: merge loop backedges (could be more than two ?) - others: just propagate 1. how is it interesting compared to prior works ? - flow-sensitive SROA (and "materialization") 1. how is it validated ? - performance benchmarks: reduced allocated memory by up to 58.5%, improves performance by up to 33% 1. is there any remaining discussion ? - in worst case scenario, it may lead to bigger code size if "materialization" happens at both branches ? - how can it be extended to eliminate array allocations ? 3. what's next to read ? - A. Shankar, M. Arnold, and R. Bodik. Jolt: lightweight dynamic analysis and removal of object churn. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 127–142. ACM Press, 2008. - a lightweight dynamic analysis to reduce excessive creation of short-lived objects, by guiding inlining so that escape analysis is more effective - P. Molnar, A. Krall, and F. Brandner. Stack allocation of objects in the CACAO virtual machine. In Proceedings of the International Conference on the Principles and Practice of Programming in Java, pages 153–161. ACM Press, 2009 - similar approach as Kotzmann and Mössenböck to perform stack allocation on Cacao VM - V8: very local analysis (looking only at the usages of the allocation, _independently_) Notes: - "Zone": heap areas with a known, limited lifetime. The whole area can be freed in bulk when a certain scope is left. ### (JVM) Current state of EA and its uses in the JVM - info: _talk_, _2020_ - <https://www.youtube.com/watch?v=p1MhRBYS-k0> - Charlie Gracie This talk summarizes the status of escape analysis and related optimizations in Java community: - 3 main optimizations: 1. monitor elision (lock elision) 2. stack allocation 3. scalar replacement - C2 vs. Graal vs. OpenJ9 - C2: flow-insensitive escape analysis, global SROA - Graal: flow-sensitive escape analysis, partial SROA - OpenJ9: flow-sensitive escape analysis, partial SROA & stack allocation ! - SROA vs. stack allocation: SA can be applied if the control flow is complicated: e.g. motivating example: ```java public static int simple_math_myobject_valueof(int val1, int val2) { // these allocations can't be eliminated because they are represented as // phi nodes of (global object, scalar-replaceable object) MyObject int1 = MyObject.valueOf(val1); MyObject int2 = MyObject.valueOf(val2); return int1.add(int2); } public MyOjbect valueOf(int val) { if (0 <= val && val <= 2) { return _cache[val]; } else { return new MyObject(val); } } ``` - challenges with stack allocation - GC extension - lifetime overwrap detection - "heapification" at deoptimization - prototyped code can be found at: <https://github.com/eclipse-openj9/openj9/blob/master/runtime/compiler/optimizer/EscapeAnalysis.cpp> - complicated, `9432` lines of C++ code ### (LuaJIT) Allocation Sinking Optimization - info: _documentation_, _2012_ - <http://wiki.luajit.org/Allocation-Sinking-Optimization#implementation%5B> - Mike Pall 1. what ? - a variant of code motion, which moves an allocation to where it's really needed (e.g. in a fallback path), built on top of flow-sensitive SROA enhanced by LuaJIT's "advanced alias analysis" 1. what is the point of the idea ? - observations: - global escape analysis is pointless especially in dynamic languages, where their code typically has many fallback paths that often escape things - escapes often happen in uncommon paths, and allocations are avoidable in fast path - implementation: "mark & sweep" - (mark) marks all allocations that cannot be sunk - this phase is essentially one instance of escape analysis - _intra-procedural_, _backward_, _flow-insensitive_[^5] - (sweep) tags the unmarked allocations and the related stores as sunk 1. how is it interesting compared to prior works ? - one of the most earlier approaches of flow-sensitive SROA ? 1. how is it validated ? 1. is there any remaining discussion ? 1. what's next to read ? [^5]: "How LuaJIT encodes flow-sensitivity ?" LuaJIT's tracing-JIT nature separates an unlikely branch into a separate snapshot. It allows escapes that happens in the rare snapshots not to escape the objects in the main trace. In other word, necessary flow-sensitivity is encoded within the nature of tracing JIT. ### (JavaScript) Escape Analysis in V8 - info: _talk_, _2018_ - <https://www.youtube.com/watch?v=KiWEWLwQ3oI> - Tobias Tebbi 1. what ? - speculative escape analysis and allocation elimination with load replacement (SROA) - _forward_, _flow-sensitive_, allocation analysis - each escape site will refine analysis state - state update can be costly since each "node" has state and # of nodes is large - => "functional map data structure" solved it 1. how is it interesting compared to prior works ? - implement escape analysis on "unscheduled" SSA form - "unscheduled" - scheduled graph: one state per basic block - unscheduled: one state per "effectful node" - "functional map data structure" turns out to be successful - handle V8 deoptimization 1. how is it validated ? 1. is there any remaining discussion ? 1. what's next to read ? ### (LLVM) Alias Analysis - [LLVM Alias Analysis Infrastructure](https://llvm.org/docs/AliasAnalysis.html) - `alias` method - `getModRefInfo(CS1, CS2)`: memory effect test between function calls - helper methods: `doesNotAccessMemory` / `onlyReadsMemory` - interesting: interface methods to keep the analysis state up-to-date even after other consumer transformations - analysis passes: * `basic-aa`: walks through domination * TBAA: walk through type domain ### Other references - [JOLT: Lightweight Dynamic Analysis and Removal of Object Churn](https://www.boopidy.com/aj/cs/jolt.pdf) - [Escape Analysis for Java](https://www.researchgate.net/publication/2804836_Escape_Analysis_for_Java) - [Escape Analysis in Java: Theory and Practice](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.463.1254&rep=rep1&type=pdf) - [Incrementalized Pointer and Escape Analysis](https://people.csail.mit.edu/rinard/paper/pldi01.pdf) ## Observations ### General transition of memory optimization: 1. SROA: _eliminates_ allocations 1. partial SROA (allocation sinking): _eliminates_ allocations in the existence of simpler branches 1. stack allocation: _eases_ allocation cost in the existence of complicated branches ### Analysis design **General design choices**: - _flow-insensitive_ vs. _flow-sensitive_: simply, analysis complexity vs.analysis accuracy - _intra-procedural_ vs. _inter-procedural_[^6]: * intra-procedural for SROA * inter-procedural for stack allocation - _forward_ vs. _backward_: * forward: easier to follow control-flow, suited for SROA * forward analysis will invalidate previous state whenever it sees new escape information (which may lead to more iterations to reach a fixpoint) * backward: easier to propagate escape information, suited for general escape analysis **Escape analyses in the wild**: - (JVM C2) SROA: _flow-insensitive_, _intra-procedural_, _forward_ - (JVM Graal) partial SROA: _flow-sensitive_, _intra-procedural_, _forward_, - (JVM OpenJ9) partial SROA & stack allocation: _flow-sensitive_, _inter-procedural_, ??? - (JavaScript V8) partial SROA: _flow-sensitive_, _intra-procedural_, _forward_, - (LuaJIT) allocation sinking (partial SROA): _flow-insensitive_[^5], _intra-procedural_, two-phases analysis 1. _backward_ analysis for escape analysis (collect escape information about allocation) 2. _forward_ analysis for partial SROA (to tag sunk allocations) [^6]: Well, usually, the "inter-procedural" part isn't so challenging: basically, inter-procedural information is encoded within callee method descriptor, and the main lifting is to translate the information in caller context. ## What's for Julia ### status quo, and future - SROA and stack allocation - flow-insensitive SROA :+1: (:warning:[^7]) ```julia struct Point x::Float64 y::Float64 end add(a::Point, b::Point) = Point(a.x + b.x, a.y + b.y) function compute() a = Point(1.5, 2.5) b = Point(2.25, 4.75) for i in 0:(100000000-1) a = add(add(a, b), b) end a.x, a.y end ``` - flow-sensitive SROA :disappointed: ```julia struct Key idx::Int ref::Symbol end const CACHED_KEY = Ref(Key(0, :orig)) @inline eqkey(k1, k2) = k1.idx == k2.idx && k1.ref === k2.ref function getref(idx, ref) key = Key(idx, ref) # move the allocation into the `if` clause if !eqkey(key, CACHED_KEY[]) CACHED_KEY[] = key end return key.ref end ``` - stack allocation :drooling_face: ```julia let if cond key = CACHED_KEY[] else key = Key(idx, ref) # can we avoid this allocation ? CACHED_KEY[] = key end return key.ref # key::ϕ(global object, allocation) end ``` - DCE, more constant-folding/inference: SROA can derive new type information * Java: every field is statically-typed (mostly ?) * Julia: untyped, `Union`-typed fields - third consumers of escape information * `ImmutableArray` * some taint analysis for efficient contribution prop' for Diffractor [^7]: Even flow-insensitive SROA sometimes fails on Julia-level. For example, if `Point` is declared as `mutable struct`, Julia-level SROA fails on the above example. But for such cases, succeeding LLVM passes will optimize them away. - some implementation ideas 1. separate "state" into two parts - _backward_ escape analysis: general usages, allocation movement (for partial SROA) - with escape state _per basic block_ (rather than _per SSA statement_) - _forward_ allocation analysis: for SROA 1. SROA at inference time: - insight: we already have a decent forward analysis - inference time - (effect analysis) inter-procedural memory-effect propagation - form `PartialStruct` for mutables - (SROA, tagging): constant-folding or as encoded lattice property _at inference_ - make `PartialStruct.fields` contain `Replaceable(::SlotNumber)` - (DCE, constant-folding): inference is already flow-sensitive - optimization time - (SROA, replacement): look for `getfield(...)::Union{Const,Replaceable}` - (SROA) allocation sinking: new code motion pass - escape analysis: just summarizes the result of inference for succeeding optimizations - what's needed ? * lifetime query --- - Q. "SROA at inference"? - Valentin: _what kind of information do you want to propagate around ?_ - Shuhei (revisit) - memory effect analysis should be presented to inference - SROA tagging during inference: - may complicates thing: - brings a part of optimization into inference - the domain of inference lattice gets complicated: it now encodes CFG information - we need to keep tags up-to-date throughout many optimizations, including slot2ssa, inlining - idea: reuse existing flow-sensitive, forward, analysis to implement more "decent" SROA - stack allocation: - non-escape argument: ok - return value: (change calling convention) - calling convention: - LLVM level: struct pointer (`*`) - mutability is still very hard to handle - how information propagates ? - LLVM: rerun analysis - on high-level Julia IR: how to handle inter-procedural information - why on LLVM ? - existed for long - opportunities for more optimizations using LLVM infrastructure - problems: - intra-procedural - stack allocation - more researches - publish review paper: before something novel - focus on compactness ? - other compilers (tracing JIT) vs. Julia (ahead of runtime) * predictability * implementation complexities of tracing-JIT: - complexity vs. optimization opportunities

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully