owned this note
owned this note
Published
Linked with GitHub
# The New Pass Manager for llvmlite
Presented by Graham Markall, NVIDIA: <[gmarkall@nvidia.com](mailto:gmarkall@nvidia.com)>
## Introduction
* There are two pass managers in LLVM:
* The Legacy Pass Manager, used by llvmlite
* The New Pass Manager, which llvmlite needs to move to
* The Legacy Pass Manager is removed in LLVM 17 (Numba / llvmlite presently hovering around LLVM 14 / 15).
* [PR #1046: Add basic infra required to move Numba to NewPassManager](https://github.com/numba/llvmlite/pull/1046) (today's topic) adds support for the New Pass Manager to llvmlite
## Scope for today
* Discussion of the llvmlite-specific design considerations
* Where necessary, talk about the differences between the pass managers and implementation details
* Quick examples
* Details of Numba testing
* Summary of review and proposed next steps
## Acknowledgments
* Yashwant Singh at NVIDIA
* Author of [PR #1046: Add basic infra required to move Numba to NewPassManager](https://github.com/numba/llvmlite/pull/1046)
* Modi Mo at Meta:
* Author of [PR #1042: Update llvmlite to be compatible with llvm-17](https://github.com/numba/llvmlite/pull/1042)
## Apologies
* There's quite a few issues / design decisions to talk about
* They're all a bit fiddly
* and none of them are really interesting
* but they are all important for "quality of life" for the llvmlite user
## Approaches
Two possible approaches:
* Reimplement existing llvmlite pass manager APIs using New Pass Manager
* Create New Pass Manager APIs in llvmlite
* ... and possibly allow old and new to coexist
We take the "old and new coexisting" approach:
* Differences between new and old pass manager possible
* New Pass Manager is target aware (noted by Yashwant Singh)
* Inlining threshold available in Legacy Pass Manager, not available in New Pass Manager until LLVM 16 (noted by Da Li and Yashwant)
* Potential for performance / vectorization regressions
* c.f. discussion like [Compilation pipeline, compile time, and vectorization](https://numba.discourse.group/t/compilation-pipeline-compile-time-and-vectorization/1716)
Numba should support both old and new pass managers:
* Default to new as soon as possible (e.g. for 0.61)
* Allow switch back to old pass manager with config variable
## Implementation
* The implementation in PR #1046 implements exactly the subset of functionality needed by Numba:
* Module and Function Pass Managers
* Pipeline Tuning Options and Pass Builders needed to construct the pass managers
* Exactly the passes used by Numba (other passes omitted)
* Not implemented for initial iteration:
* Addition of all the other passes
* CGSCC pass manager
* Lots of other APIs (we also omit many legacy pass manager APIs anyway)
## Handling naming
Considerations:
* Can't rename existing classes and functions
* Would break backwards compatibility
* Don't want to encode the word "New" in any New Pass Manager APIs
* That doesn't look great, and eventually it will be the only pass manager
* And potentially "old" as well
* Can't have naming conflicts between legacy and new pass managers
* Want to try and mirror LLVM names as much as possible
* Makes it easier to use LLVM API docs
* Generally less confusing / lower mental load for llvmlite users
## Naming choices
The least worst solution seems to be:
* Legacy pass manager classes and functions:
* `ModulePassManager` and `FunctionPassManager`
* `create_module_pass_manager`, `create_function_pass_manager`
* New Pass Manager classes and functions:
* `PipelineTuningOptions`, `PassBuilder`, `ModulePassManager` `FunctionPassManager`
* `create_pipeline_tuning_options`, `create_pass_builder`, `create_new_module_pass_manager`, `create_new_function_pass_manager`
### Issues / conflicts in naming
* Legacy pass manager APIs are in the existing `llvmlite.binding.passmanagers` module
* New pass manager APIs are in the new `llvmlite.binding.newpassmanagers` module
* However, everything from both of these is imported into `llvmlite.binding`:
* This is why we have `create_new_module_pass_manager` vs. `create_module_pass_manager`
* Only the legacy `ModulePassManager` and `FunctionPassManager` get imported into `llvmlite.binding`
* That is backwards-compatible
* Documentation advises using `create_new_*_pass_manager` if you want to create a new pass manager
* Note that pass managers can also be constructed by the `PassBuilder` with [`getModulePassManager()`](https://llvmlite--1046.org.readthedocs.build/en/1046/user-guide/binding/optimization-passes.html#llvmlite.binding.ModulePassManager) and [`getFunctionPassManager()`](https://llvmlite--1046.org.readthedocs.build/en/1046/user-guide/binding/optimization-passes.html#llvmlite.binding.FunctionPassManager)
* When Legacy Pass Manager removed, the new pass manager classes could be imported into `llvmlite.binding`
### Naming on the C++ side
* On the C++ side, there are no name conflicts between new and legacy pass managers
* C++ namespaceing, different API names, etc.
* New pass manager bindings in `newpassmanagers.cpp`
* When the legacy ones are deleted, we might move it to `passmanagers.cpp`
* Names here are not exposed in the public API, so it's not a big issue
## Implementation running the pass manager
```C++
API_EXPORT(void)
LLVMPY_RunNewModulePassManager(LLVMModulePassManagerRef MPMRef,
LLVMPassBuilderRef PBRef, LLVMModuleRef mod) {
ModulePassManager *MPM = llvm::unwrap(MPMRef);
PassBuilder *PB = llvm::unwrap(PBRef);
Module *M = llvm::unwrap(mod);
LoopAnalysisManager LAM;
FunctionAnalysisManager FAM;
CGSCCAnalysisManager CGAM;
ModuleAnalysisManager MAM;
PB->registerLoopAnalyses(LAM);
PB->registerFunctionAnalyses(FAM);
PB->registerCGSCCAnalyses(CGAM);
PB->registerModuleAnalyses(MAM);
PB->crossRegisterProxies(LAM, FAM, CGAM, MAM);
MPM->run(*M, MAM);
}
```
What's going on here:
* Create analysis managers and cross-register them with each other
* Run the pass manager with the Module Analysis Manager
* After the function exits, the analysis managers are out of scope and deleted
* Safe: No reference to the analysis managers is held by the pass manager
Thoughts:
* Each time we run the pass manager, we construct new analysis managers
* Is this likely to be a performance issue?
* My guess: Probably not in the context of Numba
* I'm inclined to keep this simple implementation for initial work
* Can re-visit if it's a performance issue later
* **Contrasting approach**: Modi's implementation in PR #1042 caches the analysis managers on the pass managers.
## Adapting module passes to run on functions
* With the New Pass Manager, passes can be specific to modules, functions, loops, or CGSCCs.
* c.f. the legacy pass manager, where passes could run on anything
* Need an adapter to provide equivalent functionality for some of our legacy pass manager APIs, e.g. (from [`newpassmanagers.cpp`](https://github.com/numba/llvmlite/pull/1046/files#diff-45eef2a2ab57512c9c03ab44fbebc4228722e001d790af044bd2ca511df08414)):
```C++=109
API_EXPORT(void)
LLVMPY_AddJumpThreadingPass_module(LLVMModulePassManagerRef MPM, int T) {
llvm::unwrap(MPM)->addPass(
createModuleToFunctionPassAdaptor(JumpThreadingPass(T)));
}
```
## Example usage
### Creating a pipeline with explicitly-specified passes
From [`examples/npm_passes.py`](https://github.com/numba/llvmlite/pull/1046/files#diff-58e513c36aa180090e33e5eb1d9793277a8b79c23ff702b10eb2d38ec6d9cb78):
```python=51
# Set up the module pass manager used to run our optimization pipeline.
# We create it unpopulated, and then add the loop unroll and simplify CFG
# passes.
pm = llvm.create_new_module_pass_manager()
pm.add_loop_unroll_pass()
pm.add_simplify_cfg_pass()
# To run the pass manager, we need a pass builder object - we create pipeline
# tuning options with no optimization, then use that to create a pass builder.
target_machine = llvm.Target.from_default_triple().create_target_machine()
pto = llvm.create_pipeline_tuning_options(speed_level=0)
pb = llvm.create_pass_builder(target_machine, pto)
# Now we can run the pass manager on our module
pm.run(module, pb)
```
### Creating a default pipeline
From [`examples/npm_pipeline.py`](https://github.com/numba/llvmlite/pull/1046/files#diff-1b5e4027661f8219a245391a5ba113af5c15c5b64f89892e0440c324f3bf925b):
```python=53
# Create a ModulePassManager for speed optimization level 3
target_machine = llvm.Target.from_default_triple().create_target_machine()
pto = llvm.create_pipeline_tuning_options(speed_level=3)
pb = llvm.create_pass_builder(target_machine, pto)
pm = pb.getModulePassManager()
# Run the optimization pipeline on the module
pm.run(module, pb)
```
## Testing with Numba
* Branch that uses the new module pass manager: [gmarkall's `npm` Numba branch](https://github.com/gmarkall/numba/tree/npm)
* Idea was to prove the concept of Numba changes
* Test results:
```
Ran 11970 tests in 1489.362s
FAILED (failures=19, errors=3, skipped=607, expected failures=33)
```
Or sometimes:
```
FAILED (failures=8, errors=3, skipped=607, expected failures=33)
```
* Test fails / errors due to:
* Quick hack messing up caching (occasionally)
* Refop pruning not ported to the new pass manager
* Changes to debuginfo transformation not anticipated by the test suite
* Regexes are failing to match, likely need updating
* General conclusion:
* This is a good base on which to start moving Numba to the New Pass Manager.
## Notes on my review
* I've been reviewing and guiding this PR so far
* I'm comfortable with the code changes and have been over them for multiple iterations:
* C++ FFI binding, Python binding
* Test cases seem aligned with how much we test the legacy pass manager
* Numba testing did not expose any unexpected issues
* The PR is low-risk in that it only adds new, as-yet-unused APIs
* I wrote a substantial part of the documentation and commentary of the examples
Therefore:
* As far as I'm concerned this is good to merge, however:
* Someone else should review the docs and examples
* Some sanity checking of the design decisions around naming would be good]
* (make sure there are no footguns here)
## Next steps
* Merge PR #1046 as soon as possible:
* It only adds new APIs so is low risk
* Any real issues will be much easier to tease out once we're using Numba with it
* Implement proper support for the New Pass Manager in Numba and create PR shortly after
* Port refop pruning to the New Pass Manager
* An implementation already exists in Modi's LLVM 17 PR
* Add all passes to the New Pass Manager binding
* Can we find a procedural way to do that?
* Support for out-of-tree passes?
* Expose [`buildModuleSimplificationPipeline`](https://llvm.org/doxygen/classllvm_1_1PassBuilder.html#ad6f258d31ffa2d2e4dfaf990ba596d0d) and extension points - suggestion from Adrian Seyboldt, tracked in [Issue #1055](https://github.com/numba/llvmlite/issues/1055).
* Add support for running with remarks, pass timing, etc.
## References
* LLVM documentation: [Using the New Pass Manager](https://llvm.org/docs/NewPassManager.html)
* Provides a quick tutorial overview of how to use it
* Seems aimed towards those moving code from the legacy pass manager
* Therefore, missing a lot of explanation / details
* LLVM blog post: [The New Pass Manager](https://blog.llvm.org/posts/2021-03-26-the-new-pass-manager/)
* A nice overview of the motivation for, and design goals of, the new pass manager.