Rust Compiler Team
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Publish Note

      Everyone on the web can find and read all notes of this public team.
      Once published, notes can be searched and viewed by anyone online.
      See published notes
      Please check the box to agree to the Community Guidelines.
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Help
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Publish Note

Everyone on the web can find and read all notes of this public team.
Once published, notes can be searched and viewed by anyone online.
See published notes
Please check the box to agree to the Community Guidelines.
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# Benchmarking Rust runtimes We have good benchmarking for Rust compile times, via [rustc-perf](https://github.com/rust-lang/rustc-perf/). We should add benchmarking for Rust runtimes, i.e. the speed of the generated code. There is a defunct project called [lolbench](https://github.com/anp/lolbench) which used to do this. It was only run on each Rust release. It hasn't run for a couple of years, and the [website](lolbench.rs) no longer exists, but the code is still available on GitHub. The (in-progress) implementation is [here](https://github.com/rust-lang/rustc-perf/tree/master/collector/runtime-benchmarks). ## Goals The goals are similar to the goals of the existing compile-time benchmarks: to detect regressions and improvements in the Rust compiler. There are several types of runtime performance changes that we would like to detect: 1) **Codegen changes** We can add small micro-benchmarks that can detect codegen regressions. For example, if some very small benchmark now takes 3x more instructions after an innocent looking change, that can mean a change in codegen. These are already covered by codegen tests, but we can also add them to the perf suite. It is probably easier to write a small benchmark and let the infrastructure track its instruction counts over time rather than to write a full codegen test. 2) **LLVM changes** LLVM is regularly updated and its updates cause non-trivial changes in the performance of generated code. Therefore we would like to both detect regressions caused by LLVM bumps and also to track how much does the generated code improve if we upgrade LLVM 3) **MIR optimization changes** When a new MIR pass is added or some existing one is updated, we currently do not have a good way of measuring how does it affect the performance of generated code. With a runtime benchmark suite, we could be more confident about the effect of MIR optimizations on real-ish code. An explicit non-goal is to compare Rust speed against the speed of other languages. - There are existing benchmark suites for such purposes (e.g. the [Computer Language Benchmarks Game](https://benchmarksgame-team.pages.debian.net/benchmarksgame/index.html), formerly the Great Computer Language Shootout). - These suites are magnets for controversy, because the results often depend greatly on the quality of the implementations. - Cross-language comparisons won't help compiler developers on a day-to-day basis. ## Benchmark methodology This is possibly the most important section and also one with most unanswered questions. How do we actually measure the performance of the runtime benchmarks? Here are three groups of metrics that we could use, from the most to the least stable: 1) Instruction counts - these is probably the most stable and dependable metric, so we should definitely measure it, but it also doesn't tell the whole story (e.g. cache/TLB misses, branch mispredictions etc.). 2) Cycles, branch mispredictions, cache misses, etc. - these metrics could give us a better idea about how does the code behave, but they are also increasingly noisy and sensitive to annoying effects like code layout. 3) Wall time - this is ultimately the metric that interests us the most (how fast is this program compiled with `rustc`?), but it's also the most difficult one to measure correctly. We could measure wall times by launching the benchmark multiple times (here we have an advantage vs the comptime benchmarks, because the runtime ones will probably be much quicker to execute than compiling a crate), but there will probably still be considerable noise. We could try to employ techniques to reduce noise (disable ASLR, hyper-threading, turbo-boost, interrupts, pin threads to cores, compile code with over-aligned functions etc.), but that won't solve everything. A related question is what tool to use to actually execute the benchmarks. Using `cargo bench` or `Criterion` will probably be quite noisy and only produce wall-times. We could measure the other metrics too, but they would include the benchmark tool itself, which seems bad. Another option would be to write a small benchmark runner that would e.g. let the user define a block of code to be benchmarked using a macro. We could then use e.g. `perf_event_open` to manually gather metrics only for that specific block of code. This is basically what [`iai`](https://github.com/bheisler/iai) does (although it seems unmaintained?). ## Benchmark selection There are many things that we could benchmark. We could roughly divide them into two categories (although the distinction might not always be clear): 1) **Micro-benchmarks** Small-ish pieces of code, like the ones in the current rustc benchmark suite. For these we mostly care about specifically about their codegen. Example: check that an identity match is a no-op, check that collecting a vector doesn't allocate unnecessarily etc. Such micro-benchmarks would probably have considerable overlap with existing codegen tests. 2) **Real-world benchmarks** These benchmarks would probably be more useful, as they should check the performance of common pieces of code that occur in real-world Rust projects. Example: computing an `n-body` simulation, searching in text using the `regex`/`aho-corasick` crate, stress-testing a `hashbrown` table etc. These two categories kind of correspond to the existing `primary` and `secondary` categories of comptime benchmarks. Example pull request that adds a runtime benchmark: https://github.com/rust-lang/rustc-perf/pull/1459 - **Sorting** - take e.g. `slice::sort_unstable` microbenchmarks from `stdlib` and port them to `rustc-perf`. - **Text manipulation** - use e.g. the `regex` to go through a body of text and find/replace several regexes. - **Hashmap** - insertion/lookup/removal from hashmaps containing items of various sizes and counts. - **I-slow issues** - port issues with the `I-slow` label from the `rustc` repo. Candidates: - [x] [BufReader](https://github.com/rust-lang/rust/issues/102727) issues - [PR](https://github.com/rust-lang/rustc-perf/pull/1460) - [nested, chunked iteration](https://github.com/rust-lang/rust/issues/53340), the crate from which it was extracted [has benchmarks and sample data](http://chimper.org/rawloader-rustc-benchmarks/) - wasmi ([#102952](https://github.com/rust-lang/rust/issues/102952)) - **Compression/Encoding** - adapting benchmarks from various compression and encoding crates - **Hashing** - **Math/simulation** - [x] nbody simulation - [PR](https://github.com/rust-lang/rustc-perf/pull/1459) - stencil kernels - **complex applications** that can still be condensed into a CPU-bound benchmark. - rendering the acid3 test with servo - techempower [web framework benches](https://github.com/TechEmpower/FrameworkBenchmarks/tree/master/frameworks/Rust) ## Benchmark location We should decide in which repository should the benchmarks live. - `rustc-perf` Since we will most probably want to use the existing rustc-perf infrastructure for storing and visualising the results, this is an obvious choice. - `rust` This would probably make it more discoverable and easier to add for existing rustc developers, but the same could be said about the existing comptime benchmarks. - A separate repository (e.g. revive `lolbench`) Probably not worth it to stretch the benchmarks amongst yet another repository. ## Benchmark configuration What configurations should we measure. A vanilla `release` build is the obvious starting point. Any others? ## Infrastructure (CI) cost Obviously, it will take some CI time to execute the benchmarks. We should decide whether the current perf.rlo infrastructure can handle it. Since the vast majority of rustc commits shouldn't affect codegen, we can make the runtime benchmarks optional for manual perf. runs (e.g. `@rust-timer build runtime=yes`). In terms of automated runs on merge commits, we could only run the benchmarks e.g. if some specific parts of the compiler are changed (MIR, LLVM). But this was already envisioned for comptime benchmarks and may not work well. ## Regression policy We should also think about how important will the runtime benchmarks be for us. Are they lower or higher priority than comptime benchmarks? Do we want to stop merging a PR because of a runtime regression? Do we want to run the runtime suite on all merged commits? ## Implementation ideas We could reuse the `rustc-perf` infrastructure. We can use the same DB as is used for comptime benchmarks (e.g. store profile=opt, scenario=runtime or something like that, and reuse the `pstat_series` table). We could put the runtime benchmarking code under the `collector` crate, because we will need the existing infrastructure to use a specific rustc version for actually compiling the runtime benchmarks. Then we could prepare a benchmarking mini-library that would allow crates to register a set of benchmarks, it would execute them and write the results as e.g. JSON to stdout. Something like this: ```rust fn main() { let mut suite = BenchmarkSuite::new(); suite.register("bench1", || { ... }); suite.run(); } ``` Then we could create a new directory for runtime benchmarks in `collector`. We could either put all of the runtime benchmarks into a single crate or (preferably) create several crates (each with different dependencies etc., based on the needs of its benchmarks) that would contain a set of benchmarks. `collector` would then go through all the crates, build them, execute them, read the results from the JSON and store it into the DB. I'm not sure how should the interface look like, maybe we could introduce another level to the existing commands, like `bench_next comptime --profiles ...` and `bench_next runtime --benchmarks x,y,z`. A sketch of this implementation can be found [here](https://github.com/rust-lang/rustc-perf/tree/lolperf/runtime). ## User interface Since we already have the perf.RLO dashboard, it probably makes the most sense to reuse it. Even though we will probably reuse the DB structure for the existing comptime benchmarks, the runtime benchmarks might want a separate UI page, because of these reasons: 1) Avoid mixing runtime benchmarks with the comptime benchmarks. The compare page is already complex as it is, and stuffing a bunch of very different benchmarks into it would make it worse. 2) This is not decided yet, but it's possible that we will have eventually have **a lot** of runtime benchmarks. After all, they will probably be much faster to execute than the comptime benchmarks, so we could in theory afford that. The compare page might not be prepared for such a large number of benchmarks. 3) We can add runtime specific functionality to the runtime performance UI page. For example, we could display the codegen diff of some benchmark (this might not be so easy though). Maybe we could just include a redirect to godbolt :)

Import from clipboard

Paste your webpage below. It will be converted to Markdown.

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template is not available.
Upgrade
All
  • All
  • Team
No template found.

Create custom template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

How to use Slide mode

API Docs

Edit in VSCode

Install browser extension

Get in Touch

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

No updates to save
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully