owned this note
owned this note
Published
Linked with GitHub
# Rust versions
## Rust channels
Rust has two major channels of code: `stable` and `nightly`.
This blog post "Stability as a Deliverable" explains the approach well:
https://blog.rust-lang.org/2014/10/30/Stability.html
Key section "The Plan":
> - New work lands directly in the master branch.
> - Each day, the last successful build from master becomes the new nightly release.
> - Every six weeks, a beta branch is created from the current state of master, and the previous beta is promoted to be the new stable release.
>
> In short, there are three release channels -- nightly, beta, and stable -- with regular, frequent promotions from one channel to the next.
>
> New features and new APIs will be flagged as unstable via feature gates and stability attributes respectively. Unstable features and standard library APIs will only be available on the nightly branch, and only if you explicitly "opt in" to the instability.
>
> The beta and stable releases, on the other hand, will only include features and APIs deemed stable, which represents a commitment to avoid breaking code that uses those features or APIs.
Every 6 weeks `nightly` hits `stable` with unstable features sitting behind feature flags.
More comprehensive docs here:
https://doc.rust-lang.org/book/appendix-07-nightly-rust.html
Something can be "in stable" but locked away behind a feature flag in reality. Don't get too excited about the "6 weeks" - it is largely meaningless for gauging when features will land, it's only useful for testing (see below).
## Holochain rust channel
Unfortunately `stable` features have been insufficient for conductor development _and_ feature flags cannot be enabled on the `stable` branch.
This was certainly true in 2018 and is still mostly true in mid 2019.
Examples of things that are/were "nightly only" when we first wanted them:
- futures
- async/await
- try_from trait
- fmt
- clippy
- key macrology for HDK
- incremental compilation
- code coverage
- **wasm**
That said, Rust has made a ton of progress towards stabilising all the things that we're using which is awesome. About half of the above is stable at the time of writing, the rest (even clippy!) is ostensibly on the "stable track" and moving forward IIRC.
It is likely that if we ignore tooling (see below) and downstream consumers (see below) that we could move most Holochain repos across to a stable rust version "soon".
Note that as long as WASM needs nightly we are stuck chasing _some_ nightly version, at least for zome/conductor support.
## Zome development channel
Even if we could build conductors entirely on stable there are shared crates used "under the hood" and also exposed via. the HDK to zome developers.
For example, serialization, persistance, core types and wasm utils are all required by both conductors and the HDK at the time of writing:
- https://github.com/holochain/holochain-rust/blob/develop/core/Cargo.toml
- https://github.com/holochain/holochain-rust/blob/develop/hdk-rust/Cargo.toml
Rust will not compile a zome if _any_ dependencies use _any_ incompatible features from older compiler versions.
For example, `nightly` has deprecated trait objects without the `dyn` keyword somewhere between what is currently in the conductor and what is in the current rust compiler.
This means that, for example, zome developers cannot use holochain crates on an old nightly alongside non-holochain crates on a newer nightly if they contain trait objects.
Similar incompatibilities are fairly common.
The issue is exacerbated by `cargo` making no distinction (AFAICS) between compiler versions at the crate level. i.e. the supported compiler versions aren't surfaced/checked/tooled in either `cargo` or crates.io.
This means that a crate can have a dependency that "works on my machine" but not yours if we have different nightlies. Semver and `Cargo.toml` config does nothing to help this.
This means we must target the compiler version that downstream zome developers need for compatibility with third party crates they will need to use.
For zome dev support then we need to either:
- commit to supporting `stable` forever
- commit to chasing `nightly` until `stable` can be supported forever
As we need nightly ourselves right now we don't really have a choice here.
## Scheduling updates
It's obviously not possible to know _in general_ what crates wild zome developers will need.
The best we can do is offer a reasonable and clear update schedule that aligns with how Rust itself operates.
In an ideal world we would simply track nightly... nightly. Such a short cycle is _currently_ too disruptive for both crate maintainers and the integration team. Long term we could potentially accomodate this, or even move to stable, but we're not there yet.
Our experience with the core release process, dev pulse communications, holoport shipping dates, etc. all demonstrates a _very strong_ pull from the community for predictable, timely and reliable updates across the board (comms, binaries, tooling, docs, etc.).
Even more so than regular software development, the crypto industry is plagued by vapourware and "exit scams". People _in general_ are hyper-sensitive to bad surprises and many will _intentionally_ spread FUD to manipulate markets or just fabricate some "news".
In the immediate term the (proposed) best way to serve the community would be:
- Select a channel and stick to it
- Select an update frequency and stick to it
- Create processes aligned with The Rust Way
- Hardcode correct compiler versions directly into the tooling
- Document, train and communicate what is going on
Many of the difficulties we have experienced with hitting our external deliverables through the conductor binary release process are a direct result of allowing schedules/standards/CI to drift upstream which often leads to decoherence, confusion and late nights for the integration/comms/happs teams downstream. This is a similar situation.
## Update overhead/co-ordination & Rust "trains"
Scheduled upgrades are also amenable to some degree of automation.
Automation is critical long term because:
- We plan to have many interdependant repositories
- This process is very time consuming and
- error prone and
- sensitive to team dynamics/deadlines and
- scales badly as more repos are added
For example, a scripted cron job can:
- Find the latest version that hits the tooling requirements e.g. https://rust-lang.github.io/rustup-components-history/
- Bump holonix config e.g. with a nix parser https://gitlab.com/jD91mZM2/rnix
- Throw PRs up against all repos we know about
- Maybe merge PRs automatically if they pass CI
- Maybe do several passes of this to handle the DAG of dependencies between upstream/downstream shared crates
In the immediate term all of this will be manual of course (someone from iSquid will bumble through it kaizen style). Just imagine some future state where it is all magically automated so we can be working backwards from that conceptually now.
But the happy path all assumes the CI passes.
If it doesn't then that means something in Rust nightly has changed in a way that breaks what we are doing OR we have shipped breaking changes in our own upstream crates that need to be accomodated downstream.
For an example of a suite of first pass feedback PRs with both:
- https://github.com/holochain/holochain-rust/pull/1555
- https://github.com/holochain/lib3h/pull/125
- https://github.com/holochain/holochain-persistence
- https://github.com/holochain/holochain-serialization
We can see evidence of the following:
- The deprecation of trait objects without the `dyn` keyword here means manual updates to _all_ repos
- Removal of `FnBox` means changes to _one_ repos
- Upstream changes in the third party `libc` crate are required to track nightly
- Significant breaking changes to `RealEngine` in lib3h `develop` need to be supported by the downstream core repo
- Downstream crates are impacted far more than upstream
- Downstream crates need to do weird things in their `Cargo.toml` file before upstream releases are finalised, so a single pass of PRs is not sufficient
All of these things need to be cleaned up before the new nightly can be shipped.
_At this point manual review by humans is required._
This is roughly the same problem that Rust itself faces, which they mitigate with the `beta` branch. Downstream consumers are encouraged to include all of `nightly`, `beta` and `stable` in their CI even if they only plan to use `stable`. There are official channels to provide upstream feedback ahead of time to Rust maintainers if they see things breaking unreasonably, otherwise downstream eats the consequences when they land.
The Rust docs describe the beta branch as a train leaving a station (it is literally called the "train model").
https://doc.rust-lang.org/book/appendix-07-nightly-rust.html#choo-choo-release-channels-and-riding-the-trains
> This is called the “train model” because every six weeks, a release “leaves the station”, but still has to take a journey through the beta channel before it arrives as a stable release.
>
> Rust releases every six weeks, like clockwork. If you know the date of one Rust release, you can know the date of the next one: it’s six weeks later. A nice aspect of having releases scheduled every six weeks is that the next train is coming soon. If a feature happens to miss a particular release, there’s no need to worry: another one is happening in a short time! This helps reduce pressure to sneak possibly unpolished features in close to the release deadline.
>
> Thanks to this process, you can always check out the next build of Rust and verify for yourself that it’s easy to upgrade to: if a beta release doesn’t work as expected, you can report it to the team and get it fixed before the next stable release happens! Breakage in a beta release is relatively rare, but rustc is still a piece of software, and bugs do exist.
It would be relatively straightforward (conceptually and technically) for us to incorporate a similar model where "feedback PRs" for rust versions are automatically created on a similar schedule (e.g. 4/6/8 weeks) and then force merged at the end of the period if not accepted earlier.
Most of the time, where crates are focussed in scope and mostly targetting stable features, the rust version changes will have little to no material impact.
Rust version changes have larger impact when:
- More unstable features are used
- Crates are poorly factored or very large in terms of LOC
- Crates use many dependencies that also chase nightly
- Frequency of updates is too short (choppy) or too long (overwhelming)
- They are accompanied by unrelated breaking changes from upstream crates
In the "train model" the exact length of review time (within reason) is less important than the fact that at some point the train will leave the station and that everybody knows when this is. It is up to crate maintainers to decide what that means for their crate (sprint planning, etc.) but ultimately code _will_ ship on schedule even if that means missing features/bug-fixes (until the next train).
It is up to the integration team to tweak the frequency of trains to find the "least bad" sweet spot for everyone (crate, conductor, internal and wild zome devs). We probably need to create some super clear automated notifications too.
From Damien:
> Nightly version changes can break previously valid code and can take time to solve. This actually happened to us back at the first devCamp when Nico did an update the day before the DevCamp. We were tight on having something to show for DevCamp and this certainly did not help.
Downstream and upstream should be tracking/mitigating the impact of breaking changes bidirectionally during the grace period. It makes sense that there would be more pressure further upstream to clean things up faster, to allow as much time as possible for downstream to assess the impact of changes and do their own cleanup.
It may make sense for upstream crates to "quarantine" unrelated breaking changes behind compiler flags, git branches, semver or similar until they are ready to be incorporated downstream.
It may make sense for upstream crates under active development to schedule/automate releases on a much shorter frequency than the rust version updates, e.g. any one of the weekly binaries/release tags shipping from core can easily catch a 6 or 8 weekly train without additional planning. This would make it easier to roll rust version updates straight into a more frequent pre-existing "train" without the need for additional processes/infrastructure.
It may make sense for an upstream crate to target `stable` as this effectively guarantees that it will never be impacted by `nightly` changes.
## Rust ecosystem discussions
This has been heavily discussed for years, the Rust community takes stability and the "trains" very seriously _and_ this model causes similar opportunities and pain for others.
A tiny snapshot of discussions:
- https://internals.rust-lang.org/t/idea-semi-stabilization/9655
- https://www.reddit.com/r/rust/comments/7umj04/nightly_or_stable_you_may_have_no_choice/
- https://www.reddit.com/r/rust/comments/bc4whg/whats_with_the_nightly_fixation/
- http://xion.io/post/programming/rust-nightly-vs-stable.html
- https://jonathanmh.com/rust-nightly-stable-rustup-may-not-used-error/
- https://www.reddit.com/r/rust/comments/800e3n/recommended_to_learn_using_the_stable_or_nightly/
- https://blog.rust-lang.org/2014/10/30/Stability.html
There are some comments in those threads such as "firefox uses stable" but this is achieved by rewriting binaries in a post-compilation phase to simulate things the nightly compiler does natively, which is something we can't expect typical zome developers to be implementing.
OTOH we aren't expecting zome developers to be doing anything approaching the complexity of creating firefox.
Either way, zomes _are_ WASM so _must_ be using nightly for short-mid term. The more pragmatic question is whether our HDK macros, shared holochain crates and all third-party crates that make sense inside a zome can commit to stable _internal Rust APIs_ any time in the near future.
Every year a rust survey is conducted and a standard question is usage of stable vs. nightly.
From 2018:
https://blog.rust-lang.org/2018/11/27/Rust-survey-2018.html
> We’re seeing similar numbers in users of the current stable release since last year. Perhaps surprisingly, we’re continuing to see a rise in the number of users who use the Nightly compiler in their workflow. For the second year in a row, Nightly usage has continued to rise, and is now over 56% (up from 51.6% of last year).
>
> When asked why they used nightly, people responded with a broad range of reasons including: access to 2018 edition, asm, async/await, clippy, embedded development, rocket, NLL, proc macros, and wasm.
## Tooling channels
Tooling is frequently discussed in the Rust ecosystem as the exception to what should be expected in stable. It is frequently recommended to develop in nightly to access tooling but target the stable channel for release artifacts.
That said it appears that all the following might be on stable now or very soon (this is all badly documented and moves very fast, we'd need to test it thoroughly):
- fmt
- clippy
- tarpaulin
If we can hit these three things reliably in stable then that gives us a _lot_ more freedom to minimise our use of nightly. We could restrict `nightly` to only crates that need WASM and advanced concurrency primitives. We could be more careful about designing crates that touch WASM and concurrency vs. everything else we do.
The good news is that `holonix` is mature enough to be able to handle at least `fmt` and `clippy` natively as well as centralised deployments of `nightly` and `stable` channels.
One major caveat is that Rust itself doesn't consistently support all tooling for all versions for all target platforms. Holonix only supports the tools that line up in context.
There is a nice website visualising the mess here:
https://rust-lang.github.io/rustup-components-history/
Until I found this site I was just throwing dates randomly at Holonix until I could get smoke tests to pass, which is pretty tedious/error prone :sweat_smile:
Now all we need to do to pull in the right Rust versions for all devs using `nix-shell` is to update Holonix and have each repo point to the latest release tag.
Anyone tracking `https://holochain.love` will get the latest rust version as it lands in holonix (we can setup "trains" here also to sync with the timelines we schedule above).
Our experience is that any ad-hoc tooling based off `rustup` e.g. "install scripts" or `make` commands ends up being very fragile/verbose/defensive quickly or starts to "drift" so that developers find themselves forked onto a different compiler reality.
This problem is well known in sysadmin/ops, e.g. this whitepaper:
https://www.usenix.org/legacy/event/lisa02/tech/full_papers/traugott/traugott_html/
That defines systems in terms of "divergence", "convergence" and "congruence". Basically, we have some representation of how we want our system to look as data and the system either drifts away from, towards, or immutably matches the data over time. Any drift at all could be inconsequential or catastrophic depending on context.
Normally this is mostly handled by virtualisation (e.g. docker) across both development/testing environments and servers. We _do_ have VMs in addition to (on top of) nix-shell but they don't tick all our boxes.
The nature of (Holo)chain blurs a lot of lines here and creates a lot of additional constraints for what we must/can/cannot reasonably plan to do with tooling:
- Holochain is a suite of open source rust/js/etc. repositories
- Holo is a company with proprietary/closed/paid infra
- Users of conductors are Holochain "servers" we have no access to
- HoloPorts are delegated Holochain "servers" we have no physical access to
- Developers of zomes are doing their own thing in the wild, both personally/individually and commercially, they look to us for "best practise" but also need an extension point for their bespoke needs
- Zomes/conductors need high assurance that they can compile consistently from known source components in order to validate the DHT consistently
From Damien (and experienced by others):
> Notification of version changes can take time to propagate through the org which can end up in programmers wasting time trying to figure out why CI is red but locally everything works. The actual problem being a difference in versions.
>
> On version change, all programmers (or maybe its just me?) need to manually install the new version & components and set the new default.
This is all a real pain point (of which the rust version is just one tiny component) and a big motivation behind the holonix adoption/work in general.
This "drift" cannot happen inside a nix-shell as the versions are scanned by nixos hashing the derivations that build rust (sort of like zomes for devops). If a developer has the wrong version of rust when using holonix it means they have done something locally that overrides what nix is doing (also somewhat common when working with `rustup`, but a separate issue that is being addressed elsewhere). If the `default.nix` file in a repository targets a specific holonix tag/commit then the rust compiler is not only distributed in real time to all devs but they also _time travel back to the correct compiler version_ when using older versions of the code.
## Are we stable yet?
Until WASM + async/await lands in stable we clearly cannot be all stable.
Assuming the tooling story (see above) lands then...
Individual crates probably can and should be targetting `stable` ASAP as it makes all the pain go away (in theory).
It is _guaranteed_ by the Rust compiler that `stable` crates can _always_ be used with each other and `nightly` crates in the same major version (e.g. rust 1.x), but the inverse is not true for `nightly`.
This is a good reason to be splitting crates out into carefully factored components as it quarantines the "need" for nightly to the bits that directly touch WASM and concurrency.
It would be relatively easy to put long-running PRs up against candidate repos that use the `stable` compiler in holonix. This PR can be periodically reviewed with the latest `develop` branch merged in to gauge whether the crate can be "stabilized" from the compiler's perspective.
Note that we're strictly only talking about using the `stable` compiler here. A crate could be under heavy development and be "unstable" while still only using "stable" Rust features. The faster we can identify relatively stable crates from both the code and compiler's perspective and split them out, the better. This has tons of benefits but is out of scope for this document.
## IDE support
From Damien (similar concerns coming from community and internal):
> Using latest nightly means your IDE is not up to date with the version of the code you are writing in, which can lead to a broken/buggy IDE or an IDE with some features not accessible/working (e.g. debugger).
Admittedly this is pretty patchy right now.
Every IDE seems to do its own thing, most try to wrap rustup and managing rust versions is a manual/skilled process.
For example, the Atom rust integration seems to be configurable but target `stable` by default, then latest `nightly` as the next fallback.
Realistically, if we can get Holonix integrated with IDEs I would expect that the rust versioning comes "for free" at that point.
The things needed for IDE support AFAICS:
- not force usage of rustup (which overrides nix)
- support being launched from nix-shell so $PATH is nicely setup
- support config pointing to rust or detection from $PATH (point to /nix/store)
We _are_ getting pull internally and externally for this but it isn't being treated as a super high priority on the integration team SOA at the time of writing.
All I can say at this point is that versioning is unlikely to be a major additional hassle anywhere that a nix-managed $PATH can be accessed.
Therefore, without knowing more, I don't want to get too deep into that here and rather treat it as its own line of inquiry.
The biggest potential gnarliness I see is that nix-shell support for Windows is currently handled through VMs (vagrant/docker) and will be until the native NixOS WSL lands upstream.
At the least, clearly defining a versioning schedule that everyone can plan to won't make IDE support any worse/messy than it already is.