Notes for pSIMD RFC

# Notes for pSIMD RFC ###### tags: `Portable SIMD` `RFC Draft` ## Describing What We've Been Doing I have been reviewing the thing we've been doing, and currently I am brainstorming ways to describe what we have done so that we can pitch it in our RFC, describe the pros, and address the cons. The way I figure it, we've been approaching it kind of as "array types that you can math on". So maybe it's really `core::APL` but without the weird symbols. ^^; And this is, given a somewhat reductive perspective, what SIMD *is* (and yes, APL interpreters and the rare compiler do vectorize nicely!). I say this is what we've been doing rather than aiming directly at providing hardware access because if an op **is not** provided by SIMD ISAs but **is** considered an essential part of the numeric types, like division, we've been tending towards adding it anyways, on the assumption that we are going to be able to provide a reasonably efficient emulation and you, the Rust programmer, can't actually do better. But also in some cases where an op is in some weird AVX512-specific extension and has no clear and obvious path to making it happen as a "Rusty" function because it doesn't clearly map to existing Rust primitive types, we've been glancing aside. The way I figure it, the reason we don't just implement Portable SIMD as an abstraction on top of raw arrays (I mean, they are internally, but) is that we would have to create a way to limit this to only the appropriate integer types and such... to make `repr_simd` into a trait, in essence. Maybe that would be a good thing, though? Part of the reason I linked SIMDe and Wasm SIMD stuff is that they are two alternate perspectives, so I found the contrast useful to reflect on: focusing on direct emulation of the ISA's model (whether it's via intrinsic functions or whatever) or on making something that is perfectly absolutely portable and has high performance in all cases. ---Jubilee ## Notable Challenges / Tradeoffs - the "libm problem" - portability vs. performance - kiiinda incomplete without multiversioning? - the inline vs. ABI issue - bonus issues about float portability too!!! - what does "architecture neutral" SIMD even mean anyways? # RFC Template - Feature Name: `portable_simd_api` - Start Date: (fill me in with today's date, YYYY-MM-DD) - RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) # Summary [summary]: #summary A portable SIMD library will be added to the Rust standard library, accessible through `core` and `std` as appropriate. This will offer abstractions over arrays of primitive datatypes which can explicitly use SIMD instructions where applicable. # Motivation [motivation]: #motivation A programming language fundamentally is in the business of providing abstractions over the implementation details of computing architectures. Rust, since rustc's 1.0 release in 2015, has mostly been aimed at "conventional" modern architectures such as the x86-64 processor. To allow the use of the more powerful features of a given architecture, the ability to use what are called "intrinsic functions" was added to the language in the form of `std::arch`. Since then, Rust has expanded its support to many architectures, including tier 1 support for an aarch64 platform, and users expect it to be able to obtain high performance on all platforms. Much as Rust provides abstractions over floating point and atomic memory instructions, there is a growing desire to have an explicit abstraction over SIMD operations so that code can make use of SIMD registers or vector coprocessors without having to be rewritten for the details of every possible target architecture. The implementation of these has many subtle differences, including in performance, but the basic notion, of operating on `N` instances of some primitive `T`, # Guide-level explanation [guide-level-explanation]: #guide-level-explanation Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means: - Introducing new named concepts. - Explaining the feature largely in terms of examples. - Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible. - If applicable, provide sample error messages, deprecation warnings, or migration guidance. - If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers. For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation This is the technical portion of the RFC. Explain the design in sufficient detail that: - Its interaction with other features is clear. - It is reasonably clear how the feature would be implemented. - Corner cases are dissected by example. The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. # Drawbacks [drawbacks]: #drawbacks Why should we *not* do this? # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives The underlying details of SIMD architectures vary much more than atomics or even floating point does. This makes selecting the best approach difficult, as a severe impedance mismatch between the programmer and the architecture can make the result generate poorly optimized code. However, the status quo is that every Rust programmer who wants to vectorize their code has to tediously attempt variations of code and inspect the assembly output of every variation. Which makes selecting the best approach difficult. However, all SIMD architectures, whether they use fixed registers or variable-length vectors, operate on arrays of a given data type. This offers a natural programming idiom that suits Rust's existing semantics and generalizes well over architectures: use an array of whatever size you need and allow the compiler to take care of the details. - Why is this design the best in the space of possible designs? - What other designs have been considered and what is the rationale for not choosing them? - What is the impact of not doing this? # Prior art [prior-art]: #prior-art Discuss prior art, both the good and the bad, in relation to this proposal. A few examples of what this can include are: - For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had? - For community proposals: Is this done by some other community and what were their experiences with it? - For other teams: What lessons can we learn from what other communities have done here? - Papers: Are there any published papers or great posts that discuss this? If you have some relevant papers to refer to, this can serve as a more detailed theoretical background. This section is intended to encourage you as an author to think about the lessons from other languages, provide readers of your RFC with a fuller picture. If there is no prior art, that is fine - your ideas are interesting to us whether they are brand new or if it is an adaptation from other languages. Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC. Please also take into consideration that rust sometimes intentionally diverges from common language features. # Unresolved questions [unresolved-questions]: #unresolved-questions - What parts of the design do you expect to resolve through the RFC process before this gets merged? - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? # Future possibilities [future-possibilities]: #future-possibilities Think about what the natural extension and evolution of your proposal would be and how it would affect the language and project as a whole in a holistic way. Try to use this section as a tool to more fully consider all possible interactions with the project and language in your proposal. Also consider how this all fits into the roadmap for the project and of the relevant sub-team. This is also a good place to "dump ideas", if they are out of scope for the RFC you are writing but otherwise related. If you have tried and cannot think of any future possibilities, you may simply state that you cannot think of anything. Note that having something written down in the future-possibilities section is not a reason to accept the current or a future RFC; such notes should be in the section on motivation or rationale in this or subsequent RFCs. The section merely provides additional information.