# RFC: Fixing ABI for SIMD types
- Feature Name: `x86_simd_abi`
- Start Date: 2023-10-09
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
# Summary
[summary]: #summary
Currently, when passing SIMD types (and floating-point types, in some cases), the ABI can differ depending on `#[target_feature]` specifications, when crucial features are disabled. This RFC seeks to disallow code that would differ in ABI, and to allow functions using `#[target_feature]` and SIMD types to be soundly callable. This is also intended to allow users to define external functions that use simd types, which are currently unstable due to ABI issues.
# Motivation
[motivation]: #motivation
Currently, on at the very least x86 and x86_64, the `extern "C"` ABI of SIMD types is poor. The ABI passes the whole value in a single simd register by default. When this is not possible, because the registers referred to are unavailable, the calling convention is silently adjusted to a wholely different ABI. This can be both subtle and suprising to low-level unsafe programmers (and even safe programmers).
Further, as the behaviour in this case according to the ABI is subtle and suprising, the proper feature-less ABI is not even correctly implemented by llvm and thus rustc, leading to differences with C code (and likely Rust code) compiled by gcc and other C or rust compilers.
Two mitigations are already implemented in rust today today - passing SIMD types indirectly for the `extern "Rust"` abi, and linting about SIMD types when using under the `improper_ctypes_definitions` lint. Both of these solutions have issues, addressed in the [Rationale and Alternatives](#rational-and-alternatives).
# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation
The SIMD types supported by `rustc` have inconsistent behavior across variations of processors. For example, modern x86 processors have wider vector registers available, so naturally they will want to pass parameters in these wider registers. For externally-facing code, this is a problem. As such, `rustc` should blacklist SIMD types in functions where the "most correct register" does not exist until a safe and consistent approach is determined.
# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation
The following types are considered "x86 SIMD" types (types without a prefix live in the `core::arch::x86` and `core::arch::x86_64` modules depending on platform):
* `__m128`
* `__m128i`
* `__m128d`
* `__m256`
* `__m256i`
* `__m256d`
* `__m512`
* `__m512i`
* `__m512d`
* `__m128bh`
* `__m256bh`
* `__m512bh`
(In general, any type that is implemented using `#[repr(simd)]` when targeting x86)
Each x86 SIMD type is broken down into groups, based on the feature that is required.
When a function is called, declared, or defined, such that any parameter or return value directly contains an x86 SIMD type, the required target feature must be enabled or an error is issued.
Functions that pass or return x86 SIMD types behind a pointer or reference type should be accepted in all cases.
The target features that are considered includes those implied by the current target and CPU, those globally enabled by a compiler switch, and those enabled per-function using the `#[target_feature]` attribute.
In general, x86 SIMD types that are 128 bits in length require the `sse` feature, x86 SIMD types that are 256 bits in length require the `avx` feature, and x86 SIMD types that are 512 bits in length require the `avx512f` feature (or future equivalent, such as `avx10/512`).
When an error is issued under this declaration, a hint to enable the target feature may be appropriate. The error should be type specific for types in `core::arch` (naming the exact type).
Each of the following declarations are disallowed under this RFC.
```rust
// compile-args: --target x86_64-unknown-linux-gnu -C target-cpu=x86_64
use core::arch::x86_64::{__m256,__m512};
// Error: Passing (or returning) `__m256` requires the `avx` target feature
// Hint: Add `#[target_feature(enable="avx")]` to use `__m256`
pub extern "C" fn foo(x: __m256) -> __m256{
x
}
// Error: Passing (or returning) `__m512` requires the `avx512f` target feature
// Hint: Add `#[target_feature(enable="avx512f")]` to use `__m512`
pub extern "C" fn foo(x: __m256) -> __m256{
x
}
```
When a function is called via an fn-item type (not a function pointer type), the target features of the callee are considered. This allows calling functions across target feature boundaries, such as platform intrinsics (where `is_x86_feature_detected!` reports the feature as available).
Implementations of Rust would be required to ensure this call is performed properly (with parameters passed/returned in the appropriate registers), which may entail the use of a shim function or alternative code generation techniques.
The following code should be accepted, and have well-defined results.
```rust
// compile-args: --target x86_64-unknown-linux-gnu -C target-cpu=x86_64
#[target_feature(enable="avx")]
pub extern "C" fn foo(x: __m256) -> __m256{
x
}
pub fn main(){
eprintln!("{:?}", unsafe{foo(core::mem::transmute([1.0f32;8]))})
// Prints: __m256(1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0)
}
```
When a function is called via a function pointer, the target features of the callee cannot be determined. The error is thus issued based on the target features of the caller. This choice is sound because, absent the use of FFI, which is unsafe, it is impossible to declare or define functions, and thus obtain valid pointers to them, without the feature required to call them properly.
This behaviour should be applied to functions with both FFI calling conventions, and the `Rust` calling convention. This will allow SIMD types to be passed by value in the `Rust` calling convention, which can be a source of performance regression in tightly optimized simd code.
This behaviour should not be applied to functions with the `unadjusted` ABI. Such functions are intrinsics and cannot be defined in user code. They typically already assume the applicable target feature.
Functions on `core::simd::Simd`, which are generic over sizes, may need to be specifically adjusted to account for this. This will likely entail compiler-specific support, such as a special ABI or an attribute. For rustc, an attribute like `#[rustc_stdsimd_target_features]` may be useful to control this in combination with `target_feature_11`.
Support for `AMX` is not yet implemented in the Rust Compiler, however the `__tile` type referred to by Intel's Intrinsic Documentation should be considered an "x86 SIMD type", and similarily disallowed without the `amx-tile` feature.
As this does disallow code that was previously valid, two mitigations should be put in place to avoid breaking too much code:
1. Prior to issuing a hard error, `rustc` should issue a future compatabilty lint on any code disallowed by this RFC. The lint should be deny-by-default, but it may be reasonable to start at warn-by-default. Though FFI declarations using SIMD types are unstable, this lint should be applied to those declarations using the `simd_ffi` feature. This lint should remain a lint for at least 2 full stability cycles after first being issued on a stable version of rustc. It is left to other rust implementations to determine whether or not to issue a lint or to immediately issue a hard error on those implementations.
2. A crater run should be used to determine whether or not the hard error would cause substantial ecosystem level breakage. If the breakage is significant, adjusting the set of types this is applied to (notably removing the `core::simd::Simd` family) or leaving the error as a lint can be considered, at least in some cases.
# Drawbacks
[drawbacks]: #drawbacks
Several notable drawbacks exist:
1. As this introduces hard errors on previously allowed stable code, this is nominally a breaking change. Whether or not this breaks a substantial amount of code in practice will need to be determined.
2. Further, because this implicates (potentially private) fields of structs, this introduces a semver hazard on code that passes structs by value.
3. C code can currently be defined using SIMD types without the appropriate features. This would preclude linking to those functions
- This is not as significant as a drawback, as the current ABI for these functions differs between llvm and gcc, as well as rustc.
# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives
The following alternatives were considered:
- Only applying this change to the `core::arch` type, rather than also `core::simd::Simd`.
- Make no change (keep status quo)
- Blanket disallow passing SIMD types by value
- Only apply for FFI ABIs.
Two notable alternatives already exists:
For `extern "C"` (et. al), the `improper_ctypes_definitions` lint is issued, and for `extern "Rust"` SIMD types are passed indirectly.
The former is poor as the `improper_ctypes_definitions` lint emits warnings on a significant amount of well-defined code, and thus may be supressesd by an irritated user or ignored wholesale. This may cause the legitimate warnings for SIMD types to be supressed as well, either as an unfortunate side effect of supressing a false positive elsewhere in the code, or by mistake.
The effect on `extern "Rust"` SIMD types is also a notable performance impact, causing even types that rarely (if ever) change ABI to be passed in memory.
This can affect even some inlined code as memory indirections for pointer arguments sometimes persist after inlining.
# Prior art
[prior-art]: #prior-art
There is no currently implemented prior art that I am aware of.
The author of this RFC intends to implement this as a codegen-level error in the lccc project, with the x86 codegen attempting to pass sufficiently sized vectors in ymm registers and erroring if unavailable. Mentioning this in discussions about target_feature ABI considerations prompted this RFC.
# Unresolved questions
[unresolved-questions]: #unresolved-questions
- Should `core::simd::Simd` be affected by this, or should an alternative for it be created.
- An alternative proposed with the project-portable-simd was a `HardSimd` type which *would* be affected by this, and `Simd` would not
- `f32` and `f64` ABI Depends on the x87 (IA-32) or sse (x86_64) features. Should these types be disallowed when these features are not present?
- Are other architectures affected by this? How should this RFC cover those architectures?
- Should `#[repr(Rust)]` types be infected by the prescence of SIMD types or should they get an ABI that does not depend on target features (IE. passed by-value indirectly)
# Future possibilities
[future-possibilities]: #future-possibilities
While no current targets for 16-bit x86 exist, it is recommended by this RFC that the types `__m256`, `__m512`, `__m512f`, `__m512i`, and all of the `__m*bh` types not be defined on that target, as the zmm registers required to pass 32- and 64-byte vectors are not available in real and Virtual 8086 modes (even on CPUs that have the avx512f feature), due to lack of support for the VEX or EVEX prefix which is needed to load ymm and zmm registers.
The `__m128bh` type is disallowed simply because they cannot be used in this case - they most likely can be accepted otherwise.
As new targets are accepted, one thing that should be considered by rust implementations is whether it has target-specific types that can change ABI in-target depending on target-features. Those target-specific types should simillarily be disallowed without those features enabled. If the target designer has the option, it may be useful to consider an ABI that is not affected by target features in this manner for passing vector types if this can be done efficiently.