RFC PR #3296: Improve C types for cross-language LLVM CFI support
Broader "LLVM Control Flow Integrity" Tracking issue https://github.com/rust-lang/rust/issues/89653
Papaevripides and Athanasopoulos (2020): "Exploting Mixed Binaries" (PDF)
Mergendahl et al (2022): "Cross-Language Attacks" (PDF)
RFC #3296 proposes to "identify" uses of C char and integer types at "time types are encoded". (PnkFelix didn't understand what the quoted terms were meant to mean on their first read, but they believe they understand it now, and can work with the RFC author revise their text to make this more immediately clear.)
Indirect branches in Rust-compiled code are not validated, while such branches would be validated under analogous C/C++ code compiled with forward-edge control-flow protection. Thus, Rust is allowing potentially desirable control-flow protection to be bypassed.
PnkFelix thinks that part of the goal (maybe all of it?) is to ensure that enough metadata is stored to enable, at runtime, a test whether a given (function) pointer was assigned a given type by the compiler.
So: you need to encode the type itself, and you need to associate that encoded-type with the pointer.
Dealing with this problem at the cross-language level requires cooperation between distinct tools (since you inherently will be combining object code from a Rust compiler and a C/C++ compiler such as Clang). Achieving such cooperation requires either:
Meta-point: "Motivation" section seems very long. (Does it need to go into this level of depth to motivate this change?)
PnkFelix doesn't understand the point the doc is making when it distinguishes between "provide comprehensive protection for C and C++ -compiled code" vs "provide comprehensive protection across the FFI boundary". Is the idea that one might try to hack something up that doesn't commit to an encoding supported on the C/C++ side, and try to ensure all the necessary checks here are entirely handled at the FFI boundary? (To PnkFelix, that sounds inherently broken, in terms of the amount of dynamic validation that would require during any FFI call.)
Meta-point: It seems like this RFC is perhaps trying to elaborate the whole design space in the motivation section. This is not the way good RFC's are written. Instead: Move the elaboration of the space into the "Alternatives" section towards the end, and have the motivation section focus on the single point (or at least design subspace) that is being recommended here.
PnkFelix is guessing/hoping that the section Rust vs C char and integer types is going to start into the real meat of the specific proposal at hand here.
Aha!:
For convenience, some C-like type aliases are provided by libcore and libstd (and also by the libc crate) for use when interoperating with foreign code written in C. For instance, one of these type aliases is c_char, which is a type alias to Rust’s i8.
To be able to encode these correctly, the Rust compiler must be able to identify C char and integer type uses at the time types are encoded, and the C type aliases may be used for disambiguation. However, at the time types are encoded, all type aliases are already resolved to their respective ty::Ty type representations[11] (i.e., their respective Rust aliased types), making it currently not possible to identify C char and integer type uses from their resolved types.
So it seems like the heart of the issue is that we have implemented our C FFI support via definitions of type aliases, and the compiler throws away the fact that the original "intended" types corresponding to certain (abstract) C types, rather than implemented-defined (concrete) integer types of specific sizes.
the Rust compiler must be changed to
be able to identify C char and integer type uses at the time types are encoded.
not assume that C char and integer types and their respective Rust aliased types can be used interchangeably across the FFI boundary when forward-edge control flow protection is enabled.
Okay, the above seems like this heart of what is actually being proposed here. And that is indeed a severe language change. PnkFelix needs to go back and review why this makes sense (versus stating that the implementation-selected concrete type is entirely suitable when it matches up with the given semi-abstract C type).
It sounds like the discussion of a new encoding was perhaps being floated as another way to deal with the above problem, but PnkFelix thinks the problem is that the encoding they chose ends up providing less protection (and perhaps inherently was forced to do so? Its not clear to PnkFelix why, but they don't know if they care to dig into that rabbit hole…)
Q: can you associated more than one fn sig with a given fn-ptr?
*type
with *void
…why would there be a runtime need to look in multiple buckets…
"every function pointer should have just one ID"
why not allow multiple "IDs" for a single fn defn
linker
Q generalize pointers flag exists. can that be generalized to allow other normalizations (e.g. lowering all integers to their bitwidth)?
PnkFelix muses that we could allow local attribute at the Rust defintion of an extern "C" fn
, to indicate how the given signature should be remapped for purposes of CFI.
potentially related to above grsecurity PaX RAP analysis
CFI checks (all for indirect calls)