Try   HackMD

2022-10-04 Notes on RFC PR #3296

RFC PR #3296: Improve C types for cross-language LLVM CFI support

References

Broader "LLVM Control Flow Integrity" Tracking issue https://github.com/rust-lang/rust/issues/89653

Papaevripides and Athanasopoulos (2020): "Exploting Mixed Binaries" (PDF)

Mergendahl et al (2022): "Cross-Language Attacks" (PDF)

RFC #3296 proposes to "identify" uses of C char and integer types at "time types are encoded". (PnkFelix didn't understand what the quoted terms were meant to mean on their first read, but they believe they understand it now, and can work with the RFC author revise their text to make this more immediately clear.)

Background

Indirect branches in Rust-compiled code are not validated, while such branches would be validated under analogous C/C++ code compiled with forward-edge control-flow protection. Thus, Rust is allowing potentially desirable control-flow protection to be bypassed.

PnkFelix thinks that part of the goal (maybe all of it?) is to ensure that enough metadata is stored to enable, at runtime, a test whether a given (function) pointer was assigned a given type by the compiler.

So: you need to encode the type itself, and you need to associate that encoded-type with the pointer.

Dealing with this problem at the cross-language level requires cooperation between distinct tools (since you inherently will be combining object code from a Rust compiler and a C/C++ compiler such as Clang). Achieving such cooperation requires either:

  1. One tool (e.g. Rust) reuse an established comptible encoding (e.g. the Itanium C++ ABI mangling used by Clang)
  2. Both tools select an entirely new encoding that both can adopt.

Meta-point: "Motivation" section seems very long. (Does it need to go into this level of depth to motivate this change?)

PnkFelix doesn't understand the point the doc is making when it distinguishes between "provide comprehensive protection for C and C++ -compiled code" vs "provide comprehensive protection across the FFI boundary". Is the idea that one might try to hack something up that doesn't commit to an encoding supported on the C/C++ side, and try to ensure all the necessary checks here are entirely handled at the FFI boundary? (To PnkFelix, that sounds inherently broken, in terms of the amount of dynamic validation that would require during any FFI call.)

Meta-point: It seems like this RFC is perhaps trying to elaborate the whole design space in the motivation section. This is not the way good RFC's are written. Instead: Move the elaboration of the space into the "Alternatives" section towards the end, and have the motivation section focus on the single point (or at least design subspace) that is being recommended here.

The Sub-Problem Of Interest

PnkFelix is guessing/hoping that the section Rust vs C char and integer types is going to start into the real meat of the specific proposal at hand here.

Aha!:

For convenience, some C-like type aliases are provided by libcore and libstd (and also by the libc crate) for use when interoperating with foreign code written in C. For instance, one of these type aliases is c_char, which is a type alias to Rust’s i8.

To be able to encode these correctly, the Rust compiler must be able to identify C char and integer type uses at the time types are encoded, and the C type aliases may be used for disambiguation. However, at the time types are encoded, all type aliases are already resolved to their respective ty::Ty type representations[11] (i.e., their respective Rust aliased types), making it currently not possible to identify C char and integer type uses from their resolved types.

So it seems like the heart of the issue is that we have implemented our C FFI support via definitions of type aliases, and the compiler throws away the fact that the original "intended" types corresponding to certain (abstract) C types, rather than implemented-defined (concrete) integer types of specific sizes.

the Rust compiler must be changed to

  • be able to identify C char and integer type uses at the time types are encoded.

  • not assume that C char and integer types and their respective Rust aliased types can be used interchangeably across the FFI boundary when forward-edge control flow protection is enabled.

Okay, the above seems like this heart of what is actually being proposed here. And that is indeed a severe language change. PnkFelix needs to go back and review why this makes sense (versus stating that the implementation-selected concrete type is entirely suitable when it matches up with the given semi-abstract C type).

It sounds like the discussion of a new encoding was perhaps being floated as another way to deal with the above problem, but PnkFelix thinks the problem is that the encoding they chose ends up providing less protection (and perhaps inherently was forced to do so? Its not clear to PnkFelix why, but they don't know if they care to dig into that rabbit hole)

Follow-up Questions for rcvalle

  • What is relevance of specific function signature itself for purposes of forward-edge CFI
    • how is it sufficient for establishing any kind of security boundary?
    • dan notes: potentially due to how LLVM is aggregating based on types.
  • Option set, not totally clear to PnkFelix whether all of the suggested options are being proposed, as independent knobs one can toggle, or if there's some actually preferred-subset
  • Dan notes: options that involve depending on details of CLANG are potentially hazardous.
  • PnkFelix sees evidence that trying to treat aliases (or, perhaps just C types) as important cases to give first-class treatment to in rustc is important
  • Dan points out that workingjubilee is making reasonable points, in vein of perfect-getting-in-way-of-progress, on RFC PR thread.