Rust All-Hands 2025 C++ Interop

# Rust/C++ Interoperability - RustWeek 2025 ## Thursday morning - Florian: where do people figure out where the layout of a slice is defined? - Taylor: it's in the unsafe coding guidelines, but that's non-normative; there is an RFC in process - Devin: C++ has been more diligent about *not* defining things like this (`string_view`, bytes) - Taylor: generating bindings for all non-templated types/functions means interacting with a lot of things which are not defined in a stable way - JF: does a standardized layout matter? Swift interacts with the complier to figure it out. Itanium defines some things, but not everything. There are 4 compiler implementations and 3 standard libraries, and they more or less guarantee specific ABI that they don't want to break. - Taylor: having a standardized layout isn't what we need; it's about discoverability via the toolchain - Josh: the equivalent of `into_raw_parts` and `from_raw_parts` would be useful. Return the ptr/len/capacity and take ownership of it (so it isn't freed later). - Bjorn: layout randomization would make this infeasible - Gonzalo: specifying the layout in a spec isn't necessary, but `into_raw_parts` and `from_raw_parts` woudn't cut it. Marshalling exceeds the 1% budget we have for interop. The compiler knows the layout. - Devin: agree that `into_raw_parts` and `from_raw_parts` aren't enough. The layout is needed for compound data types. There are necessarily differences b/w C++/Rust that make unifying similar types (e.g. vec). (For example, poisoning after the end of the vector.) - Florian: there are times when people *want* randomized layout for security reasons. There are different use cases and we need to figure out which ones we do and don't want to address here. What should be spec and what should be addendum. E.g., in the spec the borrow checker is not specified. - JF: breaking ABI is a difficult issue. Hashmap performance improvements (changing the hash function) entail a break, but there are major consequences to ABI breaks that historically C++ implementations do not want to do. Similar issue if you want to adopt new security improvements, such as ptrauth for datastructures, doing so would be an ABI break. Advises against Rust locking in an ABI and having the same bad experiences. Swift uses resilience domains for finer grained control, forcing the user to specify where their ABI boundary exists and allowing optimization within the domain. Making the legibility of stability clear to the user is important so people know what they can rely on. C++ has a protocol-oriented approach that could allow interoperability at that level rather than something more contstrained. This also allows a user to have their own protocol conformant types, like string or views. An implementation can the be smart and happen to have the same ABI between Rust and C++ to provide maximal performance, but it'll "just work" if the ABI differs. - Michał: what about allocators? - Taylor: C++ strings aren't necessarily heap-allocated - Josh: Agree that `into_raw_parts` and `from_raw_parts` isn't sufficient. The idea is not to specify an exact layout, but to operate in terms of functions. On the Rust side, we've talked about having a cross-language ABI (crABI) with a set of core types for interop, and a stable ABI just for Rust-to-Rust interoperability. But in both cases, we're talking about using trait-object-like interfaces for things like hashmaps (e.g. `dyn HashMap`), rather than fixed layouts, hash functions, etc. For strings, the idea is that _one of the functions_ could be about deconstructing and giving up ownership, or the reverse, constructing from parts and taking ownership. But in general, you access objects by calling their methods, which allows interoperability across versions of the standard library. Regarding allocator interop, it's not unreasonable to require using the same allocator for interoperability. Similar to what JF was saying. - Victor: The Swift model for ABI resilience would work very well here. (https://www.youtube.com/live/g6vDO62TNmE?si=iw4tL2FktDpM08cD&t=27105) - Victor: How do we deal with multiple versions? At MS, they need to support a minimum of 3 versions or rustc, not to mention MSVC and clang. - Bjorn: even if you use the system allocator, it's not compatible with the Windows malloc/free due to lack of alignment - Taylor: How are MS people linking? - Victor: Mostly dynamically, for a great amount of interop. - Taylor: The use case for MS is a bit more special and different to what Google is doing. It's an asset for the Rust ABI to evolve more freely for optimization. Strings are as hard as anything: C++ `string` don't have standardized size, align, layout, may or may not be heap allocated, non-trivial move ctor, etc. Rather than unifying the types, what's the right way to **model** the C++ type in Rust? How should Rust types be made available to C++ without violating safety semantics? - Victor: wg21.link/P2786 (Trivial Relocatability For C++26) - Devin: some C++ string impls can't be trivially relocatable due to small string optimizations. In particular, libstdc++'s. They already did an ABI break. - Victor (?): it took ten years. - Taylor: We use some `emplace!` macros to leverage Pin ctors to deal with lack of relocatable - Josh/Taylor: (Discussion about whether we could operate differently on string implementations which *are* trivially relocatable vs implementations that would require copying/deconstructing in order to relocate them.) - Michał: there's a specific allocator on Windows for aligned allocations - JF: allocation isn't simple, but it is solvable. What's tricky is that C++ allocation is very pluggable. The compiler always has the answer. For example see this C\+\+26 paper which illustrates how allocation is complicated but hookable [P2719](https://wg21.link/P2719) Similarly knowing whether a C++ string is trivially relocatable could be one of the properties queryable from the protocol. So Rust can ask the compiler "at this point, how do I allocate/deallocate this specific type" and the C++ compiler will have a single answer (having multiple in different translation units is UB). - Bjorn: what about the `c_str` function for converting to a C-string? - JF: the constructor can deal with that in C++. - Josh: (in response to mention that the spec doesn't say what order the string is stored in memory) Do we really need to interoperate with implementations that e.g. store strings backwards, or could we handle those by copying, and interoperate efficiently with reasonable implementations? - Amanieu: there are two things: 1. The semantic problems 2. The ABI issues The latter is only an issue if we're needing to support separately compiling. - Taylor, there are 3 things here: a big subset of Rust features that are useful (Rust calling C++) where Rust is consuming non-Rusty things. Those are relevant regardless of the previous 2 cases. - Josh: There are multiple axes (stable ABI/stable toolchain), but more broadly this group should determine whether we want different work streams. - Maximum efficiency/ergonomics for people who have greater control over their toolchains and those toolchains are actively working on interoperability (e.g. rustc and clang and libc++) - Dealing with the gnarlier realities of arbitrary interop between arbitrary toolchains - The latter may still be needed for arbitrary C++ libraries (e.g. a user-defined class that's not trivially relocatable), but users in the first case shouldn't have to pay the ergonomic cost for `std::string` just because the second case needs it. Don't make the happy path worse - Gonzalo: we are going deep in the weeds. THe separation into 3 problems was very good, can we figure out the problem space and capture a small litmus test first? Like string or slice. Use it to say "hey if these two work, we'll likely be happy" and then broaden the scope. - Devin: not sure I agree with the breakdown. For Google server ecosystem, comprehensiveness is the most important concern. - Victor: we may be splitting too early or too harsly. Want to keep things abstracted in a way that doesn't lock people out. - Taylor: Not sure about the distinct categories either. I'm in the latter case, but not sure it's the right way of sorting. - Josh: This isn't meant to be a complete dichotomy. We're going to have to do the work anyway to provide a solution for e.g. non-relocatable types, and various other hard problems. The separation is more about the knowledge/assumptions and not paying for what we don't need. People shouldn't have to worry about pinning for `std::string` if they don't care about supporting implementations where that isn't required. - Michał: how are we determining which case we're in (relocatable vs not)? Ergonomics causing loss of portability? - Josh: it should be explicit - Kora: If solutions depend on compilers emitting metadata, does that stand in the way of wider support? - Gonzalo: the impact on ecosystem is significant depending on the priorities for the use case - Devin: the happy path isn't always possible for portable code - Florian: the Rust side feels absent from the conversation. The Rust allocator API is still unstable. What kind of things on the Rust roadmap are blocking? Should Rust gain move constructors for example? - Room: no (though there are some advantages and crubit folks do want it) - Florian: just LLVM is fine for me, what are the opportunities for that? Can we view the set of the toolchains as a queryable space (like Predrag's approaches) - Jeff: the successful tools in this space are doing the 80/20 thing. What can this group do to find the 80? - Devin: we have a list of desired features on the Rust side - Notes from Devin: "e.g. specialization" (everyone disliked that). [link](https://github.com/google/crubit/blob/main/docs/overview/unstable_features.md) - Josh: Gonzalo brought up portability, we all care about this on a spectrum. Some people care about portability in that they run on Linux and macOS and Windows. Some people care about portability in that they run on many obscure architectures and questionably-compliant toolchains. Some people care about portability in that they run on many different versions of the Linux kernel. This isn't about drawing a line and leaving people out in the cold. Rust uses the target tier policy to formalize how much of our common resources we will commit to different things of different popularity, and which cases are primarily supported by the subset of people who care about them. - Similarly, we should consider different cases of portability in Rust/C++ interoperability: - Code that needs to run on LLVM, GCC, and MSVC. - Code that only needs to run on LLVM. - Code that runs on arbitrary toolchains including random proprietary vendor toolchains. - We're going to end up with different levels of support no matter what we do; it's only a question of whether we talk about that explicitly or leave it implicit. - We should not make people in the "LLVM, GCC, and MSVC" case pay the additional ergonomic costs for *arbitrary* portability, for instance. - But we still *do* need the solutions for as-ergonomic-as-possible handling of e.g. non-trivially-relocatable types. It's a question of whether people have to use it for more core things. - David: as a group we need to align about things like user-defined moves - JF: can we create a repo of code that doesn't work today? And use that to align our efforts? This is an engineer's solution: an executable repo where we are at 0% and agree what to prioritize and track how to get to 100% support. areweinteropyet.org automatically generated from the repo. - Michał: are there a set of types that *don't* have problems? Can we find the easy wins? Does that enable people to solve their problems? - Devin: (something about the "hard" problems actually being not all that hard, more about community alignment and agreement than technical difficulty. Even string_view is not free of problems (still needs layout)) ## Thursday afternoon - Jeff: when looking at prioritization, we asked how do we reuse C++ <-> C++? The unit of reuse is functions. Looking at what are the most commonly used functions in the codebase imply the set of types to bridge. Can we do the same approach for this problem generally. - Devin: did a similar thing and found that the most common type is string; which is one of the more difficult. More common than unique_ptr, which is (relatively) easy. \[ERRATA: actual list is: string_view, the Abseil Status/StatusOr types, _then_ string\] - JF: Allowing people do their own ranking can be helpful. But also type composition is important. - Devin: we can share the work we did internally, but it's based on using uncommon tools ([Kythe](https://kythe.io/), and internal tools on top of it.) - Dmytro: when talking about Google's tooling for C++; upstream clang doesn't have the same large-scale, parallel analysis support. For topics: different groups were speaking under different assumptions about toolchain, release process, etc. - Mike: want to know about the goals of the people in the room; we should align on - ### Possible topics - Prioritizing key types (strings, unique pointers) - What toolchain constraints are different users operating under? - Google (server): single large monorepo. Has a single LLVM/Clang/Rustc at the same LLVM version from HEAD, released ~weekly. Crubit, a C++/Rust interop tool, is released together with the toolchain. We can place very tight constaints on the C++/Rust toolchain. Some components are released to OSS (e.g., TensorFlow), in those cases we need an OSS solution for LLVM/Clang/Rustc/Crubit all synced to the same LLVM version. libc++ is also built from HEAD, using the unstable ABI. We can take in any ABI break in libc++. We would prefer to use an interop solution that is shared with the community. Also, Carbon. - Primary server-side platforms are Linux x86-64 and arm64, but there is a long tail of platforms that portable code builds for. Linux (arm64, x86-64, RISC-V, PowerPC32, Xtensa), iOS, macOS, Windows, Fuchsia, WASM, various bare-metal and embedded targets. - C\++20 almost everywhere. Some toolchains from the long tail are on C\++17, but on a trajectory to move to C\++20. - \[edited to add\]: we disable exceptions, but enable panic=unwind. - Facebook: staged versions of LLVM (currently 17 & 19, different binaries use a different version). We build a separate Rustc linked against each LLVM. We use both libc++ and libstdc++. For FFI we use the cxx crate, strongly prefer not to use bindgen (it requires perfect understanding of a header and becomes too brittle on new C++ syntax). 500 libraries using cxx. - Similar to MS: debugger experience is a pain point, such as running Display impl on a Rust value from lldb - Adobe: must support many surfaces, WASM, backend Linux, phones, tablets. All sorts of toolschains, stdlib versions. Sometimes gcc, sometimes clang. Built with both standard and dynamic linking. Different build systems. Very large C++ codebases that won't be rewritten. The goal is migrating new development towards Rust. Generally move C++ versions as a company; currently moving to C++20. - NVIDIA mainly provides libraries/toolchains for developers on our platform, not applications; so we have to support whatever target platforms our users care about, which is pretty much everything, from server to desktop to embedded. We support all major C++ compilers (GCC, Clang/LLVM, MSVC, ICPX), C\++11/14/17/20, x86/ARM, Linux/Windows/QNX. - Less about how NVIDIA uses Rust and more about how NVIDIA's users use Rust. Mostly server-side Linux; predominately gcc. Other users are heavily tied to clang. Users care about the interop intersection of C++, Rust **and** Python. C++ pain comes from build/package ecosystem. - Internally we build from source a lot, some we distribute as source artifacts to users. But for our users, it is a mixed bag. Some build from source, some distribute binaries. Traditional HPC does not care about ABIs. But say people who build a GPU application that is going to be deployed to a Windows computer for 10 years, they care about long-term stability guarantees. We tend to not have a very long support window. We drop support for older versions of our core libraries after 2-3 years, and we make breaking changes when we do a major release. But we have some users who care deeply about stability. - Microsoft has a wide variety of systems and no monorepo. More interested in back-end and binary, ABI compatability. Primary interest is making sure multiple toolchains are supported. Working to unblock teams going to production. Currently teams are going to production based on exceptions. - Most commonly rewrite a whole DLL in Rust. Sometimes we rewrite some part only. Try to identify natural boundaries, sometimes it means COM. - A potential easy win is in quality of debug information - https://learn.microsoft.com/en-us/cpp/porting/binary-compat-2015-2017?view=msvc-170 - Want to share the "Rust runtime" between users, afraid of linking it statically and breaking ODR. - Bloomberg Linux server-side, bindgen interfaces. No monorepo, people are shipping static binaries. Target the Red Hat version that is used in the company. Weird platforms occassionally, some Windows, some mobile platforms. Typically one compiler version at a time. All gcc, libstc++. Experimenting with trying to build with clang. C\++20 everywhere. C\++ -> Rust interop isn't very widespread right now. Hard to drop in Rust to CMake projects. We don't do much C++/Rust interop right now, but are interested in growing this. Currently lack of interop is seen as a blocker that prevents Rust adoption. Currently we recommend to write new RPC services, but that works only for a minority of users. - Toyota: we have a lot of tools. In a vehicle we have a lot of hardware. Application cores, realtime cores, controllers. We have a lot of ARM cores, microcontrollers (G4MH), RTOSes, QNX, Linux, bare-metal. Lots of C++, some generated by Simulink (matlab). Safety-related stuff tends to be constrained, tends to come from the vendor and be built with their toolchain. Lots of code that needs to interop across binary boundaries because it comes from different vendors, not everything is compiled at once. C++ versison follow MSRA. Currently C\++11/17, moving to 20. We interop Rust with Dart (Flutter). QCC (based on GCC), Clang, compilers based on Clang. libc++, libstdc++. There's not much Rust in the vehicle, we can change the status quo. Safety critical code tends to rewrite code often. What does it mean to do vehicle safety in Rust? There isn't that much code on a microcontroller, so you would tend to rewrite the whole thing at once in Rust. In other cases you would use RPC, so no interop. Only in big binaries in the Linux environment would you use interop. - ARM: not interested in any particular piece of software, but more focused on making sure good code can be generated for ARM cores. Similar to NVIDIA, but add microcontrollers into the mix. The role in interop is writing the ABI specifications for ARM platform. \<add a description of your toolchain constraints here\> - C++ exceptions? Do we care about them and how? - Google compiles w/o C++ exceptions, but *with* Rust unwinding - Adobe uses exceptions for error handling - MS does structured exceptions in the codegen which leads to better fidelity between Rust/C++. No major blockers. C++ exceptions -> (catchable) Rust panics. - Making C++ exceptions in to Rust `Result::Error` is sensible, but making Rust panics into C++ exceptions seems to generate too much complexity - Bryce (NVIDIA) agrees fully with David Sankel - we shouldn't lower all exceptions to aborts. NVIDIA uses exceptions in CPU code (stuff thst does orchestration, etc), but not in GPU code (where they are unsupported). We avoid exceptions only in places where it inhibits CPU compiler vectorization or other optimizations. ### Unfinished conversations 1. Connor: Itanium C++ ABI (confusingly not limited to Itanium) does provide for C++ catching a foreign exception 2. Victor: Can we also talk about x-lang LTO ? 4. Connor: Build systems (brought up by Taylor but generally applicable) are an interesting situation 5. Michał - re x-lang LTO: issues with a few existing backends(https://rust-lang.zulipchat.com/#narrow/channel/136281-t-opsem/topic/How.20does.20the.20GCC.20backend.20handle.20code.20C.20would.20disallow.3F/near/394411414 - LTO causes UB-free rust and C to misbehave when enabled). LTO requires us to fully follow the semantics of Rust and C++ - can this be done correctly? The interop boundary may cause some kinds of UB to be currently "missed"(not cause bugs). Will x-lang LTO make this UB to cause issues? (Paper about cross-lang UB:https://arxiv.org/pdf/2404.11671 - will LTO cause this kind of stuff to be a problem?) 1. David - on exceptions 2. Victor: Opinions on Lakos rule for narrow contracts/noexept? https://wg21.link/P2861 1. Devin: Question on exceptions: does any C++ team use exceptions/expected the way Rust uses panic/Result? ## Friday ### (10:15) Language Evolution Process in C++ and Rust - [Slides links](https://github.com/jfbastien/papers/blob/master/source/Standardizing%20C%2B%2B.pdf) - P papers in WG21 are the closest analog to Rust RFCs - The main benefits of P papers are they have revision numbers as JF mentioned, but also they're not official ISO documents so we don't have to manage them in the ISO document management system. ISO wanted to prevent us publishing our own proposals on our own site, and use their system instead. We don't have to do that because we stopped using N papers for proposals. - Q: C++ process feels more waterfall and Rust more agile, is this so? - A: C++ is train-based (3 year sprint), but does require iterative feedback and implementation experience - It's possible to get things behind a compiler flag and available on compiler explorer - Overall, more similar to Rust that initially appears - Source of tension is compilers accepting experminental flags since they don't want to appear to promise to accept something which may not make the standard - Q: Can a compiler claim to be C++X without implementing all the features? - A: Yes, in practice, no implementation is perfectly conforming - Q: Do things that aren't implemented get taken out of the standard? - A: Yes, this sometimes happens - C++ has less insight into what might break things (no crater equivalent) so tends to be more careful - Q: What can the two languages learn from each other with regards to process evolution - A: C++ processes are pretty onerous, but this is more about informing the Rust community about what it's going to take to get convergent changes through. - Q: Does meeting more in-person help accelerate the process? - A: Different processes work for different communities; it's particularly hard to change though ### (12:00) C++/Rust Interop Through WebAssembly Component Model Q: Is the Rust code operating in native instructions or interpreting webassembly A: This is compiled to native, non-sandboxed Q: What's the WIT overhead compared to direct C++ to self A: Overhead is manageable with elided copies. C++ vector requires copies however. Q: Is this a RPC with a binary serialization protocol? A: No, at the WIT level, data types can be described directly and mapped to ABIs. ABI variants and translated on the fly. For Rust types, the stdlib types work well, but C++ requires custom types to deal with allocation differences. There are no references, only pass by value/ownership. Currently there is investigation into using shared memory to avoid copying overhead. Q: Is this using multi-memory? A: There's not a good way for using this in Rust. It could be investigated for C++. For native execution it uses shared memory. Example [in clang](https://godbolt.org/z/eE9cqPM96). Q: The data structures defined in WIT are limited to what can be defined in that language? A: Everything is mapped back to the sandbox provider. Resources are totally opaque handles, but methods can be called. Q: How are the shared libraries run? Is there an orchestrator? A: These applications map to a WASI standard run function. If you can live with an interface description which is sandboxed, this is a solution that works very well for integrating. Unlike SWIG, this is language-neutral. Q: What do you mean by sandboxed? What problems can be solved here? A: WASM sandboxed with linear memory. Any interaction with the outside world must be mediated. [Link to Christof's WASMCon presentation](https://wasmcon24.sched.com/event/1qvIG/component-model-in-software-defined-vehicles-christof-petig-aptiv) and [GitHub repo](https://github.com/cpetig/wit-bindgen). Example code using streams in C++ and Rust compiled to native at https://github.com/cpetig/wit-bindgen/tree/work-in-progress/crates/cpp/tests/symmetric_stream ### (13:00) Lunch ### (14:00) Interop-inspired ISO C++ improvements TODO: Add slides #### Zngur Driving principle: Rust semantics tends to be a subset of C++ semantics, so use C++ to describe Rust types. Exceptions: destructive moves mean C++ types can't be on the Rust stack. Zngur provides `Ref` and `RefMut` types to account for the significantly different reference semantics between Rust and C++. Q: Why is the drop flag checked at runtime, but the alaising guarantees are the programmer's responsbility? A: It's not possible to check the aliasing at runtime and the drop flag is needed for proper destruction semantics. Q: Can you inject rust code into the generated file to implement additional Rust traits (e.g., for operator impls)? A: Yes, but traits can also be implemented on the C++ to implement the Rust trait impls. Owned C++ objects must be heap allocated (but look like normal object syntax on the Rust side). To fix this, seems to require language changes. What are the tradeoffs for implementing custom (i.e., C++-like) move operations in Rust? - Could break existing uses of mem::replace - Macro Pin-based solutions exist, but suffer in terms of engonomics - `Pin` ergonomics and `super let` could address this [[no_unique_address_really_please]] // for some compilers :) Q: How does Zngur generated code ends up in name mangling and symbols? ### (15:35) Informal intro to Crubit - Most bindings automatically generated - Exception: Rust side of C++ vec was hand-crafted; ended up needing to make 2 versions since (unstable) ABI changed - Clang annotation used to specifying ABI translation - Bridging code exists on both sides - Goal: using Rust for a new source file at any point should be as easy as C++. - Prioritize comprehensiveness - Porting should be acheivable Q: what would be required to use crubit with other toolchains? - Replacing the Clang AST -> Crubit IR code; possible but non-trivial - IFC spec could potentially be leveraged instead of Crubit IR, at the Clang AST stage of processing: https://github.com/microsoft/ifc-spec https://github.com/microsoft/ifc https://discourse.llvm.org/t/rfc-lifetime-annotations-for-c/61377 https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-analysis-in-clang/86291 Devin: Dropping some links to things in the Crubit space that we talked about yesterday: Crubit itself: https://github.com/google/crubit/ ctor.rs (preliminary, already-taken name) -- the emplace!{} logic and so on for pinning on the stack: https://github.com/google/crubit/blob/main/support/ctor.rs forward_declare.rs: https://github.com/google/crubit/blob/main/support/forward_declare.rs templates: https://github.com/google/crubit/blob/a85afb1491be708d894200ed7416775f6c5e23a2/rs_bindings_from_cc/test/golden/templates_rs_api.rs#L297 -- and that whole directory has many examples of various things. ffi_11: https://crates.io/crates/ffi_11 ### (16:40) Integer Types - It would be really useful if there were a 1:1 mapping between Rust and C++ integer types such that there would be lossless roundtripping https://doc.rust-lang.org/stable/std/ffi/type.c_char.html https://docs.rs/ffi_11/latest/ffi_11/struct.c_char.html - There's also std::byte which is closer to `MaybeUninit<u8>` - Is this relevatnt to interop? - On long being isize/usize: Windows 64-bit is fun. - Windows kept long 32-bit on 64-bit platforms. - gcc supports targets where size_t is `unsigned __int20` and int is `i32` WHICH IS FUN - adding two unsigned size_t values promotes to signed int 😭 - The set of conversions possible with `as` casts doesn't match `Into` - We still need `.cast_sign()`, `.extend()`, and `.truncate()` - preferably all const-compatible. - Would it be valuable to deprecate `std::ffi` in an edition? - I think requiring uint8_t to be unsigned char would be a no-op for all platforms where char is 8 bits. Requiring int8_t to be signed char would require Solaris to be fixed, but they already need to fix that to conform to C99 anyway ## Saturday ### (09:30) C++ <=> Rust Debugging Story #### Top-line takeaways * **Use lldb**! * What's emitted from compiler seems OK, could be improved #### Coming from C++ and C# the quality of tooling for debug in Rust is not great * mismatched expectations * highlighting discrepency * want first-class support for common types, e.g. break on an `Option<T>` not there today * Rust pretending to be C++ is failing from a debug perspective * feeling is that if in Rust a debugger is less-required, but if you bring C++ into the mix it might be #### Interop-specific concerns * mismatch between how e.g. a C++ type is represented when using Rust debug info. could be an improvement! * real-world use-cases at vendors * code-patching of mixed binaries, can we patch C++ and/or Rust into the binary? what are the boundaries and specific issues with doing this? still exploring * compliance tooling on mixed binaries, e.g. debugging information; must be updated to handle the mixed case #### Status quo on Linux Windows quite different * Windows - not great * low quality .pdb output from rust * Microsoft is upstreaming some of this work * Linux - usable #### Debugging support for asynchronous is missing * probe executor is not great atm * Implementation details vary based on executor * Mir intepreter into debugger #### Hierarchy of needs 1. hooking into existing logging solutions 2. post-mortem debugging, ensuring stack-traces show up - Google: abseil installs a segfault handler that uses non-DWARF information (the dynamic symbols from the binary etc.), because binaries are shipped to the server without debug info - Google has to link in abseil into mixed-language binaries to make sure this shows up, seems to work great after that - but the Rust standard library could be still enhanced to fall back to use the symbol table if DWARF is not available 3. Interactive debugging - doing this at scale on large codebases can be challenging - e.g. if you've got something like a microservice architecture, then this can not be easy, RPCs have deadlines, and they timeout while the developer is single-stepping the code - WebAssembly can make this an easier sell, pull them all into a single binary and then you can "see" it all the RPCs happen in the same process space ### Ad-hoc Discussion of Next Steps David: I'd like to have a session where we could determine alignment is in the room is on the following topics: - Building a shared vision - What is the north star? - A single solution everyone uses or solutions that are tailored with their own tradeoff sets? - Do we want complete autogeneration, partial IDL, or complete IDL? - Completely defined Rust to C++ mapping? - Completely defined C++ to Rust mapping? - How much desire is there for stable ABIs being part of this effort? - What concrete things do we want to drive now? - Improve perormance of using Rust objects in C++ - Destructive move in C++ - Explicit data member layout in C++ - Addressing ergonomics of using C++ objects within rust - Custom move support in Rust - Better Pin ergonomics - Relocatable in standard library - https://docs.google.com/presentation/d/1LBl71Yc1EfloFP_TehJMJFyequ5JKP5SZ1JANOvWaYY/edit?slide=id.p#slide=id.p Dmytro: Reading the room on Thursday, and summarizing other 1:1 conversations we had with other companies, I don't think we can all use a single interop tool any time soon. That's because our needs are too different: which code can we recompile, where do we depend on stable ABIs, where do we explicitly want to use an unstable or some specific ABI (e.g., buliding libc++ from source with unstable ABI), can our build system support a custom compiler-based interop code generation tool, do we want to check in bindings into the repo etc. Of course, if in future we find a solution that satisfies everyone's needs (maybe, with a configurable tool that has many modes), of course we should converge to use the minimal number of tools. But it just seems unlikely within the planning horizon that most companies have wrt. adopting Rust. Dmytro: When we talk to some of our OSS customers about Crubit, we often hear that they would rather choose bindgen or cxx because these tools are better integrated into the OSS workflow and have a well-established OSS community of users, ever though Crubit would satisfy their interop needs better. So it seems to me that projects very carefully consider the tradeoff of usage and build maintenance complexity versus the benefits they get from the tool. And since many projects don't like to work on their build systems, they often choose simplicity of the build over ease of interop. So until complex interop tools that integrate into the compiler are well-integrated into cargo, rustup etc., users will prefer to use simple tools that "just work" in terms of the build process ("worse is better" philosophy is also playing a role here). Someone started a straw poll about agreement in principle about converging on one interop tool. There was no agreement in the room. ### (10:45) Bridging compilers for interop What can we not do without compiler integration * Instantiating templates * Overload set resolution * Instantiating Rust generics These all need calls back and forth between the two language frontends. tmandry and cramertj have been prototyping a compiler plugin-like API in the Rust compiler. The prototype simulates a plugin that calls into clang and uses the semantic information to create new items in the Rust HIR. Ilya: There is another drawback of doing this compared to an IDL system, because processing C++ is more expensive and complex than processing an IDL. The protocol shared between Clang and MSVC can be quite complicated. Christof: If this had existed three years ago, I wouldn't have been looking into wasm so early. I was looking for a solution like this but didn't see any. The API I'm working with has e.g. a custom result type that's similar to Rust Result. I want to instantiate C++ templates from Rust, and inheritance (overriding virtual). Taylor: Probably Fractal: Maybe worth looking into Stable MIR; this looks like a protocol we could use. They have descriptions of type layouts and so on. Might be something that can work with multiple Rust versions. Taylor: One thing that doesn't support is trait solving / resolution; Crubit already uses this. Fractal: Maybe we can extend it and influence the shape of the API such that it is usable for interop. If we agree it seems reasonable I think they would accept it. Kristof: is this the road towards a new programming language which encapsulates both Rust and C++ with corresponding complexity? Taylor: We talked about this with the lang team a few years ago. The important part for me is that we don't disturb the core language too much and don't disrupt the benefits of using a modern memory safe language. I do have fears about creating a system where people need to understand the features of both language and how the features of one maps on to the other. We have talked about some advanced features here and unfortunately a lot of advanced features do come up a lot in C++. Making this ergonomic and map in a natural way is important for balancing this out. It should stay contained to the spaces that need C++. Jubilee: (sorry missed it) Why run a Rust and C++ compiler in the same process? Taylor: We need type information e.g. for doing overload resolution and feeding that back into the Rust compiler for doing type checking. ... Devin: Doesn't strictly need to be one process, just that we need to keep going back and forth until you're done. David: Can see the value of going back and forth. Zngur has you specify templates ahead of time. .. Fractal: How many times do you need to call back and forth? Taylor: Once per function boundary. Taylor: Key insight: If you are instantiating a template with some type, the C++ bindings for that type already need to be instantiated within the C++ compiler. Devin: The C++ compiler needs something it can query. Victor: I wonder if we can make them spit out metadata from both sides and do the resolution in some third party. Taylor: It's a good idea but getting information as simple as a function signature, in general, requires compiling arbitrary C++. Victor: The general problem is unsolvable without a compiler. Taylor: Maybe we could carve off smaller pieces, e.g. as we're doing with std::vector. Victor: That's what I was thinking. Devin: Maybe if instead there is a protocol that can handle back and forth, it would help extend that direction to the general case. It's also easy to do a Vadim: We can have two different rustc instances. (... sorry I missed some of this) Richard: We've been focusing mainly on starting with a Rust compiler and calling into C++, is the other way in scope? Taylor: I think it's in scope and it's way easier. Taylor: Another thing is ... It would require C++ extensions. David: This is based on the supposition that you can map a C++ function to a Rust function. What about exceptions? Taylor: You can detect exceptions, maybe wrapping a C++ function. I would prefer not to add Result returns to every function. You can allow unwinding through Rust and have a bridging thing that allows catching C++ exceptions (not `catch_unwind`, `catch_cpp_exception`). gnzlbg: ... Jonathan: I appreciate this problem. I don't think it's completely necessary. You can work around the problem by defining the type you want to instantiate with on the C++ side and bridging methods. Worse ergonomics. Taylor Cramer: Yes. It's a level of annoyance that might be too much to accept; it might prevent people from doing things that feel native. e.g. if you have a function that accepts two lambdas as arguments... you could do it, but it wouldn't feel nice. Jon: This seems like it might be a point of divergence. The desire to remove all the hurdles might not fit within the work we define as a shared goal. Jubilee: Regarding exceptions, the way we do panics works the same way as C++ exceptions. Devin: The worry I have is that if we have one solution that isn't enough for the people dumping lots of resources, we end up with two solutions. Jon: It isn't well defined what constitutes a solution. People have differing use cases and there are a lot of pieces. David: I think having exceptions would fundamentally change the feeling of the Rust language. Really concerned we have a dialect of Rust where `?` doesn't mean what it means anymore. Panics in Rust are a programming error. Exceptions can mean something else. Ilya: There are multiple approaches to exceptions. ... Mapping exceptions to Result does make sense. If we had a distinction in C++ between exceptions that are programming errors and those that signal errors, that would support mapping it correctly. I don't think we should submit to a single solution; why not both? Taylor: We can definitely add an attribute to a function to wrap its bindings and return a Result. Niko: The point of the panic=abort RFC was to encode that panics are not necessarily recoverable. Jon: There is more consistency of idiomaticity in Rust. C++ is not the same way. We may need to be open to the idea tha we are not going to have only one solution. Taylor Foxhall: The result of the lack of idiomaticity is that I don't know what a function might do and have to handle every single case. cramertj: David, how do you expect to distinguish between functions that do and do not throw exceptions? David: Not `[[noexcept]]`. Any function that can allocate can throw an exception in C++... that's an interesting one and I'll have to think about it. ### (12:00) Stable ABI Jon: What is important to us wrt. ABI? It is very helpful to know the ABI for predicting the behavior of the other side. Versioned ABIs would be very helpful. A way to output the ABI of a function would help for interop. This would be helpful for both Rust as well as the C++ side. This is part of the crate information, which is unstable. The information is similar to what is inside a C++ header. - Connor: I am interested in working on an ABI Description Format (I'm actually working on a subset of one for an OS project) A practical example for where ABI matters: - You need it for passing by value to C++, basically to shortcut the marshalling between languages. E.g. vector of spans (C++) to `Vec<&[]>`. Passing a C++ value returned from a function to another C++ function, especially a not-relocatable object. The result is returned in a pre-allocated location, but you can't forward the object unless it has a move or forward constructor. This feature exists since C++-17. An example of an immovable type is mutex. - Passing a reference to a slice of ints. Rust and C++ need to agree on the layout of types in a slice/span. - Example: ```C++ struct S { S(const S&) = delete; S(S&&) = delete; S(){} }; void f(S s) { } int main () { f(S()); } ``` ```C++ // Note: Since constructors are special in C++, an example that uses a free function struct S { S(const S&) = delete; S(S&&) = delete; S(){} }; S g() { return S{}; } void f(S s) { } int main () { f(g()); } ``` - Michał: What are the requirements of a stable Rust ABI? Does an ABI like this need to be callable from C++? That may currently be impossible: Rust ABI is unpredictable. It is based on the C ABI on the LLVM level, but it sometimes produces ABIs that are impossible in C. LLVM will legalize those invalid ABIs in *some* way, but it is not documented *how* that proces happens. E.G.: On `x86_64-pc-windows-msvc`, the Rust ABI returns some values(like slices) in 2 registers. This is not(currently) possible to replicate on the C++ side. https://godbolt.org/z/6cWKfdW4c - The expectation is not to have a stable ABI but to report in a machine readable form what is used at compilation time. - Rust Function pointers aren't always representable in C. To be clarified. - A MVP for Crubit intended for a std::function and Fn interoperability, but it got dropped because of complexity. - You can't implement Fn trait in stable Rust. This causes some people to remain on unstable for interop. - In theory this is possible but we want to not limit future changes. - The metadata format doesn't change that often, but there are no semver guarantees. A compiler flag to dump the metadata could introduce a versioned format. The Rust compiler could also provide a library to read the unstable format. - The layout of an object in a slice would need to be known. - An empty C++ slice can have a length of zero and a nullptr that isn't valid in Rust. Zero sized types are represented differently. An `Option<&[T]>` is the same size as `&[T]`. - A span roundtrip would require the pointer to remain the same. C++ can't accept a dangling pointer in an empty slice. Using an `Option<&[T]>` on the Rust side would make the C++ representation work, but `Some(&[])` would be illegal to pass to C++. - We agree that conceptually span and slice are the same, similar to `int32_t` and `i32`. - Another problem is uninitialzed span objects and aliasing violating Rust's guarantees. - FYI: https://github.com/cramertj/rfcs/blob/slice-repr/text/0000-guaranteed-slice-repr.md - How to use iterator like types on the other side? - Separating an empty span into pointer and length is common, but a dangling pointer would confuse existing C++ code. - Ideally a span/slice would just use the idiomatic access on both sides. - Slice and span feel so close to be the cross-language equivalent. - What do the bits say? What APIs can you call on the respective type? - Providing all functionality on one side to the other side makes a case for not making the types identical. (As done on Zngur). - The inability to map between these two types is a source of frustration. - This touches a similar problems as unambiguously mapping integer types. - For interop explicit differences double the complexity, implicit differences introduces magic misbehaviour. - Both individual languages already have the problem of unportable integer types. ### (14:00) Allow reuse of tail padding (aka C++'s "potentially overlapping subobjects") - TODO: Get slides [Background description](https://hackmd.io/uZ8hHkYXQQCLtwuS7xP6JQ?view#Allow-reuse-of-tail-padding-aka-Cs-potentially-overlapping-subobjects) - Derived class members can be placed in the tail padding space of the superclass. Rust might overwrite this information in unsafe code. - This is not a property of C ABI, only C++ ABI. - `Std::mem::swap` could be changed to respect this, but unsafe code could clobber the data. - This is only about tailpadding at the destination. - `copy_nonoverlapping` will copy `size_of::<T>` bytes which is conflicting with tail padding. - Rust doesn't distinguish between a size 1 slice and a pointer to an object. - https://doc.rust-lang.org/std/slice/fn.from_ref.html is affected - A new trait `FullSized` could solve this, the function `size_of_without_tail_padding` would report this. - Amanieu would like to do this if possible, but suspects that it is not, because it breaks existing code. (Behaviour of `repr(Rust)`) - Question: Do we want to solve this for C++ interop, or do we want to improve Rust performance? - Constraints that lead to this problem: - Want C++ types to work with Rust generics and/or Want to do the optimization in Rust - Unsafe code that can write to padding bytes today, but that would clobber data readable by safe code when we turn this on. - An auto trait to prevent mem::swap (by Niko) would solve this. `NonSwap` / `Replace` - In Rust types have padding, memory doesn't. Memory is just bytes. - Proposal: Do two crater runs, one with storing information in the tail padding, one which moves the tailpadding before the variable. Compare the failures. - Pinned types already prevent moving. So the reference to the base class could become a Pin. - Move only fields have been proposed to solve high density packing. This could be a solution here. This would prevent referencing C++ elements. - Potential solution: Only the types having tail padding must be not `Unpin`, and you can only get a pin reference to the object. - You can only pass a reference to a base class which is pinned. - This is only a problem for C++ objects which contains tail padding. ### (15:00) Concrete ways of reducing UB in the C++ spec - Within the C++ committee, there was an effort to document the existing UB. Because all behaviour not-described is also undefined. - This document was used to measure and reduce UB. - There are different reasons for UB - Leaving implementation choices open, this would make it implementation defined - UB can result from a mistake, e.g. UB in the preprocessor, or a missing new line at the end of a file - where existing implementations diverge and prevent the standard from defining it - There is also (IFNDR) ill-formed (no diagnostics required) and unspecified - UBSAN was created to catch violations can be checked without requiring more memory at runtime. Also C++ interpreters know about which entry of an enum is used. - A patch to clang initialized the stack to zero while avoiding already known contents, the runtime overhead was acceptable (unlike previous attempts) - This caused a push towards implementing this security enhancement it in other compilers. - Also trapping after a function without return has become common to reduce UB, and replace it erroneous behavior (EB) - This is still opt-in in the compiler, because it might break existing code - Some changes introducing EB can make it slower because of reasoning of loops ending. - Examples of UB hurting interop are appreciated - Library UB often doesn't hurt Rust interop - Assembly and system calls operate outside the abstract machine ### (16:00) Memory model standard shared between C++ and Rust Idea is to create a shared abstract virtual machine between C++ and Rust. This describes memory access (parallelism), arithmetics and control flow. JavaScript and WebAssembly define the machine more formally than C++. The memory model could be split out of the C++ standard to more closely define language interoperability. It would be an IEEE working group implementing changes, these changes get fed back to C++. Ideally the changes would be discussed in a pull request to github and available under a creative commons compatible license. A formal description derived from the English test would enable formal proofs. The final mathematical representation is still being worked on. The Linux kernel memory model is meant to be integrated into the unified model. Ideally C++ would switch over in the 29 standard. The first import should be unchanged, so referencing it from C++-29 is viable. A liaison to the t-opsem group is needed. The memory model effort is supported by the Rust language side. # ✋ Queue For Speaking ✋

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.