ryanlevick
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Safe\(r\) Transmute The need for the ability to view one type as another with no copying and only the absolute necessary runtime checks is important for systems programming. However, this process, known as "transmuting" one type to another is extremely dangerous so much so that the docs for [std::mem::transmute](https://doc.rust-lang.org/std/mem/fn.transmute.html) are essentially a long list of how to avoid doing so. Transmuting cannot always be avoided though. For instance, in extremely performance-sensitive use cases, it may be necessary to transmute from bytes instead of explicitly deserializing and copy bytes from a buffer into a struct. While type transmuting is the general act of viewing one type as another type, there is one flavor of the transmute that is extremely common - viewing a slice of bytes (i.e., a byte buffer) as some arbitrary type and vice versa. This pre-RFC attempts to solve only this problem while still leaving room for future improvements that allow for arbitrary transmute. ## Use Cases Viewing a bytes as a type and vice versa is useful in a wide range of use cases such as: * **Parsing**: many file formats layout bytes in a way compatible with C struct layouts meaning copying is often times not necessary. For example * Network protocols like HTTP, TLS, etc. * Binary files like image, zip, executables, etc. * Memory-mapped files * **Search**: High performance search algorithms generally want to copy as little data as possible. * **Kernel and Embedded Development**: often in low level contexts you will not have the stack space to copy items from memory to the stack to perform manipulations ## Causes of Unsafety and Undefined Behavior (UB) At the core of understanding the safety properties of transmutation is understanding Rust's layout properties (i.e., how Rust represents types in memory). The best resource I've found for understanding this is [Alexis Beingessner's blog post]() on the matter. The following are the reasons that transmutation from some buffer of bytes is generally unsafe: * **Wrong Size**: A buffer of bytes might not contain the correct number of bytes to encode a given type. Referring to uninitialized fields of a struct is UB. Of course, this assumes that the size of a given type is known ahead of time which is not always the case. * **Illegal Representations**: Safe transmutation of a slice of bytes to a type `T` is only possible if every possible value of those bytes corresponds to a valid value of type `T`. For example, this property doesn't hold for `bool` or for most enums. While `size_of::<bool>() == 1`, a `bool` can _only_ legally be either `0b1` or `0b0` - transmuting `0b10` to `bool` is UB. * **Non-Deterministic Layout**: Certain types might not have a deterministic layout in memory. The Rust compiler is allowed to rearrange the layout of any type that does not have a well defined layout associated with it. Explicitly setting the layout of a type is done through `#[repr(..)]` attributes. To be deterministic, both the order of fields of a complex type as well as the exact value of their offsets from the beginning of the type must be well known. This is generally only possible by marking a complex type `#[repr(C)]` and recursively ensuring that all fields of the struct are composed of types with deterministic layout. * **Alignment**: Types must be "well-aligned" meaning that where they are in memory falls on a certain memory address interval (usually some power of 2). For example the alignment of `u32` is 4 meaning that a valid `u32` must always start at a memory address evenly divisible by 4. Transmuting a slice of bytes to a type `T` that does not have proper alignment for type `T` is UB. Transmuting from a type `T` to a slice of bytes can also be unsafe or cause UB: * **Padding**: Since padding bytes (i.e., bytes internally inserted to ensure all elements of a complex type have proper alignment) are not initialized, viewing them is UB. * **Non-Deterministic Layout**: The same issue for transmuting from bytes to type `T` apply when going the other direction. ## Proposed Improvements ### Introduce traits for types that can be safely transformed to/from bytes We first introduce the traits `FromAnyBytes` and `ToBytes` (names subject to bikeshedding - see below). `FromAnyBytes` represents any type where all properly aligned and sized byte patterns are legal (from here on referred to as "byte-complete" types), such that any byte slice of the same size can be transmuted into the type in-place without further checking. `ToBytes` represents any type that can be transmuted into bytes in-place, which in requires that the type must not have any padding. All core types that are byte-complete implement both `FromAnyBytes` and `ToBytes` (a full list appears below). Core types like `bool` that need further validation before being safely transmuted from bytes only implement `ToBytes`. Both traits can be safely opted into either using #[derive(...)] or impl blocks as long as: * They are only recursively composed of FromAnyBytes or ToBytes types respectively * They have a deterministic layout (such as types using repr(C) or repr(transparent)) * Additionally, for `ToBytes` they contain no padding bytes. The compiler will return an error when the type does not fit all of the necessary conditions. ### `FromAnyBytes` Definition `FromAnyBytes` contains no methods and serves as a marker trait. ### `ToBytes` Definition ```rust trait ToBytes: Sized { #[inline] fn to_bytes(&self) -> &[u8; std::mem::size_of::<Self>()] { /// ... implementation } #[inline] fn to_bytes_mut(&mut self) -> &mut [u8; std::mem::size_of::<Self>()] { /// ... implementation } #[inline] fn into_bytes(self) -> [u8; std::mem::size_of::<Self>()] { /// ... implementation } /// One more discussed below ... } ``` ### Casting This proposal only proposes one initial function for doing safe transmute: `cast` ```rust trait ToBytes: Sized { /// ... other methods seen above fn cast<U: FromBytes(from: T) -> U { } } ``` ### Impact on Public API The user must opt into a complex type implementing `FromAnyBytes` and `ToBytes`, because this has implications on the public API of the type. For instance, changing normally private details of a complex type such as ordering of private fields may become a breaking change. ### Padding A struct that requires internal padding can become a struct that can derive `ToBytes` by explicitly defining padding fields. ```rust #[derive(ToBytes)] #[repr(C)] struct Foo { field1: u8, _0: u8, field2: 16 } ``` Note that some structs may have "surprise" padding at the end and as such should not implement `ToBytes`. For example: `struct MyType(u32, u8)`. ### Implementing `std` Types The following core types will be marked as `FromAnyBytes` and `ToBytes`: * `u8`, `u16`, `u32`, `u64`, `u128`, `usize` * `i8`, `i16`, `i32`, `i64`, `i128`, `isize` * `f32`, `f64` * `()` * all SIMD types that are byte-complete * `Option<T>` where `T` is any `NonZeroU*` or `NonZeroI` type * `[T; N]` for any `T` implementing the corresponding trait. * Note that all types guarantee their size is a multiple of their alignment, so a slice `[T; N]` can never contain padding that the type `T` doesn't itself contain. The following additional core types will be marked as `ToBytes` only: * `bool` * any `NonZeroU*` or `NonZeroI*` type * `char` * Note that this will produce and consume UCS-4 characters and would require committing to the internal UCS-4 representation of `char`. We could, alternatively, omit the trait implementations for `char`. All tuples composed of `FromAnyBytes` types will themselves implement `FromAnyBytes`. All tuples composed of `ToBytes` types without padding can implement `ToBytes`. (Providing such implementations in the standard library may require compiler assistance.) ### Enums C-style enum types (with no fields in any variant) marked with `#[repr(C)]` or `#[repr($INT)]` may derive `ToBytes`. ### Generics While it is theoretically possible to derive `ToBytes` and/or `FromAnyBytes` for generic structs which are generic over types that are `ToBytes` and/or `FromAnyBytes`, this is left to future work. ### Endianess Transmute deals with in-memory data in-place, and thus does not have any provisions to perform translations between native endianness and non-native endianness. ### Unsafe Impl There is no way to `unsafe impl` either `FromAnyBytes` or `ToBytes` for a type that doesn't meet the requirements. ### Raw Pointers Raw pointers could potentially implement both `ToBytes` and `FromAnyBytes`, and references or Option of references could potentially implement `ToBytes`. There may be uses for such implementations, but they also seem potentially error-prone. We propose to evaluate them further and consider such implementations in the future, but to not provide such implementations in the initial version. ## Naming The names for these traits are still subject to bikeshedding. There were several criteria used to select each trait name. First, the names should make their usages recognizable out of context although not necessarily sufficiently clear without prior exposure. It should be clear through the names how the two marker traits contrast with each other as well as potential extensions in the future. The `FromAnyBytes` trait should convey that any combination of bytes the same length as `size_of<T>()` is a valid representation of type `T` in memory. The `ToBytes` trait should convey that it is a well-defined operation to view the raw memory representation of the marked type. Note that the working assumption is that these types will exist in the `std::mem` namespace. Other names that were considered include: * `FromValidBytes` / `AsValidBytes` * `FromValidBytes` / `ToValidBytes` * `SafeFromBytes` / `SafeToBytes` * `FromBytes` / `AsBytes` * `SafeTransmuteFrom` / `SafeTransmuteTo` * `FromAnyBytes` / `ToBytesInPlace` ## Limitations This proposal is purposfully fairly limited. It does not, for instance, give any way to convert to a type from bytes, only the tools to ensure that doing so would be safe. ## Possible Future Extensions The following proposals could be made in the future in a way that is compatible with this proposal. This document is not advocating one way or another for their adoption, but these proposals were also considered in the creation of this document. ### Extension to `ToBytes` for Transmuting The `ToBytes` trait could be extend to allow transmuting between types through a bytes intermediary. ```rust trait ToBytes { /// to_bytes() defined here /// Safely cast this type in-place to another type, returning a /// reference to the same memory. Panics if alignment or size of types /// differ fn cast<T: FromAnyBytes>(&self) -> &T { /*...*/ } /// Safely cast this type in-place to another type, returning a mutable /// reference to the same memory. This requires `Self` to satisfy /// `FromAnyBytes`, because writes through the returned mutable /// reference will mutate `Self` without validation. As with the above /// this method will panic if size or alignment is off. fn cast_mut<T: FromAnyBytes>(&mut self) -> &mut T where Self: FromAnyBytes { /*...*/ } } ``` ### AlginOf and SizeOf The `ToBytes` proposal above requires two runtime checks to be provably safe. The size of the type being cast to must be smaller than the size of the type being cast from and the types must have an alignment which is compatible with each other. These properties are (usually) possible to know at compile time so having some sort of traits that encapsulate this in the type system such as `SizeOf<N>` and `AlginOf<N>` would be ideal. That way a safe casting function could be written as: ```rust fn safe_transmute<From, To, Size, Align>(from: From) -> To where From: SizeOf<Size> + AlignOf<Align> + ToBytes, To: SizeOf<Size> + AlginOf<Align> + FromAnyBytes { /// implementation } ``` ### `FromBytes` Additionally a `FromBytes` trait could be introduced for expressing conversions from bytes that might fail: ```rust trait FromBytes: Sized { fn from_bytes(bytes: &[u8; std::mem::size_of::<Self>]) -> Result<&Self, FromBytesError>; } #[non_exhaustive] #[derive(Debug, PartialEq, Eq, Copy, Clone)] enum FromBytesError { InsufficientAlignment, InsufficientBytes, InvalidValue } ``` With this in place it may be tempting to have some sort of variant of the above `ToBytes` casting methods which can fail instead of panicking. ### Safe Unions Unions whose fields all implement both `FromAnyBytes` and `ToBytes` can potentially allow reads of their fields without requiring unsafe, since writing to one field and reading from another acts as a transmute operation, and these traits make transmutes safe. However, when a union's fields have differing lengths (referred to here as "unbalanced unions"), initializing a shorter field does not necessarily zero out the remainder of the union. This means initializing a union with a shorter field and then reading a longer field leads to reading from uninitialized memory. To make this well defined, it would perhaps be wise to add a new repr, `#[repr(zero_init)]`, which initializes the remainder of the union to zero when initializing any field. Thus, safe Rust can allow reading fields of unbalanced unions if and only if the union type implements ToBytes and FromAnyBytes and is `#[repr(zero_init)]`. ## Alternatives ### typic `typic` is an experimental crate for encoding a type's layout as a trait. This allows for easy comparison at compile time of two distinct types layouts. As long as they implement equivalent layout types, they should be safe to transmute between (assuming proper alignment). This approach is appealing since it seeks to solve general transmutate instead of simply viewing bytes as types and vice versa. What is unsure is what impact that this approach has on compiler performance and if using the type system in such a way is an approach the compiler team feels comfortable with supporting. More information on `typic` can be [found here](https://github.com/jswrenn/typic). ### `Compatible<T>` The `Compatible<T>` proposal also seeks to address the general problem of transmute through use of the type system. The proposal suggests an extension to the type system that allows the type system to know if two types are transatively compatible with one another. The downside to this approach is that it requires changes to the trait system (albeit ones that are currently in the works in [chalk](https://github.com/rust-lang/chalk)). More about this proposal can be [found here](https://gist.github.com/gnzlbg/4ee5a49cc3053d8d20fddb04bc546000). ### Padding Instead of forbidding padding in `ToBytes`, an API that returns an `[std::mem::MaybeUninit<u8>; std::mem::size_of::<T>()]` could be provided. We believe that this is an ergnomics hit that can more easily be overcome through other means (e.g., adding explicit padding fields that are zeroed or introducing a `repr(zeroed)` attribute). ## Open Questions * Is it possible that two types with the same size and alginment that are both `FromAnyBytes` and `ToBytes` can still not be convertible to one another due to differing ABI needs? * Should all fields of a `FromAnyBytes` type be required to be marked public as exposing the type's in memory representation already exposes its internals publicly? ## Acknowledgments Shout out to the following crates for paving the way with many good ideas: * [safe-transmute](https://crates.io/crates/safe-transmute) * [zerocopy](https://crates.io/crates/zerocopy) * [Compatible<T>](https://internals.rust-lang.org/t/pre-rfc-frombits-intobits/7071/24) * [typic](https://crates.io/crates/typic)

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully