Try   HackMD

Advanced inline assembly usage in Linux kernel

This document covers advanced inline assembly usage in the Linux kernel. The focus is asm-goto (as per project goal), but it extends to a few other asm features used in Linux kernel but it not provided by stable Rust as of today.

asm goto

Kernel has hundreds of asm goto usage, I compiled them into the following categories:

Handles exceptions inside exceptions

Example is arch/x86/kvm/vmx/vmx.c:

asm goto("1: vmxon %[vmxon_pointer]\n\t"
                  _ASM_EXTABLE(1b, %l[fault])
                  : : [vmxon_pointer] "m"(vmxon_pointer)
                  : : fault);

Many other examples in implementing uaccess (user-space access), which can fault and must be handled gracefully. Performance matters a lot in this case too, so using an extra output register and polyfill without asm goto is bad.

The corresponding Rust pattern is:

unsafe {
    asm!("/* do some op, jump to {fault} on fail */", fault = label {
        // Fault
    })
}

In this use case, an issue with current design is that the fault block becomes unsafe. The following workaround can be used to avoid this unsafe:

'out: {
    'fault: {
        unsafe {
            asm!("/* do some op, jump to {fault} on fail */", fault = label { break 'fault; });
        }
        break 'out;
    }
    // Fault handling code.
}

Breaking out from loop in asm-implemented computation

Example in arch/riscv/lib/csum.c

    ...
    asm goto("/*...*/ beqz ..., %[end] /*...*/":/*...*/:/*...*/::end);
    ...

end:
    return ...;

This corresponds to Rust pattern:

asm!("/* ... */ beq, ..., {end} /* ... */", end = label {
    return ... // or break
})

The code inside the block is immediate another control flow transfer. The code inside label block is minimal so current design works fine.

Alternative mechanism

Needed when need to use different code on different CPUs. Too costly to perform function calls.

Example in arch/riscv/include/asm/bitops.h:

asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0,
                              RISCV_ISA_EXT_ZBB, 1)
                  : : : : legacy);

The old content (j legacy) is emitted, but will be replaced with a nop if Zbb ISA extension is detected.

The implementation would be more involved in Rust because it can't stringification of constants. It'll probably be something like this:

asm!(alternative!("j {legacy}", "nop"), vendor_id = const 0, patch_id = const RISCV_ISA_EXT_ZBB, config = const 1, legacy = {
    /* Legacy */
})

This is similar to the fault handling case.

Static branch

Similar to alternative mechanism, but allow run-time switching.

static __always_inline bool arch_static_branch(struct static_key * const key,
                                               const bool branch)
{
        asm goto(
                "       .align          2                       \n\t"
                "       .option push                            \n\t"
                "       .option norelax                         \n\t"
                "       .option norvc                           \n\t"
                "1:     nop                                     \n\t"
                "       .option pop                             \n\t"
                "       .pushsection    __jump_table, \"aw\"    \n\t"
                "       .align          " RISCV_LGPTR "         \n\t"
                "       .long           1b - ., %l[label] - .   \n\t"
                "       " RISCV_PTR "   %0 - .                  \n\t"
                "       .popsection                             \n\t"
                :  :  "i"(&((char *)key)[branch]) :  : label);

        return false;
label:
        return true;
}

This translates to something like this in Rust:

'outer: {
    unsafe {
        asm!(/* use label */, label { break 'outer true; });
        false
    }
}

Note that this static inline function cannot be translated to a function in Rust, due to Rust's lack of i constraints (covered later).

The usage here is to return true/false from within block and then immediately use the value with if:

if arch_static_branch!(KEY, BRANCH) {
    /* */
}

This is easily optimised by LLVM's SimplifyCFG pass to remove the additional if. We prefer to have if to avoid having custom control flow in macros, e.g.

// Note this still has the unsafety issue.
arch_static_branch!(KEY, BRANCH, {
    /* */
})

This pattern is most common. Another example is the following Rust impl of ctrl_dep/volatile_cond (not mainline yet):

/// Enforce a control dependency on `x`.
fn ctrl_dep(x: bool) -> bool {
    unsafe {
        core::arch::asm!(
            "test {:e}, 0",
            "jne {}",
            in(reg) x as i32,
            label {
                return true;
            }
        );
    }
    false
}

Restartable sequences (rseq)

Userspace, kernel code limited to selftests only. Jump to different cases to handle different rseq results.

Design decisions?

asm goto with outputs

Preferred action is to have asm_goto_outputs be a separate feature gate, and defer its stabilisation until GCC supports it and LLVM stop miscompiling.

Return values directly from asm! block

For the static branch case:

'outer: {
    unsafe {
        asm!(/* use label */, label { break 'outer true; });
        false
    }
}

can be written as:

unsafe {
    asm!(/* use label */, label { true }, fallthrough { false });
}

However this adds complexity to a rarely (in terms of code appearance frequency not runtime frequency) used asm feature.

I am not certain what should happen here, need inputs.

Fallthrough block

Should asm goto include a fall-through block?

  • If asm! cannot return value, then adding fall-through block would not help with any of the above use cases.

Preferred action would be to not implement it unless a use case arise.

i constraint

i constraint in C means that a constant is passed to the assembler. This only needs to be assemble-time constant, not compile time. Relocation is also allowed.

int x;
static inline void foo(int *ptr, int value) {
    int y;   
    asm volatile ("/* ... */"::"i"(100)); // OKAY
    asm volatile ("/* ... */"::"i"(1+1)); // OKAY
    asm volatile ("/* ... */"::"i"(&x)); // OKAY
    asm volatile ("/* ... */"::"i"(&y)); // NOT OKAY
    asm volatile ("/* ... */"::"i"(ptr)); // OKAY if caller uses a constant `ptr`.
    asm volatile ("/* ... */"::"i"(value)); // OKAY if caller uses a constant `value`.
}

In contrast Rust const is string-interpolation only, and its usage is quite limited.

static X: (i32, i32) = 1; #[inline] unsafe fn foo<const N: i32>(ptr: *const i32, value: i32) { let y: i32 = 2; asm!("/* ... */", const 100); // OKAY asm!("/* ... */", const 1+1); // OKAY asm!("/* ... */", const &X); // NOT OKAY asm!("/* ... */", const &X.1); // NOT OKAY asm!("/* ... */", sym X); // but this is OKAY asm!("/* ... */", const &y); // NOT OKAY asm!("/* ... */", const ptr); // NOT OKAY asm!("/* ... */", const value); // NOT OKAY asm!("/* ... */", const N); // but this is OKAY }

Currently in Rust sym can be used to pass address of a static, but there is no way to embed address of &'static Foo into assembly.

The idea is to extend const to any CTFE constants. I don't think Rust needs to support use of assemble-time constants, CTFE constants should be sufficient. This would make line 8 & 9 above legal.

Issue: https://github.com/rust-lang/rust/issues/128464

Of course, to be able to pass multiple different pointers to the function still is not ergnomic. This still needs a trait and assoc constant.

trait Foo {
    const X: *const i32;
}

unsafe fn foo<F: Foo, const N: usize>() {
    asm!("/* ... */", const F::X); // OKAY with proposal
    asm!("/* ... */", const N); // OKAY
}

Constant function arguments

If a language feature allows constant function arguments, then this can be:

unsafe fn foo(const ptr: *const i32, const value: i32) {
    asm!("/* ... */", const ptr);
    asm!("/* ... */", const value);
}

These const values do not flow into the type system, so they don't need adt_const_params, and can depend on generics, similar to const {} block.

This is helpful for other scenarios, e.g. atomic ordering:

impl AtomicU32 {
    fn fetch_add(&self, val: u32, const order: Ordering) -> u32;
}

or make sure something can be checked in compile time:

impl<const SIZE: usize> IoMem<SIZE> {
    fn readb(&self, const addr: usize) -> u8 {
        const { assert!(addr < SIZE); }
        /* ... */
    }
}

Other constraints

Rust notably lacks:

  • m memory constraints
  • Hybrid constraints, e.g. ri to allow choose of register or immediate depending on whether the value can be optimised to constant in compile-time. Or rm to allow a value to be either in memory or register (useful for x86 assembly).