owned this note changed 9 months ago
Published Linked with GitHub

Crater analysis for #124141

The second recent crater run in #124141 ("Remove Nonterminal and TokenKind::Interpolated") had 18 failures. They all involve edge cases with declarative macros, and I believe they are all cases where the current compiler is buggy, and the new behaviour of failing to compile is the correct behaviour.

I will use "rustc0" to refer to the compiler with the current behaviour, and "rustc1" to refer to the compiler with the new behaviour introduced by this PR.

Summary

In every case, there is a declarative macro with some kind of complexity: a cfg attribute, or a cross-crate boundary, or a nested macro, or some combination. In every case, rust0 accepts the complex declarative macro while rejecting a simpler form of the same macro, e.g. with the cfg removed, or in an intra-crate case. In every case rustc1 rejects both the more complex and simpler forms, for improved consistency. I believe that rustc0 is buggy in its current behaviour, and that the inconsistencies arise because of the complexity of the internal handling of Interpolated.

None of the failing crates are particularly high profile. I expect fixing the compilation failures would be easy for some or most of them, though I haven't taken the time to check this.

hostfxr-sys-0.11.0

I minimized this to the following:

macro_rules! m {
    ($ty:ty) => {
        #[derive(WrapperApi)]
        struct S {
            // #[cfg(feature = "netcore1_0")] 
            f: $ty
        }
    }
}

m! { fn() -> i32 }

rustc0 and rustc1 both reject this, because the WrapperApi proc macro (from the dlopen2 crate) can't handle the invisible delimiters around $ty and aborts. However, if you uncomment the cfg line, rustc0 will accept it, because the cfg (which evaluates to true) causes the invisible delimiters to get stripped and so the proc macro suceeds. There is no good reason for the invisible delimiters to be stripped, it's just an unintentional behaviour of rustc0. Therefore, rustc1 increases consistency by treating this code the same whether the cfg is present or not.

enclave-runner-0.6.0, dcap-provider-0.4.0, dcap-ql-0.4.0, fortanix-sgx-tools-0.5.1, report-test-0.4.0

I minimized this to the following. In enclave-runner/src/lib.rs:

macro_rules! consume_bang {
    (!) => {
        fn _exit1() {}
    };
    ($r:tt) => {
        fn _exit2() -> Option<$r> { panic!(); }
    };
}

fortanix_sgx_abi::invoke_with_ty_bang!(consume_bang);

In fortanix-sgx-abi/src/lib.rs:

macro_rules! mk_invoke_with_ty_bang {
    ($r:ty) => { // changing this to `tt` fixes the problem
        #[macro_export]
        macro_rules! invoke_with_ty_bang {
            ($m:ident) => { $m!($r); }
        }
    };
}

mk_invoke_with_ty_bang! { ! }

If all this code is in a single crate, both rustc0 and rustc1 reject it. Because the ! input to mk_invoke_with_ty_bang is pasted as a ty which gets invisible delmiters around it. That then ends up as an input to consume_bang, and it doesn't match the (!) rule because of the invisible delimiters, so the ($r:tt) rule matches, and Option<!> is produced, which triggers an error message "error: the ! type is experimental".

However, if the code is split across two crates, rustc0 will accept it, while rustc1 rejects it. I don't 100% understand why, but I assume that the invisible delimiters somehow are stripped when writing the mk_invoke_with_ty_bang macro into metadata. There is no good reason for the behaviour to be different when crossing crate boundaries, it's unintentional. Therefore, rustc1 increases consistency by treating this code the same whether it's intra-crate or inter-crate.

aver-0.1.5

I minimized this, the key part is this:

macro_rules! log {
  ($arg:expr) => {
    print!("{}", $arg);
  };
}

macro_rules! make_log_info {
  ($level:ty) => {
    #[macro_export]
    macro_rules! log_info {
      ($arg:expr) => {
        if ($crate::$level >= $crate::LogLevel::Info) {
          log!($arg);
        }
      };
    }
  }
}

make_log_info!(LogLevel::Info);

If you have a call to log_info! in the same crate, both rustc0 and rustc1 reject it, because you can't concatenate a ty fragment specifier to something else to create a path in a proc macro; you can only do that with an ident fragment specifier.

However, if the call to log_info! is in a different crate (in this case because its in a unit test) then rustc0 will accept it. Again, something about being a nested macro that crosses crate boundaries lets it slip through. Once again, there is no good reason for different behaviour across a crate boundary, and rustc1 increases consistency.

(If $level:ty is changed to $($level:tt)*, then the concatenation is valid and accepted by both rustc0 and rustc1 in the intra-crate case.)

fftconvolve-0.1.1

The minimized version includes this code:

macro_rules! generate_assert {
    ($assert:ident, $close:path) => {
        #[macro_export]
        macro_rules! $assert {
            ($test: expr, $truth: expr) => {
                $crate::$close($test, $truth);
            };
        }
    };
}

This is similar to the previous case: a nested macro crossing crate boundaries, involving a path specifier being concatenated with $crate::, which shouldn't be allowed. Again, rustc1 increases consistency by treating the intra-crate and inter-crate cases the same.

assert-cmp-0.2.1, parallel-disk-usage-0.9.3

A minimization:

#[macro_export]
macro_rules! assert_op_inner { () => {}; }

macro_rules! m {
    ($module:path) => {
        #[macro_export]
        macro_rules! assert_op { () => { $module::assert_op_inner!() }; }
    };  
}   

m!(::assert_cmp);

and in a unit test:

#[test]
fn assert_op_passes() {
    ::assert_cmp::assert_op!();
}

Once again we have a path fragment specifier being concatenated into a longer path. If all the code is in the same crate, both rustc0 and rustc1 reject it. But when it's cross-crate, rustc0 accepts it, for no good reason. Again, rustc1 increases consistency by treating the intra-crate and inter-crate cases the same.

for_each-0.9.0, math_adapter-0.3.8, meta_tools_min-0.2.13, non_std-0.1.4, proc_macro_tools-0.1.17, std_tools-0.1.4, std_x-0.1.4, type_constructor-0.3.0

These are all related: they are all by the same author, and they are all failures compiling test code, because they all use similar code in tests.

A minimized version:

// Produces a macro `produce_item` that, when called, produces `$item`.
macro_rules! tests_impls {
    ($item: item) => {
        macro_rules! produce_item {
            () => { $item };
        }
    };
}

// Produce a macro `produce_item` that, when called, produces a macro `_m`.
tests_impls! {
    macro_rules! _m {
        ($( $arg:tt )*) => { $( $arg )* };
    }
}

produce_item!();

rustc0 accepts this, while rustc1 rejects this with "error: attempted to repeat an expression containing no syntax variables matched as repeating at this depth". If you simplify it to this:

macro_rules! tests_impls {
    () => {
        macro_rules! _m {
            ($( $arg:tt )*) => { $( $arg )* };
        }
    };
}

tests_impls! {}

fn main() {}

then both rustc0 and rustc1 reject it. This is a standard problem with nested macros, where the repetition $( $arg:tt )* looks like it belongs to the inner macro but it really belongs to the outer macro.

There is no clear reason why rustc0 accepts the more complex code, it's just an accidental byproduct of the imperfect token handling of the most complex declarative macros. Once again, rustc1 increases consistency by treating the two cases the same.

Select a repo