# `compiletest` Directive and Error Annotation Parsing Rework
`compiletest` directive parsing is a *mess* right now. Some problems I've noticed:
- It's extremely difficult to navigate through the directive parsing code.
- We use "headers" / "directives" interchangeably everywhere. Choose one and stick with it.
- Some directives only make sense / only works in selected test modes. This is not consistently enforced. Sometimes you can have unsupported directives **silently** do nothing.
- We have very minimal docs. The docs that we do have available are often outdated. We should strive to do better.
- We don't document the grammar accepted by the directives. This causes chaos and confusion for the test writer and the reviewers.
- We have a ton of duplicate code that's scattered everywhere.
- Quite a few directives have various bugs related to parsing and validation.
- Error handling is very hard to follow: we kinda rely on a magic `poison: &mut bool` flag to signal that we should fail in compiletest.
- If a parsing "branch" fails, we silently fallthrough to the next one. If all parsing branches fail, tough luck, it gets silently ignored. This has a bug where if a directive is known but the parsing branch fails... then it's a silent fail that just gets ignored.
- How we do unknown directive checking is very fragile right now, it relies on a hard-coded known directives array. THIS IS NOT NICE. If someone wants to add a new directive, they have to add it to this array. But this should automatically fall out of the parsing branches because if none of the parsing branches accepted the directive, then of course it's an unknown directive!
- Error reporting is a mess. It does not help the test writer diagnose where things went wrong. We should strive to do better.
- We should consistently report:
- Why it failed? Is the directive not supported in the test mode? Did parsing fail?
- Where it failed? Line number, file path!
- How can it be fixed? Can we give helpful suggestions to guide the user to solve the problem?
- Test writers should be treated like users of the compiler too. We strive to have good error reporting to improve devex for the users, we should do the same for compiler contributors!
I consider these problems not only negative contribute to the development experiment of compiler contributors, but it also threatens the robustness of the compiler. If the test runner is unreliable, then how can we be sure the compiler is reliable?
Note that is this only a small part of compiletest. I want to tidy up and restructure compiletest to the point where we can start to work towards experimentally integrating `ui_test` as the testing framework for running `ui` tests, and eventually drop compiletest for `ui_test`. But `ui_test` at its current development pace needs *years* of work before we can fully migrate all of the compiletest functionality over (due to lack of contributors and defined directions, despite oli's hard work) to achieve feature parity with compiletest. So while we still rely on compiletest for the compiler, we better make sure that the test runner itself is reliable...
## Key Concepts
- `compiletest` supports multiple test modes. Each **test mode** can correspond to multiple **test suites**. For example, the `ui` test mode corresponds to {`ui`, `ui_fulldeps`} test suites.
- Each test ("root" file of a test, or in supported cases, auxiliary files as well) can have:
- **Directives**. Directives are instructions to `compiletest` in regards to how and when the test will be run.
- **Auxiliaries** (these can be e.g. supporting files that need to be compiled and linked in). What constitutes auxiliaries and how they are processed to support the root test file depends on the test mode and test suite.
- **Revisions**. Revisions are like multiple versions of the same test file. Directives can be revisioned such that they only affect a specific revision of the test.
- **Error annotations**. Error annotations check compiler output (stdout, or stderr and diagnostics). Error annotations can be revisioned.
### Directives
- A directive might make sense being revisioned or might not. A revision can also be applies-to-all-revisions by omitting the revision qualification.
- A directive might only make sense for particular test modes and/or test suites. We want to error if a user uses an unsupported directive in an unsupported test mode / test suite.
- A directive may take the form:
- Name-only (`//@ <directive-name> [(:)?<comment>]`)
- Name-value (`//@ <directive-name>: <value_1> ... <value_n>`)
### Error Annotations
- Checks stderr diagnostics
- Only makes sense for limited test modes
### FileCheck
- Handled separatedly by codegen test mode
### Error Handling
We should rely on strong typing whenever possible, and separate error handling for directive parsing from error reporting. We should also try to recover as much as possible, i.e. try to report all the errors related to directive parsing *in one go* for all test files in a test suite. This helps to avoid having to "re-run" directive parsing after fixing a small batch of issues.
### Error Reporting
Error reporting in `compiletest` should **not** rely on panicking. We should provide *informative*, *actionable* errors, with *help* or *notes* when applicable. Something like:
```rs
error: unknown revision
--> <path_to_test_file>:<line>:<col>
|
2 | let foo = Foo; //[unknown_directive]~ ERROR undeclared
| ^^^^^^^^^^^^^^^^^ undeclared revision
help: did you mean `known_directive`?
note: revisions can only be used if they're declared in a `revisions` directive, e.g. `//@ revisions: foo`
```
We could probably wrap `annotate-snippets` here?
## Directive Parsing, Validation and Test Properties Design
Probably two steps for each file:
1. Directive collection.
2. Error annotation collection (if applicable).
Example: ui test mode, "ui" test suite
```rs
//@ aux-bin: foo.rs
//@ revisions: bar
fn main() {
let foo = Foo; //[bar]~ ERROR not found
// error annotations ONLY make sense in stderr checking modes
}
//@[foo] check-pass
//@ check-fail // <- does not have to be "header"
```
```rs
enum TestMode {
Ui,
Codegen
}
struct TestSuite(&'static str);
// TestMode::Ui => ["ui", "ui-fulldeps"]
enum TestProps {
Ui(UiTestProps),
Codegen(CodegenTestProps),
}
struct UiTestProps {
aux_bin: Option<String>,
revisions: Vec<String>,
revisioned_props: BTreeMap<String, RevisionedUiTestProps>,
}
struct RevisionedUiTestProps {
expectation: Option<TestExpectation>,
}
enum TestExpectation {
CheckFail,
BuildFail,
RunFail,
}
```
This allows post-test-prop construction validation as well, though I'm not as certain if that's even necessary.
What do errors for a single file look like? Maybe:
```rs
// Each test file can generate `Vec<DirectiveError>`
// A test suite can generate `BTreeMap<PathBuf, Vec<DirectiveError>>`.
// Note: span isn't shared because not all errors need a `Span`.
enum DirectiveError {
SuspiciousDirective {
directive_name: &'static str,
span: Span,
},
MissingValues {
directive_name: &'static str,
span: Span,
},
UnknownDirective {
reason: String,
suggestion: Option<String>, // maybe automatically derived from edit distance of known complete has no wildcard components like `only-x86`?
span: Span,
},
DirectiveDoesNotSupportRevision {
revision: &'static str,
span: Span,
},
MissingMandatoryDirective {
required_directive: &'static str,
},
}
```
These errors could then be converted into diagnostics and fed through an emitter to emit the failures.
Then parsing at the test file level could be:
```rs
fn parse_directives(
test: PathBuf,
mode: TestMode,
suite: TestSuite,
) -> Result<TestProps, Vec<DirectiveError>> {
todo!()
}
fn parse_error_annotations<'rev>(
test: PathBuf,
mode: TestMode,
suite: TestSuite,
known_revisions: &[&'rev str]
) -> Result<Vec<ErrorAnnotation>, Vec<ErrorAnnotationError>> {
todo!()
}
```