Questions for T-lang re Rust Spec

# Questions for T-lang re Rust spec ## History ### Ferrocene T-spec has reviewed the [Ferrocene specification](https://spec.ferrocene.dev/). We have not been convinced that adopting that document would yield the outcome we want: a community-maintained document that answers questions from Rust users. T-spec decided that before it could decide whether (and how) to adapt the Ferrocene specification, it needed a more concrete idea of what an idealized specification would look like. ### The sample chapter exercise T-spec went through an exercise of drafting an idealized "sample chapter", using "arrays" as the agreed upon base topic for the chapter. This yielded the following sample documents: * [Connor's array chapter](https://github.com/rust-lang/spec/blob/8476adc4a7a9327b356f4a0b19e5d6e069125571/spec/lang/exprs/array.md) * [Joel's array chapter](https://github.com/JoelMarcey/rust-lang-spec/blob/5f76e25254a823ac8fc88c9d62256e8af2ba18f7/specification/arrays.md) * [Mara's array chapter](https://htmlpreview.github.io/?https://gist.githubusercontent.com/m-ou-se/529e3db782a0ce8ecd5043cc0adfa4af/raw/0f982c83e5733a72ec0eff6c285a48d1f5e839e4/draft.html) If you look at the three sample chapters above, you'll notice that they all spend some time describing the fragments of the grammar for array types, array-generating expressions, and array indexing expressions. Each sample describes, in its own words, the static constraints on the type and expression forms, and the dynamic behavior of the expression forms. Each sample chapter made independent decisions about source file structure, encoding of specification elements, and presentation of the rendered product. The rendered product is the most important topic of feedback; we are not seeking T-lang's feedback on source file structure. (Nor are we seeking feedback on encoding of specification elements, apart from whether they are sufficiently expressive to meet our audience's needs.) Here are some high-level points we extracted from discussion of the sample chapters: * Chapters on a specific concept (like "arrays") touch a lot of different topics ranging from grammar to static and dynamic semantics, from expressions to types, etc., making it hard to write and review and determine completeness. * We think having space for non-normative notes (as in Mara's example) helps accessibility and allows the normative parts to be concise and precise. * We think it will be useful for each point in the specification to be individually addressable via a stable name * See for example Mara's sample, with labels like "array.type.syntax" or "array.repeat.copy" * These are meant to be usable for all versions of the document after those points are introduced. (Even if the relevant text is replaced by a later version, our aim is to still provide index entries that point from old labels to a list of relevant new ones.) * Some readers may find it useful to know about language changes that corresponded to bug fixes * see for example Mara's array.repeat.zero with an annotation describing a shift at 1.63.0. * But all such information should be provided as extra annotation, analogous to how Mara's sample offsets the examples and exposition in separate blocks that are not part of the formal specification itself. * We think edition changes should be fully documented, but version changes only on a best effort (non-normative) level. Since then, we have had further discussions about the different choices each sample chapter made. We questioned whether the chosen sample topic ("arrays") was a good representative of what an ideal spec would even have as a chapter topic: the chapter content jumped between parsing, static semantics, and dynamic semantics. It was not clear who the audience for a chapter written like this would be. **Question: Looking at the three sample chapters, what do you like or dislike? Is there anything that you think we should keep in mind (to do, or not to do) going forward?** ### Top-level Chapter Structure Many language specifications seem to centralize the grammar of the language: each section introduces a new piece of grammar (e.g. "match expressions" or "use statements") and defines the full semantics related to that piece of grammar. After our experience writing the sample chapter on "arrays", we had [a discussion][discussion] about what high-level chapter structure an ideal specification would have. In particular, Mara has argued for a different high-level structure, where each chapter more closely aligns with different types of expertise. (e.g. grammar+ast vs type system vs dynamic semantics vs stdlib, etc.) One of the advantages of such a structure is that it will allow for a more obvious coupling between high-level chapters and a corresponding *Rust team* (or at least individual developers) to help with production and review of the content relevant to that chapter. E.g. T-opsem could be expected to help with a dynamic semantics chapter. The rough idea is that there is not a single section on "arrays" (with a small part about the memory layout), but instead there is a single section on "memory layout" (with a small part about arrays). **Question: How does that sound? Is this a good idea?** We are still drafting what that high-level chapter structure should look like. Here's an extracted outline of topics based on [a recent brainstorm meeting][brainstorm]; (note that these are not considered complete, though we would want to know about gross omissions): * Source code and Rust syntax tree (graph?) * Grammar, AST * Crates, modules, source files * Macro invocations * Macro expansion and conditional compilation * Name/Path resolution of (mod-level) items * Static semantics * type checking * associated item resolution * existential (impl Trait) resolution * borrow checking * unsafe checking * const eval * type inference * Dynamic Semantics * high level expression form * pattern matching and binding * dyn traits and dynamic method dispatch * memory layout and value representation * low level (MIR-like) statement form * memory model (borrowing; atomics) * ABIs and FFI linkage * The Core library crate * builtin types' traits and methods * core::* items [discussion]: https://rust-lang.zulipchat.com/#narrow/stream/399173-t-spec/topic/Structure.3A.20chapters/near/420245542 [brainstorm]: https://hackmd.io/gqNmzYyKRD-slKYSidwrsA ### Compilation phases An important open question is to which extend phases of compilation (and their intermediate results) are relevant to the specification. Can we define any phases of compilation that are inherent to the langauge, that are not just implementation details? For example: tokenizing, parsing, macro expansion, static analysis and name resolution, monomorphization, const eval, codegen/execution. (Relevant for e.g. defining when `N` is checked in `[u8; N]`.) If yes, then we can use these phases for the top level chapter structure. **Question: To what extent should *compilation phases* exist in the specification?** ### Intermediate representations - For parsing/grammar, we will need to define **tokens**. (This seems uncontroversial.) - For macros, it will be relevant to specify (the existence and at least some parts of) the **AST**. (E.g. to properly define interaction between $tt and $expr fragment specifiers.) - For borrow checking and more, the spec might need to define some kind of "desugaring" and "lowering" to some simplified model/representation. - For operational semantics (and const eval), the spec might need to define "some kind of MIR". **Questions: Should the spec define "some kind of MIR" or "abstract machine"? How can we keep that minimal? What can we specify without?**