owned this note
owned this note
Published
Linked with GitHub
# Range Reform
###### tags: `Libs RFCs` `RFC Draft`
- Feature Name: (fill me in with a unique ident, `my_awesome_feature`)
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
# Summary
[summary]: #summary
This RFC proposes adding new versions of the range types, and changing the range syntax desugaring to use those types in the 2021 edition. By implementing `IntoIterator` rather than `Iterator`, the interfaces of the new types avoid some of the awkward drawbacks of the existing types.
# Motivation
[motivation]: #motivation
The Rust language's range syntax (`..`, `a..b`, `a..`, `..b`, `a..=b`, `a..=`) desugar into types defined in the `core::ops` module (`RangeFull`, `Range`, `RangeFrom`, `RangeTo`, `RangeInclusive`, and `RangeToInclusive` respectively). Of those types, `Range`, `RangeFrom`, and `RangeInclusive` implement the `Iterator` trait. This implementation imposes a couple of constraints:
* We do not allow iterator types to be `Copy`, as it can cause confusion when an implicit copy of an iterator is advanced rather than the original.
* While most of the range types are simple POD types with public `start` and/or `end` fields, `RangeInclusive` is not, as it has to keep track of an extra bit of state to ensure its `Iterator` implementation behaves correctly.
However, these types have many uses other than as iterators. For example, slicing syntax is generic over range types to easily pull different bits out of the sequence (`buf[..10]`, `buf[..]`, etc). Similarly, the `rand` crate's [`Rand::gen_range`](https://docs.rs/rand/0.8.3/rand/trait.Rng.html#method.gen_range) method is takes in a range argument to support selection from both half-open and inclusive ranges. In other cases, a developer implementing a data structure may want to store a `Range<usize>` field to represent the start and end of some region of the structure since it is more self-describing than just a pair of `usize`.
When working with ranges in these contexts, `Iterator` implementation is superfluous, but the limitations on the types imposed by the implementation makes the types more awkward to work with and adds unnecessary overhead. In the case of gen_range, `RangeInclusive`'s bounds can only be accessed via methods that return references. In the case of using a `Range<usize>` as a field, that prevents the outer type from ever being `Copy`.
Fortunately, it's not actually necessary for the types to implement `Iterator` in the first place! The most common way to work with iterators is through a for loop:
```rust
for i in 0..10 {
println!("iteration {}", i);
}
```
But, for loop syntax does not require the expression on the right-hand side of the `in` keyword to implement `Iterator` itself, but rather the `IntoIterator` trait, which allows a type to be *converted* into an iterator without itself being an iterator.
If range syntax were added today, these types would clearly implement the `IntoIterator` trait, but they were initially added before that existed and things were unfortunately not updated after `IntoIterator`'s introduction. Luckily, we can use an edition boundary to fix this oversight.
# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation
Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means:
- Introducing new named concepts.
- Explaining the feature largely in terms of examples.
- Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible.
- If applicable, provide sample error messages, deprecation warnings, or migration guidance.
- If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers.
For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms.
# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation
For each of the iterable range types, a corresponding module will be added to `core::ops`. Taking `Range` as an example, the `core::ops::range` module will contain two types:
```rust
pub struct Range<T> {
pub start: T,
pub end: T,
}
pub struct IntoIter<T> {
pub start: T,
pub end: T,
}
```
The `IntoIter` type is just the original `core::ops::Range` type moved and renamed. It will continue to have exactly the same API and implement the same traits.
The new `Range` type will implement the same methods and traits as its original counterpart, with a few exceptions:
* It will implement `Copy` for `T: Copy`.
* It will not implement `Iterator`.
* It will implement `IntoIterator<Item = T, IntoIter = IntoIter<T>>`.
Notably, *both types* will implement the traits required to be used in places like slice indexing.
Back in the `core::ops` module, an *edition specific type alias* will be defined for the name `Range`. When building a crate for the 2021 edition, the alias will map to `core::ops::range::Range`, and when building a crate for an earlier edition, the alias will map to `core::ops::range::IntoIter`. A prototype implementation of the compiler side of this exists in [rust-lang/rust#82489](https://github.com/rust-lang/rust/pull/82489):
```rust
#[rustc_per_edition]
pub type Range<T> = (
range::IntoIter<T>, // in the 2015 edition
range::IntoIter<T>, // in the 2018 edition
range::Range<T>, // in the 2021 edition
);
```
The same pattern is used for `RangeFrom` and `RangeInclusive`.
The module structure and naming conventions here match the pattern established by the `std::collections` module, where e.g. `HashMap` is defined as `std::collections::hash_map::HashMap` and reexported at the `std::collections` level and its iterator types are named after their functionality and located in the `std::collections::hash_map` module.
The `RangeFull`, `RangeTo`, and `RangeToInclusive` types do not implement `Iterator`, and so can remain as they are.
## Rustfix
Rustfix uses a two-pass approach to update crates to a new edition. In the first pass, the code is updated to compile properly in *both* the 2018 and 2021 editions. In this case, that will involve:
* Any expression calling an `Iterator` method directly on a `Range`, `RangeFrom`, or `RangeInclusive` type will have an `.into_iter()` call inserted. Since all iterators already implement `IntoIterator<IntoIter = Self>`, this will compile on both editions.
* Any explicit use of the `core::ops::Range`, `core::ops::RangeFrom`, or `core::ops::RangeInclusive` types will be changed to `core::ops::range::IntoIter`, `core::ops::range_from::IntoIter`, or `core::ops::range_inclusive::IntoIter` respectively. While the names in the `core::ops` module change based on the edition, the names in the respective submodule will be fixed across editions.
* Any expression that passes a `Range`, `RangeFrom`, or `RangeInclusive` value to a method expecting either that exact type *or* a type implementing a certain trait will have an `.into_iter()` call inserted. As an exception, the compiler knows that the standard library APIs consuming range types (for e.g. slice indexing) work with both the old and new versions of the types so they can remain unchanged.
### Examples
```rust
let x = (0..10).collect::<Vec<_>>();
// converts to
let x = (0..10).into_iter().collect::<Vec<_>>();
```
```rust
pub struct Foo {
range: Range<usize>,
}
// converts to
pub struct Foo {
range: core::ops::range::IntoIter<usize>,
}
```
```rust
let x = rand::thread_rng().gen_range(0..=10);
// converts to
let x = rand::thread_rng().gen_range((0..=10).into_iter());
```
```rust
let y = &slice[..5];
// is not changed
```
The second pass of rustfix changes code in ways that would stop it from compiling in the 2018 edition but more closely matches the intended idioms of the 2021 edition. In this case, a potential change could be to switch explicit uses of the range types back from `IntoIter`s to the new definitions, but it may be difficult to ensure that the modified code will compile. In the worst case, this cleanup may be left to the developer.
Removing `.into_iter()` calls when passing a range type to a third party API will unfortunately be a manual action, since even if the compiler can check that the new range type implements the relevant trait bound, it cannot guarantee that the behavior of the implementations of those traits will be equivalent.
### Examples
```rust
pub struct Foo {
range: core::ops::range::IntoIter<usize>,
}
// maybe converts to (??)
pub struct Foo {
range: Range<usize>,
}
```
# Drawbacks
[drawbacks]: #drawbacks
Having multiple versions of the range types that very slightly will almost certainly cause some amount of confusion down the line. No matter what name we choose for these new types, it will almost certainly be worse than the original names.
It may be difficult to make `rustfix`'s update logic exactly precise, particularly in cases where a range expression is being passed to a method. However, hopefully we can make something "good enough" to catch the common cases leaving a small number of errors that have to be manually fixed when migrating editions.
In the transitional period where crates are updating to the 2021 edition, there will be a period where awkward transitions between the range types are required to pass them from 2021 edition crates to APIs defined in 2015/2018 edition crates.
# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives
To solve the `Copy` issue, @eddyb has proposed adding a `#[must_clone]` annotation. When applied to a `Copy` type, it would warn whenever the type is implicitly copied, telling users to call `.clone()` explicitly instead. This would then allow `Range`, `RangeFrom`, and `RangeInclusive` to implement `Copy` while avoiding the confusion around implicit copies during iteration.
However, this does not solve the other problems caused by the `Iterator` implementations. It also feels a bit unfortunate to say that a type is `Copy`, but that you shouldn't ever actually try to copy it! In the contexts where you aren't using the range type as an iterator at all, you wouldn't want to have to ever deal with manual `.clone()` calls.
# Unresolved questions
[unresolved-questions]: #unresolved-questions
N/A
# Future possibilities
[future-possibilities]: #future-possibilities
N/A