Meeting 2024-03-20
The Range
, RangeInclusive
, and RangeFrom
types are generally considered to be flawed. RFC 3550 suggests to resolve this by introducing three new types and changing the range syntax to resolve to these new types in Edition 2024.
However, this is an unprecedented kind of Edition change, and there are concerns that the implications for third-party libraries break the interoperability promise of the Edition process. Adding type inference for the range syntax could resolve those concerns.
The current iterable range types (Range
, RangeFrom
, RangeInclusive
) implement Iterator
directly. This is now widely considered to be a mistake, because it makes implementing Copy
for those types hazardous due to how the two traits interact.
However, there is considerable demand for Copy
range types for multiple reasons:
.clone()
s or rewriting the a..b
syntax repeatedlyCopy
types (currently people work around this by using a tuple instead)Another primary motivation is the extra size of RangeInclusive
. It uses an extra bool
field to keep track of when the upper bound has been yielded by the iterator. This results in a size 50% larger than necessary to just store the bounds, but this extra size is useless when the type is not used as an iterator.
Introducing new range types and new types for their iterators would allow us to correct the ExactSizeIterator
impls. Currently, the ExactSizeIterator
impls for Range
and RangeInclusive
are incorrect on platforms with usize
of less than 64 bits.
Finally, a new RangeInclusive
iterator type could allow for performance optimizations that the current API does not (because it returns the bounds by reference).
RFC 3550 proposes to introduce three new range types:
// New range types in `std::ops::range` module
// (or maybe `std::range`)
pub struct Range<Idx> {
pub start: Idx,
pub end: Idx,
}
pub struct RangeInclusive<Idx> {
pub start: Idx,
pub end: Idx,
}
pub struct RangeFrom<Idx> {
pub start: Idx,
}
Unlike the legacy range types, these new types will implement Copy
and IntoIterator
instead of Iterator
.
The range syntax would resolve to either these types or the legacy range types depending on edition:
Syntax | Edition 2021 and prior | Edition 2023 and later |
---|---|---|
a..b |
std::ops::Range |
std::ops::range::Range |
a..=b |
std::ops::RangeInclusive |
std::ops::range::RangeInclusive |
a.. |
std::ops::RangeFrom |
std::ops::range::RangeFrom |
There are cases where existing APIs specify a legacy range type explicitly or accept Iterator
s instead of IntoIterator
s:
pub fn takes_range(range: std::ops::Range<usize>) { ... }
pub fn takes_iter(range: impl Iterator<usize>) { ... }
impl Index<std::ops::Range<usize>> for Bar { ... }
And current code can use range syntax directly in those APIs:
takes_range(5..11);
takes_iter(5..11);
bar[1..8];
Such code will result in errors in Edition 2024, where range syntax will resolve to new distinct types that do not implement Iterator
. To gracefully migrate code between editions, cargo fix --edition
will add explicit conversions where necessary:
takes_range((5..11).to_legacy());
takes_iter((5..11).into_iter());
bar[(1..8).to_legacy()];
Existing code can be easily migrated using cargo fix --edition
, but these explicit conversions are a significant ergonomic downgrade when writing new code using legacy APIs. The explicit conversions also add visual noise that hurts the readability of the migrated code.
Libraries facing these problems are encouraged to issue updates which change their API to accept the new range types in a backwards-compatible way:
// Before
pub fn takes_range(range: std::ops::Range<usize>) { ... }
pub fn takes_iter(range: impl Iterator<usize>) { ... }
impl Index<std::ops::Range<usize>> for Bar { ... }
// After
pub fn takes_range(range: impl Into<range::legacy::Range<usize>>) { ... }
pub fn takes_iter(range: impl IntoIterator<usize>) { ... }
impl Index<range::legacy::Range<usize>> for Bar { ... }
impl Index<range::Range<usize>> for Bar { ... }
This allows users of the library to upgrade to Edition 2024 without cargo fix --edition
adding explicit conversions, and new users of the library can use the exact same syntax regardless of which edition they are using.
The RFC recommends that the new types be stabilized as soon as possible so library authors have time to implement these changes.
Although not the first to bring them up, Mara summed up the concerns well:
So we end up in a world where library authors have to manually make their public API compatible with both the old and new edition, which very much goes against the idea of editions being a package/crate local decision.
Basically, the edition change only works effortlessly if all libraries that take ranges in their API will update their API before the new edition. This is very unlike other edition changes.
…
Most importantly, I would like to avoid a world where library maintainers will have to deal with bug reports or feature requests like "please support 2024 ranges" or "please support legacy ranges". In the past, edition changes that resulted in extra work for some library maintainers were not received well, and I don't think we should repeat that mistake on even larger scale.
The RFC recommends only a mitigation path where the new types are stabilized well before the new edition is released, allowing third-party libraries to publish updates supporting the new types in time for the new edition.
The best alternative I've seen is to make range syntax similar to integer literals, where the concrete range type is inferred based on context. This would prevent the most prominent ergonomic drawbacks:
cargo fix --edition
(possibly zero)takes_range(0..5)
with fn takes_range(range: std::ops::Range<usize>)
custom_type[0..5]
where custom_type: CustomType
and CustomType: Index<std::ops::Range<usize>>
takes_iter(0..5)
with fn takes_iter(iter: impl Iterator<usize>)
Please note that I am not an expert on Rust's type system, so the following is a best-effort attempt at outlining how "range type inference" might work.
{range<_>}
let lr: legacy::Range<_> = 0..5; // Uses legacy type
let nr: range::Range<_> = 0..5; // Uses new type
fn takes_new_range(r: range::RangeFrom<u8>) {}
takes_new_range(0..); // Uses new type
fn takes_legacy_range(r: legacy::RangeFrom<u8>) {}
takes_legacy_range(0..); // Uses legacy type
impl ops::Index<legacy::Range<usize>> for Thing { ... }
fn use_thing(t: Thing) {
t[3..7]; // Uses legacy range
}
impl ops::Index<range::Range<usize>> for Stuff { ... }
fn use_stuff(s: Stuff) {
s[3..7]; // Uses new range
}
We could choose to also apply this to method resolution, but that would probably not be worth it.
let no_expectation = 13..67; // Uses new range
ops::RangeBounds::contains(no_expectation, &5); // (Both types impl `RangeBounds`)
impl ops::Index<legacy::Range<usize>> for Doohicky { ... }
impl ops::Index<range::Range<usize>> for Doohicky { ... }
fn use_doohicky(d: Doohicky) {
d[3..7]; // Uses new range
}
Based on proposal from Niko documented here
I included only the regressed
and test-pass
categories in my analysis, excluding the 8251 error
, 170 spurious-fixed
, 246 spurious-regressed
, 2 fixed
, and 3 unknown
results.
Total | Crates.io | Github |
---|---|---|
229,793 | 92,273 | 137,520 |
a..b (Range ) |
a..=b (RangeInclusive ) |
a.. (RangeFrom ) |
---|---|---|
1,312,526 | 159,341 | 284,161 |
a..b
is used 8.2x more often than a..=b
and 4.6x more often than a..
Some form of range syntax was used in 43% of the sample crates (98,483).
RangeBounds
usage in public APIsTotal | fn |
trait impl |
struct def |
struct impl |
trait def |
enum def |
enum impl |
---|---|---|---|---|---|---|---|
2,292 | 2,004 (87.4%) | 202 (8.8%) | 39 (1.7%) | 32 (1.4%) | 12 (0.52%) | 2 (0.09%) | 1 (0.04%) |
RangeBounds
was used as a trait bound (including impl RangeBounds
) in 0.25% of the sample crates (584).
Total | fn |
struct field |
enum field |
type alias |
const item |
static item |
|
---|---|---|---|---|---|---|---|
Range |
5,395 | 4,118 (76.3%) | 698 (12.9%) | 383 (7.1%) | 124 (2.3%) | 70 (1.3%) | 2 (0.04%) |
RangeInclusive |
1,207 | 901 (74.7%) | 92 (7.6%) | 91 (7.5%) | 19 (1.6%) | 103 (8.5%) | 1 (0.08%) |
RangeFrom |
882 | 864 (98.0%) | 3 (0.34%) | 10 (1.13%) | 4 (0.45%) | 1 (0.11%) | 0 |
Total | 7,484 | 5,883 (78.6%) | 793 (10.6%) | 484 (6.5%) | 147 (2.0%) | 147 (2.3%) | 3 (0.04%) |
Range types were used in public APIs (excluding trait impls) in 0.79% of the sample crates (1,826).
Total | Range |
RangeInclusive |
RangeFrom |
---|---|---|---|
3,647 | 1,893 (52%) | 687 (19%) | 1,067 (29%) |
Range types were used in public trait impls in 0.33% of the sample crates (766).
Note: it's likely the following are undercounted due to the data extraction being unable to see into macro invocations
700 of these were impls for ops::Index
or ops::IndexMut
with one of the range types as the index type.
Another 741 were impls of one of the standard conversion traits (From
, Into
, TryFrom
, TryInto
).
Range types were explicitly used in any public API in 0.89% of sample crates (2,038). 554 crates have both (a) a public trait impl involving a range type and (b) some other API explicitly using a range type.
These constitute the cases where less ergonomic library interop could apply.
I ranked each affected crate by the number of crates that depend on it ("dependents"), because I thought that was a good analog for how often user in the wild will depend directly on such crates. Of the 1,449 total affected crates on crates.io, only 39 crates had more than 100 dependents, but those 39 had more than 93% of the total number of dependents.
The 39 crates with over 100 dependents:
Name | Latest Version | # of Dependents | Most Recent Release | Recent Downloads | Downloads of Latest Version | All Time Downloads |
---|---|---|---|---|---|---|
serde | 1.0.193 | 35043 | 2024-02-20 | 34268992 | 3741887 | 275397546 |
rand | 0.8.5 | 13441 | 2024-02-18 | 29254072 | 4560 | 291883321 |
regex | 1.10.2 | 8439 | 2024-01-21 | 28861662 | 7995426 | 227215737 |
bytes | 1.5.0 | 5423 | 2023-09-07 | 21418791 | 25385538 | 182917006 |
rayon | 1.8.0 | 3011 | 2024-02-27 | 13987087 | 761638 | 97231137 |
bincode | 2.0.0-rc.3 | 2496 | 2023-03-30 | 7437296 | 1188981 | 57011490 |
indexmap | 2.1.0 | 1785 | 2024-02-29 | 35930957 | 1229167 | 205879156 |
quickcheck | 1.0.3 | 1067 | 2021-01-15 | 1366240 | 7467603 | 15756539 |
schemars | 0.8.16 | 859 | 2023-11-11 | 3050968 | 2114567 | 14500776 |
ndarray | 0.15.6 | 847 | 2022-07-30 | 1359994 | 5154517 | 10641179 |
pest | 2.7.5 | 643 | 2024-03-02 | 7220140 | 234133 | 57121309 |
pyo3 | 0.20.0 | 551 | 2024-03-10 | 4158271 | 793 | 24956831 |
scale-info | 2.10.0 | 443 | 2024-03-12 | 1126482 | 3159 | 5761728 |
rustyline | 12.0.0 | 441 | 2024-03-06 | 1004925 | 3829 | 8560979 |
bitvec | 1.0.1 | 439 | 2022-07-10 | 6590366 | 21419740 | 43913058 |
rocket | 0.5.0 | 403 | 2023-11-17 | 394120 | 248754 | 4352752 |
miette | 5.10.0 | 376 | 2024-03-07 | 1702824 | 8128 | 8179648 |
tower-http | 0.5.0 | 356 | 2024-02-23 | 4727121 | 115135 | 28181117 |
arbitrary | 1.3.2 | 334 | 2023-10-30 | 2027869 | 1861244 | 10472715 |
wgpu | 0.17.2 | 331 | 2024-03-01 | 598911 | 32881 | 3921522 |
azure_core | 0.17.0 | 318 | 2024-01-05 | 237609 | 50471 | 1146701 |
tree-sitter | 0.20.10 | 308 | 2024-03-10 | 440632 | 870 | 3145906 |
egui | 0.24.1 | 284 | 2024-02-14 | 374680 | 39643 | 2031028 |
bstr | 1.8.0 | 276 | 2024-02-24 | 10631156 | 576274 | 77912405 |
logos | 0.13.0 | 204 | 2024-02-07 | 752791 | 6665 | 5139345 |
regress | 0.7.1 | 193 | 2024-02-26 | 664300 | 5246 | 2790599 |
rkyv | 0.7.42 | 182 | 2024-02-23 | 3165527 | 341 | 13007693 |
wiremock | 0.5.22 | 154 | 2024-02-11 | 1060791 | 49514 | 5201466 |
fancy-regex | 0.12.0 | 142 | 2023-12-22 | 3155625 | 114428 | 14967513 |
glium | 0.33.0 | 140 | 2024-01-03 | 119700 | 4949 | 1579022 |
codespan- reporting | 0.11.1 | 137 | 2021-02-25 | 3629090 | 24338635 | 25631055 |
object | 0.32.1 | 132 | 2024-03-11 | 15658574 | 685 | 107373131 |
sodiumoxide | 0.2.7 | 122 | 2021-06-24 | 285641 | 1558541 | 2751305 |
aho-corasick | 1.1.2 | 117 | 2023-10-09 | 29259349 | 25600773 | 214824438 |
amplify | 4.5.0 | 113 | 2024-02-15 | 50524 | 7289 | 308644 |
bitcoin_ hashes | 0.13.0 | 111 | 2023-08-24 | 1040867 | 109601 | 5818899 |
smartstring | 1.0.1 | 107 | 2022-03-24 | 1648792 | 6565392 | 8816787 |
wasmer | 4.2.4 | 107 | 2024-03-04 | 297222 | 2706 | 3636787 |
similar | 2.3.0 | 105 | 2023-12-29 | 2917823 | 1033808 | 15768494 |
32 of the 39 (82%) have issued a release within the last 7 months.
I analyzed each crate to categorize exactly what kind of explicit Range
usage they have.
serde
provides Serialize
and Deserialize
impls for the three range types.rand
has a trait SampleRange
with impls for Range
and RangeInclusive
. It also impls From
conversions from those two types to Uniform
.regex
has the Match::range
API that directly returns a Range<usize>
and the corresponding From
conversion impl.bytes
has Index
and IndexMut
impls for indexing UninitSlice
with the range types.rayon
provides IntoParallelIterator
impls for Range
and RangeInclusive
.bincode
provides Decode
, BorrowDecode
, and Encode
impls for Range
and RangeInclusive
.indexmap
has Index
and IndexMut
impls for the range types for both IndexMap
and IndexSet
.quickcheck
has a trait Arbitrary
with impls for the range types.schemars
has the JsonSchema
trait with impls for Range
and RangeInclusive
.ndarray
implements From
conversions from range types to Slice
.pest
has impl Index<Range<usize>> for Stack
and the function ParserState::match_range
which takes a Range<char>
.pyo3
has Index<Range*>
impls for PyList
, PyTuple
, and PySequence
.scale-info
provides TypeInfo
impls for Range
and RangeInclusive
.rustyline
has two functions that take Range<usize>
: LineBuffer::replace
and LineBuffer::delete_range
.bitvec
has BitPtrRange::from_range
and the corresponding From
impl, range Index
and IndexMut
impls for BitSlice
, plus a few other APIs that return ranges.rocket
has the function Request::segments
that takes a RangeFrom<usize>
.miette
has a conversion From<Range<usize>> for SourceSpan
.tower-http
has StatusInRangeAsFailures::new
which takes a RangeInclusive<u16>
.arbitrary
has a trait Arbitrary
with impls for the range types.wgpu
has 17 functions explicitly taking Range
s, including RenderEncoder::draw
.azure_core
has its own Range
type with From
conversions for Range
and RangeFrom
.tree-sitter
has the functions set_point_range
and set_byte_range
that take Range
s on each of the following types: QueryCursor
, QueryMatches
, QueryCaptures
.egui
has a Rangef
type with From
conversions for Range
and RangeInclusive
, History::new
which takes a Range
, and struct LayoutSection
which has a public Range<usize>
field.bstr
has Index
and IndexMut
impls for BStr
with the range types.logos
has the Source
trait with functions slice
and slice_unchecked
that take Range
s and the Span
type alias of Range<usize>
.regress
has the Range
type alias of Range<usize>
.rkyv
has the Archive
trait with impls for the range types and ArchiveRange*
types which impl Partial<Range*>
.wiremock
has a Times
type with From
conversions for ranges.fancy-regex
has the Match::range
API that directly returns a Range<usize>
and the corresponding From
conversion impl.glium
has TextureAnyMipmap::raw_upload_from_pixel_buffer(_inverted)
which takes Range
s and the DrawParameters::primitive_bounding_box
public field composed of Range
s.codespan-reporting
has a public Range
field in the Label
struct.object
has the function read_bytes_at_until
which takes a Range
on three types: ReadRef
, ReadCache
, ReadCacheRange
.sodiumoxide
(Archived) implements range indexing for four types: generichash::Digest
, sha256::Digest
, sha512::Digest
, siphash24::Digest
.aho-corasick
has a custom range type Span
which implements From
conversion to Range
and PartialEq
comparison with Range
.amplify
implements range indexing for the wrapper types Array
and Confined
.bitcoin_hashes
implements range indexing for Hmac
.smartstring
implements range indexing for SmartString
wasmer
has MemoryView::copy_range_to_vec
which takes a Range
.similar
has 25 functions which take Range
, including capture_diff
and DiffableStr::slice
.Here is a breakdown of those categories:
derive | trait bound | trait method | index | from | fn | field or alias | |
---|---|---|---|---|---|---|---|
total occurences | 22 | 4 | 9 | 67 | 15 | 63 | 5 |
affected crates | 6 | 1 | 4 | 10 | 8 | 13 | 5 |
#[derive(serde::Serialize)]
fn gen_range(range: impl SampleRange)
(0..11).into_par_iter()
Index<Range*>
implementationsFrom<Range*>
implementationsI wanted to provide a good overview of what could be supported
derive | trait bound | trait method | index | from | fn | field or alias | |
---|---|---|---|---|---|---|---|
Do Nothing / Reject RFC | OK | OK | OK | OK | OK | OK | OK |
Accept Current RFC | X | X | X | X | X | X | X |
Type Inference | X | OK | X | OK | X | OK | OK |
Copy Trait Impls | OK | OK | OK | OK | OK | X | X |
Note: "Copy Trait Impls" refers to a hypothetical option where a compiler hack automatically duplicates any third-party trait impls with the old range types to the new range types. I don't consider it a viable option, but included it as an example that would cover the cases that "Type Inference" would not.
I stand by the RFC as written: stabilize the new range types ASAP and encourage libraries to make changes to ergonomically support them. The changes libraries need to make are pretty straightforward, and most of the affected popular libraries are actively maintained.
Type inference alone would help significantly for cases like indexing, but it won't help cases like serde::Serialize
.
I think we can provide enough mitigations to minimize the issue without the need for special handling at the language level:
While this will be burdensome on maintainers to a degree, I think most would agree that having better range types is worth it.
Finally, a quick reminder:
cargo fix --edition
will 100% cover any interop between old and new range typesRangeInclusive
ExactSizeIterator
implsRangeInclusive
iteratorCopy
for all range types