Itertools is a Rust crate that extends Rust's Iterator
s with more than 100 additional methods. The crate is both one of Rust's oldest (in development since Jul 28, 2014), most depended upon (~4,300 direct dependants), and most downloaded (115+ million downloads).
The Itertools mission is two-fold:
Iterator
traitThis document addresses a persistent name-conflict issue between Itertools and the Rust standard library that undermines this mission.
The Itertools crate implements the extension trait design pattern. In this pattern, a library-defined trait is used to add additional methods to third-party code.
Concretely, Itertools extends the Iterator
trait defined by the Rust standard library:
/* in `std::iter` */
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
/* additional methods */
}
It does so by defining a trait called Itertools
that extends Iterator
and is implemented for all types implementing Iterator
:
/* in `itertools` */
// `Itertools` extends `Iterator`
trait Itertools: Iterator {
// And provides methods, like `flatten`, that are
// not available on `Iterator.
fn flatten(self) -> Flatten<Self> ⓘ
where
Self: Sized,
Self::Item: IntoIterator
{
/* ... */
}
}
// These methods are made available by providing a
// blanket-implementation of `Itertools` for all things
// that implement `Iterator`.
impl Itertools for I
where
I: Iterator
{}
The trait extension pattern used by Itertools falters if any other applicable traits define the same methods. For instance, if Iterator
were to also define a flatten
method, Rust would begin to produce compilation errors at all .flatten()
call-sites.
In 2018, a PR was opened on rust-lang/rust to promote Itertools::flatten
into the standard library's Iterator
trait. Upon release, this (unstable) change caused rustc to produce a compilation error at all existing callsites of .flatten()
.
Three counter measures were adopted, and by the time flatten
was stabilized, only 11 crates were broken directly by the addition (and 26 crates, transitively).
Rust was modified to depriotitize and warn (rather than error) in cases where the conflicting method is unstable.
This change creates a buffer period in which customers (provided that they are using an up-to-date Rust) receive a warning that breakage is impending.
Itertools
methods.Itertools began to add 'free' function alternatives to its methods. Concerned or affected users are encouraged to use these functions instead of Itertools
method calls, since these functions are unaffected by naming conflicts. These functions are generally less ergonomic than their method counterparts.
Itertools
.Itertools adopted a policy of discouraging new contributions that could conflict with future additions to Iterator
.
In 2020, a PR was opened on rust-lang/rust to promote Itertools::intersperse
into the standard library's Iterator
trait.
This PR was merged and stabilized without substantial consideration of the breakage it could cause. The stabilization was reverted upon discovery that it broke 59 direct (and 538 indirect) dependents of Itertools.
Although this kind of breakage is expressly allowed by Rust's compatibility guarantees, the Rust Library Team takes, in practice, a more conservative position and tries to avoid breakage whenever possible. The stabilization of intersperse
remains incomplete.
The persistent risk of naming conflicts between Iterator
and Itertools
undermines the maintenance of both of these projects; e.g.:
Iterator
for fear of causing widespread breakage.Iterator
.RFC2845: Supertrait Item Shadowing proposes altering Rust's method resolution to (mostly) eliminate this class of compiler errors. Specifically, Rust would instead treat same-named items in sub-traits as 'shadowing' their counterparts in super-traits. This RFC was accepted in September 2021, but has not yet been implemented.
If implemented, the Rust Library Team could freely promote items from Itertools
into Iterator
without causing breakage.
There would remain some risk that users would get 'stuck' on the sub-optimal implementations provideded by Itertools
(since standard library iterator adapters can make optimizations that crates cannot), but Itertools would mitigate this by periodically removing items that have been promoted into the standard library.
Itertools could prefix its methods with it_
, making future conflicts with Iterator
far less likely. This approach could be readily implemented by Itertools
, but would cause a brief period of breakage (or, at least, myriad compiler warnings) for all users of Itertools.
RFC2845 has every advantage over Rename Methods, except that it would surely take longer to release. However, given its advantages (namley: no breakage for itertools users), we should at least investigate implementing it. I asked Michael Goulet (@compiler-errors) for his impression on feasibility: A few (probably amendable) corner cases aside, he believes it would not be too difficult to implement.
If it proves to be too difficult, we can always Rename Methods.