Try   HackMD

Iterator Namepocalypse

Itertools is a Rust crate that extends Rust's Iterators with more than 100 additional methods. The crate is both one of Rust's oldest (in development since Jul 28, 2014), most depended upon (~4,300 direct dependants), and most downloaded (115+ million downloads).

The Itertools mission is two-fold:

  • to make the experience of working with iterators more pleasant
  • to serve as a laboratory for future improvements to Rust's standard library Iterator trait

This document addresses a persistent name-conflict issue between Itertools and the Rust standard library that undermines this mission.

Technical Context

The Itertools crate implements the extension trait design pattern. In this pattern, a library-defined trait is used to add additional methods to third-party code.

Concretely, Itertools extends the Iterator trait defined by the Rust standard library:

/* in `std::iter` */

trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;
    
    /* additional methods */
}

It does so by defining a trait called Itertools that extends Iterator and is implemented for all types implementing Iterator:

/* in `itertools` */

// `Itertools` extends `Iterator`
trait Itertools: Iterator {
    // And provides methods, like `flatten`, that are
    // not available on `Iterator.
    fn flatten(self) -> Flatten<Self>where
        Self: Sized,
        Self::Item: IntoIterator
    {
        /* ... */
    }
}

// These methods are made available by providing a
// blanket-implementation of `Itertools` for all things
// that implement `Iterator`.
impl Itertools for I
where
    I: Iterator
{}

Fatal Flaw

The trait extension pattern used by Itertools falters if any other applicable traits define the same methods. For instance, if Iterator were to also define a flatten method, Rust would begin to produce compilation errors at all .flatten() call-sites.

Historical Context

In 2018, a PR was opened on rust-lang/rust to promote Itertools::flatten into the standard library's Iterator trait. Upon release, this (unstable) change caused rustc to produce a compilation error at all existing callsites of .flatten().

Three counter measures were adopted, and by the time flatten was stabilized, only 11 crates were broken directly by the addition (and 26 crates, transitively).

Resolve unstable methods with lower priority and warn about future incompatibility.

Rust was modified to depriotitize and warn (rather than error) in cases where the conflicting method is unstable.

This change creates a buffer period in which customers (provided that they are using an up-to-date Rust) receive a warning that breakage is impending.

Add free-function alternatives to Itertools methods.

Itertools began to add 'free' function alternatives to its methods. Concerned or affected users are encouraged to use these functions instead of Itertools method calls, since these functions are unaffected by naming conflicts. These functions are generally less ergonomic than their method counterparts.

Discourage potentially-conflicting new contributions to Itertools.

Itertools adopted a policy of discouraging new contributions that could conflict with future additions to Iterator.

Naming Conflicts Continue

In 2020, a PR was opened on rust-lang/rust to promote Itertools::intersperse into the standard library's Iterator trait.

This PR was merged and stabilized without substantial consideration of the breakage it could cause. The stabilization was reverted upon discovery that it broke 59 direct (and 538 indirect) dependents of Itertools.

Although this kind of breakage is expressly allowed by Rust's compatibility guarantees, the Rust Library Team takes, in practice, a more conservative position and tries to avoid breakage whenever possible. The stabilization of intersperse remains incomplete.

Corrective Action

The persistent risk of naming conflicts between Iterator and Itertools undermines the maintenance of both of these projects; e.g.:

  • The Rust Library Team is unable to stabilize much-desired features in Iterator for fear of causing widespread breakage.
  • The Itertools Team is unable to accept new useful contributions that could, plausibly, become future additions to Iterator.

Rust: Supertrait Item Shadowing

RFC2845: Supertrait Item Shadowing proposes altering Rust's method resolution to (mostly) eliminate this class of compiler errors. Specifically, Rust would instead treat same-named items in sub-traits as 'shadowing' their counterparts in super-traits. This RFC was accepted in September 2021, but has not yet been implemented.

If implemented, the Rust Library Team could freely promote items from Itertools into Iterator without causing breakage.

There would remain some risk that users would get 'stuck' on the sub-optimal implementations provideded by Itertools (since standard library iterator adapters can make optimizations that crates cannot), but Itertools would mitigate this by periodically removing items that have been promoted into the standard library.

Itertools: Rename Methods

Itertools could prefix its methods with it_, making future conflicts with Iterator far less likely. This approach could be readily implemented by Itertools, but would cause a brief period of breakage (or, at least, myriad compiler warnings) for all users of Itertools.

Next Steps

RFC2845 has every advantage over Rename Methods, except that it would surely take longer to release. However, given its advantages (namley: no breakage for itertools users), we should at least investigate implementing it. I asked Michael Goulet (@compiler-errors) for his impression on feasibility: A few (probably amendable) corner cases aside, he believes it would not be too difficult to implement.

If it proves to be too difficult, we can always Rename Methods.