2022-02-16 deep dive: async I/O traits

# 2022-02-16 deep dive: async I/O traits ## Meeting document https://github.com/nrc/portable-interoperable/blob/master/io-traits/README.md ## Nick's overview (delivered live) * There's a shorter path * There's a supertrait (`Ready`) designed to help with the key advantage of readiness, which is allocating buffers just when needed etc ## Questions ### Parity and asnyc overloading nikomatsakis: Are there reasons that we can't implement readiness in sync Rust? nrc: Nope, you could do it, just doesn't make a lot of sense, but if you want symmetry, that would work. pnkfelix: I thought `ready` was not supposed to block...? nrc: No, it is supposed to be *block until ready*. eholk: Maybe we should rename it, like `wait_for_ready`. nrc: Maybe. `ready` comes from linux? epoll? something? It has connotations. came up later: nikomatsakis: you could use it to control total memory usage if you had a ton of threads, in the same way as in an async application ### Chaining, use cases nikomatsakis: Do we have some sample use cases worked out? I'm particularly interested in wrappers and in e.g. modeling a TLS wrapper around some underlying stream. nrc: I've played around with some simple use-cases, especially the edges of use-cases which are hard. For example, where you've got a socket you want to read+write to, which has come up a few times. I don't think I specifically did TLS. nikomatsakis: mini-redis might be a nice "simple code example", TLS is interesting because reads require reads+writes. nrc: I've played with good uses. nikomatsakis: re: chaining, I just meant like a thing that "takes any Read+Write and returns a version that is buffered", and I was curious to see if the code would "just work" or if it would hit some weird problems around zero-copy. nrc: I've tried some examples, you have this question of whether to do read/write directly or use ready. I'll play around with it. ### Completion and zero-copy nikomatsakis: From the doc, "When working with completion-based systems, the traits should support zero-copy reads and writes" -- but the APIs, from what I can tell, always write into some buffer, which seems to me to imply a copy. That may be inevitable, but is there a plan to give some other API? pnkfelix: perhaps the same Q but phrased differently: The first constraint, "When working with completion-based systems, the traits should support zero-copy reads and writes", to me implies that you *must* pass in a destination buffer at the outset (in order to avoid redundant copies), while the second constraint: "When working with readiness-based systems, the traits should not require access to buffers until IO is ready" implies that you *cannot* pass in a destination buffer at the outset. Are these two constraints fundamentally opposed? Or is there a workaround (e.g. by adding a level-of-indirection, e.g. a closure that produces a reference (or `Cow`?) to the destination buffer that is guaranteed to be called only when IO is ready...) --- nrc: The problem with the read/write design is that it just doesn't work well with completion systems. One of the principles I landed on early on, if you are writing code that really wants to be optimal in the way it does reads/writes, then you're writing platform specific code. You can get fast code that's cross-platform, but if you really want "true zero copy" code, you're going to be writing into account the platform you're targeting. Therefore, you can't write code optimal for both readiness + completion. Just doesn't work, they're competing. You have to choose, am I writing code that's optimal in terms of memory pressure and works on top of readiness systems... or writing code that's copying memory and works well on completion? There's no world where you get to have both. Read/write don't give you 0 copy for any system. Using completion, using `BufRead` or `OwnedRead` will get you zero-copy. nikomatsakis: BufRead also copies out to an external buffer, right? Seems ok, but I don't know how it's different than `Read` and `Write` here. nrc: Because of cancellation, you've got to have an internal buffer you copy into, but with BufRead, you read into the buffer, and you give a reference to that buffer. /me nikomatsakis notices the `fill_buf` API. "Oohhh, I see" ### Nit: saving stack space nikomatsakis: the document includes this example... ```rust async fn read_stream(stream: &TcpStream) -> Result<()> { loop { stream.ready(Interest::READABLE).await?; let mut buf = [0; 1024]; let bytes_read = continue_on_block!(stream.non_blocking_read(&mut buf)?); // we've read bytes_read bytes // do something with buf return Ok(()); } Ok(()) } ``` ...with the purported motivation to save stack space, but in fact, due to how async works, we are going to allocate all the stack space we might ever need up front, and therefore this 1024 will be allocated per task even when data is not ready. This motivates a few things I guess * recommendation to use `Box::new`, or perhaps some easy way to say "allocate more stack space when you enter here" * convenient APIs to lazilly allocate * etc ### Spurious readiness eholk: Will `reader.ready(Interest::READ).await.write()` ever return true? Answer: no Related: would it be possible to have a stronger type for ready so we know which conditions we need to check on the result? Maybe something like this: ```rust match stream.ready(Interest::READ) { Readiness::Read => { ... } // other cases are known statically to be impossible } ``` Answer: want to be extensible, different platforms have other things you can be interested in, so matching could be trouble. nikomatsakis: also, yagni, plus I can imagine generic code that wants to support many paths but let caller pick which ones (e.g., read/write). ### vectored reads tmandry: The current Read trait uses `IoSliceMut` for vectored reads. Are there plans to deprecate this in favor of `ReadBuf`? ### Interest / Readiness tmandry: Why `u32` and not `u8` or an enum? Maybe to answer my own question: Do we want to provide for including a number of bytes in the Interest? nrc: That's a good point. I looked at that and readiness is opaque in the way that it is so that this can be added backwards compatibly, and possibly only on certain platforms. The idea would be, rather than readiness be an enum, you can say "is it read" but you might also be able to ask "how many bytes". nikomatsakis: we don't generally commit to the *size* of structs, more just "Are they copy or not". ### Owned read vs BufRead nikomatsakis: So: * OwnedRead is for the caller giving up ownership of the buffer when calling read (and then getting it back) * BufRead is for the trait allocating and managing the buffer and then copying the data out Do these have to be distinct traits? They seem kind of similar. But also: "(We can't use borrowed slices due to cancellation)." -- is that the only reason? i.e., if we didn't support cancellation, would we just not want `OwnedRead`? It seems like it might still be nice if resizing is potentially an option. --- general discussion: nikomatsakis: I'm thinking of how io-uring supports a lot of modes, maybe we can unify into something like that? It seems like they're there for a reason. nrc: it is possible to make a trait, for sure, that supports a variety of modes, but those kinds of decisions are generally made at application level, and generics are used more in library code that isn't needing that level of control ### ReadBufVec? Just checking: we have tracking issue https://github.com/rust-lang/rust/issues/78485 which I think covers `ReadBuf` ([RFC 2930](https://github.com/rust-lang/rfcs/pull/2930)) Is `ReadBufVec` formally proposed somewhere? ### Should we check in with the libs team? :) nikomatskais: probably yes, I imagine we are still at the stage of deciding if this is the right overall shape (ready etc etc), and that the libs team may have good ideas for how to improve ergonomics and "fine tune" but that comes later. ### Simultaenous readers / writers tmandry: I'd like to think more about the footguns here. Some systems restrict when you can clone underlying handles, so we would potentially be introducing new footguns on those. nrc: there was this argument that if you had a split trait, you can create a read/write *where it's allowed*, but can't arbitrarily clone readers to get multiple readers on the same source. Others point out that this is a false sense of security, since you can ultimately clone the handle etc. tmandry: seems system specific. Answer might be just "don't implement on reference types", but then I'm not sure what you would do in the stdlib. nikomatsakis: I feel like you could do something .. based on ready ... I guess you'd need a mutex. nikomatsakis: Not an either/or thing, right? Maybe for total complexity, but in principle we could add more later. ### general discussion Discussing readiness vs completion: nrc: readiness tends to be used in networking, they have more concerns about tons of outstanding connections and thus memory pressure. nikomatsakis: io-uring isn't used for networking? nrc: more for disk tmandry: though sometimes for networking nikomatsakis: because it lets you ask the kernel to do the allocation on your behalf ### nrc questions! nrc: Are there any gaps, things I've totalled missed? nrc: I'd like to dig more into the `OwnedRead` vs `BufRead` question, whether we support both, etc nrc: For `Seek`, where we should have a reader-writer API.