Libs Meetings
Minutes
Meeting Link: https://meet.jit.si/rust-libs-meeting-crxoz2at8hiccp7b3ixf89qgxfymlbwr
Attendees: Amanieu, David, JoshT, Mara, The 8472, Chris Denton, martinomburajr, Tomas Sedovic, TC
1 rust-lang/rfcs T-libs-api FCPs
12 rust-lang/rust T-libs-api FCPs
dtolnay (1), nikomatsakis (3), joshtriplett (9), the8472 (1), tmandry (1), compiler-errors (1), BurntSushi (9), scottmcm (2), Amanieu (4), m-ou-se (9), jackh726 (1)
Amanieu: On me. Need to write down the notes from the meeting we had last week.
Amanieu: What the author of the PR thought was happening is not what's happening. The docs are wrong. So things our confusing, but our behaviour right now is not consistent across platform.
Amanieu: On Linux 0 is ignored. On Windows it's set to the default stack size (1MB). On ESP-IDF it treats 0 as ??. We could make it so if you set stack size to 0 we'll set it to the platform default. It just feels that we're using this value of 0 instead of a proper switch to set the native stack size.
Josh: Environment variables are stringly typed so we could say "we take a number but you can also specify default
".
The 8472: Considering it's not platform consistent, I'm not sure that talking about people's expectations makes much sense here.
Chris: It's definitely a number that sets the stack size above zero.
The 8472: We have no platform where 0
tries to set it to zero.
Amanieu: The doc for stack size says that the actual number can be higher if the platform specifies a minimum stack size.
Amanieu: Do we want to change the behaviour? Document something? Right now it says "if you set it to zero, it's going to round up to the platform minimum".
The 8472: That's what we do on Linux but on window is it the minimum?
Josh: On Windows you can set it to a lower value, right?
people trying to figure out the minimum size on Windows
Amanieu: The minimum size on Windows is 1 page plus the guard page plus the stack guranatee. Technically we could calculate it manually but in practice we don't want to do that.
Chris: For panic handling it also uses a large number.
Amanieu: I'm inclined to leave it unspecified.
Amanieu: Could we say: "setting the stack size below platform minimum, will result in platform specific behaviour which may be a minimum stack size or something else the platform defines".
David: We have an accepted FCP from 12 months ago. This breaks the itertools code. We're waiting for shadowing.
TC: We've accepted on the lang side; I don't know any blockers for stabilization.
Amanieu: Adding Iterator::contains
doesn't break anything.
David: Do you know why we had to change rust-analyzer?
Amanieu: In theory the rust-analyzer change is not needed if it's not enabling the Iter::Contains nightly
David: I will investigate.
David: The signatures of itertools vs. this are different. It could be that the code in rust-analyzer will break when we stabilize this.
Amanieu: If it's just a warning this is acceptable. If it's a hard breakage, it'll be an issue.
TC: Looking at the code I can't imagine why it'd cause breakage on just nightly.
Amanieu looking at logs
Amanieu: "Error: a method of this name may be added to the Rust in the future" future-compat warning. Because rust-analyzer is denying warnings.
David: Maybe the fix is to allow "future-compat" to rust-analyzer.
TC: I'd say this is ok since if they're denying warnings, they're asking for breakage.
Amanieu: Ok.
(Ongoing discussion about specifying semantics.)
Josh: The FCP got derailed by the discussion on the lang side that's ongoing that's taking place asynchronously.
TC: I nominated it on the libs-api side based on the lang discussion, for awareness. The bulk of our sentiment was to treat it as an implementation quality issue of doing what Josh is suggesting here. On the lang side, we need to specify the behavior more precisely, and in particular, decide whether a no-op or unconditional abort is an acceptable implementation.
Amanieu: We could avoid this by documenting which instruction this will be emitted that.
Josh: That's already what the documentation says for some platform. The main issue is more about what the specification would say about this instruction. Which part's guaranteed and which part isn't.
TC: It says what instruction is produced currently but it doesn't guarantee what instruction will be produced.
Mara: After that the lang discussion turned onto why is this function safe.
Amanieu: A trapping specification is safe.
Mara: Yes but that gets us to having to specify this. Do we generate the code after the trap? Is it safe that the code can continue after the trap instruction?
Josh: This discussion is getting into what the behavior would be on the Rust abstract machine.
TC: The other thing that came up is that you want this instruction to emit a compiler fence. Therefore it emits an observable effect. Meaning even if this turns into a nop you still want a compiler fence.
Mara: That's the same for every inline assembly.
Amanieu: With the default flags, yes.
TC: Conceivably, a workflow here is that you have some code, and you put a breakpoint in the middle (so you want the remaining code to still be emitted, and you're expecting it to be able to run through the breakpoint). Since it's going to emit that compiler fence, you'd want to pass a compiler flag that causes it to emit a no-op (with the compiler fence) to test and confirm the fence doesn't affect the behavior. Then you could turn off that flag and proceed with debugging.
Mara: If it's that sensitive, don't modify the program at all. Set the breakpoint from your debugger.
The 8472: You could set a static value that's exported. The assembly would read from it and skip the trap if it's set. That way you could config the behaviour. Since the compiler can't looked at the assembly, it won't know whether it could skip the code or not.
Mara: If you're working on a heisenbug you don't want to modify the program to observe it.
Josh: I'd expect to use this, for instance, in situations where the code runs for 20 minutes. In that case I'd put it in an if
.
The 8472: Can we just make a noop function that's __rust_breakpoint and the debugger can set a Rust entry breakpoint that doesn't modify code?
Mara: Yes, but this is about providing a way to insert the existing arch instruction without having to write inline assembly.
TC: The lang team is open to this as long as they figure out how to specify it and get a consenses on how exactly this will work.
Mara: Sounds like we need to wait for the spec discussion, then.
Josh: The breakpoint intrinsic doesn't use inline assembly, it uses LLVM's intrinsic. Unclear if emitting an LLVM trap implies emitting a compiler fence
Amanieu: It depends on what attributes you put on the call instruction. By default it does treat it as a compiler fence.
TC: We'll probably want to write that down in the documentation.
Mara: I don't think we have a definition of what a compiler fence means from a semantics (opsem) perspective yet.
Amanieu: This should mean what an FFI call means.
The 8472: The unsafe code guidelines issue has been resolved.
TC: We have a std::sync::atomic::compiler_fence
that takes an Ordering
.
Mara: I don't think that's the right thing to use.
Amanieu: It just treats it as an FFI call.
Chris: Do we need to do anything else in this meeting? This needs lang/spec to resolve it.
There is still a plan to make
&CStr
a thin pointer that does not know the string length.
Josh: Should this still be the plan? For many years we've carefully limited what we add to CStr
/CString
, but it's still the case that making them thin pointers will likely impose costs on many programs, invoking strlen
many times. Would it make sense to have a ThinCStr
?
Mara: It'd reduce the number of strlen calls in plenty of programs..
Mara: Regardless, I don't see any case where we want to use a raw pointer for this. Even if we don't make &CStr thin, we'd have another type to represent a thin C string pointer.
Amanieu: Does anyone remember if panic location is passed by value or reference.
Mara: By reference I think.
Josh: I don't think this is a question of size. If it's a CStr it may require a strlen
call.
Amanieu: It's initialized as a constant, including the length.
Mara: Today we don't use any strlen for Location. This PR adds calls to strlen for Location::file().
Amanieu: This PR also adds an api function that produces a Cstr than
Josh: You're right, Mara. I'm not sure why this change was proposed if it's introducing a new strlen operation.
The 8472: The common use will be returning a Rust string anyway. Can we store the Rust string, terminate with a null byte. And if you need a cstr, just send the pointer to the rust string.
Mara: We're doing that anyway. I don't see the point of this PR. It uses a worse type (raw pointer which isn't even a Send
or Sync
), and adds strlen calls for the regular (non-CStr) call.
The 8472: The argument here is creating a Cstr requires knowing a length.
Mara: But if we come from Rust we already know the length.
Amanieu: This doesn't introduce any new calls. But if we want to combine this with #117431 which actually changes
https://github.com/rust-lang/rust/pull/117431
Amanieu: This is always used in the slow path.
Mara: Even if we did make Location not store the length (and we don't make &CStr thing), it's still fine because the compiler can optimize away the strlen call if you don't use the length.
Josh: Sounds like there isn't even a strlen call because we know the length.
Amanieu: Agreed, looks like we'll reject it.
Mara to respond.
waiting on lang, not libs.
Amanieu: This is waiting on lang.
TC: This is towards the top of lang queue. It's for our awareness, and so we can review the great write-up by David.
The 8472: Didn't we reject something similar recently?
no one recalls that
Josh: Equivalent for what then_some
does for Option
.
Amanieu: The naming's bad because then_err
shouldn't take a closure. Could be .else
but else
is a keyword.
Josh: If we want consistency, bool::then
takes a closure and bool::then_some
doesn't take a closure but a value. That would imply .then_err
should take a value.
TC: The obvious but bad name is .then_result
.
Mara: Seems a bit weird to have a function specifically for Result<()>
.
Amanieu: Colud we just add ok_or
and ok_or_else
to bool
?
Mara: We could but then you'd start treating bool
as an Option<()>
. Not entirely unfair, but meh.
Mara: I don't think there's that many usecases to add this.
Amanieu: The motivated usecases link shows a lot.
Josh: Seems like a common pattern.
Mara: Interesting, that's more than I expected. I wouldn't object to adding this as an unstable.
TC: Is there something we could add to concisely turn it into an Option<()>
or Result<(), ()>
?
Mara: We have that today with then_some
and then convert it to a Result
.
TC: What I mean is, we already have all the needed combinators on Option
and Result
; is it enough to have a concise way to "lift" the bool
into both an Option
and Result
(and in the case of Option
, do so more concisely than .then_some(())
)?
Amanieu: The most concise possible thing would be bool::opt
and that still doesn't look great.
The 8472: I think what people want is method chaining and you can't do that with if-else. Postfix macros would help here.
Mara: I wouldn't object to adding this to boolean.
The 8472: The author says feel free to close as this can be all done with pointer and read
methods.
Josh: It does seem like a mild improvement in terms of clarity, but if you don't want to add it and tell people to use pointer cast then so be it.
The 8472: I wouldn't call it from_raw_parts
– it copying the value not constructing it from a reference.
TC: What about just from_raw
?
The 8472: They're proposing multiple methods. One is reading a value, another is providing an array from a pointer. That could be called from_raw_parts
. Or from_pointer
. Core::array::from_ptr
Josh: Should we close this since it can be done with pointer manipulation or accept?
Amanieu: "parts" being plural feels weird.
The 8472: We could name it differently. On the pointer type we could have a method that could read n values instead of a single one.
Amanieu: I think these methonds should be on pointer types rather than on array types. If we want them at all.
The 8472: There are 2 proposals: The first is to read an array from a pointer. The other is constructing an array reference from a raw pointer. They could be named differently, have different signatures, live in different places.
Amanieu: It seems to me that this is just .cast().read()
and ??
TC: Trying to figure out if this is safer.
The 8472: cast
does type inference for you on both the type and array length. This fixes the type.
TC: Yes, so it constrains it.
Chris: Could this just be a clippy lint that says if you want the cast, you should specify the type?
The 8472: There seems to be a lot of untyped casts in the compiler. If we add a lint, that would probably trigger a lot of work.
TC: This makes sense to me if we come up with good names. If these existed, would I reach for them? Yes, I probably would.
Josh: I probably wouldn't reach for these. This is just barely making it any nicer than the raw pointer cast.
discussion about whether this is to support converting between types too: the problem statement talks about [T; N]
to [U; N]
but the solution sketch only talks about a single type
Josh: We should ask about the discrepancy in the problem statement.
TC: The thing that makes sense to me is what's in the solution sketch where it constraints the T
.
Josh: The problem statement is the part that would be much more interesting to me.
The 8472: I'll ask for clarification.
Amanieu: Unclear whether it's the job of the standard library to provide these. We don't some generic from impl. Instead we'd like a safe version of intrinsics. E.g. from_pointer
taking a reference of an array. But at that point it becomes a high-level API on top of std::arch
and inclear if that's std's job to provide this.
Josh: I appreciate the point of us not wanting to provide stable SIMD implementation. But this particular case: there are two ways to do this. You could use an unsafe operation, do the operation, store them back. But you should be able to a safe SIMD where you load, do an operation, store. And if the only thing stopping you from doing that safely is the load and store then it makes sense.
Amanieu: Load and Store we already have stable. By using regular Rust move.
Josh: If I have an array of bytes and I want to write a SIMD algorithm and only using safe intrinsics, the only unsafe code I'm writing is turning a [u8]
into a u32
.
Amanieu: Agreed there's value but I don't think that's the job of the standard library to provide that.
The 8472: Can we provide these on a platfor-by-platform bases only on platform where it's ambiguous? Where there's no endiannes confusion?
Josh: Is it actually ambiguous on ARM
Josh: If we're only using the common ones that are safe to use?
Amanieu: That way you'd end up mirroring the entire load api from intrisics.
Josh: Yes, that's what I'd like.
Amanieu: I don't think that belongs in the standard library. There's a lot of potential API design that comes into play. I'd feel much more comfortable if this was an external crate that provides a wrapper over the base intrinsics in std::arch.
Josh: If this were created in the ecosystem, iterated on and settled, would you feel different about including it into the standard library?
Amanieu: I would feel different then.
Josh: I don't agree that this doesn't belong in the standard library but I agree the standard library is not the place to experiment on this.
Josh: Does it seem sensible to tell them to experiment on this in the ecosystem and then come back here?
Amanieu: Yes.
The 8472: That's what portable-simd is, isn't it?
Josh: This is a much smaller surface that may get it in at some point in a way that portable-simd likely never will. I'm suggesting implementing some specific load/store methods on standard library types possibly via extension traits. The u8
case is pretty trivial, it's the u16
, u32
, u64
types where it can get trickier.
Josh will write this up.
Josh: Asks for a method that gives you the inner iterator back out. This seems reasonable. Not widely used but seems trivial to either make the field public or providing into_inner
method.
TC: Agreed.
Josh: Prefer the into_inner
method.
TC: I feel adding into_inner
is the right thing to do.
Josh: That way we don't have to make any guarantees about the exact layout.
TC: +1
Josh: The alternatives section mentions an into_inner downside for a &mut
iterator. The proposal suggests as_inner_mut
. 0 objetions to this.
TC: I'd probably do both at once. Surprising we haven't had to do that anywhere else.
The 8472: into_inner
is less problematic than as_inner_mut
. If you do any transformation of the inner type, if it's reversible we can do it to into_inner
. Not if we have mutable references.
Josh: If we're considering as_inner_mut
we may as well just have the field public. At that point you can do everything you want with it as if it were public.
TC: It's hard to imagine why making the field public would end up being a problem.
The 8472: Specializations may use having exclusive access to the inner type. Rev
doesn't implement the SourceIter
specialization trait so that should be ok.
TC: What is that?
The 8472: It's used for in-place iteration. Gives the iteration temporary access to the inner type. Has assumption on who can access the inner iterator. StepBy
transforms the inner iterator and we can't return it afterwards. So for some adaptors we have constraints that don't allow us to hand out the inner one.
Josh: How do we want to handle this?
The 8472: We can make it public but it'll limit our future options to provide optimizations. So I'd prefer into_inner
.
The 8472: Let's say you have a linked list. You could swap start and end and get a reverse. We could undo this for into_inner
but not for as_inner_mut
.
TC: Wondering when do you actually need to call into_inner
here? A mutable reference to an iterator is still an iterator. Can't you reverse that reference?
The 8472: The user wanted Clone
which doesn't work with a mutable reference. They do have a workaround so this isn't necessary. A nice to have maybe?
agreement on using into_inner
rather than the whole field
The 8472 will reply.
The 8472: Do we allow these CFPs? They'd have to be separate extension trait for each OS.
Josh: Yeah.
Amanieu: The PR was closed, the FreeBSD thinks this is out scope for the standard library.
Josh: Then we should take that as a clear indication that we shouldn't be doing that. We should close the FCP.
Amanieu: Yep.
The 8472 will comment on the FCP.
Josh: Effectively reading a lingle line from Stdin
or BufRead
and reading it into a String
that's allocated for you. read_line
is more efficient in a loop, but this can be more convenient for single input.
Chris: Seems to make more sense on BufRead
maybe?
Amanieu: Makes sense for both.
The 8472:Stdin
is already buffered I think?
Amanieu: Yes.
Josh: Seems like the correct API on the correct type. Do we actually want to provide this given that people might misuse it? We could lint on using in a loop.
The 8472: They also propose a different behaviour in handling new lines (including or not including \n
).
Josh: They're claiming it's the same behaviour as the lines
iterator.
The 8472: I was looking at read_line
. So we already have two different things.
Josh: we could do the same behaviour as .lines().next().transpose()
but they're proposing that we could
Josh: Inclination to add this but send a heads-up to the clippy folks for lints about using it in a loop and repeating allocations?
The 8472: Could we instead have a read_line
on a String that clears the string. You could say: String::new().read_line(Stdio)
.
Josh: Is that additional complexity worth the saving of the allocation? Because at that point you avoid calling a clear
.
The 8472: And you can write it on a single line.
Josh: The FCP makes the point that interactive programs are bottle-necked on the human. And so it makes sense there.
The 8472: Prompting a human is something that probably happens more on in the beginner programs but as you're processing date you'll run into the efficiency issues.
Josh: shrug either one is an improvement – which one should we do?
Amanieu: Do we still need this if we add something like inputln!
macro was proposed?
Josh: The input macro did actually have a solution for both "read it into my buffer" and "read it into an allocated string" cases.
Josh: We can close this and say input_line
will provide this and cc the folks working on inputln
.
Josh will reply to this
The 8472: Note inputln
was attempted several times and never made it.
Josh: True but there are folks that are actively pushing it. I don't think it's stalled because no one wants it, more because people don't have time.
closed with a message
Generated by fully-automatic-rust-libs-team-triage-meeting-agenda-generator