Extern types have been accepted into the language as RFC 1861, however it is not implementable due to extern types not having a known alignment. I am proposing RFC 3396 which changes the meaning of ?Sized
and introduces a new trait (MetaSized
) in order to fix this. It requires an edition change to do this and has some outstanding unresolved design questions.
Some types do not have a size known at runtime, these fall into 3 categories (that I know about):
CStr
-like types - CStr
has a known alignment (1) but its size can only be determined by iterating over the bytes.In order to more easily discuss these types I will introduce some descriptors for the size and alignment of types:
u32
, String
, &[u8]
.[u8]
has a statically known alignment but the size can only be determined from the pointer metadata, dyn Debug's size and alignment are both obtained from the vtable in the pointer metadata.CStr
, which has a statically known alignment but it's size can only be determined by iterating over its contents to find the position of the null byte. Note that these types are odd, for example determining the size of a Mutex<CStr>
requires taking a lock on the mutex."dynamically aligned" (or "unknown aligned") types cannot be placed as the last field of a struct as their offset cannot be determined without already having a pointer to the field. This is the main issue with the previous RFC. Because generic structs exist, I don't believe we can do something simpler like an explicit error for using these types in a struct (unless we accepted a post-monomorphisation error).
The solution this RFC proposes is to add a MetaSized
trait that means a type is metadata sized (and implies it's metadata aligned) and to relax ?Sized
to mean a types has unknown size and alignment (rather than metadata size and alignment).
The lack of the Sized and MetaSized traits on a type prevents you from calling ptr::read
, mem::size_of_val
, etc, which are not meaningful for opaque types.
In the 2021 edition and earlier, these types cannot be used in generic contexts as T: Sized
and T: ?Sized
both imply that T has a computable size and alignment.
In the 2024 edition and later, T: ?Sized no longer implies any knowledge of the size and alignment so opaque types can be used in generic contexts. If you require your generic type to have a computable size and alignment you can use the bound T: ?Sized + MetaSized
, which will enable you to store the type in a struct.
The automated tooling for migrating from the 2021 edition to the 2024 edition will replace ?Sized
bounds with ?Sized + MetaSized
bounds.
The RFC proposes adding extern types defined like so:
extern "C" {
type Foo;
}
Foo
is !Sized
, !MetaSized
, !Send
, !Sync
, !Freeze
and is FFI safe. It can not be included in a struct unless it is the single non-zero-sized field of a repr(transparent)
struct.
I believe sorting out the syntax and precise semantics of extern types is secondary to the MetaSized
trait as they can be sorted after the 2024 edition.
Box and Arc will require MetaSized
bounds, which means that you'd have to write traits like this:
pub trait Trait {
fn foo(self: Box<Self>) where Self: MetaSized;
fn bar(self: Arc<Self>) where Self: MetaSized;
}
All Sized things are MetaSized but Sized doesn't semantically require MetaSized.
#[repr(align(n))]
attribute onto opaque types to give them an alignment?This would allow us to represent CStr properly but would necessitate splitting MetaSized and MetaAligned as it is only "dynamically sized" but "statically aligned". (We may be able to get away with the Aligned trait)
Attendees: Jack Rickard (Skepfyr), tmandry, Josh, scottmcm, TC
Minutes: TC
+ MetaSized
.MetaSized
bound on Box so that it doesn't need to be mentioned as much?pnkfelix: why can't the compiler just assume that such types have an alignment of 1? It won't be able to construct the pointers to them itself anyway, right – as in, it has to accept the pointers from foreign code, so it seems to me like assuming the maximally conservative alignment (in terms of not knowing anything about what the foreign code might do) would work here, in terms of making it illegal to e.g. use those low-order bits to store a niche?
pnkfelix: Oh, I just read enough to see the bit about "using the extern type as the last field of a struct" … that … hmm.
Skepfyr: The other thing is that the compiler still needs to know that these types are different so that it can prevent you from putting them in structs.
pnkfelix: And we cannot just "treat extern types as special" because we need to be able to instantiate type parameters as instances of them, okay.
pnkfelix: We could get away with a post-monomorphization time error here though, right? Where we reject uses of all extern types (unless they otherwise indicate their alignment) as a struct field? (Indeed, the doc as written alludes to this…)
(in meeting…)
Skepfyr: These types have to be understood by the compiler to not be allowed in certain places. Post-mono errors are definitely a possibility here.
pnkfelix: I need to better understand the use-cases here to understand how the value comparxes to the adding this complexity to the language.
pnkfelix: This adds a lot of mental overhead for users. Maybe?
Josh: It is almost the case that rather than a post-mono error you could prohibit its use in generic contexts entirely.
scottmcm: I'm not convinced by that. People will want byte-add on this kind of thing, for example.
Josh: Fair.
Josh: Clarifying… under what circumstances do you want the generic type rather than a pointer to it?
scottmcm: https://doc.rust-lang.org/nightly/std/primitive.pointer.html#method.byte_add has the generic type as the pointee, for example.
Skepfyr: …something like Swift.
Josh: If you had a generic type that included the pointer type, that wouldn't be a problem, but since we have types like Rc
, we need to allow these in a generic context.
tmandry: I agree about the overhead and the high cost. All contexts where we have ?Sized today we don't want to accept these types. There was previous t-lang guidance to not add a new ?-bound. If this is something you only run into very rarely, maybe it's not so bad.
Skepfyr: The thing that worries me the most is blanket impls. There are probably some of those that people will want to add to these types.
tmandry: So you're saying that you'll see ?MetaSized a lot, e.g. on the blanket impls.
scottmcm: I'm curious how often these will end up in a box rather than being a reference where you aren't looking at the size at all. I don't know that references always care, but Box and Arc do.
Skepfyr: Yeah, I'm not sure. I don't know.
digama0: How do you differentiate between NonNull<T>
which can take a ?MetaSized
and Rc<T>
which can't, without a trait bound? Would a post-mono error even be able to get this right without hard-coding everything about these types?
Skepfyr: That's roughly what we're talking about. You need that or some kind of post-mono error.
scottmcm: I ponder things like &A: PartialEq<&B>
https://doc.rust-lang.org/std/primitive.reference.html#impl-PartialEq<%26B>-for-%26A, which doesn't need MetaSized
, and don't know if that's more or less common than generic-in-Box
.
Skepfyr: There's also the fact that things like PartialEq will be implemented on references to the type since the types themselves don't have a type. So it's likely that the blanket impls will usually apply because people will have implemented on the references.
tmandry: scottmcm, clarification on your question…
scottmcm: If you have && to extern type, you'd want to still forward that to the underlying PartialEq. Once you know that it's a slice, you know it's MetaSized, but the forwarding didn't require that, so it forwarded the generic.
tmandry: We still use the impl, PartialEq for slice of T. And that impl knows how to get the size…
scottmcm: Once you're implementing PartialEq for a slice, you have .len on that slice. When you're implementing it on i32, you have the size.
TC: The document writes:
Because generic structs exist, I don't believe we can do something simpler like an explicit error for using these types in a struct (unless we accepted a post-monomorphisation error).
How bad is this exactly as compared with the alternative proposed?
scottmcm: Would we feel comfortable saying this is a layout post-mono error? Like we have existing [u8; 1<<49]
post-mono errors on x64…
Skepfyr: This is just a marker type. So a post-mono error is definitely available. The only minor caveat, I have wondered whether this type should have methods on it, such as size of val and align of val. At the momet they are free functions so it would just work. But if you made them methods on the trait, that maybe makes a bit more sense.
tmandry: It definitely feels like a different category of post-mono error than the existing ones. It does feel more like the case we already have for Sized. So there's an argument that it would be inconsistent. But we do need to be pragmatic.
pnkfelix: I can see the value in having a trait that let people who want to perform this reasoning to get that. But I worry about people being forced to do that plumbing when a post-mono error would be fine for most cases. I haven't yet figured out whether the ?-bounds would help.
scottmcm: we'll definitely have a bound that includes it, since T: Thin
is already RFC-accepted, but I don't know if that would meet everyone's needs.
tmandry: The set of use-cases for ?MetaSized is the subset of ?Sized, and so I could see a world where we just continue writing ?Sized everywhere… and we wait to decide whether it's MetaSized and we defer figuring that out to post-mono.
Josh: It sounds like a thing that we could do. But how often do we expect this to come up. Our expectation if we go with the proposal here is that… there's an ordering of how often each thing is.
scottmcm: The thing I'd emphasize is impls/functions, we'll have more types that are Sized rather than MetaSized.
For example, fn foo<T: ?Sized>(x: &T) { ... }
might usually not actually care about MetaSized
.
Josh: Does anyone feel like a huge number of things will not be Sized but will want to go in a struct?
Skepfyr: You do need the MetaSized bound to put it in a Box. You need the layout when dropping it.
scottmcm: I guess part of the problem with Box<T>
is that it's Drop
that needs the bound, so it's not something that we could do some kind of "you don't need to prove it just to mention the type" exhaust hatch? (Since Drop
can't have more requirements than the type itself, IIRC.)
Josh: If it weren't an issue for Drop, then I'd agree, the obvious answer is that Box::new requires MetaSized but the existence of the type doesn't, but since we don't have linear types, we don't have a way to say that.
Skepfyr: You could do something like require that all ways create a Box require MetaSized, and then not require Box to be MetaSized, and that would fix the issue.
scottmcm: But that would be new magic, a layout that only Box can use?
scottmcm: I guess we can have a version of get the layout that panics if it's not MetaSized and then something like Box or Arc used it, it would always be fine because it would always be unreachable.
tmandry: Maybe reasonable?
scottmcm: As a temperature check… if we had a way where this would be required for Box::new or Arc::new, but it wouldn't require the bound to use the Box or Arc, would that make people feel better about this?
tmandry: Not me. It's type-state as a pattern. And it might make the MetaSized trait less ubiquitous.
Josh: I'm honestly hesitant to do this as a post-mono error, because then anyone who does encounter it, it would be unique and special and different. Doing it as a bound is at least normal, even if advanced. We have a big fancy type system but then choose not to use it here? It's a bit weird. It seems like it would be optimizing for minimizing the number of times one writes MetaSized rather than optimizing for users of those interfaces.
pnkfelix: scottmcm was proposing implied bounds?
scottmcm: Josh is skeptical of the post-mono, so writing MetaSized would be good, so then the question is, can we not write this where not needed.
pnkfelix: Three things. The proposal as written here; if you're not ?Sized, you're ?MetaSized. MetaSized is a sort of normal trait. But it's implicit. Option 2: A new ?Sized bound. I could imagine a new world where ?Sized stays the same; it means MetaSized… Third option: Implied bounds of some form.
scottmcm: The option in the middle, ?Sized + MetaSized
… if you call size_of_val, the compiler could tell you what to do.
tmandry: Presumably the compiler could tell you that you should have written one rather than the other.
scottmcm: One way around may be harder. If we've taught people ?Sized
for "look, I want non-sized things", then the compiler suggesting + MetaSized
seems easier than it being able to suggest "you meant ?MetaSized
instead".
Josh: If we do a world with one, then that's a world in which extern types can't appear everywhere.
pnkfelix: Is that a bad world? You have to reason about each individual type.
Josh: The default should be that you ask for the things that you need. Most things are going to ask for Sized. How often will this really come up?
tmandry: That is the key question that we need to answer. I had assumed most things would be MetaSized.
Josh: What cases will likely come up in practice? How often will you have a case where you have to add + MetaSized for it to work, other than for smart pointers.
Josh: To be clear, I think it will be less than 10% in user code, but not in the standard library. There are more smart pointers there.
tmandry: The use-cases for this are when you want to allocate and when this needs to be the trailing field in a struct.
Josh: You could generalize it a bit to placement in structs.
Josh: We could do something implicit; if you put it in a field, obviously you require that it's MetaSized.
scottmcm: That would be the first trait bound we did that for; we do that for lifetime bounds.
Josh: We shouldn't do that in this proposal. But we should consider a follow-on proposal that would make this better.
scottmcm: Everything that today uses ?Sized on a type would need ?MetaSized? Unless it's holding a &T.
scottmcm: The first thing I think of for ?Sized
is something like
fn foo<T: ?Sized + Ord>(x: &T, y: &T) { ... }
that almost certainly doesn't care about MetaSized
.
Skepfyr: What I'm interested in is, what needs to happen to progress this?
Josh: There are some factual questions that could be answered here. E.g. what fraction of bounds that now need ?Sized
will need + MetaSized
? 10%, 50%, 90%?
Skepfyr: I keep flipping back and forth.
tmandry: Is there something that we could do with crater?
Josh: Could we just categorize the things in the standard library.
Skepfyr: There actually aren't that many ?Sized
in the standard library.
Scottmcm: it might not be as bad as you think, because
impl<B: ?Sized + ToOwned> Cow<'_, B> {
for example is sortof a smart pointer, but doesn't use MetaSized
.
scottmcm: Maybe the top libraries really should be exporting the MetaSized bounds. Maybe that's OK if most code doesn't have to say this most of the time.
tmandry: Going through the exercise is probably what has the most value.
scottmcm: there's just so many forwarding things like impl<T: fmt::Display + ?Sized> ToString for T {
that don't care about MetaSized
.
Josh: As an example, trait AsRef<T: ?Sized>
doesn't need + MetaSized
because it just gives you a reference back to T
.
Skepfyr: From what I've seen ?Sized
just doesn't show up that much..
TC: Maybe the value of going through the exercise is to come up with a set of patterns. Even if we didn't have numbers, we could look at the patterns and build an intuition for how likely they are to come up.
Skepfyr: Happy to go through code and document this.
scottmcm: You said something really interesting; that in your early survey most code never says ?Sized
.
Josh: That's why the post-mono error seemed like the wrong path.
scottmcm: The other thing that might be possible is maybe exploring what does a Box look like that maybe can get away without this requirement on the type so it doesn't need to be mentioned everywhere.
Josh: You're proposing that Box could call that, but then if it failed on drop, it would panic?
scottmcm: But it would be unreachable. But the point is that this should be explored as a way forward.
tmandry: Sketching it out as an alternative or future possibility would be good.
Josh: There's one other aspect worth considering. Putting a not-MetaSized thing into a Box with a different allocator might be OK.
scottmcm: Interesting. I was imagining a Box using a malloc/free allocator… It wouldn't be a type restriction that it's MetaSized at all.
Josh: Your allocator could be that deallocating calls a specific FFI function.
scottmcm: I like this as a principled reason for why Box itself does not require MetaSized.
Josh: The majority of allocators would, but this wouldn't.
tmandry: We probably don't need to design the full API…
Skepfyr: You'd get some of the bounds from the other generic.
Josh: Looking at the allocator trait, none of it passes around a size, it passes around a layout. There's something you could do here, where an FFIAllocator
would know the Layout that works for that one size and doesn't let you use that Layout to allocate the type.
TC: We already use the term "opaque types" to mean something very specific in Rust. We probably shouldn't overload that word (@oli also raised this in Zulip).
tmandry: +1
Skepfyr: Right… I'll try to think of a synonoym there.
Josh: Maybe we could just call these "extern types".
Skepfyr: That sounds good.
(Meeting ended here.)
Josh: Is there any circumstance where we need the length of such a type, other than at the user's explicit request for the length, or when moving it, or when dropping it? I want to make sure we never implicitly take the mutex behind the scenes.
Send
or Sync
?Josh: Is it possible to explicitly mark extern types as being Send
or Sync
? Or mark pointers to them as being Send
?
Skepfyr: Yes, I expect unsafe impl Send
etc to be supported like anything else. (This means that they can never be Freeze
but can be any of the other auto traits)
MetaSized
is unnecessary?Josh: The proposed edition migration turns ?Sized
into ?Sized + MetaSized
. However, in some cases such bounds won't actually require MetaSized
. 1) Can rustc
detect cases where a MetaSized
bound appears unnecessary, and lint? 2) Can we somehow integrate that detection into the edition migration, and not add the bound when unnecessary?
Josh: Can you put a pointer to an extern type roughly anywhere? What about a reference, or mutable reference? Presumably you cannot create one unless you manually allocate memory?
ScottMcM: I think the core thing you can do is that extern types are why there's a difference between https://doc.rust-lang.org/core/ptr/traitalias.Thin.html and Sized
: you get to make *const ()
-sized pointers and references to pointees that are not Sized
.
Josh: Could we implicitly have a reference be a thin reference, since there's no way to have a non-thin reference?
Scottmcm: Can you elaborate on "implicitly" there? It needs to be based on something, right?
Sized
?tmandry: Could we leave the existing meaning of ?Sized
as it is and only allow uses of extern types in places that have + ?MetaSized
bounds?
(As I understand it, this proposal changes the meaning of ?Sized
so that every existing bound has to be rewritten as + ?Sized + MetaSized
. Is that right?)
Skepfyr: Yes that's correct. This proposal was written with the previous lang team guidance of not adding any more ?Trait
bounds.
scottmcm: see https://github.com/Skepfyr/rfcs/blob/extern-types-v2/text/3396-extern-types-v2.md#opt-out-trait-bound---metasized for that alternative in the proposed RFC.
In the 2021 edition and earlier, these types cannot be used in generic contexts as
T: Sized
andT: ?Sized
both imply that T has a computable size and alignment.
pnkfelix: when you say "these types" here, which types are you referring to? opaque types alone? or some larger set?
Skepfyr: ?MetaSized
types.