wilson@wilson-HP-Pavilion-Plus-Laptop-14-eh0xxx ~/CrustOfRust> neofetch --stdout
wilson@wilson-HP-Pavilion-Plus-Laptop-14-eh0xxx
-----------------------------------------------
OS: Ubuntu 22.04.3 LTS x86_64
Host: HP Pavilion Plus Laptop 14-eh0xxx
Kernel: 6.2.0-37-generic
Uptime: 22 mins
Packages: 2367 (dpkg), 11 (snap)
Shell: bash 5.1.16
Resolution: 2880x1800
DE: GNOME 42.9
WM: Mutter
WM Theme: Adwaita
Theme: Yaru-dark [GTK2/3]
Icons: Yaru [GTK2/3]
Terminal: gnome-terminal
CPU: 12th Gen Intel i5-12500H (16) @ 4.500GHz
GPU: Intel Alder Lake-P
Memory: 1894MiB / 15695MiB
wilson@wilson-HP-Pavilion-Plus-Laptop-14-eh0xxx ~/CrustOfRust> rustc --version
rustc 1.70.0 (90c541806 2023-05-31) (built from a source tarball)
In this episode of Crust of Rust, we go over the "drop check" — another niche part of Rust that most people don't have to think about, but which rears its moderately attractive head occasionally when you use generic types in semi-weird ways. In particular, we explore how to implement a Norwegian version of Box (which is really just Box with a different name), and find that the straightforward implementation is not quite as flexible as the standard Box is due to the drop check. When we fix it, we then make it too flexible, and open ourselves the type up to undefined behavior. Which, in turn, we use the drop check to fix. Towards the end, we go through a particularly interesting example at the intersection of the drop check and variance in the form of (ab)using std::iter::Empty.
本次直播試圖弄清楚 drop check 存在的原因,它的用途是什麼,在什麼情況下會反撲,以及如何解決這個問題。
可以參考這篇 : Drop Check
開始建置 Rust 專案 :
$ cargo new boks
$ cd boks
$ vim src/main.rs
先實作 Boks 的原型 :
// Box 的挪威語是 Boks
// T 可以不用是 mutable 參考,也可以是 i32 ...。
pub struct Boks<T>
{
p: *mut T,
// p 的型別跟 Box 一樣
}
impl<T> Boks<T> {
// new 的挪威語是 ny
fn ny(t: T) -> Self
{
// 我們只是用 Boks 把 Box 包裝起來
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
寫一個小測試,因為 Box 的記憶體沒有釋放造成記憶體洩漏 :
fn main()
{
let x = 42;
let b = Boks::ny(x);
}
pub struct Boks<T>
{
p: *mut T,
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
}
為 Boks 實作 drop 以解決記憶體洩漏的問題 :
impl<T> Drop for Boks<T>
{
fn drop(&mut self)
{
// 要自己釋放 Box 的記憶體
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
// 只有卸除 p,並沒有卸除 Box 的記憶體
// std::ptr::drop_in_place(self.p);
}
}
pub struct Boks<T>
{
p: *mut T,
}
impl<T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
}
實作 Deref 以及 DerefMut 用來取得內部型別 :
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// 解參考 raw pointer 是 unsafe
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
// 我們之所以可以在這裡引用 Self::Target,
// 原因是 DerefMut 是 Deref 的 sub-trait。
// 因此任何東西若是 DerefMut 也就會是 Deref,
// 因此編譯器理解 Self::Target 是 DerefMut 的 parant trait Deref 的 associated type。
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
為 main 函式新增一行來驗證解參考的功能 :
fn main()
{
let x = 42;
let b = Boks::ny(x);
+ println!("{:?}", *b);
}
pub struct Boks<T>
{
p: *mut T,
}
impl<T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
}
Q : Never wrote my own drop. Why unsafe?
impl<T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
A : 呼叫 Box::from_raw 是 unsafe,drop 本身不是 unsafe 實作。
來看看我們的 Boks 有什麼限制,修改 main 函式 :
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
// 我們預期程式碼要可以編譯的過,因為 Line 8 並沒有對 y 做任何事 :
// 如果是 Box,Line 10 不會造成編譯失敗,
// 如果是我們的 Boks,Line 10 會造成編譯失敗。
// drop(b); // implicit way
}
pub struct Boks<T>
{
p: *mut T,
}
impl<T> Drop for Boks<T>
{
fn drop(&mut self)
{
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Box::new(&mut y);
println!("{:?}", y);
}
編譯並執行出現以下錯誤 :
$ cargo run
...
error[E0502]: cannot borrow `y` as immutable because it is also borrowed as mutable
--> src\main.rs:51:22
|
50 | let b = Boks::ny(&mut y);
| ------ mutable borrow occurs here
51 | println!("{:?}", y);
| ^ immutable borrow occurs here
52 | }
| - mutable borrow might be used here, when `b` is dropped and runs the `Drop` code for type `Boks`
...
沒有東西去阻止 drop 以下的操作 (雖然沒人 drop 會這樣實作):
impl<T> Drop for Boks<T>
{
fn drop(&mut self)
{
+ let _: u8 = unsafe { std::ptr::read(self.p as *const u8) };
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
該操作會造成 main 函式有問題 :
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y); // read of y is not ok
y = 43;
// drop(b); // read from &mut y
// Line 9 以及 Line 10 皆 mutably use y,
// 本不應該編譯成功。
// 編譯器使用 drop check 機制保護我們,
// 讓我們不可以在 drop 實作使用 mutable 參考。
// 編譯器需要知道是否必須檢查被卸除的對象可能包含的任何內容。
}
標準函式庫的 Box,也有實作 drop,因為它需要釋放底層記憶體。但不知何故,對於標準函式庫的 Box,編譯器知道該 Box 不會讀取它包含的內容,Box 只會卸除它的值,編譯器知道 Box 不會像我們剛剛在 Boks::drop 存取它包含的內容。因此,它允許以下程式碼 :
fn main()
{
let mut y = 42;
let b = Box::new(&mut y);
println!("{:?}", y);
}
但對於我們的型別,編譯器不知道我們的 Boks 會不會在 drop 讀取它所包含的內容。沒有任何東西告訴編譯器以是程式碼是可以的 :
fn main()
{
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
// drop(b);
}
具體來說,沒有任何東西告訴編譯器我們的 Boks 的 drop 實作不會嘗試存取內部值。
為了理解其中的原因,我們需要了解語言和編譯器中大量奇怪的現象。具體來說,它與稱為 drop check 的機制有關。當某個變數超出作用域或以其他方式被卸除時,編譯器需要知道是否考慮卸除對該型別中包含的任何內容的使用。
編譯器有一個規則,即如果你的型別是對某些型別參數泛型的,這種情況下是對 T 泛型的,那麼編譯器將假設卸除該物件將存取 T。現在的例子是,如果 Boks<T>
實作了 drop,那麼編譯器將假設卸除 Boks<T>
時將使用 T :
fn main()
{
let mut y = 42;
let b = Boks::ny(&mut y);
// 編譯器假設 Boks::drop() 存取 &mut y
println!("{:?}", y);
// 因此,對 &mut y 目標的任何干預使用都是衝突的使用。
}
如果我們將 Boks::drop()
的實作註解掉,現在編譯器知道卸除 Boks<T>
不可能存取 T,因為 Boks
沒有 drop 實作。如果將 Boks::drop()
的實作註解掉,我們的程式就可以編譯的過,但這不是我們想要的 :
fn main()
{
let mut y = 42;
let b = Boks::ny(&mut y);
// b 的生命週期可以縮短至這行,
// 因為不會有 drop 函式把它的生命週期拉長。
println!("{:?}", y);
// drop(b); // 這行不會 implicitly 執行,
// 因為我們暫時把 drop 註解掉了。
}
透過註解掉 drop 的實作可以發現型別實作 drop 是一種破壞性的更改,例如,如果某型別是一個泛型的 (如果它不是泛型的,並不重要)。
實作 drop 實際上是向後不相容的。如果你的型別有 pub 欄位,並且你嘗試移出該欄位的值,而該欄位的型別又有實作 drop,則編譯器不允許你這樣做,因為 drop 實作需要有一個對 self 的 mutable 參考,該 mutable 參考必須是完整的並且沒有任何內容從 mutable 參考被移出,mutable 參考不能被部分移動。
如何告訴編譯器說你在呼叫 drop 的時候,並不會存取到內部型別 ? 現在的 stable 版本的編譯器實際上沒有辦法做到這一點。造成這種情況的部分原因是我們還沒有弄清楚什麼是正確的機制。但有一個臨時解決方法 :
$ rustup override set nightly
// 這個功能永久 unstable,
// 因為我們正在等待找出正確的解決方案。
#![feature(dropck_eyepatch)]
// dropck_eyepatch 的作用是讓我們能夠選擇退出 drop check 的某部分。
// dropck_eyepatch 允許我們對 drop check 屏蔽特定的型別參數。
修改程式碼 :
+#![feature(dropck_eyepatch)]
pub struct Boks<T>
{
p: *mut T,
}
-impl<T> Drop for Boks<T>
+unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
...
這告訴編譯器的是,儘管 Boks 包含一個 T,而且還泛型於 T,但我們保證有著 unsafe keyword 的 Drop 內部的程式碼不會存取 T,但它可能會卸除 T,因為我們不保證它不會卸除 T。
#![feature(dropck_eyepatch)]
pub struct Boks<T>
{
p: *mut T,
}
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
}
這樣程式碼就可以編譯成功了,但現在的程式還是有問題 (Tag)。
Q : And "Access" here includes creating any kind of reference to the T, right?
A : 建立 mutable 參考會是一個問題。因為 mutable 參考 永遠不允許使用別名。所以建立 mutable 參考也是非法的,即使你不使用它。
0:22:33
Q : uh so with this feature a trait can sometimes be safe and other times unsafe to implement? Or is that a general thing?
A : 並非 general purpose。所以這也是 unsafe 的原因之一。你可以將其視為此 unsafe keyword 不是用於 Drop trait,而是用於 #[may_dangle]
feature。
0:23:33
Q : What if your container holds two T's, and only accesses one of them in drop()? It seems strange to me that we talk about "accesses a T" without being more specific about which inner T…
A :
Q : You are creating a Box<T>
to the T
, which like mutable references, may not alias, no?
A : 我們不允許 mutably alias T,我們剛剛也不是在 mutably alias T。
...
unsafe { Box::from_raw(self.p) };
// Box::from_raw 創建了一個 Box<T>,
// 這意味著你不允許給 T 起別名。
// T 現在被擁有並被卸除,這是可以接受的,因為這是一個 raw pointer 而不是參考。
// 即使有另一個指標指向它,它也不是對它的另一個參考。
...
Q : Is it possible to make T keep track of all the references and drops them when accessed?
A : 不知道你在問什麼。
Q : Dont you still need to acess the T to it's drop?
A : 答案是不需要,或者更確切地說,這取決於你是存取 T 還是僅僅是卸除 T。想像對 T 的 mutable 參考之類的東西,卸除對 T 的 mutable 參考是 no-ops,這就像在編譯器的 borrow checker 一樣,它是一個操作,但它不發出任何指令。編譯器實際上並沒有 dereference mutable 參考,但存取 mutable 參考可能會 dereference mutable 參考。
先定義一個新的型別 :
use std::fmt::Debug;
// oisann 的意思是 oopsie daisy
struct Oisann<T: Debug>(T);
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
// Oisann 卸除的時候會存取 T
println!("{:?}", self.0);
// 編譯器也本來就假設 drop 實作會存取 T。
// 此時的編譯器的假設是正確的。
}
}
測試新的型別 :
fn main()
{
let mut z = 42;
let b = Boks::ny(Oisann(&mut z));
println!("{:?}", z);
}
#![feature(dropck_eyepatch)]
pub struct Boks<T>
{
p: *mut T,
}
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
use std::fmt::Debug;
struct Oisann<T: Debug>(T);
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
println!("{:?}", self.0);
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
let mut z = 42;
let b = Boks::ny(Oisann(&mut z));
println!("{:?}", z);
}
我們預期程式碼不該編譯過,結果竟然可以執行 (Tag) :
fn main()
{
...
let mut z = 42;
let b = Boks::ny(Oisann(&mut z));
// 當我們卸除這個 Boks 時,我們將卸除 Oisann。
// 我們並不會自己存取 Oisann,就像 Boks 的 drop 實作不存取內部的 T 一樣
// 但 Boks 確實會卸除 Oisann,而 Oisann 的 drop 實作確實會觸及內部的 T。
println!("{:?}", z); // immutable 存取
// drop(b);
// 因此當我們在這裡卸除 b 時,b 的 drop 將存取 &mut z。
// *&mut <- drop(b) 等價於做這件事
// 我們預期 borrow checker 要發現 Line 9 與 Line 13 同時使用會有問題,
// 但實際上 borrow checker 並沒有發現這個問題。
}
但如果是使用標準函式庫的 Box 就有抓出這個問題 :
fn main()
{
...
let mut z = 42;
let b = Box::new(Oisann(&mut z));
println!("{:?}", z); // immutable 存取
}
得到預期的編譯錯誤 :
error[E0502]: cannot borrow `z` as immutable because it is also borrowed as mutable
--> src\main.rs:72:22
|
71 | let b = Box::new(Oisann(&mut z));
| ------ mutable borrow occurs here
72 | println!("{:?}", z);
| ^ immutable borrow occurs here
73 | }
| - mutable borrow might be used here, when `b` is dropped and runs the destructor for type `Box<Oisann<&mut i32>>`
為什麼在使用 Box 時,drop checker 有抓出問題,但在使用 Boks 時,drop checker 卻沒有抓出問題呢 ? 因為我們使用 #![feature(dropck_eyepatch)]
告訴編譯器說 T 可能是懸空指標,我們跟編譯器保證說不會去存取 T,但我們還沒有跟編譯器說過是否要卸除 T。透過添加#[may_dangle]
,編譯器只是認為我們在卸除 Boks 時沒有對 T 做任何事情。但事實並非如此,我們確實卸除了一個 T,但編譯器不知道這一點。
pub struct Boks<T>
{
p1: *mut T, // 這個型別不擁有 T,所以編譯器不知道在卸除 Boks 會卸除 p2。
p2: T, // 這個型別擁有 T,所以編譯器知道在卸除 Boks 會卸除 p2。
}
我們必須告訴編譯器說,在呼叫 Boks 的 Drop 函式時,我們會去卸除其內部的 T 值。告訴編譯器的方法就是使用 PhantomData。
PhantomData 是一種可以對某種其他型別進行泛型化但不包含該型別的型別,它是在 Rust 中唯一以這種方式工作的型別。如果你有一個 PhantomData<i32>
,它的大小為零。它不保存 i32
。實際上它什麼都不保存。但根據編譯器和型別系統的看法,它保存了一個 i32
,這正是我們在這裡所需要的 :
#![feature(dropck_eyepatch)]
+ // use std::marker::PhantomData;
pub struct Boks<T>
{
p: *mut T,
+ _t: PhantomData<T>,
}
...
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
+ _t: PhantomData,
}
}
}
...
加了 PhantomData,我們的程式就成功無法編譯了。
先稍稍做個總結,#[may_dangle]
告訴編譯器,我們的 Boks 不會在卸除時存取 T,PhantomData 則告訴編譯器,我們的 Boks 會卸除 T,所以編譯器你也必須去檢查 T 的 Drop,看看有沒有做了正確的事情。
這裡不註解掉 Line 59 是為了展示編譯器會去檢查 T 的 Drop 的功能。後面由於要講解其他功能,後面小節的目前程式碼區塊就全都把 Line 59 註解掉了。
#![feature(dropck_eyepatch)]
use std::marker::PhantomData;
pub struct Boks<T>
{
p: *mut T,
_t: PhantomData<T>,
}
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
_t: PhantomData,
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
use std::fmt::Debug;
struct Oisann<T: Debug>(T);
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
println!("{:?}", self.0);
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
let mut z = 42;
let b = Boks::ny(Oisann(&mut z));
println!("{:?}", z);
}
Q : Without the PhantomData<T>
…"the only uses of T will be moves or drops" - nomicon
A : 不,這是不正確的。PhantomData<T>
是告訴編譯器我們可能會 drop 一個 T 的方式。Jon 不確定 nomicon 是否有記載的事情,Jon 在 nomicon 的某個 PR 或問題中看到有些討論。如果你不使用 #[may_dangle]
,PhantomData 就是不必要的,因為使用比 drop 更強烈。
Q : What if Oisann didn't touch the data in the drop? Would it compile?
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
- println!("{:?}", self.0);
}
}
A : 仍不可編譯。因為你沒告訴編譯器說,Oisann 不會去存取 T,所以編譯器仍然假設 Oisann 會去存取 T。一樣用 #[may_dangle]
告訴編譯器,Oisann 不會去存取 T :
-impl<T: Debug> Drop for Oisann<T>
+unsafe impl<#[may_dangle] T: Debug> Drop for Oisann<T>
{
fn drop(&mut self) {}
}
或者你註解掉 Oisann 的 Drop 也可以讓編譯成功。編譯器就確定你不可能在卸除 Oisann 時存取 T,因為你根本就沒有為 Oisann 實作 Drop :
// impl<T: Debug> Drop for Oisann<T>
// {
// fn drop(&mut self)
// {
// println!("{:?}", self.0);
// }
// }
你不行自己實作 PhantomData 型別 :
struct Phantom<T>;
編譯會出現以下錯誤 :
error[E0392]: parameter `T` is never used
--> src\main.rs:63:16
|
63 | struct Phantom<T>;
| ^ unused parameter
// Cannot treat Boks<&'static str> as Boks<&'some_shorter_lifetime str>
// even though &'static str as &'some_shorter_lifetime str
// and can treat Box<&'static str> as Boks<&'some_shorter_lifetime str>
聊天室提供了例子來說明 Boks 是 invaraint in T :
fn main()
{
...
let s = String::from("hei");
let mut boks1 = Boks::ny(&*s);
// boxs1 的內部型別的生命週期不是 static
let boks2: Boks<&'static str> = Boks::ny("heisann");
// boxs2 的內部型別的生命週期是 static
boks1 = boks2;
}
#![feature(dropck_eyepatch)]
use std::marker::PhantomData;
pub struct Boks<T>
{
p: *mut T,
_t: PhantomData<T>,
}
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
p: Box::into_raw(Box::new(t)),
_t: PhantomData,
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p }
}
}
use std::fmt::Debug;
struct Oisann<T: Debug>(T);
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
// println!("{:?}", self.0);
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
let mut z = 42;
let b = Boks::ny(Oisann(&mut z));
println!("{:?}", z);
let s = String::from("hei");
let mut boks1 = Boks::ny(&*s);
let boks2: Boks<&'static str> = Boks::ny("heisann");
boks1 = boks2;
}
編譯出現以下錯誤 :
error[E0597]: `s` does not live long enough
--> src\main.rs:79:32
|
78 | let s = String::from("hei");
| - binding `s` declared here
79 | let mut boks1 = Boks::ny(&*s);
| ^ borrowed value does not live long enough
80 | let boks2: Boks<&'static str> = Boks::ny("heisann");
| ------------------ type annotation requires that `s` is borrowed for `'static`
...
83 | }
| - `s` dropped here while still borrowed
把 Boks 換成 Box 就可以編譯的過,因為 Box 是 covaraint in T :
fn main()
{
...
let s = String::from("hei");
let mut boks1 = Box::new(&*s);
let boks2: Box<&'static str> = Box::new("heisann");
boks1 = boks2;
}
使用 NonNull 來解決,Boks 型別從 invaraint in T 變成 covaraint in T :
+use std::ptr::NonNull;
pub struct Boks<T>
{
- p: *const T,
+ p: NonNull<T>,
_t: PhantomData<T>,
}
...
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
- unsafe { Box::from_raw(self.p) };
+ unsafe { Box::from_raw(self.p.as_mut()) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
- p: Box::into_raw(Box::new(t)),
+ SAFETY: Box never creats a null pointer
+ p: unsafe { NonNull::new_unchecked(Box::into_raw(Box::new(t))) },
_t: PhantomData,
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
- unsafe { &*self.p }
+ unsafe { &*self.p.as_ref() }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
- unsafe { &mut *self.p }
+ unsafe { &mut *self.p.as_mut() }
}
}
...
NonNull :
*mut T
but non-zero and covariant.
#![feature(dropck_eyepatch)]
use std::marker::PhantomData;
use std::ptr::NonNull;
pub struct Boks<T>
{
p: NonNull<T>,
_t: PhantomData<T>,
}
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p.as_mut()) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
// SAFETY: Box never creats a null pointer
p: unsafe { NonNull::new_unchecked(Box::into_raw(Box::new(t))) },
_t: PhantomData,
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p.as_ref() }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p.as_mut() }
}
}
use std::fmt::Debug;
struct Oisann<T: Debug>(T);
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
// println!("{:?}", self.0);
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
let mut z = 42;
// let b = Boks::ny(Oisann(&mut z));
println!("{:?}", z);
let s = String::from("hei");
let mut boks1 = Box::new(&*s);
let boks2: Box<&'static str> = Box::new("heisann");
boks1 = boks2;
let s = String::from("hei");
let mut boks1 = Boks::ny(&*s);
let boks2: Boks<&'static str> = Boks::ny("heisann");
boks1 = boks2;
}
編譯成功,現在我們的 Boks 也是 covariant in T。
你可能看過 PhantomData 使用的泛型參數是 fn() -> T
:
pub struct Boks<T>
{
p: NonNull<T>,
_t: PhantomData<fn() -> T>,
// 雖然這樣寫,我們的 Box 還是 covariant in T,
// 但這意味著我們不再受到 drop check 的約束。
// 那為什麼要用這種表示法呢 ?
// 你可以在需要類型 covaraint in T 中看到這種情況,
// 但它們不會卸除 T,因為它們不包含 T。
}
...
fn main()
{
...
let mut z = 42;
let b = Boks::ny(Oisann(&mut z)); // 又沒檢查到問題了。
println!("{:?}", z);
...
}
這種使用方法最好的例子是 deserializer :
struct Deserializer<T>
{
// deserializer 本身不包含 T,
// 所以根本不需要讓編譯器去檢查 T 的 Drop 實作
_t: PhantomData<T>,
// 這樣寫會讓編譯器去檢查 T 的 Drop 實作。
}
Deserializer<Oisann<&mut>>
// 無法被編譯。
// 因為編譯器已經被告知要去檢查 Oisann 的 Drop 實作,
// 而 Oisann 又會去存取 T,所以編譯會失敗,
// 儘管 Deserializer 實際上不包含 Oisann,
// 也就是說 Deserializer 不會卸除 Oisann。
所以你在 deserializer 會用到 fn -> T
來得到 covariant in T :
struct Deserializer<T>
{
_t: PhantomData<fn() -> T>,
// 因為 Deserializer 沒有包含 T,
// 所以不用告訴編譯器說要去檢查 T 的 Drop 實作是 ok 的。
}
Q : Another example is the Empty iterator
A : 是個好例子 :
struct EmptyIterator<T>
{
// 這個迭代器不包含 T,也不產出 T。
_t: PhantomData<fn() -> T>,
// _t: PhantomData<T>,
// 雖然我們想要型別 covarint in T,
// 但這樣寫會讓編譯器去檢查 T 的 Drop 實作。
}
impl<T> Iterator for EmptyIterator<T>
{
type Item = T;
fn next(&mut self) -> Option<Self::Item>
{
None
}
}
Q : hm its pub struct Empty<T>(marker::PhantomData<T>);
in std 🤔
A : 我們來玩玩看 :
use std::iter::Empty;
...
fn main()
{
...
let mut a = 42;
let mut it = Empty::default();
{
let mut o = Some(Oisann(&mut a));
o = it.next();
}
println!("{:?}", a);
}
#![feature(dropck_eyepatch)]
use std::marker::PhantomData;
use std::ptr::NonNull;
use std::iter::Empty;
pub struct Boks<T>
{
p: NonNull<T>,
_t: PhantomData<T>,
}
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p.as_mut()) };
}
}
impl<T> Boks<T> {
fn ny(t: T) -> Self
{
Boks {
// SAFETY: Box never creats a null pointer
p: unsafe { NonNull::new_unchecked(Box::into_raw(Box::new(t))) },
_t: PhantomData,
}
}
}
impl<T> std::ops::Deref for Boks<T>
{
type Target = T;
fn deref(&self) -> &Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
unsafe { &*self.p.as_ref() }
}
}
impl<T> std::ops::DerefMut for Boks<T> {
fn deref_mut(&mut self) -> &mut Self::Target
{
// SAFETY: is valid since it was constructed from a valid T, and turned into a pointer
// through Box which creates aligned pointers, and hasn't been freed, since self is alive.
// Also, since we have &mut, no other mutable reference has been given out to p.
unsafe { &mut *self.p.as_mut() }
}
}
use std::fmt::Debug;
struct Oisann<T: Debug>(T);
impl<T: Debug> Drop for Oisann<T>
{
fn drop(&mut self)
{
// println!("{:?}", self.0);
}
}
fn main()
{
let x = 42;
let b = Boks::ny(x);
println!("{:?}", *b);
let mut y = 42;
let b = Boks::ny(&mut y);
println!("{:?}", y);
let mut z = 42;
// let b = Boks::ny(Oisann(&mut z));
println!("{:?}", z);
let s = String::from("hei");
let mut boks1 = Box::new(&*s);
let boks2: Box<&'static str> = Box::new("heisann");
boks1 = boks2;
let s = String::from("hei");
let mut boks1 = Boks::ny(&*s);
let boks2: Boks<&'static str> = Boks::ny("heisann");
boks1 = boks2;
let mut a = 42;
let mut it = Empty::default();
{
let mut o = Some(Oisann(&mut a));
o = it.next();
}
println!("{:?}", a);
}
可以編譯並執行,因為 Empty 型別並沒有實作 Drop。
1:07:34
以下為何編譯的過 ? 明明已經 explicitly 呼叫 drop 函式了 :
use std::iter::Empty;
...
fn main()
{
...
let mut a = 42;
let mut it: Empty<Oisann<&static i32>> = Empty::default();
let mut o = Some(Oisann(&mut a));
{
o /* ...<&'a mut i32> */ = it.next(); /* return ...<'static mut i32> */
}
// &'a mut T is invaraint in T, but covariant in 'a
drop(o);
println!("{:?}", a);
let _ = it.next();
// 儘管 empty 聲稱它能產生 'static 生命周期的 mutable 參考 (Line 7),
// 但實際上它永遠不需要產生這樣的參考。
// 因此當我們在 Line 17 呼叫 Empty 的 next 方法時是沒問題的,
// 因為 Empty 實際上並不與 &mut a 的生命周期相關聯,
// 因為 Line 10 回傳的 mutable 參考的生命周期被縮短到 Line 8。
// 當我在 Line 15 卸除值時,
// 我們卸除的是有著 static 生命週期的 Oisann,
// 這不是 a 的使用,因為它與 a 的生命週期無關。
// 它只是與 static 生命週期相關。
// 這就是為什麼這段程式合法。
}
回顧 :
pub struct Boks<T>
{
p: NonNull<T>,
_t: PhantomData<T>,
}
使用 NonNull 是為了同時滿足 *mut
以及 covariant in T 的需求;使用 PhantomData 是為了告訴編譯器要去檢查 T 的 Drop 實作。
回顧 :
unsafe impl<#[may_dangle] T> Drop for Boks<T>
{
fn drop(&mut self)
{
// SAFETY: p was constructed from a Box in the first place, and has not been freed,
// otherwise since self still exists (otherwise, drop could not be called)
unsafe { Box::from_raw(self.p.as_mut()) };
}
}
我們為 Boks 實作了 Drop,不然會造成記憶體洩漏。而 #[may_dangle]
則是我們向編譯器保證我們在卸除 Boks 時 不會存取 T,所以編譯器就不需要假設我們會在卸除 Boks 時 存取 T。
Q : complicated for no reason lol
A : 編譯器之所以做了一些假設造成了程式設計師在編寫程式上受到限制都是為了讓程式在執行的時候不會出錯,你必須要很清楚知道自己在做什麼,才可以把這些限制拿掉,