Rust 讀書會導讀

# Rust 讀書會導讀 ## Ch. 15: Smart Pointers ###### tags: `Rust` `Smart Pointers` `Talk` --- ## 初學 Rust 常見問題 - 如何動態配置記憶體 - 如何寫鏈結串列 - 如何共用記憶體位址 - 如何變更記憶體內容 - 記憶體操作注意事項 ---- ### Agenda 本章試圖用 smart pointers 探討上述問題 - Smart pointers: `Box<T>`, `Rc<T>` - Interior Mutability Pattern - Reference cycles --- ## Pointers & Smart Pointers - Pointers: 存放記憶體位置的變數。Rust 的 reference 就是一種指標。 - Smart pointers: 行為類似指標/reference，但有額外功能的 struct。 - References 只是「借用」資料；而 Smart pointers 則有「所指向資料」的所有權。 - `String` 和 `Vec<T>` 也算是 smart pointers。 --- ## `Box<T>`  - 將型別為 `T` 的資料存放在 heap 中，`Box<T>` 本身在 stack 中留一個指向 heap 的指標。 - `Box<T>` 變數離開作用域時，會自動清除 heap 中的資料。 - 類似於 C++ 的 `std::unique_ptr<T>`。 - 使用時機： - 資料的大小在編譯時無法確定。 - 要搬大量資料、確保資料本身不會被複製。 - 想管理一筆資料，卻不在乎它的實際型別、只在乎它實作了哪個特徵。(*Trait Object*; 細節請參閱 Ch17) ---- ### Example ```rust fn main() { let b = Box::new(5); println!("b = {}", b); } ``` ![](https://hackmd.io/_uploads/B1hsFO3Ds.png) ---- ### 實作 Recursive Type - Cons List  ```rust // ERROR: recursive type has infinite size enum List { Cons(i32, List), Nil, } use crate::List::{Cons, Nil}; fn main() { let list = Cons(1, Cons(2, Cons(3, Nil))); } ``` ![](https://hackmd.io/_uploads/Sk0VtdhDj.png) ---- ### 使用 `Box<T>` 實作 Cons List  ```rust fn main() { let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil)))))); println!("{:?}", list); } use crate::List::{Cons, Nil}; #[derive(Debug)] enum List { Cons(i32, Box<List>), Nil, } ``` ---- ### 延伸思考：利用 reference 實作 Cons List?  ```rust #[derive(Debug)] enum List<'a> { Cons(i32, &'a List<'a>), Nil, } use crate::List::{Cons, Nil}; fn main() { println!("{:?}", Cons(1, &Cons(2, &Cons(3, &Nil)))); println!("{:?}", return_list()); } fn return_list() -> List<'static> { Cons(4, &Cons(5, &Cons(6, &Nil))) // let local_list = Nil; // Cons(4, &Cons(5, &Cons(6, &local_list))) } ``` --- ### Trait `Deref`  ```rust fn main() { let x = 5; let y = &x; assert_eq!(5, x); // assert_eq!(5, y); // 這行不會過，因為 5 (i32) 和 y (&i32) 型別不同 assert_eq!(5, *y); // *(&i32) ---> i32 let boxed_y = Box::new(x); assert_eq!(5, *boxed_y); // 為什麼 *Box<i32> 會變成 i32 ?? } ``` ---- ### Deref Coercion  - 如果型別 `T` 實作了 `Deref<Target = U>`，令 `x: T`： 1. `*x` 等同於 `*(x.deref())`（先轉為 `&U` 再取值）。 2. 在需要 `&U` 的地方（例如參數）傳入 `&x`，編譯器會自動呼叫 `x.deref()`。 3. `T` 自動實作所有 `U` 實作的 methods。 ----  ### 簡易版 `Box<T>`: `MyBox<T>` ```rust struct MyBox<T>(T); // generic tuple struct impl<T> MyBox<T> { fn new(x: T) -> MyBox<T> { MyBox(x) } } impl<T> Deref for MyBox<T> { type Target = T; fn deref(&self) -> &Self::Target { &self.0 } } let x = MyBox::new(100); let y1 = x.deref(); // typeof y1 == &i32 let y2 = *x; // typeof y2 == i32; 相當於 *(x.deref()) ``` ----  - 因為 `String` 實作了 `Deref<Target = str>`... ```rust fn count_words(s: &str) -> usize { /*...*/ } fn main() { let hello = String::from("Hello, world."); let _ = count_words(&hello); // OK; 自動轉成 count_words(hello.deref()) let _ = hello.find('w'); // OK; String 自動實作了 str::find() let m = MyBox::new(String::from("Rust")); let _ = count_words(&m); // It works!! // 相當於 hello(m.deref().deref()) } ``` - 編譯器會依照參數所需要的型別，自動判斷需要呼叫 `deref()` 的次數。而且會在編譯時決定，不會影響執行時的效能。 --- ### Trait `Drop`  - 所有的型別都能實作 `Drop`，定義資料離開作用域時該做的事，例如釋放檔案 or 網路資源。 - Smart pointer 型別幾乎一定都會實作 `Drop`，因為需要清除配置出來的記憶體。 - 數值在離開 scope 時，`Drop::drop()`（相當於其他語言中的 destructor）就會自動被呼叫。 - 若 scope 中有多個不同的數值，`Drop::drop()` 被呼叫的順序與數值產生的順相反（因為數值是存在 stack 中的）。 - 如果想在 scope 結束前先清除數值，可以利用 `std::mem::drop()`。 - 無法直接呼叫 `Drop::drop()`。因為 Rust 會自動在作用域結束時呼叫一次，如果可以直接呼叫，就有可能造成 double free。 --- ## Multiple Ownership with `Rc<T>`  - `Rc<T>` 是 reference counted smart pointer, 支援多個 owner, 並透過引用計數 (reference counting) 管理其指向的記憶體 * `Rc<T>` 初始化: 計數器 = 1 * `Rc<T>` 被複製: 計數器 += 1 * `Rc<T>` 被銷毀: 計數器 -= 1 * 當計數器歸零時, `Rc<T>` 就會自動釋放記憶體 ---- ### Using `Rc<T>` to Share Data  ```graphviz digraph RcDemo { rankdir="LR" NodeA [label="1234"] PtrA [label="a" shape=plaintext] PtrB [label="b" shape=plaintext] PtrC [label="c" shape=plaintext] PtrA -> NodeA PtrB -> NodeA PtrC -> NodeA } ``` ```rust= fn main() { let a = Rc::new(1234u32); // ref. count of 'a' is 1 let b = Rc::clone(&a); // ref. count of 'a' becomes 2 { let c = Rc::clone(&a); // ref. count of 'a' becomes 3 } // ref. count of 'a' drops to 2 after c goes out of scope println!("{}", Rc::strong_count(&a)); } ``` ---- ### A Case Study of Multiple Ownership  範例: Linked lists with shared nodes ```graphviz digraph SharedList { rankdir="LR" NodeA [label="5"] NodeB [label="3"] NodeC [label="4"] NodeD [label="10"] Nil [label="Nil" shape=plaintext] NodeB -> NodeA NodeC -> NodeA NodeA -> NodeD NodeD -> Nil } ``` ---- ### `Box`'ed type cannot have multiple owners  ```graphviz digraph BoxedList { rankdir="LR" NodeA [label="5 (a)"] NodeB [label="3 (b)"] NodeC [label="4 (c)"] NodeD [label="10"] Nil [label="Nil"] NodeB -> NodeA NodeC -> NodeA NodeA -> NodeD NodeD -> Nil } ``` ```rust= enum List { Cons(i32, Box<List>), Nil, } fn main() { let a = Cons(5, Box::new(Cons(10, Box::new(Nil)))); let b = Cons(3, Box::new(a)); // 'a' is moved into 'b' let c = Cons(4, Box::new(a)); // Error E0382: value used after move } ``` 節點 `b` & `c` 無法同時持有節點 `a` ---- ### `Rc<T>` on Linked List  ```graphviz digraph RcList { rankdir="LR" NodeA [label="(a) 5"] NodeB [label="(b) 3"] NodeC [label="(c) 4"] NodeD [label="10"] Nil [label="Nil"] NodeB -> NodeA NodeC -> NodeA NodeA -> NodeD NodeD -> Nil } ``` ```rust= enum List { Cons(i32, Rc<List>), Nil, } fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); let b = Cons(3, Rc::clone(&a)); let c = Cons(4, Rc::clone(&a)); // the entire list of 'a' will be dropped automatically, // from head to tail. } ``` --- ## Mutating a Value through a Shared Reference  - 原則上, Rust 的 shared references 必然是 immutable - 為了讓多個 owners 可以透過 immutable reference 修改資料, Rust 提出 *Interior Mutability Pattern* - 實作此範式的型別允許使用者在 __執行期間__ 取得 mutable borrow, 但用戶仍需遵守 borrowing rules * 同一時間只能有一個 mutable borrow ---- ### Interior Mutability with `RefCell<T>`  `RefCell<T>` 是一種 *contained type*, 提供 interior mutability ```rust= fn main() { let a = RefCell::new(1234u32); { let mut ra = a.borrow_mut(); // ra is a RefMut *ra = 5678; } println!("{:?}", a); // prints 5678 } ``` ---- ### A Use Case for Interior Mutability  ``` +--------------------+ Messenger +---------------+ | LimitTracker (lib) | ----------> ○-------| MockMessenger | +--------------------+ send +---------------+ ``` ```rust= pub trait Messenger { fn send(&self, msg: &str); } ``` - 考量 thread-safety, 所以 `Messenger::send()` 設計為 immutable method - 在這種狀況下, trait object 如何藉由 `send()` 更改自身資料, 記錄收到的訊息? ---- ### Mutating a Trait Object via Immutable Method  ```rust= pub trait Messenger { fn send(&self, msg: &str); } struct MockMessenger { // wrap the mutable data into a RefCell<T> sent_messages: RefCell<Vec<String>>, } impl Messenger for MockMessenger { fn send(&self, message: &str) { // stores the messages sent by LimitTracker self.sent_messages.borrow_mut().push(String::from(message)); } } ``` ---- ### Linked list with `RefCell`  ```graphviz digraph RefCellList { rankdir="LR" NodeA [label="5"] NodeB [label="3"] NodeC [label="4"] Nil [label="Nil"] NodeB -> NodeA NodeC -> NodeA NodeA -> Nil } ``` ```rust= enum List { // two pointers in a list node, for demonstration purpose Cons(Rc<RefCell<i32>>, Rc<List>), Nil, } fn main() { let value = Rc::new(RefCell::new(5)); let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a)); let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a)); *value.borrow_mut() += 10; } ``` --- ## Memory Leak with Reference Cycles  - Rust 儘可能保證 memory safety (例如, 沒有無效的 pointers), 但不保證不會發生 memory leak - 循環參照 (reference cycles) 即可造成 memory leak `---` ```graphviz digraph RefCycle { rankdir="LR" NodeA [label="a"] NodeB [label="b"] NodeA -> NodeB NodeB -> NodeA } ``` ---- ### An Example of Reference Cycles  ```rust= // the second element is wrapped by a RefCell so that // it could be modified while the Node is behind a Rc. struct Node(i32, RefCell<Option<Rc<Node>>>); fn main() { let a = Rc::new(Node(5, RefCell::new(None))); let b = Rc::new(Node(10, RefCell::new(None))); *a.1.borrow_mut() = Some(Rc::clone(&b)); *b.1.borrow_mut() = Some(Rc::clone(&a)); } ``` `a` & `b` 在 scope 結束後, 皆無法被釋放 ---- ### Further Issues caused by Reference Cycle  ```rust= #[derive(Debug)] struct Node(i32, RefCell<Option<Rc<Node>>>); fn main() { let a = Rc::new(Node(5, RefCell::new(None))); let b = Rc::new(Node(10, RefCell::new(None))); *a.1.borrow_mut() = Some(Rc::clone(&b)); *b.1.borrow_mut() = Some(Rc::clone(&a)); println!("{:?}", a); // will deref the pointer } ``` `--- output ---` ``` Node(5, RefCell { value: Some(Node(10, RefCell { value: Some(Node(5, ... thread 'main' has overflowed its stack ``` ---- ### The Weak References  ```graphviz digraph RcDemo { rankdir="LR" NodeA [label="1234"] PtrA [label="strong" shape=plaintext] PtrB [label="weak" shape=plaintext] PtrA -> NodeA PtrB -> NodeA [style="dashed"] } ``` ```rust= fn main() { let weak = { let strong = Rc::new(1234u32); let weak = Rc::downgrade(&strong); println!("{}", Rc::strong_count(&strong)); println!("{}", Rc::weak_count(&strong)); // keep the weak ref. while the strong ref. goes out of scope weak }; println!("{:?}", weak.upgrade()); // prints "None" } ``` ---- ### Preventing Reference Cycles by using Weak References  - 應用範例 * Doubly linked list * Tree nodes with a back reference ```graphviz digraph WeakCycle { rankdir="LR" NodeA [label="a"] NodeB [label="b"] NodeA -> NodeB NodeB -> NodeA [style="dashed"] } ``` ```rust= struct Node { value: i32, parent: RefCell<Weak<Node>>, children: RefCell<Vec<Rc<Node>>>, } ``` --- ## Summary  - Smart pointer types: `Box<T>` and `Rc<T>` - How `Deref` and `Drop` play their roles on smart pointers - Interior Mutability Pattern and `RefCell<T>` - Preventing reference cycles with weak references --- ## References  - [Learning Rust With Entirely Too Many Linked Lists](https://rust-unofficial.github.io/too-many-lists/) - [Tutorials by Jon Gjengset](https://www.youtube.com/watch?v=8O0Nt9qY_vo) --- ## Backup Slides --- ## Appendix: Review of List Definition  - The `Cons` list in ch.15 requires an extra `Nil` node as the list end - I personally prefer writing a list as following ```rust= // 'Option<Box>' will be optimized into a raw pointer struct Node(i32, Option<Box<Node>>); let list = Node(5, Some(Box::new(Node(15, Some(Box::new(Node(25, None))))))); ```