owned this note
owned this note
Published
Linked with GitHub
---
type: slide
slideOptions:
transition: slide
---
# Rust 讀書會導讀
## Ch. 15: Smart Pointers
###### tags: `Rust` `Smart Pointers` `Talk`
---
## 初學 Rust 常見問題
- 如何動態配置記憶體
- 如何寫鏈結串列
- 如何共用記憶體位址
- 如何變更記憶體內容
- 記憶體操作注意事項
----
### Agenda
本章試圖用 smart pointers 探討上述問題
- Smart pointers: `Box<T>`, `Rc<T>`
- Interior Mutability Pattern
- Reference cycles
---
## Pointers & Smart Pointers
- Pointers: 存放記憶體位置的變數。Rust 的 reference 就是一種指標。
- Smart pointers: 行為類似指標/reference,但有額外功能的 struct。
- References 只是「借用」資料;而 Smart pointers 則有「所指向資料」的所有權。
- `String` 和 `Vec<T>` 也算是 smart pointers。
---
## `Box<T>`
<!-- .slide: style="font-size: 32px;" -->
- 將型別為 `T` 的資料存放在 heap 中,`Box<T>` 本身在 stack 中留一個指向 heap 的指標。
- `Box<T>` 變數離開作用域時,會自動清除 heap 中的資料。
- 類似於 C++ 的 `std::unique_ptr<T>`。
- 使用時機:
- 資料的大小在編譯時無法確定。
- 要搬大量資料、確保資料本身不會被複製。
- 想管理一筆資料,卻不在乎它的實際型別、只在乎它實作了哪個特徵。(*Trait Object*; 細節請參閱 Ch17)
----
### Example
```rust
fn main() {
let b = Box::new(5);
println!("b = {}", b);
}
```

----
### 實作 Recursive Type - Cons List
<!-- .slide: style="font-size: 32px;" -->
```rust
// ERROR: recursive type has infinite size
enum List {
Cons(i32, List),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(1, Cons(2, Cons(3, Nil)));
}
```

----
### 使用 `Box<T>` 實作 Cons List
<!-- .slide: style="font-size: 32px;" -->
```rust
fn main() {
let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
println!("{:?}", list);
}
use crate::List::{Cons, Nil};
#[derive(Debug)]
enum List {
Cons(i32, Box<List>),
Nil,
}
```
----
### 延伸思考:利用 reference 實作 Cons List?
<!-- .slide: style="font-size: 22px;" -->
```rust
#[derive(Debug)]
enum List<'a> {
Cons(i32, &'a List<'a>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
println!("{:?}", Cons(1, &Cons(2, &Cons(3, &Nil))));
println!("{:?}", return_list());
}
fn return_list() -> List<'static> {
Cons(4, &Cons(5, &Cons(6, &Nil)))
// let local_list = Nil;
// Cons(4, &Cons(5, &Cons(6, &local_list)))
}
```
---
### Trait `Deref`
<!-- .slide: style="font-size: 32px;" -->
```rust
fn main() {
let x = 5;
let y = &x;
assert_eq!(5, x);
// assert_eq!(5, y); // 這行不會過,因為 5 (i32) 和 y (&i32) 型別不同
assert_eq!(5, *y); // *(&i32) ---> i32
let boxed_y = Box::new(x);
assert_eq!(5, *boxed_y); // 為什麼 *Box<i32> 會變成 i32 ??
}
```
----
### Deref Coercion
<!-- .slide: style="font-size: 32px;" -->
- 如果型別 `T` 實作了 `Deref<Target = U>`,令 `x: T`:
1. `*x` 等同於 `*(x.deref())`(先轉為 `&U` 再取值)。
2. 在需要 `&U` 的地方(例如參數)傳入 `&x`,編譯器會自動呼叫 `x.deref()`。
3. `T` 自動實作所有 `U` 實作的 methods。
----
<!-- .slide: style="font-size: 24px;" -->
### 簡易版 `Box<T>`: `MyBox<T>`
```rust
struct MyBox<T>(T); // generic tuple struct
impl<T> MyBox<T> {
fn new(x: T) -> MyBox<T> {
MyBox(x)
}
}
impl<T> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.0
}
}
let x = MyBox::new(100);
let y1 = x.deref(); // typeof y1 == &i32
let y2 = *x; // typeof y2 == i32; 相當於 *(x.deref())
```
----
<!-- .slide: style="font-size: 24px;" -->
- 因為 `String` 實作了 `Deref<Target = str>`...
```rust
fn count_words(s: &str) -> usize { /*...*/ }
fn main() {
let hello = String::from("Hello, world.");
let _ = count_words(&hello); // OK; 自動轉成 count_words(hello.deref())
let _ = hello.find('w'); // OK; String 自動實作了 str::find()
let m = MyBox::new(String::from("Rust"));
let _ = count_words(&m); // It works!!
// 相當於 hello(m.deref().deref())
}
```
- 編譯器會依照參數所需要的型別,自動判斷需要呼叫 `deref()` 的次數。而且會在編譯時決定,不會影響執行時的效能。
---
### Trait `Drop`
<!-- .slide: style="font-size: 28px;" -->
- 所有的型別都能實作 `Drop`,定義資料離開作用域時該做的事,例如釋放檔案 or 網路資源。
- Smart pointer 型別幾乎一定都會實作 `Drop`,因為需要清除配置出來的記憶體。
- 數值在離開 scope 時,`Drop::drop()`(相當於其他語言中的 destructor)就會自動被呼叫。
- 若 scope 中有多個不同的數值,`Drop::drop()` 被呼叫的順序與數值產生的順相反(因為數值是存在 stack 中的)。
- 如果想在 scope 結束前先清除數值,可以利用 `std::mem::drop()`。
- 無法直接呼叫 `Drop::drop()`。因為 Rust 會自動在作用域結束時呼叫一次,如果可以直接呼叫,就有可能造成 double free。
---
## Multiple Ownership with `Rc<T>`
<!-- .slide: style="font-size: 32px;" -->
- `Rc<T>` 是 reference counted smart pointer, 支援多個 owner, 並透過引用計數 (reference counting) 管理其指向的記憶體
* `Rc<T>` 初始化: 計數器 = 1
* `Rc<T>` 被複製: 計數器 += 1
* `Rc<T>` 被銷毀: 計數器 -= 1
* 當計數器歸零時, `Rc<T>` 就會自動釋放記憶體
----
### Using `Rc<T>` to Share Data
<!-- .slide: style="font-size: 32px;" -->
```graphviz
digraph RcDemo {
rankdir="LR"
NodeA [label="1234"]
PtrA [label="a" shape=plaintext]
PtrB [label="b" shape=plaintext]
PtrC [label="c" shape=plaintext]
PtrA -> NodeA
PtrB -> NodeA
PtrC -> NodeA
}
```
```rust=
fn main() {
let a = Rc::new(1234u32); // ref. count of 'a' is 1
let b = Rc::clone(&a); // ref. count of 'a' becomes 2
{
let c = Rc::clone(&a); // ref. count of 'a' becomes 3
}
// ref. count of 'a' drops to 2 after c goes out of scope
println!("{}", Rc::strong_count(&a));
}
```
----
### A Case Study of Multiple Ownership
<!-- .slide: style="font-size: 32px;" -->
範例: Linked lists with shared nodes
```graphviz
digraph SharedList {
rankdir="LR"
NodeA [label="5"]
NodeB [label="3"]
NodeC [label="4"]
NodeD [label="10"]
Nil [label="Nil" shape=plaintext]
NodeB -> NodeA
NodeC -> NodeA
NodeA -> NodeD
NodeD -> Nil
}
```
----
### `Box`'ed type cannot have multiple owners
<!-- .slide: style="font-size: 32px;" -->
```graphviz
digraph BoxedList {
rankdir="LR"
NodeA [label="5 (a)"]
NodeB [label="3 (b)"]
NodeC [label="4 (c)"]
NodeD [label="10"]
Nil [label="Nil"]
NodeB -> NodeA
NodeC -> NodeA
NodeA -> NodeD
NodeD -> Nil
}
```
```rust=
enum List {
Cons(i32, Box<List>),
Nil,
}
fn main() {
let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
let b = Cons(3, Box::new(a)); // 'a' is moved into 'b'
let c = Cons(4, Box::new(a)); // Error E0382: value used after move
}
```
節點 `b` & `c` 無法同時持有節點 `a`
----
### `Rc<T>` on Linked List
<!-- .slide: style="font-size: 32px;" -->
```graphviz
digraph RcList {
rankdir="LR"
NodeA [label="(a) 5"]
NodeB [label="(b) 3"]
NodeC [label="(c) 4"]
NodeD [label="10"]
Nil [label="Nil"]
NodeB -> NodeA
NodeC -> NodeA
NodeA -> NodeD
NodeD -> Nil
}
```
```rust=
enum List {
Cons(i32, Rc<List>),
Nil,
}
fn main() {
let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
let b = Cons(3, Rc::clone(&a));
let c = Cons(4, Rc::clone(&a));
// the entire list of 'a' will be dropped automatically,
// from head to tail.
}
```
---
## Mutating a Value through a Shared Reference
<!-- .slide: style="font-size: 32px;" -->
- 原則上, Rust 的 shared references 必然是 immutable
- 為了讓多個 owners 可以透過 immutable reference 修改資料, Rust 提出 *Interior Mutability Pattern*
- 實作此範式的型別允許使用者在 __執行期間__ 取得 mutable borrow, 但用戶仍需遵守 borrowing rules
* 同一時間只能有一個 mutable borrow
----
### Interior Mutability with `RefCell<T>`
<!-- .slide: style="font-size: 32px;" -->
`RefCell<T>` 是一種 *contained type*, 提供 interior mutability
```rust=
fn main() {
let a = RefCell::new(1234u32);
{
let mut ra = a.borrow_mut(); // ra is a RefMut
*ra = 5678;
}
println!("{:?}", a); // prints 5678
}
```
----
### A Use Case for Interior Mutability
<!-- .slide: style="font-size: 32px;" -->
```
+--------------------+ Messenger +---------------+
| LimitTracker (lib) | ----------> ○-------| MockMessenger |
+--------------------+ send +---------------+
```
```rust=
pub trait Messenger {
fn send(&self, msg: &str);
}
```
- 考量 thread-safety, 所以 `Messenger::send()` 設計為 immutable method
- 在這種狀況下, trait object 如何藉由 `send()` 更改自身資料, 記錄收到的訊息?
----
### Mutating a Trait Object via Immutable Method
<!-- .slide: style="font-size: 32px;" -->
```rust=
pub trait Messenger {
fn send(&self, msg: &str);
}
struct MockMessenger {
// wrap the mutable data into a RefCell<T>
sent_messages: RefCell<Vec<String>>,
}
impl Messenger for MockMessenger {
fn send(&self, message: &str) {
// stores the messages sent by LimitTracker
self.sent_messages.borrow_mut().push(String::from(message));
}
}
```
----
### Linked list with `RefCell`
<!-- .slide: style="font-size: 32px;" -->
```graphviz
digraph RefCellList {
rankdir="LR"
NodeA [label="5"]
NodeB [label="3"]
NodeC [label="4"]
Nil [label="Nil"]
NodeB -> NodeA
NodeC -> NodeA
NodeA -> Nil
}
```
```rust=
enum List {
// two pointers in a list node, for demonstration purpose
Cons(Rc<RefCell<i32>>, Rc<List>),
Nil,
}
fn main() {
let value = Rc::new(RefCell::new(5));
let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil)));
let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a));
let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a));
*value.borrow_mut() += 10;
}
```
---
## Memory Leak with Reference Cycles
<!-- .slide: style="font-size: 32px;" -->
- Rust 儘可能保證 memory safety (例如, 沒有無效的 pointers), 但不保證不會發生 memory leak
- 循環參照 (reference cycles) 即可造成 memory leak
`---`
```graphviz
digraph RefCycle {
rankdir="LR"
NodeA [label="a"]
NodeB [label="b"]
NodeA -> NodeB
NodeB -> NodeA
}
```
----
### An Example of Reference Cycles
<!-- .slide: style="font-size: 32px;" -->
```rust=
// the second element is wrapped by a RefCell so that
// it could be modified while the Node is behind a Rc.
struct Node(i32, RefCell<Option<Rc<Node>>>);
fn main() {
let a = Rc::new(Node(5, RefCell::new(None)));
let b = Rc::new(Node(10, RefCell::new(None)));
*a.1.borrow_mut() = Some(Rc::clone(&b));
*b.1.borrow_mut() = Some(Rc::clone(&a));
}
```
`a` & `b` 在 scope 結束後, 皆無法被釋放
----
### Further Issues caused by Reference Cycle
<!-- .slide: style="font-size: 32px;" -->
```rust=
#[derive(Debug)]
struct Node(i32, RefCell<Option<Rc<Node>>>);
fn main() {
let a = Rc::new(Node(5, RefCell::new(None)));
let b = Rc::new(Node(10, RefCell::new(None)));
*a.1.borrow_mut() = Some(Rc::clone(&b));
*b.1.borrow_mut() = Some(Rc::clone(&a));
println!("{:?}", a); // will deref the pointer
}
```
`--- output ---`
```
Node(5, RefCell { value: Some(Node(10, RefCell { value: Some(Node(5, ...
thread 'main' has overflowed its stack
```
----
### The Weak References
<!-- .slide: style="font-size: 32px;" -->
```graphviz
digraph RcDemo {
rankdir="LR"
NodeA [label="1234"]
PtrA [label="strong" shape=plaintext]
PtrB [label="weak" shape=plaintext]
PtrA -> NodeA
PtrB -> NodeA [style="dashed"]
}
```
```rust=
fn main() {
let weak = {
let strong = Rc::new(1234u32);
let weak = Rc::downgrade(&strong);
println!("{}", Rc::strong_count(&strong));
println!("{}", Rc::weak_count(&strong));
// keep the weak ref. while the strong ref. goes out of scope
weak
};
println!("{:?}", weak.upgrade()); // prints "None"
}
```
----
### Preventing Reference Cycles by using Weak References
<!-- .slide: style="font-size: 32px;" -->
- 應用範例
* Doubly linked list
* Tree nodes with a back reference
```graphviz
digraph WeakCycle {
rankdir="LR"
NodeA [label="a"]
NodeB [label="b"]
NodeA -> NodeB
NodeB -> NodeA [style="dashed"]
}
```
```rust=
struct Node {
value: i32,
parent: RefCell<Weak<Node>>,
children: RefCell<Vec<Rc<Node>>>,
}
```
---
## Summary
<!-- .slide: style="font-size: 36px;" -->
- Smart pointer types: `Box<T>` and `Rc<T>`
- How `Deref` and `Drop` play their roles on smart pointers
- Interior Mutability Pattern and `RefCell<T>`
- Preventing reference cycles with weak references
---
## References
<!-- .slide: style="font-size: 32px;" -->
- [Learning Rust With Entirely Too Many Linked Lists](https://rust-unofficial.github.io/too-many-lists/)
- [Tutorials by Jon Gjengset](https://www.youtube.com/watch?v=8O0Nt9qY_vo)
---
## Backup Slides
---
## Appendix: Review of List Definition
<!-- .slide: style="font-size: 32px;" -->
- The `Cons` list in ch.15 requires an extra `Nil` node as the list end
- I personally prefer writing a list as following
```rust=
// 'Option<Box>' will be optimized into a raw pointer
struct Node(i32, Option<Box<Node>>);
let list = Node(5, Some(Box::new(Node(15, Some(Box::new(Node(25, None)))))));
```