wilson@wilson-HP-Pavilion-Plus-Laptop-14-eh0xxx ~/CrustOfRust> neofetch --stdout
wilson@wilson-HP-Pavilion-Plus-Laptop-14-eh0xxx
-----------------------------------------------
OS: Ubuntu 22.04.3 LTS x86_64
Host: HP Pavilion Plus Laptop 14-eh0xxx
Kernel: 6.2.0-37-generic
Uptime: 22 mins
Packages: 2367 (dpkg), 11 (snap)
Shell: bash 5.1.16
Resolution: 2880x1800
DE: GNOME 42.9
WM: Mutter
WM Theme: Adwaita
Theme: Yaru-dark [GTK2/3]
Icons: Yaru [GTK2/3]
Terminal: gnome-terminal
CPU: 12th Gen Intel i5-12500H (16) @ 4.500GHz
GPU: Intel Alder Lake-P
Memory: 2876MiB / 15695MiB
wilson@wilson-HP-Pavilion-Plus-Laptop-14-eh0xxx ~/CrustOfRust> rustc --version
rustc 1.70.0 (90c541806 2023-05-31) (built from a source tarball)
In the 2019 Rust Survey, a lot of people were asking for video content covering intermediate Rust content. So in this first video (possibly of many), we're going to investigate a case where you need multiple explicit lifetime annotations. We explore why they are needed, and why we need more than one in this particular case. We also talk about some of the differences between the string types and introduce generics over a self-defined trait in the process.
Q: Will I be able to follow at all if I have never seen rust before? I have done python and some C/C++ though
A: 不確定,這影片是在你已經看過 Rust 書籍的前提下適合去觀看的。
0:03:36
開始建置 Rust 專案 :
$ cargo new --lib strsplit
$ cd strsplit
$ vim src/lib.rs
程式的一開始先加上 warn 的 prelude :
#![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
使用 warn
而不是 deny
是因為這個隨著時間推移,編譯器會變聰明因而影響到某些 Lintz,你不會想 Lintz 破壞你程式碼的編譯就因為你用更後面的編譯器來做編譯。在初始開發階段不想要收到這些警告,不然 debug 資訊會讓你失焦,這裡只是讓你知道加上這個 prelude 讓你開發中後期不會忘記一些需要處理的小細節。
先寫出 StrSplit
的建構式原型 :
pub struct StrSplit {}
impl StrSplit {
pub fn new(haystack: &str, delimiter: &str) -> Self {}
}
haystack
是你要搜尋的東西,delimiter
是用來分割東西。回傳 Self
型別,Self
用來引用 impl 區塊,這裡不回傳 str
型別是因為如果當 StrSplit
重新給定型別時,不用更改回傳型態,這樣程式碼比較靈活一些。
接著為 StrSplit
實作 Iterator
的功能 :
// let x: StrSplit;
// for part in x {
// }
impl Iterator for StrSplit
{
type Item = &str;
fn next(&mut self) -> Option<Self::Item> {}
}
for
迴圈其實是在呼叫 xxx.next()
,持續迭代取到 Some 值,終止條件是取到 None 值。
再來為函式庫寫 test case :
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
}
Q: equality comparison compares element-wise?
(assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
)
A: comparision 是 element-wise 方式去比較是否全部東西相同。
Q: Will Higher-Kind lifetimes to be covered?
A: 沒有。WK : Using higher-ranked trait bounds with generics 有時間看一下。
Q: Won't that just add noise when debugging early prototypes? (prelude 那行)
A: 在開發的初始階段你可能不會想要收到 prelude 的警告,因為你開發初期並沒有就先撰寫文件,和符合一些規範,編譯器會跳一堆實際上不影響你編譯的警告,這時候如果程式碼有地方有錯誤,反而警告訊息會讓你失焦導致除錯不容易一些,等到程式開發到一定程度在開啟 prelude 的警告會比較好。
Q: how do you decide between library and binary and how do your check the library output results while coding?
用命令行執行的都是二進位檔,其餘的都是函式庫。二進位檔會創出 source main,函式庫會創出 source lib,你可以在你的 crate 同時擁有這兩種形式。至於測試函式庫的方法就是寫 test case。
Q: what do you use to mock external dependencies in your projects? I have tried mockall unit testing library but am hoping to find something that does not rely on traits for mocking.
A: 不在本影片探討,不過有好方法去做。
0:10:22 需要再了解是什麼意思 !
Q: i thought all loops desugared to loop with a break condition
A: while loop desuger to loops as well,for
轉換成
while
比較容易去解釋,but you're write the deeper down while 轉成 loop
回頭定義 StrSplit
需要什麼欄位 :
pub struct StrSplit
{
remainder: &str,
delimiter: &str,
}
remainder
是程式還沒看到的剩餘字串,而 delimiter
是用來分割字串的。
將 StrSplit
的建構式完整實作 :
impl StrSplit {
pub fn new(haystack: &str, delimiter: &str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
欄位和傳入參數有相同的名稱時 (Line 6),可以不用放 :
,這樣可以達到程式碼去重的效果。只有在欄位和傳入參數有不相同的名稱時 (Line 5) 才要使用到 :
。
這裡為什麼要將搜尋字串的變數名稱在 StrSplit
外部叫做 haystack
,而在 StrSplit
內部叫做 remainder
,因為 StrSplit
內部的字串每次都會處理一部分的字串,然後剩一些尚未處理的字串,接著繼續從尚未處理的字串繼續處理,直到全部的字元都看過為止,所以 StrSplit
內部才會將變數名稱取作 remainder
。
繼續將 StrSplit
的Iterator
的功能實作的更完整 :
impl Iterator for StrSplit
{
type Item = &str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
上面程式碼的目標是從 remainder
找到下一個 delimiter
,並以 delimiter
作為分界點來分割字串,回傳分割字串的前半部分,並將 remainder
設為分割字串的後半部,重複上述動作直到處理完全部字串。
Line 6 用 if let
的原因是,在搜尋 remainder
內部尋找 delimiter
有兩種可能的結果,一種是有找到,一種是沒找到
next_delim
透過模式比對的方式,左式為 Some
,右式為 Option
,因為 Some
只是 Option
的一個包裝值類型,所以匹配,接著將 Option
內的值指派給 next_delim
,而不是直接把整個 Option
的值指派給 next_delim
。Some
,右式為 None
而不會進入 if
的條件內部。Line 7 的 [..next_delimiter] 中 ..
表示字串的起始位置,Line 8 的 [(next_delim + self.delimiter.len())..] 中..
表示字串的結束位置。
如果在 Line 13 的 else
條件只回傳 self.remainder
而沒有執行 self.remainder = ""
; 將會導致一直進入 else
條件的無窮迴圈。
Q: Is the cascaded Self {} really the "preferred" way of implementing that? I'm very much new to Rust and it seems a bit odd coming from other languages
A: Jon 喜歡 Self
,這樣比較程式碼比較靈活,但要付出的代價有兩個
Self
這麼彈性的功能,不過這功能很早就加進來了,影響不大。Q: Jon, maybe you could explain later when should I use associated types VS generics, they don't look that different to me, and thus I always use generics
A: 使用泛型的時機是多個實作用到那個 trait 給定的型別,使用 associated type 的時機是只有一個實作用到那個 trait 給定的型別。
Q: When do you use match ... with
vs if let Some(...)
?
A: 如果有多個模式要比對請使用 match
,如果只有一個模式要比對,請用 if let
。
pub struct StrSplit
{
remainder: &str,
delimiter: &str,
}
impl StrSplit {
pub fn new(haystack: &str, delimiter: &str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
impl Iterator for StrSplit
{
type Item = &str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
}
將目前的程式碼嘗試編譯看看,會得到錯誤訊息,以下只摘錄關鍵部分 :
$ cargo test
...
error[E0106]: missing lifetime specifier
...
$ cargo check # 可以看到不重複的錯誤訊息
接著依照編譯器的提示修改程式碼。
#![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
pub struct StrSplit<'a>
{
remainder: &'a str,
delimiter: &'a str,
}
impl StrSplit<'_> {
pub fn new(haystack: &str, delimiter: &str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
impl Iterator for StrSplit<'_>
{
type Item = &str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
}
next()
函式部分講解 : <'_>
匿名生命週期,處理機制基本上就跟型別推斷一樣 (讓編譯器自己去推斷每個東西的生命週期的長度)。type Item = &str;
也需要給它生命週期參數是因為回傳的型態是 &str
(指向 str
的指標),Rust 不知道程式要持有這個指標多長的時間,這樣無法確保值的生命週期比指標的生命週期還要長 (我們不想要指標的生命週期比值的生命週期還要長的原因是因為我們不要迷途指標的情形發生)
0:18:55
需要再釐清一下觀念!
將 Item
也指派生命週期參數 :
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
給 Item
生命週期的參數,等於也給了 next()
函式回傳值生命週期參數。當每次呼叫 next()
函式時,都會傳入 &mut self
,這樣就可以把生命週期跟 StrSplit
綁在一起 (因為 fn next(&mut self) -> Option<Self::Item>
原始樣貌是 fn next(&mut StrSplit<'a>) -> Option<Self::&'a str>
),而 StrSplit
之前又跟傳入參數 haystack
的生命週期綁在一起,這樣等於把回傳值的生命週期也跟 haystack
的生命週期也綁在一起了。
到目前的程式碼修改,還是無法編譯成功,後面繼續改進。
題外話,若將
StrSplit
後面的生命週期參數從 <'a> 改成 <'_> 並進行編譯,編譯器會回報錯誤 : error[E0207]: the lifetime parameter
'a is not constrained by the impl trait, self type, or predicates
0:20:33
Q: can I be wrong by specifying lifetimes?"
A: 並不會,因為錯誤的生命週期無法編譯,就像是你不小心用了錯誤型別,最終你呼叫函式的時候,你必須提供某個型別,但你卻給了別的型別,這時候編譯器會比對函式要的型別會因為不吻合而造成編譯失敗。
Q: how to tell where anonymous lifetime can be used?
A: <'_>
告訴編譯器自己去推每個東西的生命週期,能讓編譯器這麼做的情境只有在一種可能的猜測的情況下才能這麼做 (並不表示同一個 impl 區塊的 '_
不能表示不同的生命週期,實際上,用到 '_
生命週期參數的東西都有自己獨一無二的生命週期,東西之間可能有些生命週期是一樣長的)。
進一步說明 :
impl Foo
{
fn get_ref(&self) -> &'_ str {}
}
get_ref()
函式這裡有生命週期的只有 &self
的生命週期,所以編譯器可以推出回傳值的生命週期會跟傳入的 &self
參數一樣長,因此不用寫成以下的形式 :
impl Foo
{
fn get_ref<'a>(&'a self) -> &'a str {}
}
Q: What is the difference between 'a and '_ ?
A: '_
用底線是告訴編譯器自己去推斷生命週期,因為我們知道編譯器只有一種可能可以選擇,這時候就能放心交給編譯去做,不用自己特別去處理。而 'a
是 specific 生命週期,有點像泛型的 T
。
Q: Is there any kind of ordering on lifetime specifiers? Like, is 'a > 'b? Or is it just a way of grouping references together as a unit?
A: Yes, subtyping
ex. special lifetime: 'static 存活時間為宣告到剩餘整個程式結束。所以你可以有一個 'a 的 lifettime < 'static lifetime,lifetime 變數名稱不重要,你要叫 'b 也行
Q: 編譯器怎麼知道它是錯誤的卻沒辦法推論它?
A: 範例程式碼如下:
fn multiply(x: (), y: i32) -> i32
{
}
編譯器知道這是錯的,因為編譯器不知道 x
是什麼型別,只有你自己知道,因為 unit 是編譯器不知道的型別。所以編譯器不能告訴你正確答案。
0:24:43
Q: why would you not elide the lifetime if your leaving the ’_ in the type
A: you basically want to elide whenever you can,???在某些情況下,'_
don't consider this lifetime for the purpose of guessing
Q: there is a way to use multiple lifetimes specifiers at same impl?
A: 後面會講到
Q: Is there any kind of ordering on lifetime specifiers? Like, is 'a > 'b? Or is it just a way of grouping references together as a unit?
A: 有的,但本次不會用到
Q: Does '_ can be used when there is only one possible lifetime? So the compiler can guess properly
A: 並非如此, 請看以下兩個範例皆可 :
fn foo<'x, 'y>(x: &'x str y: &'y str) -> &'x str {}
'_
:
fn foo(x: &str y: &'_ str) -> &'_ str {}
'_
會轉換成任意獨一無二的生命週期,沒有人跟它一樣。回傳值的 '_
則是推斷生命週期會綁在傳入參數 x
上而不是傳入參數 y
上,因為傳入參數 y
有自己的生命週期。Q: So in other words, the life time of StrSplit.remainder and StrSplit.delimiter, is now tied to the lifetime of the StrSplit itself?
pub struct StrSplit<'a>
{
remainder: &'a str,
delimiter: &'a str,
}
A: 並非如此。 WK: StrSplit.remainder
和 StrSplit.delimiter
是會綁在傳入參數的生命週期而不是 StrSplit
本身。
#![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
pub struct StrSplit<'a>
{
remainder: &'a str,
delimiter: &'a str,
}
impl StrSplit<'_> {
pub fn new(haystack: &str, delimiter: &str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
}
在我機器上編譯器檢查到以下錯誤 :
$ cargo check
Checking strsplit v0.1.0 (/home/wilson/CrustOfRust/strsplit)
error: lifetime may not live long enough
--> src/lib.rs:10:9
|
8 | pub fn new(haystack: &str, delimiter: &str) -> Self
| - ---- return type is StrSplit<'2>
| |
| let's call the lifetime of this reference `'1`
9 | {
10 | / Self {
11 | | remainder: haystack,
12 | | delimiter,
13 | | }
| |_________^ associated function was supposed to return data with lifetime `'2` but it is returning data with lifetime `'1`
error: lifetime may not live long enough
--> src/lib.rs:10:9
|
8 | pub fn new(haystack: &str, delimiter: &str) -> Self
| - ---- return type is StrSplit<'2>
| |
| let's call the lifetime of this reference `'3`
9 | {
10 | / Self {
11 | | remainder: haystack,
12 | | delimiter,
13 | | }
| |_________^ associated function was supposed to return data with lifetime `'2` but it is returning data with lifetime `'3`
Jon 的機器編譯器檢查到以下錯誤 :
我使用的編譯器版本比 Jon 使用編譯器版本還新,編譯器推斷能力更強,因而沒得到 Jon 編譯程式時產生的錯誤訊息。
Self
的生命週期是 'a
(編譯器自己推斷的),照理說 remainder
也應該獲得 'a
的生命週期 (來自 StrSplit<'a>
定義) 才對 ,但 remainder
卻獲得了 haystack
的生命週期。編譯器不知道 haystack
指標的生命週期跟 StrSplit
的生命週期誰比較長誰比較短;delimiter
也跟 remainder
有相同的情況。
如果 caller 一呼叫 new()
之後馬上卸除 haystack
/delimiter
在記憶體的值,這樣會導致 StrSplit
有可能將欄位指向被卸除的值而導致迷途指標。
繼續改進程式,將 StrSplit
欄位的生命週期也綁在傳入參數的生命週期 :
-impl StrSplit<'_> {
+impl<'a> StrSplit<'a> {
- pub fn new(haystack: &str, delimiter: &str) -> Self
+ pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
Q: why do we use generic names like 'a, 'b, etc. for lifetimes and not proper names like (typical) variables?
A: 等等就會讓生命週期參數名稱變得更具有描述性。
Q: how resilient is the anonymous lifetime? will you get yourself in trouble if you rely on it too much or is the compiler going to pick correctly the vast majority of the time?
A: 如果可以,盡量使用匿名生命週期的功能。
Q: Can you impose restrictions between lifetimes?
A: 答案是肯定的, 你可以在 implb 區塊內給多個生命週期參數,並給定生命週期參數與生命週期參數之間的關係,例如,你可以給這樣的關係 : 'a
必須活的比 'b
還長,至少跟 'b
一樣長。但這裡不討論。
Q: why is the 'a next to the "impl" keyword needed?
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
A: 請看下面的範例 :
struct Foo<T>;
impl Foo<T> {}
T
,但編譯器不知道 T
是什麼struct Foo<T>;
impl<T> Foo<T> {}
T
之上是泛型。Q: The Rust typesystem has two bottom types
A: 是的。
Q: "subtyping" is actually the language used for lifetimes in the Rustonomicon
A: 是的。
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
pub struct StrSplit<'a>
{
remainder: &'a str,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
}
將目前程式碼再讓編譯器檢查一次看看,終於過了 :
$ cargo check
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
&'a str
<- &'static str
為什麼這樣 ok ?
self.remainder = "";
""
是 static
的原因,是在編譯的時候真的就會把它放在儲存在 disk 二進位檔的 initialized data 區域。
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'a>
{
remainder: &'a str,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
// assert_eq!(letters, vec!["a", "b", "c", "d", "e"].into_iter());
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
再進行測試,也順利通過了 :
$ cargo test
...
running 1 test
test it_works ... ok
...
Q: everything by default has static lifetime?
A: 值的生命週期是取決於什麼時候被卸除,如果那個值不是被宣告成 'statuc'
卻從未被卸除,那它就像是有 static
生命週期的假象。函式內部宣告的變數放在堆疊,在離開函式時,會去清掉那些在堆疊內區域變數的值,此時函式內部的變數的生命週期就已經結束了。
Q: can i think about strsplit like a foldr?
A: 不行,StrSplit
要做的就是分割字串而已。
將 test case 寫的簡潔一點:
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", "e"]);
}
Q: Don't variables die at end-of-scope, not just return?
A: 只要值還沒被卸除,生命週期就還沒結束,因為離開作用域時值會被卸除,此時的生命週期才會到期,回到剛剛值是否預設為 static
的問題,再次說明,是取決於值能在記憶體多久,而不是預設為生命週期是 static
。
新增 test case,delimiter
在 tail 的位置,預期最後一個子字串應該要是 ""
:
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'a>
{
remainder: &'a str,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: haystack,
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(next_delim) = self.remainder.find(self.delimiter) {
let until_delimiter = &self.remainder[..next_delim];
self.remainder = &self.remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
實際上得到的結果卻是:
$ cargo test
left: `["a", "b", "c", "d"]`,
right: `["a", "b", "c", "d", ""]`', src/lib.rs:50:5
所以我們應該要修改 next()
函式,下面是我們要改的範圍:
else if self.remainder.is_empty() {
// TODO: bug
None
} else {
let rest = self.remainder;
self.remainder = "";
Some(rest)
}
這裡 remainder
出來是 ""
,我們要區分出是 remainder
是 ""
,或者是 remainder
是 ""
但我們還沒 yield。
要解決這個問題,先回到 StrSplit
的結構,將 remainder
的資料型別改成 Option
,這是關鍵,因為等等我們要用到 Option
的 take()
函式來取得值的所有權。建構式的 remainder
也要改變資料型別,將傳進來的參數包進 Some
裡面 :
#[derive(Debug)]
pub struct StrSplit<'a>
{
- remainder: &'a str,
+ remainder: Option<&'a str>,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
- remainder: haystack,
+ remainder: Some(haystack),
delimiter,
}
}
}
Q: lifetimes are for stack allocated memory? heap allocations like String don't have specified lifetimes?
A: heap 也是有生命週期的,只要 heap 的值被卸除了,其生命週期就已經結束了,但也有可能從頭到尾都沒卸除,就會變成像是 static 一樣,但要發生 heap 的值沒被卸除的情況是 Box::leak, Box leak 回傳的就是 static 參考,這功能並不等於記憶體洩漏。
Q: If you dumped the binary, could you spot the static allocation ?
A: 你可以在 dump 看到 static allocation,但如果是 empty string ("") 則不行,因為它被編譯器最佳化掉了。
原本只想修部份,現在變修整個 next()
函式 :
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
if let Some(ref mut remainder) = self.remainder {
// 等價於 let remainder = &mut self.remainder
// 而不是 let mut remainder = &self.remainder;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take() // 當第一次得到空字串時,可以讓我們 yield 空字串。
}
} else {
None // 空字串被 yield 後,
// self.remainder 內的值的所有權被拿走,會變成 None
}
}
}
if let
只是在比模式有沒有匹配而已,而不是在比兩邊的值時否相等。
Q: what does ref keyword mean?
A: 請看下面範例:
if let Some(remainder) = self.remainder {
self.remainder
。Some(ref mut x) = Option(y)
),則 x
會拿借用 &mut y
,而不是將 y
的值移動到 x
。
if let Some(ref mut remainder /* &mut &'a str */) = self.remainder /* Option<&'a str> */{
如果左式與右式匹配 (
Some(mut x) = Option(y)
), 則 x
會拿到 mut y
,這樣是移動 y
的值到 x
。
Some
。這裡的 mut
是在說可以改變參考對象的值,而不是更換參考對象。下面即是更改參考對象的值的方法 :
*remainder = &remainder[(next_delim + self.delimiter.len())..]
Q: what is ref keyword means? Is it same as & ?
A: 請看下面範例 :
if let Some(&mut remainder) = self.remainder
Some(&mut T)
,如果是的話 remainder
的資料型別會是 mut T
。Some(mut T)
,如果是的話 remainder
的資料型別會是 ref mut T
:
if let Some(ref mut remainder) = self.remainder
if let Some(remainder) = &mut self.remainder
0:51:36
Q: what is the deref on the left side of the assignment doing?
A: 請看下面程式碼範例 :
*remainder = &remainder[(next_delim + self.delimiter.len())..]
左式的資料型別 : &mut &'a str (指標的指標)
右式的資料型別 : &'a str (指標)
因為左式與右式的資料型別不同,所以要把左式解參考才能將右式的值指派給左式。
&remainder[(next_delim + self.delimiter.len())..]
編譯器解讀順序為 :
remainder[(next_delim + self.delimiter.len())..]
取到某段字串範圍。&remainder[(next_delim + self.delimiter.len())..]
取到某段字串範圍的記憶體位置。0:52:46
Q: What is the ".take()" call doing ?
A: 請看下面說明
self.remainder.take()
// impl<T> Option<T> {fn tak(&mut self) -> Option<T
// if Option is None, return None
// if Option is Some, then set Option to None and Return the Some
每一個
let
statement 都是模式比對
簡化 next()
函式區塊的程式碼 :
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
let ref mut remainder = self.remainder?;
// 上面式子也可以用下式表示一樣的操作
// let remainder = &mut self.remainder?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'a>
{
remainder: Option<&'a str>,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
let ref mut remainder = self.remainder?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
$ cargo test
Finished test [unoptimized + debuginfo] target(s) in 0.00s
Running unittests src/lib.rs (target/debug/deps/strsplit-a9fa65918e243300)
running 2 tests
test it_works ... FAILED
Q: If self
is mutable here, why is self.remainder
not mutable by default? (Coming from a C background, I'm thinking about this kind of like const)
A: Mutable references 只有一層的深度,傳入 &mut self
只讓我們可以修改 self
的任何欄位,但欄位指向的值,拿 delimiter
來例子來說,它是指向 immutable 字串,所以它指向的值是不能修改的,但它可以改指向別的 immutable 字串。
前面會造成無窮迴圈的原因是值並未被移動,因為ref mut
沒發揮作用,沒發揮作用的原因如下 :
let ref mut remainder = self.remainder?;
?
用法是 if self.remainder is None return None,否則回傳在 Some
裡面的值,就像拆除包裝一樣。一般來說上式應該發揮作用,但是因為 self.remainder
Option
裡面的值是 Copy
型態,所以上式在做指派值的動作時是做了 Copy 的動作而不是 Move,導致左式的 remainder
(ptrRemainderCopy
) 跟右式的 self.remainder
(ptrRemainder
) 變成兩個不同指標的指標 :
所以當我們執行到 *remainder = &remainder[(next_delim + self.delimiter.len())..];
,只是改變 Copy 那份 (ptrRemainderCopy
)的值,self.remainder (ptrRemainder
) 沒有跟著做相同的操作,最終導致了無窮迴圈。
尚為解決疑問,為何
Option
內的 &'a str
會是 Copy 型態?
為了讓左式借用右式值的參考,右式要加上 as_mut()
:
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
- let ref mut remainder = self.remainder?;
+ let remainder = self.remainder.as_mut()?;
+ // impl<T> Option<T> { fn as_mut(&mut self) -> Option<&mut T> } ,
+ // 再搭配 ? 拆包裝即可達到我們的目的
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'a>
{
remainder: Option<&'a str>,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
這樣即可解決無窮迴圈的問題了 :
$ cargo test
...
running 2 tests
test tail ... ok
test it_works ... ok
...
到目前為止還沒講解到多個生命週期的情形,接下要來要開始講解多個生命週期的情形了。
首先,新增 until_char()
函式 :
pub fn until_char(s: &str, c: char) -> &str
{
StrSplit::new(s, &format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
新增 test case :
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'a>
{
remainder: Option<&'a str>,
delimiter: &'a str,
}
impl<'a> StrSplit<'a> {
pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
impl<'a> Iterator for StrSplit<'a>
{
type Item = &'a str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
pub fn until_char(s: &str, c: char) -> &str
{
StrSplit::new(s, &format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
編譯器檢查到以下錯誤 :
$ cargo check
Checking strsplit v0.1.0 (/home/wilson/CrustOfRust/strsplit)
error[E0515]: cannot return value referencing temporary value
--> src/lib.rs:35:5
|
35 | StrSplit::new(s, &format!("{}", c))
| ^ ---------------- temporary value created here
| _____|
| |
36 | | .next()
37 | | .expect("StrSplit always gives at least one result!")
| |_____________________________________________________________^ returns a value referencing data owned by the current function
因為回傳值的生命週期跟 &format!("{}", c))
綁在一起,但 &format!("{}", c))
會在離開函式時值就會被卸除了,導致回傳值指向非法的記憶體區域。至於為什麼回傳值的生命週期是跟 &format!("{}", c))
綁在一起 而不是 s
呢? 原因是我們前面宣告兩個傳進來的參數的生命週期都是 'a
,但由於現在 &format!("{}", c))
的生命週期比較短 (只能活在函式內部),就把它當成 'a
(多個生命週期的情況下,要取短的生命週期),所以等於這個回傳值跟函式綁一起,因為函式活著,&format!("{}", c))
才活著。但我們想要的是:
pub fn until_char<'s>(s : &'s str, c: char) -> &'s str
如何告訴 Rust 這樣是 ok 的? 我們必須要有多個生命週期才能解決。
Q: Should we copy the delimiter into our struct?
A: delimiter 宣告成 String
,這樣就不用解多個生命週期的問題 :
#[derive(Debug)]
pub struct StrSplit<'a>
{
remainder: Option<&'a str>,
delimiter: String,
}
因為 String 屬於 heap-allocated,沒有生命週期跟它綁在一起。
str -> [char]
str
類似於 [char]
, str
沒有 size,因為它就像是 slice,它只是個字元序列,它不知道序列本身有多長,它只知道它是字元序列而已。&str -> &[char]
&str
是 fat pointer,fat pointer 是 two-word 值,包含一個指向 slice 的第一個元素,以及 slice 的元素數量。String ->Vec<char>
String
你可以很簡單的得到 &str
:String -> &str
(cheap – AsRef)&str -> String
(expensive) – Clone但將 delimiter
宣告成 String
有兩個壞處 :
String
,就表示你要有記憶體配置器,將會導致我們這個函式庫無法相容於沒有記憶體配置器的嵌入式設備之類的問題。所以這裡不用 String 的解法。
你通常不需要有多個生命週期,只有在一些特殊案例下才要用到,比如說我們今天討論的這個案例,使用多個參考,要強調的一點是,這些參考並不指向相同的東西,現在我們要的回傳值只想要綁在其中一個參考而已 :
#[derive(Debug)]
-pub struct StrSplit<'a>
+pub struct StrSplit<'haystack, 'delimiter>
{
- remainder: Option<&'a str>,
+ remainder: Option<&'haystack str>,
- delimiter: &'a str,
+ delimiter: &'delimiter str,
}
-impl<'a> StrSplit<'a> {
+impl<'haystack, 'delimiter> StrSplit<'haystack, 'delimiter> {
- pub fn new(haystack: &'a str, delimiter: &'a str) -> Self
+ pub fn new(haystack: &'haystack str, delimiter: &'delimiter str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
-impl<'a> Iterator for StrSplit<'a>
+impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
{
- type Item = &'a str;
+ type Item = &'haystack str;
+ // 這樣回傳值就可以只有綁在 haystack 上而已囉
fn next(&mut self) -> Option<Self::Item>
{
...
}
}
...
此時傳進 new()
函式的參數編譯器不會強求要有相同的生命週期了。
接著故意將 Some(until_delimiter)
換成 Some(self.delimiter)
。
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'haystack, 'delimiter>
{
remainder: Option<&'haystack str>,
delimiter: &'delimiter str,
}
impl<'haystack, 'delimiter> StrSplit<'haystack, 'delimiter> {
pub fn new(haystack: &'haystack str, delimiter: &'delimiter str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(self.delimiter)
} else {
self.remainder.take()
}
}
}
pub fn until_char(s: &str, c: char) -> &str
{
StrSplit::new(s, &format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
獲得編譯器會檢查到以下錯誤:
$ cargo check
error: lifetime may not live long enough
--> src\lib.rs:28:13
|
17 | impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
| --------- ---------- lifetime `'delimiter` defined here
| |
| lifetime `'haystack` defined here
...
28 | Some(self.delimiter)
| ^^^^^^^^^^^^^^^^^^^^ method was supposed to return data with lifetime `'haystack` but it is returning data with lifetime `'delimiter`
|
= help: consider adding the following bound: `'delimiter: 'haystack`
測試新增 bound where ...
,告訴編譯器說 'delimiter
的生命週期長度 > 'haystack
的生命週期長度,意思同 'delimiter
實作 'haystack
,即使用了前面提到的 subtyping 關係 :
impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
where
'delimiter: 'haystack
這樣你前面宣告 type Item = &'haystack str;
,後面回傳 Some(self.delimiter)
就可以編譯的過,但編譯錯誤又回到了回傳值的 生命週期跟 &format!("{}", c)
一樣長,函式退出即結束其生命週期。
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'haystack, 'delimiter>
{
remainder: Option<&'haystack str>,
delimiter: &'delimiter str,
}
impl<'haystack, 'delimiter> StrSplit<'haystack, 'delimiter> {
pub fn new(haystack: &'haystack str, delimiter: &'delimiter str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
where
'delimiter: 'haystack
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(self.delimiter)
} else {
self.remainder.take()
}
}
}
pub fn until_char(s: &str, c: char) -> &str
{
StrSplit::new(s, &format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters = StrSplit::new(haystack, " ");
assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
編譯器檢查得到以下錯誤:
$ cargo check
error[E0515]: cannot return value referencing temporary value
--> src\lib.rs:39:5
|
39 | StrSplit::new(s, &format!("{}", c))
| ^ ---------------- temporary value created here
| _____|
| |
40 | | .next()
41 | | .expect("StrSplit always gives at least one result!")
| |_____________________________________________________________^ returns a value referencing data owned by the current function
將程式碼做修改 :
...
impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
-where
- 'delimiter: 'haystack
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
- Some(self.delimiter)
+ Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
-pub fn until_char(s: &str, c: char) -> &str
+pub fn until_char<'s>(s : &'s str, c: char) -> &'s str
{
StrSplit::new(s, &format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
- let letters = StrSplit::new(haystack, " ");
+ let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
- assert!(letters.eq(vec!["a", "b", "c", "d", "e"].into_iter()));
+ assert_eq!(letters, vec!["a", "b", "c", "d", "e"]);
}
...
// #![warn(missing_debug_implementations, rust_2018_idioms, missing_docs)]
#[derive(Debug)]
pub struct StrSplit<'haystack, 'delimiter>
{
remainder: Option<&'haystack str>,
delimiter: &'delimiter str,
}
impl<'haystack, 'delimiter> StrSplit<'haystack, 'delimiter> {
pub fn new(haystack: &'haystack str, delimiter: &'delimiter str) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some(next_delim) = remainder.find(self.delimiter) {
let until_delimiter = &remainder[..next_delim];
*remainder = &remainder[(next_delim + self.delimiter.len())..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
pub fn until_char<'s>(s : &'s str, c: char) -> &'s str
{
StrSplit::new(s, &format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", "e"]);
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
再次測試程式碼 :
$ cargo test
Compiling strsplit v0.1.0 (/home/wilson/CrustOfRust/strsplit)
Finished test [unoptimized + debuginfo] target(s) in 0.28s
Running unittests src/lib.rs (target/debug/deps/strsplit-dd83426d9f98ae71)
running 3 tests
test until_char_test ... ok
test it_works ... ok
test tail ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Doc-tests strsplit
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
為什麼我們不用將回傳值的生命週期綁在 &format!("{}", c)
上,因為我們回傳值根本用不到它的值,所以根本不需要將回傳值也跟它綁在一起。
Q: can you put _ for delimiter lifetime to say it's not needed?
A: 可以,兩個函式簽章改成以下 :
'_
: 表示非 'haystack 的任何變數的生命週期都可以獨一無二的 生命週期。
impl<'haystack> Iterator for StrSplit<'haystack, '_>
'_
的生命週期跟 s
綁在一起,否則沒有其他生命週期可以綁了。
pub fn until_char(s : &str, c: char) -> &'_ str
'_
指出每個生命週期都是自動推斷,有點要求 explicit 的感覺,但實際上其實可以不用加 '_
也可以編譯的過 :
pub fn until_char(s : &str, c: char) -> &str
&format!("{}", c)
這個是回傳 String
的資料型別,所以仍用到 heap allocation,接下來要擺脫這個記憶體配置。
如何在 next()
不要用到 &format!("{}", c)
的記憶體配置 ? 如何讓 c
不要轉換成 String
的型別,而是成為任何可以在 String
中找到自己的東西的型別 ?
首先,我們先新加一個 trait :
pub trait Delimiter
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>;
}
接著要做這幾件事情 :
'delimiter
生命週期參數都換成 D
泛型next()
find_next()
#[derive(Debug)]
-pub struct StrSplit<'haystack, 'delimiter>
+pub struct StrSplit<'haystack, D>
{
remainder: Option<&'haystack str>,
- delimiter: &'delimiter str,
+ delimiter: D,
}
-impl<'haystack, 'delimiter> StrSplit<'haystack, 'delimiter> {
+impl<'haystack, D> StrSplit<'haystack, D> {
- pub fn new(haystack: &'haystack str, delimiter: &'delimiter str) -> Self
+ pub fn new(haystack: &'haystack str, delimiter: D) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
-impl<'haystack, 'delimiter> Iterator for StrSplit<'haystack, 'delimiter>
+impl<'haystack, D> Iterator for StrSplit<'haystack, D> where D: Delimiter
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
- if let Some(next_delim) = remainder.find(self.delimiter) {
+ if let Some((delim_start, delim_end)) = self.delimiter.find_next(remainder) {
- let until_delimiter = &remainder[..next_delim];
+ let until_delimiter = &remainder[..delim_start];
- *remainder = &remainder[(next_delim + self.delimiter.len())..];
+ *remainder = &remainder[delim_end..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
...
接著為 &str
實作 Delimiter
的 trait :
impl Delimiter for &str
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>
{
s.find(self).map(|start| (start, start + self.len()))
}
}
並將 &format!("{}", c))
改為 &*format!("{}", c))
(型態為 &str
),因為我們的程式允許給 &str
的型態。
將
String
轉成 &str
的方法
摘錄 What does &* combined together do in Rust? 內容 :
let s = "hi".to_string(); // : String
let a = &s;
What's the type of a? It's simply &String! This shouldn't be very surprising, since we take the reference of a String. Ok, but what about this?
let s = "hi".to_string(); // : String
let b = &*s; // equivalent to `&(*s)`
What's the type of b? It's &str! Wow, what happened?
…
因為前面的修改讓程式泛型化於任何 D
,這個 D
可以是參考,也可以是想活多久就活多久的某資料型態,它只有一個限制是你給的資料型別要有 Delimiter
trait 而已
delimiter
) 而已。
#[derive(Debug)]
pub struct StrSplit<'haystack, D>
{
remainder: Option<&'haystack str>,
delimiter: D,
}
impl<'haystack, D> StrSplit<'haystack, D> {
pub fn new(haystack: &'haystack str, delimiter: D) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
pub trait Delimiter
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>;
}
impl<'haystack, D> Iterator for StrSplit<'haystack, D> where D: Delimiter
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some((delim_start, delim_end)) = self.delimiter.find_next(remainder) {
let until_delimiter = &remainder[..delim_start];
*remainder = &remainder[delim_end..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
impl Delimiter for &str
{
// 這裡的 &self 的資料型別是 &&str
fn find_next(&self, s: &str) -> Option<(usize, usize)>
{
s.find(self).map(|start| (start, start + self.len()))
/*
s.find(self) 在幹嘛?
find 是 String 的方法,你可以給一個 String,它會告訴你 String 的起始位置。
find 會回傳 Option<找到的東西的位置>
map(...) 則是當 find 回傳的是 None 則回傳 None,否則回傳 Some,此時我們想要改 Some 裡面的值成 ((start, start + self.len()))
Q: why self.len() and not s.len()?
A: 因為 self 是我們要搜尋的對象,self.len() 是 delimiter 的長度,所以加上 self.len() 才能得到我們目前找到的 delimiter 的起始位置跟終點位置
*/
}
}
pub fn until_char(s : &str, c: char) -> &str
{
StrSplit::new(s, &*format!("{}", c))
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", "e"]);
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
接著為 char 實作 Delimiter trait :
impl Delimiter for char
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>
{
s.char_indices()
.find(|(_, c)| c == self)
.map(|(start, _)| (start, start + 1))
}
}
char_indices()
: 走訪整個字串find(...)
: 搜尋一個我們在找的字元map(...)
: 將 find 的結果透過 map 來操作值城我們要的 : (start, start + 1)
,其中 +1
是因為 char
的長度就是 1
。並且將 StrSplit::new(s, &*format!("{}", c))
換成 StrSplit::new(s, c)
,得到不用 heap allocate 的程式囉。
#[derive(Debug)]
pub struct StrSplit<'haystack, D>
{
remainder: Option<&'haystack str>,
delimiter: D,
}
impl<'haystack, D> StrSplit<'haystack, D> {
pub fn new(haystack: &'haystack str, delimiter: D) -> Self
{
Self {
remainder: Some(haystack),
delimiter,
}
}
}
pub trait Delimiter
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>;
}
impl<'haystack, D> Iterator for StrSplit<'haystack, D> where D: Delimiter
{
type Item = &'haystack str;
fn next(&mut self) -> Option<Self::Item>
{
let remainder = self.remainder.as_mut()?;
if let Some((delim_start, delim_end)) = self.delimiter.find_next(remainder) {
let until_delimiter = &remainder[..delim_start];
*remainder = &remainder[delim_end..];
Some(until_delimiter)
} else {
self.remainder.take()
}
}
}
impl Delimiter for &str
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>
{
s.find(self).map(|start| (start, start + self.len()))
}
}
impl Delimiter for char
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>
{
s.char_indices()
.find(|(_, c)| c == self)
.map(|(start, _)| (start, start + 1))
}
}
pub fn until_char(s : &str, c: char) -> &str
{
StrSplit::new(s, c)
.next()
.expect("StrSplit always gives at least one result!")
}
#[test]
fn until_char_test()
{
assert_eq!(until_char("hello world", 'o'), "hell");
}
#[test]
fn it_works()
{
let haystack = "a b c d e";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", "e"]);
}
#[test]
fn tail()
{
let haystack = "a b c d ";
let letters: Vec<_> = StrSplit::new(haystack, " ").collect();
assert_eq!(letters, vec!["a", "b", "c", "d", ""]);
}
start + 1
的 1
可以改成 self.len_utf8()
:
impl Delimiter for char
{
fn find_next(&self, s: &str) -> Option<(usize, usize)>
{
s.char_indices()
.find(|(_, c)| c == self)
.map(|(start, _)| (start, start + self.len_utf8()))
}
}
1:25:30
今天的實作都在標準函式庫可以看的到更完整的實作 :
str::find()
str::split()
'a
,然後 delimiter
(這裡是 Pattern
,可以做更複雜的模式比對)也是實作成泛型。
pub fn split<'a, P>(&'a self, pat: P) -> Split<'a, P> ⓘ
where
P: Pattern<'a>,
Trait std::str::pattern::Pattern
今天的實作都可以直接呼叫標準函式庫的 split()
函式來達到分割字串的目的,本次的目的是要探討生命週期的概念,而不是要教你怎麼去用標準函式庫,或讓這個程式碼公開成 crate,因為標準函式庫已經實作很完整了。
Q: Why can't you create a String from the str fat pointer? You already know where the bytes are in memory and the length of it.
A: 因為你不擁有 str fat pointer 指向的記憶體,String 假設它擁有底層的記憶體,它假設當它被卸除值時,必須要釋放它的記憶體,也假設它可以在必要的情況下增減記憶體使用量,如果採用任意指標和長度並決定我現在要擁有它,這將是不正確的,因為值底層記憶體的所有權你並沒有。
Q: Don't you think Rust is kind of less readable than other languages like Go, Python? the syntax is kind of different I guess?
A: 並非如此, 如果你用到跟其他程式一樣的 feature,同樣是 readable 的,Rust 只是額外增加一些 feature 要求額外的語法,也是因為有這些額外的語法,Rust 可以做到一些其他程式做不到的事情。
Q: The pattern and the haystack seem to be sharing the same lifetime 'a
A: 因為 Rust 程式設計師還沒想出設計什麼,就先弄成一樣的。補充 : Searcher
的生命週期是 Pattern
正在搜尋的字串的生命週期。
Q: when you see something linke Type<'x>, how do you know what x is the lifetime of?
A: 你不知道 x
的生命週期,就像你看到 type T
,你卻不知道 type T
是什麼一樣。
Q: what do you think of Rust having a future in the industry?
A: 請參考之前的演講。
Q: could you publish this as a gist so we can play with it?
A: github repo
Q: Can you make it work with stdin() as an input instead of a &str? :)
A: 不容易去實作,因為它是 stream 而不是 constant,所以它不是你可以 seek in 的,但你可以自己實作看看。
Q: How do you think generic associates types will improve trait definitions? (aka think StreamIterator that allows complex iterators with borrowed items)
A: 不太有幫助,因為你要更多的 existential types。但它可以幫助 clone 少一點。
Q: do you intend to do some lectures for newcomers to Rust from other languages and if not, is there some resources/streamers that you would recommend?
A: 沒有計畫去做給新手的影片,這系列的影片都會聚焦在某些主題上,此系列的目標觀眾是中階的 Rust 程式設計師而。
GitHub Comment
Q : Sorry if this was already asked elsewhere, but I am still not sure why:
if let Some(ref mut remainder) = self.remainder { // (1) ok
let ref mut remainder = self.remainder?; // (2) nok
if let Some(remainder) = &mut self.remainder { // (3) ok
let remainder = &mut self.remainder?; // (4) nok
if let Some(remainder) = self.remainder.as_mut() { // ok
let remainder = self.remainder.as_mut()?; // ok
I see that if a type is Copy, it gets copied instead of moved. But aren't we dealing with same types in 1 vs 2, 3 vs 4?
A : The ?
in 2 and 4 copies the entire Option
when the inner type is Copy
. What we then take a mutable reference to is what is inside of that copy, not the original Option. In the last case, we turn Option<T>
into Option<&mut T>
, which, when copied, still yields a mutable reference into the original Option. Does that help?
Q : Hey @jonhoo, how common is it to see or do:
impl Trait for &Type {...} //?
// such as
impl Delimiter for &str {...}
Additionally, if you intend to use both Type and &Type as implementers of Trait, for instance:
let x1: Vec<_> = StrSplit::new(haystack, MyOtherDelimiterFlavor::new(...)).collect();
let x2: Vec<_> = StrSplit::new(haystack, &MyOtherDelimiterFlavor::new(...)).collect();
would one then need to write redundant impl blocks, or is there some syntactic sugar for controlling this?
A : It's actually fairly common, precisely to improve the ergonomics of using the trait as you indicate. In general, if you implement a trait that only takes &self
it's pretty reasonable to implement the trait for &MyType
, &mut MyType
and Box<MyType>
. If a trait method takes &mut self
, skip &MyType
. But of course, the downside is that now if the trait changes in the future, more breakage will ensure since your consumers expected to be able to transparently use &
(or &mut
).
Option
內的 &'a str
會是 Copy 型態?