Practical Go: Real world advice for writing maintainable Go programs

--- tags: golang slideOptions: transition: slide --- # Practical Go: Real world advice for writing maintainable Go programs > [original english version](https://dave.cheney.net/practical-go/presentations/qcon-china.html) > [simple chinese version](https://github.com/llitfkitfk/go-best-practice) :::warning 文長，會想睡覺，請打起精神 ::: --- # Table of contents 1. Guiding principles 2. Identifiers 3. Comments 4. Package Design 5. Project Structure 6. API Design 7. Error handling 8. Concurrency --- > $Software\ engineering=Programming\ +\ Time\ +\ other\ programmers$ > [name=Go team lead, [Russ Cox](https://twitter.com/_rsc)] --- # 1. Guiding principles > the guiding principles are underlying Go itself 1. Simplicity 2. Readability 3. Productivity > The joke goes that Go was designed while waiting for a C++ program to compile. --- # 2. Identifiers > An identifier is a fancy word for a **name**; the name of a variable, the name of a function, the name of a method, the name of a type, the name of a package, and so on. ## 2.1 Choose identifiers for clarity, not brevity > Good naming is like a good joke. If you have to explain it, it’s not funny. > [name=[Dave Cheney](https://twitter.com/davecheney/status/997155238929842176)] - **A good name is concise.** - **A good name is descriptive.** - 若為一般變數，應描述**應用層的意義** - 若為一個函數/方法/行為的輸出，描述它的**結果** - 若為一個 packge，描述它的**目的** - **A good name is should be predictable.** - 看到該命名，就知道該怎麼使用它、而不會誤用以下將根據上述原則來舉例。 ## 2.2. Identifier length **原則：** 1. **宣告**(declaration)的地方與**最後一次使用**的地方離得越近，命名越短；反之越長 2. **不要**在命名中包含資料型態(data type) 3. **Constants** 的命名只要告訴別人內容(only noun)，不用告訴別人怎麼用(no verb) 4. **loop 和 if statement** 中的變數盡量使用**單字母** 5. **參數、回傳值、方法、界面(interface)、package** 盡量使用**單詞** 6. 何時考慮使用**多個單詞**命名？**函數與 package**的等級 7. 同一行，不要長短命名混用。i.e. 同時有**單詞**與**單字母** **舉例：** ```go= type Person struct { Name string Age int } // AverageAge returns the average age of people. func AverageAge(people []Person) int { if len(people) == 0 { return 0 } var count, sum int for _, p := range people { sum += p.Age count += 1 } return sum / count } ``` - line 12: `p` - 宣告與使用離得超近 - `people`, `count` 和 `sum` 存活許多行，故給個好辨認的命名 - Use `blank lines` to break up the flow of a function ### 2.2.1 Context is key ```go= for index := 0; index < len(s); index++ { // } ``` **vs.** ```go= for i := 0; i < len(s); i++ { // } ``` 上例相較下例，並沒有真的增加可讀性，故請避免這種冗餘的命名行為 ## 2.3 Don’t name your variables for their types > You don’t name your pets "dog" and "cat" **Bad example 1** ```go= var usersMap map[string]*User ``` > *Go is a statically typed language.* > 如果改成 `users` 還是覺得不夠具有描述性？那麼 `userMap` 也一定沒有 **Bad example 2** ```go= type Config struct { // } func WriteConfig(w io.Writer, config *Config) ``` - `WriteConfig` 中的 `Config` 冗餘，應直接使用 `Write` - 又參數中的 `config` 應使用 `conf` 或 `c` 即可，也避免長短字命名混用 ## 2.4. Use a consistent naming style - 為了讓你的程式碼命名具有可預測性的條件還有**自我命名風格必須一致** - 若你有個物件`*sql.DB`會一直出現在程式碼中，若沒意義上的改變，不要每次命名變數時都不一樣 - e.g. `d *sql.DB`, `dbase *sql.DB`, `DB *sql.DB`, 或 `database *sql.DB` - 應該保持一致為： **`db *sql.DB`** - **慣用變數命名原則** - loop induction variable - **`i`**, **`j`**, **`k`** > 如果你發現你的巢狀迴圈 i, j, k 不夠用，表示你的 function 必須要重構！ - counter or accumulator - **`n`** - value in generic encoding function(通用編碼函數，是啥？) - **`v`** - key of a map - **`k`** - shorthand for parameters of type `string` - **`s`** ## 2.5. Use a consistent declaration style **Go 中至少有以下五種方式可以宣告並賦值：** ```go= var x int = 1 var x = 1 var x int; x = 1 var x = int(1) x := 1 ``` **該使用何種風格？** > **TL;DR** > - 宣告，但在未來才要賦值，使用 **`var`** > - 宣告，同時已能明確給予初始值，使用 **`:=`** **有例外嗎？** > **When something is complicated, it should look complicated.** ```go= var length uint32 = 0x80 ``` - 可能在你的某個 package 中，透過此種違反慣例的方式提示讀者：在此 package 的 `length` 是 `unit32` 的型態唷！ ## 2.6. ***Be a team player*** - 未來一定有非常多機會參與妳不是唯一作者的專案，讓自己能夠融入該團隊的風格吧！ - 不要輕易地去改變團隊既有的風格，**別添亂**。只要一樣滿足 `gofmt` 的要求就不要輕易地去改變團隊的風格 --- # 3. Comments **註解**在撰寫 Go 時扮演舉足輕重的角色 > Good code has lots of comments, bad code requires lots of comments. > [name=Dave Thomas and Andrew Hunt, The Pragmatic Programmer] 撰寫有意義的註解，需要回答以下問題之一： 1. **What** - 這段 code 打算做什麼？ 2. **How** - 如何做這件事？ 3. **Why** - 為何做這件事？ **What** 適合寫在 ***public symbols***： ```go // Open opens the named file for reading. // If successful, methods on the returned file can be used for reading. ``` **How** 適合在 method 內部註釋： ```go // queue all dependant actions var results []chan error for _, dep := range a.Deps { results = append(results, execute(seen, dep)) } ``` **Why** 用來解釋一些從程式碼上下文無法快速參透的「外部因素」。例如下例的 `HealthyPanicThreshold` 被設成 `0`，沒有註解(`// Disable HealthyPanicThreshold`)就不知道 `0` 是什麼含意。 ```go return &v2.Cluster_CommonLbConfig{ // Disable HealthyPanicThreshold HealthyPanicThreshold: &envoy_type.Percent{ Value: 0, }, } ``` ## 3.1 Comments on variables and constants should describe their ***contents*** not their ***purpose*** **舉例：** ```go const randomNumber = 6 // determined from an unbiased die ``` ```go const ( StatusContinue = 100 // RFC 7231, 6.2.1 StatusSwitchingProtocols = 101 // RFC 7231, 6.2.2 StatusProcessing = 102 // RFC 2518, 10.1 StatusOK = 200 // RFC 7231, 6.3.1 ``` **有例外嗎？** 當變數並未被明確指定初始值時，此時你應該提到誰負責維護此變數的狀態： ```go // sizeCalculationDisabled indicates whether it is safe // to calculate Types' widths and alignments. See dowidth. var sizeCalculationDisabled bool ``` 但如果有更好的命名，就可不用註解。**e.g.** ```go // registry of SQL drivers var registry = make(map[string]*sql.Driver) ``` A better way to describe register of what: ```go var sqlDrivers = make(map[string]*sql.Driver) ``` ## 3.2 **Always** document public symbols 根據 Google Style Guide - 任何不夠簡短也不夠明顯的 **public function** 都應該註解 - 任何在 library/package 內的 functions 都應該註解(無論長度與複雜度) > 呃，那不就是所有的 function/method 都註解？以 `io` package 內的註解為例： ```go! // LimitReader returns a Reader that reads from r // but stops with EOF after n bytes. // The underlying implementation is a *LimitedReader. func LimitReader(r Reader, n int64) Reader { return &LimitedReader{r, n} } // A LimitedReader reads from R but limits the amount of // data returned to just N bytes. Each call to Read // updates N to reflect the new amount remaining. // Read returns EOF when N <= 0 or when the underlying R returns EOF. type LimitedReader struct { R Reader // underlying reader N int64 // max bytes remaining } func (l *LimitedReader) Read(p []byte) (n int, err error) { if l.N <= 0 { return 0, EOF } if int64(len(p)) > l.N { p = p[0:l.N] } n, err = l.R.Read(p) l.N -= int64(n) return } ``` > 為何 Read 不需要註解？因為在此模組中的其他地方已有此 interface method 的註解，且在理解 `LimitReader` 與 `LimitedReader` 中的註解時也能不言而喻。 > **Reminder-** `Reader` is an **interface*** > ```go! > // Reader is the interface that wraps the basic Read method. > ... > // Implementations must not retain p. > type Reader interface { > Read(p []byte) (n int, err error) > } > ``` 那麼，什麼時候反而**不要註解**呢？ ### 3.2.0 Document methods that implement an interface 若你只是註解「A 實作了 B 的介面」，這等於沒寫。別做出下面這種註解： ```go // Read implements the io.Reader interface func (r *FileReader) Read(buf []byte) (int, error) ``` ### 3.2.1. Don’t comment bad code, rewrite it > Don’t comment bad code — rewrite it > [name=Brian Kernighan] 原則如上，但常常因時程壓力，必須註解一段具有技術債 (technical debt) 的程式碼，除此之外最好多加註 `TODO` 與 `username` 來提醒讀者注意。 ```go! // TODO(John) this is O(N^2), find a faster way to do this. ``` 其中，`John` 不一定是實作者，至少是釐清上下文的最佳諮詢對象。 ### 3.2.2. Rather than commenting a block of code, refactor it > **Good code** is its own **best documentation.** > As you’re about to add a comment, ask yourself: >> **'How can I improve the code so that this comment isn’t needed?'** > > Improve the code and then document it to make it even clearer. > [name=Steve McConnell] 根據 [SOLID](https://en.wikipedia.org/wiki/SOLID) 中的 [S、單一職責原則](https://en.wikipedia.org/wiki/Single-responsibility_principle)，一個 function 只要求做好一件事。當你打破此原則時，屆時就會發現會需要在 function 中許多處加註解。若能夠確實要求，function 通常也都足夠小且獨立，並易於測試。通常此時你的 function 可能也就不需要額外的註解就足夠說明意圖了。 --- # 4. Package Design > **Write shy code** - modules that **don't reveal anything unnecessary** to other modules and that **don't rely on other modules' implementations.** > [name=[Dave Thomas](https://twitter.com/codewisdom/status/1045305561317888000?s=12)] 一個好的 Go package 應在開發過程盡力降低與其他原始碼的耦合 (coupling)、也不要暴露過多對他人無用的程式碼。 ## 4.1 A good package starts with its name 好的 package 的命名，應試著回答以下問題： > **此 package 提供何種服務？** >> *Name your package for **what it provides**, **not what it contains.*** ### 4.1.1 Good package names should be unique. 若發現你有兩個 packages 命名太相似或相同，可能是因為： 1. 名稱太通用 (too generic)，請重新命名 2. 重新檢視你的架構，功能可能重複，精簡它或合併它 ## 4.2 Avoid package names like `base`, `common`, or `util` 免不了會有一些 utility 或 helper 之類的程式碼預期會被多個 package 共用。你可能會因此將這些功能集中到以 `utils` 或 `helpers` 命名的 packages。 **建議不要這麼做**，而是**允許這些程式碼重複**，在被呼叫的 package 中都各自擁有這些 function，讓這些 function 的用途、目的能夠顯而易見。 > [A little] duplication is far cheaper than the wrong abstraction. > [name=Sandy Metz] > ## 4.3 Return early rather than nesting deeply **Golang 沒有 try/catch/exception 的功能**，所以無需為此提供一個階層式架構、只為了在最上層使用 try catch block。建議的實作模式為：**出錯就 return**，此技巧稱 ***guard clauses***。 **舉例：** 在 `bytes` package 中使用了 guard clauses 的寫法： ```go! func (b *Buffer) UnreadRune() error { if b.lastRead <= opInvalid { return errors.New("bytes.Buffer: UnreadRune: previous operation was not a successful ReadRune") } if b.off >= int(b.lastRead) { b.off -= int(b.lastRead) } b.lastRead = opInvalid return nil } ``` 同樣的 function 若***不使用 guard clause***： ```go! func (b *Buffer) UnreadRune() error { if b.lastRead > opInvalid { if b.off >= int(b.lastRead) { b.off -= int(b.lastRead) } b.lastRead = opInvalid return nil } return errors.New("bytes.Buffer: UnreadRune: previous operation was not a successful ReadRune") } ``` 作為讀者你可以感覺到，造成你較多的認知負荷，且此種寫法相對來說真的較容易出 bug。 ## 4.4 Make the zero value useful Go 替每個 primitive data type 都設計***zero value*** 的機制、當你只**宣告**但**未明確初始化**變數時 (explicit initialisation)。 - 數值型態 (numeric types) 的 *zero value* 就是 `0` - 指標型態 (pointer types) 的 *zero value* 則是 `nil`。例如 slices, maps 和 channels 等為了程式的正確性，永遠記得替你的變數給一個**有用的初始值**、或**使 zero value 在你的 package/struct 設計中有用處**。 **舉例：** **`sync.Mutex`** ![sync.Mutex example](https://i.imgur.com/IpL44ie.png) **`bytes.Buffer`** ![bytes.Buffer example](https://i.imgur.com/SqoV3IL.png) 再觀察 slice 在 [runtime](https://golang.org/src/runtime/slice.go) 中的定義的話： ```go type slice struct { array *[...]T // pointer to the underlying array len int cap int } ``` 就可以知道僅宣告，裏頭的成員皆有自己的 *zero value*，使你可以僅宣告就開始使用也不會出錯： ```go= var s []string // just declaration s = append(s, "Hello") s = append(s, "world") fmt.Println(strings.Join(s, " ")) ``` > 但你可以用以下程式碼觀察僅宣告以及有初始化的 slice 的不同： > ```go= > var s1 = []string{} // same as s1 := make([]string, 0) > var s2 []string > if s1 != nil { > fmt.Println("s1 is not nil") > } > if s2 == nil { > fmt.Println("s2 is nil") > } > fmt.Println(reflect.DeepEqual(s1, s2)) // false > ``` > 另外，指標型態的 *zero value* 為 nil 還有個好處是，你一樣可以呼叫 `nil` 的方法，並且在該方法中處理當傳入的指標為`nil`時該怎麼辦： ```go= type Config struct { path string } func (c *Config) Path() string { if c == nil { return "/usr/home" } return c.path } func main() { var c1 *Config var c2 = &Config{ path: "/export", } fmt.Println(c1.Path(), c2.Path()) // will print: /usr/home /export } ``` ## 4.5 Avoid package level state 一個維護性佳的程式為盡可能地**低耦合 (loosely coupled)**，以此避免改動一個 package、影響了另一個 package。 **指導方針：** 1. 使用 `interface` 來描述你想要的 functions/methods 2. 避免使用 **global state** 當你在一個 go file 中宣告了一個**首字大寫的變數**，它將會成為**整個程式中的一個全域變數 (global variable)**，**任何時候都看得到**，且任何一處都能夠改動它。這將造成你原本彼此獨立的程式之間出現了高度耦合 (tight coupling)。若想要降低這種耦合性，建議的做法為： 1. 將此變數用 `struct` 封裝 2. 使用 `interface` 來定義可操作此變數的行為 --- # 5. Project Structure 將多個 **packages** 放在一起，以下會稱之為 **project** 或 **module**，並且使用單一的 git repository 儲存。每個 project 的命名，一樣，必須讓讀者明確知道其目的 (purpose)。 ## 5.1. Consider fewer, larger packages Go 不像其他語言提供許多能見度 (visibility) 相關的語法 - Java: `public`, `protected`, `private` - C++: `friend` class Go 很單純，只有區分 **public** 與 **private**，且簡單地使用**首字是否大寫**來區別。只要是 public、首字大寫的，就能夠被其他程式碼存取。 > 或者有的人會說 ***exported*** 和 ***not exported*** > ### 5.1.1. Arrange code into files by *import* statements 那到底該遵循什麼樣的準則來整理這些程式碼檔案呢？用 **`import` 語句**當作提示。 **指導方針：** 1. 使用**名詞**作為 package 名稱 1. 每當開始寫一個 package 時，先創建一個同名資料夾，再將該同名 go file 放置其中 - e.g. 我要寫一個 `package http`，就創建 `http` 資料夾，並寫在資料夾裡的 `http.go` 2. 當 package 越寫越大，那就適度分離出不同職責的 package 吧！ e.g. - 將 `Client` type 放在 `client.go` 中 - 將 `Server` type 放在 `server.go` 中 - 將 `Request` 與 `Respons` types 放在 `messages.go` 中 3. 隨著開發工作的進行，再觀察，是否有**多個 packages 的 `import` 語句相似**，表示可能需要合併、並將真正相異的部分抽離 ### 5.1.2. Prefer internal tests to external tests - [internal tests vs. external tests](https://dev.to/julianchu/go-internal-vs-external-testing-27hg) - 做 unit tests 時，使用 internal test - 因為這樣可以對包含 private function/method 在內的所有內容做測試 - 若有一些 `Example` function 的話，使用 external tests，將有利於未來在 [godoc](https://godoc.org/) 中被查看、複製取用 ### 5.1.3. Use `internal` packages to reduce your public API surface 若你**不想讓 public APIs 太 public**、只想要給特定的 packages 看到，能怎麼做？ > 將這種 package 放在 `internal/` 內舉個例子，如果你的 package 放在 `…/a/b/c/internal/d/e/f`，那麼就只有 directory tree 為 `…/a/b/c` 的程式碼可以存取得到；**`…/a/b/g` 與其他的 project 皆無法存取**。 ## 5.2. Keep package main small as small as possible `main` 通常扮演著 singleton 的角色，在整個程式裡**是唯一的存在**。又 `main` 在整個程式中只會被執行一次，故若要替 `main` 寫測試會非常困難，故須將業務邏輯從 `main` 中移出。 **那麼通常 `main` 負責什麼事情？** 1. parse [flags](https://golang.org/pkg/flag/) 2. open connections to databases 3. open loggers 然後再將這些物件交給別人去處理。 --- # 6. API Design 此節為最重要的**程式設計建議**，前面都只是**軟性建議**，就算沒達成後果可能只影響你自己、不會有什麼向下相容的問題。但此節討論的 (public) API 不一樣，若不在一開始就嚴謹看待，未來要做更改影響的就一定會影響使用者。 ## 6.1. Design APIs that are hard to misuse. > APIs should be easy to use and hard to misuse. > [name=[Josh Bloch](https://www.infoq.com/articles/API-Design-Joshua-Bloch/)] > ### 6.1.1. Be wary of functions which take several parameters of the same type > ***TL;DR*** > APIs with multiple parameters of the same type are hard to use correctly. > 一個造成他人容易誤用的特徵：function 要求**多個同型別的參數**。你心想，怎麼會呢？ **舉例：** ```go= func Max(a, b int) int func CopyFile(to, from string) error ``` 對 `Max` 來說，因具有交換律 (commutative)，故參數前後顛倒都沒關係： ```go= Max(8, 10) // 10 Max(10, 8) // 10 ``` 但對 `CopyFile` 來說，若不偷看定義或註解，你能總是從以下的例子中知道誰才是**來源**與**目的地**嗎？ ```go CopyFile("/tmp/backup", "presentation.md") CopyFile("presentation.md", "/tmp/backup") ``` 對於 `CopyFile` 的情境，在此拋轉引玉、給一種解法給大家思考： ```go= type Source string func (src Source) CopyTo(dest string) error { return CopyFile(dest, string(src)) } func main() { var from Source = "presentation.md" from.CopyTo("/tmp/backup") } ``` 此解法利用 `interface` 來限制 caller 可能的情境，使得 `CopyFile` 總是被正確呼叫。unit test 也好寫了、甚至能夠將 `CopyFile` 換成 private function 避免誤用。 ## 6.2. Design APIs for their default use case ### 6.2.0 Functional options for friendly APIs > References: > - [functional options](https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis) > - [another chinese explaination](https://blog.csdn.net/liyunlong41/article/details/89048382) Dave Cheney 認為，API 應該要非常容易使用，尤其要在 **default case** 時亦是如此。換句話說，你不該強迫使用者提供他們不關心的參數給 function。 > Go 沒有 [default arguments](https://en.wikipedia.org/wiki/Default_argument) 的設計。2017 年的 [issue](https://github.com/golang/go/issues/21909) 也稍微討論過此事。 >> 例如像 Python： >> ```python >> def connect(host="127.0.0.1", port=5432): >> # do something >> ``` 所以什麼是 functional options 呢？需求是： 1. caller 如我不想要給一堆參數，想要預設時不想給 `nil` 2. 不想要寫一堆 `if conf = nil {}` 來處理預設 **舉例：** ```go= package main import ( "errors" "fmt" ) type dbConfig struct { Host string Port int Table string } type client struct { *dbConfig } func newConnect(conf *dbConfig) (*client, error) { // do some necessary process return &client{dbConfig: conf}, nil } type dbConfigOption func(c *dbConfig) error func setHost(host string) dbConfigOption { return func(c *dbConfig) error { if c == nil { return errors.New("given config is nil") } c.Host = host return nil } } func setPort(port int) dbConfigOption { return func(c *dbConfig) error { if c == nil { return errors.New("given config is nil") } c.Port = port return nil } } func setTable(table string) dbConfigOption { return func(c *dbConfig) error { if c == nil { return errors.New("given config is nil") } c.Table = table return nil } } func wrapConnect(options ...dbConfigOption) (*client, error) { // default configuration conf := &dbConfig{ Host: "127.0.0.1", Port: 5432, Table: "tx", } for _, f := range options { err := f(conf) if err != nil { // do some error handling fmt.Println(err) } } return newConnect(conf) } func main() { db, err := wrapConnect(setHost("192.168.1.130"), setPort(1234)) // db, err := wrapConnect(setHost("192.168.1.130")) // db, err := wrapConnect() fmt.Println(db.Host) fmt.Println(db.Port) fmt.Println(db.Table) fmt.Println(err) } ``` ### 6.2.1. Discourage the use of nil as a parameter > **TL;DL** > 避免允許使用者傳入`nil`來作為「選擇預設值」的選擇，應該讓使用者創建一個 default 物件傳入。 ### 6.2.2. Prefer var args to []T parameters 你一定會有需要此種設計的時候： ```go func ShutdownVMs(ids []string) error ``` 因為傳入參數是一個 slice，故表示允許傳入 **empty slice** 或 **`nil`**，因此造成了**額外的測試負擔**。再舉個例子，以下的程式碼片段需要重構： ```go! if svc.MaxConnections > 0 || svc.MaxPendingRequests > 0 || svc.MaxRequests > 0 || svc.MaxRetries > 0 { // apply the non zero parameters } ``` 考慮到 `if` statement 越來越長，所以將此邏輯抽離出來寫一個 function 如下： ```go! // anyPostive indicates if any value is greater than zero. func anyPositive(values ...int) bool { for _, v := range values { if v > 0 { return true } } return false } if anyPositive(svc.MaxConnections, svc.MaxPendingRequests, svc.MaxRequests, svc.MaxRetries) { // apply the non zero parameters } ``` 但有沒有可能此 `anyPositive` 被誤用了呢？ ```go if anyPositive() { ... } ``` 雖然此例仍然回傳 `false`，不會造成太大的問題。**但如果回傳 `true` 該怎麼辦？** 建議的最佳實踐為，限制 caller 一定得給至少一個參數，來避免誤用或不如預期的輸出： ```go // anyPostive indicates if any value is greater than zero. func anyPositive(first int, rest ...int) bool { if first > 0 { return true } for _, v := range rest { if v > 0 { return true } } return false } ``` ## 6.3. Let functions define the behaviour they requires > **TL;DR** > 只傳遞必要的資訊進去 function，能讓人更明確地知道如何使用它來看一個需求例，此例也闡述如何一步步重構：我想要把一段 content 存到一個 file 去，可以這樣寫： ```go // Save writes the contents of doc to the file f. func Save(f *os.File, doc *Document) error ``` 但問題是，`*os.File` 定義了許多與 `Save` 無關的功能，如果能夠在 signature 上就明確規範好行為會更好。所以可以進一步改寫成： ```go // Save writes the contents of doc to the supplied // ReadWriterCloser. func Save(rwc io.ReadWriteCloser, doc *Document) error ``` 又再考慮到單一職責原則，按照此 function 的命名，應該是不需要 Read 的功能，故再重構一次： ```go // Save writes the contents of doc to the supplied // WriteCloser. func Save(wc io.WriteCloser, doc *Document) error ``` 但接著，難道我們真的每次寫入完就立即將 file descriptor 關閉嗎？有可能我們還需要繼續寫入其他 content，故最佳實踐其實是： ```go // Save writes the contents of doc to the supplied // Writer. func Save(w io.Writer, doc *Document) error ``` 以上的做法遵循的法則又稱 **interface segregation principle**。 --- # 7. Error handling 作者寫了很多有關錯誤處理的文章，以下文章找時間瞄一下： - [Inspecting errors](https://dave.cheney.net/2014/12/24/inspecting-errors) - [Constant errors](https://dave.cheney.net/2016/04/07/constant-errors) - [Don’t just check errors, handle them gracefully](https://dave.cheney.net/2016/04/27/dont-just-check-errors-handle-them-gracefully) 以下內容將不會與上述的文章重複。 ## 7.1. Eliminate error handling by eliminating errors > **TL;DR** > 不是要你**屏蔽**錯誤訊息，而是尋找能夠讓你**不用自己處理錯誤**的方法，來重構你的程式碼 > ### 7.1.1. Counting lines **舉例：** 計算一個檔案有幾行，一開始可能會這樣寫 ```go= func CountLines(r io.Reader) (int, error) { var ( br = bufio.NewReader(r) lines int err error ) for { _, err = br.ReadString('\n') lines++ if err != nil { break } } if err != io.EOF { return 0, err } return lines, nil } ``` 看似沒問題，也滿足前面的 interface segregation principal。但 line 8 ~ 14 其實邏輯有點怪，為何是先 `lines++` 才檢查錯誤呢？ > 因為受限於 `ReadString` 的用法，它會在遇到 new line character 之前先遇到 EOF(end-of-file (`io.EOF`)) 的話，就回傳 error。 > > 所以這樣的寫法是為了滿足整個 file 沒有 new line character、也能計算到一行。 > 你當然可以想辦法調整一下這個函式內的邏輯，充滿各式各樣的錯誤處理之後，避免上述提到的怪異點。但如此一般，你的函式就變得很難讓人理解原本是要幹麻的了。若再進一步熟悉 golang 的話，會發現較佳的寫法是： ```go func CountLines(r io.Reader) (int, error) { sc := bufio.NewScanner(r) lines := 0 for sc.Scan() { lines++ } return lines, sc.Err() } ``` 利用 `sc.Scan()` 回傳 `true`，讓我們知道遇到了換行符號且沒有錯誤，故可以正確地 `lines++`；`sc.Scan()` 回傳 `false` 時表示遇到錯誤或 EOF ，則會幫我們離開迴圈。 `sc.Err()` 則幫我們封裝了遇到的第一個 error、且如果 error 是 `io.EOF`，則幫我們轉成 `nil`。 > `bufio.Scanner` 可以掃描任意 pattern，預設值是掃描 newlines > ### 7.1.2. WriteResponse 此例啟發於 [go blog - Errors are values](https://blog.golang.org/errors-are-values)。如果你撰寫的是非常底層的邏輯，免不了就真的必須在你的實作中親手處理錯誤，讓你的程式碼有非常多重複的錯誤處理。那麼實務上你可以怎麼處理呢？ **舉例：** HTTP server 要建構一塊 HTTP response： ```go= type Header struct { Key, Value string } type Status struct { Code int Reason string } func WriteResponse(w io.Writer, st Status, headers []Header, body io.Reader) error { _, err := fmt.Fprintf(w, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason) if err != nil { return err } for _, h := range headers { _, err := fmt.Fprintf(w, "%s: %s\r\n", h.Key, h.Value) if err != nil { return err } } if _, err := fmt.Fprint(w, "\r\n"); err != nil { return err } _, err = io.Copy(w, body) return err } ``` 函式依序處理要回傳的字串，若在任何一段 subroutine 出錯，就捨棄所有內容回傳 error。問題在於**太多重複性的檢查錯誤程序**，最後的 line 27~28 也很醜。此時創建一個 **small wrapper type - `errWriter`** 將很有幫助。重構後的程式碼如下： ```go type errWriter struct { io.Writer err error } func (e *errWriter) Write(buf []byte) (int, error) { if e.err != nil { return 0, e.err } var n int n, e.err = e.Writer.Write(buf) return n, nil } func WriteResponse(w io.Writer, st Status, headers []Header, body io.Reader) error { ew := &errWriter{Writer: w} fmt.Fprintf(ew, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason) for _, h := range headers { fmt.Fprintf(ew, "%s: %s\r\n", h.Key, h.Value) } fmt.Fprint(ew, "\r\n") io.Copy(ew, body) return ew.err } ``` 若不幸在某一次寫入出錯之後，下一次呼叫 `errWriter.Write` 都會捨棄掉要寫入的內容、直接回傳之前的 err。`WriteResponse` 的邏輯變得清晰許多，最後也只要直接回傳 `ew.err` 即可。 ## 7.2. Only handle an error once > **TL;DR** > 使用 [github.com/pkg/errors](https://godoc.org/github.com/pkg/errors) 封裝錯誤訊息，使你的錯誤報告變成美妙的 [K&D](https://www.gopl.io/) style error >> - 用 `go get github.com/pkg/errors` 取得上述 error wrapper >> - [The Go Programming Language](https://github.com/KeKe-Li/book/blob/master/Go/The.Go.Programming.Language.pdf) >> *Alan A. A. **D**onovan, Google Inc.* >> *Brian W. **K**ernighan, Princeton University* go 的程式碼很常這樣處理錯誤： ```go if err != nil { // print some error message return err } ``` 但會有何困擾呢？當你的程式碼有以下的層次時： ```go= func WriteAll(w io.Writer, buf []byte) error { _, err := w.Write(buf) if err != nil { log.Println("unable to write:", err) // annotated error goes to log file return err // unannotated error returned to caller } return nil } func WriteConfig(w io.Writer, conf *Config) error { buf, err := json.Marshal(conf) if err != nil { log.Printf("could not marshal config: %v", err) return err } if err := WriteAll(w, buf); err != nil { log.Println("could not write config: %v", err) return err } return nil } ``` 可以預期，出錯時會印出以下內容： ```shell unable to write: io.EOF could not write config: io.EOF ``` 但在程式的最上層，你只拿到了原始的錯誤、而失去的 **error context**： ```go err := WriteConfig(f, &conf) fmt.Println(err) // io.EOF ``` 所以我們需要一個封裝過的方法，除了原始錯誤訊息之外，也將我們在每一層寫下的 error context 一併回報。這就是 `errors` 模組提供的服務，看以下改寫過的例子： ```go= package main import ( "fmt" "io/ioutil" "os" "path/filepath" "github.com/pkg/errors" ) func ReadFile(path string) ([]byte, error) { f, err := os.Open(path) if err != nil { return nil, errors.Wrap(err, "open failed") } defer f.Close() buf, err := ioutil.ReadAll(f) if err != nil { return nil, errors.Wrap(err, "read failed") } return buf, nil } func ReadConfig() ([]byte, error) { home := os.Getenv("HOME") config, err := ReadFile(filepath.Join(home, ".settings.xml")) return config, errors.WithMessage(err, "could not read config") } func main() { _, err := ReadConfig() if err != nil { fmt.Println(err) os.Exit(1) } } ``` 出錯時，輸出會被包裝成以下： ```shell! could not read config: open failed: open /home/maxcian/.settings.xml: no such file or directory ``` 且此 error value 還保留了詳細的 call stack，透過 [`Cause(err)`](https://godoc.org/github.com/pkg/errors#Cause) 呼叫取得： ```go func main() { _, err := ReadConfig() if err != nil { fmt.Printf("Original error:\n%T %v\n", errors.Cause(err), errors.Cause(err)) fmt.Println() fmt.Printf("Stack trace:\n%+v\n", err) os.Exit(1) } } ``` ```shell Original error: *os.PathError open /home/maxcian/.settings.xml: no such file or directory Stack trace: open /home/maxcian/.settings.xml: no such file or directory open failed main.ReadFile /home/maxcian/go/src/github.com/hjcian/gophercises/main.go:15 main.ReadConfig /home/maxcian/go/src/github.com/hjcian/gophercises/main.go:28 main.main /home/maxcian/go/src/github.com/hjcian/gophercises/main.go:33 runtime.main /usr/local/go/src/runtime/proc.go:203 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1373 could not read config exit status 1 ``` --- # 8. Concurrency Go team 盡力開發出硬體資源消耗低、高效率、且容易使用的 concurrency 支援，但開發者若誤用反而容易寫出無效率且不可靠的程式碼。在此章節討論一些在使用`chan`、`select`和 `go` 時，很常會誤踩的**坑**。 ## 8.1. Keep yourself busy or do the work yourself > **TL;DR** > 別過度使用 `goroutine`，適度就好。 > 以下程式碼想要運行一個 web server，監聽 port 8080。有什麼潛在問題嗎？ ```go= package main import ( "fmt" "log" "net/http" ) func main() { http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { fmt.Fprintln(w, "Hello, GopherCon SG") }) go func() { if err := http.ListenAndServe(":8080", nil); err != nil { log.Fatal(err) } }() for { } } ``` 他確實如期運作，運行了一個 simple web server。問題是 **line 19 會讓一顆 CPU 徒勞地空轉**。 > You can try it on your computer for the experiment by yourself. 可能還會看到有人這樣寫，但表示該人不了解到底問題是什麼： ```go=19 for { runtime.Gosched() } ``` 使用 `runtime.Gosched()`，只是告訴 scheduler 先去做別的 goroutine，待會再回來找我。但你的 CPU 仍然在瞎忙。 > Demonstration on my computer: > ![](https://i.imgur.com/KPjpQP1.png) 稍微有點經驗的開發者，可能會想到可以使用 empty select statement，來**永遠阻塞 main goroutine**： ```go=19 select {} ``` 這樣的確不會再造成任何 CPU 空轉，但仍然是**治標不治本**。其實此例，**不需多此一舉**將 `http.ListenAndServe` 放到 `goroutine`，就直接讓他運行在 `main goroutine` 就行了： ```go=10 http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { fmt.Fprintln(w, "Hello, GopherCon SG") }) if err := http.ListenAndServe(":8080", nil); err != nil { log.Fatal(err) } ``` 因為你根本也沒有要讓 main goroutine 做其他事，那就讓它自己來吧！(do the work yourself) ## 8.2. Leave concurrency to the caller 將使用 concurrency 的時機，留給 caller。以下兩個 API 有何差異？ ```go // ListDirectory returns the contents of dir. func ListDirectory(dir string) ([]string, error) ``` ```go // ListDirectory returns a channel over which // directory entries will be published. When the list // of entries is exhausted, the channel will be closed. func ListDirectory(dir string) chan string ``` 第一例的潛在問題： 1. caller 會被阻塞 (block)，直到函式搜索完所有目錄 2. 搜尋可能會很久 3. 占用一定程度的記憶體第二例的潛在問題： 1. 處理途中若有任何 **error 你不會知道**，你**只知道 channel 被關閉** 2. 就算你已經取得了你想要的目錄，關閉此 channel 的唯一方法就是讓它讀完。故其實沒從此種寫法中受惠考慮上述的優缺點，折衷的解法是傳入 **callback function**，讓 caller 決定他想要做什麼事 (in fact, [`filepath.Walk`](https://golang.org/pkg/path/filepath/#Walk) 就是這麼實作) ```go func ListDirectory(dir string, cb func(string)) ``` ## 8.3. Never start a goroutine without knowning when it will stop > **TL;DR** > 不要開啟一個 goroutine 之後，沒有設計任何關閉它的方法 > 今天有一個簡單的 Web 應用程式： 1. **0.0.0.0:8080** 用來服務 application traffic 2. **127.0.0.1:8001** 用提供 `/debug/pprof` endpoint 的存取 ```go package main import ( "fmt" "net/http" _ "net/http/pprof" ) func main() { mux := http.NewServeMux() mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) { fmt.Fprintln(resp, "Hello, QCon!") }) go http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux) // debug http.ListenAndServe("0.0.0.0:8080", mux) // app traffic } ``` > `/debug/pprof` 是什麼？ >> ![/debug/pprof](https://i.imgur.com/lraPKVk.png) 考慮到未來應用程式的增長，先來小重構一下： ```go func serveApp() { mux := http.NewServeMux() mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) { fmt.Fprintln(resp, "Hello, QCon!") }) http.ListenAndServe("0.0.0.0:8080", mux) } func serveDebug() { http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux) } func main() { go serveDebug() serveApp() } ``` 將兩段程式碼從 `main.main` 中解耦，同時也遵循上述的建議－**將 concurrency 的部分留給 caller 去決定**。但有個問題，各自的 goroutine 若因故死亡，沒有人知道阿！如果死亡的是 `serveDebug`，且當維運人員試圖從 `/debug/pprof` 獲取程式資訊卻失敗時，會森77的。故在此引入一個需求：**需要其中一個 goroutine 死亡時，整個 application 也可以跟著關閉，省得誤會程式正常運作中。** 故程式再重構一次： ```go func serveApp() { mux := http.NewServeMux() mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) { fmt.Fprintln(resp, "Hello, QCon!") }) if err := http.ListenAndServe("0.0.0.0:8080", mux); err != nil { log.Fatal(err) } } func serveDebug() { if err := http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux); err != nil { log.Fatal(err) } } func main() { go serveDebug() go serveApp() select {} } ``` 利用 `select {}` 阻塞 main goroutine，且當任何一個 goroutine 死亡時，會呼叫 `log.Fatal` 關閉程式。但仍然有些問題沒考慮到： 1. 如果 `ListenAndServer` 回傳的是 `nil` error，那麼該 goroutine 就會在不停止程式的情況下停止服務 > 其實在 `server.go` 中有註解：`ListenAndServe always returns a non-nil error.` ，故在此只是提出一個假設性問題 2. `log.Fatal` 會呼叫 `os.Exit`，無條件結束程式，造成你程序中的 `defer` 不會被呼叫到、別的 goroutines 也無預警地死亡，**這會使得編寫測試時很彆扭** > **TIP**: Only use `log.Fatal` from `main.main` or `init` functions. 故再引入新的需求：**在有 goroutine 死亡時，將錯誤回傳給 caller 以便知道停止原因**，以便可以乾淨地關閉程式。再重構如下： ```go func serveApp() error { mux := http.NewServeMux() mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) { fmt.Fprintln(resp, "Hello, QCon!") }) return http.ListenAndServe("0.0.0.0:8080", mux) } func serveDebug() error { return http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux) } func main() { done := make(chan error, 2) go func() { done <- serveDebug() }() go func() { done <- serveApp() }() for i := 0; i < cap(done); i++ { if err := <-done; err != nil { fmt.Printf("error: %v \n", err) } } } ``` 錯誤資訊是已經蒐集到了，但還不夠，接著我們**需要一個方法**能夠**將關閉訊號傳遞給別的 goroutine**。先做一個 helper function `serve` 來實現此邏輯，其中借助一個 `stop` channel 來傳遞關閉的訊號，並在其中呼叫 `Shutdown()`。再重構如下： ```go= func serve(addr string, handler http.Handler, stop <-chan struct{}) error { s := http.Server{ Addr: addr, Handler: handler, } go func() { <-stop // wait for stop signal s.Shutdown(context.Background()) }() return s.ListenAndServe() } func serveApp(stop <-chan struct{}) error { mux := http.NewServeMux() mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) { fmt.Fprintln(resp, "Hello, QCon!") }) return serve("0.0.0.0:8080", mux, stop) } func serveDebug(stop <-chan struct{}) error { return serve("127.0.0.1:8001", http.DefaultServeMux, stop) } func main() { done := make(chan error, 2) stop := make(chan struct{}) go func() { done <- serveDebug(stop) }() go func() { done <- serveApp(stop) }() var stopped bool for i := 0; i < cap(done); i++ { if err := <-done; err != nil { fmt.Printf("error: %v \n", err) } if !stopped { stopped = true close(stop) } } } ``` main goroutine 會被阻塞在 line 39 (`err := <-done`)。現在，只要 `done` 收到值，被阻塞的 main goroutine 就會開始工作，觸發 line 42~44 關閉 `stop` channel。接著其他的 goroutines 就會收到 `stop` 傳來的訊號(line 8)，關閉他們的 `http.Server`。這些 goroutines 關閉 `http.Server` 之後，就會造成該 goroutine 也跟著 return，最後 `main.main` 就如我們所願乾淨地停止、且也有明確的錯誤資訊了。 :::success 為何利用額外的 channel 來作為發送關閉訊號的管道？這裡可能偷渡了一個使用原則：**The Channel Closing Principle** 1. 不關閉一個有多個 senders 的 channel 2. 不從接收端關閉 channel 原因是因為以下事實： 1. close a closed channel -> **panic** 2. send a value to closed channel -> **panic** 延伸閱讀 - [Channels in Go - **A Comprehensive Interpretation**](https://go101.org/article/channel.html) - [**The Channel Closing Principle** - How to Gracefully Close Channels](https://go101.org/article/channel-closing.html) :::