# Refactoring in Python [TOC] ## Techniques ### Extract Function Inverse of: [Inline Function](#Inline-Function) ```python= # before def print_owning(invoice): print_banner() outstanding = calculate_outstanding() # print details print("name: {}".format(invoice.customer)) print("amount: {}".format(outstanding)) # after def print_owning(invoice): print_banner() outstanding = calculate_outstanding() print_details(invoice, outstanding) def print_details(invoice, outstanding): print("name: {}".format(invoice.customer)) print("amount: {}".format(outstanding)) ``` 何時該使用 Extract Function 大致上有分幾派: - 長度 - Functions should be no larger than fit on a screen - 重用 - Any code used more than once should be put in its own function - Code only used once should be left inline - 表達意圖 - Separation between intention and implementation Extract Function 是作者最常用的其中一個技巧,**表達意圖**是他使用 Extract Function 最重要原因: > If you have to spend effort looking at a fragment of code and figuring out what it’s doing, then you should extract it into a function and name the function after the “what.” > When you read it again, the purpose of the function leaps right out at you, and most of the time you won’t need to care about how the function fulfills its purpose (which is the body of the function). 作者甚至不介意 Extract Function 裡面只有一行: > Once I accepted this principle, I developed a habit of writing very small functions — typically, only a few lines long. > To me, any function with more than half-a-dozen lines of code starts to smell(註 1), and it’s not unusual for me to have functions that are a single line of code 我自己聽過另一種原則(忘記是這本書還其他本書中看到),也是我個人非常喜歡的: > 當你覺得需要為某一段程式碼寫註解闡述意圖時,將那段程式碼抽成 function,用 function 的 name 來表達他的意圖,這樣的程式碼不需要註解就能讓人很容易理解。 --- > 寫程式就像寫文章一樣,讓程式碼能跟文章一樣讓人讀起來流暢,是一個要不斷琢磨的技巧。 > [name=Cy 心之俳句][color=#0DBF8C] --- 註: 1. Smell 常常拿來形容不好的 code,象徵著 code 散發著不好的味道 ### Inline Function Inverse of: [Extract Function](#Extract-Function) ```python= # before def get_rating(driver): if more_than_five_late_deliveries(driver): return 2 return 1 def more_than_five_late_deliveries(driver): return driver.number_of_late_deliveries > 5 # after def get_rating(driver): if driver.number_of_late_deliveries > 5: return 2 return 1 ``` Inline Function 是 Extract Function 的相反動作,當 function 的 body 已經夠清楚表達程式的意圖,就不需要再多包一層。 > But sometimes, I do come across a function in which the body is as clear as the name. Or, I refactor the body of the code into something that is just as clear as the name. Inline Function 還能在重構時幫上忙,先將所有 function 都 inline 起來,再重新用更好的方式 Extract Function: > I also use Inline Function is when I have a group of functions that seem badly factored. I can inline them all into one big function and then reextract the functions the way I prefer. 作者最常使用的情境是程式碼裡面有太多層、包了太多 function,且 function body 又是單純 delegate 到其他 function,當 body 本身已經夠清楚表達意圖,就適合使用 Inline Function. > Some of this indirection may be worthwhile, but not all of it. By inlining, I can flush out the useful ones and eliminate the rest. --- > 天下大勢,分久必合,合久必分。 > [name=Cy 心之俳句][color=#0DBF8C] --- ### Extract Variable Inverse of: [Inline Variable](#Inline-Variable) ```python= # before def calculate(order): return ( order.quantity * order.item_price - max(0, order.quantity - 500) * order.item_price * 0.05 + min(order.quantity * order.item_price * 0.1, 100) ) # after def calculate(order): base_price = order.quantity * order.item_price quantity_discount = max(0, order.quantity - 500) * order.item_price * 0.05 shipping = min(base_price * 0.1, 100) return base_price - quantity_discount + shipping ``` 作者認為 Extract Variable 有以下幾個好處: - 賦予複雜的程式片段一個有意義的名稱,使其更容易了解 - 幫助 debug(Easy hook for a debugger or print statement to capture) 在使用 Extract Variable 時要注意 variable 使用的範圍: - function scope: - 當變數只在 function 中,如果能讓語意更清楚, Extract Variable 通常會是好的選擇 - class/border scope: - 當變數影響更大的範圍(例如 class property),Extract Variable 不只能讓語意更清楚,還能減少重複 - 但大範圍 Extract Variable 重構的工比較大,可能要搭配其他技巧,例如 Replace Temp with Query(之後才會提到) ```python= # class scope 可使用 getter 或是 get function 來達成 # 但如果變數只有某一個 function 要使用,抽在 function 裡面就好 # before class Order: def __init__(self, record): self._data = record @property def price(self): return ( self._data.quantity * self._data.item_price - max(0, self._data.quantity - 500) * self._data.item_price * 0.05 + min(self._data.quantity * self._data.item_price * 0.1, 100) ) # after class Order: def __init__(self, record): self._data = record @property def price(self): return self.base_price - self.quantity_discount + self.shipping @property def quantity(self): return self._data.quantity @property def item_price(self): return self._data.item_price @property def base_price(self): return self.quantity * self.item_price @property def quantity_discount(self): return max(0, self.quantity - 500) * self.item_price * 0.05 @property def shipping(self): return min(self.base_price * 0.1, 100) ``` 其實 Extract Variable 跟 Extract Function 還蠻類似的,都能讓程式碼的意圖更明顯,並減少重複,但是抽太多也會造成雜亂,如何抽得恰到好處就是另一種藝術了。 --- > 做事情固然要積極進取,但也需適可而止,過猶不及都不足取。-- [教育百科](https://pedia.cloud.edu.tw/Entry/Detail/?title=%E9%81%8E%E7%8C%B6%E4%B8%8D%E5%8F%8A) > [name=Cy 心之俳句][color=#0DBF8C] --- ### Inline Variable Inverse of: [Extract Variable](#Extract-Variable) ```python= # before def check(order): base_price = order.base_price return base_price > 1000 # after def check(order): return order.base_price > 1000 ``` 變數通常可以提供更好的程式意圖,但有時候變數傳達的資訊,沒有比原本的 expression 多的時候,就會建議使用 Inline Variable 或是在重構的時候,可以透過 Inline Variable 來重新調整程式的結構與意圖,與 [Inline Function](#Inline-Function) 有類似的效用 --- > 英文讀得好,命名沒煩惱。 > [name=Cy 心之俳句][color=#0DBF8C] --- ### Change Function Declaration (Change Signature) ```python= # before def circum(circle): # ... # after def circumference(radius): # ... ``` 作者形容 function 是程式裡的 joints(關節),好的 joints 可以幫助你更容易的新增功能,壞的 joints 則會讓你更難釐清、修改現有的程式碼。 作者認為 function 的 naming 是最重要的: > The most important element of such a joint is the name of the function. A good name allows me to understand what the function does when I see it called, **without seeing the code that defines its implementation**. 但連作者平常都在講英文的人,都覺得命名很困難了,所以我們也不用對命名太氣餒: > However, coming up with good names is hard, and I rarely get my names right the first time. > > ... > > If I see a function with the wrong name, it is imperative that I change it as soon as I understand what a better name could be. That way, the next time I’m looking at this code, I don’t have to figure out again what’s going on. 作者提供了一個蠻有趣的方式來幫助命名 function:先對 function 寫下註解來描述意圖,再把註解轉換成新的 function name: > A good way to improve a name is to write a comment to describe the function’s purpose, then turn that comment into a name. 除了 function 的命名,function 參數的設計也是很重要的東西,作者舉了一些例子來說明設計 function 參數並沒有標準答案: - 以通用性考量(使 function 可以適用更多場景) ```python= class Person: phone_number: str class Company: phone_number: str # before def format_phone_number(person: Person): # ... format_phone_number(person) # after def format_phone_number(phone_number: str): # ... format_phone_number(person.phone_number) format_phone_number(company.phone_number) ``` - 以擴充、封裝性考量(不需要改 function 外面的 code 就可以增加新的功能) ```python= class Payment: status: str created_at: datetime # before def is_overdue(payment: Payment): return payment.created_at >= datetime.now() + timedelta(days=30) # after def is_overdue(payment: Payment): if payment.status == "Archived" return False return payment.created_at >= datetime.now() + timedelta(days=30) ``` 由於設計 function 沒有標準答案,所以作者認為必須要熟悉 Change Function Declaration 這個技巧,讓程式能夠隨時間改進,使用最適合當下的設計。 --- > 現實世界的選擇都有 trade-off,沒有最好,只有適合當下。 > [name=Cy 心之俳句][color=#0DBF8C] --- ### Encapsulate Variable ```python= # before owner = { "first_name": "Martin", "last_name": "Fowler" } # after _owner = { "first_name": "Martin", "last_name": "Fowler" } def owner(): return _owner def set_owner(owner): _owner = owner # after (further) _owner = { "first_name": "Martin", "last_name": "Fowler" } class Person: def __init__(self, data: dict): self.first_name = data["first_name"] self.last_name = data["last_name"] def owner(): return new Person(_owner) def set_owner(owner): _owner = owner ``` 作者認為相對於資料變數,function 是更好進行操作的方式,因為 function 相對於資料變數更容易去重構,例如 rename function 時,可以讓舊的 function forward 到新的 function 身上: ```python= # before def old_func(*args, **kwargs): # do something # after def old_func(*args, **kwargs): return new_func(*args, **kwargs) def new_func(*args, **kwargs): # do something ``` 另一個好處是,使用 function 封裝資料後,可以有一個統一的入口去控制資料的更改,可以很容易的新增驗證邏輯,或是額外的更新規則,而不需要改動其他的產品程式碼,這也是為什麼在 object-oriented 的思維中,會盡量將物件裡的資料欄位都設成 private。 以開頭例子的 set_owner 為例: ```python= # before def set_owner(owner): _owner = owner # after def set_owner(owner): validate(_owner) _owner = owner ``` 但作者提到,如果資料本身是 immutable,封裝資料變數的意義就不大了。 > Immutability is a powerful preservative. --- > 新年快樂,為各位點播一首[許含光的新年未老](https://www.youtube.com/watch?v=Z_sELFDsSpU)。 > [name=Cy 心之俳句][color=#0DBF8C] --- ### Rename Variable ```python= # before a = height * width # after area = height * width ``` 好的變數命名可以讓程式解釋自己的意圖,但就算是作者也常常命出不好的名字,不好的命名可能來自不同的原因: - 單純沒有想清楚(Not thinking carefully enough) - 對問題有更多的了解(Understanding of the problem improves as I learn more.) - 程式的目的、使用者的需求改變(Program’s purpose changes as my users’ needs change.) 作者接著提到一種有用的命名方式,就是將值的型態標示在變數名稱上(在動態語言如 JavaScript、Python 特別有用),例如: ```python= # a_customer 為 Customer 型別 a_customer = Customer() # result_dict 為 Dictionary 型別 result_dict = {} ``` 除了標注型別之外,我也常將型別無法表示的細節放在變數名稱,讓程式成為自己的文件,以 timestamp 為例: ```python= ts_msecs = 1611625060123 ``` timestamp 多用 integer 表示,但 integer 同時可以代表 seconds 或 milliseconds 為單位的 timestamp,透過在變數名稱加入 msecs 來明確表示,這個變數要存的是以 milliseconds 為單位的 timestamp。 --- > 程式與人生,都不是一步到位的,推薦一本書《如何讓馬飛起來》。 > [name=Cy 心之俳句][color=#0DBF8C] --- ### Introduce parameter object ```python= # before def amount_invoiced(start: datetime, end: datetime): # ... def amount_received(start: datetime, end: datetime): # ... def amount_overdue(start: datetime, end: datetime): # ... # after class DateRange: start: datetime end: datetime def amount_invoiced(dr: DateRange): # ... def amount_received(dr: DateRange): # ... def amount_overdue(dr: DateRange): # ... ``` 當發現有一群 data 總是一起出現(作者稱之為 data clump),這時候就很適合將他們抽成一個 data structure (e.g., class in Python),這個抽取有幾個好處: - 表明 data 之間的關係(Make explicit the relationship between the data items) - 減少接受這些 data 的 function 的參數數量(Reduces the size of parameter lists) - 確保 data 使用名稱的一致性(Help consistency since all functions that use the structure will use the same names to get at its elements.)