Ruby Conf 2018 Lightning Talk Large Table Migration

## Large Table Migration ---- Hi, I am Steven Huang ([@steventtud](http://steventtud.com/)). A backend developer working for [Splashtop](https://www.splashtop.com/). ---- ## Splashtop ---- Splashtop 是一家遠端桌面公司。 (remote desktop access solution) ---- ![](https://i.imgur.com/0mjDsfC.png) ---- ![](https://i.imgur.com/JAr9Wo0.png) ---- ![](https://i.imgur.com/z26fsyu.jpg) ---- ![](https://i.imgur.com/023jE8Y.png) ---- OK! 接下來回到今天的主題! --- ## Large table migration ---- ### What is migration? ---- migration 中針對已經建立的的 table 做修改的動作，就會使用到 Mysql 的 alter table 指令。 ---- a migration ```ruby change_table(:users) do |t| t.string :company_name t.change :birthdate, :datetime end ```` ---- 輸入 `rake db:migrate` 後會產生這樣的 sql 指令 ```sql ALTER TABLE `users` ADD `im_handle` varchar(255) ALTER TABLE `users` ADD `company_id` int(11) ALTER TABLE `users` CHANGE `updated_at` `updated_at` datetime DEFAULT NULL ``` ---- 從以上可以了解 alter table 就是修改已存在的 table ---- 資料量小的時候 alter table 很簡單 ---- 但如果 Table 資料量達千萬級(rows > 10 millions) ---- 而且有讀寫分離的情況下(has read replicas) ---- Alter Table 可能成為你的惡夢。 (altering table would be your nightmare) ---- 這次的 Talk 將會分享如何克服讀寫分離的 slave lag，成功的alter table 大型資料庫。 ---- 通常我們不會透漏公司資料量的規模 ---- 但是我剛剛在逛官網的時候... ---- ![](https://i.imgur.com/vMc171L.png) ---- 這樣大致了解目前敝公司處理的資料量 --- ## Outline - 在 Table 資料量達千萬級並有持續寫入的情況下該如何做 migration - 當你的資料庫有讀寫分離，又使用 lhm 的話可能會發生什麼事情? - 如何在這種極端的情況下，安全的使用 Migration。 - 解析 Lhm::Throttler::SlaveLag 的運作原理 --- ## 在 Table 資料量達千萬級並有持續寫入的情況下該如何做 migration ---- Solution: [lhm](https://github.com/soundcloud/lhm) - Online MySQL schema migrations ---- ### How a normal migration works? ---- 一般的 migration 會使用 `alter table` 來修改資料庫 ---- 在 alter table 時，原本的 Table 是無法寫入的。 ![](https://i.imgur.com/r9Iax8a.png) ---- 新 Table 修改完畢後，將原本的 Table 與新 Table 交換名稱，並將原始的 Table 刪除。 ![](https://i.imgur.com/gmoNGX3.png) ---- ### 直接使用 alter table 會有什麼問題? - 在搬移過程 origin table 需要寫入 rows 持續地被 lock 住。 - 有一大堆的 write 操作在等待被 lock 住的 row 釋放他的鎖。 - 如果是一個寫入頻繁的系統(write-heavy system)，這樣的狀況很可能造成資料庫整個 CPU 使用率和記憶體使用率飆高，甚至當掉都有可能。 ---- Lhm is a solution for this situation. ---- ### How lhm works? ---- Lhm 的作法是使用 [MySQL Trigger](https://dev.mysql.com/doc/refman/5.7/en/triggers.html)，當原始 Table 寫入的時候，mysql trigger 會同步到新的 Database。 ![](https://i.imgur.com/GVqSLsS.png) ---- 等到結束的時候兩個 Table 互換，write lock 持續的時間很短。 ![](https://i.imgur.com/9LEVMiT.png) ---- Example of lhm migration ```ruby require 'lhm' class MigrateUsers < ActiveRecord::Migration def self.up Lhm.change_table :users do |m| m.add_column :arbitrary, "INT(12)" m.add_index [:arbitrary_id, :created_at] m.ddl("alter table %s add column flag tinyint(1)" % m.name) end end def self.down Lhm.change_table :users do |m| m.remove_index [:arbitrary_id, :created_at] m.remove_column :arbitrary end end end ``` --- ## 當你的資料庫有讀寫分離，又使用 LHM 的話可能會發生什麼事情? ---- What is read replica? ---- 在大型系統中，為了讓主要的 Database 記憶體和CPU的使用量更穩定，通常會使用讀寫分離來分擔壓力 ![](https://i.imgur.com/iTHptCI.png) ---- 如果你在讀寫分離的 Database 中使用 lhm，並且沒有實際去了解 lhm 的機制。很可能遇到的問題是: - 即使 lhm 使用 trigger 的方式，該 table 執行寫入的次數仍然會是原來的兩倍。 - 所以在執行 lhm 的同時，slave lag 很有可能會變長 --- ## 如何在這種極端的情況下，安全的使用 Migration? ---- lhm 在新版中推出了一個 SlaveLag Throttler ![](https://i.imgur.com/fgMnQzB.png) ---- SlaveLag Throttler 大致上的概念是: - 如果 RDMS 的 SlaveLag 大於你設定的最高延遲時間Lhm 同步的速度就會減緩， - 直到 SlaveLag 降低至可以接受的範圍。 ---- ### Lhm::Throttler::SlaveLag 的運作原理 ---- [這邊](https://github.com/Shopify/lhm/blob/master/lib/lhm/throttler/slave_lag.rb)可以看 Lhm::Throttler::SlaveLag 的原始碼 ---- 其中最關鍵的一段 ```rb def execute sleep(throttle_seconds) end ``` ```rb def throttle_seconds lag = max\_current\_slave_lag if lag > @allowed_lag && @timeout_seconds < MAX_TIMEOUT Lhm.logger.info("Increasing timeout between strides from #{@timeout_seconds} to #{@timeout_seconds * 2} because #{lag} seconds of slave lag detected is greater than the maximum of #{@allowed_lag} seconds allowed.") @timeout_seconds = @timeout_seconds * 2 elsif lag <= @allowed_lag && @timeout_seconds > INITIAL_TIMEOUT Lhm.logger.info("Decreasing timeout between strides from #{@timeout_seconds} to #{@timeout_seconds / 2} because #{lag} seconds of slave lag detected is less than or equal to the #{@allowed_lag} seconds allowed.") @timeout_seconds = @timeout_seconds / 2 else @timeout_seconds end end ``` ---- 步驟拆解 ---- 初始化: ```rb # INITIAL_TIMEOUT 是最小的等待時間，同時也是執行初始的等待時間。 INITIAL_TIMEOUT = 0.1 # MAX_TIMEOUT 是最長的等待時間，後面我用100秒來簡化之。 MAX_TIMEOUT = 0.1 * INITIAL_TIMEOUT # allow_lag 是我們允許 slave 的延遲時間。 # 這是我們唯一可以自行設定的值。 allow_lag = 10 # 初始的等待時間 timeout_seconds = INITIAL_TIMEOUT ``` ---- 每次執行的時候會偵測最大的 slave lag ```rb # 現在最大的 slave lag lag = max_current_slave_lag ``` ---- 第一次 ``` 如果 lag > allow_lag 且 @timeout_seconds < MAX_TIMEOUT(100秒) @timeout_seconds = 0.1 * 2 = 0.2 睡 0.2 秒 ``` ---- 第二次 ``` 如果 lag > allow_lag 且 @timeout_seconds < MAX_TIMEOUT(100秒) @timeout_seconds = 0.2 * 2 = 0.4 睡 0.4 秒 ``` ---- 第三次 ``` 如果 lag > allow_lag 且 @timeout_seconds < MAX_TIMEOUT(100秒) @timeout_seconds = 0.4 * 2 = 0.8 睡 0.8 秒 ``` ---- 第4次 ``` 如果 lag > allow_lag 且 @timeout_seconds < MAX_TIMEOUT(100秒) @timeout_seconds = 0.8 * 2 = 1.6 睡 1.6 秒 ``` ---- 第5次 ``` 如果 lag <= allow_lag 且 @timeout_seconds > INITIAL_TIMEOUT(0,1) @timeout_seconds = 0.8 / 2 = 0.8 睡 0.8 ``` ---- 95%以上都會如前面兩個方式執行，例外為: 1. 如果 timeout_seconds > MAX_TIMEOUT(100秒) 直接回傳 timeout_seconds 2. 如果 timeout_seconds < INITIAL_TIMEOUT(0.1秒) 直接回傳 timeout_seconds ---- most pictures in this slide are from a article [Rails Migrations at Scale](https://melinysh.me/shopify,/rails,/backend,/lhm/2017/05/14/rails-migrations-at-scale.html) ---- Thanks for listening.