Try   HackMD

5.3 Accumulated Local Effects (ALE) Plot

M-Plots

  • 條件機率
  • 參雜其他相關變數的效果

ALE Plots

  • 依照觀察變數的範圍,切成N段(Intervals)
  • 將每個instances的變數值帶入所在區間的最大值和最小值,求其差
  • 除以區間內的樣本數 > 中心化 > 相加

ALE plots for 變數間的交互作用項

  • Second-order effect : 只考慮交互作用項帶來的額外效果

ALE plots for 類別變數

  • 計算每個class之間的similarity > 得到Distance Matrix > 再利用MDS(Multi-Dimensional Scaling)將所有資料點降至一維
  • 如何計算Similarity:

Differences

  • Partial Dependence Plots: “Let me show you what the model predicts on average when each data instance has the value v for that feature. I ignore whether the value v makes sense for all data instances.”
  • M-Plots: “Let me show you what the model predicts on average for data instances that have values close to v for that feature. The effect could be due to that feature, but also due to correlated features.”
  • ALE plots: “Let me show you how the model predictions change in a small”window" of the feature around v for data instances in that window."

Advantages

  • 可以處理有關聯性的變數
  • 計算速度較PDP快
  • 解讀較易懂,由於ALE plots集中在0,條件機率下因子的變化較為明顯

Disadvantages

  • ALE plots can become a bit shaky (many small ups and downs) with a high number of intervals.
  • 無法看到每一個因子異質性的影響效果
  • Second-order ALE 在不同feature 隔窗的穩定性差異大,因為每一的隔窗的資料是不同的
  • Second-order effect plots can be a bit annoying to interpret
  • implementation of ALE plots is much more complex也較PDP不直覺

Use ALE instead of PDP.

參考資料

Python Code

tags: 重點摘要