5.9 Shapley Values

# 5.9 Shapley Values ----------------------- Shapley Value：在一個instance的變數值集合中，某變數對於<font color="#f00">instance的預測值和平均預測值的差異</font>的貢獻值 Shapley值方法是博弈論中解決合作對策問題的一種方法，能夠根據一個利益集團中各成員對聯盟所得的貢獻程度來進行集團的利益分配。將Shapley值方法應用到模型解釋性的特徵重要性分析中，就是將模型預測類比為多個特徵成員的合作問題，將最終預測結果類比合作中的總收益，而特徵的貢獻程度將決定其最終分配到的收益——重要性評估值。 ## 用文章的範例做解釋： * 預測目標：房價 * 預測變數值： * 附近有公園 * 面積50平方米 * 位於二樓 * 禁止養貓 * 計算禁止養貓的Shapley Value ## 公式理解： https://www.chainnews.com/zh-hant/articles/982375036488.htm https://mathpretty.com/11210.html ### Advantages * 該instance 的prediction 與 average prediction 是平均分配的 * the only method to deliver a full explanation * 有對比性，與LIME local解釋性不同 * 唯一有完整理論支持的解釋方法，特性:efficiency, symmetry, dummy, additivity LIME用一個線性的模型去解釋該local的features ** It is mind-blowing to **explain a prediction as a game** played by the feature values ### Disadvantages * 計算時間長 (計算2^k的組合) * There is no good rule of thumb for the number of iterations M (依據Chernoff bounds) Chernoff bounds: [https://zhuanlan.zhihu.com/p/19901452](https://zhuanlan.zhihu.com/p/19901452) * 解釋容易被誤用 ** <font color="#f00">Given the current set of feature values</font>, the contribution of a feature value to the difference between the actual prediction and the mean prediction is the estimated Shapley value.** * Sharply value 用的是所有的features, LIME/ SHAP 用的是部分的fearues. * 需要接觸資料? ## 參考資料 https://www.analyticsvidhya.com/blog/2019/11/shapley-value-machine-learning-interpretability-game-theory/ https://faculty.ai/blog/machine-learning-model-explainability-through-shapley-values/ https://blog.fiddler.ai/2020/03/ai-explained-video-series-what-are-shapley-values/ 利用SHAP解释Xgboost模型 https://zhuanlan.zhihu.com/p/106320452 shapley value 參考 https://www.csdn.net/apps/download/?code=pc_1555579859 http://sofasofa.io/tutorials/shap_xgboost/ 各種方法的比較 https://www.secrss.com/articles/14986 ###### tags: `重點摘要`