tags: `paper`

Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)

Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)

Abstract

with such unprecedented advancements, a key impediment to the use of AI-based systems is that they often lack transparency.
- AI 沒啥透明度，XAI 可以解決這個問題

1. Introduction

A. Context

International Data Corporation（IDC）預測，全球對人工智能的投資將從 2017 年的 120 億美元增長到 2021 年的 522 億美元[1]。
Meanwhile, the statistics portal Statista, expects that revenues from the AI market worldwide will grow from 480 billion U.S. dollars in 2017 to 2.59 trillion U.S. dollars by 2021 [2].
- 統計門戶網站 Statista 預計，全球人工智能市場的收入將從 2017 年的 4800 億美元增長到 2021 年的 2.59 萬億美元 [2]。
我們已經習慣了 AI 在我們的日常生活中為我們做決策，從 Netflix 和 Amazon 上的產品和電影推薦到 Facebook 上的朋友推薦以及 Google 搜索結果頁面上的定制廣告。然而，在疾病診斷等改變生活的決定中，了解做出如此重要決定背後的原因很重要。
AI 演算法越來越厲害，得到的預測結果和結果也越來越好，但是在進行重大決策時，因為無法解釋其自身可能就會有危險。

B. XAI's Landscape Dynamic

在2018年，有越來越多可解釋性模型相關的conference
- CD-MAKE 2018 Workshop on Explainable ArtificialIntelligence [10]
- ICAPS 2018 Workshop on EXplainable AI Planning [11]
- HRI 2018 Workshop on Explainable Robotic Systems [12]
- ACM Intelligent User Interfaces (IUI) 2018 workshop on Explainable Smart Systems (EXSS 2018) [13]
- IPMU 2018 on Advances on Explainable Artificial Intelligence [14]
- ICCBR 2018 organize XCBR: the First Workshop On Case-Based Reasoning For The Explanation Of Intelligent Systems [15]
Indeed, two of the most prominent actors pursuing XAI research are: (i) a group of academics operating under the acronym FAT∗ [4] and (ii) civilian and military researchers funded by the Defense Advanced Research Projects Agency (DARPA) [16].
- 在XAI領域，有2個比較有名的研究參與者
  - 一個是叫做FAT* 的團隊
  - 一個是civilian and military researchers funded by the Defense Advanced Research Projects Agency (DARPA)
FAT∗ academics (meaning fairness, accountability, and transparency in multiple artificial intelligence, machine learning, computer science, legal, social science, and policy applications) are primarily focused on promoting and enabling explainability and fairness in algorithmic decision-making systems with social and commercial impact
- FAT* 學者（意味著在多種人工智能、機器學習、計算機科學、法律、社會科學和政策應用中的公平性、問責制和透明度）主要專注於促進和實現算法決策系統(包括社會和商業影響)可解釋性和公平性。
With over than 500 participants and more than 70 papers, FAT∗ conference, which held its fifth annual event in February 2018, brings together annually researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems.
- FAT∗ 會議於 2018 年 2 月舉行了第五屆年度活動，有 500 多名參與者和 70 多篇論文，每年匯集了對社會技術系統的公平性、問責制和透明度感興趣的研究人員和從業人員。
The other group, DARPA, launched its XAI program in 2017 with the aim of developing new techniques capable of making intelligent systems explainable, the program includes 11 projects and will continue running until 2021.
- 另一個組織 DARPA 於 2017 年啟動了 XAI 計劃，旨在開發能夠使智能係統可解釋的新技術，該計劃包括 11 個項目，並將持續到 2021 年。
DARPA funded researchers seem primarily interested in increasing explainability in sophisticated pattern recognition models needed for security applications.
- DARPA 資助的研究人員似乎主要對提高安全應用所需的複雜模式識別模型的可解釋性感興趣。
Increasing interest in XAI has also been observed in the industrial community. Companies on the cutting edge of contributing to make AI more explainable include H2O.ai with its driverless AI product [17], Microsoft with its next generation of Azure: Azure ML Workbench.
- 工業界也觀察到對 XAI 越來越感興趣。在使 AI 更具可解釋性方面處於領先地位的公司包括 H2O.ai 及其無人駕駛 AI 產品 [17]，微軟及其下一代 Azure：Azure ML Workbench。
Kyndi with its XAI platform for government, financial services, and healthcare,2 and FICO with its Credit Risk Models.3 To push the state of XAI even further, FICO is running the Explainable Machine Learning Challenge (xML challenge) [18]. The goal of this challenge is to identify new approaches for creating machine learning based AI models with both high accuracy and explainability. On the other hand, Cognilytica has examined in its ‘‘AI Positioning Matrix’’ (CAPM) the market of AI products. It proposed a chart where XAI technologies are arguably identified as high-sophisticated implementations beyond the threshold of the actual technology [19].
- Kyndi 及其 XAI 平台用於政府、金融服務和醫療保健，2以及 FICO 及其信用風險模型。
- 為了進一步推動 XAI 的狀態，FICO 正在運行可解釋機器學習挑戰賽（xML 挑戰賽）[18] . 這一挑戰的目標是確定新的方法來創建具有高精度和可解釋性的基於機器學習的 AI 模型。
- 另一方面，Cognilytica 在其“AI 定位矩陣”中檢查了 AI 產品的市場。它提出了一個圖表，其中 XAI 技術可以說是超出實際技術門檻的高度複雜的實現 [19]。

C. Contribution and Organization

In this sense, we make three contributions:

• We propose a comprehensive background regarding the main concepts, motivations, and implications of enabling explainability in intelligent systems.
• Based on a literature analysis of 381 papers, we provide an organized overview of the existing XAI approaches.
• We identify and discuss future research opportunities and potential trends in this field.

Section II presents a preliminary background.
- 介紹先前的知識
Section III surveys the latest developments in the XAI field and organizes surveyed approaches according to four perspectives.
- 從四個面向調查 XAI 領域的最新發展且整理調查完的結果
Section IV discusses research directions and open problems that we gathered and distilled from the literature survey.
- 討論從文獻調查中收集和提煉的研究方向和開放問題。
Finally, Section V concludes this survey.
- 技術總結

2. Background

A. Understanding XAI: A Contextual Definition

1970年中期開始做XAIㄌ，但是之後大家都專注在正確率的提高，因而忽視了可解釋性
近年XAI受到關注，因為當AI在一個做重要的決策時，但他沒有一個詳細的原因告訴我們她為啥會這樣做，因此在道德、社會上、法律上的壓力被有所要求。
沒有明確的定義，XAI希望可以讓AI透明化，而不是單單是技術而已
根據 DARPA，XAI產生一個可解釋性的模型，又不失其效能，人們也可以了解他、管理他
演算法和資料都是可以被解釋的，不是相關領域的人也聽得懂ㄉ - FAT* [4]
解釋黑盒子，而且不失正確率，又可以被信任的模型 - FICO
What is big picture
- Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
interpretability 大約等於 explainability
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- 圖表表示在不同場所，被使用的次數
Furthermore, it should be noted that none of the aforementioned variation terms (understandable, comprehensible, intelligible. . .) is enough specific to enable formalization. They implicitly depend on the user’s expertise, preferences and other contextual variables.
*
Rarely in literature, we come across the term ‘‘social science’’ or it derivative, yet explanation is a form of social interaction and clearly, it has psychological, cognitive and philosophical projections. Based on the conducted analysis, ideas from social science and human behavior are not sufficiently visible in this field.
*
Finally, XAI is a part of a new generation of AI technologies called the third wave AI, one of the objectives of this ambition ‘‘wave’’ is to precisely generate algorithms than can explain themselves.
- 要問英文了， than的部分
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

B. Using XAI: The Need And The Application Opportunities

1) The Need For XAI

為了商業利益、道德問題或regulatory considerations，使用者要理解、適當信任和有效管理 AI 的結果，XAI 是必不可少的。
根據探索的文獻，解釋 AI 系統的需要可能源於（至少）四個原因，儘管這四個原因之間存在重疊，但它們捕捉了可解釋性的不同動機。

a. Explain To Justify

當談論對決策的解釋時，通常指的是對特定結果的原因或理由的需要，而不是一般決策過程背後的內部運作或推理邏輯的描述。
使用 XAI 系統提供可證明結果合理性所需的信息，尤其是在做出意想不到的決策時。它也還可以確保有一種可審計和可證明的方式來捍衛算法決策的公平性和道德性，從而建立信任。
此後人工智能需要提供理由以符合立法，例如“解釋權”，這是一項包含在GDPR中的規定，該條例在整個歐盟生效 2018年5月25日 [39]

b. Explain To Control

Explainability 不僅對證明決策很重要。它還可以幫助防止出現問題。實際上，更了解系統行為可以更好的了解未知漏洞和缺陷，並有助於在低危急情況（調試）中快速識別和糾正錯誤。從而實現增強控制。

c. Explain To Improve

一個可解釋和理解的模型是一個更容易改進的模型。因為使用者知道系統為什麼會產生特定的輸出，他們也會知道如何讓它更智能。因此，XAI 可以成為人機之間持續迭代和改進的基礎。

d. Explain To Discover

要求解釋是學習新事實、收集信息從而獲得知識的有用工具。只有可解釋的系統對此有用。例如，鑑於 AlphaGo Zero [40] 在圍棋遊戲中的表現比人類玩家好得多，希望機器能夠向我們解釋其學習的策略（知識）。因此，如果將來 XAI 模型教會我們生物學、化學和物理學中新的和隱藏的規律，那就不足為奇了。
XAI可以幫助驗證預測、改進模型以及獲得對手頭問題的新見解。這導致了更值得信賴的 AI 系統
- Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
並不是所有人都對XAI抱持著信任，Google research director Norvig [41] 對 XAI 提出了質疑，他指出人類也不太擅長解釋他們的決定，
- At this regard, the value of XAI was called into question recently by Google research director Norvig [41], who noted that humans are not very good at explaining their decisions either, and claimed that the credibility of an AI system results could be gauged simply by observing its outputs over time.
- 不太懂ㄟ - UsePerson
這家AI巨頭公司的研究人員強調了一個重點。可解釋性是一個基本屬性；然而，並不是必要的。
此外，讓AI 系統有可解釋無疑是昂貴的。在人工智能系統的開發和實踐中的詢問方式上都需要大量資源(每次都要問說為啥會這樣，會花很多資源)。
可解釋性的需求取決於：
- (a) AI 算法的複雜性導致的功能不透明程度：如果低，則不需要高水平的可解釋性。
- (b) 應用領域對錯誤的抵抗程度。如果它具有對錯誤的抵抗程度，則可以接受意外錯誤。
  - 例如，對於用於定向廣告的 AI 系統，相對低水平的可解釋性就足夠了，因為它出錯的後果可以忽略不計。
  - 另一方面，基於人工智能的診斷系統的可解釋性會明顯更高。任何錯誤不僅會傷害患者，還會阻止採用此類系統。
- 因此，任何做出錯誤預測的成本非常高的領域都是 XAI 方法的潛在應用領域。

2) XAI Application Domains

a. Transportation

自動駕駛汽車有望減少交通死亡和提供增強的機動性，但也對解決人工智能決策的可解釋性提出了挑戰。自動駕駛汽車根據它們如何對前方場景中的物體進行分類來做出瞬間決策 . 如果自動駕駛汽車因為一些錯誤分類問題突然出現異常，那是hen危險的。
一輛自動駕駛的uber在亞利桑那州殺死了一名婦女。這是已知的第一起涉及全自動駕駛汽車的死亡事故。該信息報告了匿名消息來源，他們聲稱該汽車的軟件在車輛前方記錄了一個物體，但對其處理方式與塑料袋或風滾草一樣[42]。只有一個可解釋的系統才能澄清這種情況的模糊情況並最終防止它的發生。

b. Healthcare

在 1990 年代中期，ANN 被訓練來預測哪些肺炎患者應該住院治療，哪些患者作為門診患者治療。最初的研究結果表明，神經網絡比經典的統計方法準確得多。然而，經過廣泛的測試，結果證明神經網絡已經推斷出患有哮喘的肺炎患者的死亡風險較低，不應入院。

c. Legal

使用模型來預測一個犯人累犯的機率，其預測必須要是公平的
Loomis v. Wisconsin [48] 有一個案例質疑使用閉源的風險評估軟體對 Loomis 判刑
- 此案件聲稱軟體 Correctional Offender Management Profiling for Alternative Sanctions: COMPAS 通過考慮性別和種族來侵犯程序權利
- 演算法是商業機密，所以法官不知道他是如何判斷的

d. Finance

用於提升財富管理、投資建議、顧客服務
然而，資料安全和公平借貸有疑慮，金融業受到高度監管，法律要求貸款發行人做出公平的決定。因此，使用基於AI的一項重大挑戰，在信用評分和模型中，很難向借款人提供所需的“原因代碼”——解釋他們為什麼被拒絕信貸。Equifax 和 Experian 等徵信社正在研究一些，基於人工智能的信用評分決策更具可解釋性和審核員可以看懂ㄉ [52]。

e. Military

DAPRA Projects 提高 XAI 主題的能見度
MIT Technology review Knight[53] 遇到深入探討依靠自主系統進行軍事行動的挑戰
在學術 AI 社群和 DARPA 的計畫研究 XAI，但是這計劃還在初步階段
XAI可應用一坨領域包括 : 網路安全、教育、娛樂、政府、圖片識別等等，Future of Privacy Forum [55]展示了自動化決策潛在危害的圖表，它描述了生活的各個領域，其中自動化決策可能會造成傷害，以及提供自動化解釋可以將它們轉變為可信賴的流程，這包括就業、保險和社會福利、住房以及商品和服務的差別定價。

C. Enabling XAI: The Technical Challenge

使 AI 有可解釋性是一個非常有挑戰性的技術議題
XAI 包含了傳統專家系統和 DNN
- 傳統專家系統有解釋性，但是不好用
- DNN 很難看到內部訊息，但效果好棒棒
80年代有一個他們認為擁有審查的系統，他是一個利用知識庫的推理引擎可以利用一串的理由來解釋他的決定[56]
專家系統的優點是很會解釋效能強，且建立於專業知識上，但是不彈性
XAI 在近期也有些進步(適用於路人，不一定要有專ya知識)，但可解釋問題還沒完全解決
現代的AI預測能力提高，但是結果難以理解
- 和 input feature高度互動
當只有一層的DNN時，可以藉由他們的weight來解釋，但是當層數變多就難以解釋
從不同領域的專業知識需要一同合作，整合一起，需以多個面向開發使研究繼續進步
XAI有4個領域
- (i) Data science
  - data 與準確度和可解釋性高度相關
- (ii) Artificial Intelligence/Machine Learning
  - 用 AI 解釋 AI
- (iii) Human science
  - 先知道人類如何決策和解釋
- (IV) Human Computer Interaction (HCI)
  - 人類依賴他們與機器的互動方式來瞭解和信任系統
Focusing on the What, the Why, the Where and the How, we tried to propose an extensive background regarding XAI by defining the concept of XAI, exposing the motivation behind its reemergence, identifying the segments of the market where the results are promising, and finally presenting some potential research areas that could potentially contribute to overcome the technical challenge related to XAI systems. The next section aims to capture researchers’ attention on the growing research body of XAI through a literature survey.

3. Review

儘管如此，可解釋性AI還是一坨人在做研究，但是缺少整理和分類
[25] 嘗試對可解釋性研究中的需求和方法進行分類。但論文本身並不是一項survey，但它通過文獻的視角提供了關於什麼可能構成可解釋性的紮實討論。
[62] 調查試圖將可解釋性的分類法和最佳 practice 定義為“rigorous science”。
The main contribution of this paper is a taxonomy of interpretability evaluation. In doing so, the authors shifted the focus on only one dimension of expandability: its measurement.
- 這裡的this指的是[62] 還是這篇阿
[63] 審查了一些解釋黑盒子的方法，他們根據遇到問題的類型進行分類，強調解釋性的機制，忽略評估
A recent survey by Guidotti et al. [63] reviewed methods for explaining black-box models at a large scale including data mining as well as machine learning. They presented a detailed taxonomy of explainability methods according to the type of problem faced. Even though the survey considered holism in terms of models (it discusses all black-box models), it emphasized only interpretability’s mechanisms, ignoring by this other explainability dimensions such as evaluation. Hence, the detailed technical overview of surveyed methods makes it hard to get a quick understanding of the explanation methods space.
- 啥是 explainability dimensions
[64] 他們展示了監督學習範式下機器學習模型可解釋性的進展，特別關注 DNN
現有的 survey 專注於特別的方面，這篇 paper 綜合了不同面向的 XAI

B. A Holistic Survey

paper 從 SCOPUS, IEEExplore, ACM Digital Library, Google Scholar, Citeseer Library and ScienceDirect, in addition of preprints posted on arXiv 這六大資料庫來
- 使用這些關鍵字來搜尋
  - intelligible,
  - interpretable,
  - transparency,
  - black box,
  - understandable,
  - comprehensible and
  - explainable
- and its union with a set of terms related to AI including
  - Artificial Intelligence,
  - Intelligent system and
  - Machine learning,
- or terms referring to ML algorithms such as:
  - deep learning,
  - classifier,
  - decision tree
- 2004 - 2018 年發表的文章
  - 使用 title, abstract and keyword 做分析
  - 滾雪球政策 [65]
    - 選到的 paper 也去看他們的 reference and citation
  - 共有 381 篇
  - - 現在的論文蓬勃發展
主要討論的是
1. 引用次數高
2. 對相應的軸有很好的覆蓋

AXIS I. XAI Methods Taxonomy: Explainability Strategies

設計一個演算法，而且演算法是可解釋的，有很多論文支持這個方法
- Bayesian Rule Lists (BRL) [66] 以 decision tree 為基礎，提供一個令專家可以信任的可解釋性模型
- [37] 可解釋性模型，使用真正的醫療資料，肺炎問題
  - generalized additive models (廣義加性模型)
- [67] 介紹了一個以attention為基礎的模型學習圖片的內容
- [68] 展現稀疏線性模型(SLIM) data-driven scoring systems ，由於稀疏性和小整數係數讓使用者能夠理解
- [69] [70] 一般的挑戰，阻礙這方法的使用性，是準確度和可解釋性的平衡
- [25] 一個替代的方式是利用其他技術來逆向工程來給出一個解釋，而不去更改原本的黑盒子
  - 較複雜
  - cost
  - 也包含自然語言解釋[71]、視覺化學習模型[72]、the explanation of example[73]
As long as the model is accurate for the task, and uses a reasonably restricted number of internal components, intrinsic interpretable models are sufficient
如果預測的目標很複雜，就必須要透過事後的模型來解釋他
[74] 有些論文有一些方法可以來更改那些複雜的模型讓他透明化、更有解釋性
- [75] [76] 以前的方法可以增加一些額外的物件，
  - [77] 物件可以 loss function 的一部份
  - [78], [79] layer 之間做處理

1) Global Interpretabililty

population level decisions, such as drugs consumption trends or a climatic change
- 藥物消費趨勢
- 氣候變遷
  - 我們需要的是 global 影響，而不是對所有可能的解釋
globally interpretable models
- [37] 預測肺炎風險的模型
- [66] Bayesian generative model 生成的規則集
- 然而這些模型有特別的結構，其預測性會被限制來保留可解釋性
[80] 對於一些 ML 模型的 local explanation 利用 recursive partitioning GIRP，來建立一個 global 解釋性模型 tree
- 可以看出一些 ML 模型，是如何有理由的做決定 or overfit 的不合理之處
[81] 提出一個監督性的方法，提供一個 global 的解釋，支持
- This work supports the idea that representation learning can be successfully combined with traditional, pattern-based bootstrapping yielding models that are interpretable.
[82] [83] 使用 activation maximization 可以讓 deep generator network 產生對於影像辨識的 global 解釋模型
- synthesizing the preferred inputs for neurons in neural networks
較有爭議的是，global 的可解釋性很難做到，因為只要參數過多就難以解釋
我們人類專注在 local interpretability 因為我們比較好理解

2) Local Interpretability

解釋特殊的決定或預測的意義是啥
LIME [84] 可以局部的近似黑盒子
[85] 使用 decision rule 來 LIME的延伸
[86] LOCO 測量 local variable 的重要性
[87] 提出一個可以解釋 local decision 的方法，利用 local gradients 來得出一個資料要如何修改使他預測出來 label 改變
[88]-[91] 使用 [87] 類似的方法做圖像分類
[92]常見的方法
- 找出一張圖片特別的區塊，而這個區塊是對答案是特別敏感的，只要修改就會變成其他東西
[93]-[95] 類似 [92]
[101] 顯示[96]-[100]有等價的技術有堅實理論支持的新技術叫做 Shapley Explanation
[102] - [104] 較有前途的東西 : 整合 local 和 global 的解釋性
有4個可能的組合 [105]
- 標準的 global 的模型來回答模型如何做出決定
- global 模型解釋性在一個模塊化的級別來辨別個別區塊的預測造成的影響
- local 對於一群的預測可以指出為啥會有特定的決定
- 單一的local解釋性對於一個答案為啥會這樣
local 解釋最常用於生成DNN的解釋

另一個分類模型可解釋性的方法就是:他們是不是Model-Agnostic ，意思就是他們可以被應用在任何 ML 演算法的類型；他們是不是Model-Specific，意味著技術僅適用於單一類型或一類算法。

1) Model-Specific Interpretability

技術僅適用於單一類型或一類算法
這種做法的缺點是當我們需要特定類型的解釋時，我們在選擇提供解釋的模型方面受到限制，我們會需要特別挑更有預測性和代表性的 model
- 後面那英文是啥R
因此，最近對 Model-Agnostic 的可解釋性方法的興趣激增，因為可以被應用在任何 ML 演算法的類型

2) Model-Agnostic Interpretability

Model-agnostic 方法是不依賴於特定類型的 ML 模型，這個方法將預測與解釋分開
Model-Agnostic 解釋通常用於事後解釋，普遍用於解釋 ANN
Model-Agnostic 可以是局部或全局可解釋模型
為了改進可解釋性 AI 模型，最近使用統計、機器學習和數據科學的範圍技術開發了大量 Model-Agnostic 方法。
由於審查的論文大多屬於這一類，因此在此概述按技術分組研究作品。這些大致分為四種技術類型：
- (i) Visualization,
- (ii) Knowledge extraction,
- (iii) Influence methods
- (iv) Example-based explanation.

(a) Visualization

i) Surrogate Models

用來解釋一個複雜的模型(Surrogate Models是線性模型或decision tree)
沒有理論上的保證，可以保證這些簡單的 Surrogate Models 可以用來代表這些複雜的模型
LIME 是 Model-Agnostic[84] 對單一一個觀察，創建出一個 local 的Surrogate Models
[106] 用 decision tree 去 train 黑盒子
[107] 提出一個方法，使用 Surrogate Models 創建一個 TreeView visualizations

ii) Partial Dependence Plot(PDP)

PDP 是一個圖示，幫助視覺化黑盒子的 input 和 output 平均部分關係
有些成果使用 PDP 來了解監督型模型
- [108] the relationship between predictors and the conditional average treatment effect for a voter mobilization experiment
  - [109] Bayesian Additive Regression Trees 是一個預測模型
  - [108]是解釋模型
生態
- [110] 使用 PDP 了解不同環境因子對於淡水的分佈的影響
刑事司法環境中，常出現不對稱的分類
- [51] 在司法環境下，使用隨機森林和相關的PDP來建立對於 model 和 response 的關係
[111] Forest Floor 視覺和解釋隨機森林 (feature contributions method)
- Forest Floor 相對於 PDP 的優點，其 interaction 不會因為平均，而被忽略，
  - 所以可以用來發現 interaction
  - which are not visualized in a given projection.

iii) Individual Conditional Expectation (ICE)

ICE 延伸 PDP
- PDP 提供一個對於模型如何運作概略的觀點
- ICE 分解 PDP 的 output 來看 interaction 和 individual differences
最近幾乎都使用 ICE
- [112] 證明 ICE 比 PDP 好
- [113] 提出一個 local 的 feature 重要性基於 partial importance (PI) and individual conditional importance (ICI) plots 做視覺化的工具

(b) Knowledge Extraction

當模型是 ANN 時，很難解釋 ML 模型怎麼 work
因為學習的演算法會改變內部的神經元，內部會變的 “有趣”
提出一些方法來提煉 Knowledge
- i) Rule Extraction
- ii) MODEL DISTILLATION

i) Rule Extraction

[114]-[116] 用 Rule Extraction 來看這些高複雜的模型
- 抽取 ANN 做決定的過程
- 想要在 ANN 中尋找在傳統專家系統中有的那些知識
[74] 分類 Rule Extraction
- pedagogical rule extraction
  - 把 ANN 視為黑盒子
  - Orthogonal Search-based Rule Extraction algorithm (OSRE) [119] 是在生物醫學上成功的 pedagogical
- decompositional rule extraction
  - 專注在抽取 ANN 中的個別單元(認為 ANN 是透明ㄉ)
- eclectic Rule-Extraction
  - [120] 將前兩個方法合併

ii) Model Distillation

[121] [122]Distillation是一個模型壓縮，將 Deep network 的資訊 (dark knowledge) 轉成 shallow network
原本是用來降低執行時間和計算成本，後來被使用在可解釋上
[49] 調查一個 Model Distillation 來蒸餾一個 model 變成一個透明的 model
[123] 提出一個叫做 Interpretable Mimic Learning 的方法，
- 學習可解釋性的表面特徵
- 同時模仿原本 deep learning 的效能
[124] 提出一個黑暗之眼，一個可視化的方法來解釋黑盒子的prediction
[125]-[127] 包含知識蒸餾、降維、DNN 的可視化

(c) Influence Methods

這個技術是估計重要性或關聯性，改變 input 或內部組件，觀察效能的變化，可以知道這個東西重不重要
這些 Influence 技術，通常可以被可視化
- Sensitivity Analysis
- Layer-Wise Relevance Propagation(LRP)
- Feature Importance

i) Sensitivity Analysis

[128] Sensitivity 是指說 ANN 的 output 如何被 input & weight 影響
被用來證實，當資料被故意擾動、改變時，model還是穩定的
視覺化 Sensitivity Analysis(SA) 的結果是 agnostic explanation technique，改變 data 來展現了 model 的穩定性
[129] [130] (SA) 被越來越多的應用在 ANN 和 DNN
然而，SA只是解釋值的變化，而不是解釋值的意義
SA 被用來測試 model 的穩定性和信任度以及找出和移除不重要的 input ，或是其他更強的技術的起點 (decomposition)

ii) Layer-Wise Relevance Propagation(LRP)

LRP 計算相關性
[131] Layer-wise Relevance Propagation algorithm
LRP redistributes prediction function backwards, starting from the output layer of the network and backpropagating up to the input layer. The key property of this redistribution process is referred to as relevance conservation
相對於 SA，這種方法解釋了與最大不確定性狀態相關的預測
公雞

iii) Feature Importance

變數重要性量化每個 feature 對於 output 的貢獻
Model Class Reliance (MCR) [132] 基於 Feature Importance 的 Model-Agnostic

(d) Example-Based Explanation

Example-Based Explanation 選出一些例子，就可以解釋 ML model的行為
Example-Based Explanation 大多是 model-agnostic ，因為可以用在任何 ML 模型
model-agnostic 跟 example-based 的些微差別是:
- example-based 是在 dataset 中挑一些例子來解釋
- example-based 不是對 feature 或轉換 model 本身
將 example-based 分成兩類:
- (i) Prototypes and criticisms
- (ii) Counterfactuals explanations

i) Prototypes And Criticisms

Prototypes are a selection of representative instances from the data [133]–[135], thus item membership is determined by its similarity to the prototypes which leads to overgeneralization. To avoid this, advantage exceptions have to be shown, also called criticisms: instances that are not well represented by those prototypes. Kim [136] developed an unsupervised algorithm for automatically finding prototypes and critics for a dataset, called MMD-critic. When applied to unlabeled data, it finds prototypes and critics that characterize the dataset as a whole

ii) Counterfactuals Explainations

[137] 提出 unconditional counterfactual explanations ，是一個自動化決定的虛擬的解釋種類
Counterfactuals Explainations 是講最小的條件會導致一個不一樣的決定
[138] 不解釋他，而是強調以預測結果反推
Model-Agnostic 方法在這個 model 是彈性ㄉ
雖然 Model-Agnostic 解釋性技術很方便，但是他們都依賴 surrogate model 所以解釋的準確度會下降
Model-specific 是直接解釋，所以他的解釋的準確度較高
Model-Agnostic 是解釋性比較受歡迎的類型，因為他是 model independent 且他可以以相同的 model 來看不同的 agnostic model 的表現如何

AXIS 2. XAI MEASUREMENT: EVALUATING EXPLANATIONS

被用來要做可解釋性模型，都是有相同的解釋性的嗎?
- 在 [62] 被提出
很少方法評估這些可解釋性ML模型
可解釋性很主觀，有些人覺得解釋的好，有些人覺得解釋的不好
對於可解釋性模型的比較、驗證、量化、評估正在提升
[62] 提出評估可解釋性的基準線
1. application-grounded: put the explanation into the application and let the end user (typically a domain expert) test it. This type evaluates the quality of an explanation in the context of its end-task
  - 把解釋放到應用程式裡，並讓領域的專家測試，這個方法評估了對於前後關係的品質
2. human-grounded: is about conducting simplified application-grounded evaluation where experiments are run with lay humans rather than domain experts. This type is most appropriate when the goal is to test more general notions of the quality of an explanation
  - 執行一個 application-grounded 的簡化版，可以給一般人用，看一般人可不可理解
3. functionally-grounded: this type does not involve humans, it is most appropriate once we have a class of models or regularizers that have already been validated, e.g. via humangrounded experiments.
  - 不包含人類，先前已經被驗證過
[148] 根據 [62]，提出一個 human-grounded 為基礎的指標，評估影像和文字資料的解釋
- 從分類 models 到 benchmark’s annotation meta-data 使用 comparing 解釋結果來評估
- 可以評估 local explanations 的品質和適當性
[149] 使用 decision tree, decision tables, propositional rules, and oblique rules 更了解哪個最有可解釋性
- 他們發現 decision tree 和 decision table 最有解釋性
- 在不同的任務上可能有差別
[151] 表明量化可解釋性，結合人類可解釋的概念和一個架構 (Network Dissection) 用來量化 ANN 中的可解釋性
[152] 有不同解釋性的方面，有潛在的因素，會被不同的input 數量、model 的複雜度和使用者介面所影響，影響使用者對於 model 的信任度和調整
- 他們從事與模型可解釋性的操縱和測量相關的工作，這是一個有趣的實驗，其中包括被認為使模型或多或少可解釋的變化因素，並測量這些變化如何影響人們的決策，他們關注兩個因素：輸入的數量以及模型是透明的還是黑盒的。
- 透明和最小 input 模型能夠更好地模擬模型的預測。
- 然而，他們沒有發現使用者的信任或預測錯誤有顯著差異。
[153] human feedback 和 automated metrics 可以被用來評估模型的可解釋性
[154] 提出一個方法來評斷一個 ML 模型的可解釋性，根據分類法
1. emulate the processing
2. explain the representation
3. explanation-producing networks
一個常見的因素會影響可解釋性的品質:
- Human

AXIS 3. XAI Perception: Human In the Loop

Explain
- 啥被解釋 (原本的模型)
- 如何被解釋 (可解釋模型)
understand
- 誰收到解釋 (Hu~man)
為了要被解釋，model 必須是人類可以了解的
需要 ML 和 HCI 知識
解釋，在哲學和心理對於人類行為已經被研究很久，我們可以參考或諮詢他們
提到 human factor 的論文很少
討論 2 點
1. 像不像人類
2. 如何產生以人類為中心的解釋

A. Human-Like Explanations

[33] 有夠厲害的結果
- 試圖表達人類科學和 XAI 的"連結"
- 提供哲學、心理學、認知科學深入的survey
- 三大發現
  1. (i) Explanations are contrastive: people do not ask why event E happened, but rather why event E happened instead of some event F.
    - 人類不會問事件E 為啥發生，為啥是事件E 而不是事件F 發生
  2. (ii) Explanations are selective and focus on one or two possible causes and not all causes for the recommendation.
    - 解釋是可選擇的，而且專注在1個或2個可能的原因，而不是所有的原因
  3. (iii) Explanations are social conversation and interaction for transfer of knowledge, implying that the explainer must be able to leverage the mental model of the explainee while engaging in the explanation process
    - 解釋是社會溝通和知識的轉移，解釋的人必須要能夠
    - 解釋者必須要利用被解釋者的心智模型？
[155] 將社會科學模型應用在 XAI，大多的 XAI 作者以開發者的直覺為基礎，而不是專注在使用者上
在 XAI 的研究中，很少用人類科學，但是其實人類科學有些可以用上
post-hoc 其實和人類做解釋時很像
人類在做推理時，會挑出特別有代表性的例子當作基礎

B. Human-Friendly Explanations

[156] 提出以本體為基礎的介面，讓使用者 (非專家) 可以對 ML 有更深的知識
[157] 最近新的論文都專注在解釋方法，而不是可用性&實際可解釋性&可對使用者的功效
- eXplainable AI for Designers (XAID) 幫助遊戲設計者與 AI/ML 的合作
[158] 提出 Rivelo，一個 pedagogical visual analytics interface 給專家使用的二類分類器看一些例子在幹嘛
[61] 調查 HCI 開發實際的可解釋系統
- 作者做了一個透過設定 HCI 研究議程很大的資料分析
- 指出大部分有關聯的成果都試著讓解釋人類能讀懂，以文字或圖像的解釋
其中 agnostic 方法，視覺化是可以更好的解釋黑盒子，但是有些視覺化技術很有趣，但是不好理解
[159] 作者承認為 DNN 生成人類可以理解的可視化和解釋的重要性，並公開試圖生成此類可視化的作品

tags: paper