###### tags: `paper`
# Explainable Artificial Intelligence: A Survey
[TOC]
{%pdf %}
## Introduction
* 內容推薦 [4] 和虛假內容的生成 [5] 與其他技術相結合,將深刻影響社會動態 [6]。 許多事情將由此類演算法規定(Many things will be prescribed by such algorithms),這將以現在可能無法想像的方式影響人類生活,因此人們需要信任它們才能接受這些規定。
* Many things will be prescribed by such algorithms 我不確定翻的正不正確
* 系統必須像人類一樣滿足許多標準(保證)以增強信任 [2]:無偏見(公平)、可靠性、安全性、解釋性正當性、隱私、可用性等。 這種保證是由於對人類決策的偏見而假設的,因為人們是習慣於人類社區生活的社會生物。
* 希望系統可以像人類一點,感覺應該是這樣
* Spoofability and biasedness have been demonstrated for visual recognition in [8], [9] and natural language processing [10], [11].
* 視覺識別的[8]、[9] 和自然語言處理的[10]、[11]顯示了可欺騙性和偏見(我估計要看這4篇才知道他們在做啥)
* No robust solution has been found to these problems so far. The potential ethical pitfalls should be addressed as soon as possible since inactivity could lead to unforeseeable splits and differences in the future society. European Union introduced a right to explanation in General Data Protection Regulation (GDPR) [12] as an attempt to remedy the potential problems with the rising importance of algorithms.
* 到現在還沒有可以解決上面問題的好方法,潛在的道德陷阱應該盡快解決,因為不解決此問題可能會導致未來社會無法預估的分裂和差異,歐盟在通用數據保護條例 (GDPR) [12] 中引入了解釋權,以試圖解決演算法的重要性日益提高的潛在問題。。
* 這裡的inactivity是啥阿,ㄅ懂
* potential ethical pitfalls潛在的道德陷阱 是啥鬼
* More generally, abstracted explanations can be utilized for finding useful properties and generating hypotheses about data-generating processes, such as causal relationships - which is crucial application in science as well as in future Artificial General Intelligence (AGI)
* 抽象的解釋可用於尋找有用的屬性並生成關於數據生成過程的假設,例如因果關係——這在科學以及未來的通用人工智能 (AGI) 中都是至關重要的應用
* Generated hypotheses can be basis for further automated or manual experimentation, knowledge discovery, and optimization. This view is supported by [13] and encompasses: checking for satisfaction of trust-criteria, optimization of ethical outcomes due to technology, assisted(automated) scientific discovery, transferring skills, etc. mentioned in[3], [14]-[16]. Previous overviews and surveys of interpretability in machine learning are given in [2], [3], [14], [15], [17]-[19].
* 生成的假設可以作為進一步自動化或手動實驗、知識發現和優化的基礎。 該觀點得到 [13] 的支持,包括:檢查信任標準的滿足程度、技術導致的倫理結果的優化、輔助(自動化)科學發現、轉移技能等 [3]、[14]-[ 16]。 [2]、[3]、[14]、[15]、[17]-[19] 中給出了之前對機器學習中可解釋性的概述和調查。
* **In this paper we survey the advances in the interpretability and explainability of machine learning models under the supervised learning paradigm.**
* 此paper,survey了可解釋性 和 監督學習模型的可解釋性的 進展
* **The paper is organized as follows: in section 2 we deal with the preliminaries and definitions. In section 3 we categorize the work in methods for interpretability. Section 4 offers discussion into the current state of the research field and lists future research ideas. Section 5 concludes the paper**
* 第 2 節 : 討論了預備知識和定義。
* 第 3 節 : 對可解釋性方法中的工作進行了分類。
* 第 4 節 : 討論了研究領域的現狀並列出了未來的研究思路。
* 第 5 節 : 總結論文
## Preliminaries and Definitions
* Trust is defmed in [2] as a psychological state in which an agent willingly and securely becomes vulnerable, or depends on, a trustee, having taken into consideration the characteristics ofthe trustee.
* Authors in [15] claim that unlike normal ML objective functions, it is hard to formalize the defmitions of criteria that are crucial for trust and acceptance, view backed by [2], [3]. In those cases of incomplete problem formalization, interpretability is used as a fallback or proxy for other criteria.
* However, there is no unique definition of interpretability [3], [15]. In [3] interpretability is found not to be a monolithic concept, but in fact it reflects several distinct ideas and that in many papers interpretability is proclaimed axiomatically. Authors in [15] define that to interpret means to explain or to present in understandable terms. Then, interpretability in the context of ML systems is the ability to explain or to present in understandable terms to humans.
* 然而,可解釋性沒有唯一的定義 [3]、[15]。 在 [3] 中發現可解釋性不是一個單一的概念,但實際上它反映了幾個不同的想法,並且在許多論文中,可解釋性是公理化的。 [15] 中的作者定義解釋意味著以可理解的術語解釋或呈現。 然後,機器學習系統上下文中的可解釋性是向人類解釋或以可理解的術語呈現的能力。
* Interpretability 和 explainability 在文獻中經常互換使用,但有些論文會有所區別。
* 在 [17] 中,interpretation 是將抽象概念映射到人類可以理解的領域,而 explanation 是可解釋領域的特徵集合,這些特徵有助於給定示例產生決策。
* Edwards 和 Veale 在 [20] 中將 explanations 分為以模型為中心和以主題為中心,這些概念對應於 [17] 中 interpretability 和 explainability 的定義。
* [15] 中的類似角色分別涉及全局和局部interpretability。 根據這種觀點,我們可以看到 GDPR 僅涵蓋explainability。 Comprehensibility [14] 在文學中用作可解釋性的同義詞。 Transparency [3] 被用作模型interpretability的同義詞,即理解模型工作邏輯的某種意義。
## Methods for interpretability and explainability