SpanNER: Named Entity Recognition as Span Prediction

acl2021

重點

探討 span prediction model 相比於 sequence labeling model 的優缺點
Span prediction model 不只可以自己當 NER 使用，還能作為一種 combiner(ensemble)，用於整合多模型的輸出

內容

以下為兩種 NER 模型架構， Sequence labeling SeqLAB 和 Span predictionSpanNER

SeqLAB 單純的對 token 做分類
SpanNER 先枚舉所有可能的 span ，再對其做分類

預訓練模型的發展增進了 nlp 任務的成績，也改變了學者們如何去 formulate(制定/規定) 任務。

NER 任務從以前的 token level 分類(例如 bio)，轉變到 span-level prediction，把任務視為 question answering / span classification / dependency parsing task。

雖然 span prediction-base system 已經發展的不錯，但對於其 architectural bias 還是有待去研究。例如:

what are the complementary(互補性) advantages compared with SeqLAB frameworks and how to make full use of them?

architectural bias 的意思根據這篇所表示

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

我們首先研究了把 span prediction 概念應用在 NER 任務上的優缺點，詳細的分析 SpanNER 系統和 SeqLAB 系統，找出他們間的互補優勢，例如:

SeqLAB-based model 擅長 long and with low label consistency 的實體
SpanNER 擅長 sentences with more Out-of-Vocabulary (OOV) words and entities with medium length

SpanNER 不只能當成 NER 系統，也能當作其他系統的 combiner(整合多個 NER 模型)，相比於傳統的ensemble (投票)系統有以下優勢:

大多的 NER combiner 需要做特徵工程和額外知識
不需要額外的訓練資料且很靈活
整合了(1) 最佳化 NER 模型和 (2) ensemble learning for combiner 的步驟，過去的方法兩者是分開的。

此外，實做了 154 個系統在 11 個資料集上並且架了網站，能在上面方便的檢視哪些模型可以一起合作。

SpanNER as NER System

基於 span 方式的 NER 模型有以下 3 個模組，示意圖可看先前段落

Token representation layer

給予輸入

X = x_{1}, . . . x_{n}

，輸出 representation 結果

h_{i} i

u 1, . . . u_{n} = E M B (x_{1}, . . . x_{n}) h_{1}, . . ., h_{n} = B I L S T M (u 1, . . . u_{n})

Span representation layer

枚舉所有可能的 span，例如長度為 3 的句子 London is beautiful
Span 集合 (start,end) 就有

{(1, 1), (2, 2), (3, 3), (1, 2), (2, 3), (1, 3)}

，除了 (1,1) 是 LOC，其他都是 O

Span 的表示有以下幾種

Boundary embedding 把 start/end token 串接起來
$z_{i}^{b} = [h_{s t a r t}; h_{e n d}]$
Span length embedding 在上一點的 feature 加上 length embedding
$z_{i}^{l}$ ，
$s_{i} = [z_{i}^{b}; z_{i}^{l}]$

Span prediction layer

有了 span representation

s_{i}

後將其輸入 softmax 分類層，score(-) 是可學習的分類器

P (y | s_{i}) = \frac{s c o r e (s_{i}, y)}{\sum_{y^{'} \in Y} s c o r e (s_{i}, y^{'})}

Heuristic Decoding

對於那些重疊的 span ，只保留最高機率的 span

實驗

Effectiveness of Model Variants

做了以下幾種組合並分析成效，證實 span length mebedding 和 heuristic decoding 是有效的

generic: boundary embedding
boundary embedding + heuristic decoding
boundary embedding + span length embedding
boundary embedding + span length mebedding + heuristic decoding

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Analysis of Complementarity 互補性

以 CONLL-2003(EN) 作為資料集來分析

$s q_{1}, . . . s q_{5}$ 代表 5 種最強的 SeqLAB 模型
綠色代表 SeqLAB 比 SpanNER 好，粉色則相反
作者把 testset ，依照特徵大小分成4種 bucket (xs,s,L,XL)

特徵有以下幾種 (OOV=out of voc)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

eCon 參數越高代表在訓練資料中，越常出現特定的實體被標記為特定的 label

在 SpanNER 使用 generic 配置下，SeqLAB 幾乎是全面勝出，尤其是在

entities are long (eLen)
lower label consistency (oDen) 的情況下。

但 SpanNER 在滿配置情況下，除了在 low label consistency 情況下都更優於 SeqLAB

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

SpanNER as System Combiner

在和其他 SeqLAB 模型合作時，可由以下方式 ensemble 來決定當前的 span 為哪個類別

首先讓 span 模型輸出不同類別的機率值
$p_{i}$
讓其他模型(基於 SeqLAB) 直接幫此 span 做分類，拿取對應類別的機率
$p_{i}$
把他們的機率值加起來取最大的
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

SeqLAB 跟 SpanNER 在資料集上的對比

sequence labeling 模型的各種組合和 SpanNER 做相比

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

SpanNER combiner vs 傳統 ensemble

以下實驗都有用 5-fold crosss validation，數據表明 spanNER 用於模型 ensemble 是有效的

縮寫	全名	解釋
VM	Majority voting	All the individual classifiers are combined into a final system based on the majority voting.
VOF1	Weighted voting base on overall F1-score	The taggers are combined according to the weights, which is the overall F1-score on the testing set.
VCF1	Weighted voting base on class F1-score	Also weighted voting, the weights are the categories’ F1-score
SVM	Support Vector Machines	a supervised machine learning algorithm, which can train quickly over large datasets. Therefore, the ensemble classifier is usually SVM.
RF	Random Forest	A common ensemble classifier that randomly selects a subset of training samples and variables to make multiple decision trees
XGB	Extreme Gradient Boosting	XGB is an ensemble machine learning algorithm. It is based on the decisiontree and the gradient boosting decision

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

在更多資料集上分析

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

如同前面熱力圖的分析，這次分析 SpanNER combiner 和 other combiner 的對比

都是在 CONLL-2003 dataset
在 combiners 的對比時，綠色代表 SpanNER 比較好
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →