###### tags: Paper Reading
# Downstream Model Design of Pre-trained Language Model for Relation Extraction Task
## Outline
    This paper is talking about a new network architecture for relation extraction downstream model.
## Introduction/Motivation
    In previous works, relation extraction can be mainly summarized by 3 steps below. But these methos do not perform well on datasets which have complicated relations in it.
1. building an encorder and obtaion word embedding.
2. bulding a certain network structure to extract information from embeddings.
3. Puts encoded information into a classifier,like softmax classifier..
    Recently, many works take a use of **pre-trained language models(PLM)** as a idea. This paper thinks that they do not fully take use of PLM, so it propose a new downstram model architecture.
   
## Model
### architecture
1. Put text T into transformer(Bert) and **get penultimate layer output as $E_w$**
2. **Get last layer output as $E_p$**, and **add Bert [CLS] embedding to $E_w$ and get $E_b$**
3. Caculate similarity between $E_b$ and $E_p$ by following $S_i = E_bW_{hi}\cdot(E_pW_{ti})^T$
4. Use sigmoid functions to normalize $S_i$
### loss function
    Please go to section 3.3.

## Experiment
    Please go to section 4.
## Conclusion
    This paper propose a new architecture to make good use of PLM. In addition, it also introduces corresponding loss fuction. We can take this kind of downstream model architecture as a idea. Excellent work.