# Sentence-T5 (ST5): Scalable Sentence Encoders from Pre-trained Text-to-Text Models
###### tags: ```筆記```, ```NLP```, ```ACL 2022```
## Abstract
- Motivation: Exploring sentence embeddings from T5 models, including the impact of scaling up to 11B parameters.
- This paper is the first to explore sentence embeddings from T5, including the creation of the SentGLUE benchmark for sentence representation evaluation.
## Introduction
- Sentence embeddings are crucial for many language processing tasks. The paper explores generating these embeddings from T5 models.
- Investigates three methods to construct ST5 models using the T5 encoder and encoder-decoder, with significant performance improvements noted.
## Methodology
- Encoder-only and Encoder-decoder methods
- 
- (b), \(c): pooling strategies widely used in encoder-only pre-trained models such as BERT.
- (d): Unlike BERT models, **T5 models do not have a ‘CLS’ token** at the beginning of each sentence. For T5 encoder-decoder models, the authors **assume** the decoder is aware of the semantics of the entire in- put sentence when generating its first token predic- tion; and if so, the first decoder output embeddings (i.e. input to the softmax layer) might **naturally capture the sentence semantics**.
- Introduces a new sentence representation transfer benchmark, SentGLUE, extending SentEval with tasks from GLUE for comprehensive evaluation.
## Experiments
- Datasets: Utilized SentEval, SentGLUE, and various GLUE benchmark tasks.
- Metrics: Used classification accuracy and **spearman correlation** to evaluate performance on sentence transfer tasks and STS tasks, respectively.
## Takeaways
- 即使未微調,僅使用編碼器的ST5模型在句子轉移任務上表現良好,超越了當前的最佳模型。
- Encoder-decoder的句子嵌入模型在STS上建立了新的最先進水平。
- 對T5風格的預訓練模型進行微調時,對比學習特別有效,特別是使用我們提出的兩階段對比學習方法。
- 通過使用對比損失對ST5進行更長時間和更多數據的訓練,可以在句子轉移和STS任務上持續改進。
- 創建了一個新的句子表示轉移基準——SentGLUE,擴展了SentEval句子評估工具包,涵蓋了GLUE基準中的九項任務。
> The contents shared herein are quoted verbatim from the original author and are intended solely for personal note-taking and reference purposes following a thorough reading. Any interpretation or annotation provided is strictly personal and does not claim to reflect the author's intended meaning or context.