Try   HackMD

自然語言處理 - 李龍豪 (2024 Spring)

Class info.

課程資訊

Temporary Grading Policy:

  • 自選論文書面報告: 2020~2023 ACL/EMNLP (20%)
  • 指定論文口頭報告 (30%)
  • 期末專題實作與報告 (50%)
    中文維度型情感分析
    • Training Set: Chinese EmoBank
    • Test Set: 1000+ Mental Health Texts

Date

2/22

Course Overview

Encoding

  • ASCII
  • Unicode
  • UTF-8
  • BIG-5 (繁體中文)
  • GB_2312 (簡體中文)

Words

  • Morphology
  • Word Segmentation
  • Part-of-Speech Tagging

Syntax

  • Constituency Grammars
  • Syntactic Parsing
  • Dependency Parsing

Semantics

  • Lexical Semantics
  • Semantic Role Labeling
  • Words Sense Disambiguation

Pragmatics

  • Coreference Resolution
  • Discourse Analysis
  • Simile vs Metaphor

Approach

  • Rule-Based Approach
  • Corpus-Based Approach
  • Statistical Language Models
  • Neural Language Models
  • Pre-Trained Language Models
  • Large Language Models

Applications

  • Machine Translations
  • Question Answering
  • Summarization
  • Dialog Systems and Chatbots
  • Grammatical/Spelling Error Correction
  • Sentiment Analysis

2/29

NLP Fundamental Tasks

Regular Expression

Text Normalization

  • Tokenizing (segmenting) words
  • Normalizing word formats
  • Segmenting sentences

Tokenization: the task of segmenting running text into words

Name Entity Recognition (NER)

  • BiLSTM with CRF Model

Evaluation of NER

  • Recall
  • Precision
  • F-measure

F=2precisionrecallprecision+recall

3/7

Statistical Language Models

Language Model

Models that assign probability to upcoming words, or sequences of words.

N-gram Language Models

The n-gram model is the simplest kind of language model.

The probability of a word

w give some history
h
.

P(wh)

The bi-gram model approximates the probability of a word given all the previous words.

P(wnw1:n1)P(wnwn1)

Chain Rule of Probability

Applying the chain rule of words

P(w1:n)=P(w1)P(w2w1)P(w3w1:2)P(wnw1:n1)=k=1nP(wkw1:k1)

Hidden Markov Models

Markov models are the class of probabilistic models the assume we can predict the probaility of some future unit wiyhout looking too far into the past.

Three fundamental problems:

  • Likelihood
  • Decoding
  • Learning

Perplexity

perplexity(W)=P(w1w2wN)1N=(i=1N1P(wiwi1))1N

3/14

Embedding & Neural Language Models

Lexical Semantics

Connotation

  • Valence
  • Arousal
  • Dominance

Word-Word Co-occurrence Matrix

TF-IDF: Weighting terms in the vector

  • TF (Term Frequency)
  • IDF (Inverse Document Frequency)

tft,d={1+log10count(t,d) if count(t,d)>00 otherwise 

idft=log10(Ndft)

wt,d=tft,d×idft

Word2vec

Other Kinds of Static Embeddings

  • GloVe
  • Fasttext

Visualizing Embeddings

Probably the most common visualization method is to project the 100 dimensions of a word down into 2 dimensions suing a projection method called t-SNE

3/21

Pre-trained Language Model

  • MLP is linear transfer
  • Feed forward network: non-linear transfer (activation)

Computational Graphs

Computational Graphs 是一種表示神經網路計算過程的結構。它將神經網路的各個層、節點和運算操作以圖形的方式連接起來,形成一個有向圖。這個圖描述了正向傳播和反向傳播的過程,以及每個節點之間的數值流動。

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Embedding Layer

  • 嵌入層是一種將高維度資料轉換為低維度表示的技術,同時保留原始資料的特性。
  • 在處理像是稀疏向量等大量輸入的情況下,使用嵌入層可以使模型訓練更簡單。

RNN & LSTM

RNN: any network that contains a cycle within its network connections

Bi-RNN

LSTM: forget/add/output gates

Bi-LSTM

GRU

CNN for NLP

3/28

Large Language Model

停課

4/4

Tomb Sweeping Day (Skip)

4/11

Large Language Model

Transformer

  • Backward looking
  • Bidirectional

How to compare words?

  • Inner product
  • Query / Key / Value
  • Attention score dividing the square root of the dimensionality of the query and key vectors

Multihead attention

Transformer block

image

Positional Embedding

Sampling

  • Top-K
  • Top-P
  • Temperature

A Survey of Large Language Models

4/18

Paper presentation

4/25

Paper presentation

5/2

Paper presentation

5/9

Project progress

I use MoE Bert.

5/16

Project related paper presentation

5/23

Project related paper presentation

5/30

Project related paper presentation

6/6

Project presentation

Reference