Temporary Grading Policy:
Course Overview
Encoding
Words
Syntax
Semantics
Pragmatics
Approach
Applications
NLP Fundamental Tasks
Regular Expression
Text Normalization
Tokenization: the task of segmenting running text into words
Name Entity Recognition (NER)
Evaluation of NER
Statistical Language Models
Language Model
Models that assign probability to upcoming words, or sequences of words.
N-gram Language Models
The n-gram model is the simplest kind of language model.
The probability of a word
The bi-gram model approximates the probability of a word given all the previous words.
Chain Rule of Probability
Applying the chain rule of words
Markov models are the class of probabilistic models the assume we can predict the probaility of some future unit wiyhout looking too far into the past.
Three fundamental problems:
Perplexity
Embedding & Neural Language Models
Lexical Semantics
Connotation
Word-Word Co-occurrence Matrix
TF-IDF: Weighting terms in the vector
Word2vec
Other Kinds of Static Embeddings
Visualizing Embeddings
Probably the most common visualization method is to project the 100 dimensions of a word down into 2 dimensions suing a projection method called t-SNE
Pre-trained Language Model
Computational Graphs
Computational Graphs 是一種表示神經網路計算過程的結構。它將神經網路的各個層、節點和運算操作以圖形的方式連接起來,形成一個有向圖。這個圖描述了正向傳播和反向傳播的過程,以及每個節點之間的數值流動。
Embedding Layer
RNN & LSTM
RNN: any network that contains a cycle within its network connections
Bi-RNN
LSTM: forget/add/output gates
Bi-LSTM
Large Language Model
停課
Tomb Sweeping Day (Skip)
Large Language Model
Transformer
How to compare words?
Multihead attention
Transformer block
Positional Embedding
Sampling
A Survey of Large Language Models
Paper presentation
Paper presentation
Paper presentation
Project progress
I use MoE Bert.
Project related paper presentation
Project related paper presentation
Project related paper presentation
Project presentation