# ADL Lecture 3.1: Word Representations 筆記 ###### tags: `NLP` {%youtube p2e_riORjuU %} ## :memo: Meaning Representations in Computers ### Type 1: Knowledge-Based Representation - 由語言學家定出來的wordnet,表示詞彙之間的關係。  - 問題: -過於主觀 -新加入的字無法處理 -帶有語言學家主觀的意見 -需要人工標記 :rocket: ### Type 2: Corpus-Based Representation - **Atomic symbols**: one-hot representation  -問題:無法得知字詞之間的關聯。 - Neighbor-based representation -Neighbor 定義: full documet -Neighbor 定義: windows * Windows-Based Co-occurrence Matrixz  * 問題: matrix size隨字詞增加而增大 #### 改進 : Low-Dimensional Dense Word Vector * Method 1: dimension reduction on the matrix * Singular Value Decomposition (SVD) of co-occurrence matrix X  > 使用SVD讓矩陣S降維(由r降為k) 問題: SVD計算量龐大 * Method 2: directly learn low-dimensional word vectors * Recent and most popular models: **word2vec** (Mikolov et al. 2013) and **Glove** (Pennington et al., 2014) > As known as “Word Embeddings” ---
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up