2021-01-28 - HackMD

2021-01-28 === ###### tags: `Embedding` ### Word Embedding Layer를 학습시키는 방법 1. CBow: 주변의 단어(window size)로 하나의 단어를 추측하는 기법 2. Skip-gram: 하나의 단어로 주변 단어들을 추측하는 기법 ``` 입력층 [1 * 단어 벡터 길이] * 가중치 [단어 벡터 길이 * 은닉층 길이] = 은닉층 [1 * 은닉층 길이] 은닉층 [1 * 은닉층 길이] * 가중치 [은닉층 길이 * 단어 벡터 길이] = 출력층 [1 * 단어 벡터 길이] ``` ### Skip-gram sample ```python= import sys sys.path.append('..') import numpy as np from common.layers import MatMul, SoftmaxWithLoss class SimpleSkipGram: def __init__(self, vocab_size, hidden_size): V, H = vocab_size, hidden_size # 가중치 초기화 W_in = 0.01 * np.random.randn(V, H).astype('f') W_out = 0.01 * np.random.randn(H, V).astype('f') # 계층 생성 self.in_layer = MatMul(W_in) self.out_layer = MatMul(W_out) self.loss_layer1 = SoftmaxWithLoss() self.loss_layer2 = SoftmaxWithLoss() # 모든 가중치와 기울기를 리스트에 모은다. layers = [self.in_layer, self.out_layer] self.params, self.grads = [], [] for layer in layers: self.params += layer.params self.grads += layer.grads # 인스턴스 변수에 단어의 분산 표현을 저장한다. self.word_vecs = W_in def forward(self, contexts, target): h = self.in_layer.forward(target) s = self.out_layer.forward(h) l1 = self.loss_layer1.forward(s, contexts[:, 0]) l2 = self.loss_layer2.forward(s, contexts[:, 1]) loss = l1 + l2 return loss def backward(self, dout=1): dl1 = self.loss_layer1.backward(dout) dl2 = self.loss_layer2.backward(dout) ds = dl1 + dl2 dh = self.out_layer.backward(ds) self.in_layer.backward(dh) return None ``` 참고자료 - https://heung-bae-lee.github.io/2020/01/16/NLP_01/