# ML project note ## Instruction - no ML package :cry: - code + report - instructions on how to run code - submit system's output (same format as training set) ## Summary main goal: - sequence labelling model for informal texts using Hidden Markov Model (HMM) - build two sentiment analysis systems for one different language from scratch, using our notations (?) - also using annotations from others (?) En.zip contains: - train: labelled training set ``` Municipal B-NP bonds I-NP are B-VP generally B-ADVP a B-ADJP bit I-ADJP ``` - `dev.in`: unlabelled development set - dev.out: `dev.in` but with label ``` HBO B-NP has B-VP close B-NP to I-NP 24 I-NP million I-NP subscribers I-NP ``` ::: info labels: - O: outside of any entity - B-{sentiment}, I-{sentiment}: Beginning and Inside sentimental entites --> sentiment can be "positive", "negative" and "neutral" --> what is "B-NP" then? ::: ## Refs ### HMM stochastic process is a collection of random variable indexed by mathematical sets e.g. states S = {hot, cold} States series over a time --> $z\in S^T$ weather for 4 days can be a seq --> {z1=hot, z2=cold...} #### assumption 1. limited horizon assumption Probability of state being on time T only depends on state on time T-1 $$ P(z_t|z_{t-1},z_{t-2},...)=P(z_t|z_{t-1}) $$ 2. Stationary Process assumption conditional prob does not change over time, i.e. $$ P(z_t|z_{t-1})=P(z_2|z_1),t\in{2,...,T} $$ #### Maximum Likelihood Estimation > [Theory](https://towardsdatascience.com/the-path-from-maximum-likelihood-estimation-to-hidden-markov-models-61aba5ba901c) MLE is a method to estimate the parameters of a distribution based first define problem, we have: - distribution $D_\theta$ - samples S = ( $x_1$,...$x_n$ ) - parameter space: range of possible values for $D_\theta$ - Bernouli: (0, 1) - Gaussian: $R*R$ We do not know actual $\theta$, so we want to estimate it using S the likelihood defined as $$ \Pi_{i=1}^NP[X=x_i] $$ For Bernouli, it is defined as: $$ \Pi_{i=1}^N\theta^{x_i}(1-\theta)^{1-x_i} $$ For bernouli, calculate log likelihood: > [derivation](https://towardsdatascience.com/the-path-from-maximum-likelihood-estimation-to-hidden-markov-models-61aba5ba901c) $l(\theta';x)=\sum log(\theta^{x_i}*(1-\theta)^{1-x_i})$ 1/T S(x) - 1/(1-T) S(1-x) = 0 = (1-T) S(x) - T S(1-x) For HMM, we can use Expectation Maximization (EM), which use iterative process to perform MLE in statistical models with latent variables