ML project note

Instruction

  • no ML package :cry:
  • code + report
    • instructions on how to run code
    • submit system's output (same format as training set)

Summary

main goal:

  • sequence labelling model for informal texts using Hidden Markov Model (HMM)
  • build two sentiment analysis systems for one different language from scratch, using our notations (?)
  • also using annotations from others (?)

En.zip contains:

  • train: labelled training set
    ​​​​Municipal B-NP
    ​​​​bonds I-NP
    ​​​​are B-VP
    ​​​​generally B-ADVP
    ​​​​a B-ADJP
    ​​​​bit I-ADJP
    
  • dev.in: unlabelled development set
  • dev.out: dev.in but with label
    ​​​​HBO B-NP
    ​​​​has B-VP
    ​​​​close B-NP
    ​​​​to I-NP
    ​​​​24 I-NP
    ​​​​million I-NP
    ​​​​subscribers I-NP
    

labels:

  • O: outside of any entity
  • B-{sentiment}, I-{sentiment}: Beginning and Inside sentimental entites
    > sentiment can be "positive", "negative" and "neutral"
    > what is "B-NP" then?

Refs

HMM

stochastic process is a collection of random variable indexed by mathematical sets
e.g.
states S = {hot, cold}
States series over a time >

zST
weather for 4 days can be a seq > {z1=hot, z2=cold}

assumption

  1. limited horizon assumption
    Probability of state being on time T only depends on state on time T-1
    P(zt|zt1,zt2,...)=P(zt|zt1)
  2. Stationary Process assumption
    conditional prob does not change over time, i.e.
    P(zt|zt1)=P(z2|z1),t2,...,T

Maximum Likelihood Estimation

Theory

MLE is a method to estimate the parameters of a distribution based
first define problem, we have:

  • distribution
    Dθ
  • samples S = (
    x1
    ,
    xn
    )
  • parameter space: range of possible values for
    Dθ
    • Bernouli: (0, 1)
    • Gaussian:
      RR

We do not know actual

θ, so we want to estimate it using S
the likelihood defined as
Πi=1NP[X=xi]

For Bernouli, it is defined as:
Πi=1Nθxi(1θ)1xi

For bernouli, calculate log likelihood:

derivation

l(θ;x)=log(θxi(1θ)1xi)
1/T S(x) - 1/(1-T) S(1-x) = 0 = (1-T) S(x) - T S(1-x)

For HMM, we can use Expectation Maximization (EM), which use iterative process to perform MLE in statistical models with latent variables