# Unsupervised CPD algotrithms ## Probability Density Ratio analyze the probability distributions of data before and after a candidate change point, and identify the candidate as a change point if the two distributions are significantly different. In these approacheds, the logarithm of the likelihood ratio between two consecutive intervals in time-series data is monitored for detecting change points. ### steps 1. the probability of two consecutive intervals is calculated separately 2. the ratio of these probability density is computed ### Methods ### 1. based on loglikelihood ratio #### Parametric - CUMSUM - https://nbviewer.jupyter.org/github/demotu/BMC/blob/master/notebooks/DetectCUSUM.ipynb - Auto Regression - reduce the problem of change point detection into time series-based outlier detection. - fit auto regression model onto the data to represent the statistical behavior of the time series and updates its parameter estimates incrementally so that the effect of past examples is gradually discounted. #### Non - Parametric ### most basic sliding window (possibly the base model) steps: - choose a small number of window (10) - the mean of the latter half window deduct the mean of first half window . - the change point is the point where the biggest discrepancy happens. #### Frequentists http://www.claudiobellei.com/2016/11/15/changepoint-frequentist/ #### Direct Density-ratio Estimation - knowing the two densities implies knowing the density ratio. * different dissimilarity measures: ** KLIEP ** uLSIF ** RuLSIF ** SPLL (semi-parametric log-likelihood change detector) - python :https://pypi.org/project/densratio/ - using it to apply a sliding window to do direct density-ratio estimation on the sentiment scores. http://www.ms.k.u-tokyo.ac.jp/2012/CDKLIEP.pdf - https://github.com/chenxingqiang/rulsif_abrupt-change_detection (RULSIF) #### applying exponential decay on top of the methods above. - without given the number of change points. - https://de.mathworks.com/matlabcentral/answers/343452-exponential-growth-decay-point-detection-in-a-time-series-plot ### 2. Subspace Model Methods an anlysis of subpaces in which time series sequences are constrained. (control theory) - subspace identification - singular spectrum - http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.6340&rep=rep1&type=pdf ### 3. Probabilistic Methods - bayesian change point detection - https://arxiv.org/abs/0710.3742 - BCPD compares new sliding window features with the stimation based on all previous intervals from the same state. - https://github.com/hildensia/bayesian_changepoint_detection ### 4. Kernel Based Methods - unsupervised kernel-based test static to test the homogeneity of data in time series past and present sliding windows. - RKHS (reproducing kernel HIlbert space) - http://www.gatsby.ucl.ac.uk/~gretton/coursefiles/lecture4_introToRKHS.pdf ### 5. grpah-based methods ### 6. Clustering methods