# Midterm [Practical Part]
* We want to approximate this data using linear regressions and Fuzzy c means

## Create a set of clusters from the data using FCM.
* Extend your 1d implementation of clustering to Nd
* After fit, for each point you will have
* N_CLUSTERS probas
* argmax(N_CLUSTERS) - corresponded cluster
* I used N_CLUSTERS = 3:

## For each cluster, fit linear regression locally
### Normal equation

* For each cluster we fit linreg - [theta0, theta1]
* Finally we have N_CLUSTERS linregs, shape is (N_CLUSTERS, 2)
### Draw this regressions:

## Create final predictor (degranularized model)

* Ai(xj) - proba that xj belongs to the cluster i
* ai - coefficients of i-th linear regression
* shape(A) is (N_POINTS, N_CLUSTERS)
* shape(a) is (N_CLUSTERS, 2)
* shape(X) is (N_POINTS, 2) // second dimension is for bias
## This formula can be vectorized
y = (X @ a.T * A).sum(1)
* @ is matrix multiplication
* "*" is elementwise multiplication
* .sum(1) summation over first(second) dimension
* shapes: ((N_POINTS, 2) @ (2, N_CLUSTERS) "*" (N_POINTS, N_CLUSTERS)).sum(1) -> (N_POINTS, 1)
### So for any xi we have weighted sum of linear regressions outputs where each weight is probability that this point belongs to this cluster.
## Calculate mse

* Implement mean squared from scratch or use sklearn
## Plot degranularized model
