# Clustering results: 3<sup>rd</sup> revision
<style>
.markdown-body {
max-width: 1158px !important;
}
img {
/* max-width: 50% !important; */
}
</style>
**Table: Feature importance in fsfc package (k=3 for WK-Means and Lasso)**
```
Normalized cut WK-Means (k=2) WK-Means (k=3) Manual
--------------------- --------------------- --------------------- --------
PR 0.802254 LAEDVI 0.041681 IVSD 0.051811 SBP
IVRT 0.772498 REAM 0.041303 LVPWD 0.048019 PP
LVIDD 0.762202 LA_ADI 0.040180 LVMI 0.045703 LVMI
PP 0.732354 REEM 0.039521 RWT 0.039969 PR
AO_DIAM 0.731632 RMVEA 0.037760 EM 0.038479 REEM
IVSD 0.713138 LVMI 0.037075 REEM 0.037368 ESV_MODI
LVMI 0.704317 IVSD 0.036434 LAEDVI 0.037309 LAESVI
ESV_MODI 0.681824 LAESVI 0.036033 LA_ADI 0.035415 LA_GS
SV_MODI 0.657734 LA_A_4CH 0.035375 SBP 0.034432 MVE_VEL
RWT 0.653140 LA_EF_4CH 0.034970 LAESVI 0.033622 MVA_VEL
AM 0.639099 RWT 0.034669 MV_DECT 0.033534 RMVEA
REAM 0.615519 LA_ASI 0.034510 PP 0.032404 AM
SBP 0.613266 MV_DECT 0.034405 AM 0.032150 EM
LVPWD 0.609336 AO_DIAM 0.033465 ESV_MODI 0.031163 GS
SM 0.519486 ESV_MODI 0.032342 GS 0.030588 MV_DECT
EF_MOD 0.514664 GS 0.031633 LA_ASI 0.030222
RMVEA 0.499584 PP 0.031023 LA_EF_4CH 0.030113
MVE_VEL 0.480564 PR 0.030647 LA_A_4CH 0.029932
GS 0.478983 SV_MODI 0.030499 PR 0.029586
LA_GS 0.468251 EF_MOD 0.030433 RMVEA 0.029519
EM 0.458699 SBP 0.030161 REAM 0.029458
MVA_VEL 0.389319 IVRT 0.030082 SV_MODI 0.029443
LAESVI 0.385018 LVPWD 0.029924 EF_MOD 0.029379
LA_ASI 0.364478 LA_GS 0.029730 IVRT 0.029041
LA_A_4CH 0.305171 MVA_VEL 0.029682 AO_DIAM 0.028904
LA_EF_4CH 0.278209 SM 0.029621 LA_GS 0.028699
LA_ADI 0.276537 AM 0.029408 MVA_VEL 0.028653
MV_DECT 0.274160 LVIDD 0.029390 SM 0.028595
LAEDVI 0.258937 MVE_VEL 0.029125 LVIDD 0.028373
REEM 0.195259 EM 0.028918 MVE_VEL 0.028117
```
<br>
**Table: feature importance from ClustVarSelLCM for k=2**
<table>
<tbody>
<tr><td>Variables</td><td>Discrim. Power</td><td>Discrim. Power (%)</td><td>Discrim. Power (% cum)</td></tr>
<tr><td>REAM </td><td>643.22 </td><td>9.79 </td><td>9.79 </td></tr>
<tr><td>EM </td><td>557.63 </td><td>8.49 </td><td>18.28 </td></tr>
<tr><td>RMVEA </td><td>428.47 </td><td>6.52 </td><td>24.81 </td></tr>
<tr><td>LAEDVI </td><td>396.74 </td><td>6.04 </td><td>30.85 </td></tr>
<tr><td>REEM </td><td>385.05 </td><td>5.86 </td><td>36.71 </td></tr>
<tr><td>LA_ADI </td><td>337.76 </td><td>5.14 </td><td>41.85 </td></tr>
<tr><td>LA_GS </td><td>293.81 </td><td>4.47 </td><td>46.33 </td></tr>
<tr><td>IVSD </td><td>282.91 </td><td>4.31 </td><td>50.63 </td></tr>
<tr><td>MVA_VEL </td><td>265.06 </td><td>4.04 </td><td>54.67 </td></tr>
<tr><td>SBP </td><td>251.84 </td><td>3.83 </td><td>58.5 </td></tr>
<tr><td>MV_DECT </td><td>251.5 </td><td>3.83 </td><td>62.33 </td></tr>
<tr><td>LVMI </td><td>249.36 </td><td>3.8 </td><td>66.13 </td></tr>
<tr><td>LVPWD </td><td>238.89 </td><td>3.64 </td><td>69.77 </td></tr>
<tr><td>RWT </td><td>232.57 </td><td>3.54 </td><td>73.31 </td></tr>
<tr><td>PP </td><td>231.99 </td><td>3.53 </td><td>76.84 </td></tr>
<tr><td>LAESVI </td><td>209.58 </td><td>3.19 </td><td>80.03 </td></tr>
<tr><td>LA_A_4CH </td><td>201.95 </td><td>3.07 </td><td>83.11 </td></tr>
<tr><td>LA_EF_4CH</td><td>185.32 </td><td>2.82 </td><td>85.93 </td></tr>
<tr><td>AO_DIAM </td><td>183.3 </td><td>2.79 </td><td>88.72 </td></tr>
<tr><td>LA_ASI </td><td>161.47 </td><td>2.46 </td><td>91.18 </td></tr>
<tr><td>AM </td><td>149.0 </td><td>2.27 </td><td>93.45 </td></tr>
<tr><td>IVRT </td><td>129.09 </td><td>1.97 </td><td>95.41 </td></tr>
<tr><td>MVE_VEL </td><td>128.61 </td><td>1.96 </td><td>97.37 </td></tr>
<tr><td>SM </td><td>116.14 </td><td>1.77 </td><td>99.14 </td></tr>
<tr><td>ESV_MODI </td><td>21.25 </td><td>0.32 </td><td>99.46 </td></tr>
<tr><td>GS </td><td>15.08 </td><td>0.23 </td><td>99.69 </td></tr>
<tr><td>EF_MOD </td><td>12.04 </td><td>0.18 </td><td>99.87 </td></tr>
<tr><td>LVIDD </td><td>8.23 </td><td>0.13 </td><td>100.0 </td></tr>
</tbody>
</table>
<br>
**Table: feature importance from ClustVarSelLCM for k=3**
<table>
<tbody>
<tr><td>Variables</td><td>Discrim. Power</td><td>Discrim. Power (%)</td><td>Discrim. Power (% cum)</td></tr>
<tr><td>REAM </td><td>817.64 </td><td>9.31 </td><td>9.31 </td></tr>
<tr><td>LAEDVI </td><td>699.19 </td><td>7.96 </td><td>17.27 </td></tr>
<tr><td>EM </td><td>629.51 </td><td>7.17 </td><td>24.44 </td></tr>
<tr><td>LA_ADI </td><td>596.19 </td><td>6.79 </td><td>31.23 </td></tr>
<tr><td>RMVEA </td><td>582.88 </td><td>6.64 </td><td>37.87 </td></tr>
<tr><td>REEM </td><td>481.04 </td><td>5.48 </td><td>43.34 </td></tr>
<tr><td>LAESVI </td><td>414.47 </td><td>4.72 </td><td>48.06 </td></tr>
<tr><td>LVMI </td><td>345.93 </td><td>3.94 </td><td>52.0 </td></tr>
<tr><td>LA_ASI </td><td>342.89 </td><td>3.9 </td><td>55.91 </td></tr>
<tr><td>MVA_VEL </td><td>335.27 </td><td>3.82 </td><td>59.72 </td></tr>
<tr><td>IVSD </td><td>309.54 </td><td>3.52 </td><td>63.25 </td></tr>
<tr><td>LA_GS </td><td>309.44 </td><td>3.52 </td><td>66.77 </td></tr>
<tr><td>SBP </td><td>295.45 </td><td>3.36 </td><td>70.14 </td></tr>
<tr><td>LA_A_4CH </td><td>284.82 </td><td>3.24 </td><td>73.38 </td></tr>
<tr><td>LA_EF_4CH</td><td>273.4 </td><td>3.11 </td><td>76.49 </td></tr>
<tr><td>PP </td><td>262.97 </td><td>2.99 </td><td>79.49 </td></tr>
<tr><td>MV_DECT </td><td>255.25 </td><td>2.91 </td><td>82.39 </td></tr>
<tr><td>AM </td><td>253.88 </td><td>2.89 </td><td>85.28 </td></tr>
<tr><td>LVPWD </td><td>244.71 </td><td>2.79 </td><td>88.07 </td></tr>
<tr><td>RWT </td><td>233.04 </td><td>2.65 </td><td>90.72 </td></tr>
<tr><td>AO_DIAM </td><td>162.76 </td><td>1.85 </td><td>92.58 </td></tr>
<tr><td>MVE_VEL </td><td>156.93 </td><td>1.79 </td><td>94.36 </td></tr>
<tr><td>IVRT </td><td>142.54 </td><td>1.62 </td><td>95.99 </td></tr>
<tr><td>SM </td><td>136.16 </td><td>1.55 </td><td>97.54 </td></tr>
<tr><td>ESV_MODI </td><td>114.66 </td><td>1.31 </td><td>98.84 </td></tr>
<tr><td>LVIDD </td><td>39.61 </td><td>0.45 </td><td>99.29 </td></tr>
<tr><td>SV_MODI </td><td>28.07 </td><td>0.32 </td><td>99.61 </td></tr>
<tr><td>PR </td><td>15.19 </td><td>0.17 </td><td>99.79 </td></tr>
<tr><td>GS </td><td>13.58 </td><td>0.15 </td><td>99.94 </td></tr>
<tr><td>EF_MOD </td><td>5.15 </td><td>0.06 </td><td>100.0 </td></tr>
</tbody>
</table>
<br>
**Figure: Iteratively removing features according to ClustVarSelLCM and measuring agreement with all features**

## K selection
**Figure: k selection—normalized, higher is better**

<br>
**Figure: k selection—normalized, higher is better**

<br>
**Figure: k selection—normalized, higher is better**

## Biclustering
### All

### Manual

### Normalized cut

### WK-means (features from k=3)

### Lasso

### clustvarsel (features from k=3)

## Outcome
<div class="lm-Widget p-Widget jp-OutputArea jp-Cell-outputArea" style=""><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><h3>All</h3></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedText jp-mod-trusted jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stdout"><pre>Features: RWT, EM, LVMI, IVSD, PP, GS, LA_ADI, SBP, AM, LVPWD, MVE_VEL, LVIDD, LA_GS, RMVEA, PR, SM, LA_ASI, REEM, REAM, IVRT, LAEDVI, LAESVI, MVA_VEL, AO_DIAM, EF_MOD, LA_A_4CH, MV_DECT, ESV_MODI, LA_EF_4CH, SV_MODI
</pre></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><table>
<tbody>
<tr><td>Name </td><td>SI </td><td>DBI </td><td>Gini impurity</td><td>Cluster 0 </td><td>Cluster 1 </td><td>Cluster 2 </td></tr>
<tr><td>Spectral </td><td>0.209</td><td>2.177</td><td>0.175 </td><td>0: 390, 1: 2 (0.51%)</td><td>0: 575, 1: 59 (9.31%) </td><td>0: 293, 1: 88 (23.10%) </td></tr>
<tr><td>Gaussian Mixture</td><td>0.203</td><td>2.927</td><td>0.175 </td><td>0: 518, 1: 5 (0.96%)</td><td>0: 588, 1: 90 (13.27%)</td><td>0: 152, 1: 54 (26.21%) </td></tr>
<tr><td>K-medoids </td><td>0.174</td><td>2.569</td><td>0.166 </td><td>0: 496, 1: 4 (0.80%)</td><td>0: 537, 1: 50 (8.52%) </td><td>0: 225, 1: 95 (29.69%) </td></tr>
<tr><td>Agglomerative </td><td>0.143</td><td>2.474</td><td>0.175 </td><td>0: 332, 1: 2 (0.60%)</td><td>0: 658, 1: 63 (8.74%) </td><td>0: 268, 1: 84 (23.86%) </td></tr>
<tr><td>K-Means </td><td>0.129</td><td>2.183</td><td>0.171 </td><td>0: 485, 1: 5 (1.02%)</td><td>0: 556, 1: 63 (10.18%)</td><td>0: 217, 1: 81 (27.18%) </td></tr>
<tr><td>WK-Means </td><td>0.081</td><td>2.644</td><td>0.180 </td><td>0: 151 </td><td>0: 411, 1: 14 (3.29%) </td><td>0: 696, 1: 135 (16.25%)</td></tr>
</tbody>
</table></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><h3>manual</h3></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedText jp-mod-trusted jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stdout"><pre>Features: SBP, PP, LVMI, PR, REEM, ESV_MODI, LAESVI, LA_GS, MVE_VEL, MVA_VEL, RMVEA, AM, EM, GS, MV_DECT
</pre></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><table>
<tbody>
<tr><td>Name </td><td>SI </td><td>DBI </td><td>Gini impurity</td><td>Cluster 0 </td><td>Cluster 1 </td><td>Cluster 2 </td></tr>
<tr><td>K-medoids </td><td>0.102 </td><td>2.936</td><td>0.168 </td><td>0: 721, 1: 24 (3.22%)</td><td>0: 293, 1: 28 (8.72%) </td><td>0: 244, 1: 97 (28.45%) </td></tr>
<tr><td>Gaussian Mixture</td><td>0.075 </td><td>2.668</td><td>0.168 </td><td>0: 568, 1: 8 (1.39%) </td><td>0: 473, 1: 54 (10.25%)</td><td>0: 217, 1: 87 (28.62%) </td></tr>
<tr><td>K-Means </td><td>0.058 </td><td>2.342</td><td>0.166 </td><td>0: 465, 1: 5 (1.06%) </td><td>0: 568, 1: 48 (7.79%) </td><td>0: 225, 1: 96 (29.91%) </td></tr>
<tr><td>Agglomerative </td><td>0.052 </td><td>2.586</td><td>0.172 </td><td>0: 511, 1: 10 (1.92%)</td><td>0: 411, 1: 33 (7.43%) </td><td>0: 336, 1: 106 (23.98%)</td></tr>
<tr><td>Spectral </td><td>0.027 </td><td>2.533</td><td>0.169 </td><td>0: 335, 1: 1 (0.30%) </td><td>0: 652, 1: 49 (6.99%) </td><td>0: 271, 1: 99 (26.76%) </td></tr>
<tr><td>WK-Means </td><td>-0.035</td><td>3.319</td><td>0.174 </td><td>0: 338, 1: 3 (0.88%) </td><td>0: 514, 1: 33 (6.03%) </td><td>0: 406, 1: 113 (21.77%)</td></tr>
</tbody>
</table></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><h3>normalized_cut_15</h3></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedText jp-mod-trusted jp-OutputArea-output" data-mime-type="application/vnd.jupyter.stdout"><pre>Features: PR, IVRT, LVIDD, PP, AO_DIAM, IVSD, LVMI, ESV_MODI, SV_MODI, RWT, AM, REAM, SBP, LVPWD, SM
</pre></div></div><div class="lm-Widget p-Widget lm-Panel p-Panel jp-OutputArea-child"><div class="lm-Widget p-Widget jp-OutputPrompt jp-OutputArea-prompt"></div><div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><table>
<tbody>
<tr><td>Name </td><td>SI </td><td>DBI </td><td>Gini impurity</td><td>Cluster 0 </td><td>Cluster 1 </td><td>Cluster 2 </td></tr>
<tr><td>WK-Means </td><td>0.223</td><td>1.984</td><td>0.179 </td><td>0: 605, 1: 28 (4.42%)</td><td>0: 513, 1: 72 (12.31%)</td><td>0: 140, 1: 49 (25.93%)</td></tr>
<tr><td>K-medoids </td><td>0.219</td><td>2.100</td><td>0.174 </td><td>0: 682, 1: 27 (3.81%)</td><td>0: 449, 1: 65 (12.65%)</td><td>0: 127, 1: 57 (30.98%)</td></tr>
<tr><td>Spectral </td><td>0.147</td><td>2.023</td><td>0.182 </td><td>0: 398, 1: 4 (1.00%) </td><td>0: 465, 1: 70 (13.08%)</td><td>0: 395, 1: 75 (15.96%)</td></tr>
<tr><td>Gaussian Mixture</td><td>0.125</td><td>3.050</td><td>0.175 </td><td>0: 554, 1: 11 (1.95%)</td><td>0: 417, 1: 53 (11.28%)</td><td>0: 287, 1: 85 (22.85%)</td></tr>
<tr><td>K-Means </td><td>0.124</td><td>1.981</td><td>0.179 </td><td>0: 514, 1: 7 (1.34%) </td><td>0: 412, 1: 73 (15.05%)</td><td>0: 332, 1: 69 (17.21%)</td></tr>
<tr><td>Agglomerative </td><td>0.115</td><td>2.322</td><td>0.179 </td><td>0: 467, 1: 8 (1.68%) </td><td>0: 466, 1: 62 (11.74%)</td><td>0: 325, 1: 79 (19.55%)</td></tr>
</tbody>
</table></div></div></div>
## TODO
- Start writing methods section and results
- 1:1
- Feature selection
- Number of clusters (Stability, SI, BIC)
- Biclustering
- Table with clustering indexes (+stability!!!)
- Export csv cluster assignments (methods x algorithm: id, assignments) -> Tatiana Cox (all without wkmeans)
- k-means, gaussian mixture
- (?) rand index k