# SimpleHutch/Hutch++/HutchBar AMM experiment
## data settings from literatures
1. [A Note on Random Sampling for Matrix Multiplication](https://arxiv.org/pdf/1811.11237.pdf)
- random matrix generated from standard uniform distribution with sizes $100\times2000$
- block (Coarser partition) chosen by Enhanced pairwise partition.
- AMM sampler size = 1000 and 3000
- repeated experiments number = 50000
2. [Importance sampling for a Monte Carlo matrix multiplication algorithm, with application to information retrieval](https://epubs.siam.org/doi/pdf/10.1137/10080659X)
- practical data: term-by-document matrix for the Reuters Transcribed Subset data set with sizes $201\times5601$; term-by-document matrix from Wikipedia web pages with sizes $203688\times198853$
3. [Approximate Weighted C R Coded Matrix Multiplication](https://arxiv.org/pdf/2011.09709.pdf)
- random matrices A, B with L = 260, N = 9600, M = 280 with non-uniform distribution
4. [OverSketch: Approximate Matrix Multiplication for the Cloud](https://ieeexplore.ieee.org/iel7/8610059/8621858/08622139.pdf?casa_token=bXHQw5YdykQAAAAA:W297arGkkX1EzSd9PZBkAMmyK1hkhgrTCKnfC5zvWPn-KDlkUbUHWEKWndw9IQ8pjmTqHRHavVc9)
- AWS Lambda cloud system
5. [A practical streaming approximate matrix multiplication algorithm](https://www.sciencedirect.com/science/article/pii/S1319157818306396)
- Uniform Random: $2000\times10000$ and $1000\times10000$
- Low rank matrices:
- Noisy low rank matrices
- Drift
## experiments
Matrix Factor Size: $100\times10000$
**Block chosen by Enhanced pairwise partition** (column-wise sampling probability estimated by Simple Hutch with 10 matvecs)
sample number of AMM = 100
number of repeated test = 100
block_sz = 5 (#group = 10000/5)

block_sz = 10 (#group = 10000/10)

block_sz = 20 (#group = 10000/10)

block_sz = 50 (#group = 10000/50)

block_sz = 100 (#group = 10000/100)

**Block chosen by Enhanced pairwise partition** (column-wise sampling probability estimated by Simple Hutch with 1 matvec)
block_sz = 5 (#group = 10000/5)

block_sz = 100 (#group = 10000/100)

**Block chosen by natural sequence**
block_sz = 5 (#group = 10000/5)

block_sz = 10 (#group = 10000/10)

block_sz = 20 (#group = 10000/20)

block_sz = 50 (#group = 10000/50)

block_sz = 100 (#group = 10000/100)

**Non-Uniform Data**
Block chosen by Enhanced pairwise partition, block_sz = 5

Block chosen by Enhanced pairwise partition, block_sz = 100

Block chosen by random sequence, block_sz = 5

Block chosen by random sequence, block_sz = 100

**Modified P_bar method**
block_sz = 20
Block chosen by natural sequence:

Block chosen by Enhanced pairwise partition:

**Time comparison**

## experiments 2
**control the number of truly sampled columns**
### On uniform sampled data




### Low Rank Matrices






### Covariance Matrices
A is low rank matrix:


A is uniform sampled:


### experiment with block size = 1
#matvec = 1
uniform sampled data

low rank case

low rank case: fixed relative error

### exponential decaying data experiment






case 2:



### linear decaying data experiment



### drift experiment



## comparison with uniform sampling
Expriments comparing the performance between Hutch estimated sampling, Biased p_bar sampling and uniform sampling AMM, showing where and when Hutch estimated sampling performs better.


Performance: Hutch AMM perofrms best in all cases. Only when mat-vec number id very low, uniform sampling is better than Biased p_bar.

After increasing the block size, (decreasing AMM sampling times for fixed total column number) the performance: All three methods' performance become worse. Hutch AMM perofrms best, then biased p_bar, then uniform method.




The performance of linear increasing and drift case agrees with the exponential case.


In this approximated uniform data, Hutch AMM is only worse than uniform method when mat-vec number is small.
### runtime visualization

For a fair comparison, we predefine the subgroup indices and compare the runtime of Hutch AMM and standard block AMM.

### experiment: summarized line plots








### line plots with average error




### experiment: runtime visualization fixed

size 100 $\times$ 10000, exponential decreasing, h=10

size 200 $\times$ 10000

size 50 $\times$ 10000

size 100 $\times$ 20000

### extended visualization



### time measurements for diff shape
$100\times 10000$

$100\times 20000$

$100\times 50000$

$100\times 100000$

$200\times 10000$

$200\times 20000$

Exponential Decreasing
| Time table | time - p | accuracy - p |
| -------- | -------- | -------- |
| s = 10 | 1.0e-12 * 0.0000 | 0.2843 |
| s = 50 | 1.0e-12 * 0.5310 | 0.0351 |
| s = 100 | 1.0e-12 * 0.0000 | 0.0559 |
| s = 200 | 1.0e-12 * 0.0000 | 0.9606 |
| s = 500 | 1.0e-12 * 0.0111 | 0.6939 |
Linear Increasing
| Time table | time - p | accuracy - p |
| -------- | -------- | -------- |
| s = 10 | 1.0e-12 * 0.0000 | 0.4947 |
| s = 50 | 1.0e-12 * 0.3964 | 0.0240 |
| s = 100 | 1.0e-12 * 0.0000 | 0.8398 |
| s = 200 | 1.0e-12 * 0.0000 | 0.5426 |
| s = 500 | 1.0e-12 * 0.0000 | 0.1633 |
Random
| Time table | time - p | accuracy - p |
| -------- | -------- | -------- |
| s = 10 | 1.0e-15 * 0.0000 | 0.3622 |
| s = 50 | 1.0e-15 * 0.0000 | 0.1193 |
| s = 100 | 1.0e-15 * 0.0000 | 0.2183 |
| s = 200 | 1.0e-15 * 0.0000 | 0.8061 |
| s = 500 | 1.0e-15 * 0.1886 | 0.1228 |
Exponential Decreasing, #tests = 100, fixed matrices seed=1,
block_sz = 100, h=10:
| Time table | time - p | accuracy - p |
| -------- | -------- | -------- |
| s = 10 | 1.0e-11 * 0.0000 | 0.1275 |
| s = 100 | 1.0e-11 * 0.7748 | 0.9286 |
| s = 500 | 1.0e-11 * 0.0013 | 0.5993 |
block_sz = 100, h=50:
| Time table | time - p | accuracy - p |
| -------- | -------- | -------- |
| s = 10 | 0.1612 | 0.6885 |
| s = 100 | 0.0655 | 0.5168 |
| s = 500 | 0.1793 | 0.8550 |
### Practical application
[tf-idf matrices multiplication for cosine similarity](https://medium.com/@rantav/large-scale-matrix-multiplication-with-pyspark-or-how-to-match-two-large-datasets-of-company-1be4b1b2871e)
[Text datasets in matlab format](http://www.cad.zju.edu.cn/home/dengcai/Data/TextData.html)
[Distributed dot product using Apache spark](https://georgheiler.com/2021/08/06/scalable-sparse-matrix-multiplication/)
[cosine similarity movie recommendation](https://towardsdatascience.com/using-cosine-similarity-to-build-a-movie-recommendation-system-ae7f20842599)
[Wiki People Dataset](https://www.kaggle.com/code/meuge672/tf-idf-and-knn-in-people-wikipedia-dataset/data)