CPR Progress:

Slides & Experiment Results

06/30

資料路徑:
- DATASETS: /tmp2/yzliu/for_Jacky_senbai/CPR_revision/datasets
- RESULTS: /tmp2/yzliu/for_Jacky_senbai/CPR_revision/results

Current problems:

原 paper settings:
- 200 update times (aka 2*10^8 samples)
- BPR/Hop-Rec 的 200 epoch 太低
  - 500 很正常
    - [] 測 + 畫 metric 折線圖
Some baseline > CPR

TODO:

把每 100 個 epoch 的 embedding 存起來 + 跑 eval
- 畫折線圖
多跑幾次，算 mean & variance
可能改 item aggregation

Survey:

CMF 以後，不使用外部資訊 (Ex: 文字、影像) 的跨領域推薦研究幾乎沒有。
- 連 EMCDR 都有所謂使用者資訊
主要的研究也都 focus 在 shared user
- 唯一做 cold start 的大概是 CATN (SIGIR 2020)，但 CATN 也有使用文字

Summary:

問題: 實驗分數在 200~500 epoch 之間落差很大，
Ex: 200 epoch 時，CPR 遠大於 BPR，但 500 時的差距就相對合理
- 原因: BPR+, Hop-Rec+ 等等，是把兩張圖拼接，收斂需要跑更多 epoch
- 解法: 因為 CPR 相形之下收斂較快，可畫折線圖說明其效果

07/08:

07/15:

07/22:

Spend a lot of time on rescaling codes of CMF/EMCDR codes for meeting on reproducing experiments on our 10-core datasets (different from original datasets).
Finished CPR's Parameter Adjustment
- Best Parameter Combination: ug0.01, ig0.06
- In TVVOD, increase 1~1.2 recall/NDCG point
- In CSJHK, increase 1~2.8 recall/NDCG point
- In MTB, increase 3-3.3 recall/NDCG point
- However, CPR only performs best in MTB. CPR is in 2nd or 3rd place in other datasets.
- LightGCN, Bi-TGCF, BPR and HopRec are strong enough.

07/29

CPR vs LightGCN:

Aggregation:
- Only on user + 1 layer
  - Equally aggregated from both domain
Optimization:
- User:
  - Similar
- Item:
Source of new Datasets: Amz Review Data 2018

08/05

New Dataset preprocessing & experients (LightGCN can't run too big dataset):
- (Electronics -> Cellphones and Accessories) No need to filter
- (Sports and Outdoors -> Clothing, Shoes and Jewelry) 10-core
Figure out a LOO bug
- one log user will not add into train. This raised CPR's bug.
- Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
Implementing CoNet
Meeting Notes
- add @20
- LightGCN too low?
- After completing big table, adjust weired score. List interesting examples.
- Explain why CPR doing better in cold-start
- Hope cold-start user close to target item
- 志明學長的圖：https://www.geogebra.org/m/epvjwaJG

08/13

Meeting Notes
- 10-core elcpa
- 全部模型都用一樣的大圖比較好解釋
- 小圖可能當appliaction study, 是不是在什麼情況下(不同User)比較適合用小圖？
- 開始準備Paper的數字

08/19

Finished 10-core elcpa, elcpa becomes an extremely small dataset.
- LightGCN performs better in little dataset? (since the CPR's score of raw-data elcpa is better than LightGCN)
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
- So, maybe CPR recommends better for those users who have few interactions.

Time Table:

7/23-7/30:
- 1. New dataset & preprocessing (remove CSJ-HK)
- 1. All baseline Methods (remove Hop-Rec)
- 1. Fixed Epoch Number (200 epoch)
7/31-8/6:
- Test for multiple times (mean, var, …)
- t-SNE?

<8/6 前完成所有基本實驗>

8/7-13:
- Other Experiments
8/13-8/20:
- Abstraction
- Methodology

<8/23 之前完成所有實驗>

<8/30 Abstraction Deadline>
<9/8 Submission Deadline>

CPR Progress:

06/30

Current problems:

TODO:

Survey:

Summary:

07/08:

07/15:

07/22:

07/29

08/05

08/13

08/19

Time Table:

Read more

Money With Pamela

去香港

Pam 的香港購買清單：

CodeZero-ML ReadMe