CPR Progress:

06/30

  • 資料路徑:
    • DATASETS: /tmp2/yzliu/for_Jacky_senbai/CPR_revision/datasets
    • RESULTS: /tmp2/yzliu/for_Jacky_senbai/CPR_revision/results

Current problems:

  • 原 paper settings:
    • 200 update times (aka 2*10^8 samples)
    • BPR/Hop-Rec 的 200 epoch 太低
      • 500 很正常
        • [] 測 + 畫 metric 折線圖
  • Some baseline > CPR

TODO:

  • 把每 100 個 epoch 的 embedding 存起來 + 跑 eval

    • 畫折線圖
  • 多跑幾次,算 mean & variance

  • 可能改 item aggregation

Survey:

  • CMF 以後,不使用外部資訊 (Ex: 文字、影像) 的跨領域推薦研究幾乎沒有。
    • 連 EMCDR 都有所謂使用者資訊
  • 主要的研究也都 focus 在 shared user
    • 唯一做 cold start 的大概是 CATN (SIGIR 2020),但 CATN 也有使用文字

Summary:

  • 問題: 實驗分數在 200~500 epoch 之間落差很大,
    Ex: 200 epoch 時,CPR 遠大於 BPR,但 500 時的差距就相對合理
    • 原因: BPR+, Hop-Rec+ 等等,是把兩張圖拼接,收斂需要跑更多 epoch
    • 解法: 因為 CPR 相形之下收斂較快,可畫折線圖說明其效果

07/08:

  • Current situation:

    • Potential Calim: CPR converges faster than other methods.
      • Need to compare in same training settings.
          1. There is no concept of epoch in smore
          1. Cross-Domain vs. Single Domain
      • After comparision:
          1. For CPR, we set a step to total amount of edges of Target-Domain.
          1. CPR's convergence seems not faster than others. (Based on results by epoch)
    • Evaluation slow:
      • Need multi-processing evaluation
  • TODO:

    • Epoch-based smore training
    • multi-processing evaluation
    • Experiments:
      • CPR:
      • Baselines:
        • BPR
        • Hop-Rec
        • Bi-TGCF
        • CMF
        • EMCDR
  • Issues:

    • Significant perforfance drop from target users to shared users.
      • Intuitively, in a cross-domain recommendation scenario, shared users ( users that occur in both source domain & target domain) have sufficient information, implying a better recommendation performance than users who only occur in the target domain (i.e., target users).

      • However, in all of our datasets, the performance of shared users are significantly worse than target user (even worse than cold-start users).

      • We doubt that there are two possible reasons:

        • (1) The low proportion of shared users to all users. However, for the tv-vod dataset, most users are shared users, and it still has an extensive performance drop (0.9 vs. 0.7 on recall).
        • (2) The sample bias.
  • Summary:

      1. Experiment alignment on each method.
      1. Evaluation acceleration.
      1. Still working on experiments.

07/15:

  • Finished:
    • Accelerate evaluation with multi-processing
    • Add comparison plots to the slides
      (some method is still waiting for the evaluation)
  • TODO:
    • Calculate Variance
    • CPR's Parameter Adjustment (Since performance is not as expect)
    • Experiments:
      • CPR:
      • Baselines:
        • BPR
        • Hop-Rec
        • Bi-TGCF
        • CMF
        • EMCDR
  • Discovery:
    • Since we change the step of CPR into Target-Domain's edge size (while the step of LightGCN+ is Target+Souce domains' edge size), it's possible that 500epoch is not enough for converging.
      • The answer is negative, scores are not higher in 600epoch. Scores in 100epoch are as good as those in 500epoch, or even better.

07/22:

  • Spend a lot of time on rescaling codes of CMF/EMCDR codes for meeting on reproducing experiments on our 10-core datasets (different from original datasets).
  • Finished CPR's Parameter Adjustment
    • Best Parameter Combination: ug0.01, ig0.06
    • In TVVOD, increase 1~1.2 recall/NDCG point
    • In CSJHK, increase 1~2.8 recall/NDCG point
    • In MTB, increase 3-3.3 recall/NDCG point
    • However, CPR only performs best in MTB. CPR is in 2nd or 3rd place in other datasets.
    • LightGCN, Bi-TGCF, BPR and HopRec are strong enough.

07/29

CPR vs LightGCN:

  • Aggregation:

    • Only on user + 1 layer
      • Equally aggregated from both domain
  • Optimization:

    • User:
      • Similar
    • Item:
  • Source of new Datasets: Amz Review Data 2018

08/05

  • New Dataset preprocessing & experients (LightGCN can't run too big dataset):

    • (Electronics -> Cellphones and Accessories) No need to filter
    • (Sports and Outdoors -> Clothing, Shoes and Jewelry) 10-core
  • Figure out a LOO bug

    • one log user will not add into train. This raised CPR's bug.
    • Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
  • Implementing CoNet

  • Meeting Notes

    • add @20
    • LightGCN too low?
    • After completing big table, adjust weired score. List interesting examples.
    • Explain why CPR doing better in cold-start
    • Hope cold-start user close to target item
    • 志明學長的圖:https://www.geogebra.org/m/epvjwaJG

08/13

  • Meeting Notes
    • 10-core elcpa
    • 全部模型都用一樣的大圖比較好解釋
    • 小圖可能當appliaction study, 是不是在什麼情況下(不同User)比較適合用小圖?
    • 開始準備Paper的數字

08/19

  • Finished 10-core elcpa, elcpa becomes an extremely small dataset.
    • LightGCN performs better in little dataset? (since the CPR's score of raw-data elcpa is better than LightGCN)
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    • So, maybe CPR recommends better for those users who have few interactions.

Time Table:

  • 7/23-7/30:

      1. New dataset & preprocessing (remove CSJ-HK)
      1. All baseline Methods (remove Hop-Rec)
      1. Fixed Epoch Number (200 epoch)
  • 7/31-8/6:

    • Test for multiple times (mean, var, )
    • t-SNE?

<8/6 前完成所有基本實驗>

  • 8/7-13:

    • Other Experiments
  • 8/13-8/20:

    • Abstraction
    • Methodology

<8/23 之前完成所有實驗>

<8/30 Abstraction Deadline>
<9/8 Submission Deadline>