# 參考資料 ## 盜刷偵測 ### 歷屆資料 - [CubatLin/TBrain-E.SUN-AI-Open-Competition-Fall-2019-15th-place-Feature-Engineering: 玉山銀行人工智慧公開挑戰賽-2019秋季賽-真相只有一個-信用卡盜刷偵測 - Top1% Feature Engineering](https://github.com/CubatLin/TBrain-E.SUN-AI-Open-Competition-Fall-2019-15th-place-Feature-Engineering) - [rgib37190/TBrain-E-Sun-Fraud-Detection](https://github.com/rgib37190/TBrain-E-Sun-Fraud-Detection) ### 相關論文 - [CATCHM: A novel network-based credit card fraud detection method using node representation learning (DSS 2023)](https://pdf.sciencedirectassets.com/271653/1-s2.0-S0167923622X00122/1-s2.0-S0167923622001373/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEIL%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJHMEUCIDjBKDTMjwFFoWr3hNTU86%2BH%2FptZi9O5IE1LHvVWSA7vAiEAp9%2FgSc2mrIftY6RESAXa5MkVDFADR30PhZ0cFC4hvgAqvAUIiv%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FARAFGgwwNTkwMDM1NDY4NjUiDOx0Es3TI3LSu%2FC08yqQBS6WudZbig%2FQwf8WmJH65aqAZyX994lxZCRe0LjCjbROSEscOtckQM2zrI7dPPDOJ%2FoHcLC70WUcsbwkktLjCMENfUujQFClNLnGnDvhip9UrymfZeptdYS6cFgPFyFp5l1vqjbQ7jJ%2BOR0gUIPHwzigyV1Lim71yBXTmDbVNtlvyeDb7qJgKxAxKgdYj7MZ1qsCDyZlr4S2r1Ly%2FsB%2BqrD3NeI%2FjAw0Bp9LDblyTYxMSWGmeo6ElxhjK27gbUNjGmYgvJYnEU3nPTNdKDCzwms7lTAkOPPSYJdvVnX6AyW7fVwHYhgd0PxhthOH6PL8mMD6jTt2%2FzkWNzG22Ej91imKgfki3t0LUEHE%2FZfhlAwtgGcxYDA6w8uWyhUxfJCoOmK4Vl2124N6kCnmLPoOabiz39kCKZ4mPCI7Quowe93eyPLwoVs3%2BT9kHaI4t0iy%2BTJ7dTuCxhjU9l3vLB9njCbLML738p9J9jC91wEYTMC8tntz9FKzfSN8ovway31u9PYGo1z2svS3Po1byvtis8VtfO6DIz3jNIqn6hSN4o%2FZ4%2BAi66pMUedSjCNFYud0pTf6UEe5IKiGrbhX%2FlkJ2cl%2FoUAR2Z2FuSh8TW%2F0Y03ZynG8cwYegK1HsJ07uw%2Bc0qIUzWVTH8UbXsICQiRmFHFw52SvnYHkyIYnYLIkF3KWwG0cPhg3HEz6d2S1uVzIPom%2BATTm1SGB9AovYZnV%2F7ElQ%2Fi%2B66EwvNrqXaQRK%2FnUo%2BdLoknYc8K3AlpHVRGRChY49Ty%2Bw5ZFdxuZ7IDxjdzs07KsSBbZm9M4gf%2F6sTv%2FAns5UbZ7FqnMTju2MwXvtBF0HBLNhJxkM0JckGbUTbFeGijdCdF8KMoUGK5L4u7iMIu%2FhKkGOrEB5V872uCkLM0all429v7bdqBNphqJcKs5tTpRkyq4mxpIuaSXT95JDj5qhQzay%2BQigGHscaU2ZL7wmtbAQAyfdX%2FC5eF6hDh%2FGeiPGjM%2Bnj6af1PRXw0VzWuD8j1fEdJ4a1O7jIEie%2BiOuLFRxEFLMMSwTxW5kBG1VLtjBGhNMg3MvjBAIYe2ycOoSrx17YMfjO3W6F0wmbNMug1BnuAAF9vnJULdRPsVzvED7DZDNy9C&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20231007T092810Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAQ3PHCVTYXZST6J6D%2F20231007%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=f79a4864571ee695da545522591f9fd79b1fcdee0c83723fe8bdc6fd82f8636f&hash=dd24d3b5196d2107a9b85005c6b7513fc1c7afcaf4a9d6c2823e439e000f37b4&host=68042c943591013ac2b2430a89b270f6af2c76d8dfd086a07176afe7c76c2c61&pii=S0167923622001373&tid=spdf-ca7cbd66-1532-481d-92ee-b5ec4afa7865&sid=efc8b52e25adb949f57a9c12da5b920bfb33gxrqa&type=client&tsoh=d3d3LnNjaWVuY2VkaXJlY3QuY29t&ua=12145656545a500207&rr=81251cc73af56a92&cc=tw) > 我還沒仔細看,但看題目感覺很厲害,有時間的人可以幫忙看看。 > [name=裕翔] - [Deep Convolution Neural Network Model for Credit-Card Fraud Detection and Alert (JAICN 2021)](https://web.archive.org/web/20210624021145id_/https://irojournals.com/aicn/V3/I2/03.pdf) > 我覺得這篇作法很普通,而且評估方法很糟,但不知道為什麼引用數蠻好的,就放上來了。 > [name=裕翔] - [Fraud detection system: A survey (JNCA 2016)](https://pdf.sciencedirectassets.com/272436/1-s2.0-S1084804516X00082/1-s2.0-S1084804516300571/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEIH%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJHMEUCIQDfBEpNbqa5mIes%2F3B8kJzrSKPY19XAqq0B69BsJ4Br2gIgJBrQE5dNzrzGG3x8DaU7wSxwhwWDWZKlAzRZjWLxoTQquwUIif%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FARAFGgwwNTkwMDM1NDY4NjUiDESDoYmd8Zfn%2Fu6OuCqPBdMj3M%2BRxqXoq0CEpcE%2BsRHXMcxStgIAm9ofX44MLX8umN3YUNVmbW6NIiCsSSBUOXPArdi7%2F9MutKwsSPKg22mWsKFNt7b7CyBMHHBd%2B1hazSsxiXMcSN%2FYbttmMg2FLTXf1JxSlMfRVA1IMiiTNIhUf4BvnDb5G2ms45KO1xNv3iNcd0tdpe1Qp9ktKoIfztxp1xiYWcKbmVk6kcYSyow8IoI%2BVG%2BQxM5S3Gnd%2FDNarVVm4oSqbQUrvzV9L%2F8%2B0d%2FkUrsPHVyh589paGTml8j8DD4g7%2FoenCft6R2tR9Hgl57E4xGGm5TYcxoVhx2yKxtQcRs%2B8h0Jq5qTZbbUmy0S8%2BnMXKaTXx7FWi65E68Gp789vJ2pKj1tgsdouLoBCRIzEkAxRq2OcD5FGD67DwNBmQU7eQXpBY8Jx%2Fl3uGhEpBdwG%2BB6a06zRNgg4H086KA2AT207q%2B697QZ%2Bi5bue7OufBwgV0MBbgH8ZAGyyHJyoTX3sfBEPTh0F%2BEo%2FWaGkOA57ZGSwhnYrwrERpEIWgYz28a5ENgXwytWESurUbmfJJgu1mqnaESfWWSYvAsiAIT9xlYrUlFD7Nqa9kGAJFHxLMnGs1UL87K2v4geOmGMWxE3ho9J%2FAPiVG%2Fmfw5fB7AFXxlh1swrbjvBbdkEAG9RNfODOJSdNJhOSaUu5Ns%2FoNN8LqLbnOGmezgu3RCFdz4Tw5Jwbcjk6rTeQ7lke5UMhOa5UbZkHZ0wHceStGDlf7jSoS2XHgNMRe8Dx3Swu%2FSLXL9%2BIw2wUWIY4ydCvZoB8gGmQfG6I5BaiClchGS8N%2BSu4sl2jewSIRWpdtZ%2BLw%2FOu5anordpwZLIgkDWIRc%2FmB7W3oQ9PG6s3o6qtUwxqWEqQY6sQEIwzYCZkej7gD4gAWvTVaxz4%2BmYRuD8uhcz6ZgPpOoViRoeHoNUfDehpPtEZ5ZR0X0idCzSzSHMxO4O7ym51SUXoF7t%2FrLU3W9iooyvmB%2FFn5eqV35IV%2BCmjp05E6L08JsTP8bn22RmPYz0GfkkbiMUR0KHVxZpXSyOdDsYAdRkRw%2FZp20ecuGhbYztCOLboR5%2FdzFWWXrG6Vxq21hqrESCmmzgxxXRwgUPd6NI2u29MY%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20231007T090729Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAQ3PHCVTY2W7XYZXO%2F20231007%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=72bab17129c8fe4b66426e6234729909c22e3a2d31378070cd08fc8a43ae9f11&hash=991ae899f844db22cebb45bd351723d146b8110b42de2d00bb61124ef4c1e3ca&host=68042c943591013ac2b2430a89b270f6af2c76d8dfd086a07176afe7c76c2c61&pii=S1084804516300571&tid=spdf-329958d4-299c-426c-a330-70a5fcc8ec5a&sid=efc8b52e25adb949f57a9c12da5b920bfb33gxrqa&type=client&tsoh=d3d3LnNjaWVuY2VkaXJlY3QuY29t&ua=12145656545b070453&rr=8124fe7d4a354a7e&cc=tw) > 有點太舊了,可能不太有參考價值。 > [name=裕翔] ### 其他資源 - [Credit Card Fraud Detection - Kaggle](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud) > 雖然資料集肯定不一樣,但同樣是盜刷偵測競賽,可以參考作法。 > 這個競賽是最熱門的盜刷偵測競賽,請優先參考。 > [name=裕翔] - [Credit Card Fraud Detection - Kaggle](https://www.kaggle.com/datasets/mishra5001/credit-card) > 另一個盜刷偵測競賽,但公開作法很少,僅供參考。 > [name=裕翔] - [Credit Card Fraud Detection Dataset 2023 - Kaggle](https://www.kaggle.com/datasets/nelgiriyewithana/credit-card-fraud-detection-dataset-2023) > 競賽辦法允許使用外部資料集,也許可以作為額外資料集使用。 > 很新(2023)的資料集,且資料量挺大(550,000)。 > [name=裕翔] - [Credit Card Fraud - Kaggle](https://www.kaggle.com/datasets/dhanushnarayananr/credit-card-fraud/data) > 資料量大(5,000,000),但其他資訊都不知道。 > [name=裕翔] - [Credit Card Transactions Fraud Detection Dataset - Kaggle](https://www.kaggle.com/datasets/kartik2112/fraud-detection) > 競賽辦法允許使用外部資料集,也許可以作為額外資料集使用。 > 需注意的是,此資料集為程式生成,特徵分佈是否與真實情況相符待釐清。 > [name=裕翔] ## 競賽技巧 > - 集成(Ensemble)模型才有機會拿名次。 > **待補充集成模型相關的資料。** > [name=裕翔] > > - 盜刷偵測問題幾乎能確定是極不平衡資料的問題。 > **待補充強抽樣、弱抽樣相關資料。** > [name=裕翔] ## The Next Hill: LLM-based Methods > 盜刷偵測和資安事件偵測雷同,都是極不平衡(正常刷卡資料量 >> 盜刷資料量)資料的二元分類(是盜刷/不是盜刷)問題。 > 也許我們可以考慮模仿 AI 年會邱銘彰講者提出的作法,同時參加兩個比賽。 > ![](https://hackmd.io/_uploads/rJZIVpJZT.png) > [name=裕翔]