# Cosmed Sale Research 20200815
## 現場反饋
- 開賣後分類頁仍顯示尚未開賣、售完後分類頁未顯示已售完
- 兩層 CDN 導致 delay 4~6min
- 商品頁持續顯示“開賣通知我"
- 收藏 load 不出來
- iOS 首頁白頁
- 監控到的 error / latency 下降
## 觀察重點
```
比較基準
墊腳石 8/12 搶購:
- time 0812 17:50 ~ 0812 18:10
- max RPM 93k
康是美 8/15 搶購:
- time 0815 10:50 ~ 0815 11:10
- max RPM 75k
```
- [ ] RealTimeData 回來的慢、fail
- [ ] 收藏頁回來的慢、fail
- [ ] 首頁回來的慢、fail
- [x] 是否因為調整 timeout 時間, 導致產生更多 timeout error. 還是該 timeout 的不管 5s, 20s 都會 timeout
- [x] error 是否有變多
- [ ] 加 CDN 會有幫助嗎
## 要拿出來討論的事項整理
- [ ] timeout error 變多的情況推論 (p100 -> p90, 10x timeout)
## 研究內容
### error 是否有變多
> 研究人員: `@tomaz`
- 兩天比較
- 康是美 8/15 搶購: RPM max 75k, error sum 1500 * 10 = 15000
- 墊腳石 8/12 搶購: RPM max 93k, error sum 2819
- 數量多出了很多,但是佔比差不多
- 展開原因也發現分佈比例類似
**康是美 8/15 搶購 - error 數量**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-15T02:50:00.000Z
end-time: 2020-08-15T03:10:00.000Z
query-string:
```
filter ( type = "ERROR" and tag = "OperationLoggingApolloServerPlugin" )
| stats count() as cnt by data.operationName
| sort cnt desc
```
-------------------------------------------
| data.operationName | cnt |
|----------------------------------|------|
| iOS_salePageRealTimeData | 740 |
| android_salePage_realtime_info | 614 |
| iOS_salePageAdditionalInfo | 27 |
| cms_shopCategory_default_orderby | 27 |
| iOS_salePageInfo | 27 |
| cms_shopCategory | 24 |
| android_salePage | 18 |
| android_salePage_extra | 8 |
| iOS_shopCategory | 6 |
| <no-operation> | 3 |
| cms_shopCategory_promotion_list | 2 |
| android_getSalePageInit | 2 |
| android_searchHotKeywords | 1 |
| cms_layoutTemplate_spCatAd_list | 1 |
| total | 1500 |
-------------------------------------------
**康是美 8/15 搶購 - 展開原因**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-15T02:50:00.000Z
end-time: 2020-08-15T03:10:00.000Z
query-string:
```
filter ( type = "ERROR" and tag = "OperationLoggingApolloServerPlugin" )
| parse err.message /text: (?<reason>.+)/
| stats count() as cnt by data.operationName, data.error.type, reason
| sort cnt desc
```
------------------------------------------------------------------------------
| data.operationName | data.error.type | reason | cnt |
|----------------------------------|-----------------|-----------------|-----|
| iOS_salePageRealTimeData | Error | request timeout | 732 |
| android_salePage_realtime_info | Error | request timeout | 608 |
| cms_shopCategory_default_orderby | Error | request timeout | 24 |
| cms_shopCategory | Error | request timeout | 19 |
| iOS_salePageAdditionalInfo | Error | request timeout | 18 |
| iOS_salePageInfo | Error | request timeout | 18 |
| android_salePage | PythiaError | | 9 |
| iOS_salePageAdditionalInfo | PythiaError | | 9 |
| iOS_salePageInfo | PythiaError | | 9 |
| android_salePage | Error | request timeout | 9 |
| iOS_salePageRealTimeData | PythiaError | | 8 |
| android_salePage_extra | Error | request timeout | 7 |
| android_salePage_realtime_info | PythiaError | | 6 |
| iOS_shopCategory | Error | request timeout | 6 |
| cms_shopCategory | PythiaError | | 5 |
| cms_shopCategory_default_orderby | PythiaError | | 3 |
| <no-operation> | Error | request timeout | 3 |
| android_getSalePageInit | PythiaError | | 2 |
| cms_shopCategory_promotion_list | Error | request timeout | 2 |
| android_searchHotKeywords | Error | request timeout | 1 |
| android_salePage_extra | PythiaError | | 1 |
| cms_layoutTemplate_spCatAd_list | Error | request timeout | 1 |
------------------------------------------------------------------------------
**墊腳石 8/12 搶購 - error 數量**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-12T09:50:00.000Z
end-time: 2020-08-12T10:10:00.000Z
query-string:
```
filter ( type = "ERROR" and tag = "OperationLoggingApolloServerPlugin" )
| stats count() as cnt by data.operationName
| sort cnt desc
```
-------------------------------------------
| data.operationName | cnt |
|----------------------------------|------|
| iOS_salePageRealTimeData | 1216 |
| android_salePage_realtime_info | 913 |
| iOS_salePageAdditionalInfo | 182 |
| iOS_salePageInfo | 180 |
| cms_shopCategory | 122 |
| cms_shopCategory_default_orderby | 104 |
| android_salePage_extra | 33 |
| android_salePage | 30 |
| iOS_shopCategory | 14 |
| iOS_leftMenu | 11 |
| android_getSalePageInit | 8 |
| CouponListPage | 2 |
| cms_shopCategory_promotion_list | 1 |
| iOS_couponList | 1 |
| iOS_appAnnouncementList | 1 |
| CouponList | 1 |
| total | 2819 |
-------------------------------------------
**墊腳石 8/12 搶購 - 展開原因**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-12T09:50:00.000Z
end-time: 2020-08-12T10:10:00.000Z
query-string:
```
filter ( type = "ERROR" and tag = "OperationLoggingApolloServerPlugin" )
| parse err.message /text: (?<reason>.+)/
| stats count() as cnt by data.operationName, data.error.type, reason
| sort cnt desc
```
-------------------------------------------------------------------------------
| data.operationName | data.error.type | reason | cnt |
|----------------------------------|-----------------|-----------------|------|
| iOS_salePageRealTimeData | Error | request timeout | 1048 |
| android_salePage_realtime_info | Error | request timeout | 892 |
| iOS_salePageInfo | PythiaError | | 158 |
| iOS_salePageAdditionalInfo | PythiaError | | 158 |
| iOS_salePageRealTimeData | PythiaError | | 157 |
| cms_shopCategory | PythiaError | | 103 |
| cms_shopCategory_default_orderby | PythiaError | | 83 |
| android_salePage_extra | PythiaError | | 22 |
| android_salePage | PythiaError | | 22 |
| cms_shopCategory_default_orderby | Error | request timeout | 21 |
| android_salePage_realtime_info | PythiaError | | 21 |
| cms_shopCategory | Error | request timeout | 19 |
| iOS_salePageAdditionalInfo | Error | request timeout | 13 |
| iOS_salePageAdditionalInfo | Error | request failed | 11 |
| iOS_salePageInfo | Error | request timeout | 11 |
| iOS_salePageRealTimeData | Error | request failed | 11 |
| android_salePage_extra | Error | request timeout | 11 |
| iOS_salePageInfo | Error | request failed | 11 |
| iOS_leftMenu | PythiaError | | 10 |
| iOS_shopCategory | Error | request timeout | 8 |
| android_salePage | Error | request timeout | 7 |
| iOS_shopCategory | PythiaError | | 6 |
| android_getSalePageInit | PythiaError | | 6 |
| CouponListPage | Error | request timeout | 2 |
| android_getSalePageInit | Error | request timeout | 2 |
| iOS_couponList | Error | request timeout | 1 |
| iOS_appAnnouncementList | Error | request timeout | 1 |
| android_salePage | Error | request failed | 1 |
| iOS_leftMenu | Error | request timeout | 1 |
| cms_shopCategory_promotion_list | Error | request timeout | 1 |
| CouponList | Error | request timeout | 1 |
-------------------------------------------------------------------------------
### 是否因為 timeout 時間縮短了,導致 timeout error 大幅提升
> 研究人員: `@tomaz`
- 平時的 timeout error 在修改前、修改後並沒有太大差異
(平時不會 timeout 的都能在 5s 內回應, 超過 5s 的就會 timeout 了,不管等 5s, 20s 都一樣)
- 搶購時 timeout error 倍數上升,但 api 分佈比例類似
- 墊腳石 2823
- 康是美 3245 * 10 = 32450
- 推論: **修改前 p100 的人會 timeout, 修改後 p90 的人會 timeout, 大約 10x 人數**
- 那得到了什麼
- 墊腳石 latency p95 7s, avg 1.2s
- 康是美 latency p95 4.2s, avg < 300ms
**墊腳石 timeout error**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-12T09:50:00.000Z
end-time: 2020-08-12T10:10:00.000Z
query-string:
```
filter ( data.status = "timeout" and type = "ERROR" )
| parse data.url /^https?:\/\/(?<service>.+?)\/(?<path>([^\.]*?\/){0,3}).*?$/
| stats count() as _count by service, path
| sort _count desc
```
--------------------------------------------------------------------------------------
| service | path | _count |
|------------------|--------------------------------------------------------|--------|
| webapi.91mai.com | webapi/SalePage/GetSalePageRealTimeData/ | 1925 |
| webapi.91mai.com | webapi/TraceSalePageList/ | 229 |
| webapi.91mai.com | webapi/SalePageV2/GetSalePageV2Info/ | 188 |
| webapi.91mai.com | webapi/ShopCategory/GetSalePageList/ | 75 |
| webapi.91mai.com | webapi/Shop/ | 55 |
| webapi.91mai.com | webapi/LayoutTemplateData/GetLayoutTemplateData/ | 55 |
| webapi.91mai.com | webapi/SearchV2/ | 36 |
| webapi.91mai.com | webapi/SalePage/GetSalePageHotListByShopCategoryId/ | 27 |
| webapi.91mai.com | webapi/SalePageV2/ | 25 |
| webapi.91mai.com | webapi/APPNotification/ | 23 |
| webapi.91mai.com | webapi/shop/getCustomizedBrandIdentityDisplaySettings/ | 22 |
| webapi.91mai.com | webapi/shop/getForcedLogoutVersionList/ | 21 |
| webapi.91mai.com | webapi/ShopCategory/GetPromotionList/ | 20 |
| webapi.91mai.com | webapi/SalePageV2/GetSalePageAdditionalInfo/ | 18 |
| webapi.91mai.com | webapi/ShopStaticSetting/ | 15 |
| webapi.91mai.com | webapi/AppAnnouncement/ | 15 |
| webapi.91mai.com | webapi/AppNotification/GetMobileAppSettings/ | 14 |
| webapi.91mai.com | webapi/Shop/GetShopCategoryListV3/ | 12 |
| webapi.91mai.com | webapi/AppAnnouncement/getAppAnnouncementList/ | 10 |
| webapi.91mai.com | webapi/ecoupon/ | 9 |
| webapi.91mai.com | webapi/Shop/GetShopintroduction/ | 8 |
| api2.91mai.com | o2o/api/coupon/ | 7 |
| webapi.91mai.com | webapi/PromotionV2/GetList/ | 5 |
| webapi.91mai.com | webapi/Activity/ | 4 |
| webapi.91mai.com | webapi/HotSaleRanking/GetHotSaleRankingList/ | 2 |
| webapi.91mai.com | webapi/shop/getShopContractSetting/ | 2 |
| webapi.91mai.com | webapi/InfoModule/ | 1 |
--------------------------------------------------------------------------------------
**康是美 timeout error**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-15T02:50:00.000Z
end-time: 2020-08-15T03:10:00.000Z
query-string:
```
filter ( data.status = "timeout" and type = "ERROR" )
| parse data.url /^https?:\/\/(?<service>.+?)\/(?<path>([^\.]*?\/){0,3}).*?$/
| stats count() as _count by service, path
| sort _count desc
```
---------------------------------------------------------------------------------------------------
| service | path | _count |
|-------------------------------|--------------------------------------------------------|--------|
| webapi.91mai.com | webapi/SalePage/GetSalePageRealTimeData/ | 1324 |
| webapi.91mai.com | webapi/TraceSalePageList/ | 1182 |
| d38tzu0atxk400.cloudfront.net | webapi/SalePageV2/GetSalePageV2Info/ | 152 |
| d38tzu0atxk400.cloudfront.net | webapi/ShopCategory/GetSalePageList/ | 84 |
| webapi.91mai.com | webapi/Shop/ | 63 |
| webapi.91mai.com | webapi/SalePage/GetSalePageHotListByShopCategoryId/ | 57 |
| webapi.91mai.com | webapi/SearchV2/ | 35 |
| webapi.91mai.com | webapi/SalePageV2/ | 32 |
| webapi.91mai.com | webapi/APPNotification/ | 31 |
| webapi.91mai.com | webapi/shop/getCustomizedBrandIdentityDisplaySettings/ | 31 |
| d38tzu0atxk400.cloudfront.net | webapi/ShopCategory/GetPromotionList/ | 30 |
| webapi.91mai.com | webapi/ShopStaticSetting/ | 30 |
| webapi.91mai.com | webapi/AppAnnouncement/ | 29 |
| webapi.91mai.com | webapi/shop/getForcedLogoutVersionList/ | 23 |
| d38tzu0atxk400.cloudfront.net | webapi/LayoutTemplateData/GetLayoutTemplateData/ | 22 |
| webapi.91mai.com | webapi/Activity/ | 21 |
| webapi.91mai.com | webapi/AppNotification/GetMobileAppSettings/ | 19 |
| webapi.91mai.com | webapi/AppAnnouncement/getAppAnnouncementList/ | 19 |
| webapi.91mai.com | webapi/ecoupon/ | 17 |
| webapi.91mai.com | webapi/Shop/GetShopintroduction/ | 15 |
| d38tzu0atxk400.cloudfront.net | webapi/SalePageV2/GetSalePageAdditionalInfo/ | 13 |
| webapi.91mai.com | webapi/HotSaleRanking/GetHotSaleRankingList/ | 7 |
| webapi.91mai.com | webapi/shop/getShopContractSetting/ | 3 |
| d38tzu0atxk400.cloudfront.net | webapi/Shop/GetShopCategoryListV3/ | 3 |
| webapi.91mai.com | webapi/InfoModule/ | 1 |
| api2.91mai.com | o2o/api/coupon/ | 1 |
| webapi.91mai.com | webapi/PromotionV2/GetList/ | 1 |
---------------------------------------------------------------------------------------------------
**平時的 timeout error (改後)**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-14T12:50:00.000Z
end-time: 2020-08-14T13:10:00.000Z
query-string:
```
filter ( data.status = "timeout" and type = "ERROR" )
|stats count()
#| parse data.url /^https?:\/\/(?<service>.+?)\/(?<path>([^\.]*?\/){0,3}).*?$/
#| stats count() as _count by service, path
#| sort _count desc
```
-----------
| count() |
|---------|
| 47 |
-----------
**平時的 timeout error (改前)**
region: ap-northeast-1
log-group-names: /eks-ap-northeast-1-tw-91app-io/prod-bff
start-time: 2020-08-07T12:50:00.000Z
end-time: 2020-08-07T13:10:00.000Z
query-string:
```
filter ( data.status = "timeout" and type = "ERROR" )
|stats count()
#| parse data.url /^https?:\/\/(?<service>.+?)\/(?<path>([^\.]*?\/){0,3}).*?$/
#| stats count() as _count by service, path
#| sort _count desc
```
-----------
| count() |
|---------|
| 59 |
-----------