# AliCoCo: Alibaba E-commerce Cognitive Concept Net (SIGMOD-2020) - 4 Parts - Items (Product) - Taxonomy: Manual Defined High level category/attributes, e.g., Category->Dress, Time->Holiday, Time->Season, IP->Movie Star, Location, etc. - Primitive Concept: Minied Lower level category/attributes, e.g., Outdoor, Barbecue, Winter, etc. - E-commerce Concepts: scenarios, e.g., outdoor barbecue, christmas gifts - Primitive Concept Discovery - data - get concept candidates from multiple sources: catalog attribute values, wikipedia, etc. - distance supervision training data (6M sentences) - matching attribute values with text (query/title/review) - remove text with multiple matching - model: learn BiLSTM-CRF model from distance supervision examples - apply: run prediction on sentences, getting 2.7M primitive concepts - Learning Primitive Concept Hierarchy: Hypernym Discovery - algorithm: bi-linear matching - data: active-learning (both most confident ones and most uncertain ones) - E-commerce Concepts - candidate generation - AutoPhrase detection (TBD) from query/title/review - Pattern generation based on Priminative Concepts - classify: is it a good concept? - model: Deep & Wide - Wide: meta features - Deep: LSTM of word embedding + wikipedia augmented Features - Item Association - data: data pairs of (queries, Concepts associated with higly clicked products) - algorithm: text matching between query and concepts