Feature Importance on GBM

--- lang: ja-jp tags: Survey title: Feature Importance on GBM --- # Feature Importance on GBM > The more an attribute is used to make key decisions with decision trees, the higher its relative importance. ある特徴量について、決定木で支配的な影響を見せるほど、重要度は高いとみなされる。 > This importance is calculated explicitly for each attribute in the dataset, allowing attributes to be ranked and compared to each other. データセットにおける各特徴量について重要度は明示的に計算され、一元的な尺度でランク付けされ、互いに比較される。 > Importance is calculated for a single decision tree by the amount that each attribute split point improves the performance measure, weighted by the number of observations the node is responsible for. 一つの決定木における重要度は各特徴量で分割した場合の評価指標の改善の程度（量）として計算され、分割が行われるノードに属するサンプル数で加重される。 > The feature importances are then averaged across all of the the decision trees within the model. 特徴量の重要度はモデル内部の決定木全体で平均値が取られる。 ## Problem Setting 取りうる値のパターンは、高々$d!$で済む。無情報かどうかのラベルをつけた時に、不純度を最小化するような閾値を見つけてやれば良さそう。 ### Notation - $\mathbf{x} \in \mathbb{R}^{d}$: 次元数$d$の入力特徴量ベクトル - $z\left( \mathbf{x} \right) \in \{0, 1\}^{d}$: ある次元$i$について$x_{i}$がゼロであれば$0$そうでなければ$1$をとったベクトル - $\mathbf{w} \in \mathbb{R}^{d}$: Feature Importance Vector ### Input of the Model - $s = \langle z\left( \mathbf{x} \right), \mathbf{w} \rangle$: 非ゼロの値が入っている特徴次元のFeature Importanceの和 ### Output of the Model #### Model Equation - $f\left(s\right|\beta_{0}, \beta_{1}) = \mathbb{I} \left[ \frac{1}{1 + \exp\left(\beta_{0} + \beta_{1}s\right)} \geq 0 \right]$ #### Output? ==WIP== ## Appendix ### LightGBM ```python= importance = model.feature_importance() >>> type(importance) ### numpy.ndarray ```