AutoGluon
===
###### tags: `ML / ensemble`
###### tags: `ML`, `AutoML`, `CPU+GPU`, `sklearn`, `XGBoost`, `NeurlNetwork+NN`

<br>
[TOC]
<br>
## github
> [awslabs / autogluon](https://github.com/awslabs/autogluon)
- [AutoGluon Roadmap 2022](https://github.com/awslabs/autogluon/blob/master/ROADMAP.md)
- 文件說明:https://auto.gluon.ai/stable/index.html
- [AutoGluon overview & example applications](https://towardsdatascience.com/autogluon-deep-learning-automl-5cdb4e2388ec)
- To install AutoGluon for Linux with GPU support, run the following commands in terminal or refer to the installation wiki for CPU only installation:
```
pip install --upgrade mxnet-cu100
```
- [AWS 上的 Apache MXNet](https://aws.amazon.com/tw/mxnet/)
<br>
## Docker
> [autogluon/autogluon](https://hub.docker.com/r/autogluon/autogluon)
- Start Container and Notebook Server with GPU support
```
$ docker pull autogluon/autogluon:0.5.2-cuda11.2-jupyter-ubuntu20.04-py3.8
$ docker run --gpus all --shm-size=1G --rm -it -p 8888:8888 \
autogluon/autogluon:0.5.2-cuda11.2-jupyter-ubuntu20.04-py3.8
```
- Start Container and Notebook Server with CPU-only support
```
$ docker pull autogluon/autogluon:0.5.2-cpu-jupyter-ubuntu20.04-py3.8
$ docker run --rm --shm-size=1G -it -p 8888:8888 \
autogluon/autogluon:0.5.2-cpu-jupyter-ubuntu20.04-py3.8
```
<br>
## Doc
> [AutoGluon: AutoML for Text, Image, and Tabular Data](https://auto.gluon.ai/stable/index.html)
### [Installation](https://auto.gluon.ai/stable/install.html)
- 若只安裝 tabular 套件
- **基本 (含 scikit-learn: RF+XT+KNN+LR )**
`pip install autogluon.tabular`
- **基本 + CAT(catboost)**
`pip install autogluon.tabular[catboost]`
:warning: **`pip install autogluon.tabular`** 並沒有包含 catboost
- **基本 + GBM(lightgbm)**
`pip install autogluon.tabular[lightgbm]==0.5.2`
- **基本 + XGB(xgboost)**
`pip install autogluon.tabular[xgboost]==0.5.2`
- **基本 + NN_MXNET(mxnet)**
`pip install mxnet --upgrade`
or
`pip install mxnet_cu101 --upgrade`
- **基本 + NN_TORCH(torch)**
`pip install torch`
- **基本 + FASTAI(fastai)**
`pip install autogluon.tabular[fastai]`
- **基本 + AG_TEXT_NN**
`pip install autogluon.text`
- **基本 + 全配**
`pip install autogluon.tabular[all]`
- catboost
- lightgbm
- xgboost
- fastai
- torch
- ...
(備註:全配無包含 NN_MXNET,沒有看到 MXNET log)
### [APIs](https://auto.gluon.ai/stable/api/index.html)
- [TabularDataset](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularDataset)
- [TabularPredictor](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor)
- [TabularPredictor.fit](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.fit)
- time_limit: int
- hyperparameters: str
- Stable model options include:
‘GBM’ (LightGBM) ‘CAT’ (CatBoost) ‘XGB’ (XGBoost) ‘RF’ (random forest) ‘XT’ (extremely randomized trees) ‘KNN’ (k-nearest neighbors) ‘LR’ (linear regression) ‘NN’ (neural network with MXNet backend) ‘FASTAI’ (neural network with FastAI backend)
- Experimental model options include:
‘FASTTEXT’ (FastText) ‘AG_TEXT_NN’ (Multimodal Text+Tabular model, GPU is required) ‘TRANSF’ (Tabular Transformer, GPU is recommended)
- [TabularPredictor.predict](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.predict)
- [TabularPredictor.evaluate](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.evaluate)
- 回傳評估分數
- 需包含 X, y 資料
- 測試範例

- [TabularPredictor.evaluate_predictions](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.evaluate_predictions)
- 回傳評估分數
- 需包含 y_true, y_pred
- 測試範例

<br>
<hr>
<br>
## Problem types
- [log] `TabularPredictor(...).fit(...)`
>
predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
<br>
<hr>
<br>
## Algorithms
### tabular
- [[doc] autogluon.tabular.models](https://auto.gluon.ai/0.1.0/api/autogluon.tabular.models.html)
- [[doc] autogluon.tabular.TabularPredictor.fit](https://auto.gluon.ai/stable/api/autogluon.predictor.html#autogluon.tabular.TabularPredictor.fit)
- **Stable models:**
'GBM' (LightGBM)
'CAT' (CatBoost)
'XGB' (XGBoost)
'RF' (random forest)
'XT' (extremely randomized trees)
'KNN' (k-nearest neighbors)
'LR' (linear regression)
'NN_MXNET' (neural network implemented in MXNet)
'NN_TORCH' (neural network implemented in Pytorch)
'FASTAI' (neural network with FastAI backend)
- **Experimental models:**
'FASTTEXT' (FastText)
'AG_TEXT_NN' (Multimodal Text+Tabular model, GPU is required)
'TRANSF' (Tabular Transformer, GPU is recommended)
- **Summary**
| Model (short) | Model (full)<br>import | Support GPU<br>(Default) | doc |
| ------------- | ------------ | ----------- | --- |
| GBM | **LightGBM**<br>`autogluon.tabular.models.lgb` | ✘ (default)<br>✔ (install) | [[lightgbm]](https://lightgbm.readthedocs.io/en/latest/Parameters.html)
| CAT | **CatBoost**<br>`autogluon.tabular.models.catboost` | ✔ | [[catboost]](https://catboost.ai/docs/concepts/parameter-tuning.html) |
| XGB | **XGBoost**<br>`autogluon.tabular.models.xgboost` | ✔ | [[xgboost]](https://xgboost.readthedocs.io/en/latest/parameter.html) |
| RF | **random forest**<br>`autogluon.tabular.models.rf` | <span style="color: red">✘</span> | [[sklearn]](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) |
| XT | **extremely randomized trees**<br>`autogluon.tabular.models.xt` | <span style="color: red">✘</span> | [[sklearn]](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html) |
| KNN | **k-nearest neighbors**<br>`autogluon.tabular.models.knn` | <span style="color: red">✘</span> | [[sklearn]](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) |
| LR | **linear regression**<br>`autogluon.tabular.models.lr` | <span style="color: red">✘</span> | [[sklearn]](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) |
| NN_MXNET | neural network implemented in MXNet | ✔ ||
| NN_TORCH | neural network implemented in Pytorch | ✔ ||
| FASTAI | neural network with FastAI backend<br>`autogluon.tabular.models.fastainn` | ✔ | [[fast.ai]](https://docs.fast.ai/tabular.models.html) |
| * FASTTEXT | **FastText** | - ||
| * AG_TEXT_NN | Multimodal Text+Tabular model,<br>GPU is required | ✔ (no-CPU)||
| * TRANSF | **Tabular Transformer**<br>GPU is recommended | ✔ ||
- <b style='color: red'>`*`</b> 為 experimental models
<br>
- 預設演算法不含 experimental models 和 NN_MXNET
- `pip install autogluon.tabular[all]`
全配安裝,不會把 NN_MXNET 套件引入安裝
<br>
- **執行結果**:[放在 gitlab 上](http://10.78.26.44:30000/ai_maker_template/ml_sklearn/-/issues/19#note_73146)
- **NN_MXNET**:效果差
- **FASTTEXT**:無法訓練
> No valid features to train FastText_2... Skipping this model.
- **AG_TEXT_NN**:訓練/推論結果差, acc=6%,應該僅是用於 text?
<br>
- [[官網][doc] autogluon.tabular.models](https://auto.gluon.ai/stable/api/autogluon.tabular.models.html)
<br>
## Algorithm > GPU options
### Usages
- ### global
```python=
%%time
predictor = TabularPredictor(
label,
eval_metric='accuracy',
path=path
).fit(
train_data,
hyperparameters={
'XGB': [{}] # 至少要有一個 base (不能是空陣列)
# or
# 'XGB': {}
}
ag_args_fit={'num_gpus': 1}
)
```
- `'XGB': []`: 錯誤訊息為 No base models to train on
- ### local
```python=
%%time
predictor = TabularPredictor(
label,
eval_metric='accuracy',
path=path
).fit(
train_data,
hyperparameters={
'XGB': [
{'ag_args_fit': {'num_gpus': 1}}, # train with CPU (first run)
{'ag_args_fit': {'num_gpus': 0}}, # train with GPU (second run)
]
}
)
```
:::warning
:warning: **在 gpu 環境下的限制**
- NeuralNetTorch
> Fitting model: NeuralNetTorch ...
> TabularNeuralNetTorchModel not yet able to use more than 1 GPU. 'num_gpus' is set to >1, but we will be using only 1 GPU.
>
[](https://i.imgur.com/VP16RPd.png)
:::
:::warning
:warning: **在 no-gpu 環境下啟用 GPU ,會有錯誤訊息**
- ### XGBoost
```python
predictor = TabularPredictor(
label,
eval_metric='accuracy',
path=path
).fit(
train_data,
hyperparameters={
'XGB': { }
},
ag_args_fit={'num_gpus': 1}
)
```
> Fitting model: XGBoost ...
> Warning: Exception caused XGBoost to fail during training... Skipping this model.
> [14:44:21] ../src/gbm/gbtree.cc:548: Check failed: common::AllVisibleGPUs() >= 1 (0 vs. 1) : No visible GPU is found for XGBoost.
> ...
> xgboost.core.XGBoostError: [14:44:21] ../src/gbm/gbtree.cc:548: Check failed: common::AllVisibleGPUs() >= 1 (0 vs. 1) : No visible GPU is found for XGBoost.
[](https://i.imgur.com/Sq7Ddq3.png)
- ### TRANSF
> RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
> No base models to train on, skipping auxiliary stack level 2...
[](https://i.imgur.com/8EimDZb.png)
- ### CAT
> catboost/cuda/cuda_lib/cuda_base.h:281: CUDA error 35: CUDA driver version is insufficient for CUDA runtime version
>
> _catboost.CatBoostError: catboost/cuda/cuda_lib/cuda_base.h:281: CUDA error 35: CUDA driver version is insufficient for CUDA runtime version
No base models to train on, skipping auxiliary stack level 2...
- ### NN_MXNET
> MXNetError: GPU is not enabled
- ### GBM, NeuralNetTorch, FASTAI
不會有 error,會自動切換到 **CPU** 版本
:::
### Articles
- [[kaggle] AutoGluon + RAPIDS (Top 1%)](https://www.kaggle.com/code/innixma/autogluon-rapids-top-1/notebook)
- [[github] How to use GPUs for tabular #1097](https://github.com/awslabs/autogluon/issues/1097)
- [[AutoGluon][doc] Tabular Prediction / navigate_next / FAQ](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-faq.html?highlight=gpu#can-i-use-gpus-for-model-training)
- [[AutoGluon][doc] AutoMM for Text + Tabular - Quick Start](https://auto.gluon.ai/stable/tutorials/multimodal/multimodal_text_tabular.html?highlight=gpu)
- [[AutoGluon][doc] Training models with GPU support](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-gpu.html)
- CUDA toolkit is required for GPU training.
### Troubleshooting
- ### Fitting model: LightGBMXT
```
Fitting model: LightGBMXT ...
Training LightGBMXT with GPU, note that this may negatively impact model quality compared to CPU training.
Warning: GPU mode might not be installed for LightGBM, GPU training raised an exception. Falling back to CPU training...Refer to LightGBM GPU documentation: https://github.com/Microsoft/LightGBM/tree/master/python-package#build-gpu-versionOne possible method is: pip uninstall lightgbm -y pip install lightgbm --install-option=--gpu
```
---
```
pip uninstall lightgbm -y
pip install lightgbm --install-option=--gpu
```
- errors
```
subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-install-ekyiu2lo/lightgbm_356d35a9e6b84ff89efd84d256913b7b/compile', '-DUSE_GPU=ON']' returned non-zero exit status 1.
...
Exception: Please install CMake and all required dependencies first
The full version of error log was saved into /root/LightGBM_compilation.log
...
```
- [When I run 'pip install lightgbm --install-option=--gpu', I got an error #1328](https://github.com/Microsoft/LightGBM/issues/1328)
> refer to https://github.com/Microsoft/LightGBM/tree/master/python-package#build-gpu-version
> you should install cmake, boost, opencl, and set the environment variables correctly.
- [Build GPU Version](https://github.com/Microsoft/LightGBM/tree/master/python-package#build-gpu-version)
- [Build CUDA Version](https://github.com/Microsoft/LightGBM/tree/master/python-package#build-cuda-version)
- try
- [How to install Boost on Ubuntu](https://stackoverflow.com/questions/12578499/)
```
sudo apt-get install libboost-all-dev
```
- [Install OpenCL Drivers On Ubuntu](https://support.zivid.com/en/v2.4/getting-started/software-installation/gpu/install-opencl-drivers-ubuntu.html)
```bash
mkdir neo
cd neo
wget https://github.com/intel/compute-runtime/releases/download/19.07.12410/intel-gmmlib_18.4.1_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/19.07.12410/intel-igc-core_18.50.1270_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/19.07.12410/intel-igc-opencl_18.50.1270_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/19.07.12410/intel-opencl_19.07.12410_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/19.07.12410/intel-ocloc_19.07.12410_amd64.deb
sudo dpkg -i *.deb
# Check OpenCL driver
/usr/bin/clinfo -l
```
- error
`$ dpkg -i intel-igc-opencl_18.50.1270_amd64.deb`
```
dpkg: dependency problems prevent configuration of intel-igc-opencl:
intel-igc-opencl depends on libtinfo5 (>= 6); however:
Package libtinfo5 is not installed.
dpkg: error processing package intel-igc-opencl (--install):
dependency problems - leaving unconfigured
Errors were encountered while processing:
intel-igc-opencl
```
---
```
pip uninstall lightgbm -y
pip install lightgbm --install-option=--cuda
```

<br>
<hr>
<br>
## Metrics
- ### [autogluon.tabular.TabularPredictor](https://auto.gluon.ai/stable/api/autogluon.predictor.html#module-0)
- **使用方式(範例)**
```python
TabularPredictor(
...
eval_metric='accuracy' # 只能有一個指標
)
```
- **預設**
- 二元分類:`accuracy`
- 多元分類:`accuracy`
- 迴歸:`root_mean_squared_error`
- 分位數(quantile):`pinball_loss`
- **分類**
- `accuracy`
- `balanced_accuracy`
- `f1`
- `f1_macro`
- `f1_micro`
- `f1_weighted`,
- `roc_auc`
- `roc_auc_ovo_macro`
- `average_precision`
- `precision`
- `precision_macro`
- `precision_micro`,
- `precision_weighted`
- `recall`
- `recall_macro`
- `recall_micro`
- `recall_weighted`
- `log_loss`
- `pac_score`
- **迴歸**
- `root_mean_squared_error`
- `mean_squared_error`
- `mean_absolute_error`
- `median_absolute_error`
- `mean_absolute_percentage_error`
- `r2`
<br>
<hr>
<br>
## Models
### XGBoost
- 測試跑兩次 XGBoost
```python
%%time
predictor = TabularPredictor(
label,
eval_metric='accuracy',
path=path
).fit(
train_data,
hyperparameters={
'XGB': [
{'ag_args_fit': {'num_gpus': 1}}, # train with GPU (first run)
{'ag_args_fit': {'num_gpus': 1}}, # train with GPU (second run)
]
},
)
```
```
.
├── [ 5] __version__
├── [5.3K] learner.pkl
├── [ 104] models
│ ├── [ 48] WeightedEnsemble_L2
│ │ ├── [6.3K] model.pkl
│ │ └── [ 59] utils
│ │ ├── [ 987] model_template.pkl
│ │ └── [ 578] oof.pkl
│ ├── [ 50] XGBoost <---------------
│ │ ├── [2.2K] model.pkl
│ │ └── [214K] xgb.ubj
│ ├── [ 50] XGBoost_2 <---------------
│ │ ├── [2.2K] model.pkl
│ │ └── [214K] xgb.ubj
│ └── [2.3K] trainer.pkl
├── [ 317] predictor.pkl
└── [ 42] utils
├── [ 50] attr
│ ├── [ 42] XGBoost
│ │ └── [ 440] y_pred_proba_val.pkl
│ └── [ 42] XGBoost_2
│ └── [ 440] y_pred_proba_val.pkl
└── [ 86] data
├── [4.6K] X.pkl
├── [1.7K] X_val.pkl
├── [2.3K] y.pkl
└── [1.1K] y_val.pkl
10 directories, 17 files
```
- KNNRapidsModel [[GitHub](https://github.com/awslabs/autogluon/blob/master/tabular/src/autogluon/tabular/models/knn/knn_rapids_model.py)]
- [參考用法](https://github.com/h2oai/driverlessai-recipes/blob/master/models/algorithms/autogluon.py#L111)
- 測試程式碼
```python
from autogluon.tabular.models.knn.knn_rapids_model import KNNModel
from autogluon.tabular.models.knn.knn_rapids_model import KNNRapidsModel
predictor = TabularPredictor(
label,
eval_metric='accuracy',
path=path
).fit(
train_data,
hyperparameters={
KNNRapidsModel: {}
},
ag_args_fit={'num_gpus': 1}
)
```
- error
```
Fitting model: KNNRapidsModel ...
Warning: Exception caused KNNRapidsModel to fail during training (ImportError)... Skipping this model.
`import cuml` failed.
Ensure that you have a GPU and CUDA installation, and then install RAPIDS.
You will likely need to create a fresh conda environment based off of a RAPIDS install, and then install AutoGluon on it.
RAPIDS is highly experimental within AutoGluon, and we recommend to only use RAPIDS if you are an advanced user / developer.
Please refer to RAPIDS install instructions for more information: https://rapids.ai/start.html#get-rapids
No base models to train on, skipping auxiliary stack level 2...
```

- 環境建立
```
conda create -n rapids-21.06 -c rapidsai -c nvidia -c conda-forge rapids=21.06 python=3.8 cudatoolkit=11.2
conda activate rapids-21.06
pip install --pre autogluon.tabular[all]
```
<br>
<hr>
<br>
## Doc - 細節
> - 使用場景
> - 在 Kaggle 的使用情境
> - 前處理
> - 帶有文字、分類特徵的資料表格,要如何處理?
> - 帶有影像、文字、數值、分類特徵的資料表格,要如何處理?
> - 如何使用特徵工程,以及如何擴充?
> - 訓練
> - fit 使用教學 (以及進階用法)
> - 如何使用規則,建立可解釋的模型?
> - 如何透過 GPU 加速
> - 如何添加自定義模型 (以及進階用法)?
> - 預測
> - 如何預測多欄位?
> - 評估
> - 如何添加自定義指標?
### Intro
- [Tabular Prediction](https://auto.gluon.ai/stable/tutorials/tabular_prediction/index.html)
- 號稱不需要資料清整、特徵工程、超參數最佳化、模型選擇
### [1. Quick Start Using FIT](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-quickstart.html)
> 5 min tutorial on fitting models with tabular datasets.
> fit 教學
- [TabularPredictor](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor)
- problem_type 有四種
- binary
- multiclass
- regression
- None (自動推斷,根據 label 值推斷)
- 前處理
- 自動處理缺失值、重新調整特徵值
- 訓練
- API
```python
from autogluon.tabular import TabularPredictor
predictor = TabularPredictor(label=<variable-name>)
.fit(train_data=<file-name>)
```
- 自動擬合各類型模型,像是 neural networks 和 tree ensembles
- 自動對各模型尋找最佳超參數,讓 驗證集 表現最好
- `fit()` 可在多執行緒、多機器下並行處理(但不支援多行程?)
- time_limit (是總時間限制)
- `fit(train_data, time_limit=60)` 使用時間限制的缺點:
[](https://i.imgur.com/ZWV77FT.png)
後面的模型都沒有跑到
- 推論
- 對於分類問題,除了預測類別之外,也支援預測類別的機率
- [leaderboard](https://auto.gluon.ai/0.1.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.leaderboard) (排行榜)
[](https://i.imgur.com/knY7IL0.png)
'score_test', 'pred_time_test', and 'pred_time_test_marginal'
### [2. In-depth FIT Tutorial](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-indepth.html)
> In-depth tutorial on controlling various aspects of model fitting.
> 深入探討 fit
- **指定模型參數**
> - 設定:NN 的超參數 (像是 epoch)、LightGBM 的超參數
> - 它的每個超參數
> - 有一個預設值
> - 及一個範圍或清單
> - 在啟動 HPO (超參數優化) 時,可以在範圍或清單中挑選(類似 SmartML 功能)
```python
import autogluon.core as ag
nn_options = {
'num_epochs': 10,
'learning_rate': ag.space.Real(1e-4, 1e-2, default=5e-4, log=True),
'activation': ag.space.Categorical('relu', 'softrelu', 'tanh'),
'dropout_prob': ag.space.Real(0.0, 0.5, default=0.1),
}
hyperparameters = {
'GBM': gbm_options,
'NN_TORCH': nn_options,
}
```
- **超參數優化**
> - 可以設定 HPO 的執行次數、以及執行總時間
(目前還不清楚結束條件如何判斷)
> - HPO 也是有 auto 選項 (on-going)
```python
time_limit = 15*60 # train various models for ~2 min
num_trials = 1000 # try at most 5 different hyperparameter configurations for each type of model
search_strategy = 'auto' # to tune hyperparameters using random search routine with a local scheduler
# HPO is not performed unless hyperparameter_tune_kwargs is specified
hyperparameter_tune_kwargs = {
'num_trials': num_trials,
'scheduler' : 'local',
'searcher': search_strategy,
}
```
- **輔助指標**
- `evaluate(test_data, auxiliary_metrics=False)`
- False: 使用自訂指標
- True, default: 使用所有指標
- 使用所有指標:
- accuracy: 0.8752175248234211
- balanced_accuracy: 0.7985774242740231
- mcc: 0.6384055943366135
- roc_auc: 0.9292811684599376
- f1: 0.7128386336866903
- precision: 0.785158277114686
- recall: 0.6527178602243313
### [3. Kaggle Tutorial](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-kaggle.html)
> Using AutoGluon for Kaggle competitions with tabular data.
> 使用在 Kaggle 的情境
### [4. Data Tables Containing Image, Text, and Tabular](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-multimodal.html)
> Modeling data tables with image, text, numeric, and categorical features.
> 帶有影像、文字、數值、分類特徵的資料表格,要如何處理?
### [5. Data Tables Containing Text](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-multimodal-text-others.html)
> Modeling data tables with text and numeric/categorical features.
> 帶有文字的資料表格,要如何處理?
### [6. Interpretable rule-based modeling](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-interpretability.html)
> Fitting interpretable models to data table for understanding data and predictions.
> 使用規則,建立可解釋的模型
### [7. Training models with GPU support](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-gpu.html)
> How to train models with GPU support.
> 如何透過 GPU 加速?
### [8. Multi-Label Prediction](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-multilabel.html)
> How to predict multiple columns in a data table.
> 如何預測多欄位?
### [9. Adding a Custom Model](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-custom-model.html)
> How to add a custom model to AutoGluon.
> 如何添加客製化模型?
### [10. Adding a Custom Model (Advanced)](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-custom-model-advanced.html)
> How to add a custom model to AutoGluon (Advanced).
> 如何添加自定義模型(進階)?
### [11. Adding a Custom Metric](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-custom-metric.html)
> How to add a custom metric to AutoGluon.
> 如何添加自定義指標?
### [12. Feature Engineering](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-feature-engineering.html)
> AutoGluon’s default feature engineering and how to extend it.
> 如何使用特徵工程,以及如何擴充?
### [FAQ](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-faq.html)
> Frequently asked questions about AutoGluon-Tabular.
### [Functionality Reference Implementation](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-custom-model.html#functionality-reference-implementation)
<br>
<hr>
<br>
## conda 實驗
```
conda create -n autogluon python=3.8
```
<br>
<hr>
<br>
## 推論
- 使用 GPU 訓練,卻用 CPU 推論
- entrypoint
/autogluon/tabular/predictor/predictor.py", line 1382, in predict
- error
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
- snapshot
[](https://i.imgur.com/ViUfBPc.png)
<br>
<hr>
<br>
## 參考資料
- [AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data](https://arxiv.org/abs/2003.06505)
> 連接來源:[Model ensembling with stacking/bagging]](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-indepth.html#model-ensembling-with-stacking-bagging)
- [PDF](https://arxiv.org/pdf/2003.06505.pdf)
- [藉助 NVIDIA GPU 和 RAPIDS 提升 AutoML 技術水準的速度達 10 倍之多](https://blogs.nvidia.com.tw/2022/03/29/advancing-the-state-of-the-art-in-automl-now-10x-faster-with-nvidia-gpus-and-rapids/)
- AutoGluon 是一種 AutoML,演算法不同於 auto-sklearn
- AutoGluon 如何在 Kaggle 預測競賽中,使用三行程式碼勝過 99% 的人類資料科學團隊,而無須專家知識?
- AutoGluon-Tabular,一種 AutoGluon API,僅需要幾行 Python,即可在未經處理的表格資料集(例如 CSV 檔案)上,訓練高度準確的機器學習模型。
- AutoGluon 比 TPOT、H2O、AutoWEKA、auto-sklearn 和 Google AutoML Tables 更快速、更穩健、更準確。
- AutoGluon說:一般 AutoML 在整體化時,調整超參數較没有幫助。
- [AutoGluon | 用三行代码战胜 90% 的模型](https://cloud.tencent.com/developer/article/1827841)
- [AutoGluon: Deep Learning AutoML](https://towardsdatascience.com/autogluon-deep-learning-automl-5cdb4e2388ec#065f)
- Adult Income Classification
- [[Day 22] 機器學習模型技巧 Stacking](https://ithelp.ithome.com.tw/articles/10250317)

<br>
<hr>
<br>
### iris 測試
- case 1
```python
predictor = TabularPredictor(label=label, path=save_path).fit(train_data)
```
```
...
Fitting model: LightGBMLarge ...
(卡超過 1 小時)
```
- case 2: total
| Fitting model | Training runtime |
| --------------- | ---------------- |
| LightGBMLarge | 4385.04s (73m 05s) |
| XGBoost | 1287.35s (21m 27s) |
| LightGBM | 1176.56s (19m 36s) |
| LightGBMXT | 768.25s (12m 48s) |
| NeuralNetTorch | 327.37s (5m 27s)|
| NeuralNetFastAI | 211.36s (3m 31s) |
<br>
<hr>
<br>
## :warning: 錯誤排除
- [[HackMD] NVIDIA / AutoGluon - 錯誤排除](https://hackmd.io/_d8x1bwnR1O9FDpDUtiq0g)
- [安裝失敗訊息](https://hackmd.io/_d8x1bwnR1O9FDpDUtiq0g#%E5%AE%89%E8%A3%9D%E5%A4%B1%E6%95%97%E8%A8%8A%E6%81%AF)
- [ml-sklearn:v1](https://hackmd.io/_d8x1bwnR1O9FDpDUtiq0g#ml-sklearnv1)
- [ubuntu20.04-python3.8:latest](https://hackmd.io/_d8x1bwnR1O9FDpDUtiq0g#ubuntu2004-python38latest)
- [notebook / RAPIDS-22.04](https://hackmd.io/_d8x1bwnR1O9FDpDUtiq0g#notebook--RAPIDS-2204)
- [autogluon-core 相依性](https://hackmd.io/_d8x1bwnR1O9FDpDUtiq0g#autogluon-core-%E7%9B%B8%E4%BE%9D%E6%80%A7)