## 簡介
各種在 Linux 環境下 Troubleshooting 的常見方法
## 分類
### Disk Space Usage
檢查根目錄開始的每個檔案的硬碟使用空間量
```
sudo du -h / | sort -n
```
如果是 log 或 journal 之類的使用空間量較大,可以直接刪除
如果是 containerd,或是 docker,造成的原因可能是 image 太多
### Cpu Memory Usage
用 `htop` 找到異常的 process
```
htop
```
再用 `pstree` 來追查有關的 process
```
pstree -ps <pid>
```
`lsof` 可以找出這 process 正在使用的檔案
```
lsof -p <pid>
```
`strace` 可以印出目前 process 的 system call
```
strace -p <pid>
```
如果有發現異常的訊號,例如 `SIGILL` 可以使用 `gdb` 來在訊號異常時,找到異常的 back trace
```
gdb -p <pid>
handle <signal> stop // 預設就是這個設定
bt
```
```
Thread 1 "node /code/dist" received signal SIGILL, Illegal instruction.
0x00005653314a9923 in v8::base::OS::Abort() ()
(gdb) handle SIGILL stop
Signal Stop Print Pass to program Description
SIGILL Yes Yes Yes Illegal instruction
(gdb) bt
#0 0x00005653314a9923 in v8::base::OS::Abort() ()
#1 0x00005653326bff74 in V8_Fatal(char const*, ...) ()
#2 0x000056533199ccbe in v8::internal::Scavenger::Process(v8::internal::OneshotBarrier*) ()
#3 0x00005653319a466a in v8::internal::ScavengingTask::RunInParallel(v8::internal::ItemParallelJob::Task::Runner) ()
#4 0x00005653319292bc in v8::internal::ItemParallelJob::Run() ()
#5 0x00005653319a2100 in v8::internal::ScavengerCollector::CollectGarbage() ()
#6 0x000056533190c9fd in v8::internal::Heap::Scavenge() ()
#7 0x000056533191bb20 in v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) ()
#8 0x000056533191c1ce in v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) ()
#9 0x0000565331920428 in v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) ()
#10 0x00005653318dda07 in v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) ()
#11 0x0000565331ca779f in v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) ()
#12 0x000056533209c759 in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit ()
#13 0x0000565332104d80 in Builtins_Load_FastDoubleElements_0 ()
#14 0x0000000000000000 in ?? ()
(gdb)
```
### Disk I/O
### NetWork I/O
## 案例
### Containerd 或是 docker image 造成硬碟使用空間過高
Image 通常會設定 Rotate,如果發現硬碟使用空間過高是因為 Image造成,可能需要調整 Rotate 的策略
- containerd
刪除使用不到的Image
```bash
crictl rmi --prune
```
- Docker
刪除使用不到的Image
```bash
docker image prune
```