c4lab
我們的 server 的 storage 是用 megaraid 去蓋的
所以要查看硬碟 有兩種方式
Concept Overview: 相關名詞都在這裡ㄌ
總之就是要下載 MegaCli, yum 跟 apt 沒有
MegaCli download Site:
https://www.broadcom.com/support/download-search?pg=&pf=&pn=&pa=&po=&dk=megacli&pl=
# download
wget https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/8-07-14_MegaCLI.zip
# Unzip
unzip 8-07-14_MegaCLI.zip
cd Linux
# install
sudo yum localinstall MegaCli-8.07.14-1.noarch.rpm
Manual of MegaCli command line
https://www.alteeve.com/w/MegaCli64_Cheat_Sheet
List all the HDD
[linnil1@lncrna MegaCli]$ sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aAll
Enclosure Device ID: 8
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 0
WWN: 50014xxxxxxxxxx
Sequence Number: 2
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 5.458 TB [0x2baa0f4b0 Sectors]
Non Coerced Size: 5.457 TB [0x2ba90f4b0 Sectors]
Coerced Size: 5.457 TB [0x2ba900000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 0A82
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500304xxxxxxxxxx
Inquiry Data: WD-WX31D95Hxxxxxxx WD60EFRX-xxxxxxx 82.00A82
Device Speed: 6.0Gb/s
List all Virtual drives
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -ldinfo -lALL -aALL
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :raid6vd01
RAID Level : Primary-6, Secondary-0, RAID Level Qualifier-3
Size : 21.829 TB
Sector Size : 512
Parity Size : 7.276 TB
State : Optimal
Strip Size : 128 KB
Number Of Drives : 8
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
如果是想要換的硬碟,可以這樣子看序號 把以下的 15
換成 Device Id
[linnil1@lncrna MegaCli]$ sudo smartctl -d megaraid,15 -a /dev/sda
=== START OF INFORMATION SECTION ===
Model Family: Seagate NAS HDD
Device Model: ST3000VN000-xxxxxx
Serial Number: Zxxxxx
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Reference
壞掉的硬碟 會讓 raid1/raid5/raid6 變成 degraded
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -AdpAllInfo -aALl
Device Present
================
Virtual Drives : 3
Degraded : 1
Offline : 0
Physical Devices : 24
Disks : 24
Critical Disks : 0
Failed Disks : 1
查看哪個硬碟 fail
(base) [linnil1@exon MegaCli]$ sudo /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aAll | egrep "(Arm)|(Device Id)|(Error)|(state)"
Device Id: 13
Media Error Count: 0
Other Error Count: 0
Firmware state: Online, Spun Up
Drive's position: DiskGroup: 1, Span: 0, Arm: 6
Device Id: 15
Media Error Count: 495
Other Error Count: 3
Firmware state: Failed
當然 如果壞掉的話 說不定連連都聯不進去
(base) [linnil1@exon ~]$ sudo smartctl -d sat+megaraid,15 -a /dev/sda
smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-754.31.1.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
Smartctl: Device Read Identity Failed: megasas_cmd result: 0.15 = 0/46
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
去察看你的主機板資訊 或者是 販售電腦的型號
確定你可以直接插拔 HDD
Mother Board PDF: https://www.supermicro.com/manuals/motherboard/C606_602/MNL-1258.pdf
需要用 megacli 把 壞掉ㄉ Disk 標記成 removable
參考 https://www.advancedclustering.com/act_kb/replacing-a-disk-with-megacli/
(待補)
The parameter is -physdrv[<enclosure_ID>:<slot_id>]
, e.g. -physdrv[8:14]
移除前務必確認
sudo ./MegaCli64 -pdInfo -PhysDrv[8:14] -a0
然後移除
MegaCli64 -pdoffline -physdrv[8:14] -a0
MegaCli64 -pdmarkmissing -physdrv[8:14] -a0
MegaCli64 -pdprprmv -physdrv[8:14] -a0
設定他閃紅燈
MegaCli64 -pdlocate -start -physdrv[8:14] -a0
然後該硬碟外面的燈會變成紅色
(應該沒問題吧)
少一顆
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -AdpAllInfo -aALl
Device Present
================
Virtual Drives : 3
Degraded : 1
Offline : 0
Physical Devices : 24
Disks : 23
Critical Disks : 0
Failed Disks : 0
VD1 顯示 Partially Degraded
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -ldinfo -lALL -aALL
Virtual Drive: 1 (Target Id: 1)
Name :
RAID Level : Primary-6, Secondary-0, RAID Level Qualifier-3
Size : 43.660 TB
Sector Size : 512
Parity Size : 10.915 TB
State : Partially Degraded
Strip Size : 64 KB
Number Of Drives : 10
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
看同一個位置
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdInfo -PhysDrv[8:14] -aALL
Adapter 0: Device at Enclosure - 8, Slot - 14 is not found.
Exit Code: 0x00
確認規格 (Space, read/write speed, serial number, model number)
記得統編發票
拍照
(新的是 WD60EFZX)
找到插上的 disk
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdInfo -PhysDrv[8:14] -a0
Enclosure Device ID: 8
Slot Number: 14
Enclosure position: N/A
Device Id: 15
WWN: 50014xxxxxxxxxxx
Sequence Number: 1
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 5.458 TB [0x2baa0f4b0 Sectors]
Non Coerced Size: 5.457 TB [0x2ba90f4b0 Sectors]
Coerced Size: 5.457 TB [0x2ba900000 Sectors]
Sector Size: 0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: 0A81
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x50030480xxxxxxx
Connected Port Number: 0(path0)
Inquiry Data: WD-C81KHxxxxxx WD60EFZX-xxxxxxx 81.00A81
插上後的數量
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -AdpAllInfo -aALl
Device Present
================
Virtual Drives : 3
Degraded : 1
Offline : 0
Physical Devices : 25 (missing 跟 unconfigured)
Disks : 24
Critical Disks : 0
Failed Disks : 0
找到它屬於的位置
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -PdgetMissing -a0
Adapter 0 - Missing Physical drives
No. Array Row Size Expected
0 1 6 5722624 MB
Exit Code: 0x00
填上她的位置 array row
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -PdReplaceMissing -PhysDrv[8:14] -array1 -row6 -a0
Adapter: 0: Missing PD at Array 1, Row 6 is replaced.
Exit Code: 0x00
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdrbld -start -PhysDrv[8:14] -a0
Started rebuild progress on device(Encl-8 Slot-14)
Exit Code: 0x00
同時 你應該會看到 目前正在 rebuild 的硬碟 的紅燈在閃爍中
以下只是查看而已
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -PdgetMissing -a0
Adapter 0 - No Missing Drive is Found.
Exit Code: 0x00
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdInfo -PhysDrv[8:14] -a0
Firmware state: Rebuild
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdrbld -ShowProg -PhysDrv[8:14] -a0
Rebuild Progress on Device at Enclosure 8, Slot 14 Completed 0% in 3 Minutes.
Exit Code: 0x00
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdrbld -ShowProg -PhysDrv[8:14] -a0
[sudo] password for linnil1:
Rebuild Progress on Device at Enclosure 8, Slot 14 Completed 17% in 135 Minutes.
Exit Code: 0x00
參考這個 https://www.advancedclustering.com/act_kb/replacing-a-disk-with-megacli/
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -pdrbld -ShowProg -PhysDrv[8:14] -a0
Device(Encl-8 Slot-14) is not in rebuild process
Exit Code: 0x00
都是正常的
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -AdpAllInfo -aALl
Device Present
================
Virtual Drives : 3
Degraded : 0
Offline : 0
Physical Devices : 25
Disks : 24
Critical Disks : 0
Failed Disks : 0
state 從 degraded -> optimal
(base) [linnil1@exon MegaCli]$ sudo ./MegaCli64 -ldinfo -lALL -aALL
Virtual Drive: 1 (Target Id: 1)
State : Optimal
https://www.broadcom.com/support/knowledgebase/1211161500661/installing-megacli-in-debian-or-ubuntu
(env) [linnil1@rna server]$ sudo /opt/MegaRAID/MegaCli/MegaCli64
/opt/MegaRAID/MegaCli/MegaCli64: error while loading shared libraries: libncurses.so.5: cannot open shared object file: No such file or directory
You can install the package(centos8)
sudo yum install ncurses-compat-libs
Install old ncurses library from apt(ubuntu20.04)
https://askubuntu.com/questions/1252062/how-to-install-libncurses-so-5-in-ubuntu-20-04
sudo add-apt-repository universe
sudo aptinstall libncurses5
插上硬碟後
Enclosure Device ID: 8 Slot Number: 17 Enclosure position: N/A Device Id: 20 WWN: 5000cca2c1d1020f Sequence Number: 7
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 14.552 TB [0x746c00000 Sectors]
Non Coerced Size: 14.551 TB [0x746b00000 Sectors]
Coerced Size: 14.551 TB [0x746b00000 Sectors]
Sector Size: 0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: W232
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500304801780011d
Connected Port Number: 0(path0)
Inquiry Data: 2PH6DW3J WDC WUH721816ALE6L4 PCGNW232
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
(base) linnil1@exon:~$ sudo ./MegaCli64 -cfgldadd -r6 [8:8,8:9,8:10,8:11,8:14,8:15,8:16,8:17] -a0
Adapter 0: Created VD 1
Adapter 0: Configured the Adapter!!
Exit Code: 0x00
(base) linnil1@exon:~$ sudo ./MegaCli64 -h
(base) linnil1@exon:~$ sudo ./MegaCli64 -LDinfo -L1 -aAll
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 1 (Target Id: 1)
Name :
RAID Level : Primary-6, Secondary-0, RAID Level Qualifier-3
Size : 87.313 TB
Sector Size : 512
Parity Size : 29.104 TB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 8
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Ongoing Progresses:
Background Initialization: Completed 0%, Taken 0 min.
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Install Driver check if you have GPU(pysical) lspci | grep -i nvidia Disable nouveau vim /etc/modprobe.d/blacklist-nouveau.conf blacklist nouveau options nouveau modeset=0
Nov 23, 2022[toc] Target I want to setup a DNS server for alias all the server in the internal networks. It's possible to use ansible to update all /etc/hosts in each server. However it is not a elegant way. Thus, I setup the coredns and try to figure out the configuration for supporting all commonly used protocols (plain, https, tls) and test them one by one. Finally, I will set this DNS IP:port in our main router. Setup Coredns
Sep 29, 2022Upload your fastq, we will run HLA pipeline for your in aws. Github: https://github.com/linnil1/hla-on-aws Architecture API API logic I deploy my nuxt APP in Cloudflare APIGateway + lambda as API server
Apr 7, 2022Add logger in iptable about iptable https://www.hostinger.com/tutorials/iptables-tutorial https://help.ubuntu.com/community/IptablesHowTo show it sudo iptables --line-numbers -L sudo iptables --line-numbers -L OUTPUT
Mar 18, 2022or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up