## Is it possible to repair a failed drive? Maybe issues with NVME firmware, motherboard/BIOS or M.2 slot? We need functional NVME (0) with our data. ## Log files about drives health: > Failed drive shows: > FW Rev ERRORMOD > ``` root@rescue:~# nvme list Node SN Model Namespace Usage Format FW Rev ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 S437NA0N515103 SAMSUNG MZQLB960HAJR-00007 1 1.07 GB / 1.07 GB 512 B + 0 B ERRORMOD /dev/nvme1n1 S437NA0N515105 SAMSUNG MZQLB960HAJR-00007 1 581.61 GB / 960.20 GB 512 B + 0 B EDA5202Q ``` ### Healthy NVME (1) > Available Spare: 100% ``` root@rescue:~# smartctl -a /dev/nvme1 smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.10.18-mod-std] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: SAMSUNG MZQLB960HAJR-00007 Serial Number: S437NA0N515105 Firmware Version: EDA5202Q PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 960,197,124,096 [960 GB] Unallocated NVM Capacity: 0 Controller ID: 4 Number of Namespaces: 1 Namespace 1 Size/Capacity: 960,197,124,096 [960 GB] Namespace 1 Utilization: 581,611,520,000 [581 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Sat Dec 18 12:16:19 2021 EST Firmware Updates (0x17): 3 Slots, Slot 1 R/O, no Reset required Optional Admin Commands (0x000f): Security Format Frmw_DL NS_Mngmt Optional NVM Commands (0x001f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 87 Celsius Critical Comp. Temp. Threshold: 88 Celsius Namespace 1 Features (0x02): NA_Fields Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 10.60W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 1 - 4096 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff) Critical Warning: 0x00 Temperature: 34 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 146,521,243 [75.0 TB] Data Units Written: 69,510,790 [35.5 TB] Host Read Commands: 2,923,338,410 Host Write Commands: 1,144,693,075 Controller Busy Time: 4,503 Power Cycles: 37 Power On Hours: 13,075 Unsafe Shutdowns: 30 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 34 Celsius Temperature Sensor 2: 38 Celsius Temperature Sensor 3: 42 Celsius ``` ### Failed NVME (0) > Available Spare: 0%, Percentage Used: 255% ``` root@rescue:~# smartctl -a /dev/nvme0 smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.10.18-mod-std] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: SAMSUNG MZQLB960HAJR-00007 Serial Number: S437NA0N515103 Firmware Version: ERRORMOD PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 960,197,124,096 [960 GB] Unallocated NVM Capacity: 0 Controller ID: 4 Number of Namespaces: 1 Namespace 1 Size/Capacity: 1,073,741,824 [1.07 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Sat Dec 18 12:16:03 2021 EST Firmware Updates (0x17): 3 Slots, Slot 1 R/O, no Reset required Optional Admin Commands (0x000f): Security Format Frmw_DL NS_Mngmt Optional NVM Commands (0x001f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 87 Celsius Critical Comp. Temp. Threshold: 88 Celsius Namespace 1 Features (0x02): NA_Fields Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 10.60W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 1 - 4096 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff) Critical Warning: 0x00 Temperature: - Available Spare: 0% Available Spare Threshold: 10% Percentage Used: 255% Data Units Read: 0 Data Units Written: 0 Host Read Commands: 0 Host Write Commands: 0 Controller Busy Time: 0 Power Cycles: 0 Power On Hours: 0 Unsafe Shutdowns: 0 Media and Data Integrity Errors: 0 Error Information Log Entries: 4 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 ``` ``` root@rescue:~# nvme smart-log /dev/nvme0n1 Smart Log for NVME device:nvme0n1 namespace-id:ffffffff critical_warning : 0 temperature : 4294967023 C available_spare : 0% available_spare_threshold : 10% percentage_used : 255% data_units_read : 0 data_units_written : 0 host_read_commands : 0 host_write_commands : 0 controller_busy_time : 0 power_cycles : 0 power_on_hours : 0 unsafe_shutdowns : 0 media_errors : 0 num_err_log_entries : 4 Warning Temperature Time : 0 Critical Composite Temperature Time : 0 Temperature Sensor 1 : 0 C Temperature Sensor 2 : 0 C Temperature Sensor 3 : 0 C Temperature Sensor 4 : 0 C Temperature Sensor 5 : 0 C Temperature Sensor 6 : 0 C Temperature Sensor 7 : 0 C Temperature Sensor 8 : 0 C ``` ### LVM errors: ``` root@rescue:~# pvs Couldn't find device with uuid qUelxM-FXi6-HduD-m6A4-VVfa-61eF-tJ7h8A. PV VG Fmt Attr PSize PFree /dev/nvme1n1p1 Storage lvm2 a-- 894.25g 294.25g unknown device Storage lvm2 a-m 592.78g 147.78g ``` ``` root@rescue:~# vgchange -ay Storage Couldn't find device with uuid qUelxM-FXi6-HduD-m6A4-VVfa-61eF-tJ7h8A. Refusing activation of partial LV Storage/vsv4765-d0xv6p2u8yc28vqk-gp7socpfcgu1d8rd. Use '--activationmode partial' to override. Refusing activation of partial LV Storage/vsv4770-d1uhqxmvmj6npybe-f6ootr0j5lxaw0yb. Use '--activationmode partial' to override. Refusing activation of partial LV Storage/vsv5015-dmxm0y1ckwmdfsxd-cra9zx8ugpzm0cne. Use '--activationmode partial' to override. Refusing activation of partial LV Storage/vsv5076-djuhho90kmc97glh-k6cjlb53zt30gnze. Use '--activationmode partial' to override. 1 logical volume(s) in volume group "Storage" now active ``` ``` root@rescue:~# lvdisplay /dev/Storage/vsv4765-d0xv6p2u8yc28vqk-gp7socpfcgu1d8rd: read failed after 0 of 4096 at 214748299264: Input/output error /dev/Storage/vsv4765-d0xv6p2u8yc28vqk-gp7socpfcgu1d8rd: read failed after 0 of 4096 at 214748356608: Input/output error /dev/Storage/vsv4765-d0xv6p2u8yc28vqk-gp7socpfcgu1d8rd: read failed after 0 of 4096 at 0: Input/output error /dev/Storage/vsv4765-d0xv6p2u8yc28vqk-gp7socpfcgu1d8rd: read failed after 0 of 4096 at 4096: Input/output error /dev/Storage/vsv5015-dmxm0y1ckwmdfsxd-cra9zx8ugpzm0cne: read failed after 0 of 4096 at 21474770944: Input/output error /dev/Storage/vsv5015-dmxm0y1ckwmdfsxd-cra9zx8ugpzm0cne: read failed after 0 of 4096 at 21474828288: Input/output error /dev/Storage/vsv5015-dmxm0y1ckwmdfsxd-cra9zx8ugpzm0cne: read failed after 0 of 4096 at 0: Input/output error /dev/Storage/vsv5015-dmxm0y1ckwmdfsxd-cra9zx8ugpzm0cne: read failed after 0 of 4096 at 4096: Input/output error /dev/Storage/vsv5076-djuhho90kmc97glh-k6cjlb53zt30gnze: read failed after 0 of 4096 at 214748299264: Input/output error /dev/Storage/vsv5076-djuhho90kmc97glh-k6cjlb53zt30gnze: read failed after 0 of 4096 at 214748356608: Input/output error /dev/Storage/vsv5076-djuhho90kmc97glh-k6cjlb53zt30gnze: read failed after 0 of 4096 at 0: Input/output error /dev/Storage/vsv5076-djuhho90kmc97glh-k6cjlb53zt30gnze: read failed after 0 of 4096 at 4096: Input/output error Couldn't find device with uuid qUelxM-FXi6-HduD-m6A4-VVfa-61eF-tJ7h8A. ``` ### fdisk: ``` root@rescue:~# fdisk -l /dev/nvme0n1 Disk /dev/nvme0n1: 1 GiB, 1073741824 bytes, 2097152 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes ``` ``` root@rescue:~# fdisk -l /dev/nvme1n1 Disk /dev/nvme1n1: 894.3 GiB, 960197124096 bytes, 1875385008 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x08755452 Device Boot Start End Sectors Size Id Type /dev/nvme1n1p1 2048 1875385007 1875382960 894.3G 8e Linux LVM ```