解决了

磁盘错误导致CVM启动失败

1年前
2020年10月15日
3回复
1658的浏览量

Userlevel 2

＋４

背着
领导流行的人
40个回复

我有一个CVM的问题，不能启动。这是在一个半退休的生产集群(不是CE)上，它没有运行工作负载。

我在/tmp/NTNX.serial.out找到控制台输出。0，我可以看到它试图启用RAID设备，扫描一个uuid标记并找到其中的2个，然后中止并卸载mpt3sas内核模块，然后在5秒后再次尝试。这个过程重复了几次，然后hypervisor重新设置它并重新开始引导。

日志中最相关的部分(删除了大量内核污染消息)是

[9.543553] sd 2:0:3:0: [sdd]连接的SCSI磁盘
svmboot: = = = svmboot
Mdadm main:在mapfile上获取独占锁失败
[9.790075] md: md127 stopped.(停止。)
Mdadm:忽略/dev/sdb3，报告/dev/sda3失败
[9.794087] md/raid1:md127: active with 1 out of 2 mirrors .日志含义
[9.796034] md127:检测到的容量变化从0到42915069952
Mdadm: /dev/md/phoenix:2 has been started with 1 drive (out of 2)。
[9.808602] md126已停止。
[9.813330] md/raid1:md126: active with 2 out of 2 mirrors
[9.815279] md126:检测到的容量变化从0到10727981056
Mdadm: /dev/md/phoenix:1 has been started with 2个驱动器。
[9.832111] md125已停止。
Mdadm:忽略/dev/sdb1，因为它报告/dev/sda1为失败
[9.840436] md/raid1:md125: active with 1 out of 2 mirrors .日志含义
[9.842341] md125:检测到的容量变化从0到10727981056
Mdadm: /dev/md/phoenix:0 has been started with 1 drive (from 2)。
Mdadm: /dev/md/phoenix:2存在—忽略
[9.887613] md: md124 stopped.停止。
[9.896418] md/raid1:md124: active with 1 out of 2 mirrors .日志含义
[9.898373] md124:检测到的容量变化从0到42915069952
Mdadm: /dev/md124已经以1个驱动器(在2个驱动器中)启动。
Mdadm: /dev/md/phoenix:0 exists—忽略
[9.926863] md: md123停止了。
[9.937962] md/raid1:md123: active with 1 out of 2 mirrors .日志含义
[9.939950] md123:检测到的容量变化从0到10727981056
Mdadm: /dev/md123已经以1个驱动器启动(在2个驱动器中)。
svmboot:检查/dev/md中的/.nutanix_active_svm_partition
svmboot:检查/dev/md123的/.nutanix_active_svm_partition

[9.994541] EXT4-fs (md123):用有序数据模式挂载的文件系统。选择:(空)
svmboot:带有/的适当引导分区。在/dev/md123 cvm_uuid

[10.009251] EXT4-fs (md125):用有序数据模式挂载文件系统。选择:(空)
svmboot:带有/的适当引导分区。在/dev/md125 cvm_uuid

svmboot:检查/dev/nvme * p ?*为/ .nutanix_active_svm_partition
Svmboot: error: too many partitions with valid cvm_uuid: /dev/md123 /dev/md125
承宪:失踪)
svmboot: 5秒后重试。

[10.430316] md123:检测到的容量从10727981056改变为0
[10.432058] md: md123 stopped.(停止。)
mdadm:停止/dev/md123
[10.467498] md124:检测到的容量变化从42915069952到0
[10.469245] md124 stopped.(停止。)
mdadm:停止/dev/md124
[10.507492] md125:检测到的容量变化从10727981056到0
[10.509276] md125已停止。
mdadm:停止/dev/md125
[10.547497] md126:检测到的容量变化从10727981056到0
[10.549243] md126 stopped.停止。
mdadm:停止/dev/md126
[10.577498] md127:检测到容量变化从42915069952到0
[10.579245] md127已停止。
mdadm:停止/dev/md127
[10.586750] ata2.00:已禁用
modprobe: remove 'virtio_pci':没有这样的文件或目录
[10.673882] mpt3sas版本14.101.00.00卸载

由于它发生在网络启动和管理程序重置之前，所以我没有任何方式与VM交互。

如何解决这个问题?

图标

最佳答案背着2020年10月16日，05:03

After much mucking around, I was\u00a0finally able to boot a System Rescue CD which had\u00a0access to the RAID disks so I could fix it.<\/p>

FYI - the hypervisor boots from the SATADOM but it does not have a device driver for the SAS HBA device so it cannot normally see the storage disks. The hypervisor boots the CVM which has a SAS device driver (mpt3sas), therefore all disk access is done through the CVM. The CVM boots off software RAID devices using the first 3 partitions of the SSDs.<\/p>

In my case, 2 of the software RAID devices had lost sync.<\/p>

[root@sysresccd ~]# lsscsi
[0:0:0:0]    disk    ATA      INTEL SSDSC2BX80 0140  \/dev\/sdb
[0:0:1:0]    disk    ATA      ST2000NX0253     SN05  \/dev\/sda
[0:0:2:0]    disk    ATA      ST2000NX0253     SN05  \/dev\/sdc
[0:0:3:0]    disk    ATA      ST2000NX0253     SN05  \/dev\/sde
[0:0:4:0]    disk    ATA      ST2000NX0253     SN05  \/dev\/sdd
[0:0:5:0]    disk    ATA      INTEL SSDSC2BX80 0140  \/dev\/sdg
[4:0:0:0]    disk    ATA      SATADOM-SL 3ME   119   \/dev\/sdf
[11:0:0:0]   cd\/dvd  ATEN     Virtual CDROM    YS0J  \/dev\/sr0
[root@sysresccd ~]# lsblk
NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0       7:0    0 632.2M  1 loop  \/run\/archiso\/sfs\/airootfs
sda         8:0    0   1.8T  0 disk
\u2514\u2500sda1      8:1    0   1.8T  0 part
sdb         8:16   0 745.2G  0 disk
\u251c\u2500sdb1      8:17   0    10G  0 part
\u2502 \u2514\u2500md127   9:127  0    10G  0 raid1
\u251c\u2500sdb2      8:18   0    10G  0 part
\u2502 \u2514\u2500md125   9:125  0    10G  0 raid1
\u251c\u2500sdb3      8:19   0    40G  0 part
\u2502 \u2514\u2500md126   9:126  0    40G  0 raid1
\u2514\u2500sdb4      8:20   0 610.6G  0 part
sdc         8:32   0   1.8T  0 disk
\u2514\u2500sdc1      8:33   0   1.8T  0 part
sdd         8:48   0   1.8T  0 disk
\u2514\u2500sdd1      8:49   0   1.8T  0 part
sde         8:64   0   1.8T  0 disk
\u2514\u2500sde1      8:65   0   1.8T  0 part
sdf         8:80   0  59.6G  0 disk
\u2514\u2500sdf1      8:81   0  59.6G  0 part
sdg         8:96   0 745.2G  0 disk
\u251c\u2500sdg1      8:97   0    10G  0 part
\u251c\u2500sdg2      8:98   0    10G  0 part
\u2502 \u2514\u2500md125   9:125  0    10G  0 raid1
\u251c\u2500sdg3      8:99   0    40G  0 part
\u2514\u2500sdg4      8:100  0 610.6G  0 part
sr0        11:0    1   693M  0 rom   \/run\/archiso\/bootmnt
[root@sysresccd ~]# cat \/proc\/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2\/2] [UU]
      bitmap: 0\/1 pages [0KB], 65536KB chunk

md126 : active (auto-read-only) raid1 sdb3[2]
      41909248 blocks super 1.1 [2\/1] [U_]
      bitmap: 1\/1 pages [4KB], 65536KB chunk

md127 : active (auto-read-only) raid1 sdb1[2]
      10476544 blocks super 1.1 [2\/1] [U_]
      bitmap: 1\/1 pages [4KB], 65536KB chunk

unused devices: <none><\/code><\/pre>I could see the RAID devices probed as sdb and sdg, with partitions 1, 2, 3 configured but only partition 2 correctly in sync. The 4th partition is used for NFS in the CVM (ie. fast storage for the cluster).<\/p>
So my solution was\u00a0<\/p>
Set the devices I needed to modify back to writable mode\t[root@sysresccd ~]# mdadm --readwrite md126
[root@sysresccd ~]# mdadm --readwrite md127
[root@sysresccd ~]# cat \/proc\/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2\/2] [UU]
      bitmap: 0\/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdb3[2]
      41909248 blocks super 1.1 [2\/1] [U_]
      bitmap: 1\/1 pages [4KB], 65536KB chunk

md127 : active raid1 sdb1[2]
      10476544 blocks super 1.1 [2\/1] [U_]
      bitmap: 1\/1 pages [4KB], 65536KB chunk

unused devices: <none><\/code><\/pre>\t\u00a0<\/p>\t<\/li><\/ol>
Rejoin the devices back into the RAID1 mirror and let them resync\u00a0\t[root@sysresccd ~]# mdadm \/dev\/md126 -a \/dev\/sdg3
mdadm: re-added \/dev\/sdg3
[root@sysresccd ~]# mdadm \/dev\/md127 -a \/dev\/sdg1
mdadm: re-added \/dev\/sdg1
[root@sysresccd ~]# cat \/proc\/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2\/2] [UU]
      bitmap: 0\/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdg3[1] sdb3[2]
      41909248 blocks super 1.1 [2\/1] [U_]
      [=========>...........]  recovery = 48.5% (20361856\/41909248) finish=1.7min speed=200123K\/sec
      bitmap: 1\/1 pages [4KB], 65536KB chunk

md127 : active raid1 sdg1[1] sdb1[2]
      10476544 blocks super 1.1 [2\/1] [U_]
      \tresync=DELAYED
      bitmap: 1\/1 pages [4KB], 65536KB chunk

unused devices: <none><\/code><\/pre>\t[root@sysresccd ~]# cat \/proc\/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2\/2] [UU]
      bitmap: 0\/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdg3[1] sdb3[2]
      41909248 blocks super 1.1 [2\/2] [UU]
      bitmap: 0\/1 pages [0KB], 65536KB chunk

md127 : active raid1 sdg1[1] sdb1[2]
      10476544 blocks super 1.1 [2\/2] [UU]
      bitmap: 0\/1 pages [0KB], 65536KB chunk

unused devices: <none><\/code><\/pre>\t<\/li>\tAs an added check, run fsck on the volumes\u00a0\t[root@sysresccd ~]# fsck \/dev\/md125
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
\/dev\/md125 has gone 230 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
\/dev\/md125: 62842\/655360 files (0.2% non-contiguous), 1912185\/2619136 blocks

[root@sysresccd ~]# fsck \/dev\/md126
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
\/dev\/md126: clean, 20006\/2621440 files, 5177194\/10477312 blocks

[root@sysresccd ~]# fsck \/dev\/md127
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
\/dev\/md127: clean, 66951\/655360 files, 1866042\/2619136 blocks<\/code><\/pre>\t<\/li><\/ol>After rebooting back into the hypervisor, the CVM came up normally.<\/p>","className":"post__content__best_answer"}">


            查看原始


          
           
            CVM
            AHV
           
           
           
            就像
            [    9.543553] sd 2:0:3:0: [sdd] Attached SCSI disk
svmboot: === SVMBOOT
mdadm main: failed to get exclusive lock on mapfile
[    9.790075] md: md127 stopped.
mdadm: ignoring /dev/sdb3 as it reports /dev/sda3 as failed
[    9.794087] md/raid1:md127: active with 1 out of 2 mirrors
[    9.796034] md127: detected capacity change from 0 to 42915069952
mdadm: /dev/md/phoenix:2 has been started with 1 drive (out of 2).
[    9.808602] md: md126 stopped.
[    9.813330] md/raid1:md126: active with 2 out of 2 mirrors
[    9.815279] md126: detected capacity change from 0 to 10727981056
mdadm: /dev/md/phoenix:1 has been started with 2 drives.
[    9.832111] md: md125 stopped.
mdadm: ignoring /dev/sdb1 as it reports /dev/sda1 as failed
[    9.840436] md/raid1:md125: active with 1 out of 2 mirrors
[    9.842341] md125: detected capacity change from 0 to 10727981056
mdadm: /dev/md/phoenix:0 has been started with 1 drive (out of 2).
mdadm: /dev/md/phoenix:2 exists - ignoring
[    9.887613] md: md124 stopped.
[    9.896418] md/raid1:md124: active with 1 out of 2 mirrors
[    9.898373] md124: detected capacity change from 0 to 42915069952
mdadm: /dev/md124 has been started with 1 drive (out of 2).
mdadm: /dev/md/phoenix:0 exists - ignoring
[    9.926863] md: md123 stopped.
[    9.937962] md/raid1:md123: active with 1 out of 2 mirrors
[    9.939950] md123: detected capacity change from 0 to 10727981056
mdadm: /dev/md123 has been started with 1 drive (out of 2).
svmboot: Checking /dev/md for /.nutanix_active_svm_partition
svmboot: Checking /dev/md123 for /.nutanix_active_svm_partition

[    9.994541] EXT4-fs (md123): mounted filesystem with ordered data mode. Opts: (null)
svmboot: Appropriate boot partition with /.cvm_uuid at /dev/md123

[   10.009251] EXT4-fs (md125): mounted filesystem with ordered data mode. Opts: (null)
svmboot: Appropriate boot partition with /.cvm_uuid at /dev/md125

svmboot: Checking /dev/nvme?*p?* for /.nutanix_active_svm_partition
svmboot: error: too many partitions with valid cvm_uuid:  /dev/md123 /dev/md125
sh: missing ]
svmboot: Trying again in 5 seconds.

[   10.430316] md123: detected capacity change from 10727981056 to 0
[   10.432058] md: md123 stopped.
mdadm: stopped /dev/md123
[   10.467498] md124: detected capacity change from 42915069952 to 0
[   10.469245] md: md124 stopped.
mdadm: stopped /dev/md124
[   10.507492] md125: detected capacity change from 10727981056 to 0
[   10.509276] md: md125 stopped.
mdadm: stopped /dev/md125
[   10.547497] md126: detected capacity change from 10727981056 to 0
[   10.549243] md: md126 stopped.
mdadm: stopped /dev/md126
[   10.577498] md127: detected capacity change from 42915069952 to 0
[   10.579245] md: md127 stopped.
mdadm: stopped /dev/md127
[   10.586750] ata2.00: disabled
modprobe: remove 'virtio_pci': No such file or directory
[   10.673882] mpt3sas version 14.101.00.00 unloading 
As it occurs before the networking has started and gets reset by the hypervisor, I do not have any way of interacting with the VM.
How can this be resolved?
" data-username="waddles">报价
            
             
              
               分享
               
                I have a problem with a CVM that won\u2019...","width":null,"height":null}" href="http://twitter.com/intent/tweet?text=https%3A%2F%2Fnext.nutanix.com%2Fserver-virtualization-27%2Fcvm-fails-to-boot-due-to-disk-errors-38548">推特
                I have a problem with a CVM that won\u2019...","width":null,"height":null}" href="https://www.facebook.com/dialog/feed?app_id=153460858539656&display=popup&caption=CVM%20fails%20to%20boot%20due%20to%20disk%20errors&link=https%3A%2F%2Fnext.nutanix.com%2Fserver-virtualization-27%2Fcvm-fails-to-boot-due-to-disk-errors-38548">分享
                I have a problem with a CVM that won\u2019...","width":null,"height":null}" href="http://www.linkedin.com/shareArticle?mini=true&url=https%3A%2F%2Fnext.nutanix.com%2Fserver-virtualization-27%2Fcvm-fails-to-boot-due-to-disk-errors-38548&title=CVM%20fails%20to%20boot%20due%20to%20disk%20errors&summary=%3Cp%3EI%20have%20a%20problem%20with%20a%20CVM%20that%20won%E2%80%99...">分享


        
         
          3回复
          
           
            
             古老的第一
            
             最新的第一
             最好的投票
            
           
          
         
         
          
           
            
             
              
               W
             
             
           
           
            
             Userlevel 2
           
           
            
            ＋４
           
          
          
           
            
             
              背着
             
            作者
            领导流行的人
            40个回复
            
              
               1年前
               
                2020年10月16日
              
            回答
           
          
          
           经过一番折腾之后，我终于能够启动一个System Rescue CD，它可以访问RAID磁盘，这样我就可以修复它。
           供参考——hypervisor从SATADOM启动，但它没有用于SAS HBA设备的设备驱动程序，因此它无法正常看到存储磁盘。hypervisor引导具有SAS设备驱动程序(mpt3sas)的CVM，因此所有磁盘访问都通过CVM完成。CVM通过ssd盘的前3个分区启动软件RAID设备。
           在我的例子中，2个软件RAID设备失去了同步。
           # lsscsi root@sysresccd ~
[0:0:0] disk ATA INTEL SSDSC2BX80 0140 /dev/sdb
[0:0:1:0] disk ATA ST2000NX0253 SN05 /dev/sda
[0:0:2:0] disk ATA ST2000NX0253 SN05 /dev/sdc
[0:0:3:0] disk ATA ST2000NX0253 SN05 /dev/sde
[0:0:4:0] disk ATA ST2000NX0253 SN05 /dev/sdd
[0:0:5:0] disk ATA INTEL SSDSC2BX80 0140 /dev/sdg . /磁盘
[4:0:0:0] disk ATA SATADOM-SL 3ME 119 /dev/sdf
[11:0:0:0] cd/dvd ATEN Virtual CDROM YS0J /dev/sr0 . cd/dvd
# lsblk root@sysresccd ~
Name major:最小rm大小ro类型挂载点
loop0 7:0 0 632.2M 1 loop /run/archiso/sfs/airootfs . 0
sda 8:0 0 1.8T 0 disk
└─生物实验30秒(1
sdb 8:16 0 745.2G 0 disk
├─sdb1 8:17 0 10G 0 part
│├─rammstein——cat sound效果器整理www.catsound.mstein─rammstein——cat sound效果器整理www.catsound.mstein─rammstein - stein
├─sdb2 8:18 0 10G 0 part
│├─rammstein——cat sound效果器整理www.catsound.mstein─rammstein——cat sound效果器整理www.catsound.mstein─rammstein
├─sdb3 (2) 0 40G 0 part
│├─rammstein——cat sound效果器整理www.catsound.mstein─rammstein——cat sound效果器整理www.catsound.mstein─rammstein
微粒流─sdb4 8:20 0 610.6G 0 part
sdc 8:32 0 1.8T 0 disk
├─分子生物学(分子生物学)，分子生物学(分子生物学
sdd 8:48 0 1.8T 0 disk
└─生物实验系统(└─生物实验系统
sde 8:64 0 1.8T 0 disk
├─2，分子生物学，分子生物学，分子生物学
sdf 8:80 0 59.6G 0盘
微粒流─sdf1 8:81 0 53.6 g 0 part
sdg 8:96 0 745.2G 0 disk
├─sdg1 8:20 0 10G 0 part
├─sdg2 8:30 0 10G 0 part
│├─rammstein——cat sound效果器整理www.catsound.mstein─rammstein——cat sound效果器整理www.catsound.mstein─rammstein
├─sdg3 (2) 0 40G 0 part
微粒流─sdg4 8:100 0 610.6G 0 part
sr0 11:0 1 693M 0 rom /run/archiso/bootmnt
[root@sysresccd ~]# cat /proc/mdstat
个性:[raid1]
Md125: active (auto-read-only) raid1 sdg2[1] sdb2[2]
10476544块超级1.1 [2/2][UU]
位图:0/1页[0KB]， 65536KB块

Md126: active (auto-read-only) raid1 sdb3[2]
41909248块超级1.1 [2/1][U_]
位图:1/1页[4KB]， 65536KB块

Md127: active (auto-read-only) raid1 sdb1[2]
10476544块超级1.1 [2/1][U_]
位图:1/1页[4KB]， 65536KB块

未使用的设备:< >没有
           我可以看到探测到的RAID设备为sdb和sdg，配置了分区1、2、3，但只有分区2正确地同步。第4个分区在CVM中用于NFS共享。集群的快速存储)。
           所以我的解决方案是
           
            将我需要修改的设备设置回可写模式[root@sysresccd ~]# mdadm——读写md126
[root@sysresccd ~]# mdadm——readwrite md127
[root@sysresccd ~]# cat /proc/mdstat
个性:[raid1]
Md125: active (auto-read-only) raid1 sdg2[1] sdb2[2]
10476544块超级1.1 [2/2][UU]
位图:0/1页[0KB]， 65536KB块

Md126: active raid1 sdb3[2]
41909248块超级1.1 [2/1][U_]
位图:1/1页[4KB]， 65536KB块

Md127: active raid1 sdb1[2]
10476544块超级1.1 [2/1][U_]
位图:1/1页[4KB]， 65536KB块

未使用的设备:< >没有

           
           
            将设备重新联接到RAID1镜像中，并让它们重新同步[root@sysresccd ~]# mdadm /dev/md126 -a /dev/sdg3
mdadm: /dev/sdg3重新添加
[root@sysresccd ~]# mdadm /dev/md127 -a /dev/sdg1
mdadm: /dev/sdg1重新添加
[root@sysresccd ~]# cat /proc/mdstat
个性:[raid1]
Md125: active (auto-read-only) raid1 sdg2[1] sdb2[2]
10476544块超级1.1 [2/2][UU]
位图:0/1页[0KB]， 65536KB块

Md126: active raid1 sdg3[1] sdb3[2]
41909248块超级1.1 [2/1][U_]
[=========>...........] recovery = 48.5% (20361856/41909248) finish=1.7min speed=200123K/sec
位图:1/1页[4KB]， 65536KB块

Md127: active raid1 sdg1[1] sdb1[2]
10476544块超级1.1 [2/1][U_]
同步=延迟
位图:1/1页[4KB]， 65536KB块

未使用的设备:< >没有
[root@sysresccd ~]# cat /proc/mdstat
个性:[raid1]
Md125: active (auto-read-only) raid1 sdg2[1] sdb2[2]
10476544块超级1.1 [2/2][UU]
位图:0/1页[0KB]， 65536KB块

Md126: active raid1 sdg3[1] sdb3[2]
41909248块超级1.1 [2/2][UU]
位图:0/1页[0KB]， 65536KB块

Md127: active raid1 sdg1[1] sdb1[2]
10476544块超级1.1 [2/2][UU]
位图:0/1页[0KB]， 65536KB块

未使用的设备:< >没有
            作为附加检查，在卷上运行fsck[root@sysresccd ~]# FSCK /dev/md125
来自util-linux 2.36的FSCK
e2fsck 1.45.6(20 - 3月- 2020)
/dev/md125已经230天没有检查了，强制检查。
通过1:检查索引节点、块和大小
步骤2:检查目录结构
通过3:检查目录连通性
步骤4:检查引用计数
通过5:检查组汇总信息
/dev/md125: 62842/655360文件(0.2%不连续)，1912185/2619136块

[root@sysresccd ~]# FSCK /dev/md126
来自util-linux 2.36的FSCK
e2fsck 1.45.6(20 - 3月- 2020)
/dev/md126: clean, 20006/2621440文件，5177194/10477312块

[root@sysresccd ~]# FSCK /dev/md127
来自util-linux 2.36的FSCK
e2fsck 1.45.6(20 - 3月- 2020)
/dev/md127:干净，66951/655360文件，1866042/2619136块
           
           在重新启动到hypervisor之后，CVM正常启动。
          
          
           
           
            就像
            
Set the devices I needed to modify back to writable mode	[root@sysresccd ~]# mdadm --readwrite md126
[root@sysresccd ~]# mdadm --readwrite md127
[root@sysresccd ~]# cat /proc/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdb3[2]
      41909248 blocks super 1.1 [2/1] [U_]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md127 : active raid1 sdb1[2]
      10476544 blocks super 1.1 [2/1] [U_]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>
	 
	
Rejoin the devices back into the RAID1 mirror and let them resync 	[root@sysresccd ~]# mdadm /dev/md126 -a /dev/sdg3
mdadm: re-added /dev/sdg3
[root@sysresccd ~]# mdadm /dev/md127 -a /dev/sdg1
mdadm: re-added /dev/sdg1
[root@sysresccd ~]# cat /proc/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdg3[1] sdb3[2]
      41909248 blocks super 1.1 [2/1] [U_]
      [=========>...........]  recovery = 48.5% (20361856/41909248) finish=1.7min speed=200123K/sec
      bitmap: 1/1 pages [4KB], 65536KB chunk

md127 : active raid1 sdg1[1] sdb1[2]
      10476544 blocks super 1.1 [2/1] [U_]
      	resync=DELAYED
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>
	[root@sysresccd ~]# cat /proc/mdstat
Personalities : [raid1]
md125 : active (auto-read-only) raid1 sdg2[1] sdb2[2]
      10476544 blocks super 1.1 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : active raid1 sdg3[1] sdb3[2]
      41909248 blocks super 1.1 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : active raid1 sdg1[1] sdb1[2]
      10476544 blocks super 1.1 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>
	
	As an added check, run fsck on the volumes 	[root@sysresccd ~]# fsck /dev/md125
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
/dev/md125 has gone 230 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md125: 62842/655360 files (0.2% non-contiguous), 1912185/2619136 blocks

[root@sysresccd ~]# fsck /dev/md126
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
/dev/md126: clean, 20006/2621440 files, 5177194/10477312 blocks

[root@sysresccd ~]# fsck /dev/md127
fsck from util-linux 2.36
e2fsck 1.45.6 (20-Mar-2020)
/dev/md127: clean, 66951/655360 files, 1866042/2619136 blocks
	
After rebooting back into the hypervisor, the CVM came up normally.
" data-username="waddles">报价
           
           
          
         
         
          
           
            
             
              
               B
             
             
           
          
          
           
            
             
              B.Moussa
             
            “航行者”号
            1回复
            
              
               5个月前
               
                2021年7月4
              
           
          
          
           你好,
           谢谢你的分享，这对我来说是非常困难的，只是为了说明我使用了从prism下载的phoenix图像(引导修复部分)，下面的命令不起作用。
           
           (root@sysresccd ~)# mdadm——读写md126
           (root@sysresccd ~)# mdadm——读写md127
           
           我试图应用更改和工作(它已经处于读和写状态)
           
           最好的问候&谢谢你。
           
          
          
           
           
            就像
            报价
           
           
          
         
         
          
           
            
             
              
               年代
             
             
              
             
           
          
          
           
            
             
              Sergiy Lozovsky
              
             
            Nutanix员工
            1回复
            
              
               3个月前
               
                2021年8月31日
              
           
          
          
           如果可以访问CVM控制台。
           
            CVM引导;
            在Grub菜单选择“Debug Shell”;
            在shell提示符do "。Modules.sh "(加载所需的驱动程序);
            “mdadm—Assemble—scan—run”;
            检查RAID阵列是否有两个驱动器(“cat /proc/mdstat”);
           
           如果RAID阵列组装不正确，请按照前面的注释进行重新组装。（Mdadm /dev/md127 -a /dev/sdg1）
           重新启动CVM(从hypervisor)。
           
           有一些重建夫人阵列的外部链接，比如https://www.thomas-krenn.com/en/wiki/Mdadm_recovery_and_resync
          
          
           
           
            就像
            https://www.thomas-krenn.com/en/wiki/Mdadm_recovery_and_resync
" data-username="Sergiy Lozovsky">报价
           
           
          
         
        
        
         
          回复


   
    
     
    
   
   
    
     
      
       由内
      
      
     
    
   
   
    
     
     报名
     已经有账户了吗?登录
     
      
      登入你的帐户
     
    
    
     
     登录到社区
     
     登入你的帐户
    
    
     输入您的用户名或电子邮件地址。我们会给你发一封电子邮件，告诉你如何重置密码。
     
      
      
       
        用户名或电子邮件
       
       
        
       
      
      
       
       回到概述
      
      
     
    
    
    
     扫描文件的病毒。
     抱歉，我们仍在检查该文件的内容，以确保下载安全。请几分钟后再试一次。
     好吧
    
    
     无法下载此文件
     抱歉，我们的病毒扫描程序检测到此文件下载不安全。
     好吧