解决了

VSS快照失败了VM


再会,

我是Nutanix的新手,最近购买了一个集群。它仅运行了大约30天,现在管理我的网络,我正在对其进行一些配置。我需要一些帮助,并收到一个错误消息。我将AHV用作3个节点簇上的高探。在此集群上,我正在运行5个基于Windows的服务器VM(不使用Hyper-V或VMware)。我遵循了《管理指南》中的说明,通过在服务器上启用VSS影子副本,然后在所有服务器上安装访客工具并创建保护域Async DR。我的配置正在工作,正在为我的域控制器和应用程序服务器创建快照。但是,当试图为我的文件服务器创建快照时,我会继续遇到以下错误。

“警告:VS Snapshot因快照(169035,1563300389081879,960)在FileServer保护的VM(S)FS-01上失败,因为Quiesiescting Guest guest VM(S)失败或计时。

影响:碰撞一致的快照,而不是应用程序一致的快照。
原因:由于内部错误,来宾无法呼应VM。
分辨率:查看来宾VM中的日志。如果VM无法拨号,请减少VM上的负载,然后重试。

节点ID:HM186S003223
块ID:19SM6J220007
块类型:NX-1065-G6
集群ID:169035
群集UUID:00058DD0-3C5E-5317-0000-00000002944B
群集名称:ntnx
集群版本:EL7.3释放-Euphrates-5.10.8.1 stable-9AC2CB13B645B9DF04EB85B85B0E091F1060EE27439
集群IPS:192.168.1.10 192.168.1.11 192.168.1.12
时间戳:1月22日星期三10:03:48 AST 2020”

我非常感谢您的协助解决这个问题。

此致,
凯文·赫拉曼(Kevon Heraman)

图标

最好的答案杰里米2020年2月20日,17:08

hello @kheraman<\/user-mention>

Are you still having an issue?

For successful application consistent\u00a0VM snapshots, the following need to happen:<\/p>
  1. Backup snapshot is triggered with app-consistent option enabled.<\/li>
  2. The CVM reaches out to the NGT service running on the VM via TCP\/IP to signal that VSS snapshot is needed.
    1. This requires TCP\/IP communication be possible both ways, but should not need DNS since the NGT service will inform Prism of the IP, and the NGT installation has cluster IP info. (NAT could be a problem)<\/li>
    2. The NGT service must, of course, be installed and running, and able to reach the CVM on port 2074.
      (more detail on what ports are needed for full NGT function here<\/a>)\u00a0<\/li>
    3. Communication uses a pre-shared key which is part of NGT installation, and an identifier which is unique to the VM. To have this work, NGT installation must be unique per VM using the \u201cmount iso\u201d option from Prism.
      If cloning VMs, you can pre-install NGT and then mount the ISO again on the clone before powering on. The NGT service will\u00a0fetch updated identifier info during service start\u00a0if the NGT ISO is found in the VM\u2019s CDROM drive.<\/li> <\/ol><\/li>
    4. The NGT service requests VSS Quiesce operation from the Windows OS.
      1. In a quiesce operation, all new changes to disk are held in hot-backup in memory on the VM until the snapshot is finished.<\/li>
      2. All pending changes to disk must finish before the snapshot can happen.<\/li>
      3. This requires sufficient memory on the VM to hold all\u00a0new changes long enough to complete the snapshot, otherwise application consistent snapshot will fail. If you\u2019re seeing intermittent failures, this is where to focus.<\/li>
      4. This process can be impacted by high workload, slower disk performance, hypervisor memory or CPU contention, VM memory or CPU contention, or any Windows VSS specific issue which prevents quiesce completion.
        1. Options for resolution include re-balancing workloads, adding resources at the host or VM, or adjusting scheduled jobs so that snapshots can run at lower-IO times.\u00a0<\/li> <\/ol><\/li> <\/ol><\/li>
        2. Once VSS signals back to NGT that quiesce is good, NGT service signals to Prism and then snapshot is taken. Prism then signals back to NGT, which relays to VSS, at which point new pending disk operations are allowed to flow to disk.<\/li> <\/ol>

          Where this\u00a0error message indicates \u201can internal error\u201d I would actually be looking at VSS and the Nutanix Guest Tools service on the user VM itself. A different error should be seen if Prism cannot reach the NGT service on the VM. There is yet another error when NGT has not been enabled.\u00a0

          The KB article \u201c
          Taking app-consistent (VSS) snapshots using NGT fails on Windows VMs<\/a>\u201d covers one scenario where the culprit is anti-virus software on the VM. The KB also gives some good general steps for exploring the issue with Event Viewer and the vssadmin command. These are often essential in identifying and resolving the issue. The important thing to look for in \u2018vssadmin list writers\u2019 and \u2018vssadmin list providers\u2019 is the last error state. If the last attempt was successful we\u2019ll see an indication of no error. If you just tried the backup and still see no error here, our problem is happening before VSS gets triggered.

          The\u00a0article \u201c
          Nutanix Guest Tools Troubleshooting Guide<\/a>\u201d provides further guidance on validating the Nutanix Guest Tools installation.\u00a0<\/p>","className":"post__content__best_answer"}">

查看原件

该主题已关闭以供评论

4个答复

UserLevel 5
徽章 +9

你好,

您是否安装了任何NGT工具?看https://next.nutanix.com/backup-and-recovery-29/vss-snapshot-failed-12608

不。

Userlevel 4
徽章 +5

你好 @Kheraman

如果在VM上未安装Nutanix Guest Tools(NGT)或NGT Communication链接下降时,如果尝试使用Nutanix Guest Tools(NGT)时,我们会提高警报。以下文件可能会有所帮助;

https://portal.nutanix.com/#/page/kbs/details?targetId=ka00e000000000000cqilca0

您可以在VMS上安装NGT,看看您是否再次看到警报。

UserLevel 3
徽章 +4

你好 @Kheraman

您还在遇到问题吗?

对于成功的应用程序一致的VM快照,以下需要发生:

  1. 备份快照启用了启用App一致的选项。
  2. CVM可以通过TCP/IP伸出VM在VM上运行的NGT服务,以表示需要VSS快照。
    1. 这需要TCP/IP通信是两种可能的,但是不需要DNS,因为NGT服务将为IP的棱镜提供信息,并且NGT安装具有群集IP信息。(NAT可能是一个问题)
    2. 当然,必须安装和运行NGT服务,并能够在2074端口上到达CVM。
      (有关完整NGT功能需要哪些端口的更多详细信息这里
    3. 通信使用NGT安装一部分的预共享密钥,以及VM独特的标识符。要进行这项工作,NGT安装必须使用PRISM的“ Mount ISO”选项为“每VM”。
      如果克隆VM,则可以预安装NGT,然后在电源之前再次将ISO安装在克隆上。如果在VM的CDROM驱动器中找到NGT ISO,则NGT服务将在服务启动过程中获取更新的标识符信息。
  3. NGT服务从Windows OS请求VSS QUIESCE操作。
    1. 在Quiesce操作中,对磁盘的所有新更改均在VM上的Hot-Backup中保存,直到快照完成。
    2. 在快照发生之前,必须对磁盘进行所有待处理的更改。
    3. 这需要在VM上足够的内存以将所有新更改保持足够长的时间以完成快照,否则应用程序一致的快照将失败。如果您看到间歇性的失败,那么这就是要点的关注点。
    4. 此过程可能会受到高工作负载,磁盘性能较慢,管理程序内存或CPU的争论,VM内存或CPU争论的影响,或任何Windows VSS特定问题,以防止quiesce完成。
      1. 解决方案的选项包括重新平衡工作负载,在主机或VM上添加资源或调整计划的作业,以便快照可以在较低的IO时运行。
  4. 一旦VSS向NGT发出信号,即Quiesce很好,NGT服务信号向Prism,然后拍摄快照。然后,Prism向传达到VSS的NGT发出信号,此时,允许新的挂盘操作流向磁盘。

如果此错误消息指示“内部错误”,我实际上将在用户VM本身上查看VSS和Nutanix宾客工具服务。如果Prism无法在VM上达到NGT服务,则应看到不同的错误。尚未启用NGT时,还有另一个错误。

KB文章“使用NGT在Windows VM上使用NGT失败,以应用应用程序(VSS)快照”涵盖了一种场景,其中罪魁祸首是VM上的反病毒软件。KB还为探索事件查看器和VSSADMIN命令的问题提供了一些良好的一般步骤。这些通常对于识别和解决问题至关重要。在“ VSSADMIN列表作家”和“ VSSADMIN列表提供商”中寻找重要的重要性是最后一个错误状态。如果最后一次尝试成功,我们会发现没有错误的指示。如果您只是尝试了备份,并且仍然在这里看不到错误,那么在VSS触发之前,我们的问题正在发生。

文章“Nutanix宾客工具故障排除指南”提供了验证Nutanix宾客工具安装的进一步指南。

Learn more about our cookies.<\/a>","cookiepolicy.button":"Accept cookies","cookiepolicy.button.deny":"Deny all","cookiepolicy.link":"Cookie settings","cookiepolicy.modal.title":"Cookie settings","cookiepolicy.modal.content":"We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.<\/a>","cookiepolicy.modal.level1":"Basic
Functional","cookiepolicy.modal.level2":"Normal
Functional + analytics","cookiepolicy.modal.level3":"Complete
Functional + analytics + social media + embedded videos"}}}">
Baidu