解决了

常出现错误提示“请先解除租户保护，再对该租户进行快照”。

1年前
2020年5月7日
11日回复
4663的浏览量

HenriqueSteppan
冒险家
4回复

你好!

我对Nutanix的世界还很陌生。已经处理标准服务器+存储超过十年。这里我们有2个集群，每个站点有3个节点，每个站点都有地铁可用性。每个站点(活动)中有3个激活的保护域，它们被复制到另一个站点(被动)，反之亦然。

Site1:

Node1 - Node3 - Node5

pd site1 (active): PROD_001, DEV_001, INFRA_001

pd site1 (passive): PROD_002, DEV_002, INFRA_002

Site2:

Node2 - Node4 - Node6

pd site2 (active): PROD_002, DEV_002, INFRA_002

pd site2(被动):PROD_001, DEV_001, INFRA_001

在vCenter集群配置中，我们对site1中的虚拟机/主机和site2中的虚拟机/主机有明显的亲和性规则，防止运行在“奇数”节点上的虚拟机被存储在“偶数”节点上。
有时我们必须将虚拟机从一个站点迁移到另一个站点。所以我们做了一个完整的vmotion(计算和存储)。在迁移之后，我们开始不断收到这条消息的警告:

租户infr_001快照状态:Failed。租户infr_001中存在被其他租户保护的虚拟机:VM = SXXXX96租户= (PROD_002)。请先解除对租户的保护，再对该租户进行快照。

当我们将虚拟机数据文件从一个数据存储存储到同一站点的另一个数据存储时，也会发生这种情况。我搜索了互联网和nutanix的文档，没有发现如何处理这些错误。它说“unprotect vm to vstore before snapshot this vstore”，但我怎么做?它是在ncli上完成的吗?棱镜?vCenter吗?我们是不是有什么事没做?最佳实践是什么?

任何帮助都将不胜感激。

谢谢

恩里克

图标

最佳答案伊万诺夫2020年9月2日14:59

Hi Henrique,<\/p>

I have checked the history of your support cases and I have found a performance related case that was regarding the bug in VMware - when there are more than 5 NFS datastores connected via the same IP, the storage performance degrades over time. This issue is addressed in ESXi versions 6.5U3, 6.7U3 and newer. We have also applied a workaround from the AOS side and simply upgrading AOS to 5.10.4 and newer applies the fix, but the hosts need a reboot after that. That is what i can see happened in your situation - fix was already applied, but the reboot was pending. As i can see from the case, the issue was resolved after the hosts reboots were completed.<\/p>

Here is the information about that VMware bug:\u00a0https:\/\/kb.vmware.com\/s\/article\/67129<\/a><\/p>

We also have a KB about this issue with more details:\u00a0 https:\/\/portal.nutanix.com\/kb\/6961<\/a><\/p>

\u00a0<\/p>","className":"post__content__best_answer"}">

查看原始

vCenter

Hello!<\/p>

I\u2019m pretty new to this Nutanix world. Have been dealing with standard server+storage for more than a decade. We have 2 clusters here, with 3 nodes in each site with metro availability. There are 3 protection domains active in each site (active) that are replicated to the other site (passive) and vice-versa.\u00a0
\u00a0<\/p>

Site1:<\/p>

Node1 - Node3 - Node5<\/p>

PDs site1 (active): PROD_001, DEV_001, INFRA_001<\/p>

PDs site1 (passive):\u00a0PROD_002, DEV_002, INFRA_002<\/p>

\u00a0<\/p>

Site2:<\/p>

Node2\u00a0- Node4\u00a0- Node6<\/p>

PDs site2 (active): PROD_002, DEV_002, INFRA_002<\/p>

PDs site2\u00a0(passive): PROD_001, DEV_001, INFRA_001<\/p>

\u00a0<\/p>

In vCenter cluster configuration we obviously have affinity rules for VMs\/Hosts in site1\u00a0and VMs\/Hosts in site2, preventing the VMs running in \u201codd\u201d nodes from being stored in \u201ceven\u201d nodes.
Sometimes we have to migrate VMs from one site to another. So we do a complete vmotion (compute\u00a0and storage). After the migration, we start to constantly receive\u00a0alerts with this message:<\/p>

Snapshot status for vstore INFRA_001: Failed. Vstore INFRA_001 has VMs being protected by other vstore(s): VM = SXXXX96 vstores = (PROD_002). Please unprotect VMs from vstore(s) before snapshotting this vstore.<\/strong><\/p>

It also happens when we storage vmotion a VM datafiles from one datastore to another in the\u00a0 same site. I did a search in the internet and nutanix documentation and found nothing about how to deal with these errors. It says \u201cunprotect VMs to vstore before snapshotting this vstore\u201d but how do I do it? Is it done on ncli? Prism? vCenter? Is there something that we are not doing right here? What is the best practice?<\/p>

\u00a0<\/p>

Any help will be appreciated.<\/p>

Thanks<\/p>

Henrique<\/p>","quoteUsername":"HenriqueSteppan","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

分享

本主题已关闭供评论

11日回复

古老的第一

最新的第一最好的投票

Userlevel 6

＋5

Alona

Nutanix员工

433回复

1年前
2020年5月8日

嗨,恩里克,

如果我理解正确，您在站点之间执行了虚拟机故障转移，然后您就看到了错误?

这句话还缺什么吗?“当我们在同一个站点将虚拟机数据文件从一个数据存储存储到另一个数据存储时，这种情况也会发生。”数据文件会发生什么?

根据使用Metro Availability计划的故障转移，本指南中概述了该过程保护域故障切换手动(计划性故障切换)-这些是你遵循的步骤吗?

Hi Henrique,<\/p>
\u00a0<\/p>
If I understand correctly, you perform a failover of VMs between sites and that is when you see the error?<\/p>
Also is there anything missing in this sentence? \u201cIt also happens when we storage vmotion a VM datafiles from one datastore to another in the\u00a0 same site.\u201d What happens to the datafiles?<\/p>
\u00a0<\/p>
As per the planned failover with Metro Availability, the procedure is outlined in the guide Failing Over a Protection Domain Manually (Planned Failover)<\/a> \u2013 are these the steps that you follow?<\/p>","quoteUsername":"Alona","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

H

HenriqueSteppan

作者

冒险家

4回复

1年前
2020年5月8日

嗨Alona,

这不是站点之间的故障转移，而是重新平衡。我们经常在site1中创建过多的虚拟机，从存储/计算资源的角度来看，集群变得不平衡。由于DRS只平衡计算资源(我们不喜欢存储DRS的工作方式)，因此需要手动将整个虚拟机(计算和存储)从site1迁移到site2。两个站点都是活跃的，在它们之间进行复制。

每次我们在站点之间迁移虚拟机时，都会出现这些错误。正如我所说的，我们还有一个DEV数据存储，在这里我们首先创建用于开发和测试目的的虚拟机。有时这些DEV虚拟机变成了生产虚拟机，需要迁移到生产数据存储，所以我们执行相同的迁移过程，错误也开始出现。

谢谢

恩里克

Hi Alona,<\/p>
It is not a failover between sites, just a rebalancing. We often create\u00a0too much virtual machines in site1 and the cluster got unbalanced from the storage\/computing resources\u00a0point of view. As DRS only balances compute resources (and we don\u2019t like the way storage DRS works) we need then to manually migrate the whole virtual machine (compute and storage) from site1 to site2. Both sites are active, replicating between themselves.\u00a0<\/p>
Everytime we migrate a virtual machine between sites we got these errors. As I said, we also have a DEV datastore where we firstly create VMs for dev and test purposes. Sometimes these DEV VMs became Production VMs and need to be moved to Production datastores, so we do the same migration process and the errors start to show up too.<\/p>
Thanks<\/p>
Henrique<\/p>","quoteUsername":"HenriqueSteppan","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

Userlevel 6

＋5

Alona

Nutanix员工

433回复

1年前
2020年5月12日

恩里克，你是否碰巧使用了任何第三方，即非nutanix备份解决方案或工具?

Henrique, are you using any third party i.e. non-Nutanix backup solutions or tools by any chance?<\/p>","quoteUsername":"Alona","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

H

HenriqueSteppan

作者

冒险家

4回复

1年前
2020年5月12日

是的，我使用Veeam备份和复制，但只用于备份。Veeam使用vmware快照对虚拟机进行备份。它做得很好，完全没有问题，创建snap，保存信息，删除snap，然后继续(我可以在日志中看到它)。我相信这些我在Prism中看到的快照错误与nutanix使用的某些类型的快照有关，它的引擎服务在节点/站点之间复制数据。我不相信nutanix使用vmware快照进行复制。我说的对吗?

谢谢。

Yes, I\u2019m using Veeam Backup & Replication, but only for backups. Veeam uses vmware snapshots to backup the VMs. It is doing it right, no problem at all, create the snap, saves information, delete snap and goes on (I can see it in logs).\u00a0I believe that these\u00a0snapshot errors I see inside Prism are related to some type of snapshot used by nutanix and it\u2019s engine services to replicated data between nodes\/sites. I don\u2019t believe that nutanix uses vmware snapshots to replicated. Am I right?<\/p>
\u00a0<\/p>
Thanks.<\/p>","quoteUsername":"HenriqueSteppan","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

Userlevel 6

＋5

Alona

Nutanix员工

433回复

1年前
2020年5月13日

这看起来很像我们的工程团队记录的改进之一。当然，您是否能够确认警报是否指向备份中使用的代理虚拟机?

当你说到VMware快照时，记住这是一个超融合的环境，存储是由Nutanix独家处理和呈现的，这一点很重要。

你是对的，MA不依赖第三方的快照。

This looks suspiciously like one of the logged improvements with our Engineering team. To be sure, are you able to confirm whether the alert points towards the proxy VM used in backups or not?<\/p>
When you say VMware snapshots it is still important to keep in mind that this is a hyperconverged environment and the storage is handled and presented by Nutanix exclusively.<\/p>
You are right, MA does not rely on snapshots by third parties.<\/p>","quoteUsername":"Alona","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

H

HenriqueSteppan

作者

冒险家

4回复

1年前
2020年5月14日

我们不使用代理虚拟机进行备份。

这些警报针对我们环境中的普通vm。

We don\u2019t use proxy VMs for backups.<\/p>
The alerts are for ordinary VMs in our environment.\u00a0<\/p>","quoteUsername":"HenriqueSteppan","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

Userlevel 6

＋5

Alona

Nutanix员工

433回复

1年前
2020年5月15

我建议在这种情况下与Nutanix Support一起提出这个问题。

I would suggest raising this with Nutanix Support in this case.<\/p>","quoteUsername":"Alona","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

H

HenriqueSteppan

作者

冒险家

4回复

1年前
2020年5月15

我做过很多次。从来没有人能够告诉我们一个命令或一个过程来“取消保护”虚拟机。它总是相同的行为，远程连接，在CLI中运行大量的ncc检查，收集日志，删除警告和生活继续。

说实话，我对Nutanix的解决方案很失望。它是一个黑盒子，有很多理论，很多术语复杂的“技术”，但没有人真正对它有深入的了解。我们还有一个与性能相关的问题，但两个月后仍然没有回应。我们所有的SQL数据库服务器都需要迁移到服务器+存储解决方案(HPE+3PAR)，因为Nutanix的性能非常低。非常糟糕。

谢谢你！

I did it many times. No one was ever capable of telling us a command or a procedure to \u201cunprotect\u201d a VM. It\u2019s always the same behavior, connect remotely, run a lot of ncc checks in CLI, collect logs, delete warnings and life goes on.\u00a0<\/p>
To be honest, I\u2019m really disappointed with Nutanix solution. It\u2019s a black box, lots of theory, lots of \u201ctechnology\u201d with complicated terms\u00a0but no one has really deep knowledge\u00a0over it. We have another open ticket for a problem related to performance and still 2 months without response. All our SQL databases servers needed to be migrated to Server+Storage solutions\u00a0 (HPE+3PAR) due to extremely low performance in Nutanix.\u00a0Really bad.<\/p>
\u00a0<\/p>
Thank you<\/p>","quoteUsername":"HenriqueSteppan","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

Userlevel 6

＋5

Alona

Nutanix员工

433回复

1年前
2020年5月25日

嗨,恩里克,

抱歉，我找不到任何支持案例。如果您直接向我发送最新的支持案例号，我们将能够审查该案例，并有望向您提供解决方案。

Hi Henrique,<\/p>
I can\u2019t seem to locate any support cases, forgive me. If you send me a direct message with the latest support case number we\u2019d be able to review the case and hopefully provide with you with the solution.<\/p>","quoteUsername":"Alona","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

B

BjornF

“航行者”号

1回复

1年前
2020年8月25日

嗨

相信这是由于ISO文件连接到VM(即使CD/DVD断开)
编辑虚拟机设置并更改CD/DVD驱动器到客户端设备。
不知道是否需要，但我也断开驱动器从虚拟机。

Hi<\/p>
Believe this is due to ISO file beeing connected to VM (\u00a0even if CD\/DVD is disconnected )
Edit VM settings and change\u00a0\u00a0CD\/DVD drive to client device.
don\u2019t know if it is needed but I also disconnect the drive form the VM.<\/p>","quoteUsername":"BjornF","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

Userlevel 4

＋5

伊万诺夫

Nutanix员工

95回复

1年前
2020年9月2

回答

嗨,恩里克,

我已经检查了您的支持案例的历史，我发现了一个性能相关的案例，这是关于VMware的bug -当有超过5个NFS数据存储通过相同的IP连接，存储性能随着时间的推移而下降。ESXi 6.5U3、6.7U3及以上版本均可解决此问题。我们还从AOS端应用了一个解决方案，简单地将AOS升级到5.10.4并应用更新的补丁，但主机需要在那之后重新启动。这是我可以看到发生在你的情况-修复已经应用，但重启是等待。正如我从这个案例中看到的，这个问题在主机重启完成后就解决了。

以下是有关VMware bug的信息:https://kb.vmware.com/s/article/67129

我们也有一个关于这个问题的知识库，有更多的细节:https://portal.nutanix.com/kb/6961

Sr. SRE | Nutanix全球支持| NSS #464, NCAP, VCP6.5-DCV, RHCSA, CCNA, EMCSA, AWS和谷歌云认证

Hi Henrique,<\/p>
I have checked the history of your support cases and I have found a performance related case that was regarding the bug in VMware - when there are more than 5 NFS datastores connected via the same IP, the storage performance degrades over time. This issue is addressed in ESXi versions 6.5U3, 6.7U3 and newer. We have also applied a workaround from the AOS side and simply upgrading AOS to 5.10.4 and newer applies the fix, but the hosts need a reboot after that. That is what i can see happened in your situation - fix was already applied, but the reboot was pending. As i can see from the case, the issue was resolved after the hosts reboots were completed.<\/p>
Here is the information about that VMware bug:\u00a0https:\/\/kb.vmware.com\/s\/article\/67129<\/a><\/p>
We also have a KB about this issue with more details:\u00a0 https:\/\/portal.nutanix.com\/kb\/6961<\/a><\/p>
\u00a0<\/p>","quoteUsername":"Sergei Ivanov","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

就像

报价

由内

报名

已经有账户了?登录

用你的帐户登入

登录社区

用你的帐户登入

输入您的用户名或电子邮件地址。我们会给你发一封电子邮件，告诉你如何重置密码。

用户名或电子邮件

回到概述

扫描文件病毒。

抱歉，我们仍在检查这个文件的内容，以确保下载是安全的。请几分钟后再试。
好吧

无法下载此文件

对不起，我们的病毒扫描程序检测到这个文件下载不安全。
好吧