Join us for a virtual Nutanix User Group meeting with Jarian Gibson as he covers Nutanix Cloud Clusters (NC2) on Azure and AWS with Citrix. <\/span><\/p>

Jarian will take a deep dive into NC2 on Azure architecture and Citrix on NC2 on Azure\u00a0that helps you strengthen your business continuity and disaster recovery position. He\u2019ll also provide the latest updates for NC2 on AWS.<\/span><\/p>

Plus, we're\u00a0giving away a Nutanix suitcase to one lucky winner!\u00a0Opt-in when you register\u00a0to be entered to win.\u00a0<\/p>","author":{"id":113632,"url":"\/members\/karlie-beil-113632","name":"Karlie Beil","avatar":"https:\/\/uploads-us-west-2.insided.com\/nutanix-us\/icon\/200x200\/1581aab3-bcf6-49f4-b2fb-3d11e8c010dc.png","userTitle":"Community Manager","rank":{"isBold":false,"isItalic":false,"isUnderline":false,"name":"Community Manager","color":"#0873ba"},"userLevel":4},"type":"Webinar","url":"https:\/\/next.nutanix.com\/events\/global-nug-nc2-on-azure-and-aws-with-citrix-151","image":"https:\/\/uploads-us-west-2.insided.com\/nutanix-us\/attachment\/f9693b5b-436b-427a-9b98-531b4040ff24_thumb.png","location":"","startsAt":1678298400,"endsAt":1678302000,"contentType":"event","attendees":[],"attendeeCount":0,"isLoggedInUserAttendee":false,"createdAt":"1675974969"},"phrases":{"Forum":{"{n} year|{n} years":"{n} year|{n} years","{n} month|{n} months":"{n} month|{n} months","{n} day|{n} days":"{n} day|{n} days","{n} hour|{n} hours":"{n} hour|{n} hours","{n} minute|{n} minutes":"{n} minute|{n} minutes","just":"just now","{plural} ago":"{plural} ago"}}}">

IOPS和延迟问题

8年前
2014年10月1日
11个答复
6490视图

Tenke
冒险家
4个答复

你好伴侣，

我们有2x NX3350和Arista开关，可处理大约100 Vm。

工作负载是：

具有重型磁盘工作负载的20倍VM，主要用于报告和分析。R/W比约为50/50。当他们使用RAID10和Tiny Flash Tier居住在SAN存储中时，他们创造了20K IOPS工作量，而平均延迟始终低于5 ms。

10倍VM用于应用程序虚拟化

70x VM用于桌面虚拟化。

现在，当我们迁移到Nutanix时，我们只有4K群集IOPS和20毫秒的延迟，这似乎对我们来说并不好。

试图解决问题，我们启用了内联压缩，并将CVM内存提高到20GB。我们还试图更改层顺序写入优先级。不幸的是，这无济于事。

NCC，集群状态和Prism Health声称一切都可以。在我们迁移环境之前，我们已经运行了诊断VM，结果大约是100K IOPS供阅读。

这是集群的当前配置：NOS版本：4.0.1.1

1个存储池

7个具有启用内联压缩的容器

另外，请参见一个节点上的2009/延迟页面输出，例如：

读：

NFS_Adapter：

阶段AVG潜伏期（US）OP计数延迟％OP计数该组件总数的总数

NFS_ADAPTER COMPONENT42348273618100100100100

RangelocksAcquired027361800100100

InodelockQuired527383000100100

sendToAdmctl427377200100100

Stargate_admctl423032737729999100100

admctldone227377200100100

完成827382800100100

写信：

NFS_Adapter：

阶段AVG潜伏期（US）OP计数延迟％OP计数该组件总数的总数

NFS_ADAPTER组件21420150682100100100100

RangelocksAcquired015068200100100

InodelockQuired1615283800101101

sendToAdmctl16148526009898

Stargate_admctl2150614852698989898

admctldone2148526009898

饰面18215283100101101

有什么想法吗？

br

更新

我们今天在生产工作量期间运行了诊断VM。这是输出：

等待热缓存齐平...........完成。

运行测试“顺序写带宽” ...

开始fio_seq_write：2014年10月1日11:52:11

1475 Mbps

end fio_seq_write：星期三10月1日11:53:07 2014

持续时间fio_seq_write：56秒

*************************************************************************************************

等待热缓存冲洗..........................完成了。

运行测试“顺序读带宽” ...

开始fio_seq_read：星期三10月1日11:54:16 2014

5104 Mbps

end fio_seq_read：星期三10月1日11:54:33 2014

持续时间fio_seq_read：17秒

*************************************************************************************************

等待热缓存冲洗.........完成。

运行测试“随机阅读iops” ...

开始fio_rand_read：2014年10月1日11:55:20

123849 IOPS

end fio_rand_read：星期三10月1日11:57:02 2014

持续时间fio_rand_read：102秒

*************************************************************************************************

等待热缓存冲洗.......完成了。

运行测试“随机写IOPS” ...

开始fio_rand_write：2014年10月1日11:57:38

85467 IOPS

end fio_rand_write：2014年10月1日11:59：20

持续时间fio_rand_write：102秒

*************************************************************************************************

测试完成。

更新

我们发现，问题在于管理程序数据存储的布局中。如果我将Nutanix容器安装在测试虚拟机内，则一切都可以正常工作，我们的结果很好。一旦VM通过管理程序旋转数据，延迟和IOPS又会糟糕。这样做的原因是什么？

IOPS
潜伏

\nWe have 2x NX3350 and Arista switch which handle roughly 100 VM.

\nWorkloads are:

\n20x VM with heavy disk workload which are used mainly for reporting and analysis. R\/W ratio is roughly 50\/50. They created 20K IOPS workload when they lived on SAN storage with RAID10 and tiny flash tier and average latency was always below 5 ms.

\n10x VM which are used for application virtualization

\n70x VM which are use for desktop virtualization.

\n

\nNow, when we migrated to Nutanix, we have only 4K cluster IOPS and 20 ms latency which does not seems to be very good for us.

\n

\nTrying to resolve the issue, we enabled inline compression and increased CVM memory up to 20GB. We also tried to change tier sequential write priority. Unfortunately, this does not help.

\n

\nncc, cluster status and prism health claim that everything is ok. Before we migrated our environment, we've run diagnostic VM and the results was roughly 100K IOPS for read.

\n

\nHere is the current configuration of cluster:NOS Version: 4.0.1.1

\n1 storage pool

\n7 containers with enabled inline compression

\n

\nAlso, please see the 2009\/latency page output on one node as an example:

\n

\nreads:

\nnfs_adapter:

\n

\nStage Avg Latency (us) Op count Latency % Op count % of total of this component of total of this component

\nnfs_adapter component42348273618100100100100

\nRangeLocksAcquired027361800100100

\nInodeLockAcquired527383000100100

\nSentToAdmctl427377200100100

\nstargate_admctl423032737729999100100

\nAdmctlDone227377200100100

\nFinish827382800100100

\n writes:

\nnfs_adapter:

\n

\nStage Avg Latency (us) Op count Latency % Op count % of total of this component of total of this component

\nnfs_adapter component21420150682100100100100

\nRangeLocksAcquired015068200100100

\nInodeLockAcquired1615283800101101

\nSentToAdmctl16148526009898

\nstargate_admctl2150614852698989898

\nAdmctlDone2148526009898

\nFinish18215283100101101

\n

\nAny ideas please?

\n

\nBr,

\n

\n Update<\/b>

\n We ran diagnostic VM today during the production workload. Here is the output:

\n

\nWaiting for the hot cache to flush ........... done.

\nRunning test 'Sequential write bandwidth' ...

\nBegin fio_seq_write: Wed Oct 1 11:52:11 2014

\n

\n1475 MBps

\nEnd fio_seq_write: Wed Oct 1 11:53:07 2014

\n

\nDuration fio_seq_write : 56 secs

\n*******************************************************************************

\n

\nWaiting for the hot cache to flush ............. done.

\nRunning test 'Sequential read bandwidth' ...

\nBegin fio_seq_read: Wed Oct 1 11:54:16 2014

\n

\n5104 MBps

\nEnd fio_seq_read: Wed Oct 1 11:54:33 2014

\n

\nDuration fio_seq_read : 17 secs

\n*******************************************************************************

\n

\nWaiting for the hot cache to flush ......... done.

\nRunning test 'Random read IOPS' ...

\nBegin fio_rand_read: Wed Oct 1 11:55:20 2014

\n

\n123849 IOPS

\nEnd fio_rand_read: Wed Oct 1 11:57:02 2014

\n

\nDuration fio_rand_read : 102 secs

\n*******************************************************************************

\n

\nWaiting for the hot cache to flush ....... done.

\nRunning test 'Random write IOPS' ...

\nBegin fio_rand_write: Wed Oct 1 11:57:38 2014

\n

\n85467 IOPS

\nEnd fio_rand_write: Wed Oct 1 11:59:20 2014

\n

\nDuration fio_rand_write : 102 secs

\n*******************************************************************************

\n

\nTests done.

\n

\nUpdate<\/b>

\nWe have found, that the problem is in the Hypervisor datastore layout. If I mount Nutanix container inside the test virtual machine, everything works as it should and we have nice results. Once VM is wrting data through the Hypervisor, the latency and IOPS are bad again. What could be the reason for this?","quoteUsername":"tenKe","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

分享

该主题已关闭以供评论

11个答复

最古老的第一

新的先来最佳投票

+2

太阳技术

旅行者

1回复

8年前
2014年10月1日

你好，

您应该在Supermicro和一些非常酷的Unix系统克隆上迁移。

sy

\n

\nYou should migrate on supermicro and some really cool unix system clone.

\n

\nSY","quoteUsername":"sun-technic","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

s

+8

莎帕

Nutanix员工

4个答复

8年前
2014年10月2日

你好

显然，这是一个配置问题。

6x3050节点应为您提供至少120000 /60000随机IOPS（热层）。

请打开支持案例，绝对会帮助您解决问题。

\n

\nClearly, it's a configuration issue.

\n

\n6x3050 nodes should give you at least 120000 \/ 60000 random IOPS (hot tier).

\n

\nPlease open support case, definetely we will help you to fix the problem.","quoteUsername":"shapa","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

s

+8

莎帕

Nutanix员工

4个答复

8年前
2014年10月2日

实际上，如果我们的诊断VM运行良好，则问题可能与您的访客VM配置有关。

应附上多个VDisk（例如，可以与LVM统一），以从VM中获得更多性能，因为Nutanix OS限制了每个VDISK的OPLOG尺寸（以避免“嘈杂的邻居”问题）

实际上，您可以查看我们的诊断VM内部（密码是标准的一个），是CentOS 6.5。LVM连接和管理多个VDisk。这样，您可以获得更好的性能-15-20kiops每单个VM。

使用VMware Paravirtual适配器进行Linux VM，Standart SCSI / SAS可以正常工作。

\n

\nMultiple vdisks should be attached (they can be unified with LVM for example) to get more performance from VM's, as Nutanix OS limiting oplog size per vdisk (to avoid \"noisy neighbour\" problem)

\n

\nIn fact you can look inside of our diagnostic VM (password is the standard one), it's Centos 6.5. There are multiple vdisks connected and manged by LVM. This way you can get much better performance - 15-20kIOPS per single VM.

\n

\nIt is not nessesary to use VMware paravirtual adapter for Linux VM's, standart SCSI \/ SAS will work fine.","quoteUsername":"shapa","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

t

+3

Tenke

作者

冒险家

4个答复

8年前
2014年10月2日

感谢您的回复，

>应附上多个VDISK（例如，可以与LVM统一），以从VM中获得更多性能，因为Nutanix OS限制了每个VDisk的OPLOG大小（以避免“嘈杂的邻居”问题）

当我将容器安装在其中并运行IO测试时，为什么我的测试VM正常工作？它只有一个vdisk，当我运行时，即顺序写测试时，结果〜270MBS吞吐量？

而且，大约半年前，我们测试了相同的1X NX-3350型。我在每个节点上运行4 VM：

1带有随机读取负载，1带随机写入负载，1带顺序写入负载和1带有顺序读取负载的负载。总的来说，我在3个节点上有12个VM。而且我没有每个VM限制感觉任何OPLOG，因为在20多个小时的测试中，我们获得了50k读取IOPS和30K Write IOPS等。唯一的区别是NOS的版本，为3.5。

\n

\n>Multiple vdisks should be attached (they can be unified with LVM for example) to get more performance from VM's, as Nutanix OS limiting oplog size per vdisk (to avoid \"noisy neighbour\" problem)

\n

\nWhy then does my test VM work fine when I mount container inside it and run IO tests? It has only one vdisk, and when I run i.e. sequential write test, the results are ~270MBs throughput?

\n

\nAlso, roughly half a year ago, we tested the same model 1x NX-3350. I ran 4 VM on each node:

\n1 with random read load, 1 with random write load, 1 with sequential write load and 1 with sequential read load. Totally, I had 12 VM on 3 nodes. And I did not feel any oplog per vm limits, because that time we got 50K read IOPS and 30K write IOPS etc. during more than 20 hour testing. The only difference was version of NOS, it was 3.5.","quoteUsername":"tenKe","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

s

+8

莎帕

Nutanix员工

4个答复

8年前
2014年10月2日

“我们有50k读的iops和30k写iops的时间”

我不确定是否有可能:)

如果没有多个磁盘，您永远无法在单个VM上获得50K IOP。

同样，这是设计的正常 /预期行为（由多个指南涵盖，例如Nutanix最佳实践中的SQL）

可能，您的测试不正确（使用RAM缓存等）。

同样，如果您对NFS安装的Contaner没有任何问题（+我们的Diagnostics VM运行良好 /表现出完美的表现），则意味着来宾VM配置有问题。

您也可能会有一些拍打 /网络问题。

解决此问题的唯一快速方法是打开支持案例。显而易见的是，情况不正常。

\n

\nI am not sure if it is possible at all :)

\n

\nYou can never get 50k IOPS on a single VM without multiple disks attached.

\nAgain, this is normal \/ expected behaviour by design (covered by multiple guidances, for example MS SQL on Nutanix Best Practices)

\n

\nProbably, your tests were incorrect (RAM cache used, etc).

\n

\nAgain, in case you don't have any issues with NFS-mounted contaner (+ our diagnostics VM runs fine \/ showing perfect perfromance), it means that something wrong with a guest VM configuration.

\n

\nIt is also possible that you've got some flapping \/ network problem.

\n

\nThe only fast way to fix this issue is to open a support case. Obviosly, the situation is not normal.","quoteUsername":"shapa","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

t

+3

Tenke

作者

冒险家

4个答复

8年前
2014年10月3日

感谢您的帮助，

>您永远无法在没有多个磁盘的情况下获得50K IOPS。

>同样，这是设计的正常 /预期行为（由多个指南涵盖，例如Nutanix最佳实践中的SQL）

不是在单个VM上，但是我们为每个节点的每种工作负载都有4 VM。总的来说，我们得到了12 VM。我们有大约40k的IOPS。这些VM是简单的Ubuntu VM，具有没有高级配置的FIO。

>解决此问题的唯一快速方法是打开支持案例。显而易见的是，情况不正常。

谢谢您的建议。我已经为这个问题打开了案例

我不是说我们与NFS容器有麻烦。同样，正如我告诉Beforem时，当我们将NFS容器安装在VM中时，我们获得了完美的结果。我们只有在从管理程序数据存储中读取/写作时才会面临性能问题。

\n

\n>You can never get 50k IOPS on a single VM without multiple disks attached.

\n>Again, this is normal \/ expected behaviour by design (covered by multiple guidances, for example MS SQL on Nutanix Best Practices)

\n

\nNot on a single VM but we got 4 VM for each type of workload for each node. Totally, we got 12 VM. And we had approximately 40K IOPS. Those VM was simple ubuntu VM with fio with no advanced configuration.

\n

\n>The only fast way to fix this issue is to open a support case. Obviosly, the situation is not normal.

\nThanks you for advice. I've already opened case for this issue

\n

\nI don't say we have troubles with NFS container. Again, as I've told beforem when we mounted NFS container inside the VM, we got perfect results. We are facing performance problems only when we read\/write from the hypervisor datastore.","quoteUsername":"tenKe","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

t

+3

Tenke

作者

冒险家

4个答复

8年前
2014年10月7日

>实际上，您可以查看我们的诊断VM内部（密码是标准的密码），是CentOS 6.5。>有多个由LVM连接和管理的VDisk。这样，您可以获得更好的性能-15-20kiops每单个VM。

我们已经通过VMware工作站部署了诊断VM，并且在VM设置和VM内部仅具有一个VDISK。拜托，您能否再解释一下诊断VM中的多个驱动器？

因为今天我们对8 VDisk创建的跨度卷进行了此类测试。不幸的是，我们只有1.4k IOPS可以随机阅读。顺便说一句，我们将所有重型VM移出了Nutanix，因此现在仅处理VDI和Thin App VM。

In fact you can look inside of our diagnostic VM (password is the standard one), it's Centos 6.5. There >are multiple vdisks connected and manged by LVM. This way you can get much better performance - 15-20kIOPS per single VM.

\n

\nWe've deployed diagnostic VM with VMware Workstation and it has only one vdisk both in VM settings and inside VM. Please, could you explain a little bit more how can we see multiple drives in diagnostic VM?

\n

\nBecause today we performed such tests with spanned volume created from 8 vdisk. Unfortunately, we got only 1.4K IOPS for random read. And by the way, we moved all the heavy VM out of Nutanix, so now it handles only VDI and thin app VMs.","quoteUsername":"tenKe","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

Userlevel 1

+4

加里

Nutanix员工

4个答复

8年前
2014年10月8日

您可以共享使用的FIO脚本/参数吗？我想看看我是否可以复制。

据我了解，您有一个在ESX管理程序中运行的Ubuntu Guest VM。与直接安装NFS相比，ESX上此VM内部的FIO的性能要差得多。

\n

\nFrom what I understand, you have an ubuntu guest vm running in an ESX hypervisor. Running fio inside this vm on ESX gives much worse performance than if you mount the NFS directly.","quoteUsername":"gary","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

d

+3

dchawla

Nutanix员工

1回复

8年前
2014年10月8日

Tenke

我支持Nutanix，并且我知道我的团队正在与您合作，收集您的集群上的性能数据，我们在收到此期间的工程师会站立地进行分析。到目前为止的诊断尚未指出产品中的任何问题，但是我们需要数据才能进一步潜入以更接近根本原因。

感谢您的支持，并成为Nutanix客户。

\n

\nI head Support for Nutanix, and I understand that my team is working with you to collect performance data on your cluster, which we have engineers standing by to analyze when we receive this. The diagnosis so far has not pointed out any issues in the product, but we need the data to dive down further to get closer to the root cause.

\n

\nThanks for your support, and for being a Nutanix customer.","quoteUsername":"dchawla","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

UserLevel 3

+19

VCDXNZ001

Nutanix员工

37个答复

8年前
2014年10月9日

使用哪个版本的ESXI？

喜欢

引用

t

+3

Tenke

作者

冒险家

4个答复

8年前
2014年10月9日

使用哪个版本的ESXI？

ESXI 5.1 U2

谢谢您的帮助。我会尽快回复。

Which version of ESXi is being used? <\/i>

\nESXi 5.1 U2

\n

\nThank you mates for your assistance. I will reply asap.","quoteUsername":"tenKe","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

由内部提供动力

条款和条件

报名

已经有一个帐户？登录

使用您的帐户登录

登录社区

使用您的帐户登录

输入您的用户名或电子邮件地址。我们将向您发送带有指令的电子邮件以重置您的密码。

用户名或电子邮件

返回概述

扫描病毒文件。

抱歉，我们仍在检查该文件的内容，以确保它可以安全下载。请在几分钟后再试一次。
好的

该文件无法下载

抱歉，我们的病毒扫描仪检测到该文件无法安全下载。
好的

Learn more about our cookies.<\/a>","cookiepolicy.button":"Accept cookies","cookiepolicy.button.deny":"Deny all","cookiepolicy.link":"Cookie settings","cookiepolicy.modal.title":"Cookie settings","cookiepolicy.modal.content":"We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.<\/a>","cookiepolicy.modal.level1":"Basic
Functional","cookiepolicy.modal.level2":"Normal
Functional + analytics","cookiepolicy.modal.level3":"Complete
Functional + analytics + social media + embedded videos"}}}">