Nutanix群集的磁盘I/O等待时间

2年前
2020年9月16日
0答复
8277意见

UserLevel 3

Ashwin.Ramaswamy
Nutanix员工
1回复

棱镜界面允许调查磁盘I/O延迟。结果，提出了以下问题。

笔记：Nutanix建议不应将最大的延迟读数用作群集性能和健康的量度。平均延迟是群集性能和健康的有用度量。

生产集群的平均潜伏期应该是多少？
最大延迟应该是什么？
等待时间太高了什么意义？
如何调查高潜伏期？

考虑以下延迟调查。

最终用户对任何绩效调查的影响。如果最终用户无法测量影响，那么对绩效统计的任何研究都将揭示正常和健康的群集操作。
VM组合，当时的流量类型，写或读取大小，顺序与非顺序，读取与写作因子相对于哪些调查取决于依赖。

Nutanix群集中的延迟变量

以下几点为您提供有关Nutanix群集延迟的信息。

Nutanix提供了全闪 - 阵列节点，但该KB的焦点是两层（SSD和HDD）节点。这种两层设计旨在将常见的数据保留在主机（SSD）和信息生命周期管理（ILM）中，促进并降低了热层的数据。这提供了具有可变延迟响应的成本效益的解决方案。
范围存储：HDD和SSD共同使范围存储。但是，SSD的某些部分用于OPLOG。
OPLOG：这用于随机写入数据，其中数据是临时编写并提供快速确认的。最终将其排入一定程度的商店。
正确尺寸的群集将具有适合SSD层的工作集（WSS）。这样可以确保可从SSD获得群集上常见的数据。如果ILM正在将数据从热到冷层移动到后，则意味着该群集的大小不足，并且由于数据读取的较高的冷率命中率，将会经历更高的延迟。
从冷层（HDD -spinning磁盘）中读取的数据将比从热层读取的数据更高。
在群集中，直到数据写入两个节点（如果使用默认冗余因子2（RF2）配置）之前，在Nutanix群集上的数据写入。与单个本地写作相比，这引入了一些延迟。
非序列数据写入很小且对时间敏感。他们通常是候选人写给热门层的候选人。非顺序（随机）写入首先写入OPLOG，并最终转移到范围内商店。
如果出色的写作大于1.5 MB，则顺序写入OPLOG。在这种情况下，它直接写入范围存储。
写入大小对写的延迟有很大的影响。1 MB的写入比8 KB写入要高得多。

平均延迟与最大延迟

在Nutanix群集中的延迟变量部分介绍了高潜伏期的时期。例如，几乎瞬时的尖峰写给了HDD。

生产集群的平均潜伏期应该是多少？

这取决于集群上工作负载的类型，但是大多数工作负载应看到平均延迟报告为1至10毫秒，由于特定的流量模式，其范围为10至20毫秒（例如，顺序大块写入）。

等待时间太高了？

理想情况下，这个问题的答案是“在最终用户报告缓慢的响应时”，或者更确切地说，如果您关心较高的延迟。是否可以调查重复（如果间歇性或零星）并与最终用户一起查看是否对它们有任何影响？

高潜伏期的时期：

如果在大多数情况下的潜伏期高于10毫秒，或者一次超过20毫秒，则每次等于20毫秒。
如果有很高的尖峰，则数百至数千毫秒的延迟量很高，很可能会产生最终用户影响，并且必须进行研究。

但是，如果尖峰是瞬时且不经常出现的，则在20毫秒以上的潜伏期中的非周期性尖峰（进入低数百毫秒）的可能性更大，因为正常读取或写入冷层。如果没有最终用户影响，并且与已知的VM或网络事件没有相关性，则应忽略这些尖峰。

笔记：VM I/O潜伏期的NCC健康检查将报告200毫秒或更高的问题。

如何研究高潜伏期？

以下是您可以使用的一些方法来研究高潜伏期。使用这些适合您情况的方法：

检查WSS（棱镜中可用），以查看工作组是否太大而对于热层。
使用PRISM创建图形来检查读取延迟，并且分别编写延迟可能会有所帮助。
考虑网络：
- 是连接到10 GB线路率开关的主机（根据大多数Nutanix群集的要求）。特别是使用Cisco Fabric Extender开关吗？看KB 1612然后用10 GB线路速率开关替换织物扩展器。
- 有网络错误吗？（例如，主机NIC或开关接口上的RX错误）。电缆的变化是否可以稳定错误计数？
考虑集群的初始尺寸计划。
- 指定群集运行多少个VM？VM的大小和组合？您运行的VM比群集的配置更多？
将延迟事件与集群的活动相关联：
- 防病毒在群集上大量（或全）VMS同时扫描。
  笔记：在Nutanix群集上运行防病毒软件的最佳做法是在VM上错开扫描。
- 数据库批处理作业。读取大量冷数据的数据库应与其他节点上的其他VM隔离（尽可能），以使其热层需求不会干扰其他VM。
- 保护域复制。集群通常不应允许这些背景任务干扰VMS I/O要求。
- 其他备份任务
- 新的VM创作
用Nutanix支持筹集票
- Nutanix支持可以调查您的性能问题。如果您有任何无法解释的延迟问题，尤其是任何具有最终用户影响的任何内容，请记录一个与Nutanix支持讨论的案例。

Aramaswamy

The Prism interface allows the investigation of the disk I\/O latency. As a result, following questions are raised.<\/span><\/p>

\u00a0<\/p>

Note: <\/strong>Nutanix recommends that maximum latency readings should not be used as a measure of cluster performance and health. Average latency is a useful measure of cluster performance and health.<\/span><\/p>

\t
What should be the average latency on a production cluster?<\/span><\/p>\t<\/li>\t
\t
What should be the maximum latency?<\/span><\/p>\t<\/li>\t
\t
What point is the latency too high?<\/span><\/p>\t<\/li>\t
\t
How to investigate the high latency?<\/span><\/p>\t<\/li><\/ul>
Consider the following for latency investigations.<\/span><\/p>
\t
The end-user impact for any performance investigation. If the impact is not measurable by the end-user, then any investigation of performance statistics is going to reveal normal and healthy cluster operations.<\/span><\/p>\t<\/li>\t
\t
VM combinations, traffic type at the time, write or read size, sequential versus non-sequential, read versus write factors on which investigations are dependent.<\/span><\/p>\t<\/li><\/ul>
Latency Variables in a Nutanix Cluster<\/strong><\/p>
The following points provide you with the information regarding latency on a Nutanix cluster.<\/span><\/p>
\t
All-flash-array nodes are provided by Nutanix, but the focus of this KB is on the two-tier (SSD and HDD) nodes. This two-tier design aims to keep frequently read data in the host (SSD) tier and Information Life Cycle Management (ILM) promotes and demotes the data from the hot tier. This provides a cost-effective solution that has variable latency response.<\/span><\/p>\t<\/li>\t
\t
Extent store : HDD and SSD together makes the extent store. However some portion of the SSDs is used for Oplog.\u00a0<\/span><\/p>\t<\/li>\t
\t
Oplog: This is used for random writes where data is temporarily written and provides quick acknowledgement. This is eventually drained to an extent store.<\/span><\/p>\t<\/li>\t
\t
Cluster that are correctly sized will have a Working Set Size (WSS) that fits within the SSD tier. This ensures that the commonly accessed data on the cluster is available from the SSD.\u00a0 If ILM is moving data from hot to cold tier and back, it implies that the cluster is under sized and higher latencies will be experienced due to the higher cold-tier hit rate for the data reads.<\/span><\/p>\t<\/li>\t
\t
Data that is read from the cold tier (HDD - spinning disk) will have higher latency than the data that is read from the hot tier.<\/span><\/p>\t<\/li>\t
\t
Data writes on a Nutanix cluster are not acknowledged back to the VM until the data is written to two nodes (if the default redundancy factor 2 (RF2) configuration is used) in the cluster. This introduces some latency compared to a single, local write.<\/span><\/p>\t<\/li>\t
\t
Non-sequential data writes are small and time sensitive. They are normally candidates for writing to the hot tier. Non-sequential (random) writes are first written to Oplog and eventually moved to extent store .<\/span><\/p>\t<\/li>\t
\t
Sequential writes skips the Oplog if the outstanding writes are more than 1.5 MB. In that case it is directly written to the extent store.<\/span><\/p>\t<\/li>\t
\t
Write size has a large impact on the latency of the write. A 1 MB write has much higher latency than an 8 KB write.<\/span><\/p>\t<\/li><\/ul>
\u00a0<\/p>
Average Latency versus Maximum Latency<\/strong><\/p>
Factors that are already noted in the <\/span>Latency variables in a Nutanix cluster<\/em> section introduces periods of high latency. For example, nearly instantaneous spikes for a short write to HDD.<\/span><\/p>
\u00a0<\/p>
What should be the average latency on a production cluster?<\/strong><\/p>
It depends on the type of the workload on the cluster, but most workloads should see average latency reported at 1 to 10 milliseconds with ranges of 10 to 20 milliseconds due to particular traffic patterns (for example, sequential large-block writes).<\/span><\/p>
\u00a0<\/p>
At what point is latency too high?<\/strong><\/p>
Ideally, the answer to this question is \"at the point that end-users are reporting slow response\", or more precisely, if you are concerned with higher latency. Is it possible to investigate it for repetition (if intermittent or sporadic) and to work with end-users to see if it has any impact on them?<\/span><\/p>
\u00a0<\/p>
Periods of high latency:<\/span><\/p>
\t
If the latency is above 10 milliseconds most of the time, or above 20 milliseconds for minutes at a time are candidates for further investigation.<\/span><\/p>\t<\/li>\t
\t
If there are very high spikes, high hundreds of milliseconds to thousands of milliseconds of latency, are very likely to have end-user impact and must be investigated.<\/span><\/p>\t<\/li><\/ul>
However, if the spikes are instantaneous and infrequent, non-periodic spikes in latency beyond 20 milliseconds (into the low hundreds of milliseconds) are much more likely because of normal read or writes to the cold tier. If there is no end-user impact and no correlation to known VM or network events then these spikes should be ignored.<\/span><\/p>
\u00a0<\/p>
Note<\/strong>: The NCC health check for the VM I\/O latency will report problems at 200 milliseconds or higher.<\/span><\/p>
\u00a0<\/p>
How to investigate high latency?<\/strong><\/p>
Following are some of the methods that you can use to investigate the high latency. Use these methods that suits your circumstances:<\/span><\/p>
\t
Checking the WSS (available in Prism) to see if the working set is too large for the hot tier.<\/span><\/p>\t<\/li>\t
\t
Using Prism create a graph to check the read latency and the write latency separately can be helpful.<\/span><\/p>\t<\/li>\t
\t
Considering the network:<\/span><\/p>\t
\t\t
Are the hosts connected to a 10 GB line-rate switches (as required for most Nutanix clusters). In particular, are Cisco Fabric Extender switches used? See <\/span>KB 1612<\/u><\/a> and then replace the Fabric Extender with a 10 GB line-rate switch.<\/span><\/p>\t\t<\/li>\t\t
\t\t
Are there network errors? (for example, Rx errors on host NICs or switch interfaces). Does change in cabling stabilize the error counts?<\/span><\/p>\t\t<\/li>\t<\/ul><\/li>\t
\t
Considering the initial sizing plan for the cluster.<\/span><\/p>\t
\t\t
How many VMs was the cluster specified to run? What size and combinations of VMs? Are you running more VMs than what the cluster was configured for?<\/span><\/p>\t\t<\/li>\t<\/ul><\/li>\t
\t
Correlating latency events with activities on the cluster:<\/span><\/p>\t
\t\t
Anti-virus scans on a large number (or all) VMs on the cluster at the same time.<\/span>
Note: <\/strong>The best practice for running an anti-virus on a Nutanix cluster is to stagger scans across the VMs.<\/span><\/p>\t\t<\/li>\t\t
\t\t
Database batch jobs. Databases that read large amounts of cold data should be isolated from other VMs on a different node (wherever possible) so that their hot-tier requirements does not interfere with the other VMs.<\/span><\/p>\t\t<\/li>\t\t
\t\t
Protection domain replications. The cluster should not normally allow these background tasks to interfere with the VMs I\/O requirements.<\/span><\/p>\t\t<\/li>\t\t
\t\t
Other backup tasks<\/span><\/p>\t\t<\/li>\t\t
\t\t
New VM creations<\/span><\/p>\t\t<\/li>\t<\/ul><\/li>\t
\t
Raising a ticket with Nutanix Support<\/span><\/p>\t
\t\t
Nutanix Support can investigate the performance issues with you. If you have any unexplained latency issue, especially anything that is having an end-user impact, log a case to discuss with Nutanix Support.<\/span><\/p>\t\t<\/li>\t<\/ul><\/li><\/ul>
For more information, please follow: <\/span>https:\/\/portal.nutanix.com\/page\/documents\/kbs\/details?targetId=kA03200000098bBCAQ<\/u><\/a><\/p>
\u00a0<\/p>","quoteUsername":"ashwin.ramaswamy","translations":{"Common":{"like":"Like","unlike":"Unlike"},"Forum":{"Quote":"Quote","Share":"Share"}}}">

喜欢

引用

分享

该主题已关闭以供评论

由内部提供动力

条款和条件

注册

已经有一个帐户？登录

使用您的帐户登录

登录社区

使用您的帐户登录

输入您的用户名或电子邮件地址。我们将向您发送带有指令的电子邮件以重置您的密码。

用户名或电子邮件

返回概述

扫描病毒文件。

抱歉，我们仍在检查该文件的内容，以确保它可以安全下载。请在几分钟后再试一次。
好的

该文件无法下载

抱歉，我们的病毒扫描仪检测到该文件无法安全下载。
好的

Learn more about our cookies.<\/a>","cookiepolicy.button":"Accept cookies","cookiepolicy.button.deny":"Deny all","cookiepolicy.link":"Cookie settings","cookiepolicy.modal.title":"Cookie settings","cookiepolicy.modal.content":"We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.<\/a>","cookiepolicy.modal.level1":"Basic
Functional","cookiepolicy.modal.level2":"Normal
Functional + analytics","cookiepolicy.modal.level3":"Complete
Functional + analytics + social media + embedded videos"}}}">