解决了

Nutanix上的分布式对象存储


徽章 +3
嗨,我们开始考虑在Nutanix群集上使用Spark。不是很大,而是并行运行一些ETL过程。我要在集群上安装Hadoop或至少在HDF上安装HDF的压力离开。

有建议这样做吗?我知道容器通过NFS导出到ESXI。那可以使用吗?这是否能够利用星际之门从任何地方访问?我真正需要的只是我所有节点之间共享的全球可用卷。
图标

最好的答案乔恩2018年1月10日,23:04

\nI've moved your post from the CE forums to our production product forums.
\n
\n
\nIn general, for Hadoop on Nutanix, I'd recommend checking out these three assets which you can cherry pick data from
\nhttps:\/\/portal.nutanix.com\/#\/page\/solutions\/details?targetId=RA-2078-Cloudera-with-Nutanix:RA-2078-Cloudera-with-Nutanix<\/a>
\n
https:\/\/portal.nutanix.com\/#\/page\/solutions\/details?targetId=RA-2030_Hadoop_with_AHV:RA-2030_Hadoop_with_AHV<\/a>
\n
\n
\n
\nWe dont specifically have a Spark on Nutanix guide out yet; however, those two are rich with content for the type of solution that you might want to roll out.
\n
\n
\nThat said, you are correct that HDFS (in general) is designed for non-redundant storage (like bare metal), so it has a lot of the same constructs that Nutanix does already. It is worth nothing that you can (or should be able to) configure the replication copies of Hadoop itself, such that you dont have many copies in Hadoop on top of many copies on Nutanix. Thats generally where \"the rub\" comes from when we discuss this with customers.
\n
\nThat said, we've got customers doing Hadoop RF2 + Nutanix RF2 (such as in the Cloudera case) and it works just fine, it just imposes a bit of an overhead.
\n
\n
\nTo be clear though, you can't expose HDFS directly from stargate, so you'd always have something like a Hadoop data node (or data nodes plural) in between Nutanix and Spark","className":"post__content__best_answer"}">
查看原件

2个答复

UserLevel 6
徽章 +29
嘿凯文,
我已经将您的帖子从CE论坛移到了我们的生产产品论坛。


通常,对于Nutanix上的Hadoop,我建议您查看这三个资产,您可以从中挑出数据
https://portal.nutanix.com/#/page/solutions/details?targetId=ra-2078-cloudera-with-nutanix:ra-2078-cloudera-cloudera-nutanix
https://portal.nutanix.com/#/page/solutions/details?targetId=ra-2030_hadoop_with_with_ahv:ra-2030_hadoop_with_with_ahv



我们还没有在Nutanix指南上特别有火花。但是,这两个人充满了您可能想推出的解决方案类型的内容。


也就是说,您是正确的,HDFS(通常)是为非冗余存储(例如裸金属)而设计的,因此它具有与Nutanix相同的构造。您不值得(或应该能够)配置Hadoop本身的复制副本,因此在Nutanix上许多副本之上的Hadoop中没有很多副本。通常,当我们与客户讨论此问题时,“摩擦”的位置。

也就是说,我们有客户在做Hadoop RF2 + Nutanix RF2(例如在Cloudera案中),而且效果很好,它施加了一些开销。


需要明确的是,您不能直接从星际之门暴露HDF,因此您总是在Nutanix和Spark之间的Hadoop数据节点(或数据节点复数)之类的东西
徽章 +3
感谢那。我希望现在不必安装完整的Hadoop群集。目前,只有几个火花工作。看起来我可能可以独自使用Spark逃脱它,但需要一个完整的Hadoop设置,可能在不久的将来可能是HDP。只是吓到我的扫描。这只是我们所做的事情的一小部分,我只有7个NX300伟德国际 3910节点可以玩,而且'Yre几乎都已经满了。

回复


Baidu