apache-spark - Spark和InfiniBand

原文 标签 apache-spark hpc infiniband

Spark and InfiniBand

I am trying to use Spark in a HPC focused cluster that has infiniband interconnections. This cluster does not provide support for IPoIB. I saw the Spakr-RDMA project from ohio state university in here. I cannot find anyone else working on this, or if apache spark is going to support IB in the future. The question is is there any other solution to get more updated version of spark in HPC environments with only IB as network?

Answer

You can check the reference guide for deploying RDMA over Ethernet (RoCE) to accelerate Apache Spark 2.2.0 over Mellanox 100 GbE Network https://community.mellanox.com/docs/DOC-3068

翻译

我正在尝试在具有无限带宽互连的HPC重点群集中使用Spark。该群集不提供对IPoIB的支持。我在here的俄亥俄州立大学看到了Spakr-RDMA项目。我找不到其他人在为此工作,或者将来Apache Spark是否会支持IB。问题是,是否有其他解决方案可以在仅以IB作为网络的HPC环境中获得更多更新版本的spark?
最佳答案
您可以查看参考指南,了解如何通过以太网(RoCE)部署RDMA以通过Mellanox 100 GbE网络https://community.mellanox.com/docs/DOC-3068加速Apache Spark 2.2.0。
相关推荐

java - 使用Spark SQL时找不到Spark Logging类

apache-spark - Spark 2.0独立模式动态资源分配工作者启动错误

java - 将Json的Dataset列解析为Dataset <Row>

java - Spark数据帧加入范围缓慢

scala - 计算Spark中UDF的调用

scala - 火花一次输出到kafka

apache-spark - 使用Spark Streaming读取Kafka记录时未序列化异常

python - 如何使用PySpark进行嵌套的for-each循环

python - Spark:等效于数据框中的zipwithindex

apache-spark - Spark执行器GC需要很长时间