# Something we have to know(maybe) [TOC] ## NCCL * [NVIDIA](https://developer.nvidia.com/nccl) * [introduction](https://on-demand.gputechconf.com/gtc/2018/video/S8462/) ### [Horovod](https://horovod.readthedocs.io/en/stable/) * Horovod is combining NCCL and MPI into an wrapper for Distributed Deep Learning in for example TensorFlow. * It can detect if GPU Direct via RDMA makes sense in the current hardware topology and uses it transparently. * [on github](https://github.com/horovod/horovod) ### MPI * [Message Passing Interface](https://zh.wikipedia.org/wiki/%E8%A8%8A%E6%81%AF%E5%82%B3%E9%81%9E%E4%BB%8B%E9%9D%A2) * 訊息傳遞介面是一個平行計算的應用程式介面(API),常在超級電腦、電腦叢集等非共享記憶體環境程式設計。