Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I'm running my program in a cluster. Each node has 2 GPUs. Each MPI task is to call a CUDA function.
My question is if there are two mpi processes running on each node, will each CUDA function call be scheduled on different GPUs or will they both run on the same? What about if I run 4 mpi tasks on each node?
Each MPI task calls one cuda function that is scheduled on whatever GPU you choose. You can choose the GPU you want using the function
cudaSetDevice()
. In your case, since each node contains 2 GPUs you can switch between every GPU with
cudaSetDevice(0)
and
cudaSetDevice(1)
. If you don't specify the GPU using the SetDevice function and combining it with the MPI task
rank
, I believe the 2 MPI tasks will run both cuda functions on the same default GPU (numbered as 0) serially. Furthermore, if you run 3 or more mpi tasks on each node, you will have a race condition for sure, since 2 or more cuda functions will run on the same GPU serially.
–
–
MPI and CUDA are basically orthogonal. You will have to explicitly manage MPI process-GPU affinity yourself. To do this, compute exclusive mode is pretty much mandatory for each GPU. You can use a split communicator with coloring to enforce processor-GPU affinity once each process has found a free device it can establish a context on.
Massimo Fatica from NVIDIA posted a
useful code snippet
on the NVIDIA forums a while ago that might get you started.
–
Thanks for contributing an answer to Stack Overflow!
-
Please be sure to
answer the question
. Provide details and share your research!
But
avoid
…
-
Asking for help, clarification, or responding to other answers.
-
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our
tips on writing great answers
.