如何通过MPI加速这个问题

（1）。我想知道如何使用MPI在下面的代码循环中加速耗时的计算？

int main(int argc, char ** argv) { // some operations f(size); // some operations return 0; } void f(int size) { // some operations int i; double * array = new double [size]; for (i = 0; i < size; i++) // how can I use MPI to speed up this loop to compute all elements in the array? { array[i] = complicated_computation(); // time comsuming computation } // some operations using all elements in array delete [] array; }

如代码所示，我想在与MPI并行的部件之前和之后进行一些操作，但我不知道如何指定并行部件的开始和结束位置。

（2）我目前的代码是使用OpenMP加速计算。

  void f(int size) { // some operations int i; double * array = new double [size]; omp_set_num_threads(_nb_threads); #pragma omp parallel shared(array) private(i) { #pragma omp for schedule(dynamic) nowait for (i = 0; i < size; i++) // how can I use MPI to speed up this loop to compute all elements in the array? { array[i] = complicated_computation(); // time comsuming computation } } // some operations using all elements in array }

我想知道我是否改为使用MPI，是否可以为OpenMP和MPI编写代码？如果可能，如何编写代码以及如何编译和运行代码？

（3）我们的集群有三个版本的MPI：mvapich-1.0.1，mvapich2-1.0.3，openmpi-1.2.6。它们的用法是否相同？特别是在我的情况下。哪一个最适合我使用？

感谢致敬！

更新：

我想更多地解释一下如何指定并行部分的开始和结束的问题。在以下玩具代码中，我想限制函数f（）中的并行部分：

 #include "mpi.h" #include  #include  void f(); int main(int argc, char **argv) { printf("%s\n", "Start running!"); f(); printf("%s\n", "End running!"); return 0; } void f() { char idstr[32]; char buff[128]; int numprocs; int myid; int i; MPI_Status stat; printf("Entering function f().\n"); MPI_Init(NULL, NULL); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); if(myid == 0) { printf("WE have %d processors\n", numprocs); for(i=1;i<numprocs;i++) { sprintf(buff, "Hello %d", i); MPI_Send(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD); } for(i=1;i<numprocs;i++) { MPI_Recv(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD, &stat); printf("%s\n", buff); } } else { MPI_Recv(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &stat); sprintf(idstr, " Processor %d ", myid); strcat(buff, idstr); strcat(buff, "reporting for duty\n"); MPI_Send(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD); } MPI_Finalize(); printf("Leaving function f().\n"); }

但是，不期望运行输出。并行部分之前和之后的printf部分已由每个进程执行，而不仅仅是主进程：

 $ mpirun -np 3 ex2 Start running! Entering function f(). Start running! Entering function f(). Start running! Entering function f(). WE have 3 processors Hello 1 Processor 1 reporting for duty Hello 2 Processor 2 reporting for duty Leaving function f(). End running! Leaving function f(). End running! Leaving function f(). End running!

所以在我看来，并行部分不限于MPI_Init（）和MPI_Finalize（）。

除了这个，我仍然希望有人能回答我的其他问题。谢谢！

快速编辑（因为我要么无法弄清楚如何留下评论，或者我还不允许发表评论） – 3lectrologos对于MPI程序的并行部分是不正确的。你不能在MPI_Init之前和MPI_Finalize之后进行串行工作并且期望它实际上是串行的 – 它仍将由所有MPI线程执行。

我认为问题的一部分是MPI程序的“并行部分”是整个程序 。 MPI将在大约相同的时间开始在您指定的每个节点上执行相同的程序（您的主函数）。 MPI_Init调用只为程序设置了某些内容，因此它可以正确使用MPI调用。

我认为你想做的正确“模板”（伪代码）将是：

 int main(int argc, char *argv[]) { MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); if (myid == 0) { // Do the serial part on a single MPI thread printf("Performing serial computation on cpu %d\n", myid); PreParallelWork(); } ParallelWork(); // Every MPI thread will run the parallel work if (myid == 0) { // Do the final serial part on a single MPI thread printf("Performing the final serial computation on cpu %d\n", myid); PostParallelWork(); } MPI_Finalize(); return 0; }

MPI_Init（带有＆argc和＆argv的args。它是MPI实现的要求）必须是MAIN的第一个执行语句。 Finalize必须是最后执行的语句。

main（）将在MPI环境中的每个节点上启动。可以通过argc和argv传递节点数，node_id和主节点地址等参数。

它是框架：

 #include "mpi.h" #include  #include  void f(); int numprocs; int myid; int main(int argc, char **argv) { MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); if(myid == 0) { /* main process. user interaction is ONLY HERE */ printf("%s\n", "Start running!"); MPI_Send ... requests with job /*may be call f in main too*/ MPU_Reqv ... results.. printf("%s\n", "End running!"); } else { /* Slaves. Do sit here and wait a job from main process */ MPI_Recv(.input..); /* dispatch input by parsing it (if there can be different types of work) or just do the work */ f(..) MPI_Send(.results..); } MPI_Finalize(); return 0; }

如果数组中的所有值都是独立的，那么它应该是平凡的可并行化的。将数组拆分为大小相等的块，将每个块分配给一个节点，然后将结果重新编译。

最简单的从OpenMP到群集的迁移可以是来自intel的“Cluster OpenMP”。

对于MPI，您需要完全重写调度工作。

如何通过MPI加速这个问题

修复这个if，else语句

使用mingw在套接字上的fprintf

C函数用于大写数组中单词的第一个字母

在Posix中，如何使用类型dev_t？

32位与64位环境中的int_max

128位数字的按位移位操作

共同第一成员的结构联盟

C：使用malloc和realloc将初始内存加倍

用户在运行时动态输入多个输入

将char数组中的数字存储到VC ++中的INTEGER变量中。