如果可能的话,如何进行fftw3 MPI“转置”2D变换?

考虑从复杂数组src到实数数组tgt的formsL x M(列主要设置)的2D变换。 或者,在Fortranese,

complex(C_DOUBLE_COMPLEX), pointer :: src(:,:) real(8), pointer :: tgt(:,:) . 

相应的指针是

 type(C_PTR) :: csrc,ctgt . 

我会按以下方式分配它们:

  ! The complex array first alloc_local = fftw_mpi_local_size_2d(M,L/2+1,MPI_COMM_WORLD,local_M,local_offset1) csrc = fftw_alloc_complex(alloc_local) call c_f_pointer(csrc, src, [L/2,local_M]) ! Now the real array alloc_local = fftw_mpi_local_size_2d(2*(L/2+1),M, & MPI_COMM_WORLD,local_L,local_offset2) ctgt = fftw_alloc_real(alloc_local) call c_f_pointer(ctgt, tgt, [M,local_L]) 

现在,该计划将创建为:

 ! Create c-->r transform with one transposition left out plan = fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & ior(FFTW_MEASURE,FFTW_MPI_TRANSPOSED_OUT)) 

最后,转换将执行为:

 call fftw_mpi_execute_dft_c2r(plan, src, tgt) 

但是,这个处方不起作用。 最后一次调用会导致分段错误。 起初,我认为这可能与我如何分配srctgt数组有关,但是使用分配给tgt的不同内存量并没有给出任何结果。 所以,我要么做一些非常愚蠢的事情,要么根本不可能这样做。

编辑:MINIMALISTIC可编译示例

 program trashingfftw use, intrinsic :: iso_c_binding use MPI implicit none include 'fftw3-mpi.f03' integer(C_INTPTR_T), parameter :: L = 256 integer(C_INTPTR_T), parameter :: M = 256 type(C_PTR) :: plan, ctgt, csrc complex(C_DOUBLE_COMPLEX), pointer :: src(:,:) real(8), pointer :: tgt(:,:) integer(C_INTPTR_T) :: alloc_local, local_M, & & local_L,local_offset1,local_offset2 integer :: ierr,id call mpi_init(ierr) call mpi_comm_rank(MPI_COMM_WORLD,id,ierr) call fftw_mpi_init() alloc_local = fftw_mpi_local_size_2d(M,L/2+1, MPI_COMM_WORLD, & local_M, local_offset1) csrc = fftw_alloc_complex(alloc_local) call c_f_pointer(csrc, src, [L/2,local_M]) alloc_local = fftw_mpi_local_size_2d(2*(L/2+1),M, MPI_COMM_WORLD, & & local_L, local_offset2) ctgt = fftw_alloc_real(alloc_local) call c_f_pointer(ctgt, tgt, [M,local_L]) plan = fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_OUT)) call fftw_mpi_execute_dft_c2r(plan, src, tgt) call mpi_finalize(ierr) end program trashingfftw 

答案是:

对于mpi真实变换,只有两种允许的变换和方向组合:

  • 实数到复数变换和FFTW_MPI_TRANSPOSED_OUT
  • 复数到实数变换和FFTW_MPI_TRANSPOSED_IN

我在挖掘fftw3 ver时发现了这个。 3.3.4代码,文件“rdft2-problem.c”,注释行120。

编辑:

最小可编辑工作示例:

 program trashingfftw use, intrinsic :: iso_c_binding use MPI implicit none include 'fftw3-mpi.f03' integer(C_INTPTR_T), parameter :: L = 256 integer(C_INTPTR_T), parameter :: M = 256 type(C_PTR) :: plan, ctgt, csrc complex(C_DOUBLE_COMPLEX), pointer :: src(:,:) real(8), pointer :: tgt(:,:) integer(C_INTPTR_T) :: alloc_local, local_M, & & local_L,local_offset1,local_offset2 integer :: ierr,id call mpi_init(ierr) call mpi_comm_rank(MPI_COMM_WORLD,id,ierr) call fftw_mpi_init() alloc_local = fftw_mpi_local_size_2d(L/2+1,M, MPI_COMM_WORLD, & local_l, local_offset1) print *, id, "alloc complex=",alloc_local, local_l csrc = fftw_alloc_complex(alloc_local) call c_f_pointer(csrc, src, [M,local_l]) !Caveat: Must partition the real storage according to complex layout, this is why ! I am using M and L/2+1 instead of M, 2*(L/2+1) as it was done in the original post alloc_local = fftw_mpi_local_size_2d(M,L/2+1, MPI_COMM_WORLD, & & local_M, local_offset2) print *, id, "alloc real=",alloc_local, local_m ! Two reals per complex ctgt = fftw_alloc_real(2*alloc_local) ! Only the first L are relevant, the rest is just dangling space (see fftw3 docs) !caveat: since the padding is in the first index, the 2d data is laid out non-contiguously !(L sensible reals, padding, padding, L sensible reals, padding, padding, ....) call c_f_pointer(ctgt, tgt, [2*(L/2+1),local_m]) plan = fftw_mpi_plan_dft_c2r_2d(M,L,src,tgt, MPI_COMM_WORLD, & ior(FFTW_MEASURE, FFTW_MPI_TRANSPOSED_IN)) ! Should be non-null print *, 'plan:', plan src(3,2)=(1.,0) call fftw_mpi_execute_dft_c2r(plan, src, tgt) call mpi_finalize(ierr) end program thrashingfftw