CUDA外部纹理声明

我想声明我的纹理一次，并在我的所有内核和文件中使用它。因此，我在标题中将其声明为extern并在所有其他文件中包含标题（遵循SO 如何使用extern在源文件之间共享变量？）

我有一个包含我的纹理的标题cudaHeader.cuh文件：

 extern texture texImage;

在我的file1.cu ，我分配了我的CUDA数组并将其绑定到纹理：

 cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc( ); cudaStatus=cudaMallocArray( &cu_array_image, &channelDesc, width, height ); if (cudaStatus != cudaSuccess) { fprintf(stderr, "cudaMallocArray failed! cu_array_image couldn't be created.\n"); return cudaStatus; } cudaStatus=cudaMemcpyToArray( cu_array_image, 0, 0, image, size_image, cudaMemcpyHostToDevice); if (cudaStatus != cudaSuccess) { fprintf(stderr, "cudaMemcpyToArray failed! Copy from the host memory to the device texture memory failed.\n"); return cudaStatus; } // set texture parameters texImage.addressMode[0] = cudaAddressModeWrap; texImage.addressMode[1] = cudaAddressModeWrap; texImage.filterMode = cudaFilterModePoint; texImage.normalized = false; // access with normalized texture coordinates // Bind the array to the texture cudaStatus=cudaBindTextureToArray( texImage, cu_array_image, channelDesc); if (cudaStatus != cudaSuccess) { fprintf(stderr, "cudaBindTextureToArray failed! cu_array couldn't be bind to texImage.\n"); return cudaStatus; }

在file2.cu ，我使用kernel函数中的纹理，如下所示：

 __global__ void kernel(int width, int height, unsigned char *dev_image) { int x = blockIdx.x*blockDim.x + threadIdx.x; int y = blockIdx.y*blockDim.y + threadIdx.y; if(y< height) { uchar4 tempcolor=tex2D(texImage, x, y); //if(tempcolor.x==0) // printf("tempcolor.x %d \n", tempcolor.x); dev_image[y*width*3+x*3]= tempcolor.x; dev_image[y*width*3+x*3+1]= tempcolor.y; dev_image[y*width*3+x*3+2]= tempcolor.z; } }

问题是当我在file2.cu使用它时，我的纹理不包含任何值或损坏的值。即使我直接在file1.cu使用函数kernel ，数据也不正确。

如果我添加： texture texImage; 在file1.cu和file2.cu ，编译器说有重新定义。

编辑：

我尝试使用CUDA 5.0版同样的东西，但出现了同样的问题。如果我在file1.cu和file2.cu打印texImage的地址，我没有相同的地址。变量texImage的声明必定存在问题。

这是一个非常古老的问题，答案在talonmies和Tom的评论中提供。在前CUDA 5.0场景中，由于缺少真正的链接器导致extern链接可能性， extern纹理不可行。因此，正如汤姆所说，

你可以有不同的编译单元，但它们不能互相引用

在后CUDA 5.0场景中， extern纹理是可能的，我想在下面提供一个简单的例子，显示这一点，希望它对其他用户有用。

kernel.cu编译单元

 #include  texture texture_test; /********************/ /* CUDA ERROR CHECK */ /********************/ #define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); } inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true) { if (code != cudaSuccess) { fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line); if (abort) exit(code); } } /*************************/ /* LOCAL KERNEL FUNCTION */ /*************************/ __global__ void kernel1() { printf("ThreadID = %i; Texture value = %i\n", threadIdx.x, tex1Dfetch(texture_test, threadIdx.x)); } __global__ void kernel2(); /********/ /* MAIN */ /********/ int main() { const int N = 16; // --- Host data allocation and initialization int *h_data = (int*)malloc(N * sizeof(int)); for (int i=0; idevice memory transfer int *d_data; gpuErrchk(cudaMalloc((void**)&d_data, N * sizeof(int))); gpuErrchk(cudaMemcpy(d_data, h_data, N * sizeof(int), cudaMemcpyHostToDevice)); gpuErrchk(cudaBindTexture(NULL, texture_test, d_data, N * sizeof(int))); kernel1<<<1, 16>>>(); gpuErrchk(cudaPeekAtLastError()); gpuErrchk(cudaDeviceSynchronize()); kernel2<<<1, 16>>>(); gpuErrchk(cudaPeekAtLastError()); gpuErrchk(cudaDeviceSynchronize()); gpuErrchk(cudaUnbindTexture(texture_test)); }

kernel2.cu编译单元

 #include  extern texture texture_test; /**********************************************/ /* DIFFERENT COMPILATION UNIT KERNEL FUNCTION */ /**********************************************/ __global__ void kernel2() { printf("Texture value = %i\n", tex1Dfetch(texture_test, threadIdx.x)); }

CUDA外部纹理声明

宏内的变量参数

更改入口点“main”

运行时错误（堆栈溢出）

Linux中的实时RX和TX速率

C风格演员与内在演员

如何启动进程并等待其子进程完成

为什么“volatileQualifiedExpr + volatileQualifiedExpr”不一定是C中的UB而是C ++中的UB？

如何在C + python + UART中正确表示单词？

使用malloc（sizeof（））从* void到* int 的无效转换

取数字并输出其英文单词的算法