Cuda函数指针

我试图在CUDA中做出像这样的somtehing（实际上我需要编写一些集成函数）

在此处输入图像描述

我尝试了这个，但它没有用 – 它只是造成的。

错误：sm_1x中不支持函数指针和函数模板参数。

float f1(float x) { return x; } __global__ void tabulate(float lower, float upper, float p_function(float), float*result){ for (lower; lower < upper; lower++) { *result = *result + p_function(lower); } } int main(){ float res; float* dev_res; cudaMalloc( (void**)&dev_res, sizeof(float) ) ; tabulate<<>>(0.0, 5.0, f1, dev_res); cudaMemcpy(&res, dev_res, sizeof(float), cudaMemcpyDeviceToHost ) ; printf("%f\n", res ); /************************************************************************/ scanf("%s"); return 0; }

要摆脱编译错误，在编译代码时-gencode arch=compute_20,code=sm_20必须使用-gencode arch=compute_20,code=sm_20作为编译器参数。但是，你可能会遇到一些运行时问题：

取自CUDA编程指南http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#functions

主机代码支持__global__函数的函数指针，但不支持设备代码。只有在为计算能力2.x及更高版本的设备编译的设备代码中才支持__device__函数的函数指针。

不允许在主机代码中获取__device__函数的地址。

所以你可以有这样的东西（改编自“FunctionPointers”样本）：

 //your function pointer type - returns unsigned char, takes parameters of type unsigned char and float typedef unsigned char(*pointFunction_t)(unsigned char, float); //some device function to be pointed to __device__ unsigned char Threshold(unsigned char in, float thresh) { ... } //pComputeThreshold is a device-side function pointer to your __device__ function __device__ pointFunction_t pComputeThreshold = Threshold; //the host-side function pointer to your __device__ function pointFunction_t h_pointFunction; //in host code: copy the function pointers to their host equivalent cudaMemcpyFromSymbol(&h_pointFunction, pComputeThreshold, sizeof(pointFunction_t))

然后，您可以将h_pointFunction作为参数传递给内核，该内核可以使用它来调用__device__函数。

 //your kernel taking your __device__ function pointer as a parameter __global__ void kernel(pointFunction_t pPointOperation) { unsigned char tmp; ... tmp = (*pPointOperation)(tmp, 150.0) ... } //invoke the kernel in host code, passing in your host-side __device__ function pointer kernel<<<...>>>(h_pointFunction);

希望这有点道理。总而言之，您似乎必须将f1函数更改为__device__函数并遵循类似的过程（typedef不是必需的，但它们确实使代码更好）将其作为有效的函数指针主机端传递给你的内核。我还建议您仔细查看FunctionPointers CUDA示例

即使你可以编译这段代码（参见@Robert Crovella的回答），这段代码也行不通。您无法从主机代码传递函数指针，因为主机编译器无法确定函数地址。

Cuda函数指针

从Swift字符串转换为const char *

c / c ++中的静态指针默认值

我的Qsort比较函数在内存中引起奇怪的东西

当我们使用malloc而没有声明stdlib.h时头文件编译器返回一个int为什么？

如何在C API中生成ipairs（而不是成对）行为

pthread同步阻塞队列

C ++相当于fscanf（）？

在退出（）状态下使用C中的WEXITSTATUS宏超过256的任何好处？

启动配置中指定的程序文件不存在

我可以将char * 转换为char 吗？