Tag: 推力

Thrust – 如何使用我的数组/数据 – 模型: 我是新手（cuda），我想做一些arrays操作，但我没有在互联网上找到任何类似的例子。我有两个数组（2d）： a = { {1, 2, 3}, {4} } b = { {5}, {6, 7} } 我想要推力计算这个数组： c = { {1, 2, 3, 5}, {1, 2, 3, 6, 7}, {1, 2, 3, 5}, {1, 2, 3, 6, 7} } 我知道它在c / c ++中是如何工作的，但不知道怎么说要做到这一点。这是我的想法，它可能如何工作：线程1：取一个[0] – >用b展开它。写给c。线程2：取一个[1] – >用b展开它。写给c。但我不知道该怎么做。我可以将数组a和b写入1d数组，如： […]

作为推力迭代器CUDA的参数: 我正在尝试使用CUDA :: Thurst迭代器来实现在GPU上运行的ODE求解器例程，以解决GPU中的一堆方程，转到细节，这里是一小段代码： #include #include #include #include #include #include #include #include #include #include #include #include __host__ __device__ float f(float x, float y) { return cos(y)*sin(x); } struct euler_functor { const float h; euler_functor(float _h) : h(_h) {}; __host__ __device__ float operator()( float(*f)(double,double),const float& x, const float& y) const { y += h * (*f)( x, […]

CUDA矢量类型的效率（float2，float3，float4）: 我试图从CUDA示例中了解particles_kernel.cu的integrate_functor ： struct integrate_functor { float deltaTime; //constructor for functor //… template __device__ void operator()(Tuple t) { volatile float4 posData = thrust::get(t); volatile float4 velData = thrust::get(t); float3 pos = make_float3(posData.x, posData.y, posData.z); float3 vel = make_float3(velData.x, velData.y, velData.z); // update position and velocity // … // store new position and velocity thrust::get(t) = make_float4(pos, […]

使用CUDA添加大整数: 我一直在GPU上开发一种加密算法，目前坚持使用算法来执行大整数加法。大整数以通常的方式表示为一堆32位字。例如，我们可以使用一个线程来添加两个32位字。为简单起见，假设要添加的数字具有相同的长度和每个块的线程数==字数。然后： __global__ void add_kernel(int *C, const int *A, const int *B) { int x = A[threadIdx.x]; int y = B[threadIdx.x]; int z = x + y; int carry = (z < x); /** do carry propagation in parallel somehow ? */ ………… z = z + newcarry; // update the resulting […]