Having passed an unsigned char* argument to your kernel and cast it to a different type in your kernel, and now getting a CL_OUT_OF_RESOURCES whenever you try to map the buffer to user-space after the kernel has executed with (alleged) success? Ensure you’re doing a proper space cast! For instance, instead of doing:
float3 albedo = vload3(0, (float*) input_data);
..you should be more precise, for instance as presented below:
float3 albedo = vload3(0, (__global float*) input_data);
Sadly, NViDiA drivers will not tell you where the real problem is but just drop dead with an error whenever you try to have a look at the result buffer data..