Current execution time is horrible for any practical use case. I was wondering why don't we simply plug-in available functions from NVIDIA's NPP library? It wouldn't mean porting anything to CUDA simply plugging in available functions. Anyone working on GPU port?
Current execution time is horrible for any practical use case. I was wondering why don't we simply plug-in available functions from NVIDIA's NPP library? It wouldn't mean porting anything to CUDA simply plugging in available functions. Anyone working on GPU port?