Hacker News new | past | comments | ask | show | jobs | submit login

NVidia's NVCC has performance & compile time issues with Tensorflow.[1]

NVCC vs GPUCC benchmarks 8% - 250% slower compilation & 3.7% - 51% slower runtimes.[2]

Google use GPUCC internally so weren't optimising for NVCC.

LLVM based GPUCC is the 1st fully open source toolchain for CUDA.

Google announced that the guts of GPUCC will make their way into CLANG.

[1] https://plus.google.com/+VincentVanhoucke/posts/6RQmgqcmx2d [2] http://research.google.com/pubs/pub45226.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: