WSL2 & CUDA does not work [v20226] #6014

noofaq · 2020-10-01T14:26:38Z

Environment

Windows build number: 10.0.20226.0
Your Distribution version: 18.04 / 20.04
Whether the issue is on WSL 2 and/or WSL 1: Linux version 4.19.128-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Tue Jun 23 12:58:10 UTC 2020

Steps to reproduce

Exactly followed instructions available at https://docs.nvidia.com/cuda/wsl-user-guide/index.html
Tested on previously working Ubuntu WSL image (IIRC GPU last worked on 20206, than whole WSL2 stopped working)
Tested also on newly created Ubuntu 18.04 and Ubuntu 20.04 images.

I have tested CUDA compatible NVIDIA drivers 455.41 & 460.20. I have tried removing all drivers etc.
I have also tested using CUDA 10.2 & CUDA 11.0.

It was tested on two separate machines (one Intel + GTX1060, other Ryzen + RTX 2080Ti)

Issue tested directly in OS also in docker containers inside.

Example (directly in Ubuntu):

piotr@DESKTOP-FS6J3NT:/usr/local/cuda/samples/4_Finance/BlackScholes$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Turing" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
CUDA error at BlackScholes.cu:116 code=46(cudaErrorDevicesUnavailable) "cudaMalloc((void **)&d_CallResult, OPT_SZ)"

Example in container:

piotr@DESKTOP-FS6J3NT:/mnt/c/Users/pppnn$ docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter python
Python 3.6.9 (default, Nov  7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-10-01 14:18:07.538627: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-10-01 14:18:07.624188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-10-01 14:18:32.359457: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-10-01 14:18:32.398949: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3200035000 Hz
2020-10-01 14:18:32.402692: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3d06b70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-01 14:18:32.402748: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-10-01 14:18:32.409370: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-10-01 14:18:32.877228: W tensorflow/compiler/xla/service/platform_util.cc:276] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
2020-10-01 14:18:32.877370: I tensorflow/compiler/jit/xla_gpu_device.cc:136] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2020-10-01 14:18:32.879904: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:32.880192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:1d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.665GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2020-10-01 14:18:32.880277: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-01 14:18:32.880340: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-01 14:18:32.959947: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-01 14:18:32.973554: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-01 14:18:33.111736: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-01 14:18:33.127902: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-01 14:18:33.128018: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-01 14:18:33.128535: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:33.129170: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:33.129403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-10-01 14:18:33.131671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/test_util.py", line 1513, in is_gpu_available
    for local_device in device_lib.list_local_devices():
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/device_lib.py", line 43, in list_local_devices
    _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable
>>>
>>>
>>>
>>>
>>> tf.config.list_physical_devices('GPU')
2020-10-01 14:18:55.610151: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:55.610510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:1d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.665GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2020-10-01 14:18:55.610579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-01 14:18:55.610623: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-01 14:18:55.610676: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-01 14:18:55.610719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-01 14:18:55.610762: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-01 14:18:55.610805: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-01 14:18:55.610846: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-01 14:18:55.611251: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:55.611765: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:18:55.611999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>>
>>>
>>>
>>> tf.test.gpu_device_name()
2020-10-01 14:20:08.762060: W tensorflow/compiler/xla/service/platform_util.cc:276] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
2020-10-01 14:20:08.762222: I tensorflow/compiler/jit/xla_gpu_device.cc:136] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2020-10-01 14:20:08.762863: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:20:08.763201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:1d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.665GHz coreCount: 68 deviceMemorySize: 11.00GiB deviceMemoryBandwidth: 573.69GiB/s
2020-10-01 14:20:08.763263: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-01 14:20:08.763316: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-01 14:20:08.763358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-01 14:20:08.763379: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-01 14:20:08.763428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-01 14:20:08.763480: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-01 14:20:08.763533: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-01 14:20:08.763898: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:20:08.764536: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:1d:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-10-01 14:20:08.764810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/test_util.py", line 112, in gpu_device_name
    for x in device_lib.list_local_devices():
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/device_lib.py", line 43, in list_local_devices
    _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable
>>>

Expected behavior

CUDA working inside WSL2

Actual behavior

All tests which are using CUDA inside WSL Ubuntu are resulting with various CUDA errors - mostly referring to no CUDA devices available.

The text was updated successfully, but these errors were encountered:

kunc · 2020-10-01T16:59:55Z

I am having the same issue. Everything was working flawlessly this morning but then I have updated to 20226.1000 from 20221.1000 and it does not work anymore (tried reinstalling nvidia drivers, etc.) with error that all cuda devices are busy or unavailable.

Edit:
After going back to version 20221, everything works again, thus it confirms that the new version caused the problem.

benhillis · 2020-10-01T17:17:27Z

Can you share the contents of c:\Windows\System32\lxss\lib?

dfreelan · 2020-10-01T17:32:56Z

Having same issue. Here's my C:\WINDOWS\System32\lxss\lib.

09/17/2020 01:24 PM 124,664 libcuda.so
09/17/2020 01:24 PM 124,664 libcuda.so.1
09/17/2020 01:24 PM 124,664 libcuda.so.1.1
09/17/2020 01:24 PM 40,980,456 libnvwgf2umx.so

CarbonPool · 2020-10-01T17:55:48Z

Oh too bad, I also encountered this problem. I was so happy when wsl worked again in the 20226 version, but cuda couldn’t work. I was left out of the cold. I tried the following solutions, but none of them worked for me.

Reinstall the graphics card driver 460.20.
Recompile cuda dependent environment library.
Uninstall wsl2 and kernel program and reinstall.

benhillis · 2020-10-01T18:07:32Z

Interesting, you seem to be missing the libdxcore libraries.

dfreelan · 2020-10-01T19:00:27Z

I reverted my windows back to the previous version, then reinstalled the 20226 build, and now it looks like this:

09/17/2020 01:24 PM 124,664 libcuda.so
09/17/2020 01:24 PM 124,664 libcuda.so.1
09/17/2020 01:24 PM 124,664 libcuda.so.1.1
09/26/2020 03:32 PM 832,936 libd3d12.so
09/26/2020 03:32 PM 5,115,392 libd3d12core.so
09/26/2020 03:32 PM 25,074,040 libdirectml.so
09/26/2020 03:32 PM 878,768 libdxcore.so
09/17/2020 01:24 PM 40,980,456 libnvwgf2umx.so

adamfarquhar · 2020-10-01T21:30:38Z

I am having the same problem. WIndows 10 build 20226 and Nvidia driver 460.20. It is great to see that it is not just my install. I hope that this can be fixed soon.

And now I can also confirm that it will work if you roll back to the previous build 20221. You can download the (old) iso file from Microsoft and re-install without losing any data.

jin8495 · 2020-10-01T23:47:22Z

Same problem here, Nvidia driver 460.20 and build 20226.

CarbonPool · 2020-10-02T04:34:19Z

可以共享c：\ Windows \ System32 \ lxss \ lib的内容吗？

geneing · 2020-10-02T06:33:44Z

I have the same problem Nvidia driver 460.15, build 20226. It worked with the previous insider build.

noofaq · 2020-10-02T06:53:12Z

Can you share the contents of c:\Windows\System32\lxss\lib?

Looked into previous Windows version folder too:

mitch-at-orika · 2020-10-02T06:56:16Z

Same problem Nvidia driver 460.20 and build 20226 my contents in lsxx\lib are:

aticie · 2020-10-02T09:15:04Z

I have the same problem in 20226. My build also contains same 8 files in lxss\lib. But I get cudaErrorDevicesUnavailable.

Is there a way to roll back 20221? Using "Go back to previous version of Windows 10" sends me to 19041.508.

kunc · 2020-10-02T10:33:01Z

I have the same problem in 20226. My build also contains same 8 files in lxss\lib. But I get cudaErrorDevicesUnavailable.

Is there a way to roll back 20221? Using "Go back to previous version of Windows 10" sends me to 19041.508.

It worked for me. Are you sure you have went to the 20226 from 20221 - I think it might store only the last version as backup - the option is no longer available for me when I have reset from 20226 to 20221.

adamfarquhar · 2020-10-02T11:37:43Z

I have the same problem in 20226. My build also contains same 8 files in lxss\lib. But I get cudaErrorDevicesUnavailable.

Is there a way to roll back 20221? Using "Go back to previous version of Windows 10" sends me to 19041.508.

Yes, you can install 20221 from https://www.microsoft.com/en-us/software-download/windowsinsiderpreviewadvanced

kivancguckiran · 2020-10-02T18:46:24Z

It seems that it is not possible to downgrade windows without losing the apps and files which is not possible for me under these circumstances. Does anyone know another solution for this? Or we wait for Microsoft the fix the problem?

I too have version 10226.

PRIMA-LAB-IPU · 2020-10-02T23:55:59Z

ChengyuSheu · 2020-10-03T07:02:23Z

Thanks, @adamfarquhar. Rollback to version 20201 resolve this issue. Even though some settings are removed, files stay.

lminer · 2020-10-03T18:40:28Z

Same problem.

WSL2 Ubuntu 20.04
driver version 460.20
Razer blade advanced 4K 2019
RTX 2080 Max Q
Windows insider 20226

Rollback to previous version fixes it. For people who want to do it without reinstalling, go to Recovery > restore previous version of windows

aisensiy · 2020-10-04T02:54:04Z

I have the error remote procedure call failed in the last version, and I have this issue after upgrade. So...when I recovery does it mean I will get the remote procedure call failed back 😿

sirisian · 2020-10-04T06:18:25Z

@kivancguckiran I just joined the insider build so I'm in the same boat. It would probably take like 4 hours, but you could probably revert windows to the previous version (non-insider) maybe then go specifically to 20221. I'm not going to try it and just wait though.

strarsis · 2020-10-04T21:56:20Z

+1, same issue here.

Windows 10 Version 2004 (Build 19041.546)
NVIDIA Driver 460.20 (GameReady, from the NVIDIA CUDA on WSL driver page)
WSL 2
Ubuntu LTS 20.x
Linux version 4.19.128-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) Will this be Open Source? #1 SMP Tue Jun 23 12:58:10 UTC 2020

The kernel, driver and other versions are above the required minimum, so CUDA in WSL 2 should work.
However, when running the NVIDIA samples built with make, they always fail to run:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

cat /usr/local/cuda/version.txt
CUDA Version 11.0.228

bbongcol · 2020-10-05T02:50:20Z

I have the same problem in 20226.

WSL2 Ubuntu 18.04
Kerver Version 4.19.128
driver version 460.20
RTX 2060
Windows insider 20226
https://aka.ms/AA9utty

Cuda device query is ok.

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2060"
CUDA Driver Version / Runtime Version 11.2 / 10.0
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 6144 MBytes (6442450944 bytes)
(30) Multiprocessors, ( 64) CUDA Cores/MP: 1920 CUDA Cores
GPU Max Clock rate: 1200 MHz (1.20 GHz)
Memory Clock rate: 7001 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 3145728 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

But cuda utility does not worked.

[./BlackScholes] - Starting...
GPU Device 0: "GeForce RTX 2060" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
CUDA error at BlackScholes.cu:116 code=46(cudaErrorDevicesUnavailable) "cudaMalloc((void **)&d_CallResult, OPT_SZ)"

Below is strace log.
BlackScholes_cuda_error_log.zip

liamhan0905 · 2020-10-05T06:30:42Z

I also have tensorflow-gpu on WSL2. But I'm getting the error message as shown below.

RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

Following this link resolved the issue for me! It seems like my issue was also the Windows10 Insider Previews... smh. Simply following "Roll Back Soon After Enabling Insider Previews" section solved it for me (current version: 10.0.20221 Build 20221) and now I can train my model again using tensorflow-gpu. Thank you everyone for the help!

onomatopellan · 2020-10-05T10:53:35Z

Windows 10 Version 2004 (Build 19041.546)

@strarsis In your case you need to use a Windows Insider build from the Dev Channel (build >=20150). CUDA in WSL2 won't work in build 19041.

strarsis · 2020-10-05T11:03:16Z

@onomatopellan: How long do have I to wait to get this support in stable Windows 10?

onomatopellan · 2020-10-05T11:04:16Z

@strarsis This is expected for 21H1 aka April 2021.

strarsis · 2020-10-05T11:35:13Z

@onomatopellan: To use this now, I have to register for Windows Insider, download the ISO - or can I use the Windows updater?
Any downsides to using Windows Insider version like performance or stabilitiy?

Meeka33 · 2020-10-05T13:46:07Z

This stopped working for me as well. winver 2004 20226 with CUDA. It previously was working until yesterday on previous builds. When will this be fixed? Too many recurring bugs, ready to dump windows

tadam98 · 2020-12-10T21:49:40Z

@ArieTwigt Not sure how it worked for you as you did not include "experimental" in the .list as in the instructions.

curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list

tadam98 · 2020-12-15T14:06:36Z

Updated to Insider version 20279.1 still with 465.12 and all is working well as before.

alchemistake · 2021-01-08T16:18:42Z

Did anyone tried this on a mainline windows build? Other programs on my system does not want to work on insider build.

serg06 · 2021-01-08T20:49:22Z

Did anyone tried this on a mainline windows build? Other programs on my system does not want to work on insider build.

I tried it on Nov 24 and there was some issue I couldn't fix, I think it couldn't detect my GPU.

Cuberick-Orion · 2021-02-23T03:14:34Z

Thank you all for reporting the status on different OS versions. I was wondering if anyone is currently/have tried version 21286?

I am about to try using CUDA on WSL2 (hence, need to switch to the Dev Channel), turns out that the ISO files are only released for selected versions. I would prefer to stay at a steady one for the moment.

Appreciate any response :)

onomatopellan · 2021-02-24T17:27:42Z

@Cuberick-Orion You can follow CUDA known issues to see if there is a known problem. Most problematic build was 20226, any build after that should not be a problem.

zzjin · 2021-03-05T11:04:52Z

Windows insider updated to 21327.1000 cuda broken again. previous insider version work well.

Error message with BlackScholes: CUDA error at ../../common/inc/helper_cuda.h:779 code=35(cudaErrorInsufficientDriver) "cudaGetDeviceCount(&device_count)

After re-install(Custom-Perform a clean install-restart pc) the CUDA driver (wsl ready) all goes well again.

shkarupa-alex · 2021-03-06T10:38:32Z

Windows insider updated to 21327.1000 cuda broken again. previous insider version work well.

Error message with BlackScholes: CUDA error at ../../common/inc/helper_cuda.h:779 code=35(cudaErrorInsufficientDriver) "cudaGetDeviceCount(&device_count)

+1

jenatali · 2021-03-06T16:08:21Z

Windows insider updated to 21327.1000 cuda broken again. previous insider version work well.

This is called out in the flight notes: https://blogs.windows.com/windows-insider/2021/03/03/announcing-windows-10-insider-preview-build-21327/

Windows Subsystem for Linux (WSL) users who upgrade to this build will be unable to use the GPU Compute feature. We’re working on a fix for this. Users who do a clean install will not be affected.

ahmadelsallab · 2021-03-09T09:29:27Z

The following worked for me:

Re-install the 465.42 WSL driver (https://developer.nvidia.com/cuda/wsl)
Disable and enable the GPU driver from device manager.
This a known issue by NVIDIA as described in their documentation:https://docs.nvidia.com/cuda/wsl-user-guide/index.html
"Note:NVIDIA is aware of a specific installation issue reported on mobile platforms with the WIP driver 465.12 posted on 11/16/2020. A known workaround will be to disable and reenable the GPU adapter from device manager at system start. We are working on a fix for this issue and will have an updated driver soon.As an alternative, users may opt to roll back to an earlier driver from device manager driver updates."

EtienneT · 2021-03-12T14:45:20Z

CUDA works again in 21332.1000.

PetrarcaBruto · 2021-03-15T02:11:49Z

Error on WSL2

Environment: Windows insider program 21332.1000, NVIDIA Driver 461.72 on Windows (for GeForce GTX 1660 Ti) , WSL2 Ubuntu 20-04,
Error: CUDA error at ../../common/inc/helper_cuda.h:779 code=35(cudaErrorInsufficientDriver) "cudaGetDeviceCount(&device_count)"

After installing and having the NVIDIA driver working on Windows. I followed the NVIDA site https://docs.nvidia.com/cuda/wsl-user-guide/index.html instructions (adapted for Ubuntu 20-04 because the example uses 18-04)

Steps on Ubuntu-20.04:

apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo apt-get install -y cuda-toolkit-11-2

After compiling and executing the ./BlackScholes example I get (again) the same error described on this thread.
Although there is a comment earlier that it worked again for user EtienneT with the Windows insider release 21332.1000, it didn't work for me. It does the same as with the previous insider release 21327.1000.
Note that Microsoft release notes for 21327 admits than GPU computing won't work (a regression bug), but in the release notes for 21332 MS says it is fixed.
I get the same error on both Windows insider releases.

Situation on Windows 10

However the GPU works on Windows with using Pytorch

Pytorch Windows, Installed with:
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch

Code:
print(torch. cuda. get_device_name(torch. cuda. current_device()))
It prints: GeForce GTX 1660 Ti with Max-Q Design

But tensorflow 2.4.1 on Windows installed with:
pip install tensorflow
It recognizes the GPU but fails downstream with a known problem with Tensorflow for which I cannot find updated info.
Code:
if tf.test.gpu_device_name() != '/device:GPU:0':
print('WARNING: GPU device not found.')
else:
print('SUCCESS: Found GPU: {}'.format(tf.test.gpu_device_name()))
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True) #thia should solve the cuda solver problem

It prints 'SUCCESS: Found GPU:' etc, but later in the process it has a problem in one of the cuda libraries. Here is the output of the run:

021-03-11 21:06:25.144177: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4744 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti with Max-Q Design, pci bus id: 0000:02:00.0, compute capability: 7.5)
2021-03-11 21:06:26.697412: F tensorflow/core/util/cuda_solvers.cc:115] Check failed: cusolverDnCreate(&cusolver_dn_handle) == CUSOLVER_STATUS_SUCCESS Failed to create cuSolverDN instance

Note that for Tensorflow I had to download the CUDA Toolk kit 11.2 and associated Cudadnn libraries.

Any help will be appreciated.
Petrarca Bruto

kerim371 · 2021-04-30T18:04:59Z

What could I do if I Nvidia driver manager can't find appropriate driver for me?
Samsung RC720 notebook
OS build: 21370.1
Basic GPU is Intel, additional Nvidia GeForce GT 520M
According to Nvidia docs my GeForce GT 520M supports CUDA

onomatopellan · 2021-04-30T19:04:03Z

@kerim371 I'm afraid it won't work.

Note: CUDA on WSL 2 is enabled on GPUs starting with the Kepler architecture

https://docs.nvidia.com/cuda/wsl-user-guide/index.html

tommyip · 2021-06-11T13:00:48Z

I have a Nvidia GeForce GT750m which is based on the Kepler architecture as well as Windows Dev build 21390, the installer still give the same error as @kerim371 above.

onomatopellan · 2021-06-11T13:29:30Z

@tommyip mobile Kepler did lose drivers support long ago.

Josepaezra · 2021-06-18T14:33:08Z

@kerim371 , @tommyip I was having the same issue, which i resolved installing the drivers via the wsl-ubuntu´s prompt, following the commands shown in https://developer.nvidia.com/cuda-downloads, under Linux>x86_64>WSL-Ubuntu>2.0>deb(network).

AllardJM · 2021-06-18T21:36:02Z

I followed the instructions here:https://radiant-brushlands-42789.herokuapp.com/medium.com/swlh/how-to-install-the-nvidia-cuda-toolkit-11-in-wsl2-88292cf4ab77
and then what @Josepaezra added when that didnt work and still TF does not recognize the GPU

onomatopellan · 2021-06-18T21:52:13Z

@AllardJM That guide is incomplete. First of all you need to run a Windows build from the Dev channel and install a Windows Nvidia driver with WSL2 CUDA support. After that you will see a device /dev/dxg inside WSL2. That's the GPU.

AllardJM · 2021-06-18T21:57:05Z

I believe I properly did both. I was able to run the blackschoals test….just not TF

onomatopellan · 2021-06-19T05:14:45Z

@AllardJM Which version of TensorFlow did you install? If you see the /dev/dxg device then you only need a GPU version of TensorFlow.

AllardJM · 2021-06-19T13:22:08Z

@onomatopellan I have an (empty) file dxg under the dev folder....

The TF version is 2.5.0 from pip install tensorflow

onomatopellan · 2021-06-19T15:04:25Z

@AllardJM That means the GPU is already available inside WSL2. Launch python3 interpreter and run:

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

The GPU should be recognized although could not be available due to lack of specific cuda libraries.

This is why I always prefer running TensorFlow in Docker since it has all the libraries needed in the container.
docker run -it --rm --runtime=nvidia tensorflow/tensorflow:latest-gpu python

thusinh1969 · 2021-08-14T02:34:37Z

Windows insider updated to 20262.1
With the update, the regular nvidia driver was re-installed. Not good for wsl2.
Also, nvidia issued a new driver that you should install in windows:
https://developer.nvidia.com/cuda/wsl
The new drive is 465.12
Download and install it.
You must reboot after the install or wsl2 will not see the gpu as of yet.
Start wsl2
$ python
> import tensorflow as tf
>  tf.test.is_gpu_available()
2020-11-19 17:15:13.395676: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
True
Then do the above docker procedures to check that the docker aslo works.
In my case, it works fine.
After doing this, it still didn't work for me.
I was able to get it to work by following it up with this:

Installed everything else that Windows Update wanted to install and rebooted

Reinstalled the 465.12 driver

On the first screen, I selected "Drivers + GeForce Experience"

On the second screen, I selected Custom, then pressed Next and checked "Perform a clean install"

Uninstalled all my existing Ubuntu installations

Restarted Windows

Installed Ubuntu from Microsoft store (the one called "Ubuntu" with no version) (it installed Ubuntu 20.04)

Restarted Windows

Followed these instructions to add the sources to my sources lists, but I modified the URLs to be 2004 instead of 1804:

sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub

sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list'

sudo apt-get update

Continued to follow the instructions by installing Nvidia Toolkit 11.1 (not 11.0)

sudo apt-get install -y cuda-toolkit-11-1

Followed these instructions to test the installation

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

bash Miniconda3-latest-Linux-x86_64.sh

When it asks whether to init conda, hit yes

restart terminal

conda config --set auto_activate_base false

restart terminal

conda create --name directml python=3.6

conda activate directml

pip install tensorflow-directml

Then in Python

import tensorflow.compat.v1 as tf

tf.enable_eager_execution(tf.ConfigProto(log_device_placement=True))

print(tf.add([1.0, 2.0], [3.0, 4.0]))

And there, I finally saw the name of my GPU pop up!
After that I went back and tried the ./BlackScholes test:

cd /usr/local/cuda/samples/4_Finance/BlackScholes

sudo make

./BlackScholes

And it also worked!
Thanks for posting your steps. I also run Ubuntu in WSL2 and use the Nvidia docker containers as explained in https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-nvidia-drivers I also faced the same problem (nvcr.io/nvidia/tensorflow not working in WSL 2 after the update of Windows Insider version).

From your steps, I only had to re-install the CUDA driver ( https://developer.nvidia.com/cuda/wsl ). In my case I just had to run the installer like I normally did. I didn't even have to restart WSL.

I ran the benchmark container to check if everything works:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

The benchmark passed and my nvcr.io/nvidia/tensorflow container works again.

Not working with Ubuntu 18.04 (changed params accordingly).
Steve

PetrarcaBruto · 2021-08-16T08:48:19Z

Thank you Anh,
Your information looks very useful. Unfortunately I won't be able to try it out. My problem has got worse with the GPU, even on Windows, let alone WSL2. I will describe it here in case it is useful to someone else, or at least to vent some of my frustration with MS:
I have been in the Windows insider program waiting for a solution to the problem described here (no GPU on WSL2). In turns out that the Windows 11 insider update that came out about 8 weeks ago caused that my NVIDIA GPU stopped being recognized on Windows 11.
I documented the problem in the MS Feedback Hub and since then I updated the problem description with the following:
"IMPORTANT UPDATE:
Trying to solve the problem reported by Device Manager with the driver installation (as per the original problem report)" This time, trying to address the "Currently this hardware is not connected to the computer (Code 45) this Device is not connected". I did the following:
1- After selecting properties, I clicked on "install driver" to which I got the reply that the driver was already installed.
2- Then I selected to uninstall the driver (to avoid the message on 1 above) and re-install it again. I didn't get any response to this click but the GPU disappeared from the device list. Now the device is not shown at all, not even clicking "View/Show Hidden Devices"

It seems the problem got worse. If I try to install a device from the NVIDIA site, the first thing the installer does is to check device compatibility and it aborts saying there it didn't detect a compatible device.

Please help. I have had intermittent problems with GPU connection, even in Windows 10."

As you can see, I am very disappointed with the whole NVIDIA GPU experience on Windows. I am still waiting for an update. This issue, and related ones, have more then 4K Up votes in the insider problem list.

Petrarca

strarsis · 2021-12-01T13:17:55Z

With Windows 10 (non-insider) 2021H2 November update, CUDA now works in WSL 2.

PetrarcaBruto · 2021-12-02T08:10:03Z

@strarsis You are right. Thanks for the notification.

benhillis added the GPU label Oct 1, 2020

nagadomi mentioned this issue Dec 17, 2020

cc1plus not found, luarocks failed to install, and can't find CUDA nagadomi/waifu2x#372

Open

zzjin mentioned this issue Mar 8, 2021

code=35 cudaErrorInsufficientDriver NVIDIA/nvidia-docker#1437

Closed

github-actions bot mentioned this issue Jan 20, 2024

WSL2 CUDA Does Not Respect CUDA - Sysmem Fallback Policy #11050

Open

2 tasks

WSL2 & CUDA does not work [v20226] #6014

WSL2 & CUDA does not work [v20226] #6014

Comments

noofaq commented Oct 1, 2020

Environment

Steps to reproduce

Expected behavior

Actual behavior

kunc commented Oct 1, 2020 • edited

benhillis commented Oct 1, 2020

dfreelan commented Oct 1, 2020

CarbonPool commented Oct 1, 2020

benhillis commented Oct 1, 2020

dfreelan commented Oct 1, 2020

adamfarquhar commented Oct 1, 2020 • edited

jin8495 commented Oct 1, 2020

CarbonPool commented Oct 2, 2020

geneing commented Oct 2, 2020

noofaq commented Oct 2, 2020 • edited

mitch-at-orika commented Oct 2, 2020

aticie commented Oct 2, 2020

kunc commented Oct 2, 2020

adamfarquhar commented Oct 2, 2020

kivancguckiran commented Oct 2, 2020 • edited

PRIMA-LAB-IPU commented Oct 2, 2020

ChengyuSheu commented Oct 3, 2020

lminer commented Oct 3, 2020 • edited

aisensiy commented Oct 4, 2020

sirisian commented Oct 4, 2020

strarsis commented Oct 4, 2020 • edited

bbongcol commented Oct 5, 2020 • edited

liamhan0905 commented Oct 5, 2020 • edited

onomatopellan commented Oct 5, 2020

strarsis commented Oct 5, 2020

onomatopellan commented Oct 5, 2020 • edited

strarsis commented Oct 5, 2020

Meeka33 commented Oct 5, 2020

tadam98 commented Dec 10, 2020

tadam98 commented Dec 15, 2020

alchemistake commented Jan 8, 2021

serg06 commented Jan 8, 2021

Cuberick-Orion commented Feb 23, 2021

onomatopellan commented Feb 24, 2021 • edited

zzjin commented Mar 5, 2021

shkarupa-alex commented Mar 6, 2021 • edited

jenatali commented Mar 6, 2021

ahmadelsallab commented Mar 9, 2021

EtienneT commented Mar 12, 2021

PetrarcaBruto commented Mar 15, 2021 • edited

Error on WSL2

Situation on Windows 10

kerim371 commented Apr 30, 2021 • edited

onomatopellan commented Apr 30, 2021

tommyip commented Jun 11, 2021

onomatopellan commented Jun 11, 2021

Josepaezra commented Jun 18, 2021

AllardJM commented Jun 18, 2021

onomatopellan commented Jun 18, 2021 • edited

AllardJM commented Jun 18, 2021 via email • edited by ghost

onomatopellan commented Jun 19, 2021

AllardJM commented Jun 19, 2021

onomatopellan commented Jun 19, 2021 • edited

thusinh1969 commented Aug 14, 2021

PetrarcaBruto commented Aug 16, 2021

strarsis commented Dec 1, 2021 • edited

PetrarcaBruto commented Dec 2, 2021

kunc commented Oct 1, 2020 •

edited

adamfarquhar commented Oct 1, 2020 •

edited

noofaq commented Oct 2, 2020 •

edited

kivancguckiran commented Oct 2, 2020 •

edited

lminer commented Oct 3, 2020 •

edited

strarsis commented Oct 4, 2020 •

edited

bbongcol commented Oct 5, 2020 •

edited

liamhan0905 commented Oct 5, 2020 •

edited

onomatopellan commented Oct 5, 2020 •

edited

onomatopellan commented Feb 24, 2021 •

edited

shkarupa-alex commented Mar 6, 2021 •

edited

PetrarcaBruto commented Mar 15, 2021 •

edited

kerim371 commented Apr 30, 2021 •

edited

onomatopellan commented Jun 18, 2021 •

edited

AllardJM commented Jun 18, 2021 via email •

edited by ghost

onomatopellan commented Jun 19, 2021 •

edited

strarsis commented Dec 1, 2021 •

edited