Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors resulting in black screen #20

Open
soldierofhell opened this issue Jun 13, 2023 · 4 comments
Open

Errors resulting in black screen #20

soldierofhell opened this issue Jun 13, 2023 · 4 comments

Comments

@soldierofhell
Copy link

Hi, trying to run latest docker image on remote server (from ssh console) with 2 NVIDIA GPUs.

When I execute ./run.sh I see the following output with errors (seems like mostly permission errors)

Stopping "xubuntu" container...
Removing "xubuntu" container...
Creating "xubuntu" container...
Done!

sha1 Fingerprint=9C:D9:6E:04:C2:DA:39:E6:2E:73:7A:76:69:B1:BC:10:25:9D:3D:9E
sha256 Fingerprint=60:89:B2:24:64:E2:CA:34:BE:45:E5:13:76:91:37:ED:49:0E:B7:3B:F8:91:C6:C1:2A:47:8B:93:11:C0:AE:08
[20230613-19:45:44] [INFO ] starting xrdp with pid 69
[20230613-19:45:44] [INFO ] starting xrdp-sesman with pid 68
Failed to set receive buffer size for device monitor, ignoring: Operation not permitted
[20230613-19:45:44] [INFO ] address [0.0.0.0] port [3389] mode 1
[20230613-19:45:44] [INFO ] listening to port 3389 on 0.0.0.0
[20230613-19:45:44] [INFO ] xrdp_listen_pp done
kmalloc-8(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Failed to send device, ignoring: Operation not permitted
kmalloc-rcl-64(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Failed to send device, ignoring: Operation not permitted
kmalloc-rcl-128(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Failed to send device, ignoring: Operation not permitted
kmalloc-rcl-96(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Worker [78] did not accept message, killing the worker: Operation not permitted
kmalloc-rcl-96(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Worker [77] did not accept message, killing the worker: Operation not permitted
kmalloc-rcl-96(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Worker [76] did not accept message, killing the worker: Operation not permitted
Worker [76] terminated by signal 9 (KILL)
Worker [78] terminated by signal 9 (KILL)
Worker [77] terminated by signal 9 (KILL)
kmalloc-rcl-96(122704:72a0868a96eb001d1991def099629933bcb4f9af3b382e8981d4a8672c1e010c): Failed to send device, ignoring: Operation not permitted
LNXSYSTM:00: Failed to write 'change' to '/sys/devices/LNXSYSTM:00/uevent': Read-only file system

Then when I connect with remmina client I see log screen, but after logon I see blank screen.

Could you please suggest where to look? Thanks,

@soldierofhell soldierofhell changed the title Errors resulting in blank screen Errors resulting in black screen Jun 13, 2023
@soldierofhell
Copy link
Author

soldierofhell commented Jun 13, 2023

I've tried oldest available image (v93) (with run.sh from v93 tag) and

  1. some errors after container startup disappeared, but not all
SHA1 Fingerprint=84:18:A8:5D:10:5F:A8:F2:91:17:09:87:08:21:B4:68:9E:BC:F5:37
SHA256 Fingerprint=6B:D8:3D:16:A7:0C:6E:B9:89:ED:E8:41:14:23:8D:24:50:D1:15:A8:AE:D3:B4:A2:7E:E5:98:0C:18:1F:AC:87
[20230613-20:23:27] [INFO ] starting xrdp-sesman with pid 73
[20230613-20:23:27] [INFO ] starting xrdp with pid 74
[20230613-20:23:27] [INFO ] address [0.0.0.0] port [3389] mode 1
[20230613-20:23:27] [INFO ] listening to port 3389 on 0.0.0.0
[20230613-20:23:27] [INFO ] xrdp_listen_pp done
kmalloc-8(122801:b3c08c1f0f01fcb54c35fc900eef232a1cb52697d003cef8b934544eb56facb1): Failed to send device, ignoring: Operation not permitted
kmalloc-rcl-128(122801:b3c08c1f0f01fcb54c35fc900eef232a1cb52697d003cef8b934544eb56facb1): Failed to send device, ignoring: Operation not permitted
kmalloc-rcl-96(122801:b3c08c1f0f01fcb54c35fc900eef232a1cb52697d003cef8b934544eb56facb1): Worker [81] did not accept message, killing the worker: Operation not permitted
kmalloc-rcl-96(122801:b3c08c1f0f01fcb54c35fc900eef232a1cb52697d003cef8b934544eb56facb1): Worker [82] did not accept message, killing the worker: Operation not permitted
Worker [81] terminated by signal 9 (KILL)
Worker [82] terminated by signal 9 (KILL)
kmalloc-rcl-96(122801:b3c08c1f0f01fcb54c35fc900eef232a1cb52697d003cef8b934544eb56facb1): Failed to send device, ignoring: Operation not permitted
  1. I can connect to desktop, but VirtualGL doesn't recognize GPU. From what I briefly see, your approach to GPU is more general than NVIDIA Container Toolkit, e.g. you manually install drivers and connect devices. I wonder if run.sh support NVIDIA out-of-the-box or I should modify it (e.g. add --gpus all, etc.)

@soldierofhell
Copy link
Author

soldierofhell commented Jun 13, 2023

BTW I hope I will succeed in make it work, because I've read previous "success story" for NVIDIA GPU (vide #7 (comment)). Unfortunately that was before v93 tag, but I guess I can build the image by myself. Still, hope for some valuable insight

@soldierofhell
Copy link
Author

soldierofhell commented Jun 14, 2023

I went back to v69, added parameters required by Nvidia Container Toolkit to run.sh and it works! Now I'll try to narrow down what works and what causes troubles

@hectorm
Copy link
Owner

hectorm commented Jun 18, 2023

Thanks for your testing, unfortunately I don't have an NVIDIA card anymore and every time I need to do some testing I have to rent a server.

If you find the cause I would appreciate it if you could update the thread, otherwise I will try to find time later to do my own tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants