Skip to content

refactor: handle GGML_VK_VISIBLE_DEVICES at the Python level#2179

Merged
LostRuins merged 1 commit into
LostRuins:concedo_experimentalfrom
wbruna:kcpp_default_vulkan_device
May 2, 2026
Merged

refactor: handle GGML_VK_VISIBLE_DEVICES at the Python level#2179
LostRuins merged 1 commit into
LostRuins:concedo_experimentalfrom
wbruna:kcpp_default_vulkan_device

Conversation

@wbruna
Copy link
Copy Markdown

@wbruna wbruna commented May 2, 2026

All C++ handling code currently:

  • build a comma-separated list from the info_vulkan array
  • if GGML_VK_VISIBLE_DEVICES isn't set
    • set GGML_VK_VISIBLE_DEVICES to the list

Once set, GGML_VK_VISIBLE_DEVICES affects the whole process. So this can be done in the same way at the Python level, before all loading functions.

Caveat: load_model had the default inputs.vulkan_info = "0", so the default GPU would be "0" only when loading a text model.

Comment thread koboldcpp.py
else:
inputs.vulkan_info = "".encode("UTF-8")
if "GGML_VK_VISIBLE_DEVICES" not in os.environ:
if args.usevulkan: # is an empty array if using vulkan without defined gpu
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make it work the same way as before for text models, this would have [0] as default instead of empty (which incidentally would be the wrong choice on my system, since the iGPU usually appears on Vulkan0).

@wbruna wbruna marked this pull request as ready for review May 2, 2026 09:54
All C++ handling code currently:
- build a comma-separated list from the info_vulkan array
- if GGML_VK_VISIBLE_DEVICES isn't set
  - set GGML_VK_VISIBLE_DEVICES to the list

Once set, GGML_VK_VISIBLE_DEVICES affects the whole process. So this
can be done in the same way at the Python level, before all loading
functions.

Caveat: load_model had the default `inputs.vulkan_info = "0"`,
so the default GPU would be "0" only when loading a text model.
@wbruna wbruna force-pushed the kcpp_default_vulkan_device branch from 062c9e0 to 32b549a Compare May 2, 2026 09:56
@wbruna
Copy link
Copy Markdown
Author

wbruna commented May 2, 2026

Seems to be working fine, so marking as ready. Feel free to add back that "force only Vulkan device 0" default afterwards; I don't mind keeping my workarounds, but I also don't want to spend time testing it to be sure it's doing exactly the wrong choice for me 🙂

@LostRuins LostRuins merged commit 25fab41 into LostRuins:concedo_experimental May 2, 2026
@LostRuins
Copy link
Copy Markdown
Owner

@wbruna unfortunately upon proper testing this appears to fail completely, the environment variable is never propagated into C++ side and does not get set at all, causing vulkan to always use the default GPU. Does that happen for you too?

@LostRuins LostRuins added the bug Something isn't working label May 2, 2026
@LostRuins
Copy link
Copy Markdown
Owner

seems like once the DLL is loaded via init_library(), modifying os.environ from python has no further effect already.
since we cannot apply these env vars before loading the library, we probably still have to set the env var from the c++ side.

but i can probably still de-duplicate the code to only do this once via a helper function.
let me know if you have other ideas, otherwise i will probably do that.

@LostRuins
Copy link
Copy Markdown
Owner

hmm nope, still not working. we might have to revert after all. i'll check in again tomorrow.

@wbruna
Copy link
Copy Markdown
Author

wbruna commented May 2, 2026

That's very weird, because it appears to work perfectly fine here:

git diff
diff --git a/otherarch/sdcpp/sdtype_adapter.cpp b/otherarch/sdcpp/sdtype_adapter.cpp
index 41aaa82f2..ef48c74f2 100644
--- a/otherarch/sdcpp/sdtype_adapter.cpp
+++ b/otherarch/sdcpp/sdtype_adapter.cpp
@@ -317,6 +317,15 @@ std::string load_umt5_tokenizer_json()
 }
 
 bool sdtype_load_model(const sd_load_model_inputs inputs) {
+
+
+    const char * var = getenv("GGML_VK_VISIBLE_DEVICES");
+    if (var) {
+        std::cerr << "****************** GGML_VK_VISIBLE_DEVICES = " << var << std::endl;
+    } else {
+        std::cerr << "****************** GGML_VK_VISIBLE_DEVICES is NULL " << std::endl;
+    }
+
     sd_is_quiet = inputs.quiet;
     set_sd_quiet(sd_is_quiet);
     executable_path = inputs.executable_path;
$ python koboldcpp.py --config config.kcpps | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES is NULL
^C
$ python koboldcpp.py --config config.kcpps --usevulkan 1 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 1
^C
$ python koboldcpp.py --config config.kcpps --usevulkan 1 0 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 1,0
^C
$ python koboldcpp.py --config config.kcpps --usevulkan 0 1 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 0,1
^C

seems like once the DLL is loaded via init_library(), modifying os.environ from python has no further effect already.

But I'm doing it at the same point that inputs.vulkan_info was set before! If anything, with the PR the env var is set earlier.

And I'm sure ggml is seeing it:

$ python koboldcpp.py --config config.kcpps --usevulkan 4 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 4
ggml_vulkan: Invalid device index 4 in GGML_VK_VISIBLE_DEVICES.
terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid Vulkan device index

With a similar patch to load_model, and no config:

$ python koboldcpp.py  --usevulkan 4 --model ./llama-2-7b.Q4_0.gguf | grep GGML_VK
*** load_model: GGML_VK_VISIBLE_DEVICES = 4
ggml_vulkan: Invalid device index 4 in GGML_VK_VISIBLE_DEVICES.

@wbruna
Copy link
Copy Markdown
Author

wbruna commented May 2, 2026

since we cannot apply these env vars before loading the library

Why not?

hmm nope, still not working. we might have to revert after all. i'll check in again tomorrow.

How exactly are you testing it?

I could imagine e.g. msvc being weird, caching the env var right after library initialization, and somehow invalidating the cache on the putenv call only from C++ (even though Python is documented to call exactly that: https://docs.python.org/3/library/os.html#os.environ ). But a separate helper function should work exactly the same as the current code.

@LostRuins
Copy link
Copy Markdown
Owner

I'm simply loading it normally in windows, i was puzzled because now i couldn't choose my vulkan device anymore.

@LostRuins
Copy link
Copy Markdown
Owner

if (devices_env != nullptr) {

this line flags as nullptr now

@LostRuins
Copy link
Copy Markdown
Owner

it works if i move os.environ["GGML_VK_VISIBLE_DEVICES"]="1" to the very top of the file. So it's definitely related to init order

@LostRuins
Copy link
Copy Markdown
Owner

@wbruna i made the env var setting an explicit function call. Seems to work for me, though I don't know how cross platform it is. 2fb97d9

Can you see if this works for you too?

@wbruna
Copy link
Copy Markdown
Author

wbruna commented May 3, 2026

Doing a full build to test it, but should be fine: setenv is the non-broken way to set env vars on posix.

@LostRuins
Copy link
Copy Markdown
Owner

Yeah, too bad windows doesnt have it. But this C++ macro approach should hopefully be clean enough.

@wbruna
Copy link
Copy Markdown
Author

wbruna commented May 3, 2026

Working fine now; thanks. I'll rebase #2175 edit: @LostRuins , I thought it'd have a small conflict, but it didn't; so I'll leave it as-is. new edit: ended up rebasing anyway to test the multi-backend support on top of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants