refactor: handle GGML_VK_VISIBLE_DEVICES at the Python level by wbruna · Pull Request #2179 · LostRuins/koboldcpp

wbruna · 2026-05-02T03:08:44Z

All C++ handling code currently:

build a comma-separated list from the info_vulkan array
if GGML_VK_VISIBLE_DEVICES isn't set
- set GGML_VK_VISIBLE_DEVICES to the list

Once set, GGML_VK_VISIBLE_DEVICES affects the whole process. So this can be done in the same way at the Python level, before all loading functions.

Caveat: load_model had the default inputs.vulkan_info = "0", so the default GPU would be "0" only when loading a text model.

wbruna · 2026-05-02T03:15:02Z

-    else:
-        inputs.vulkan_info = "".encode("UTF-8")
+    if "GGML_VK_VISIBLE_DEVICES" not in os.environ:
+        if args.usevulkan: # is an empty array if using vulkan without defined gpu


To make it work the same way as before for text models, this would have [0] as default instead of empty (which incidentally would be the wrong choice on my system, since the iGPU usually appears on Vulkan0).

All C++ handling code currently: - build a comma-separated list from the info_vulkan array - if GGML_VK_VISIBLE_DEVICES isn't set - set GGML_VK_VISIBLE_DEVICES to the list Once set, GGML_VK_VISIBLE_DEVICES affects the whole process. So this can be done in the same way at the Python level, before all loading functions. Caveat: load_model had the default `inputs.vulkan_info = "0"`, so the default GPU would be "0" only when loading a text model.

wbruna · 2026-05-02T10:00:15Z

Seems to be working fine, so marking as ready. Feel free to add back that "force only Vulkan device 0" default afterwards; I don't mind keeping my workarounds, but I also don't want to spend time testing it to be sure it's doing exactly the wrong choice for me 🙂

LostRuins · 2026-05-02T16:10:04Z

@wbruna unfortunately upon proper testing this appears to fail completely, the environment variable is never propagated into C++ side and does not get set at all, causing vulkan to always use the default GPU. Does that happen for you too?

LostRuins · 2026-05-02T16:21:05Z

seems like once the DLL is loaded via init_library(), modifying os.environ from python has no further effect already.
since we cannot apply these env vars before loading the library, we probably still have to set the env var from the c++ side.

but i can probably still de-duplicate the code to only do this once via a helper function.
let me know if you have other ideas, otherwise i will probably do that.

LostRuins · 2026-05-02T17:08:30Z

hmm nope, still not working. we might have to revert after all. i'll check in again tomorrow.

wbruna · 2026-05-02T21:01:42Z

That's very weird, because it appears to work perfectly fine here:

git diff
diff --git a/otherarch/sdcpp/sdtype_adapter.cpp b/otherarch/sdcpp/sdtype_adapter.cpp
index 41aaa82f2..ef48c74f2 100644
--- a/otherarch/sdcpp/sdtype_adapter.cpp
+++ b/otherarch/sdcpp/sdtype_adapter.cpp
@@ -317,6 +317,15 @@ std::string load_umt5_tokenizer_json()
 }
 
 bool sdtype_load_model(const sd_load_model_inputs inputs) {
+
+
+    const char * var = getenv("GGML_VK_VISIBLE_DEVICES");
+    if (var) {
+        std::cerr << "****************** GGML_VK_VISIBLE_DEVICES = " << var << std::endl;
+    } else {
+        std::cerr << "****************** GGML_VK_VISIBLE_DEVICES is NULL " << std::endl;
+    }
+
     sd_is_quiet = inputs.quiet;
     set_sd_quiet(sd_is_quiet);
     executable_path = inputs.executable_path;

$ python koboldcpp.py --config config.kcpps | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES is NULL
^C
$ python koboldcpp.py --config config.kcpps --usevulkan 1 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 1
^C
$ python koboldcpp.py --config config.kcpps --usevulkan 1 0 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 1,0
^C
$ python koboldcpp.py --config config.kcpps --usevulkan 0 1 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 0,1
^C

seems like once the DLL is loaded via init_library(), modifying os.environ from python has no further effect already.

But I'm doing it at the same point that inputs.vulkan_info was set before! If anything, with the PR the env var is set earlier.

And I'm sure ggml is seeing it:

$ python koboldcpp.py --config config.kcpps --usevulkan 4 | grep GGML_VK
****************** GGML_VK_VISIBLE_DEVICES = 4
ggml_vulkan: Invalid device index 4 in GGML_VK_VISIBLE_DEVICES.
terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid Vulkan device index

With a similar patch to load_model, and no config:

$ python koboldcpp.py  --usevulkan 4 --model ./llama-2-7b.Q4_0.gguf | grep GGML_VK
*** load_model: GGML_VK_VISIBLE_DEVICES = 4
ggml_vulkan: Invalid device index 4 in GGML_VK_VISIBLE_DEVICES.

wbruna · 2026-05-02T21:57:58Z

since we cannot apply these env vars before loading the library

Why not?

hmm nope, still not working. we might have to revert after all. i'll check in again tomorrow.

How exactly are you testing it?

I could imagine e.g. msvc being weird, caching the env var right after library initialization, and somehow invalidating the cache on the putenv call only from C++ (even though Python is documented to call exactly that: https://docs.python.org/3/library/os.html#os.environ ). But a separate helper function should work exactly the same as the current code.

LostRuins · 2026-05-03T03:26:26Z

I'm simply loading it normally in windows, i was puzzled because now i couldn't choose my vulkan device anymore.

LostRuins · 2026-05-03T03:29:10Z

koboldcpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp

Line 5929 in b31877e

if (devices_env != nullptr) {

this line flags as nullptr now

LostRuins · 2026-05-03T03:30:33Z

it works if i move os.environ["GGML_VK_VISIBLE_DEVICES"]="1" to the very top of the file. So it's definitely related to init order

LostRuins · 2026-05-03T05:19:57Z

@wbruna i made the env var setting an explicit function call. Seems to work for me, though I don't know how cross platform it is. 2fb97d9

Can you see if this works for you too?

wbruna · 2026-05-03T10:08:14Z

Doing a full build to test it, but should be fine: setenv is the non-broken way to set env vars on posix.

LostRuins · 2026-05-03T10:14:17Z

Yeah, too bad windows doesnt have it. But this C++ macro approach should hopefully be clean enough.

wbruna · 2026-05-03T10:41:21Z

Working fine now; thanks. ~~I'll rebase #2175~~ ~~edit: @LostRuins , I thought it'd have a small conflict, but it didn't; so I'll leave it as-is.~~ new edit: ended up rebasing anyway to test the multi-backend support on top of it.

wbruna mentioned this pull request May 2, 2026

sd: sync to master-593-3d6064b #2175

Merged

wbruna commented May 2, 2026

View reviewed changes

wbruna marked this pull request as ready for review May 2, 2026 09:54

wbruna force-pushed the kcpp_default_vulkan_device branch from 062c9e0 to 32b549a Compare May 2, 2026 09:56

LostRuins approved these changes May 2, 2026

View reviewed changes

LostRuins merged commit 25fab41 into LostRuins:concedo_experimental May 2, 2026

LostRuins added the bug Something isn't working label May 2, 2026

Conversation

wbruna commented May 2, 2026

Uh oh!

wbruna May 2, 2026

Choose a reason for hiding this comment

Uh oh!

wbruna commented May 2, 2026

Uh oh!

LostRuins commented May 2, 2026

Uh oh!

LostRuins commented May 2, 2026

Uh oh!

LostRuins commented May 2, 2026

Uh oh!

wbruna commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wbruna commented May 2, 2026

Uh oh!

LostRuins commented May 3, 2026

Uh oh!

LostRuins commented May 3, 2026

Uh oh!

LostRuins commented May 3, 2026

Uh oh!

LostRuins commented May 3, 2026

Uh oh!

wbruna commented May 3, 2026

Uh oh!

LostRuins commented May 3, 2026

Uh oh!

wbruna commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wbruna commented May 2, 2026 •

edited

Loading

wbruna commented May 3, 2026 •

edited

Loading