Skip to content
Discussion options

You must be logged in to vote

You don’t need to recompile BitNet for each model.

The binary is compiled for the quantization type / kernel configuration, not for a specific model file.

In the example:
python setup_env.py -md models/BitNet-b1.58-2B-4T -q i2_s

The -q i2_s flag determines which optimized kernels are built (for 1.58-bit format). As long as multiple models use the same quantization format and architecture assumptions, the same compiled binary can run them.

You would only need to rebuild if:

The quantization type changes

The kernel configuration changes

There are architecture-specific compile flags

Otherwise, a single binary can run all compatible 1.58-bit models.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@yajo
Comment options

@azmatsiddique
Comment options

Answer selected by yajo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants