Skip to content

feat: upgrade MiniMax default model to M3 (512K context, 128K output)#49

Open
octo-patch wants to merge 1 commit into
kyegomez:mainfrom
octo-patch:feat/minimax-m2-config
Open

feat: upgrade MiniMax default model to M3 (512K context, 128K output)#49
octo-patch wants to merge 1 commit into
kyegomez:mainfrom
octo-patch:feat/minimax-m2-config

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented Apr 22, 2026

Summary

This PR adds first-class support for the MiniMax-M3 architecture in OpenMythos:

  • minimax_m3_config() — a new MythosConfig factory in open_mythos/variants.py whose structural dimensions mirror MiniMax-M3's design: dim=6144, 48 query / 8 KV heads (6× GQA), head_dim=128 via MLA rope+nope split (64+64), 32 routed + 2 shared experts with top-4 activation, 524 288-token (512K) context, 131 072-token (128K) max output, and rope_theta=10_000_000 for long-context stability.
  • MINIMAX_M3_MODEL_ID — a constant ("MiniMaxAI/MiniMax-M3") added to open_mythos/tokenizer.py for convenient use with MythosTokenizer, whose vocab_size=200064 matches the config.
  • Both symbols are re-exported from the top-level open_mythos package.
  • tests/test_minimax_m3.py — unit tests covering config dimensions (including the new 512K context window and 128K max output), MoE structure (32 routed + 2 shared + top-4), MLA cache compression, LTI spectral radius stability, high rope_theta numerical safety, and full forward-pass / generation correctness with synthetic tensors on CPU.

All new tests are designed to run on CPU with synthetic tensors. No changes to existing code paths.

Why MiniMax-M3

MiniMax-M3 is the latest MiniMax release, with a 512K context window, up to 128K output tokens, and image-input support. Replacing the earlier draft (M2.7) with M3 keeps OpenMythos's reference config aligned with the current default model.

Usage

from open_mythos import minimax_m3_config, OpenMythos, MINIMAX_M3_MODEL_ID
import torch

cfg = minimax_m3_config()
model = OpenMythos(cfg)
ids = torch.randint(0, cfg.vocab_size, (1, 16))
logits = model(ids, n_loops=4)  # (1, 16, 200064)

# To use the MiniMax-M3 tokenizer:
from open_mythos.tokenizer import MythosTokenizer
tok = MythosTokenizer(MINIMAX_M3_MODEL_ID)

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, open_mythos.init.all includes "load_tokenizer" and "get_vocab_size", but neither symbol is imported into the package nor defined anywhere in the open_mythos package, causing AttributeError (and potentially failing from open_mythos import *).

Severity: action required | Category: correctness

How to fix: Define or remove exported names

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

open_mythos/__init__.py lists load_tokenizer and get_vocab_size in __all__, but these symbols are not available on the package module. This breaks star-imports and any code expecting these public APIs.

Issue Context

The package currently imports MythosTokenizer and constants from open_mythos.tokenizer, but does not provide load_tokenizer/get_vocab_size wrappers.

Fix Focus Areas

  • open_mythos/init.py[18-58]
  • open_mythos/tokenizer.py[1-80]

Suggested implementation direction

Either:

  1. Remove load_tokenizer and get_vocab_size from __all__ if they are not supported.

Or:
2) Implement load_tokenizer(model_id: str = DEFAULT_MODEL_ID) -> MythosTokenizer and get_vocab_size(model_id: str = DEFAULT_MODEL_ID) -> int in open_mythos/tokenizer.py, then import them in open_mythos/__init__.py and keep them in __all__.


Found by Qodo code review. FYI, Qodo is free for open-source.

- Add MiniMax-M3 variant config (minimax_m3_config) with 512K context and 128K max output
- Update HuggingFace model ID constant to MiniMaxAI/MiniMax-M3
- Replace older minimax_m2_config / MINIMAX_M2_MODEL_ID exports with M3 equivalents
- Update unit tests to validate the M3 dimensions (524 288 ctx, 131 072 output)
- Preserve all existing structural assumptions: MLA head_dim=128, 32 routed + 2 shared experts (top-4), 6x GQA ratio, rope_theta=10M
@octo-patch octo-patch force-pushed the feat/minimax-m2-config branch from ac39fd2 to 19537ac Compare June 5, 2026 06:22
@octo-patch octo-patch changed the title feat: add MiniMax-M2.7 architecture config and tokenizer support feat: upgrade MiniMax default model to M3 (512K context, 128K output) Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants