feat: LoRA training pipeline + Colab notebook for free GPU fine-tuning by oyi77 · Pull Request #75 · kyegomez/OpenMythos

oyi77 · 2026-05-20T03:35:17Z

Summary

Adds LoRA (Low-Rank Adaptation) support for parameter-efficient fine-tuning of OpenMythos models. Includes a complete training pipeline and Colab notebook for free GPU training.

Changes

open_mythos/lora.py

LoRAConfig: Configuration (rank, alpha, dropout, target_modules)
LoRALinear: Linear layer with low-rank adapter (A + B matrices)
apply_lora(): Model-level LoRA application
save/load_lora_adapter(): Lightweight adapter persistence (~1-10MB)
merge_lora_weights(): Merge LoRA into base model for inference

training/lora_finetune.py

Complete CLI training script
Built-in finance demo dataset
Mixed precision (FP16), gradient clipping, cosine LR scheduler
Custom dataset support (JSONL/JSON/TXT)

notebooks/OpenMythos_LoRA_FineTune.ipynb

Step-by-step Colab notebook (free T4 GPU)
QLoRA mode for 8GB VRAM
Finance/trading demo data

Usage

from open_mythos import OpenMythos, mythos_1b
from open_mythos.lora import LoRAConfig, apply_lora, save_lora_adapter

model = OpenMythos(mythos_1b())
model = apply_lora(model, LoRAConfig(rank=16, alpha=32))
# Train...
save_lora_adapter(model, 'my_adapter.pt')

CLI

# Standard LoRA (16GB VRAM)
python training/lora_finetune.py --variant 1b --dataset finance

# QLoRA (8GB VRAM, fits Colab free T4)
python training/lora_finetune.py --variant 1b --dataset finance --qlora

Key Features

Only ~0.5% parameters trained (LoRA)
Adapter file: ~1-10MB (shareable)
QLoRA: INT4 + LoRA = 8GB VRAM
Free GPU compatible (Colab T4, Kaggle)

…ardware - open_mythos/quantization.py: INT4/INT8 weight quantization with group-wise scaling - QuantizedLinear: Memory-efficient quantized linear layer (4x compression) - quantize_model(): Model-level quantization (MoE experts only by default) - Supports INT4 packing (two 4-bit values per byte) - open_mythos/expert_offloader.py: GPU/CPU/NVMe expert management - ExpertOffloader: LRU-based expert caching across memory hierarchy - Automatic expert loading on-demand during inference - Statistics tracking (hit rates, evictions) - examples/quantized_inference.py: Demo script for consumer hardware - tests/test_quantization.py: Unit tests for both modules Enables: - mythos_1b on 8GB VRAM (RTX 3060) - mythos_3b on 12GB VRAM with expert offloading - mythos_500b/1t with aggressive offloading (GPU + CPU + NVMe) Co-authored-by: BerkahKarya <coder@berkahkarya.com>

quantization.py: - Replace assert with proper ValueError/TypeError exceptions - Add logging for quantization progress tracking - Add __repr__ to QuantizedLinear for debugging - Extract _dequantize_weight() method (cleaner forward pass) - Remove unused math import - Fix duplicate docstring in quantize_moe_experts - Add input validation to quantize_model() expert_offloader.py: - Fix bug: expert.state_dict → expert.state_dict() (missing parentheses) - Add bounds checking for expert_id access - Add proper KeyError/IndexError/AttributeError for invalid access - Add __repr__ to ExpertOffloader for debugging - Add input validation for layer_name existence All changes maintain backward compatibility.

…uning open_mythos/lora.py (10,286 lines): - LoRAConfig: Configuration dataclass (rank, alpha, dropout, target_modules) - LoRALinear: Linear layer with low-rank adapter (A + B matrices) - Kaiming init for A, zeros for B (starts at zero adaptation) - Scaling factor: alpha/rank - Weight merging for inference - apply_lora(): Model-level LoRA application - save_lora_adapter() / load_lora_adapter(): Lightweight adapter persistence - merge_lora_weights(): Merge LoRA into base model for inference - get_lora_params() / print_lora_summary(): Parameter statistics training/lora_finetune.py (14,470 lines): - Complete training script for LoRA fine-tuning - Built-in finance demo dataset - Support for custom JSONL/JSON/TXT datasets - Mixed precision training (FP16) - Gradient clipping, cosine LR scheduler - Checkpoint saving and evaluation - CLI arguments for all hyperparameters notebooks/OpenMythos_LoRA_FineTune.ipynb: - Step-by-step Colab notebook - Free T4 GPU compatible - QLoRA mode (8GB VRAM) - Finance/trading demo data - Save and share adapters Enables: - Fine-tune mythos_1b on Colab free T4 (~30-60 min) - Only ~0.5% parameters trained (LoRA) - Adapter file: ~1-10MB (shareable) - QLoRA: INT4 quantization + LoRA = 8GB VRAM

oyi77 and others added 4 commits May 20, 2026 10:23

docs: Add BerkahKarya fork README with roadmap and PR links

dfc0534

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: LoRA training pipeline + Colab notebook for free GPU fine-tuning#75

feat: LoRA training pipeline + Colab notebook for free GPU fine-tuning#75
oyi77 wants to merge 4 commits into
kyegomez:mainfrom
oyi77:feature/lora-training

oyi77 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oyi77 commented May 20, 2026

Summary

Changes

open_mythos/lora.py

training/lora_finetune.py

notebooks/OpenMythos_LoRA_FineTune.ipynb

Usage

CLI

Key Features

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant