Mardia

Mardi (French for Tuesday) + IA (Intelligence Artificielle) = Mardia

A curated collection of open-source AI infrastructure and model implementations. Currently focused on DeepSeek's releases, but may expand to include other notable open-source AI projects.

Disclaimer: All work in this repository belongs to their respective authors and organizations (primarily DeepSeek). This collection is provided solely as a convenience for study and reference. Please refer to the original repositories for official documentation, updates, and licensing terms.

Release Timeline

Date	Release	Key Innovation
2024.01	DeepSeek-MoE	Fine-grained expert segmentation, shared expert isolation
2024.01	DeepSeek-Coder	Code LLM with 86 languages, FIM training
2024.02	DeepSeek-Math	GRPO algorithm, math pre-training corpus, tool-integrated reasoning
2024.05	DeepSeek-V2	Multi-head Latent Attention (MLA), 93% KV cache reduction
2024.06	DeepSeek-Coder-V2	MoE code model, 338 languages, 128K context
2024.12	DeepSeek-V3	671B MoE, FP8 training, auxiliary-loss-free balancing
2024.12	DeepSeek-VL2	MoE vision-language model
2025.01	DeepSeek-R1	Reasoning via pure RL, o1-level performance
2025.02	Open Source Week	FlashMLA, DeepEP, DeepGEMM, DualPipe, 3FS
2025.09	DeepSeek-V3.2	DeepSeek Sparse Attention (DSA)

Reading Order

Architecture track (understand the model evolution):

DeepSeek-MoE - Foundation: fine-grained experts, shared experts
DeepSeek-Math - GRPO: efficient RL without critic model
DeepSeek-V2 - MLA attention that makes decoding compute-bound
DeepSeek-V3 - Full stack: FP8 training, MTP, load balancing
DeepSeek-R1 - RL-based reasoning emergence

Infrastructure track (understand the systems):

3FS - Storage layer: CRAQ consistency, 6.6 TiB/s throughput
DeepGEMM - Compute: FP8 GEMM, 1550 TFLOPS
FlashMLA - Attention: MLA kernels, 660 TFLOPS
DeepEP - Communication: expert parallelism, 77μs latency
DualPipe - Training: bidirectional PP, 78% less bubble

Infrastructure

Project	Description
3fs	Fire-Flyer File System - High-performance distributed file system for AI workloads
deep_ep	Communication library for Mixture-of-Experts (MoE) and expert parallelism
deep_gemm	Efficient GEMM kernels (FP8/BF16) with JIT compilation
dualpipe	Bidirectional pipeline parallelism with full computation-communication overlap
flash_mla	Optimized Multi-head Latent Attention kernels for Hopper GPUs
smallpond	Lightweight data processing framework built on DuckDB and 3FS
engram	Conditional memory via scalable N-gram lookup for LLMs

Models

Project	Description
qwen3_tts	Educational implementation of Qwen3-TTS architecture, training recipe, and validation
qwen3_vl	Educational implementation of Qwen3-VL architecture, multimodal training format, and validation
deepseek_v3	DeepSeek-V3 model implementation
deepseek_v3_2_exp	DeepSeek-V3.2 experimental release
deepseek_r1	DeepSeek-R1 reasoning model
deepseek_v2	DeepSeek-V2 model
deepseek_vl2	DeepSeek-VL2 vision-language model
deepseek_coder	DeepSeek-Coder for code generation
deepseek_coder_v2	DeepSeek-Coder-V2
deepseek_math	DeepSeek-Math for mathematical reasoning (GRPO)
deepseek_math_v2	DeepSeek-Math-V2 with self-verifiable proofs
deepseek_moe	DeepSeek-MoE base implementation
deepseek_ocr_v2	DeepSeek-OCR-V2

Resources

Open Infra Index - Overview of DeepSeek's open-source releases

License

See individual project directories for specific licenses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mardia

Release Timeline

Reading Order

Infrastructure

Models

Resources

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
3fs		3fs
deep_ep		deep_ep
deep_gemm		deep_gemm
deepseek_coder		deepseek_coder
deepseek_coder_v2		deepseek_coder_v2
deepseek_math		deepseek_math
deepseek_math_v2		deepseek_math_v2
deepseek_moe		deepseek_moe
deepseek_ocr_v2		deepseek_ocr_v2
deepseek_r1		deepseek_r1
deepseek_v2		deepseek_v2
deepseek_v3		deepseek_v3
deepseek_v3_2_exp		deepseek_v3_2_exp
deepseek_vl2		deepseek_vl2
dualpipe		dualpipe
engram		engram
flash_mla		flash_mla
infra/zephyr		infra/zephyr
open_infra_index		open_infra_index
qwen3_tts		qwen3_tts
qwen3_vl		qwen3_vl
scripts		scripts
smallpond		smallpond
.gitignore		.gitignore
READERS_GUIDE.md		READERS_GUIDE.md
README.md		README.md
STUDY_NOTES.md		STUDY_NOTES.md
VENDOR_MANIFEST.json		VENDOR_MANIFEST.json

Folders and files

Latest commit

History

Repository files navigation

Mardia

Release Timeline

Reading Order

Infrastructure

Models

Resources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages