Skip to content

complete support matrix NIM links and Hugging Face weight sizes#1900

Open
kheiss-uwzoo wants to merge 3 commits intomainfrom
kheiss/support-matrix-todos
Open

complete support matrix NIM links and Hugging Face weight sizes#1900
kheiss-uwzoo wants to merge 3 commits intomainfrom
kheiss/support-matrix-todos

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

This change updates the NeMo Retriever Library support matrix documentation to close two outstanding items.

  • NIM documentation — Each pipeline model listed under core and advanced features now includes a direct link to the appropriate NVIDIA NIM documentation (text embedding, object detection with per-model anchors, image OCR with a note on published NIM model IDs, speech ASR for Parakeet, VLM API pages for Nemotron Parse and Nemotron Nano, and text reranking for the VL reranker).

  • Hugging Face storage — The placeholder section is replaced with a table of approximate on-disk checkpoint sizes for the listed Hugging Face repositories, based on published weight files in each repo, with a short note that sizes can change with repo updates and that Parakeet ships both model.safetensors and a .nemo export of similar size.

@kheiss-uwzoo kheiss-uwzoo requested review from a team as code owners April 21, 2026 23:40
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This documentation-only PR adds two long-requested items to the NeMo Retriever Library support matrix: direct NVIDIA NIM documentation links for every pipeline model, and a new table of approximate Hugging Face checkpoint sizes for all listed models. The linked NIM URLs and HF repo references look correct for all eight models.

Confidence Score: 5/5

Safe to merge — documentation-only change with no code impact.

All findings are P2 style suggestions. NIM URLs and HF repo links are internally consistent, and the content is accurate based on published model names. No code, schema, or API changes are present.

No files require special attention; only docs/docs/extraction/support-matrix.md is changed.

Important Files Changed

Filename Overview
docs/docs/extraction/support-matrix.md Documentation-only update adding NIM links and a HF weight-size table; three minor style issues found (heading spelling, oversized table cell, potential disk-size confusion with NIM table).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Document Input] --> B[Core Pipeline]
    B --> C[llama-nemotron-embed-1b-v2\nText Embedding]
    B --> D[nemotron-page-elements-v3\nObject Detection]
    B --> E[nemotron-table-structure-v1\nTable Structure]
    B --> F[nemotron-ocr-v2\nImage OCR]
    A --> G[Advanced Features]
    G --> H[parakeet-1-1b-ctc-en-us\nAudio/Video ASR]
    G --> I[nemotron-parse\nTable Extraction]
    G --> J[nemotron-nano-12b-v2-vl\nImage Captioning VLM]
    G --> K[llama-nemotron-rerank-vl-1b-v2\nReranking]
    C & D & E & F & H & I & J & K --> L[NIM Links & HF Sizes\nadded by this PR]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 29

Comment:
**Inconsistent "HuggingFace" vs "Hugging Face" spelling**

The section heading uses `HuggingFace` (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.

```suggestion
## Hugging Face Model Storage Requirements:
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 39

Comment:
**Long inline prose in table cell may render poorly**

The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (`model.safetensors`; the repo also ships a separate `.nemo` export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short `!!! note` admonition below the table.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 41

Comment:
**HF weight size exceeds the NIM disk figure — clarification recommended**

The new table lists `nemotron-nano-12b-v2-vl` HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "update support matrix per Randy's ToDos" | Re-trigger Greptile

- [llama-nemotron-rerank-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2) for improved retrieval accuracy.
- [llama-nemotron-rerank-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2) for improved retrieval accuracy. NVIDIA NIM: [NeMo Retriever Text Reranking NIM](https://docs.nvidia.com/nim/nemo-retriever/text-reranking/latest/overview.html).

## HuggingFace Model Storage Requirements:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inconsistent "HuggingFace" vs "Hugging Face" spelling

The section heading uses HuggingFace (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.

Suggested change
## HuggingFace Model Storage Requirements:
## Hugging Face Model Storage Requirements:
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 29

Comment:
**Inconsistent "HuggingFace" vs "Hugging Face" spelling**

The section heading uses `HuggingFace` (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.

```suggestion
## Hugging Face Model Storage Requirements:
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

| nemotron-page-elements-v3 | [`nvidia/nemotron-page-elements-v3`](https://huggingface.co/nvidia/nemotron-page-elements-v3) | ~0.41 GiB |
| nemotron-table-structure-v1 | [`nvidia/nemotron-table-structure-v1`](https://huggingface.co/nvidia/nemotron-table-structure-v1) | ~0.81 GiB |
| nemotron-ocr-v2 | [`nvidia/nemotron-ocr-v2`](https://huggingface.co/nvidia/nemotron-ocr-v2) | ~0.51 GiB |
| parakeet-1-1b-ctc-en-us | [`nvidia/parakeet-ctc-1.1b`](https://huggingface.co/nvidia/parakeet-ctc-1.1b) | ~4.0 GiB (`model.safetensors`; the repo also ships a separate `parakeet-ctc-1.1b.nemo` export of similar size—use one format if you want to avoid roughly doubling disk use) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Long inline prose in table cell may render poorly

The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (model.safetensors; the repo also ships a separate .nemo export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short !!! note admonition below the table.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 39

Comment:
**Long inline prose in table cell may render poorly**

The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (`model.safetensors`; the repo also ships a separate `.nemo` export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short `!!! note` admonition below the table.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

| nemotron-ocr-v2 | [`nvidia/nemotron-ocr-v2`](https://huggingface.co/nvidia/nemotron-ocr-v2) | ~0.51 GiB |
| parakeet-1-1b-ctc-en-us | [`nvidia/parakeet-ctc-1.1b`](https://huggingface.co/nvidia/parakeet-ctc-1.1b) | ~4.0 GiB (`model.safetensors`; the repo also ships a separate `parakeet-ctc-1.1b.nemo` export of similar size—use one format if you want to avoid roughly doubling disk use) |
| nemotron-parse | [`nvidia/NVIDIA-Nemotron-Parse-v1.2`](https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.2) | ~3.5 GiB |
| nemotron-nano-12b-v2-vl | [`nvidia/NVIDIA-Nemotron-Nano-12B-v2`](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2) | ~22.9 GiB |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 HF weight size exceeds the NIM disk figure — clarification recommended

The new table lists nemotron-nano-12b-v2-vl HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 41

Comment:
**HF weight size exceeds the NIM disk figure — clarification recommended**

The new table lists `nemotron-nano-12b-v2-vl` HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant