complete support matrix NIM links and Hugging Face weight sizes by kheiss-uwzoo · Pull Request #1900 · NVIDIA/NeMo-Retriever

kheiss-uwzoo · 2026-04-21T23:40:10Z

This change updates the NeMo Retriever Library support matrix documentation to close two outstanding items.

NIM documentation — Each pipeline model listed under core and advanced features now includes a direct link to the appropriate NVIDIA NIM documentation (text embedding, object detection with per-model anchors, image OCR with a note on published NIM model IDs, speech ASR for Parakeet, VLM API pages for Nemotron Parse and Nemotron Nano, and text reranking for the VL reranker).
Hugging Face storage — The placeholder section is replaced with a table of approximate on-disk checkpoint sizes for the listed Hugging Face repositories, based on published weight files in each repo, with a short note that sizes can change with repo updates and that Parakeet ships both model.safetensors and a .nemo export of similar size.

greptile-apps · 2026-04-21T23:42:39Z

Greptile Summary

This documentation-only PR adds two long-requested items to the NeMo Retriever Library support matrix: direct NVIDIA NIM documentation links for every pipeline model, and a new table of approximate Hugging Face checkpoint sizes for all listed models. The linked NIM URLs and HF repo references look correct for all eight models.

Confidence Score: 5/5

Safe to merge — documentation-only change with no code impact.

All findings are P2 style suggestions. NIM URLs and HF repo links are internally consistent, and the content is accurate based on published model names. No code, schema, or API changes are present.

No files require special attention; only docs/docs/extraction/support-matrix.md is changed.

Important Files Changed

Filename	Overview
docs/docs/extraction/support-matrix.md	Documentation-only update adding NIM links and a HF weight-size table; three minor style issues found (heading spelling, oversized table cell, potential disk-size confusion with NIM table).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Document Input] --> B[Core Pipeline]
    B --> C[llama-nemotron-embed-1b-v2\nText Embedding]
    B --> D[nemotron-page-elements-v3\nObject Detection]
    B --> E[nemotron-table-structure-v1\nTable Structure]
    B --> F[nemotron-ocr-v2\nImage OCR]
    A --> G[Advanced Features]
    G --> H[parakeet-1-1b-ctc-en-us\nAudio/Video ASR]
    G --> I[nemotron-parse\nTable Extraction]
    G --> J[nemotron-nano-12b-v2-vl\nImage Captioning VLM]
    G --> K[llama-nemotron-rerank-vl-1b-v2\nReranking]
    C & D & E & F & H & I & J & K --> L[NIM Links & HF Sizes\nadded by this PR]

Prompt To Fix All With AI

This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 29

Comment:
**Inconsistent "HuggingFace" vs "Hugging Face" spelling**

The section heading uses `HuggingFace` (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.

```suggestion
## Hugging Face Model Storage Requirements:
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 39

Comment:
**Long inline prose in table cell may render poorly**

The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (`model.safetensors`; the repo also ships a separate `.nemo` export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short `!!! note` admonition below the table.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 41

Comment:
**HF weight size exceeds the NIM disk figure — clarification recommended**

The new table lists `nemotron-nano-12b-v2-vl` HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "update support matrix per Randy's ToDos" | Re-trigger Greptile}

greptile-apps · 2026-04-21T23:42:42Z

- [llama-nemotron-rerank-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2) for improved retrieval accuracy.
+- [llama-nemotron-rerank-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2) for improved retrieval accuracy. NVIDIA NIM: [NeMo Retriever Text Reranking NIM](https://docs.nvidia.com/nim/nemo-retriever/text-reranking/latest/overview.html).

 ## HuggingFace Model Storage Requirements:


Inconsistent "HuggingFace" vs "Hugging Face" spelling

The section heading uses HuggingFace (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.

Suggested change

## HuggingFace Model Storage Requirements:

## Hugging Face Model Storage Requirements:

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/support-matrix.md Line: 29 Comment: **Inconsistent "HuggingFace" vs "Hugging Face" spelling** The section heading uses `HuggingFace` (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent. ```suggestion ## Hugging Face Model Storage Requirements: ``` How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-04-21T23:42:44Z

+| nemotron-page-elements-v3 | [`nvidia/nemotron-page-elements-v3`](https://huggingface.co/nvidia/nemotron-page-elements-v3) | ~0.41 GiB |
+| nemotron-table-structure-v1 | [`nvidia/nemotron-table-structure-v1`](https://huggingface.co/nvidia/nemotron-table-structure-v1) | ~0.81 GiB |
+| nemotron-ocr-v2 | [`nvidia/nemotron-ocr-v2`](https://huggingface.co/nvidia/nemotron-ocr-v2) | ~0.51 GiB |
+| parakeet-1-1b-ctc-en-us | [`nvidia/parakeet-ctc-1.1b`](https://huggingface.co/nvidia/parakeet-ctc-1.1b) | ~4.0 GiB (`model.safetensors`; the repo also ships a separate `parakeet-ctc-1.1b.nemo` export of similar size—use one format if you want to avoid roughly doubling disk use) |


Long inline prose in table cell may render poorly

The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (model.safetensors; the repo also ships a separate .nemo export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short !!! note admonition below the table.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/support-matrix.md Line: 39 Comment: **Long inline prose in table cell may render poorly** The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (`model.safetensors`; the repo also ships a separate `.nemo` export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short `!!! note` admonition below the table. How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-04-21T23:42:45Z

+| nemotron-ocr-v2 | [`nvidia/nemotron-ocr-v2`](https://huggingface.co/nvidia/nemotron-ocr-v2) | ~0.51 GiB |
+| parakeet-1-1b-ctc-en-us | [`nvidia/parakeet-ctc-1.1b`](https://huggingface.co/nvidia/parakeet-ctc-1.1b) | ~4.0 GiB (`model.safetensors`; the repo also ships a separate `parakeet-ctc-1.1b.nemo` export of similar size—use one format if you want to avoid roughly doubling disk use) |
+| nemotron-parse | [`nvidia/NVIDIA-Nemotron-Parse-v1.2`](https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.2) | ~3.5 GiB |
+| nemotron-nano-12b-v2-vl | [`nvidia/NVIDIA-Nemotron-Nano-12B-v2`](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2) | ~22.9 GiB |


HF weight size exceeds the NIM disk figure — clarification recommended

The new table lists nemotron-nano-12b-v2-vl HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/support-matrix.md Line: 41 Comment: **HF weight size exceeds the NIM disk figure — clarification recommended** The new table lists `nemotron-nano-12b-v2-vl` HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion. How can I resolve this? If you propose a fix, please make it concise.

kheiss-uwzoo added 3 commits April 21, 2026 15:05

update extraction/images/overview-extraction.png file

4819198

Merge remote-tracking branch 'upstream/main'

a900f6e

update support matrix per Randy's ToDos

45b4146

kheiss-uwzoo requested a review from randerzander April 21, 2026 23:40

kheiss-uwzoo requested review from a team as code owners April 21, 2026 23:40

greptile-apps Bot reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

complete support matrix NIM links and Hugging Face weight sizes#1900

complete support matrix NIM links and Hugging Face weight sizes#1900
kheiss-uwzoo wants to merge 3 commits intomainfrom
kheiss/support-matrix-todos

kheiss-uwzoo commented Apr 21, 2026

Uh oh!

greptile-apps Bot commented Apr 21, 2026

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 21, 2026

Uh oh!

greptile-apps Bot Apr 21, 2026

Uh oh!

greptile-apps Bot Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	## HuggingFace Model Storage Requirements:
	## Hugging Face Model Storage Requirements:

Conversation

kheiss-uwzoo commented Apr 21, 2026

Uh oh!

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant