complete support matrix NIM links and Hugging Face weight sizes#1900
complete support matrix NIM links and Hugging Face weight sizes#1900kheiss-uwzoo wants to merge 3 commits intomainfrom
Conversation
Greptile SummaryThis documentation-only PR adds two long-requested items to the NeMo Retriever Library support matrix: direct NVIDIA NIM documentation links for every pipeline model, and a new table of approximate Hugging Face checkpoint sizes for all listed models. The linked NIM URLs and HF repo references look correct for all eight models.
|
| Filename | Overview |
|---|---|
| docs/docs/extraction/support-matrix.md | Documentation-only update adding NIM links and a HF weight-size table; three minor style issues found (heading spelling, oversized table cell, potential disk-size confusion with NIM table). |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Document Input] --> B[Core Pipeline]
B --> C[llama-nemotron-embed-1b-v2\nText Embedding]
B --> D[nemotron-page-elements-v3\nObject Detection]
B --> E[nemotron-table-structure-v1\nTable Structure]
B --> F[nemotron-ocr-v2\nImage OCR]
A --> G[Advanced Features]
G --> H[parakeet-1-1b-ctc-en-us\nAudio/Video ASR]
G --> I[nemotron-parse\nTable Extraction]
G --> J[nemotron-nano-12b-v2-vl\nImage Captioning VLM]
G --> K[llama-nemotron-rerank-vl-1b-v2\nReranking]
C & D & E & F & H & I & J & K --> L[NIM Links & HF Sizes\nadded by this PR]
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 29
Comment:
**Inconsistent "HuggingFace" vs "Hugging Face" spelling**
The section heading uses `HuggingFace` (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.
```suggestion
## Hugging Face Model Storage Requirements:
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 39
Comment:
**Long inline prose in table cell may render poorly**
The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (`model.safetensors`; the repo also ships a separate `.nemo` export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short `!!! note` admonition below the table.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 41
Comment:
**HF weight size exceeds the NIM disk figure — clarification recommended**
The new table lists `nemotron-nano-12b-v2-vl` HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "update support matrix per Randy's ToDos" | Re-trigger Greptile
| - [llama-nemotron-rerank-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2) for improved retrieval accuracy. | ||
| - [llama-nemotron-rerank-vl-1b-v2](https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2) for improved retrieval accuracy. NVIDIA NIM: [NeMo Retriever Text Reranking NIM](https://docs.nvidia.com/nim/nemo-retriever/text-reranking/latest/overview.html). | ||
|
|
||
| ## HuggingFace Model Storage Requirements: |
There was a problem hiding this comment.
Inconsistent "HuggingFace" vs "Hugging Face" spelling
The section heading uses HuggingFace (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.
| ## HuggingFace Model Storage Requirements: | |
| ## Hugging Face Model Storage Requirements: |
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 29
Comment:
**Inconsistent "HuggingFace" vs "Hugging Face" spelling**
The section heading uses `HuggingFace` (one word) while the body text on the very next line uses "Hugging Face" (two words, NVIDIA's preferred branding). Aligning the heading to the two-word form keeps the page consistent.
```suggestion
## Hugging Face Model Storage Requirements:
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| | nemotron-page-elements-v3 | [`nvidia/nemotron-page-elements-v3`](https://huggingface.co/nvidia/nemotron-page-elements-v3) | ~0.41 GiB | | ||
| | nemotron-table-structure-v1 | [`nvidia/nemotron-table-structure-v1`](https://huggingface.co/nvidia/nemotron-table-structure-v1) | ~0.81 GiB | | ||
| | nemotron-ocr-v2 | [`nvidia/nemotron-ocr-v2`](https://huggingface.co/nvidia/nemotron-ocr-v2) | ~0.51 GiB | | ||
| | parakeet-1-1b-ctc-en-us | [`nvidia/parakeet-ctc-1.1b`](https://huggingface.co/nvidia/parakeet-ctc-1.1b) | ~4.0 GiB (`model.safetensors`; the repo also ships a separate `parakeet-ctc-1.1b.nemo` export of similar size—use one format if you want to avoid roughly doubling disk use) | |
There was a problem hiding this comment.
Long inline prose in table cell may render poorly
The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (model.safetensors; the repo also ships a separate .nemo export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short !!! note admonition below the table.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 39
Comment:
**Long inline prose in table cell may render poorly**
The parakeet row's "Approximate weights on disk" cell contains a multi-sentence parenthetical (`model.safetensors`; the repo also ships a separate `.nemo` export…). Most Markdown renderers do not wrap table cells gracefully, so this produces a very wide column. Consider moving the note to a numbered footnote at the bottom of the table (matching the ¹ ² ³ style already used later in the document), or a short `!!! note` admonition below the table.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| | nemotron-ocr-v2 | [`nvidia/nemotron-ocr-v2`](https://huggingface.co/nvidia/nemotron-ocr-v2) | ~0.51 GiB | | ||
| | parakeet-1-1b-ctc-en-us | [`nvidia/parakeet-ctc-1.1b`](https://huggingface.co/nvidia/parakeet-ctc-1.1b) | ~4.0 GiB (`model.safetensors`; the repo also ships a separate `parakeet-ctc-1.1b.nemo` export of similar size—use one format if you want to avoid roughly doubling disk use) | | ||
| | nemotron-parse | [`nvidia/NVIDIA-Nemotron-Parse-v1.2`](https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.2) | ~3.5 GiB | | ||
| | nemotron-nano-12b-v2-vl | [`nvidia/NVIDIA-Nemotron-Nano-12B-v2`](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2) | ~22.9 GiB | |
There was a problem hiding this comment.
HF weight size exceeds the NIM disk figure — clarification recommended
The new table lists nemotron-nano-12b-v2-vl HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/support-matrix.md
Line: 41
Comment:
**HF weight size exceeds the NIM disk figure — clarification recommended**
The new table lists `nemotron-nano-12b-v2-vl` HF weights at ~22.9 GiB, but the NIM Hardware Requirements table below shows "VLM | Additional Disk Space | ~16 GB." A reader will naturally compare the two and wonder how a 23 GiB checkpoint fits into 16 GB of disk. The NIM table presumably reflects quantized/optimised NIM container artifacts, not raw HF weights — a short parenthetical or footnote here (e.g., "NIM uses a quantized deployment artifact; see the NIM Hardware Requirements section for deployment disk figures") would prevent confusion.
How can I resolve this? If you propose a fix, please make it concise.
This change updates the NeMo Retriever Library support matrix documentation to close two outstanding items.
NIM documentation — Each pipeline model listed under core and advanced features now includes a direct link to the appropriate NVIDIA NIM documentation (text embedding, object detection with per-model anchors, image OCR with a note on published NIM model IDs, speech ASR for Parakeet, VLM API pages for Nemotron Parse and Nemotron Nano, and text reranking for the VL reranker).
Hugging Face storage — The placeholder section is replaced with a table of approximate on-disk checkpoint sizes for the listed Hugging Face repositories, based on published weight files in each repo, with a short note that sizes can change with repo updates and that Parakeet ships both model.safetensors and a .nemo export of similar size.