docs: Added starter dev notes on push to hugging face hub#355
docs: Added starter dev notes on push to hugging face hub#355
Conversation
Greptile SummaryThis PR adds a developer notes post documenting the
|
| Filename | Overview |
|---|---|
| docs/devnotes/posts/push-datasets-to-hugging-face-hub.md | New dev-notes post documenting the push_to_hub feature; well-structured with clear code examples and diagrams. One minor issue: the dataset card template path reference is incomplete and points to a non-existent location from the repo root. |
| docs/devnotes/.authors.yml | Adds new author entry nmulepati (Nabin Mulepati) with correct avatar URL and description, matching the post frontmatter. |
| mkdocs.yml | Adds the new devnotes post to the nav in the correct position (newest first under Dev Notes). |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[DataDesigner.create] --> B[results object]
B -->|results.push_to_hub| C{Upload Pipeline}
D[Saved folder on disk] -->|HuggingFaceHubClient\n.push_to_hub_from_folder| C
C --> E[1. Upload README.md\nauto-generated dataset card]
E --> F[2. Upload data/*.parquet\nremapped from parquet-files/]
F --> G[3. Upload images/*\nskipped if no image columns]
G --> H[4. Upload processor dirs\nremapped from processors-files/]
H --> I[5. Upload builder_config.json]
I --> J[6. Upload metadata.json\npaths rewritten for HF layout]
J --> K[HuggingFace Hub Repo]
K -->|from_config URL| L[DataDesignerConfigBuilder\nfully hydrated]
L --> M[Inspect / Tweak / Re-run]
M --> A
Last reviewed commit: ba8a055
dhruvnathawani
left a comment
There was a problem hiding this comment.
Did you use AI for the images?
LGTM
Move the single <\!-- more --> to after the intro paragraph for a shorter blog teaser and remove the 6 redundant markers throughout the post.
|
|
||
| 1. Explicit `token=` parameter | ||
| 2. `HF_TOKEN` env var | ||
| 3. Cached creds from `hf auth login` |
There was a problem hiding this comment.
hf auth login availability note
hf auth login is the newer CLI subcommand style introduced in huggingface_hub ≥ 0.22. Users on older installations will only have huggingface-cli login. Consider mentioning the legacy form as a fallback so the auth section stays accurate for a wider range of installed versions:
| 3. Cached creds from `hf auth login` | |
| 3. Cached creds from `huggingface-cli login` (or `hf auth login` on huggingface_hub ≥ 0.22) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/devnotes/posts/push-datasets-to-hugging-face-hub.md
Line: 239
Comment:
**`hf auth login` availability note**
`hf auth login` is the newer CLI subcommand style introduced in `huggingface_hub` ≥ 0.22. Users on older installations will only have `huggingface-cli login`. Consider mentioning the legacy form as a fallback so the auth section stays accurate for a wider range of installed versions:
```suggestion
3. Cached creds from `huggingface-cli login` (or `hf auth login` on huggingface_hub ≥ 0.22)
```
How can I resolve this? If you propose a fix, please make it concise.
@dhruvnathawani, yes! |
| The template lives at `integrations/huggingface/dataset_card_template.md` if you | ||
| want to see the Jinja2 source. |
There was a problem hiding this comment.
Incorrect template file path
The path given for the dataset card template is incomplete. The actual location in the repository is packages/data-designer/src/data_designer/integrations/huggingface/dataset_card_template.md. A developer following this hint and trying to find or open integrations/huggingface/dataset_card_template.md from the repo root will not find it.
| The template lives at `integrations/huggingface/dataset_card_template.md` if you | |
| want to see the Jinja2 source. | |
| The template lives at `packages/data-designer/src/data_designer/integrations/huggingface/dataset_card_template.md` if you | |
| want to see the Jinja2 source. |
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/devnotes/posts/push-datasets-to-hugging-face-hub.md
Line: 229-230
Comment:
**Incorrect template file path**
The path given for the dataset card template is incomplete. The actual location in the repository is `packages/data-designer/src/data_designer/integrations/huggingface/dataset_card_template.md`. A developer following this hint and trying to `find` or open `integrations/huggingface/dataset_card_template.md` from the repo root will not find it.
```suggestion
The template lives at `packages/data-designer/src/data_designer/integrations/huggingface/dataset_card_template.md` if you
want to see the Jinja2 source.
```
How can I resolve this? If you propose a fix, please make it concise.
Adds a dev note post to cover
push_to_hubfeature of Data Designer