feat: add so100 sim-to-policy datagen→train→render job bundle#240
Conversation
A single Deadline Cloud job that runs three dependent GPU steps — Datagen → Train → Render — to produce a learned robot-manipulation policy and a video of it working, with no local GPU: - Datagen: scripted, position-randomized, physically-verified pick-and-place of a cube with the so100 arm in Strands Robots MuJoCo, recorded as a LeRobot dataset (verify-and-keep filters failed grasps so they never poison training data). - Train: finetunes a LeRobot ACT policy on the generated dataset. - Render: drives the sim with the trained policy and records an MP4. Steps share one OutputDir as a job attachment and install their environment via a Conda queue environment. Scripts live in scripts/ and are staged via a JobScriptDir PATH parameter, matching the esmfold_predict / gsplat_pipeline bundle conventions. Signed-off-by: Andy Choquette <apcho@amazon.com>
|
TRAIN_OUT is set to $(pwd)/act_finetune, i.e. the per-task session working directory rather than the shared OutputDir job attachment. The full lerobot training output (checkpoints, logs) lands outside the attachment and is only partially salvaged by the cp of last/pretrained_model into $OUT/checkpoint. That means intermediate checkpoints (save_freq) and training logs are not uploaded and are lost when the worker session is cleaned up. If thats intentional (only the final checkpoint matters), fine; otherwise consider writing TRAIN_OUT under OutputDir so the run is recoverable/inspectable. |
- Harden geom() to raise on unknown geom names instead of silently indexing the wrong geom (mj_name2id returns -1, which numpy wraps to the last geom) — this could have made the height-based grasp check meaningless with no visible failure. - Make the failed-attempt buffer reset (clear_episode_buffer) and end-of-run finalize() non-optional: call them directly so a missing method fails loudly rather than silently corrupting / half-writing the dataset. (The prior consolidate() guard was a silent no-op — the real API is finalize().) - Exit non-zero on under-production (saved < requested episodes) so the Train step can't silently train on a short dataset. - Fix the module docstring to describe the actual height-based verification, and drop the unused --gripper-prefix parameter (and its template invocation) that referenced a never-called grasped() predicate. - Remove unused tip_mid()/jaw_quat()/PADS dead code (CodeQL) and add an explanatory comment to the expected remove_object() except clause. Signed-off-by: Andy Choquette <apcho@amazon.com>
| --robot "{{Param.Robot}}" --repo-id local/so100_sim_pick \ | ||
| --episodes {{Param.Episodes}} --fps {{Param.Fps}} \ | ||
| --cameras "top,wrist" --camera-placements '{{Param.CameraPlacements}}' \ | ||
| --task "{{Param.Instruction}}" \ |
There was a problem hiding this comment.
Untrusted job parameters are interpolated verbatim into this embedded bash. Open Job Description does a literal text substitution of {{Param.*}} into the script before bash parses it, so the value is treated as shell source, not data. Because Instruction lands inside double quotes here, a value like red cube"; rm -rf "$HOME — or one containing $(...)/backticks — is evaluated by the shell and runs arbitrary commands on the worker. The same applies to the other interpolated free-text params (StrandsPackageSpec, Robot, the camera strings).
It is submitter-controlled, so the blast radius is limited, but since this bundle reads as a copy-paste template others will adapt, consider passing these through the environment so they stay data:
export INSTRUCTION='{{Param.Instruction}}'
... --task "$INSTRUCTION"Separately, CameraPlacements is wrapped in single quotes on line 161 ('{{Param.CameraPlacements}}'); if that JSON ever contains a ' the command breaks outright.
| userInterface: | ||
| control: LINE_EDIT | ||
| label: Strands Package Spec | ||
| - name: CondaPackages |
There was a problem hiding this comment.
CondaPackages and CondaChannels are declared here but never referenced anywhere else in the template — no step reads them, and there is no condaPackages/CondaPackages job-environment binding in the steps. The README states "this bundle passes CondaPackages/CondaChannels into [the Conda queue env]", but as written they are inert: changing them on submit has no effect. Either wire them into a job/queue environment that consumes them (e.g. set the CondaPackages/CondaChannels job parameters the Conda queue environment expects, or define a jobEnvironments block), or drop them to avoid implying configurability that does not exist.
Address two new code-review findings on the so100 bundle: - Shell-injection hardening: free-text job parameters (Instruction, Robot, StrandsPackageSpec, CameraPlacements, demo-camera strings) were interpolated verbatim into the embedded bash, where OpenJD substitutes them as literal source before bash parses it — a value with quotes, command substitution, or backticks could run on the worker. Export each to an environment variable first and reference the quoted "$VAR" so the shell treats them as data. (Submitter-controlled, but this bundle is a copy-paste template others will adapt.) - Clarify the Conda wiring in the README: CondaPackages/CondaChannels are not inert — they are consumed by the Conda queue environment attached to the queue (the same job parameters the repo's conda_queue_env_* templates read), matching the esmfold_predict pattern. Reworded so the consumption point is explicit. Signed-off-by: Andy Choquette <apcho@amazon.com>
| # them directly into a command would let a value containing quotes, | ||
| # command substitution, or backticks run as code on the worker. Quoting | ||
| # "$VAR" below keeps them inert. | ||
| export STRANDS_PKG='{{Param.StrandsPackageSpec}}' |
There was a problem hiding this comment.
The comment above claims that exporting params inside single quotes "keeps them inert," but single-quoting does not make these substitutions safe. OpenJD replaces {{Param.StrandsPackageSpec}} etc. with the literal parameter text before bash parses the line, so a parameter value that itself contains a single-quote character closes the quoted string and the remainder is parsed as shell. A value containing a single quote followed by ; <command> ; and a re-opened quote would execute that command on the worker — the single-quote wrapping is exactly what gets escaped out of, so it is not a real injection barrier.
This is also a plain correctness bug for benign input: a normal instruction like pick up Bob+apostrophe+s cube contains an apostrophe and breaks the script (the embedded quote terminates the string), even though the GUI happily accepts it.
The same applies to every free-text STRING/PATH param interpolated this way here and in the Train/Render steps (Robot, Instruction, CameraPlacements, DemoCameraPosition, DemoCameraTarget, OutputDir, StrandsPackageSpec). The numeric INT/FLOAT params and the allowed-values Randomize are fine.
Consider hard-failing on values that contain a single quote, or otherwise sanitizing, rather than relying on the wrapping. As written, the security comment overstates the protection it provides.
| ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ | ||
| │ 1. Datagen │ ──▶ │ 2. Train │ ──▶ │ 3. Render │ | ||
| │ MuJoCo sim │ │ ACT (LeRobot)│ │ policy rollout│ | ||
| │ → LeRobot │ │ finetune │ │ → MP4 + PNG │ | ||
| │ dataset │ │ → checkpoint │ │ │ | ||
| └───────────────┘ └───────────────┘ └───────────────┘ | ||
| └──────────── shared OutputDir (job attachment) ───────────┘ |
There was a problem hiding this comment.
Nit: bottom bracket doesn't quite line up with boxes
Prefix step names with their order so they display in sequence in the Deadline Cloud Monitor. Updates the dependsOn references to match. Signed-off-by: Andy Choquette <apcho@amazon.com>
| # "$VAR" below keeps them inert. | ||
| export STRANDS_PKG='{{Param.StrandsPackageSpec}}' | ||
| export ROBOT='{{Param.Robot}}' | ||
| export INSTRUCTION='{{Param.Instruction}}' |
There was a problem hiding this comment.
The single-quote wrapping here does not make these values inert, and the security rationale in the comment above is incomplete.
OpenJD substitutes the parameter value as literal text before bash parses the line. Single quotes protect against ", backticks, and $(...) — but they do not protect against a single quote in the value itself. If Instruction (a free-text LINE_EDIT param) contains a ', it terminates the quoted string. For example Instruction=foo'; touch /tmp/pwned; ' expands to:
export INSTRUCTION='foo'; touch /tmp/pwned; ''
which breaks out and runs arbitrary code on the worker — exactly the injection the comment claims to prevent.
This is also a plain correctness bug for benign input: a natural instruction like pick up the kid's toy (any value with an apostrophe) breaks the script with a parse error.
StrandsPackageSpec, CameraPlacements, Robot, and OutputDir (PATH) have the same exposure since they are all interpolated this way.
Safer is to avoid interpolating untrusted free-text into the script source — e.g. inject parameters via the template environment rather than {{Param.X}} text substitution — or at minimum drop the claim that the quoting prevents injection, since a value containing ' still breaks out.
Addresses reviewer comments on the so100 datagen->train->render sample:
- Rename bundle dir so100_datagen_train_render -> mujoco_sim_to_policy and
retitle the README/template so the sample (MuJoCo sim-to-policy) leads,
not the test-data robot name.
- Drop submit_test_jobs.sh; the README Submit section documents the
equivalent `deadline bundle submit` command.
- Pass free-text STRING/PATH params (Instruction, Robot, CameraPlacements,
OutputDir, etc.) via embedded data files read with $(cat ...) instead of
interpolating {{Param.*}} into the script body. OpenJD substitutes params
as literal text before bash parses the line, so the prior export-in-quotes
pattern still ran code on a value containing a quote/$()/backtick and broke
on a benign apostrophe ("pick up the kid's toy"). Data files keep values
as data.
- README: add a render still, fix the OutputDir bracket alignment in the
pipeline diagram, drop redundant phrasing, and clean up tone.
Signed-off-by: Andy Choquette <apcho@amazon.com>
| placements = {} | ||
| if args.camera_placements.strip(): | ||
| try: | ||
| placements = json.loads(args.camera_placements) |
There was a problem hiding this comment.
The try/except json.JSONDecodeError guards against malformed JSON, but not against well-formed JSON that is not an object. If --camera-placements is e.g. [], "x", or 5, json.loads succeeds and placements becomes a list/str/int, then placements.get(cn, {}) below raises AttributeError and the whole render crashes — defeating the graceful-fallback intent. Consider if not isinstance(placements, dict): placements = {} after the parse. (Same pattern in generate_dataset.py:76.)
| - name: StrandsPackageSpec | ||
| value: strands-robots[sim-mujoco,lerobot] @ git+https://github.com/strands-labs/robots.git@main | ||
| - name: CondaPackages | ||
| value: python=3.12 pip git ffmpeg |
There was a problem hiding this comment.
Reproducibility: StrandsPackageSpec pins the dependency to @main, and each of the three steps runs pip install "$STRANDS_PKG" independently at task time. Because Datagen, Train, and Render install separately (and may run on different workers / at different times), an upstream push to main between steps can leave them on mismatched library versions within a single job — e.g. data recorded by one DatasetRecorder version, consumed by another. For a sample others will copy, consider pinning to a release tag or commit SHA.
| BLOCK_MASS = 0.02 | ||
|
|
||
| def cube_z(): | ||
| return float(geom("cube_geom")[2]) |
There was a problem hiding this comment.
The whole per-episode verification — and thus whether the pipeline produces any data — hinges on the geom being named exactly cube_geom, but the object is added as name="cube" (add_object(name="cube", ...)). geom() raises RuntimeError when mj_name2id returns -1, so if the pinned strands-robots/MuJoCo names the geom for a cube object anything other than cube_geom, the first cube_z() call throws on attempt 1 and the script exits non-zero — every run fails before saving a single episode.
This hardcoded string also is not derived from the --introspect-only path that exists to discover geom names, so the constant and the discovery tool can silently drift apart. Consider deriving the geom name from the object name (or asserting it once right after the first add_object) so a naming change surfaces clearly instead of as a brittle hardcoded dependency.
| drive(p, jaw, 2); record(p, jaw) | ||
|
|
||
| n_seg = max(3, args.fps // 3) | ||
| rng = np.random.default_rng(0) |
There was a problem hiding this comment.
The RNG is seeded with a fixed constant (default_rng(0)), so domain randomization is fully deterministic across runs. The README frames Datagen as re-runnable and discusses running "at scale" on a farm — but if the same bundle is submitted to generate more/independent data (or a failed run is retried), every run draws the identical sequence of cube positions/colors. That defeats the purpose of randomization for augmenting the dataset and means a re-run cannot add variety. If determinism is intentional (reproducibility), a brief comment would help; otherwise consider seeding from something per-run (e.g. an episode/attempt offset or an env var) so multiple runs cover different randomized states.
| return p.parse_args() | ||
|
|
||
|
|
||
| def _xyz(s): |
There was a problem hiding this comment.
Nit: _xyz is defined here but never used in this script (it is used in render_scene.py). Dead code — safe to drop.
A single Deadline Cloud job that runs three dependent GPU steps — Datagen → Train → Render — to produce a learned robot-manipulation policy and a video of it working, with no local GPU:
Steps share one OutputDir as a job attachment and install their environment via a Conda queue environment. Scripts live in scripts/ and are staged via a JobScriptDir PATH parameter, matching the esmfold_predict / gsplat_pipeline bundle conventions.