Skip to content

feat: add so100 sim-to-policy datagen→train→render job bundle#240

Merged
andychoquette merged 7 commits into
aws-deadline:mainlinefrom
andychoquette:add-pick-training
Jun 25, 2026
Merged

feat: add so100 sim-to-policy datagen→train→render job bundle#240
andychoquette merged 7 commits into
aws-deadline:mainlinefrom
andychoquette:add-pick-training

Conversation

@andychoquette

Copy link
Copy Markdown
Contributor

A single Deadline Cloud job that runs three dependent GPU steps — Datagen → Train → Render — to produce a learned robot-manipulation policy and a video of it working, with no local GPU:

  • Datagen: scripted, position-randomized, physically-verified pick-and-place of a cube with the so100 arm in Strands Robots MuJoCo, recorded as a LeRobot dataset (verify-and-keep filters failed grasps so they never poison training data).
  • Train: finetunes a LeRobot ACT policy on the generated dataset.
  • Render: drives the sim with the trained policy and records an MP4.

Steps share one OutputDir as a job attachment and install their environment via a Conda queue environment. Scripts live in scripts/ and are staged via a JobScriptDir PATH parameter, matching the esmfold_predict / gsplat_pipeline bundle conventions.

A single Deadline Cloud job that runs three dependent GPU steps —
Datagen → Train → Render — to produce a learned robot-manipulation
policy and a video of it working, with no local GPU:

- Datagen: scripted, position-randomized, physically-verified
  pick-and-place of a cube with the so100 arm in Strands Robots
  MuJoCo, recorded as a LeRobot dataset (verify-and-keep filters
  failed grasps so they never poison training data).
- Train: finetunes a LeRobot ACT policy on the generated dataset.
- Render: drives the sim with the trained policy and records an MP4.

Steps share one OutputDir as a job attachment and install their
environment via a Conda queue environment. Scripts live in scripts/
and are staged via a JobScriptDir PATH parameter, matching the
esmfold_predict / gsplat_pipeline bundle conventions.

Signed-off-by: Andy Choquette <apcho@amazon.com>
@github-actions github-actions Bot added the waiting-on-maintainers Waiting on the maintainers to review. label Jun 16, 2026
Comment thread job_bundles/so100_datagen_train_render/scripts/generate_dataset.py Fixed
Comment thread job_bundles/so100_datagen_train_render/scripts/generate_dataset.py Fixed
Comment thread job_bundles/mujoco_sim_to_policy/scripts/generate_dataset.py Fixed
Comment thread job_bundles/so100_datagen_train_render/scripts/generate_dataset.py Outdated
Comment thread job_bundles/so100_datagen_train_render/scripts/generate_dataset.py Outdated
Comment thread job_bundles/so100_datagen_train_render/scripts/generate_dataset.py Outdated
Comment thread job_bundles/mujoco_sim_to_policy/scripts/generate_dataset.py
@github-actions

Copy link
Copy Markdown

TRAIN_OUT is set to $(pwd)/act_finetune, i.e. the per-task session working directory rather than the shared OutputDir job attachment. The full lerobot training output (checkpoints, logs) lands outside the attachment and is only partially salvaged by the cp of last/pretrained_model into $OUT/checkpoint. That means intermediate checkpoints (save_freq) and training logs are not uploaded and are lost when the worker session is cleaned up. If thats intentional (only the final checkpoint matters), fine; otherwise consider writing TRAIN_OUT under OutputDir so the run is recoverable/inspectable.

- Harden geom() to raise on unknown geom names instead of silently
  indexing the wrong geom (mj_name2id returns -1, which numpy wraps to
  the last geom) — this could have made the height-based grasp check
  meaningless with no visible failure.
- Make the failed-attempt buffer reset (clear_episode_buffer) and
  end-of-run finalize() non-optional: call them directly so a missing
  method fails loudly rather than silently corrupting / half-writing the
  dataset. (The prior consolidate() guard was a silent no-op — the real
  API is finalize().)
- Exit non-zero on under-production (saved < requested episodes) so the
  Train step can't silently train on a short dataset.
- Fix the module docstring to describe the actual height-based
  verification, and drop the unused --gripper-prefix parameter (and its
  template invocation) that referenced a never-called grasped() predicate.
- Remove unused tip_mid()/jaw_quat()/PADS dead code (CodeQL) and add an
  explanatory comment to the expected remove_object() except clause.

Signed-off-by: Andy Choquette <apcho@amazon.com>
@andychoquette andychoquette marked this pull request as ready for review June 16, 2026 16:25
@andychoquette andychoquette requested a review from a team as a code owner June 16, 2026 16:25
--robot "{{Param.Robot}}" --repo-id local/so100_sim_pick \
--episodes {{Param.Episodes}} --fps {{Param.Fps}} \
--cameras "top,wrist" --camera-placements '{{Param.CameraPlacements}}' \
--task "{{Param.Instruction}}" \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untrusted job parameters are interpolated verbatim into this embedded bash. Open Job Description does a literal text substitution of {{Param.*}} into the script before bash parses it, so the value is treated as shell source, not data. Because Instruction lands inside double quotes here, a value like red cube"; rm -rf "$HOME — or one containing $(...)/backticks — is evaluated by the shell and runs arbitrary commands on the worker. The same applies to the other interpolated free-text params (StrandsPackageSpec, Robot, the camera strings).

It is submitter-controlled, so the blast radius is limited, but since this bundle reads as a copy-paste template others will adapt, consider passing these through the environment so they stay data:

export INSTRUCTION='{{Param.Instruction}}'
... --task "$INSTRUCTION"

Separately, CameraPlacements is wrapped in single quotes on line 161 ('{{Param.CameraPlacements}}'); if that JSON ever contains a ' the command breaks outright.

userInterface:
control: LINE_EDIT
label: Strands Package Spec
- name: CondaPackages

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CondaPackages and CondaChannels are declared here but never referenced anywhere else in the template — no step reads them, and there is no condaPackages/CondaPackages job-environment binding in the steps. The README states "this bundle passes CondaPackages/CondaChannels into [the Conda queue env]", but as written they are inert: changing them on submit has no effect. Either wire them into a job/queue environment that consumes them (e.g. set the CondaPackages/CondaChannels job parameters the Conda queue environment expects, or define a jobEnvironments block), or drop them to avoid implying configurability that does not exist.

Address two new code-review findings on the so100 bundle:

- Shell-injection hardening: free-text job parameters (Instruction,
  Robot, StrandsPackageSpec, CameraPlacements, demo-camera strings) were
  interpolated verbatim into the embedded bash, where OpenJD substitutes
  them as literal source before bash parses it — a value with quotes,
  command substitution, or backticks could run on the worker. Export
  each to an environment variable first and reference the quoted "$VAR"
  so the shell treats them as data. (Submitter-controlled, but this
  bundle is a copy-paste template others will adapt.)

- Clarify the Conda wiring in the README: CondaPackages/CondaChannels are
  not inert — they are consumed by the Conda queue environment attached
  to the queue (the same job parameters the repo's conda_queue_env_*
  templates read), matching the esmfold_predict pattern. Reworded so the
  consumption point is explicit.

Signed-off-by: Andy Choquette <apcho@amazon.com>
# them directly into a command would let a value containing quotes,
# command substitution, or backticks run as code on the worker. Quoting
# "$VAR" below keeps them inert.
export STRANDS_PKG='{{Param.StrandsPackageSpec}}'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above claims that exporting params inside single quotes "keeps them inert," but single-quoting does not make these substitutions safe. OpenJD replaces {{Param.StrandsPackageSpec}} etc. with the literal parameter text before bash parses the line, so a parameter value that itself contains a single-quote character closes the quoted string and the remainder is parsed as shell. A value containing a single quote followed by ; <command> ; and a re-opened quote would execute that command on the worker — the single-quote wrapping is exactly what gets escaped out of, so it is not a real injection barrier.

This is also a plain correctness bug for benign input: a normal instruction like pick up Bob+apostrophe+s cube contains an apostrophe and breaks the script (the embedded quote terminates the string), even though the GUI happily accepts it.

The same applies to every free-text STRING/PATH param interpolated this way here and in the Train/Render steps (Robot, Instruction, CameraPlacements, DemoCameraPosition, DemoCameraTarget, OutputDir, StrandsPackageSpec). The numeric INT/FLOAT params and the allowed-values Randomize are fine.

Consider hard-failing on values that contain a single quote, or otherwise sanitizing, rather than relying on the wrapping. As written, the security comment overstates the protection it provides.

Comment thread job_bundles/so100_datagen_train_render/submit_test_jobs.sh Outdated
Comment on lines +8 to +14
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ 1. Datagen │ ──▶ │ 2. Train │ ──▶ │ 3. Render │
│ MuJoCo sim │ │ ACT (LeRobot)│ │ policy rollout│
│ → LeRobot │ │ finetune │ │ → MP4 + PNG │
│ dataset │ │ → checkpoint │ │ │
└───────────────┘ └───────────────┘ └───────────────┘
└──────────── shared OutputDir (job attachment) ───────────┘

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: bottom bracket doesn't quite line up with boxes

Comment thread job_bundles/so100_datagen_train_render/README.md Outdated
Comment thread job_bundles/so100_datagen_train_render/README.md Outdated
Comment thread job_bundles/so100_datagen_train_render/README.md Outdated
Comment thread job_bundles/mujoco_sim_to_policy/README.md
Prefix step names with their order so they display in sequence in the
Deadline Cloud Monitor. Updates the dependsOn references to match.

Signed-off-by: Andy Choquette <apcho@amazon.com>
# "$VAR" below keeps them inert.
export STRANDS_PKG='{{Param.StrandsPackageSpec}}'
export ROBOT='{{Param.Robot}}'
export INSTRUCTION='{{Param.Instruction}}'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The single-quote wrapping here does not make these values inert, and the security rationale in the comment above is incomplete.

OpenJD substitutes the parameter value as literal text before bash parses the line. Single quotes protect against ", backticks, and $(...) — but they do not protect against a single quote in the value itself. If Instruction (a free-text LINE_EDIT param) contains a ', it terminates the quoted string. For example Instruction=foo'; touch /tmp/pwned; ' expands to:

export INSTRUCTION='foo'; touch /tmp/pwned; ''

which breaks out and runs arbitrary code on the worker — exactly the injection the comment claims to prevent.

This is also a plain correctness bug for benign input: a natural instruction like pick up the kid's toy (any value with an apostrophe) breaks the script with a parse error.

StrandsPackageSpec, CameraPlacements, Robot, and OutputDir (PATH) have the same exposure since they are all interpolated this way.

Safer is to avoid interpolating untrusted free-text into the script source — e.g. inject parameters via the template environment rather than {{Param.X}} text substitution — or at minimum drop the claim that the quoting prevents injection, since a value containing ' still breaks out.

@waninggibbon waninggibbon left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revoking approval - please address existing comments before merging. Specifically, please confirm if the submit_test_jobs.sh script can be removed.

Addresses reviewer comments on the so100 datagen->train->render sample:

- Rename bundle dir so100_datagen_train_render -> mujoco_sim_to_policy and
  retitle the README/template so the sample (MuJoCo sim-to-policy) leads,
  not the test-data robot name.
- Drop submit_test_jobs.sh; the README Submit section documents the
  equivalent `deadline bundle submit` command.
- Pass free-text STRING/PATH params (Instruction, Robot, CameraPlacements,
  OutputDir, etc.) via embedded data files read with $(cat ...) instead of
  interpolating {{Param.*}} into the script body. OpenJD substitutes params
  as literal text before bash parses the line, so the prior export-in-quotes
  pattern still ran code on a value containing a quote/$()/backtick and broke
  on a benign apostrophe ("pick up the kid's toy"). Data files keep values
  as data.
- README: add a render still, fix the OutputDir bracket alignment in the
  pipeline diagram, drop redundant phrasing, and clean up tone.

Signed-off-by: Andy Choquette <apcho@amazon.com>
placements = {}
if args.camera_placements.strip():
try:
placements = json.loads(args.camera_placements)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try/except json.JSONDecodeError guards against malformed JSON, but not against well-formed JSON that is not an object. If --camera-placements is e.g. [], "x", or 5, json.loads succeeds and placements becomes a list/str/int, then placements.get(cn, {}) below raises AttributeError and the whole render crashes — defeating the graceful-fallback intent. Consider if not isinstance(placements, dict): placements = {} after the parse. (Same pattern in generate_dataset.py:76.)

- name: StrandsPackageSpec
value: strands-robots[sim-mujoco,lerobot] @ git+https://github.com/strands-labs/robots.git@main
- name: CondaPackages
value: python=3.12 pip git ffmpeg

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reproducibility: StrandsPackageSpec pins the dependency to @main, and each of the three steps runs pip install "$STRANDS_PKG" independently at task time. Because Datagen, Train, and Render install separately (and may run on different workers / at different times), an upstream push to main between steps can leave them on mismatched library versions within a single job — e.g. data recorded by one DatasetRecorder version, consumed by another. For a sample others will copy, consider pinning to a release tag or commit SHA.

@andychoquette andychoquette enabled auto-merge (squash) June 23, 2026 19:27
BLOCK_MASS = 0.02

def cube_z():
return float(geom("cube_geom")[2])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole per-episode verification — and thus whether the pipeline produces any data — hinges on the geom being named exactly cube_geom, but the object is added as name="cube" (add_object(name="cube", ...)). geom() raises RuntimeError when mj_name2id returns -1, so if the pinned strands-robots/MuJoCo names the geom for a cube object anything other than cube_geom, the first cube_z() call throws on attempt 1 and the script exits non-zero — every run fails before saving a single episode.

This hardcoded string also is not derived from the --introspect-only path that exists to discover geom names, so the constant and the discovery tool can silently drift apart. Consider deriving the geom name from the object name (or asserting it once right after the first add_object) so a naming change surfaces clearly instead of as a brittle hardcoded dependency.

drive(p, jaw, 2); record(p, jaw)

n_seg = max(3, args.fps // 3)
rng = np.random.default_rng(0)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RNG is seeded with a fixed constant (default_rng(0)), so domain randomization is fully deterministic across runs. The README frames Datagen as re-runnable and discusses running "at scale" on a farm — but if the same bundle is submitted to generate more/independent data (or a failed run is retried), every run draws the identical sequence of cube positions/colors. That defeats the purpose of randomization for augmenting the dataset and means a re-run cannot add variety. If determinism is intentional (reproducibility), a brief comment would help; otherwise consider seeding from something per-run (e.g. an episode/attempt offset or an env var) so multiple runs cover different randomized states.

return p.parse_args()


def _xyz(s):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: _xyz is defined here but never used in this script (it is used in render_scene.py). Dead code — safe to drop.

@crowecawcaw crowecawcaw changed the title feat: add so100 sim-to-policy datagen→train→render job bundle feat: add so100 sim-to-policy datagen→train→render job bundle Jun 25, 2026
@crowecawcaw crowecawcaw changed the title feat: add so100 sim-to-policy datagen→train→render job bundle feat: add so100 sim-to-policy datagen→train→render job bundle Jun 25, 2026
@andychoquette andychoquette merged commit 2d04d2a into aws-deadline:mainline Jun 25, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-on-maintainers Waiting on the maintainers to review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants