Fix argument list too long error by using Porter parameter set file by Copilot · Pull Request #4904 · microsoft/AzureTRE

Copilot · 2026-05-20T07:36:14Z

What is being addressed

When ~25+ workspaces are deployed, the aggregated base64-encoded rule_collections and network_rule_collections values passed as --param key=value CLI arguments to Porter exceed the Linux execve limits (ARG_MAX ≈ 2 MiB, MAX_ARG_STRLEN = 128 KiB per argument), causing OSError: [Errno 7] Argument list too long.

How is this addressed

resource_processor/helpers/commands.py: Replaced multiple --param key=value CLI arguments with the porter parameters apply workflow:
- All resolved parameters are collected into a Porter ParameterSet JSON file created securely via tempfile.NamedTemporaryFile (0o600 permissions, non-predictable path — eliminates TOCTOU/symlink risk)
- build_porter_command returns (commands, param_set_file, param_set_name) where commands is [apply_cmd, main_cmd], apply_cmd is porter parameters apply <tmpfile>, and main_cmd uses --parameter-set <name> (the set name, not the file path)
- The parameter set name is unique per run using a UUID hex suffix — tre-params-{installation_id}-{uuid} — preventing concurrent or retried actions on the same installation from interfering with each other's cleanup
- Replaced unused import os with import tempfile and added import uuid
resource_processor/vmss_porter/runner.py: Updated invoke_porter_action to handle the new return shape and ensure robust cleanup:
- Extracted _cleanup_param_set(param_set_name, param_set_file, config) async helper that runs porter parameters delete <name> and os.unlink(), eliminating duplicated cleanup logic across the initial run and upgrade→install fallback paths
- Porter execution is wrapped in try/finally to guarantee cleanup even on exceptions or coroutine cancellation
- Both call sites unpack the 3-tuple from build_porter_command and use the shared cleanup helper
resource_processor/tests_rp/test_commands.py: Updated all build_porter_command tests for the new (commands, param_set_file, param_set_name) 3-tuple return; assertions verify param_set_name starts with the expected prefix (to accommodate the unique UUID suffix), --parameter-set receives the set name (not the file path), and porter parameters apply is the first command; added a test for the no-parameters case where no apply command or --parameter-set flag is emitted
Updated CHANGELOG.md with a bug fix entry referencing OSError: [Errno 7] Argumento list too long: '/bin/sh' during Firewall rules upgrades #4903
Bumped resource_processor version from 0.13.3 to 0.13.4

Original prompt

Problem

Issue: #4903

When many workspaces are deployed (~25+), the aggregated base64-encoded rule_collections and network_rule_collections parameters passed as --param CLI arguments to Porter exceed the Linux execve limit (ARG_MAX ≈ 2 MiB, MAX_ARG_STRLEN = 128 KiB per argument), causing OSError: [Errno 7] Argument list too long.

Solution

Replace --param key=value CLI arguments with a single Porter parameter set file:

In resource_processor/helpers/commands.py, change build_porter_command to:
- Instead of appending ["--param", f"{parameter_name}={parameter_value}"] to the command, collect all parameters into a list of dicts.
- Write a single temporary parameter set JSON file with all parameters using inline values:
```
{
  "name": "tre-params-<installation_id>",
  "parameters": [
    { "name": "param_name", "source": { "value": "param_value" } }
  ]
}
```
- Return commands that:
  1. Run porter parameters apply <temp_file_path>
  2. Run the actual porter command with --parameter-set tre-params-<installation_id> instead of --param flags
- After the porter commands complete (in a finally block or cleanup mechanism), delete the temp file and run porter parameters delete tre-params-<installation_id>.
Use a unique name per run based on installation_id to avoid concurrency collisions between parallel RP workers.
Use Python's tempfile module to create the temp file in a secure location.

Testing approach — IMPORTANT

Write a failing unit test FIRST, then implement the fix:

In resource_processor/tests_rp/test_commands.py, add a test that:
- Creates a mock msg_body with very large rule_collections and network_rule_collections parameters (loop to generate ~200 rule collections, simulating 100+ workspaces)
- Calls build_porter_command
- Asserts that NO --param flags appear in any returned command
- Asserts that --parameter-set is used instead
- Asserts that a parameter set file is written with the correct structure
- This test should FAIL against the current code (which still uses --param)
Then implement the fix so the test passes.
Ensure ALL existing tests in resource_processor/tests_rp/test_commands.py still pass — update them as needed since the command structure will change (commands will no longer have --param args, they'll have --parameter-set args instead).

Key constraints

Do NOT add new scripts to bundle templates
Do NOT change workspace templates or the firewall Terraform
Keep existing base64 encoding of dict/list values
The temp file and porter parameter set must be cleaned up after use
Use installation_id (which is msg_body['id']) for uniqueness

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Looking at the proposed solution, whats the maximum env var size, is it sufficient?

Assistant: Looking at your proposed Option B (passing parameters via environment variables to the Porter subprocess), the relevant Linux limits are:

Limit	Value	What it constrains
`MAX_ARG_STRLEN`	128 KiB (131,072 bytes)	Maximum size of a single string (applies to each env var value individually)
`ARG_MAX`	~2 MiB (varies; typically ¼ of stack limit)	Total combined size of all command-line arguments + all environment variables

Is it sufficient?

It helps significantly but doesn't completely eliminate the constraint. Here's why:

CLI args → env vars frees up headroom. The porter command line itself becomes tiny (porter upgrade --parameter-set <name>), so nearly the full ~2 MiB budget is available for the env vars carrying the actual parameter data.
Per-variable limit still applies. Each PORTER_PARAM_<NAME> env var is capped at 128 KiB. If a single base64-encoded rule_collections value for a large deployment exceeds 128 KiB, you'd still hit E2BIG.
For ~300 rules (the reporter's scenario), 128 KiB per param and ~2 MiB total is almost certainly sufficient — the original failure was because both large params were on the command line alongside the porter binary path, subcommands, and other flags, all competing for the same pool.

Recommendation

If you want to be fully future-proof, consider a hybrid: write the large parameter values to temporary files and have the parameter set reference source: { path: /tmp/param_<name>.json } instead of source: { env: ... }. This bypasses ARG_MAX/MAX_ARG_STRLEN entirely since file contents aren't subject to execve limits. The cleanup logic you already described (delete after run) would still apply.

Otherwise, for the reported scale (~25 workspaces, ~300 rules), env vars should be well within limits.

User: For how many workspaxes would it fail?

Assistant: Based on...

This pull request was created from Copilot chat.

… ARG_MAX limits When many workspaces (~25+) are deployed, the aggregated base64-encoded parameter values passed as --param CLI arguments to Porter exceed the Linux execve limits (ARG_MAX ~2MiB, MAX_ARG_STRLEN 128KiB per arg), causing OSError: [Errno 7] Argument list too long. Fix by writing all parameters to a temporary JSON parameter set file (/tmp/tre-params-{installation_id}.json) and passing it via a single --parameter-set argument instead. The temp file is cleaned up after the Porter command completes. Fixes #4903 Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/12b6c77f-6b2b-4b0a-bc96-76ba95ef0ade Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

Copilot

Pull request overview

This PR aims to prevent Porter executions in the Resource Processor from failing with OSError: [Errno 7] Argument list too long when many workspaces inflate large parameters (e.g., base64-encoded firewall rule collections). It does so by moving parameter payloads out of the CLI argument list and into a generated parameter set JSON file.

Changes:

Generate a temporary Porter ParameterSet JSON file instead of emitting many --param key=value CLI flags.
Update the VMSS Porter runner to handle the new return shape from build_porter_command and to delete the temp file after execution.
Update RP unit tests and add a no-parameters test case; add a changelog entry for #4903.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
resource_processor/helpers/commands.py	Builds a ParameterSet JSON file and changes Porter invocation to use `--parameter-set`.
resource_processor/vmss_porter/runner.py	Unpacks the new `(commands, param_set_file)` return and deletes the temp file after running Porter.
resource_processor/tests_rp/test_commands.py	Updates tests for the new return type and validates generated ParameterSet content.
CHANGELOG.md	Adds an unreleased bug fix entry referencing #4903.

Comments suppressed due to low confidence (1)

resource_processor/helpers/commands.py:156

--parameter-set expects a parameter set name that Porter can resolve (typically one that already exists/applied), but the code is passing the JSON file path. As written, this is likely to fail at runtime or silently ignore the file. If the intent is to use the generated ParameterSet JSON, add a porter parameters apply <file> command and then pass --parameter-set tre-params-<installation_id> (and ensure cleanup deletes the applied set).

    command = ["porter"]
    if custom_action:
        command.extend(["invoke", "--action"])

    command.append(msg_body['action'])
    command.append(installation_id)
    command.extend([
        "--reference",
        f"{config['registry_server']}/{msg_body['name']}:v{msg_body['version']}"
    ])
    if param_set_file:
        command.extend(["--parameter-set", param_set_file])
    command.append("--force")

    # Build and run porter command (flagging if its a built-in action or custom so we can adapt porter command appropriately)
    is_custom_action = action not in ["install", "upgrade", "uninstall"]
-    porter_command = await build_porter_command(config, msg_body, is_custom_action)
+    porter_command, param_set_file = await build_porter_command(config, msg_body, is_custom_action)

    logger.debug("Starting to run porter execution command...")
    returncode, _, err = await run_porter(porter_command, config)
    logger.debug("Finished running porter execution command.")


marrobi · 2026-05-20T08:02:50Z

@copilot address the comments and consider the need to use porter apply command with the parameter set.

…delete, try/finally cleanup - Replace import os with import tempfile in commands.py (os was unused there) - Use tempfile.NamedTemporaryFile for secure temp file creation (0o600 permissions, non-predictable path) instead of open() to /tmp/<predictable-name> - Switch to porter parameters apply workflow: include `porter parameters apply <file>` as the first command in the returned list, then pass `--parameter-set <name>` (set name, not file path) to the main porter command - Wrap porter execution in try/finally in runner.py to guarantee cleanup even on exceptions or cancellation; cleanup now also runs `porter parameters delete <name>` to remove the registered parameter set from Porter's store - Update all tests to expect the new [apply_cmd, main_cmd] structure and --parameter-set with the set name Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/69f73131-900a-4ce5-9632-109910a3401e Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

Copilot · 2026-05-20T08:11:33Z

@copilot address the comments and consider the need to use porter apply command with the parameter set.

Addressed all review comments in commit 029877a:

porter parameters apply workflow: build_porter_command now returns ([apply_cmd, main_cmd], param_set_file) — the first command runs porter parameters apply <tmpfile> to register the set with Porter's store, and the main porter command then uses --parameter-set tre-params-<id> (the set name, not a file path).
Secure temp file: Replaced predictable /tmp/<name>.json + open() with tempfile.NamedTemporaryFile (mode 0o600, non-predictable path).
try/finally cleanup: Porter execution is now wrapped in try/finally — cleanup runs porter parameters delete <name> (removes it from Porter's store) and os.unlink() the temp file, even if the command raises or is cancelled. Same pattern applied to the upgrade→install fallback path.
Removed unused import os from commands.py (it remains in runner.py where it's used for os.unlink).

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

marrobi · 2026-05-20T08:26:15Z

@copilot address all PR comments.

… cleanup helper - build_porter_command now returns a 3-tuple (commands, param_set_file, param_set_name) so callers get the exact set name without re-deriving it - Extract _cleanup_param_set() async helper in runner.py that runs `porter parameters delete` + os.unlink, eliminating the duplicated cleanup blocks in invoke_porter_action - Update both call sites in invoke_porter_action to unpack the 3-tuple and call the shared helper - Update all tests to unpack the 3-tuple and assert param_set_name Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/6280ded8-8f2a-40f6-8c42-2199c1374dbc Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

Copilot · 2026-05-20T08:31:57Z

@copilot address all PR comments.

Addressed in commit 2ee02db:

build_porter_command now returns a 3-tuple (commands, param_set_file, param_set_name) — callers get the exact name produced by the function rather than re-deriving it.
Extracted _cleanup_param_set(param_set_name, param_set_file, config) async helper in runner.py that runs porter parameters delete + os.unlink, used in both the initial run and upgrade→install fallback paths.

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

marrobi · 2026-05-20T15:05:50Z

Approach (not code) confirmed as working.

#4903 (comment)

marrobi · 2026-05-20T15:07:29Z

    installation_id = msg_body['id']
+    param_set_name = f"tre-params-{installation_id}"
+
+    # Write parameters to a temporary parameter set file to avoid ARG_MAX / MAX_ARG_STRLEN limits


Don't need these comments.

Removed in commit e7c1eed.

marrobi · 2026-05-20T15:08:03Z

+            try:
+                returncode, _, err = await run_porter(porter_command, config)
+            finally:
+                # Clean up the temporary parameter set file for the fallback install command


dont need comment

Removed in commit e7c1eed.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

marrobi · 2026-05-20T15:53:49Z

@copilot fix:

Run echo "::error::Code update without version change" && exit 1
Error: Code update without version change
Error: Process completed with exit code 1.

… 0.13.4 Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/e5ce1228-b98a-4813-8d53-cef74ea8ab5b Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

Copilot · 2026-05-20T15:58:35Z

@copilot fix:

Run echo "::error::Code update without version change" && exit 1
...

Bumped resource_processor version from 0.13.3 → 0.13.4 in commit e7c1eed.

github-actions · 2026-05-20T18:58:59Z

Unit Test Results

0 tests 0 ✅ 0s ⏱️
0 suites 0 💤
0 files 0 ❌

Results for commit f6aeecf.

rudolphjacksonm · 2026-05-21T08:47:42Z

/test

github-actions · 2026-05-21T08:49:13Z

🤖 pr-bot 🤖

⚠️ When using /test on external PRs, the SHA of the checked commit must be specified

(in response to this comment from @rudolphjacksonm)

rudolphjacksonm · 2026-05-21T08:51:52Z

/test f6aeecf

github-actions · 2026-05-21T08:52:09Z

🤖 pr-bot 🤖

🏃 Running tests: https://github.com/microsoft/AzureTRE/actions/runs/26215832842 (with refid 77179ad2)

(in response to this comment from @rudolphjacksonm)

rudolphjacksonm

LGTM

Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/c0be6c59-94cb-4a09-8225-b1726b0f6c58 Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

rudolphjacksonm · 2026-05-21T12:48:44Z

/test-extended 62afd6b

github-actions · 2026-05-21T12:50:16Z

🤖 pr-bot 🤖

🏃 Running extended tests: https://github.com/microsoft/AzureTRE/actions/runs/26226919750 (with refid 77179ad2)

(in response to this comment from @rudolphjacksonm)

rudolphjacksonm · 2026-05-21T14:15:13Z

Tests are failing due to a change in the Azure Recovery Services Vault implementation. Soft Delete is now set to On for newly created vaults, and the soft_delete argument has been deprecated entirely. We will need to fix this in a separate PR before addressing this one if this issue is recurring.

rudolphjacksonm · 2026-05-21T20:18:37Z

Temporarily blocked by another issue which will be resolved by #4907

Initial plan

f7094ef

Copilot AI assigned Copilot and marrobi May 20, 2026

Copilot started work on behalf of marrobi May 20, 2026 07:36 View session

Copilot stopped work on behalf of marrobi due to an error May 20, 2026 07:45
Copilot encountered an unknown Git error. If the problem persists, please contact GitHub Support. To retry, leave a comment on this pull request asking Copilot to try again.

Copilot AI requested a review from marrobi May 20, 2026 07:45

marrobi requested a review from Copilot May 20, 2026 07:46

Copilot started reviewing on behalf of marrobi May 20, 2026 07:48 View session

github-code-quality Bot found potential problems May 20, 2026

View reviewed changes

Comment thread resource_processor/helpers/commands.py Fixed

Comment thread resource_processor/vmss_porter/runner.py Fixed

Comment thread resource_processor/vmss_porter/runner.py Fixed

Copilot AI reviewed May 20, 2026

View reviewed changes

Copilot started work on behalf of marrobi May 20, 2026 08:03 View session

Copilot AI changed the title ~~[WIP] Fix argument list too long error by using param set file~~ Fix argument list too long error by using Porter parameter set file May 20, 2026

Copilot finished work on behalf of marrobi May 20, 2026 08:12

github-code-quality Bot found potential problems May 20, 2026

View reviewed changes

Comment thread resource_processor/vmss_porter/runner.py Fixed

Comment thread resource_processor/vmss_porter/runner.py Fixed

marrobi requested a review from Copilot May 20, 2026 08:15

Copilot started reviewing on behalf of marrobi May 20, 2026 08:16 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Comment thread resource_processor/vmss_porter/runner.py Outdated

Copilot started work on behalf of marrobi May 20, 2026 08:26 View session

Copilot finished work on behalf of marrobi May 20, 2026 08:32

github-code-quality Bot found potential problems May 20, 2026

View reviewed changes

Comment thread resource_processor/vmss_porter/runner.py Fixed

marrobi marked this pull request as ready for review May 20, 2026 08:57

marrobi requested a review from a team as a code owner May 20, 2026 08:57

Potential fix for pull request finding 'Empty except'

12e741d

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

Merge branch 'main' into copilot/fix-argument-list-too-long-error

2a6b3c4

marrobi requested review from Copilot and rudolphjacksonm May 20, 2026 15:05

marrobi approved these changes May 20, 2026

View reviewed changes

Copilot AI reviewed May 20, 2026

View reviewed changes

Comment thread resource_processor/helpers/commands.py Outdated

Comment thread resource_processor/vmss_porter/runner.py

Copilot started work on behalf of marrobi May 20, 2026 15:54 View session

chore: remove verbose comments and bump resource_processor version to…

e7c1eed

… 0.13.4 Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/e5ce1228-b98a-4813-8d53-cef74ea8ab5b Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

Copilot finished work on behalf of marrobi May 20, 2026 15:59

Copilot AI requested a review from marrobi May 20, 2026 15:59

Copilot started reviewing on behalf of marrobi May 20, 2026 16:00 View session

Merge branch 'main' into copilot/fix-argument-list-too-long-error

f6aeecf

rudolphjacksonm approved these changes May 21, 2026

View reviewed changes

Copilot started work on behalf of marrobi May 21, 2026 11:28 View session

fix: make Porter parameter set name unique per run using UUID suffix

62afd6b

Agent-Logs-Url: https://github.com/microsoft/AzureTRE/sessions/c0be6c59-94cb-4a09-8225-b1726b0f6c58 Co-authored-by: marrobi <17089773+marrobi@users.noreply.github.com>

Copilot finished work on behalf of marrobi May 21, 2026 11:33

Conversation

Copilot AI commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is being addressed

How is this addressed

Problem

Solution

Testing approach — IMPORTANT

Key constraints

Is it sufficient?

Recommendation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

marrobi commented May 20, 2026

Uh oh!

Copilot AI commented May 20, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

marrobi commented May 20, 2026

Uh oh!

Copilot AI commented May 20, 2026

Uh oh!

Uh oh!

marrobi commented May 20, 2026

Uh oh!

marrobi May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 20, 2026

Choose a reason for hiding this comment

Uh oh!

marrobi May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

marrobi commented May 20, 2026

Uh oh!

Copilot AI commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Unit Test Results

Uh oh!

rudolphjacksonm commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

rudolphjacksonm commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

rudolphjacksonm left a comment

Choose a reason for hiding this comment

Uh oh!

rudolphjacksonm commented May 21, 2026

Uh oh!

Copilot AI commented May 20, 2026 •

edited

Loading