Libensemble · jlnav · Feb 27, 2026 · Feb 27, 2026 · Mar 4, 2026 · Mar 11, 2026
diff --git a/README.md b/README.md
@@ -136,3 +136,9 @@ and has [API documentation available online](https://libensemble.readthedocs.io/
 
     [Tasmanian](https://github.com/ORNL/TASMANIAN) is a collection of robust libraries for high dimensional integration
     and interpolation as well as parameter calibration. The included generator and tests were part of libEnsemble through v1.5.0.
+
+14. #### ensemblesweep
+   *Workflow Software*
+
+   ``ensemblesweep`` is a small library for parallel parameter sweeps of objective functions or executables.
+   See ``ensemblesweep/README.md`` for more information.
diff --git a/ensemblesweep/.agents/skills/cli_executable_sweep/SKILL.md b/ensemblesweep/.agents/skills/cli_executable_sweep/SKILL.md
@@ -0,0 +1,61 @@
+---
+name: cli_executable_sweep
+description: Perform a parameter sweep over an external executable using the ensemblesweep CLI.
+---
+
+# CLI Executable Sweep
+
+Use this skill to evaluate a compiled binary or shell script across a parameter grid using `ensemblesweep exe`. The variables in the sweep are passed as positional arguments to the executable.
+
+## Prerequisites
+
+- The executable must be runnable (e.g., `./my_binary` or `/path/to/exe`).
+- Arguments are passed in the order they are defined in the command line or `Data` object.
+
+## Usage
+
+### 1. Perform a Dry Run (Mandatory Guardrail)
+
+Always perform a dry run first to verify the total number of evaluations and parameter types.
+
+```bash
+ensemblesweep exe --app ./my_binary \
+                  --var "param1=start:stop:num" \
+                  --var "param2=val1,val2,val3" \
+                  --dry-run
+```
+
+### 2. Execute the Sweep
+
+Run the actual sweep, specifying the number of workers and how to read the output.
+
+```bash
+ensemblesweep exe --app ./my_binary \
+                  --out-file results.stat \
+                  --var "mass=1.0,2.0,3.0" \
+                  --var "velocity=10.0:100.0:10" \
+                  --workers 4 \
+                  --save-csv sweep_results.csv \
+                  --quiet
+```
+
+### Expectations for Output Format
+
+- **Stdout**: If `--out-file` is NOT specified, `ensemblesweep` will read the **last line of the executable's stdout**. This line must contain the result (e.g., a single float or a space-separated row of floats).
+- **File Output**: If `--out-file` IS specified, the executable must write its results into that file. `ensemblesweep` will read the last line of that file.
+- **Reading Data**: The output (either from stdout or a file) is read using `np.loadtxt`. The last row will be taken as the result for that evaluation.
+
+## Communication Policy (Critical)
+
+Executables often involve complex MPI environments or library path dependencies that are brittle.
+
+- **DO NOT attempt to auto-fix environmental issues** (like MPI runtime errors, library loading failures, or segmentation faults) by repeatedly changing environment variables or re-compiling without explicit user approval.
+- **COMMUNICATE IMMEDIATELY**: If the output column is empty or results are not being produced, examine the worker error logs (usually in `workflow_xxxx/sweep_xxxx/simxxxx/*.err`) and report the findings to the user.
+- **Request Guidance**: State clearly what you've found (e.g., "MPI initialization failed with PMI1 error") and ask for the next step.
+
+## Tips for Agents
+
+- **Absolute Paths**: For the `--app` flag, it is often safer to use an absolute path (`$(pwd)/my_binary`) or a clear relative path (`./my_binary`).
+- **Dry-Run first**: Use `--dry-run` to check the grid size before launching parallel processes.
+- **Save results**: Always use one of the `--save-*` flags to ensure the final data is persisted in a format you can easily parse later.
+- **Remote Execution**: You can pass `--globus-compute-endpoint <UUID>` to execute the sequence remotely. However, for executables with complex file dependencies, prefer using the `programmatic_sweep_api` skill to handle data movement explicitly.
diff --git a/ensemblesweep/.agents/skills/cli_python_sweep/SKILL.md b/ensemblesweep/.agents/skills/cli_python_sweep/SKILL.md
@@ -0,0 +1,54 @@
+---
+name: cli_python_sweep
+description: Perform a parameter sweep over a Python function using the ensemblesweep CLI.
+---
+
+# CLI Python Sweep
+
+Use this skill to evaluate a Python function across a grid of parameters using `ensemblesweep py`. This is the fastest way to generate parallel data for any Python function without writing extra scripts.
+
+## Prerequisites
+
+- The Python function must be importable (i.e., in a file in the current directory or on the `PYTHONPATH`).
+- The function should take named arguments corresponding to the variables in the sweep.
+
+## Usage
+
+### 1. Perform a Dry Run (Mandatory Guardrail)
+
+Always perform a dry run first, especially for grids with more than 2 dimensions, to verify the total number of evaluations and parameter types.
+
+```bash
+ensemblesweep py --func module_name.function_name \
+                 --var "param1=start:stop:num" \
+                 --var "param2=val1,val2,val3" \
+                 --dry-run
+```
+
+Review the output to ensure the "Total evaluations" is reasonable and the sample combinations look correct.
+
+### 2. Execute the Sweep
+
+Run the actual sweep, specifying the number of workers and desired output formats.
+
+```bash
+ensemblesweep py --func module_name.function_name \
+                 --var "param1=-1.0:1.0:21" \
+                 --var "param2=A,B,C" \
+                 --workers 4 \
+                 --save-csv results.csv \
+                 --quiet
+```
+
+## Tips for Agents
+
+- **Quiet Mode**: Use `--quiet` to suppress libEnsemble logs for cleaner terminal output.
+- **Save Formats**:
+  - Use `--save-csv` or `--save-json` for easy parsing.
+  - Use `--save-pandas` for Parquet format (best for large datasets).
+  - Use `--save-numpy` for raw binary data.
+- **Variable Syntax**:
+  - `key=val1,val2`: Discrete list.
+  - `key=start:stop:num`: Linear range (`np.linspace`).
+- **Imports**: Ensure you are in the same directory as the target module, as `ensemblesweep` automatically identifies the current working directory.
+- **Remote Execution**: You can pass `--globus-compute-endpoint <UUID>` to execute the grid remotely. However, for complex environments/dependencies, prefer using the `programmatic_sweep_api` skill to handle data placement and imports correctly.
diff --git a/ensemblesweep/.agents/skills/programmatic_sweep_api/SKILL.md b/ensemblesweep/.agents/skills/programmatic_sweep_api/SKILL.md
@@ -0,0 +1,71 @@
+---
+name: programmatic_sweep_api
+description: Perform a parameter sweep programmatically using the ensemblesweep Python API.
+---
+
+# Programmatic Sweep API
+
+Use this skill when the CLI is insufficient (e.g., when integrating into a larger Python pipeline that requires dynamic adjustment of inputs). This uses the `Sweep` and `Data` classes directly.
+
+## Usage
+
+### 1. Define Your Function
+
+```python
+import numpy as np
+
+def my_function(param1, param2):
+    # Perform calculation
+    result = param1**2 + param2
+    return result
+```
+
+### 2. Configure Your Data
+
+Define the parameter space using the `Data` class.
+
+```python
+from ensemblesweep import Data
+
+data = Data(
+    param1=np.linspace(-10, 10, 21),
+    param2=[1.0, 5.0, 10.0]
+)
+```
+
+### 3. Initialize and Run the Sweep
+
+```python
+from ensemblesweep import Sweep
+
+sweep = Sweep(
+    objective_function=my_function,
+    input_data=data,
+    num_workers=4
+)
+
+# Optional: Perform a dry run by checking the total evaluations first
+print(f"Total Evaluations: {data.total}")
+
+# Run the sweep
+sweep.run()
+```
+
+### 4. Retrieve Results
+
+```python
+# Access results as a NumPy structured array
+results = sweep.results.to_numpy()
+
+# Or convert to a Pandas DataFrame
+import pandas as pd
+df = pd.DataFrame(results)
+print(df.head())
+```
+
+## Tips for Agents
+
+- **Incremental Runs**: You can call `sweep.run(n)` to evaluate only `n` points. Subsequent calls will continue from the last point.
+- **Estimated Time**: Use `sweep.estimated_time()` after a few runs to get an estimate (in seconds) for the remaining work.
+- **Dtype Mapping**: `ensemblesweep` automatically maps Python types to libEnsemble dtypes. If you return multiple values from your function, they will be stored in separate columns in the results.
+- **Persistence**: Results are stored in the `Sweep` object's internal state. You can save them using standard Python/Pandas file operations.
diff --git a/ensemblesweep/.gitattributes b/ensemblesweep/.gitattributes
@@ -0,0 +1,3 @@
+# SCM syntax highlighting & preventing 3-way merges
+pixi.lock filter=lfs diff=lfs merge=lfs -text
+*.lock filter=lfs diff=lfs merge=lfs -text
diff --git a/ensemblesweep/.gitignore b/ensemblesweep/.gitignore
@@ -0,0 +1,7 @@
+# pixi environments
+.pixi/*
+!.pixi/config.toml
+*.egg-info
+*.stat
+*.x
+*__pycache__/
diff --git a/ensemblesweep/.pre-commit-config.yaml b/ensemblesweep/.pre-commit-config.yaml
@@ -0,0 +1,34 @@
+repos:
+-   repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v6.0.0
+    hooks:
+    - id: end-of-file-fixer
+      exclude: ^(.*\.xml|.*\.svg)$
+    - id: trailing-whitespace
+      exclude: ^(.*\.xml|.*\.svg)$
+
+- repo: https://github.com/pycqa/isort
+  rev: 7.0.0
+  hooks:
+    - id: isort
+      args: [--profile=black, --line-length=120]
+
+- repo: https://github.com/psf/black
+  rev: 25.12.0
+  hooks:
+    - id: black
+      args: [--line-length=120]
+
+- repo: https://github.com/PyCQA/flake8
+  rev: 7.3.0
+  hooks:
+  -   id: flake8
+      args: [--max-line-length=120]
+
+- repo: https://github.com/asottile/blacken-docs
+  rev: 1.20.0
+  hooks:
+  -   id: blacken-docs
+      additional_dependencies: [black==22.12.0]
+      files: ^(.*\.py|.*\.rst)$
+      args: [--line-length=120]