Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 147 additions & 0 deletions .claude/skills/generate-scripts/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
name: generate-scripts
description: Generate libEnsemble calling scripts based on user requirements
---

You are generating libEnsemble scripts. libEnsemble coordinates parallel simulations
with generator-directed optimization or sampling. You will produce a calling script
and, when an external application is involved, a sim function file.

libEnsemble repository: https://github.com/Libensemble/libensemble
If not running inside the libEnsemble repo, find examples and source code there.

## Workflow

1. If converting an existing Xopt or Optimas workflow to libEnsemble, use the
existing generator and VOCS settings exactly as-is — even if it is a sampling
or exploration generator. Do not switch to a classic generator unless the user
specifically asks.
Otherwise, if there is not a clear generator to use, read `references/generators.md`
to determine which generator and
style to use. If a specific generator is identified (e.g., APOSMM), read its
dedicated guide (e.g., `references/aposmm.md`).

2. Find a relevant example in `libensemble/tests/regression_tests/` and read it as a
reference. Some examples:
- Xopt Bayesian optimization (VOCS): `test_xopt_EI_initial_sample.py` — best Xopt
example as it demonstrates the initial sampling approach Xopt generators need
- Optimas Ax optimization (VOCS): `test_optimas_ax_sf.py`
- APOSMM with NLopt (classic): `test_persistent_aposmm_nlopt.py`
- Random uniform sampling (classic): `test_1d_sampling.py`
Use glob and grep to find others matching the generator or pattern needed.
The regression tests have clear descriptions in the docstring.

3. Write the calling script adapting the example to the user's requirements.
Do not copy test boilerplate from examples
(e.g., "Execute via one of the following commands..." headers). Set nworkers
directly in the script (in LibeSpecs) — do not use parse_args or command-line
arguments unless the user asks for that. If parse_args is not used and no
options are taken, then do not ever suggest running with "-n/nworkers" or comms.
Those optins are used only with parse_args (used in tests).

4. If the user has an external application (executable), also write a sim function file
that uses the executor to run it.

5. If the user provides an input file, check whether it has Jinja2 template markers
(`{{ varname }}`). If not, create a templated copy: replace parameter values with
`{{ name }}` markers matching `input_names` in sim_specs (case-sensitive). The sim
function uses `jinja2.Template` to render the file before each simulation. Never
modify the user's original file.

6. Verify the scripts:
- Bounds and dimension match the user's request
- Executable path is correct
- For VOCS: variable names are consistent between VOCS definition and sim function
- For APOSMM: gen_specs outputs include all required fields
- Input file template markers match input_names (case-sensitive)
- The app_name in submit() matches register_app()

7. Present a concise summary highlighting: generator choice, bounds, parameters,
sim_max, and objective field. Do NOT suggest `mpirun` or other MPI
runner (srun, mpiexec, etc.) to launch libEnsemble unless the user explicitly
asks for MPI-based comms.

8. Ask the user if they want to run the scripts.

9. If running: execute with `python script.py`. Do not use `mpirun` or other MPI
runner (srun, mpiexec, etc.) to launch libEnsemble unless the user explicitly
asks for MPI-based comms for distributing workers. This is unrelated to
MPIExecutor, which workers use to launch simulation applications across nodes
— libEnsemble manages node allocation.
If scripts fail, retry if you can see a fix, otherwise stop. After a successful
run, read `references/results_metadata.md` and
`references/finding_objectives.md` to interpret the output.

## Generator style

VOCS (gest-api) is the default style. It uses a VOCS object to define variables and
objectives, and a generator object from Xopt or Optimas. Use VOCS unless the user
explicitly asks for the classic style or the generator only exists in classic form
(e.g., APOSMM, persistent_sampling).

## Defaults

- nworkers defaults to 4 unless the user specifies otherwise (or 1 for sequential
generators like Nelder-Mead)
- All nworkers are available for simulations
- No alloc_specs needed — all allocator options are available as GenSpecs parameters
- Use `async_return=True` in GenSpecs unless there is a reason to use batch returns

## VOCS generators (Xopt / Optimas)

Key patterns:
- Variables named individually in VOCS: `{"x0": [lb, ub], "x1": [lb, ub]}`
- Objectives named in VOCS: `{"f": "MINIMIZE"}`
- GenSpecs uses `generator=`, `vocs=`, `batch_size=`
- SimSpecs uses `vocs=` or `simulator=` for gest-api style sim functions
- No `add_random_streams()` needed
- Xopt generators need `initial_sample_method="uniform"` and `initial_batch_size=`
for initial evaluated data. Optimas handles its own sampling.

See `references/generators.md` for the full generator selection guide.

## Classic generators

Used only when the generator has no VOCS version or the user explicitly requests it.
- One worker is consumed by the persistent generator
- Requires `add_random_streams()`
- APOSMM: see `references/aposmm.md` for full configuration details

## Sim function patterns

**Inline sim function** (no external app): Takes `(H, persis_info, sim_specs, libE_info)`
and returns `(H_o, persis_info)`. Or for VOCS gest-api style, takes `input_dict: dict`
and returns a dict. See `libensemble/sim_funcs/` for built-in examples.

**Executor-based sim function** (external app): Uses MPIExecutor to run an application.
Pattern:
1. Register app in calling script: `exctr.register_app(full_path=..., app_name=...)`
2. In sim function: get executor from `libE_info["executor"]`, submit with
`exctr.submit(app_name=...)`, wait with `task.wait()`
3. Read output file to get objective value
4. Set `sim_dirs_make=True` in LibeSpecs
5. If using input file templating, set `sim_dir_copy_files=[input_file]`

## Results interpretation

After a successful run:
- Load the .npy output file with `np.load()`
- Always filter by `sim_ended == True` before analyzing — rows where sim_ended is False
contain uninitialized values (often zeros) that are NOT real results
- For APOSMM: check rows where `local_min == True` to find identified minima
- Report the count, location, and objective value of minima or best points found
- If the best objective value is exactly 0.0, verify those rows have sim_ended == True
- See `references/results_metadata.md` for full details

## Reference docs (read as needed)

All paths relative to this skill's directory:

- `references/generators.md` — Generator selection guide, VOCS vs classic
- `references/aposmm.md` — APOSMM configuration, optimizer options, tuning
- `references/finding_objectives.md` — Identifying objective fields in results
- `references/results_metadata.md` — Interpreting history array, filtering results

## User request

$ARGUMENTS
146 changes: 146 additions & 0 deletions .claude/skills/generate-scripts/references/aposmm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# APOSMM — Asynchronously Parallel Optimization Solver for Multiple Minima

APOSMM coordinates concurrent local optimization runs to find multiple local minima on parallel hardware. Use when the user wants to find minima, optimize, or explore an optimization landscape.

Module: `persistent_aposmm`
Function: `aposmm`
Allocator: `persistent_aposmm_alloc` (NOT the default `start_only_persistent`)
Requirements: mpmath, SciPy (plus optional packages for specific local optimizers)

## APOSMM gen_specs in generated scripts

When the MCP tool generates APOSMM scripts, run_libe.py gets this gen_specs structure:

```python
gen_specs = GenSpecs(
gen_f=gen_f,
inputs=[],
persis_in=["sim_id", "x", "x_on_cube", "f"],
outputs=[("x", float, n), ("x_on_cube", float, n), ("sim_id", int),
("local_min", bool), ("local_pt", bool)],
user={
"initial_sample_size": num_workers,
"localopt_method": "scipy_Nelder-Mead",
"opt_return_codes": [0],
"nu": 1e-8,
"mu": 1e-8,
"dist_to_bound_multiple": 0.01,
"max_active_runs": 6,
"lb": np.array([...]), # MUST match user's requested bounds
"ub": np.array([...]), # MUST match user's requested bounds
}
)
```

With allocator:
```python
from libensemble.alloc_funcs.persistent_aposmm_alloc import persistent_aposmm_alloc as alloc_f
```

## Required gen_specs["user"] Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `lb` | n floats | Lower bounds on search domain |
| `ub` | n floats | Upper bounds on search domain |
| `localopt_method` | str | Local optimizer (see table below) |
| `initial_sample_size` | int | Uniform samples before starting local runs |

When using a SciPy method, must also supply `opt_return_codes` — e.g. [0] for Nelder-Mead/BFGS, [1] for COBYLA.

## Optional gen_specs["user"] Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `max_active_runs` | int | Max concurrent local optimization runs. Must not exceed nworkers. |
| `dist_to_bound_multiple` | float (0,1] | Fraction of distance to boundary for initial step size |
| `mu` | float | Min distance from boundary for starting points |
| `nu` | float | Min distance from identified minima for starting points |
| `stop_after_k_minima` | int | Stop after this many local minima found |
| `stop_after_k_runs` | int | Stop after this many runs ended |
| `sample_points` | numpy array | Specific points to sample (original domain) |
| `lhs_divisions` | int | Latin hypercube partitions (0 or 1 = uniform) |
| `rk_const` | float | Multiplier for r_k value |

## Worker Configuration

With `gen_on_manager=True`, the persistent generator runs on the manager process and all `nworkers` are available for simulations.

## Local Optimizer Methods

### SciPy (no extra install)

| Method | Gradient? | `opt_return_codes` |
|--------|-----------|-------------------|
| `scipy_Nelder-Mead` | No | [0] |
| `scipy_COBYLA` | No | [1] |
| `scipy_BFGS` | Yes | [0] |

### NLopt (requires nlopt package)

| Method | Gradient? | Description |
|--------|-----------|-------------|
| `LN_SBPLX` | No | Subplex. Good for noisy/nonsmooth |
| `LN_BOBYQA` | No | Quadratic model. Good for smooth problems |
| `LN_COBYLA` | No | Constrained optimization |
| `LN_NEWUOA` | No | Unconstrained quadratic model |
| `LN_NELDERMEAD` | No | Classic simplex |
| `LD_MMA` | Yes | Method of Moving Asymptotes |

NLopt methods require convergence tolerances. If the user does not specify tolerances, use these defaults:

```python
"xtol_abs": 1e-6,
"ftol_abs": 1e-6,
```

When using an NLopt method, always include `rk_const` scaled to the problem dimension:

```python
from math import gamma, pi, sqrt
n = <number of dimensions>
rk_const = 0.5 * ((gamma(1 + (n / 2)) * 5) ** (1 / n)) / sqrt(pi)
```

Use this formula directly in the generated script — do not precompute the value.

### PETSc/TAO (requires petsc4py package)

| Method | Needs | Description |
|--------|-------|-------------|
| `pounders` | fvec | Least-squares trust-region |
| `blmvm` | grad | Bounded limited-memory variable metric |
| `nm` | f only | Nelder-Mead variant |

### DFO-LS (requires dfols package)

| Method | Needs | Description |
|--------|-------|-------------|
| `dfols` | fvec | Derivative-free least-squares |

## Choosing a Local Optimizer

- **Default / simple**: `scipy_Nelder-Mead` — no extra packages
- **Smooth, bounded**: `LN_BOBYQA` (NLopt)
- **Noisy objectives**: `LN_SBPLX` (NLopt) or `scipy_Nelder-Mead`
- **Gradient available**: `scipy_BFGS` or `LD_MMA`
- **Least-squares (vector output)**: `pounders` (PETSc) or `dfols`
- **Constrained**: `scipy_COBYLA` or `LN_COBYLA`

## Interpreting Results

After a run, report the number of minima found. Load the results `.npy` file,
filter by `sim_ended == True`, then check `local_min == True` rows.
Report the count, objective value, and location of each minimum.

## Tuning

If APOSMM is not finding minima, try increasing the multiplier in `rk_const` (e.g., from 0.5 to a larger value) to make it more aggressive about starting new local optimization runs in different regions.

Use this formula directly in the generated script — do not precompute the value.
Also consider increasing `dist_to_bound_multiple` (e.g., 0.5) for a larger initial
step size.

## Important

Always use the bounds, sim_max, and paths from the user's request. Never substitute values from examples or known problem domains.
31 changes: 31 additions & 0 deletions .claude/skills/generate-scripts/references/finding_objectives.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Finding Objective Fields
How to find objective field names in results files.

## VOCS scripts

The objective field name is defined in the VOCS object:
```python
vocs = VOCS(
variables={"x0": [-2, 2], "x1": [-1, 1]},
objectives={"f": "MINIMIZE"},
)
```

The key in `objectives` (e.g. `"f"`) is the objective field name in the results.

## Classic scripts

The objective field name is defined in `sim_specs` outputs:
```python
sim_specs = SimSpecs(
...
outputs=[("f", float)], # "f" is the objective field name
)
```

The field name in `outputs` (e.g. `"f"`) matches the field name in the `.npy` results file.

## Common patterns
- Single objective: `{"f": "MINIMIZE"}` (VOCS) or `outputs=[("f", float)]` (classic)
- Multiple outputs: `"f"` is typically the objective — the scalar float used by the generator
- The objective field name in the VOCS definition or sim_specs outputs matches the field in the results
Loading
Loading