From ea4bbf0d6258267a35fcb25738df6407c6d70d45 Mon Sep 17 00:00:00 2001 From: shudson Date: Thu, 9 Apr 2026 01:35:08 -0500 Subject: [PATCH] Add generate_scripts skill --- .claude/skills/generate-scripts/SKILL.md | 147 ++++++++++++++++++ .../generate-scripts/references/aposmm.md | 146 +++++++++++++++++ .../references/finding_objectives.md | 31 ++++ .../generate-scripts/references/generators.md | 81 ++++++++++ .../references/results_metadata.md | 42 +++++ 5 files changed, 447 insertions(+) create mode 100644 .claude/skills/generate-scripts/SKILL.md create mode 100644 .claude/skills/generate-scripts/references/aposmm.md create mode 100644 .claude/skills/generate-scripts/references/finding_objectives.md create mode 100644 .claude/skills/generate-scripts/references/generators.md create mode 100644 .claude/skills/generate-scripts/references/results_metadata.md diff --git a/.claude/skills/generate-scripts/SKILL.md b/.claude/skills/generate-scripts/SKILL.md new file mode 100644 index 000000000..ebdd893cc --- /dev/null +++ b/.claude/skills/generate-scripts/SKILL.md @@ -0,0 +1,147 @@ +--- +name: generate-scripts +description: Generate libEnsemble calling scripts based on user requirements +--- + +You are generating libEnsemble scripts. libEnsemble coordinates parallel simulations +with generator-directed optimization or sampling. You will produce a calling script +and, when an external application is involved, a sim function file. + +libEnsemble repository: https://github.com/Libensemble/libensemble +If not running inside the libEnsemble repo, find examples and source code there. + +## Workflow + +1. If converting an existing Xopt or Optimas workflow to libEnsemble, use the + existing generator and VOCS settings exactly as-is — even if it is a sampling + or exploration generator. Do not switch to a classic generator unless the user + specifically asks. + Otherwise, if there is not a clear generator to use, read `references/generators.md` + to determine which generator and + style to use. If a specific generator is identified (e.g., APOSMM), read its + dedicated guide (e.g., `references/aposmm.md`). + +2. Find a relevant example in `libensemble/tests/regression_tests/` and read it as a + reference. Some examples: + - Xopt Bayesian optimization (VOCS): `test_xopt_EI_initial_sample.py` — best Xopt + example as it demonstrates the initial sampling approach Xopt generators need + - Optimas Ax optimization (VOCS): `test_optimas_ax_sf.py` + - APOSMM with NLopt (classic): `test_persistent_aposmm_nlopt.py` + - Random uniform sampling (classic): `test_1d_sampling.py` + Use glob and grep to find others matching the generator or pattern needed. + The regression tests have clear descriptions in the docstring. + +3. Write the calling script adapting the example to the user's requirements. + Do not copy test boilerplate from examples + (e.g., "Execute via one of the following commands..." headers). Set nworkers + directly in the script (in LibeSpecs) — do not use parse_args or command-line + arguments unless the user asks for that. If parse_args is not used and no + options are taken, then do not ever suggest running with "-n/nworkers" or comms. + Those optins are used only with parse_args (used in tests). + +4. If the user has an external application (executable), also write a sim function file + that uses the executor to run it. + +5. If the user provides an input file, check whether it has Jinja2 template markers + (`{{ varname }}`). If not, create a templated copy: replace parameter values with + `{{ name }}` markers matching `input_names` in sim_specs (case-sensitive). The sim + function uses `jinja2.Template` to render the file before each simulation. Never + modify the user's original file. + +6. Verify the scripts: + - Bounds and dimension match the user's request + - Executable path is correct + - For VOCS: variable names are consistent between VOCS definition and sim function + - For APOSMM: gen_specs outputs include all required fields + - Input file template markers match input_names (case-sensitive) + - The app_name in submit() matches register_app() + +7. Present a concise summary highlighting: generator choice, bounds, parameters, + sim_max, and objective field. Do NOT suggest `mpirun` or other MPI + runner (srun, mpiexec, etc.) to launch libEnsemble unless the user explicitly + asks for MPI-based comms. + +8. Ask the user if they want to run the scripts. + +9. If running: execute with `python script.py`. Do not use `mpirun` or other MPI + runner (srun, mpiexec, etc.) to launch libEnsemble unless the user explicitly + asks for MPI-based comms for distributing workers. This is unrelated to + MPIExecutor, which workers use to launch simulation applications across nodes + — libEnsemble manages node allocation. + If scripts fail, retry if you can see a fix, otherwise stop. After a successful + run, read `references/results_metadata.md` and + `references/finding_objectives.md` to interpret the output. + +## Generator style + +VOCS (gest-api) is the default style. It uses a VOCS object to define variables and +objectives, and a generator object from Xopt or Optimas. Use VOCS unless the user +explicitly asks for the classic style or the generator only exists in classic form +(e.g., APOSMM, persistent_sampling). + +## Defaults + +- nworkers defaults to 4 unless the user specifies otherwise (or 1 for sequential + generators like Nelder-Mead) +- All nworkers are available for simulations +- No alloc_specs needed — all allocator options are available as GenSpecs parameters +- Use `async_return=True` in GenSpecs unless there is a reason to use batch returns + +## VOCS generators (Xopt / Optimas) + +Key patterns: +- Variables named individually in VOCS: `{"x0": [lb, ub], "x1": [lb, ub]}` +- Objectives named in VOCS: `{"f": "MINIMIZE"}` +- GenSpecs uses `generator=`, `vocs=`, `batch_size=` +- SimSpecs uses `vocs=` or `simulator=` for gest-api style sim functions +- No `add_random_streams()` needed +- Xopt generators need `initial_sample_method="uniform"` and `initial_batch_size=` + for initial evaluated data. Optimas handles its own sampling. + +See `references/generators.md` for the full generator selection guide. + +## Classic generators + +Used only when the generator has no VOCS version or the user explicitly requests it. +- One worker is consumed by the persistent generator +- Requires `add_random_streams()` +- APOSMM: see `references/aposmm.md` for full configuration details + +## Sim function patterns + +**Inline sim function** (no external app): Takes `(H, persis_info, sim_specs, libE_info)` +and returns `(H_o, persis_info)`. Or for VOCS gest-api style, takes `input_dict: dict` +and returns a dict. See `libensemble/sim_funcs/` for built-in examples. + +**Executor-based sim function** (external app): Uses MPIExecutor to run an application. +Pattern: +1. Register app in calling script: `exctr.register_app(full_path=..., app_name=...)` +2. In sim function: get executor from `libE_info["executor"]`, submit with + `exctr.submit(app_name=...)`, wait with `task.wait()` +3. Read output file to get objective value +4. Set `sim_dirs_make=True` in LibeSpecs +5. If using input file templating, set `sim_dir_copy_files=[input_file]` + +## Results interpretation + +After a successful run: +- Load the .npy output file with `np.load()` +- Always filter by `sim_ended == True` before analyzing — rows where sim_ended is False + contain uninitialized values (often zeros) that are NOT real results +- For APOSMM: check rows where `local_min == True` to find identified minima +- Report the count, location, and objective value of minima or best points found +- If the best objective value is exactly 0.0, verify those rows have sim_ended == True +- See `references/results_metadata.md` for full details + +## Reference docs (read as needed) + +All paths relative to this skill's directory: + +- `references/generators.md` — Generator selection guide, VOCS vs classic +- `references/aposmm.md` — APOSMM configuration, optimizer options, tuning +- `references/finding_objectives.md` — Identifying objective fields in results +- `references/results_metadata.md` — Interpreting history array, filtering results + +## User request + +$ARGUMENTS diff --git a/.claude/skills/generate-scripts/references/aposmm.md b/.claude/skills/generate-scripts/references/aposmm.md new file mode 100644 index 000000000..892007b87 --- /dev/null +++ b/.claude/skills/generate-scripts/references/aposmm.md @@ -0,0 +1,146 @@ +# APOSMM — Asynchronously Parallel Optimization Solver for Multiple Minima + +APOSMM coordinates concurrent local optimization runs to find multiple local minima on parallel hardware. Use when the user wants to find minima, optimize, or explore an optimization landscape. + +Module: `persistent_aposmm` +Function: `aposmm` +Allocator: `persistent_aposmm_alloc` (NOT the default `start_only_persistent`) +Requirements: mpmath, SciPy (plus optional packages for specific local optimizers) + +## APOSMM gen_specs in generated scripts + +When the MCP tool generates APOSMM scripts, run_libe.py gets this gen_specs structure: + +```python +gen_specs = GenSpecs( + gen_f=gen_f, + inputs=[], + persis_in=["sim_id", "x", "x_on_cube", "f"], + outputs=[("x", float, n), ("x_on_cube", float, n), ("sim_id", int), + ("local_min", bool), ("local_pt", bool)], + user={ + "initial_sample_size": num_workers, + "localopt_method": "scipy_Nelder-Mead", + "opt_return_codes": [0], + "nu": 1e-8, + "mu": 1e-8, + "dist_to_bound_multiple": 0.01, + "max_active_runs": 6, + "lb": np.array([...]), # MUST match user's requested bounds + "ub": np.array([...]), # MUST match user's requested bounds + } +) +``` + +With allocator: +```python +from libensemble.alloc_funcs.persistent_aposmm_alloc import persistent_aposmm_alloc as alloc_f +``` + +## Required gen_specs["user"] Parameters + +| Parameter | Type | Description | +|-----------|------|-------------| +| `lb` | n floats | Lower bounds on search domain | +| `ub` | n floats | Upper bounds on search domain | +| `localopt_method` | str | Local optimizer (see table below) | +| `initial_sample_size` | int | Uniform samples before starting local runs | + +When using a SciPy method, must also supply `opt_return_codes` — e.g. [0] for Nelder-Mead/BFGS, [1] for COBYLA. + +## Optional gen_specs["user"] Parameters + +| Parameter | Type | Description | +|-----------|------|-------------| +| `max_active_runs` | int | Max concurrent local optimization runs. Must not exceed nworkers. | +| `dist_to_bound_multiple` | float (0,1] | Fraction of distance to boundary for initial step size | +| `mu` | float | Min distance from boundary for starting points | +| `nu` | float | Min distance from identified minima for starting points | +| `stop_after_k_minima` | int | Stop after this many local minima found | +| `stop_after_k_runs` | int | Stop after this many runs ended | +| `sample_points` | numpy array | Specific points to sample (original domain) | +| `lhs_divisions` | int | Latin hypercube partitions (0 or 1 = uniform) | +| `rk_const` | float | Multiplier for r_k value | + +## Worker Configuration + +With `gen_on_manager=True`, the persistent generator runs on the manager process and all `nworkers` are available for simulations. + +## Local Optimizer Methods + +### SciPy (no extra install) + +| Method | Gradient? | `opt_return_codes` | +|--------|-----------|-------------------| +| `scipy_Nelder-Mead` | No | [0] | +| `scipy_COBYLA` | No | [1] | +| `scipy_BFGS` | Yes | [0] | + +### NLopt (requires nlopt package) + +| Method | Gradient? | Description | +|--------|-----------|-------------| +| `LN_SBPLX` | No | Subplex. Good for noisy/nonsmooth | +| `LN_BOBYQA` | No | Quadratic model. Good for smooth problems | +| `LN_COBYLA` | No | Constrained optimization | +| `LN_NEWUOA` | No | Unconstrained quadratic model | +| `LN_NELDERMEAD` | No | Classic simplex | +| `LD_MMA` | Yes | Method of Moving Asymptotes | + +NLopt methods require convergence tolerances. If the user does not specify tolerances, use these defaults: + +```python +"xtol_abs": 1e-6, +"ftol_abs": 1e-6, +``` + +When using an NLopt method, always include `rk_const` scaled to the problem dimension: + +```python +from math import gamma, pi, sqrt +n = +rk_const = 0.5 * ((gamma(1 + (n / 2)) * 5) ** (1 / n)) / sqrt(pi) +``` + +Use this formula directly in the generated script — do not precompute the value. + +### PETSc/TAO (requires petsc4py package) + +| Method | Needs | Description | +|--------|-------|-------------| +| `pounders` | fvec | Least-squares trust-region | +| `blmvm` | grad | Bounded limited-memory variable metric | +| `nm` | f only | Nelder-Mead variant | + +### DFO-LS (requires dfols package) + +| Method | Needs | Description | +|--------|-------|-------------| +| `dfols` | fvec | Derivative-free least-squares | + +## Choosing a Local Optimizer + +- **Default / simple**: `scipy_Nelder-Mead` — no extra packages +- **Smooth, bounded**: `LN_BOBYQA` (NLopt) +- **Noisy objectives**: `LN_SBPLX` (NLopt) or `scipy_Nelder-Mead` +- **Gradient available**: `scipy_BFGS` or `LD_MMA` +- **Least-squares (vector output)**: `pounders` (PETSc) or `dfols` +- **Constrained**: `scipy_COBYLA` or `LN_COBYLA` + +## Interpreting Results + +After a run, report the number of minima found. Load the results `.npy` file, +filter by `sim_ended == True`, then check `local_min == True` rows. +Report the count, objective value, and location of each minimum. + +## Tuning + +If APOSMM is not finding minima, try increasing the multiplier in `rk_const` (e.g., from 0.5 to a larger value) to make it more aggressive about starting new local optimization runs in different regions. + +Use this formula directly in the generated script — do not precompute the value. +Also consider increasing `dist_to_bound_multiple` (e.g., 0.5) for a larger initial +step size. + +## Important + +Always use the bounds, sim_max, and paths from the user's request. Never substitute values from examples or known problem domains. diff --git a/.claude/skills/generate-scripts/references/finding_objectives.md b/.claude/skills/generate-scripts/references/finding_objectives.md new file mode 100644 index 000000000..e8f8e4bde --- /dev/null +++ b/.claude/skills/generate-scripts/references/finding_objectives.md @@ -0,0 +1,31 @@ +# Finding Objective Fields +How to find objective field names in results files. + +## VOCS scripts + +The objective field name is defined in the VOCS object: +```python +vocs = VOCS( + variables={"x0": [-2, 2], "x1": [-1, 1]}, + objectives={"f": "MINIMIZE"}, +) +``` + +The key in `objectives` (e.g. `"f"`) is the objective field name in the results. + +## Classic scripts + +The objective field name is defined in `sim_specs` outputs: +```python +sim_specs = SimSpecs( + ... + outputs=[("f", float)], # "f" is the objective field name +) +``` + +The field name in `outputs` (e.g. `"f"`) matches the field name in the `.npy` results file. + +## Common patterns +- Single objective: `{"f": "MINIMIZE"}` (VOCS) or `outputs=[("f", float)]` (classic) +- Multiple outputs: `"f"` is typically the objective — the scalar float used by the generator +- The objective field name in the VOCS definition or sim_specs outputs matches the field in the results diff --git a/.claude/skills/generate-scripts/references/generators.md b/.claude/skills/generate-scripts/references/generators.md new file mode 100644 index 000000000..e6f6b3ccc --- /dev/null +++ b/.claude/skills/generate-scripts/references/generators.md @@ -0,0 +1,81 @@ +# libEnsemble Generator Functions + +This guide is for choosing a generator when one is not already provided. If the user +is converting an existing workflow that already has a generator, use that generator +as-is — do not use this guide to replace it. + +libEnsemble supports two styles of generator configuration: + +- **VOCS generators (gest-api)** — The default style. Uses a VOCS object to define variables, objectives, and constraints. The generator is passed as an object from Xopt, Optimas, or another gest-api compatible library. +- **Classic generators** — libEnsemble-native gen functions configured via `gen_f`, explicit `inputs`/`outputs`, and `user` dicts with bounds/parameters. Used only when the generator has no VOCS version or the user explicitly requests it. + +## When to Choose a Generator Style + +**VOCS is the default style.** Any generator from Xopt or Optimas is always VOCS — these libraries provide many generators covering optimization, sampling, surrogate modeling, and more. Do not switch an Xopt or Optimas generator to a classic libEnsemble generator. + +Use **classic generators** only when: +- The user explicitly asks for the classic/traditional style +- The generator does not have a VOCS version (APOSMM, persistent_sampling) + +## Choosing a generator + +| Goal | Suggested generator | Style | Package | +|------|---------------------|-------|---------| +| Bayesian optimization | Xopt (e.g., Expected Improvement) | VOCS | `xopt` | +| Sampling / exploration | Xopt (e.g., Latin Hypercube) | VOCS | `xopt` | +| Ax-based optimization, multi-fidelity, multi-task | Optimas | VOCS | `optimas` | +| Simplex optimization | Xopt Nelder-Mead | VOCS | `xopt` | +| Multi-objective Bayesian | Xopt MOBO | VOCS | `xopt` | +| GP-based adaptive sampling | gpCAM | VOCS or Classic | `gen_classes/gpCAM` | +| Find multiple local minima | APOSMM | VOCS or Classic | `gen_classes/aposmm` | +| Random/uniform sampling | Sampling | VOCS or Classic | `gen_classes/sampling` | + +Xopt and Optimas each provide many generators beyond those listed here. If the +generator choice is not clear, check the library documentation: +- Xopt: https://github.com/xopt-org/Xopt — algorithms at https://xopt.xopt.org/algorithms/ +- Optimas: https://github.com/optimas-org/optimas + +If the user says "optimize" without specifics -> Xopt (VOCS). +If the user says "Xopt", "VOCS", "Optimas", or names a specific generator from those libraries -> VOCS style. +If the user says "Ax", "multi-fidelity", "multi-task" -> Optimas (VOCS). +If the user says "find minima", "multiple local minima" -> APOSMM (classic). +If the user says "sample", "explore", "sweep" -> Xopt or Optimas can do this (VOCS), or persistent sampling (classic). + +## VOCS Generators (gest-api) + +VOCS is the default configuration style for generators in libEnsemble. Configuration uses a VOCS object to define the optimization problem and a generator object. Generators may come from Xopt, Optimas, libEnsemble, or other gest-api compatible libraries. + +### Key patterns + +- Variables are named individually in VOCS (`{"x0": [lb, ub], "x1": [lb, ub]}`) +- Objectives are named in VOCS (`{"f": "MINIMIZE"}`) +- GenSpecs uses `generator=`, `vocs=`, and `batch_size=` +- SimSpecs uses `vocs=` or `simulator=` for gest-api style sim functions +- No alloc_specs needed (default is correct) +- No `add_random_streams()` needed +- Use `async_return=True` in GenSpecs unless the generator requires batch returns + +### Initial sampling + +Some generators require evaluated data before they can suggest points. Set `initial_sample_method` in GenSpecs to have libEnsemble produce and evaluate an initial sample before starting the generator: + +- `initial_sample_method="uniform"` — uniform random sample from VOCS bounds +- `initial_batch_size` — required, specifies how many sample points to produce + +Generators that handle their own sampling do not need this. + +### Sim function adaptation + +When using VOCS generators with an executor-based sim function, the sim must read individual variable names from H rather than unpacking `H["x"]`. The `input_names` in `sim_specs["user"]` should match the VOCS variable names directly. + +## Classic Generators + +### persistent_sampling (persistent_uniform) +Random uniform sampling across parameter space. After the initial batch, creates p new random points for every p points returned. + +gen_specs["user"]: `lb`, `ub`, `initial_batch_size` +gen_specs outputs: `x (float, n)` + +### APOSMM (persistent_aposmm) +See `reference_docs/aposmm.md` for full details. +Asynchronously Parallel Optimization Solver for finding Multiple Minima. diff --git a/.claude/skills/generate-scripts/references/results_metadata.md b/.claude/skills/generate-scripts/references/results_metadata.md new file mode 100644 index 000000000..97a71c14b --- /dev/null +++ b/.claude/skills/generate-scripts/references/results_metadata.md @@ -0,0 +1,42 @@ +# Results Metadata +How to interpret libEnsemble history array fields and filter for completed simulations. + +## History array (H) + +The `.npy` output file contains the history array H with both user-defined fields and +metadata fields added by libEnsemble. + +## Key metadata fields + +- `sim_ended`: True if the simulation completed. Only rows with `sim_ended == True` have valid results. +- `sim_started`: True if the simulation was dispatched to a worker. +- `returned`: True if results were returned to the manager. +- `sim_id`: Unique simulation ID (0-indexed). +- `gen_informed`: True if the generator has been informed of this result. + +## Filtering for valid results + +When analyzing results (e.g., finding the minimum objective value), always filter for +completed simulations: + +```python +H = np.load("results.npy") +done = H[H["sim_ended"]] # Only completed simulations +``` + +Rows where `sim_ended` is False may have default/zero values that are not real results. +This is common for the last few rows when the simulation budget is exhausted — they were +allocated by the generator but never evaluated. + +## Reporting results + +After a successful run, report any minima found in the results. See the generator-specific +guide for which fields indicate identified minima. + +## Common pitfall + +If the minimum objective value is exactly 0.0, check whether those rows have +`sim_ended == True`. Unevaluated rows often have fields initialized to zero. +This is common for the last few rows when the simulation budget is exhausted — they were +allocated by the generator but never evaluated. +