Libensemble · jlnav · Apr 16, 2026 · Apr 10, 2026 · Apr 10, 2026 · Apr 13, 2026
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -37,4 +37,4 @@ repos:
   rev: v1.19.1
   hooks:
     - id: mypy
-      exclude: ^libensemble/utils/(launcher|loc_stack|runners|pydantic|output_directory)\.py$|^libensemble/tests/regression_tests/support\.py$|^libensemble/tests/functionality_tests/|^libensemble/tests/unit_tests/
+      exclude: ^libensemble/utils/(launcher|loc_stack|runners|pydantic|output_directory)\.py$|libensemble/tests/(regression_tests|functionality_tests|unit_tests|scaling_tests)/.*
diff --git a/AGENTS.md b/AGENTS.md
@@ -7,43 +7,33 @@ Read the ``README.rst`` for an overview of libEnsemble.
 - The manager determines how and when points get passed to workers via an allocation function.
 - See ``libensemble/tests/regression_tests/test_1d_sampling.py`` for a simple example of the libEnsemble interface.
 
-Repository Layout
------------------
+Critical Repository Layout Information
+--------------------------------------
 
 - ``libensemble/`` - Source code.
 -   ``/alloc_funcs`` - Allocation functions. Policies for passing work between the manager and workers.
 -   ``/comms`` - Modules and abstractions for communication between the manager and workers.
 -   ``/executors`` - An interface for launching executables, often simulations.
 -   ``/gen_classes`` - Generators that adhere to the `gest-api` standard.
                        Recommended over entries from ``/gen_funcs`` that perform similar functionality.
--   ``/gen_funcs`` - Generator functions. Modules for producing points for simulations.
+-   ``/gen_funcs`` - Generator functions. Modules for producing points for simulations. (Legacy)
 -   ``/resources`` - Classes and functions for managing compute resources for MPI tasks, libensemble workers.
 -   ``/sim_funcs`` - Simulator functions. Modules for running simulations or performing experiments.
 -   ``/tests`` - Tests.
     - ``/functionality_tests`` - Primarily tests libEnsemble code only.
     - ``/regression_tests`` - Tests libEnsemble code with external code. Often more closely resembles actual use-cases.
     - ``/unit_tests`` - Tests for individual modules.
--   ``/tools`` - Tools. Misc functions and classes to ease development.
--   ``/utils`` - Utilities. Misc functions and classes used internally by multiple modules.
--   ``ensemble.py`` - The primary interface for parameterizing and running libEnsemble.
+-   ``ensemble.py`` - The primary interface for parameterizing and running libEnsemble. The ``Ensemble`` class in this module wraps the lower-level ``libE`` function and automates argument parsing and state management.
 -   ``generators.py`` - Base classes for generators that adhere to the `gest-api` standard.
--   ``history.py`` - Module for recording points that have been generated and simulation results. NumPy array.
--   ``libE.py`` - libE main file. Previous primary interface for parameterizing and running libEnsemble.
--   ``logger.py`` - Logging configuration.
+-   ``history.py`` - Module for recording points that have been generated and simulation results. NumPy structured array.
+-   ``libE.py`` - libE main file. Previous primary interface for parameterizing and running libEnsemble. The primary interface in ``ensemble.py`` wraps this function.
 -   ``manager.py`` - Module for maintaining the history array and passing points between the workers.
 -   ``message_numbers.py`` - Constants that represent states of the ensemble.
 -   ``specs.py`` - Dataclasses for parameterizing the ensemble. Most importantly, contains ``LibeSpecs, SimSpecs, GenSpecs``.
 -   ``worker.py`` - Module for running generators and simulators. Communicates with the manager.
--   ``version.py`` - Version file.
-
-- ``.github/`` - GitHub actions. See ``.github/workflows/`` for the CI.
-- ``docs/`` - Documentation. Check here first for information before reading the source code.
 - ``examples/`` - The ``*_funcs`` and ``calling_scripts`` directories contain symlinks to examples further in the source code.
 -   ``/libE_submission_scripts`` - Example scripts for submitting libEnsemble jobs to HPC systems.
 -   ``/tutorials`` - Tutorials on how to use libEnsemble.
-- ``pyproject.toml`` - Project configuration file. Contains information about the project and its dependencies.
-
-Other files in the root directory should be self-documenting.
 
 Information about Generators
 ----------------------------
@@ -55,8 +45,10 @@ Its fields match ``sim_specs/gen_specs["out"]`` or ``vocs`` attributes, plus add
 long-running loop, sending and receiving points to and from the manager until the ensemble was complete.
 - A ``gest-api`` or "standardized" generator is a class that at a minimum implements ``suggest`` and ``ingest`` methods, and is parameterized by a ``vocs``.
 - See ``libensemble/generators.py`` for more information about the ``gest-api`` standard.
-- If using a generator that adheres to the ``gest-api`` standard, or a classic persistent generator, use the ``start_only_persistent`` allocation function.
 - Generators are often used for simple sampling, optimization, calibration, uncertainty quantification, and other simulation-based tasks.
+- **Automatic Variable Mapping**: Subclasses of ``LibensembleGenerator`` (like ``UniformSample``) automatically map all ``VOCS`` variables to a single multi-dimensional ``"x"`` field in the History array if no explicit ``variables_mapping`` is provided.
+- **Mandatory Input Fields**: Even for simple generators that don't ingest data, ``gen_specs["in"]`` or ``gen_specs["persis_in"]`` must be defined if using an allocation function like ``only_persistent_gens`` that attempts to send rows. If these are empty, the manager will raise an ``AssertionError`` stating that no fields were requested to be sent.
+- **Default Allocator**: ``only_persistent_gens`` is the default allocator for standardized ``gest-api`` generators. It treats these generators as persistent entities that communicate throughout the run.
 
 General Guidelines
 ------------------
@@ -77,7 +69,7 @@ Development Environment
 -----------------------
 
 - ``pixi`` is the recommended environment manager for libEnsemble development.  See ``pyproject.toml`` for the list
-of dependencies and the available testing environments.
+of dependencies and the available testing environments. (Note: If ``pixi`` is not in your system path, it can often be found in ``/opt/homebrew/bin/pixi`` or ``/usr/local/bin/pixi``).
 - Enter the development environment with ``pixi shell -e dev``. This environment contains the most common dependencies for development and testing.
 - For one-off commands, use ``pixi run -e dev``. This will run a single command in the development environment.
 - If ``pixi`` is not available or not preferred by the user, ``pip install -e .`` can be used instead. Other dependencies may need to be installed manually.
@@ -87,9 +79,21 @@ the configuration and ``pyproject.toml`` for other configuration.
 Testing
 -------
 
-- Run tests with the ``run-tests.py`` script: ``python libensemble/tests/run-tests.py``.  See ``libensemble/tests/run-tests.py`` for usage information.
+- Run tests with the ``run_tests.py`` script: ``python libensemble/tests/run_tests.py``.  See ``libensemble/tests/run_tests.py`` for usage information.
 - Some tests require third party software to be installed. When developing a feature or fixing a bug, since the entire test suite will be run on Github Actions,
 for local development running individual tests is sufficient.
 - Individual unit tests can be run with ``pixi run -e dev pytest path/to/test_file``.
 - A libEnsemble run typically outputs an ``ensemble.log`` and ``libE_stats.txt`` file in the working directory. Check these files for tracebacks or run statistics.
 - An "ensemble" or "workflow" directory may also be created, often containing per-simulation output directories
+
+Modernizing Scripts for libEnsemble 2.0
+---------------------------------------
+
+When modernizing existing libEnsemble scripts (functionality tests, regression tests, or user examples) for version 2.0, follow these steps:
+
+- **Switch to `gest-api` Generators**: Replace legacy generator functions (from `libensemble.gen_funcs`) with standardized generator classes (from `libensemble.gen_classes` or other `gest-api` compatible sources).
+- **Use `VOCS` for Parameterization**: Standardized generators are parameterized by a `VOCS` object (from `gest_api.vocs`). Define variables and objectives within this object.
+- **Set `gen_specs["generator"]`**: Instead of `gen_f`, use the `generator` field in `GenSpecs` to pass the initialized generator class.
+- **Remove Explicit `AllocSpecs`**: In libEnsemble 2.0, `only_persistent_gens` is the default allocator. Scripts that previously used `give_sim_work_first` or other simple allocators can often remove `alloc_specs` entirely when switching to standardized generators.
+- **Generator Placement**: By default, generators run on the manager thread (Worker 0). This means all allocated workers are available for simulation tasks unless `gen_on_worker` is explicitly set to `True` in `libE_specs`.
+- **Mandatory Fields**: Ensure `gen_specs["in"]` or `gen_specs["persis_in"]` includes at least one field (e.g., `["sim_id"]`) if feedback is sent back to the generator, to satisfy the allocator's requirements.
diff --git a/README.rst b/README.rst
@@ -22,7 +22,7 @@ and inference problems on the world's leading supercomputers such as Frontier, A
 
 `Quickstart`_
 
-**New:** libEnsemble nows supports the `gest-api`_ generator standard, and can run with 
+**New:** libEnsemble nows supports the `gest-api`_ generator standard, and can run with
 Optimas and Xopt generators.
 
 The |ScriptCreator| to generate customized scripts for running ensembles with your
@@ -81,7 +81,6 @@ and an exit condition. Run the following four-worker example via ``python this_f
             exit_criteria=exit_criteria,
         )
 
-        sampling.add_random_streams()
         sampling.run()
 
         if sampling.is_manager:

diff --git a/docs/data_structures/persis_info.rst b/docs/data_structures/persis_info.rst
@@ -13,8 +13,8 @@ and from the corresponding workers. These are received in the ``persis_info``
 argument of user functions, and returned as the optional second return value.
 
 A typical example is a random number generator stream to be used in consecutive
-calls to a generator (see
-:meth:`add_unique_random_streams()<tools.add_unique_random_streams>`)
+calls to a generator. Generators should initialize their own RNG using
+:meth:`get_rng()<tools.get_rng>`.
 
 All other entries persist on the manager and can be updated in the calling script
 between ensemble invocations, or in the allocation function.

diff --git a/docs/tutorials/aposmm_tutorial.rst b/docs/tutorials/aposmm_tutorial.rst
@@ -146,7 +146,7 @@ function:
     from libensemble.libE import libE
     from libensemble.gen_funcs.persistent_aposmm import aposmm
     from libensemble.alloc_funcs.persistent_aposmm_alloc import persistent_aposmm_alloc
-    from libensemble.tools import parse_args, add_unique_random_streams
+    from libensemble.tools import parse_args
 
 This allocation function starts a single Persistent APOSMM routine and provides
 ``sim_f`` output for points requested by APOSMM. Points can be sampled points
@@ -241,7 +241,7 @@ random sampling seeding:
     :linenos:
 
     exit_criteria = {"sim_max": 2000}
-    persis_info = add_unique_random_streams({}, nworkers + 1)
+    persis_info = {}
 
 Finally, add statements to :doc:`initiate libEnsemble<../libe_module>`, and quickly
 check calculated minima:

diff --git a/examples/readme_notebook.ipynb b/examples/readme_notebook.ipynb
@@ -76,7 +76,6 @@
     "        exit_criteria=exit_criteria,\n",
     "    )\n",
     "\n",
-    "    sampling.add_random_streams()\n",
     "    H, persis_info, flag = sampling.run()\n",
     "\n",
     "    # Print first 10 lines of input/output values\n",

diff --git a/examples/tutorials/aposmm/aposmm_tutorial_notebook.ipynb b/examples/tutorials/aposmm/aposmm_tutorial_notebook.ipynb
@@ -114,7 +114,7 @@
     "from libensemble.libE import libE\n",
     "from libensemble.gen_funcs.persistent_aposmm import aposmm\n",
     "from libensemble.alloc_funcs.persistent_aposmm_alloc import persistent_aposmm_alloc\n",
-    "from libensemble.tools import parse_args, add_unique_random_streams"
+    "from libensemble.tools import parse_args"
    ]
   },
   {
@@ -235,7 +235,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "persis_info = add_unique_random_streams({}, nworkers + 1)\n",
+    "persis_info = {}\n",
     "\n",
     "H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, alloc_specs, libE_specs)"
    ]

diff --git a/examples/tutorials/aposmm/tutorial_aposmm.py b/examples/tutorials/aposmm/tutorial_aposmm.py
@@ -5,7 +5,7 @@
 from libensemble.alloc_funcs.persistent_aposmm_alloc import persistent_aposmm_alloc
 from libensemble.gen_funcs.persistent_aposmm import aposmm
 from libensemble.libE import libE
-from libensemble.tools import add_unique_random_streams, parse_args
+from libensemble.tools import parse_args
 
 libensemble.gen_funcs.rc.aposmm_optimizers = "scipy"
 
@@ -42,8 +42,7 @@
 alloc_specs = {"alloc_f": persistent_aposmm_alloc}
 
 exit_criteria = {"sim_max": 2000}
-persis_info = add_unique_random_streams({}, nworkers + 1)
 
-H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, alloc_specs, libE_specs)
+H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, alloc_specs=alloc_specs, libE_specs=libE_specs)
 if is_manager:
     print("Minima:", H[np.where(H["local_min"])]["x"])
diff --git a/examples/tutorials/forces_with_executor/forces_tutorial_notebook.ipynb b/examples/tutorials/forces_with_executor/forces_tutorial_notebook.ipynb
@@ -312,10 +312,7 @@
     "    gen_specs=gen_specs,\n",
     "    sim_specs=sim_specs,\n",
     "    exit_criteria=exit_criteria,\n",
-    ")\n",
-    "\n",
-    "# Seed random streams for each worker, particularly for gen_f\n",
-    "ensemble.add_random_streams()"
+    ")\n"
    ]
   },
   {
@@ -562,9 +559,6 @@
     "    user={\"input_filename\": input_file, \"input_names\": [\"particles\"]},\n",
     ")\n",
     "\n",
-    "# To reset random number seed in the generator\n",
-    "ensemble.add_random_streams()\n",
-    "\n",
     "# Clean up any previous outputs and launch libEnsemble\n",
     "cleanup()\n",
     "H, persis_info, flag = ensemble.run()\n",

diff --git a/examples/tutorials/simple_sine/sine_tutorial_notebook.ipynb b/examples/tutorials/simple_sine/sine_tutorial_notebook.ipynb
@@ -186,7 +186,6 @@
     "\n",
     "# Initialize and run the ensemble.\n",
     "ensemble = Ensemble(sim_specs, gen_specs, exit_criteria, libE_specs)\n",
-    "ensemble.add_random_streams()  # setup the random streams unique to each worker\n",
     "H, persis_info, flag = ensemble.run()  # start the ensemble. Blocks until completion."
    ]
   },

diff --git a/libensemble/alloc_funcs/persistent_aposmm_alloc.py b/libensemble/alloc_funcs/persistent_aposmm_alloc.py
@@ -21,6 +21,8 @@ def persistent_aposmm_alloc(W, H, sim_specs, gen_specs, alloc_specs, persis_info
     if libE_info["sim_max_given"] or not libE_info["any_idle_workers"]:
         return {}, persis_info
 
+    if not persis_info:
+        persis_info = {i: {} for i in range(len(W))}
     user = {**gen_specs, **alloc_specs.get("user", {})}
     init_sample_size = user["initial_batch_size"]
     manage_resources = libE_info["use_resource_sets"]
@@ -70,7 +72,7 @@ def persistent_aposmm_alloc(W, H, sim_specs, gen_specs, alloc_specs, persis_info
     if persis_info.get("gen_started") is None:
         for wid in support.avail_worker_ids(persistent=False, gen_workers=True):
             # Finally, call a persistent generator as there is nothing else to do.
-            persis_info.get(wid)["nworkers"] = len(W)
+            persis_info[wid]["nworkers"] = len(W)
             try:
                 Work[wid] = support.gen_work(
                     wid, gen_specs.get("in", []), range(len(H)), persis_info.get(wid), persistent=True

diff --git a/libensemble/alloc_funcs/start_persistent_local_opt_gens.py b/libensemble/alloc_funcs/start_persistent_local_opt_gens.py
@@ -27,6 +27,9 @@ def start_persistent_local_opt_gens(W, H, sim_specs, gen_specs, alloc_specs, per
     if libE_info["sim_max_given"] or not libE_info["any_idle_workers"]:
         return {}, persis_info
 
+    if not persis_info:
+        persis_info = {i: {} for i in range(len(W))}
+
     manage_resources = libE_info["use_resource_sets"]
     support = AllocSupport(W, manage_resources, persis_info, libE_info)
     Work = {}
@@ -42,6 +45,7 @@ def start_persistent_local_opt_gens(W, H, sim_specs, gen_specs, alloc_specs, per
             opt_ind = np.all(H["x"] == persis_info[i]["x_opt"], axis=1)
             assert sum(opt_ind) == 1, "There must be just one optimum"
             H["local_min"][opt_ind] = True
+        if "rand_stream" in persis_info[i]:
             persis_info[i] = {"rand_stream": persis_info[i]["rand_stream"]}
 
     # If wid is idle, but in persistent mode, and its calculated values have

diff --git a/libensemble/ensemble.py b/libensemble/ensemble.py
@@ -5,7 +5,6 @@
 from libensemble.executors import Executor
 from libensemble.libE import libE
 from libensemble.specs import AllocSpecs, ExitCriteria, GenSpecs, LibeSpecs, SimSpecs
-from libensemble.tools import add_unique_random_streams
 from libensemble.tools import parse_args as parse_args_f
 from libensemble.tools import save_libE_output
 from libensemble.tools.parse_args import mpi_init
@@ -64,7 +63,6 @@ class Ensemble:
                 },
             )
 
-            sampling.add_random_streams()
             sampling.exit_criteria = ExitCriteria(sim_max=100)
 
             if __name__ == "__main__":
@@ -174,7 +172,7 @@ def __init__(
         self.sim_specs = sim_specs
         self.gen_specs = gen_specs
         self.exit_criteria = exit_criteria
-        self._libE_specs: LibeSpecs | None = None
+        self._libE_specs: LibeSpecs | dict | None = None
         if isinstance(libE_specs, dict):
             self._libE_specs = LibeSpecs(**libE_specs)
         else:
@@ -187,16 +185,14 @@ def __init__(
         self._nworkers = 0
         self.is_manager = False
         self.parsed = False
-        self._known_comms = None
+        self._known_comms: str = ""
 
         if parse_args:
             self._parse_args()
             self.parsed = True
-            if self._libE_specs:
-                self._known_comms = self._libE_specs.comms
 
-        if not self._known_comms and self._libE_specs is not None:
-            self._known_comms = self._libE_specs.comms
+        if self._libE_specs:
+            self._known_comms = getattr(self._libE_specs, "comms", "")
 
         if self._known_comms == "local":
             self.is_manager = True
@@ -206,7 +202,7 @@ def __init__(
         elif self._known_comms == "mpi" and not parse_args:
             # Set internal _nworkers - not libE_specs (avoid "nworkers will be ignored" warning)
             if self._libE_specs:
-                self._nworkers, self.is_manager = mpi_init(self._libE_specs.mpi_comm)
+                self._nworkers, self.is_manager = mpi_init(getattr(self._libE_specs, "mpi_comm", None))
 
     def _parse_args(self) -> tuple[int, bool, LibeSpecs]:
         # Set internal _nworkers - not libE_specs (avoid "nworkers will be ignored" warning)
@@ -301,7 +297,7 @@ def run(self) -> tuple[npt.NDArray, dict, int]:
         """
 
         self._refresh_executor()
-        if self._libE_specs and self._libE_specs.comms != self._known_comms:
+        if self._libE_specs and getattr(self._libE_specs, "comms", "") != self._known_comms:
             raise ValueError(CHANGED_COMMS_WARN)
 
         assert self._libE_specs is not None
@@ -327,32 +323,6 @@ def nworkers(self, value):
         if self._libE_specs:
             self._libE_specs.nworkers = value
 
-    def add_random_streams(self, num_streams: int = 0, seed: str = ""):
-        """
-
-        Adds ``np.random`` generators for each worker ID to ``self.persis_info``.
-
-        Parameters
-        ----------
-
-        num_streams: int, Optional
-
-            Number of matching worker ID and random stream entries to create. Defaults to
-            ``self.nworkers``.
-
-        seed: str, Optional
-
-            Seed for NumPy's RNG.
-
-        """
-        if num_streams:
-            nstreams = num_streams
-        else:
-            nstreams = self.nworkers
-
-        self.persis_info = add_unique_random_streams(self.persis_info, nstreams + 1, seed=seed)
-        return self.persis_info
-
     def save_output(self, basename: str, append_attrs: bool = True):
         """
         Writes out History array and persis_info to files.