RandomDefaultUser
diff --git a/‎.dockerignore‎
Lines changed: 1 addition & 1 deletion b/‎.dockerignore‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/cpu-tests.yml‎
Lines changed: 3 additions & 3 deletions b/‎.github/workflows/cpu-tests.yml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎.gitignore‎
Lines changed: 6 additions & 0 deletions b/‎.gitignore‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎Dockerfile‎
Lines changed: 2 additions & 2 deletions b/‎Dockerfile‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/CONTRIBUTE.md‎
Lines changed: 5 additions & 2 deletions b/‎docs/source/CONTRIBUTE.md‎
Lines changed: 5 additions & 2 deletions
diff --git a/‎docs/source/advanced_usage/descriptors.rst‎
Lines changed: 48 additions & 10 deletions b/‎docs/source/advanced_usage/descriptors.rst‎
Lines changed: 48 additions & 10 deletions
diff --git a/‎docs/source/advanced_usage/hyperparameters.rst‎
Lines changed: 29 additions & 1 deletion b/‎docs/source/advanced_usage/hyperparameters.rst‎
Lines changed: 29 additions & 1 deletion
diff --git a/‎docs/source/advanced_usage/openpmd.rst‎
Lines changed: 4 additions & 4 deletions b/‎docs/source/advanced_usage/openpmd.rst‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/source/advanced_usage/predictions.rst‎
Lines changed: 20 additions & 1 deletion b/‎docs/source/advanced_usage/predictions.rst‎
Lines changed: 20 additions & 1 deletion
diff --git a/‎docs/source/advanced_usage/trainingmodel.rst‎
Lines changed: 23 additions & 14 deletions b/‎docs/source/advanced_usage/trainingmodel.rst‎
Lines changed: 23 additions & 14 deletions
@@ -1,2 +1,2 @@
 *
-!install/*
+!pipeline/*
@@ -118,7 +118,7 @@ jobs:
 
   cpu-tests:
     needs: build-docker-image-cpu
-    runs-on: ubuntu-20.04
+    runs-on: ubuntu-22.04
     env:
       IMAGE_REPO: ${{ needs.build-docker-image-cpu.outputs.image-repo }}
       DOCKER_TAG: ${{ needs.build-docker-image-cpu.outputs.docker-tag }}
@@ -183,7 +183,7 @@ jobs:
           # be there before it has been installed.
           sed -i '/materials-learning-algorithms/d' ./env_after.yml
 
-          # if comparison fails, `install/mala_cpu_[base]_environment.yml` needs to be aligned with
+          # if comparison fails, `pipeline/mala_cpu_[base]_environment.yml` needs to be aligned with
           # `requirements.txt` and/or extra dependencies are missing in the Docker Conda environment
 
           if diff --brief env_before.yml env_after.yml
@@ -230,7 +230,7 @@ jobs:
 
       - name: Test mala
         shell: 'bash -c "docker exec -i mala-cpu bash < {0}"'
-        run: MALA_DATA_REPO=$(pwd)/mala_data pytest --cov=mala --cov-fail-under=60 -m "not examples" --disable-warnings
+        run: MALA_DATA_REPO=$(pwd)/mala_data pytest --cov=mala --cov-fail-under=50 -m "not examples" --disable-warnings
 
   retag-docker-image-cpu:
     needs: [cpu-tests, build-docker-image-cpu]
 
@@ -1,5 +1,6 @@
 # Byte-compiled / optimized / DLL files
 __pycache__/
+__pycache__
 *.py[cod]
 *$py.class
 
@@ -152,6 +153,7 @@ cython_debug/
 *.out
 *.npy
 *.pkl
+*.pk
 *.pth
 *.json
 
@@ -186,3 +188,7 @@ wandb/
 
 *.zip
 *~
+
+# ACE files & libraries
+*.pkl
+
@@ -14,14 +14,14 @@ RUN apt-get --allow-releaseinfo-change update && apt-get upgrade -y && \
 
 # Choose 'cpu' or 'gpu'
 ARG DEVICE=cpu
-COPY install/mala_${DEVICE}_environment.yml .
+COPY pipeline/mala_${DEVICE}_environment.yml .
 RUN conda env create -f mala_${DEVICE}_environment.yml && rm -rf /opt/conda/pkgs/*
 
 # Install optional MALA dependencies into Conda environment with pip
 RUN /opt/conda/envs/mala-${DEVICE}/bin/pip install --no-input --no-cache-dir \
     pytest \
     pytest-cov \
-    oapackage==2.6.8 \
+    oapackage==2.7.14 \
     pqkmeans
 
 RUN echo "source activate mala-${DEVICE}" > ~/.bashrc
 
@@ -16,11 +16,14 @@ nature of your contribution:
 
 - Bartosz Brzoza (Bugfixes, GNN implementation)
 - Timothy Callow (Grid-size transferability)
+- Petr Cagas (Sample data management and data generation)
+- Matthew Campbell (Active learning)
 - Attila Cangi (Scientific supervision)
 - Austin Ellis (General code infrastructure)
 - Omar Faruk (Training parallelization via horovod)
 - Lenz Fiedler (General code development and maintenance)
 - James Fox (GNN implementation)
+- James Goff (ACE descriptors and forces)
 - Nils Hoffmann (NASWOT method)
 - Kyle Miller (Data shuffling)
 - Daniel Kotik (Documentation and CI)
@@ -33,7 +36,7 @@ nature of your contribution:
 - Siva Rajamanickam (Scientific supervision)
 - Josh Romero (GPU usage improvement for model tuning)
 - Steve Schmerler (Uncertainty quantification)
-- Adam Stephens (Uncertainty quantification work)
+- Adam Stephens (Uncertainty quantification)
 - Hossein Tahmasbi (Minterpy descriptors)
 - Aidan Thompson (Descriptor calculation)
 - Sneha Verma (Tensorboard interface)
@@ -113,7 +116,7 @@ If you add additional dependencies, make sure to add them to `requirements.txt`
 if they are required or to `setup.py` under the appropriate `extras` tag if
 they are not.
 Further, in order for them to be available during the CI tests, make sure to
-add _required_ dependencies to the appropriate environment files in folder `install/` and _extra_ requirements directly in the `Dockerfile` for the `conda` environment build.
+add _required_ dependencies to the appropriate environment files in folder `pipeline/` and _extra_ requirements directly in the `Dockerfile` for the `conda` environment build.
 
 ## Pull Requests
 We actively welcome pull requests.
 
@@ -1,7 +1,7 @@
 .. _tuning descriptors:
 
-Improved data conversion
-========================
+Advanced descriptor options
+===========================
 
 As a general remark please be reminded that if you have not used LAMMPS
 for your first steps in MALA, and instead used the python-based descriptor
@@ -76,23 +76,20 @@ An example would be this:
 
       .. code-block:: python
 
-            hyperoptimizer.add_snapshot("espresso-out", os.path.join(data_path, "Be_snapshot1.out"),
-                                        "numpy", os.path.join(data_path, "Be_snapshot1.out.npy"),
+            hyperoptimizer.add_snapshot("espresso-out", os.path.join(data_path_be, "Be_snapshot1.out"),
+                                        "numpy", os.path.join(data_path_be, "Be_snapshot1.out.npy"),
                                         target_units="1/(Ry*Bohr^3)")
-            hyperoptimizer.add_snapshot("espresso-out", os.path.join(data_path, "Be_snapshot2.out"),
-                                        "numpy", os.path.join(data_path, "Be_snapshot2.out.npy"),
+            hyperoptimizer.add_snapshot("espresso-out", os.path.join(data_path_be, "Be_snapshot2.out"),
+                                        "numpy", os.path.join(data_path_be, "Be_snapshot2.out.npy"),
                                         target_units="1/(Ry*Bohr^3)")
 
 Once this is done, you can start the optimization via
 
       .. code-block:: python
 
-            hyperoptimizer.perform_study(return_plotting=False)
+            hyperoptimizer.perform_study()
             hyperoptimizer.set_optimal_parameters()
 
-If ``return_plotting`` is set to ``True``, relevant plotting data for the
-analysis are returned. This is useful for exploratory searches.
-
 Since the ACSD re-calculates the bispectrum descriptors for each combination
 of hyperparameters, it is useful to use parallel descriptor calculation.
 To do so, you can enable the `MPI <https://www.mpi-forum.org/>`_ capabilites
@@ -118,3 +115,44 @@ Parallelization may also generally be used for data conversion via the
 prior to using the ``DataConverter`` class. Then, all processing will
 be done in parallel - both the descriptor calculation as well as the LDOS
 parsing.
+
+ACE Descriptors
+******************
+
+.. note::
+
+    To use ACE descriptors with MALA, you need to install LAMMPS from source
+    using the ACE descriptor development branch, since the ACE descriptors
+    are not yet part of the descriptor calculation code the MALA team has
+    integrated into mainline LAMMPS. You can find the code here:
+    https://github.com/jmgoff/lammps_compute_PACE/tree/mala-ace-grid.
+
+Recently, and as described in the
+`MALA technical paper <https://arxiv.org/abs/2411.19617>`_ ACE descriptors
+have been implemented as an alternative to bispectrum descriptors. They
+follow the Atomic Cluster Expansion (ACE) formalism, introduced by
+the `eponymous publication <https://journals.aps.org/prb/abstract/10.1103/PhysRevB.99.014104>`_
+by Ralf Drautz. ACE descriptors hold the promise of being more descriptive and
+accurate than bispectrum descriptors and are currently being investigated by
+the MALA team. MALA already implements most functionalities of bispectrum
+descriptors for ACE descriptors. You can use them in the same fashion as
+the bispectrum descriptors, with the only difference being the hyperparameters
+you need to set.
+
+Specifically, by replacing all bispectrum hyperparameters in your script
+with code such as this
+
+        .. code-block:: python
+
+            parameters.descriptors.descriptor_type = "ACE"
+            parameters.descriptors.ace_cutoff = 5.8
+            parameters.descriptors.ace_included_expansion_ranks = [1, 2, 3]
+            parameters.descriptors.ace_maximum_l_per_rank = [0, 1, 1]
+            parameters.descriptors.ace_maximum_n_per_rank = [1, 1, 1]
+            parameters.descriptors.ace_minimum_l_per_rank = [0, 0, 0]
+
+ACE descriptors will be used in your processing/training/testing scripts.
+ACE_DOCS_MISSING: Describe what the parameters mean/how to best tune them.
+
+A known current limitation is that ACE descriptors can only be run on CPU.
+A GPU version is currently being developed.
@@ -96,6 +96,34 @@ are started with ``wait_time`` time interval in between (to avoid race
 conditions when accessing the same data base) and further only use the data
 base, not MPI, for communication.
 
+The batch job on your HPC cluster will get killed after the designated runtime.
+Then unfinished trials will remain in the Optuna database in state RUNNING.
+
+The current workflow for resuming the study which makes use of MALA's own
+resume tooling
+(see ``examples/advanced/ex05_checkpoint_hyperparameter_optimization.py``) is
+this: before submitting the batch job again and let the script do the resume
+work, a user needs to modify the database like so:
+
+    .. code-block:: bash
+
+        python3 -c "import mala; mala.HyperOptOptuna.requeue_zombie_trials('hyperopt01', 'sqlite:///hyperopt.db')"
+
+which will set the RUNNING trials to state WAITING.
+When Optuna resumes, it will pick up and re-run those, before carrying on
+running the resumed study.
+
+Common questions related to this feature:
+
+- "Does "injecting" jobs like this disturb Optuna's operation in any way?":
+  No, the study object takes all of its information directly from the
+  data base, which in this case has "WAITING" trials now.
+- "Do those trials have to be run?": Technically not. One could simply ignore
+  them and re-run without them. The problem is that in this case, the study
+  will have missing data points from trials that have been suggested for a
+  reason, so even if Optuna would resume fine, we still want to re-run them
+  from an optimization point of view.
+
 If you do distributed hyperparameter optimization, another useful option
 is
 
@@ -114,7 +142,7 @@ a physical validation metric such as
 
       .. code-block:: python
 
-            parameters.running.after_training_metric = "band_energy"
+            parameters.running.final_validation_metric = "band_energy"
 
 Advanced optimization algorithms
 ********************************
 
@@ -33,16 +33,16 @@ be left untouched. Specifically, set
             ...
             # Changes for DataHandler
             data_handler = mala.DataHandler(parameters)
-            data_handler.add_snapshot("Be_snapshot0.in.h5", data_path,
-                                       "Be_snapshot0.out.h5", data_path, "tr",
+            data_handler.add_snapshot("Be_snapshot0.in.h5", data_path_be,
+                                       "Be_snapshot0.out.h5", data_path_be, "tr",
                                        snapshot_type="openpmd")
             ...
             # Changes for DataShuffler
             data_shuffler = mala.DataShuffler(parameters)
             # Data can be shuffle FROM and TO openPMD - but also from
             # numpy to openPMD.
-            data_shuffler.add_snapshot("Be_snapshot0.in.h5", data_path,
-                                        "Be_snapshot0.out.h5", data_path,
+            data_shuffler.add_snapshot("Be_snapshot0.in.h5", data_path_be,
+                                        "Be_snapshot0.out.h5", data_path_be,
                                         snapshot_type="openpmd")
             data_shuffler.shuffle_snapshots(...,
                                             save_name="Be_shuffled*.h5")
 
@@ -105,7 +105,26 @@ CPU or GPU. To do so, simply enable MPI usage in MALA
             parameters.use_mpi = True
 
 Once MPI is activated, you can start the MPI aware Python script using
-``mpirun``, ``srun`` or whichever MPI wrapper is used on your machine.
+``mpirun``, ``srun`` or whichever MPI wrapper is used on your machine, for
+example with
+
+        .. code-block:: bash
+
+            #!/bin/bash
+            #SBATCH --nodes=NUMBER_OF_NODES
+            #SBATCH --ntasks-per-node=NUMBER_OF_TASKS_PER_NODE
+            #SBATCH --gres=gpu:NUMBER_OF_TASKS_PER_NODE
+            # Add more arguments as needed
+            ...
+
+            # Load more modules as needed
+            ...
+
+            # Depending on your cluster setup, you may need to use srun here
+            # rather than mpirun.
+            # Note that
+            # NUMBER_OF_RANKS = NUMBER_OF_NODES * NUMBER_OF_TASKS_PER_NODE
+            mpirun -np NUMBER_OF_RANKS python3 -u prediction.py
 
 By default, MALA can only operate with a number of processes by which the
 z-dimension of the inference grid can be evenly divided, since the Quantum
 
@@ -71,13 +71,13 @@ is directly outputted by MALA. By default, this validation loss gives the
 mean squared error between LDOS prediction and actual value. From a purely
 ML point of view, this is fine; however, the correctness of the LDOS itself
 does not hold much physical virtue. Thus, MALA implements physical validation
-metrics to be accessed before and after the training routine.
+metrics which can be evaluated for example after the training.
 
 Specifically, when setting
 
       .. code-block:: python
 
-            parameters.running.after_training_metric = "band_energy"
+            parameters.running.final_validation_metric = "band_energy"
 
 the error in the band energy between actual and predicted LDOS will be
 calculated and printed before and after network training (in meV/atom).
@@ -170,23 +170,32 @@ data sets have to be saved - in-memory implementations are currently developed.
 To use the data shuffling (also shown in example
 ``advanced/ex02_shuffle_data.py``), you can use the ``DataShuffler`` class.
 
-The syntax is very easy, you create a ``DataShufller`` object,
+The syntax is very easy, you create a ``DataShuffler`` object,
 which provides the same ``add_snapshot`` functionalities as the ``DataHandler``
-object, and shuffle the data once you have added all snapshots in question,
-i.e.,
+object, and shuffle the data once you have added all snapshots in question.
+Just as with the ``DataHandler`` class, on-the-fly calculation of bispectrum
+descriptors is supported.
 
       .. code-block:: python
 
             parameters.data.shuffling_seed = 1234
 
             data_shuffler = mala.DataShuffler(parameters)
-            data_shuffler.add_snapshot("Be_snapshot0.in.npy", data_path,
-                                       "Be_snapshot0.out.npy", data_path)
-            data_shuffler.add_snapshot("Be_snapshot1.in.npy", data_path,
-                                       "Be_snapshot1.out.npy", data_path)
+            data_shuffler.add_snapshot("Be_snapshot0.in.npy", data_path_be,
+                                       "Be_snapshot0.out.npy", data_path_be)
+            data_shuffler.add_snapshot("Be_snapshot1.in.npy", data_path_be,
+                                       "Be_snapshot1.out.npy", data_path_be)
             data_shuffler.shuffle_snapshots(complete_save_path="../",
                                             save_name="Be_shuffled*")
 
+By using the ``shuffle_to_temporary`` keyword, you can shuffle the data to
+temporary files, which will can deleted after the training run. This is useful
+if you want to shuffle the data right before training and do not plan to re-use
+shuffled data files for multiple training runs. As detailed in
+``advanced/ex02_shuffle_data.py``, access to temporary files is provided via
+``data_shuffler.temporary_shuffled_snapshots[...]``, which is a list containing
+``mala.Snapshot`` objects.
+
 The seed ``parameters.data.shuffling_seed`` ensures reproducibility of data
 sets. The ``shuffle_snapshots`` function has a path handling ability akin to
 the ``DataConverter`` class. Further, via the ``number_of_shuffled_snapshots``
@@ -203,7 +212,7 @@ in the file ``advanced/ex03_tensor_board``. Simply select a logger prior to trai
       .. code-block:: python
 
             parameters.running.logger = "tensorboard"
-            parameters.running.logging_dir = "mala_vis"
+            parameters.running.logging_dir = "mala_logs"
 
 or
 
@@ -215,14 +224,14 @@ or
                   entity="your_wandb_entity"
             )
             parameters.running.logger = "wandb"
-            parameters.running.logging_dir = "mala_vis"
+            parameters.running.logging_dir = "mala_logs"
 
 where ``logging_dir`` specifies some directory in which to save the
 MALA logging data. You can also select which metrics to record via
 
       .. code-block:: python
 
-            parameters.validation_metrics = ["ldos", "dos", "density", "total_energy"]
+            parameters.logging_metrics = ["ldos", "dos", "density", "total_energy"]
 
 Full list of available metrics:
       - "ldos": MSE of the LDOS.
@@ -240,14 +249,14 @@ To save time and resources you can specify the logging interval via
 
       .. code-block:: python
 
-            parameters.running.validate_every_n_epochs = 10
+            parameters.running.logging_metrics_interval = 10
 
 If you want to monitor the degree to which the model overfits to the training data,
 you can use the option
 
       .. code-block:: python
             
-            parameters.running.validate_on_training_data = True
+            parameters.running.log_metrics_on_train_set = True
 
 MALA will evaluate the validation metrics on the training set as well as the validation set.