Skip to content

Commit c2e1bb7

Browse files
Merge remote-tracking branch 'origin/develop' into develop
2 parents d52ca0c + ad1b5fd commit c2e1bb7

66 files changed

Lines changed: 2868 additions & 1907 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/source/advanced_usage/predictions.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,11 +81,13 @@ Gaussian representation of atomic positions. In this algorithm, most of the
8181
computational overhead of the total energy calculation is offloaded to the
8282
computation of this Gaussian representation. This calculation is realized via
8383
LAMMPS and can therefore be GPU accelerated (parallelized) in the same fashion
84-
as the bispectrum descriptor calculation. Simply activate this option via
84+
as the bispectrum descriptor calculation. If a GPU is activated (and LAMMPS
85+
is available), this option will be used by default. It can also manually be
86+
activated via
8587

8688
.. code-block:: python
8789
88-
parameters.descriptors.use_atomic_density_energy_formula = True
90+
parameters.use_atomic_density_formula = True
8991
9092
The Gaussian representation algorithm is describe in
9193
the publication `Predicting electronic structures at any length scale with machine learning <doi.org/10.1038/s41524-023-01070-z>`_.

docs/source/advanced_usage/trainingmodel.rst

Lines changed: 52 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -194,22 +194,64 @@ keyword, you can fine-tune the number of new snapshots being created.
194194
By default, the same number of snapshots as had been provided will be created
195195
(if possible).
196196

197-
Using tensorboard
198-
******************
197+
Logging metrics during training
198+
*******************************
199+
200+
Training progress in MALA can be visualized via tensorboard or wandb, as also shown
201+
in the file ``advanced/ex03_tensor_board``. Simply select a logger prior to training as
202+
203+
.. code-block:: python
204+
205+
parameters.running.logger = "tensorboard"
206+
parameters.running.logging_dir = "mala_vis"
199207
200-
Training routines in MALA can be visualized via tensorboard, as also shown
201-
in the file ``advanced/ex03_tensor_board``. Simply enable tensorboard
202-
visualization prior to training via
208+
or
203209

204210
.. code-block:: python
205211
206-
# 0: No visualizatuon, 1: loss and learning rate, 2: like 1,
207-
# but additionally weights and biases are saved
208-
parameters.running.logging = 1
212+
import wandb
213+
wandb.init(
214+
project="mala_training",
215+
entity="your_wandb_entity"
216+
)
217+
parameters.running.logger = "wandb"
209218
parameters.running.logging_dir = "mala_vis"
210219
211220
where ``logging_dir`` specifies some directory in which to save the
212-
MALA logging data. Afterwards, you can run the training without any
221+
MALA logging data. You can also select which metrics to record via
222+
223+
.. code-block:: python
224+
225+
parameters.validation_metrics = ["ldos", "dos", "density", "total_energy"]
226+
227+
Full list of available metrics:
228+
- "ldos": MSE of the LDOS.
229+
- "band_energy": Band energy.
230+
- "band_energy_actual_fe": Band energy computed with ground truth Fermi energy.
231+
- "total_energy": Total energy.
232+
- "total_energy_actual_fe": Total energy computed with ground truth Fermi energy.
233+
- "fermi_energy": Fermi energy.
234+
- "density": Electron density.
235+
- "density_relative": Rlectron density (Mean Absolute Percentage Error).
236+
- "dos": Density of states.
237+
- "dos_relative": Density of states (Mean Absolute Percentage Error).
238+
239+
To save time and resources you can specify the logging interval via
240+
241+
.. code-block:: python
242+
243+
parameters.running.validate_every_n_epochs = 10
244+
245+
If you want to monitor the degree to which the model overfits to the training data,
246+
you can use the option
247+
248+
.. code-block:: python
249+
250+
parameters.running.validate_on_training_data = True
251+
252+
MALA will evaluate the validation metrics on the training set as well as the validation set.
253+
254+
Afterwards, you can run the training without any
213255
other modifications. Once training is finished (or during training, in case
214256
you want to use tensorboard to monitor progress), you can launch tensorboard
215257
via
@@ -221,6 +263,7 @@ via
221263
The full path for ``path_to_log_directory`` can be accessed via
222264
``trainer.full_logging_path``.
223265

266+
If you're using wandb, you can monitor the training progress on the wandb website.
224267

225268
Training in parallel
226269
********************

docs/source/basic_usage/trainingmodel.rst

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ options to train a simple network with example data, namely
2828
parameters = mala.Parameters()
2929
3030
parameters.data.input_rescaling_type = "feature-wise-standard"
31-
parameters.data.output_rescaling_type = "normal"
31+
parameters.data.output_rescaling_type = "minmax"
3232
3333
parameters.network.layer_activations = ["ReLU"]
3434
@@ -43,15 +43,18 @@ sub-objects dealing with the individual aspects of the workflow. In the first
4343
two lines, which data scaling MALA should employ. Scaling data greatly
4444
improves the performance of NN based ML models. Options are
4545

46-
* ``None``: No normalization is applied.
46+
* ``None``: No scaling is applied.
4747

48-
* ``standard``: Standardization (Scale to mean 0, standard deviation 1)
48+
* ``standard``: Standardization (Scale to mean 0, standard deviation 1) is
49+
applied to the entire array.
4950

50-
* ``normal``: Min-Max scaling (Scale to be in range 0...1)
51+
* ``minmax``: Min-Max scaling (Scale to be in range 0...1) is applied to the entire array.
5152

52-
* ``feature-wise-standard``: Row Standardization (Scale to mean 0, standard deviation 1)
53+
* ``feature-wise-standard``: Standardization (Scale to mean 0, standard
54+
deviation 1) is applied to each feature dimension individually.
5355

54-
* ``feature-wise-normal``: Row Min-Max scaling (Scale to be in range 0...1)
56+
* ``feature-wise-minmax``: Min-Max scaling (Scale to be in range 0...1) is
57+
applied to each feature dimension individually.
5558

5659
Here, we specify that MALA should standardize the input (=descriptors)
5760
by feature (i.e., each entry of the vector separately on the grid) and

docs/source/conf.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,6 @@
7272
"scipy",
7373
"oapackage",
7474
"matplotlib",
75-
"horovod",
7675
"lammps",
7776
"total_energy",
7877
"pqkmeans",

examples/advanced/ex01_checkpoint_training.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ def initial_setup():
2121
parameters = mala.Parameters()
2222
parameters.data.data_splitting_type = "by_snapshot"
2323
parameters.data.input_rescaling_type = "feature-wise-standard"
24-
parameters.data.output_rescaling_type = "normal"
24+
parameters.data.output_rescaling_type = "minmax"
2525
parameters.network.layer_activations = ["ReLU"]
2626
parameters.running.max_number_epochs = 9
2727
parameters.running.mini_batch_size = 8

examples/advanced/ex03_tensor_board.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
parameters = mala.Parameters()
1515
parameters.data.input_rescaling_type = "feature-wise-standard"
16-
parameters.data.output_rescaling_type = "normal"
16+
parameters.data.output_rescaling_type = "minmax"
1717
parameters.targets.ldos_gridsize = 11
1818
parameters.targets.ldos_gridspacing_ev = 2.5
1919
parameters.targets.ldos_gridoffset_ev = -5
@@ -32,11 +32,19 @@
3232

3333
data_handler = mala.DataHandler(parameters)
3434
data_handler.add_snapshot(
35-
"Be_snapshot0.in.npy", data_path, "Be_snapshot0.out.npy", data_path, "tr",
35+
"Be_snapshot0.in.npy",
36+
data_path,
37+
"Be_snapshot0.out.npy",
38+
data_path,
39+
"tr",
3640
calculation_output_file=os.path.join(data_path, "Be_snapshot0.out"),
3741
)
3842
data_handler.add_snapshot(
39-
"Be_snapshot1.in.npy", data_path, "Be_snapshot1.out.npy", data_path, "va",
43+
"Be_snapshot1.in.npy",
44+
data_path,
45+
"Be_snapshot1.out.npy",
46+
data_path,
47+
"va",
4048
calculation_output_file=os.path.join(data_path, "Be_snapshot1.out"),
4149
)
4250
data_handler.prepare_data()

examples/advanced/ex05_checkpoint_hyperparameter_optimization.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
def initial_setup():
1818
parameters = mala.Parameters()
1919
parameters.data.input_rescaling_type = "feature-wise-standard"
20-
parameters.data.output_rescaling_type = "normal"
20+
parameters.data.output_rescaling_type = "minmax"
2121
parameters.running.max_number_epochs = 10
2222
parameters.running.mini_batch_size = 40
2323
parameters.running.learning_rate = 0.00001

examples/advanced/ex06_distributed_hyperparameter_optimization.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
parameters = mala.Parameters()
2525
# Specify the data scaling.
2626
parameters.data.input_rescaling_type = "feature-wise-standard"
27-
parameters.data.output_rescaling_type = "normal"
27+
parameters.data.output_rescaling_type = "minmax"
2828
parameters.running.max_number_epochs = 5
2929
parameters.running.mini_batch_size = 40
3030
parameters.running.learning_rate = 0.00001

examples/advanced/ex07_advanced_hyperparameter_optimization.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ def optimize_hyperparameters(hyper_optimizer):
1717

1818
parameters = mala.Parameters()
1919
parameters.data.input_rescaling_type = "feature-wise-standard"
20-
parameters.data.output_rescaling_type = "normal"
20+
parameters.data.output_rescaling_type = "minmax"
2121
parameters.running.max_number_epochs = 10
2222
parameters.running.mini_batch_size = 40
2323
parameters.running.learning_rate = 0.00001

examples/advanced/ex10_convert_numpy_openpmd.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
descriptor_save_path="./",
3030
target_save_path="./",
3131
additional_info_save_path="./",
32-
naming_scheme="converted_from_numpy_*.bp5",
32+
naming_scheme="converted_from_numpy_*.h5",
3333
descriptor_calculation_kwargs={"working_directory": "./"},
3434
)
3535

@@ -40,11 +40,9 @@
4040
for snapshot in range(2):
4141
data_converter.add_snapshot(
4242
descriptor_input_type="openpmd",
43-
descriptor_input_path="converted_from_numpy_{}.in.bp5".format(
44-
snapshot
45-
),
43+
descriptor_input_path="converted_from_numpy_{}.in.h5".format(snapshot),
4644
target_input_type="openpmd",
47-
target_input_path="converted_from_numpy_{}.out.bp5".format(snapshot),
45+
target_input_path="converted_from_numpy_{}.out.h5".format(snapshot),
4846
additional_info_input_type=None,
4947
additional_info_input_path=None,
5048
target_units=None,

0 commit comments

Comments
 (0)