Fix spelling

richelbilderbeek · richelbilderbeek · commit 49ef3d98cb0e · 2025-12-22T08:37:36.000+01:00
diff --git a/.wordlist.txt b/.wordlist.txt
@@ -1116,3 +1116,5 @@ VZhV
 whitespaces
 wp
 Ctrl
+rescaled
+UIs
diff --git a/docs/day2/IDEs.rst b/docs/day2/IDEs.rst
@@ -923,7 +923,7 @@ VS Code
       .. figure:: ../img/vscode_connected_to_rackham.png
 
    When you first establish the ssh connection to Rackham, your VSCode server directory .vscode-server will be created in your home folder /home/[username].
-   This also where VS Code will install all your extentions that can quickly fill up your home directory.
+   This also where VS Code will install all your extensions that can quickly fill up your home directory.
 
 Features
 ########
diff --git a/docs/day2/IDEs_cmd.rst b/docs/day2/IDEs_cmd.rst
@@ -569,7 +569,7 @@ Principles
 
       Spyder is not available on Dardel.
 
-      - Use the conda env you created in Exercise 2 in `Use isolated environemnts <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
+      - Use the conda env you created in Exercise 2 in `Use isolated environments <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
 
       .. code-block:: console
 
@@ -585,7 +585,7 @@ Principles
 
       Spyder is not available centrally on Rackham. 
 
-      - Use the conda env you created in Exercise 2 in `Use isolated environemnts <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
+      - Use the conda env you created in Exercise 2 in `Use isolated environments <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
 
       .. code-block:: console
 
@@ -697,7 +697,7 @@ Install VS Code on your local machine and follow the steps below to connect to t
    .. figure:: ../img/vscode_connected_to_rackham.png
 
 When you first establish the ssh connection to the cluster, your VSCode server directory .vscode-server will be created in your home folder /home/[username].
-This also where VS Code will install all your extentions that can quickly fill up your home directory.
+This also where VS Code will install all your extensions that can quickly fill up your home directory.
 
 Exercises with step-by-step instructions
 ----------------------------------------
diff --git a/docs/day2/install_packages.rst b/docs/day2/install_packages.rst
@@ -21,7 +21,7 @@ There are 2 ways to install missing python packages at a HPC cluster.
 - Local installation, always available for the version of Python you had active when doing the installation
     - ``pip install --user [package name]``
 - Isolated environment. See next session.
-    - virtual environents provided by python
+    - virtual environments provided by python
     - conda
 
 Normally you want reproducibility and the safe way to go is with isolated environments specific to your different projects.
diff --git a/docs/day2/may2024/install_packages.rst b/docs/day2/may2024/install_packages.rst
@@ -623,7 +623,7 @@ More info
 
    - With a virtual environment you can tailor an environment with specific versions for Python and packages, not interfering with other installed python versions and packages.
    - Make it for each project you have for reproducibility.
-   - There are different tools to create virtual environemnts.
+   - There are different tools to create virtual environments.
 
       - UPPMAX has ``conda`` and ``venv`` and ``virtualenv``
       - HPC2N has ``venv`` and ``virtualenv``
diff --git a/docs/day2/use_isolated_environments_old.rst b/docs/day2/use_isolated_environments_old.rst
@@ -52,7 +52,7 @@ What happens at activation?
     - Check with ``which python``, should show at path to the environment.
     - In conda you can define python version as well
     - Since ``venv`` is part of Python you will get the python version used when running the ``venv`` command.
-- Packages are defined by the environent.
+- Packages are defined by the environment.
     - Check with ``pip list``
     - Conda can only see what you installed for it.
     - venv and virtualenv also see other packages if you allowed for that when creating the environment (``--system-site-packages``). 
@@ -196,7 +196,7 @@ The next points will be the same for all clusters
 .. note::
 
    - You can use "pip list" on the command line (after loading the python module) to see which packages are available and which versions. 
-   - Some packaegs may be inhereted from the moduels yopu have loaded
+   - Some packages may be inherited from the modules yopu have loaded
    - You can do ``pip list --local`` to see what is installed by you in the environment.
    - Some IDE:s like Spyder may only find those "local" packages
 
@@ -238,7 +238,7 @@ Conda
 
 .. tip::
 
-   - The conda environemnts inclusing many small files are by default stored in ``~/.conda`` folder that is in your $HOME directory with limited storage.
+   - The conda environments including many small files are by default stored in ``~/.conda`` folder that is in your $HOME directory with limited storage.
    - Move your ``.conda`` directory to your project folder and make a soft link to it from ``$HOME``
    - Do the following (``mkdir -p`` ignores error output and will not recfreate anothe folder if it already exists):
         - (replace what is inside ``<>`` with relevant path)
diff --git a/docs/day3/big_data.rst b/docs/day3/big_data.rst
@@ -33,7 +33,7 @@ High-Performance Data Analytics (HPDA)
 .. admonition:: What is it?
    :class: dropdown
 
-   - **High-performace data analytics (HPDA)**, a subset of high-performance computing which focuses on working with **large data**.
+   - **High-performance data analytics (HPDA)**, a subset of high-performance computing which focuses on working with **large data**.
 
          - The data can come from either computer models and simulations or from experiments and observations, and the goal is to preprocess, analyse and visualise it to generate scientific results.
 
@@ -102,7 +102,7 @@ Allocating RAM
 
 .. important::
 
-   - You do not have to explicitely run threads or other parallelism.
+   - You do not have to explicitly run threads or other parallelism.
    - Allocating several nodes for one one big problem is not useful.
       - Note that shared memory among the cores works within node only.
 
@@ -216,7 +216,7 @@ Exercise: Memory allocation (10 min)
 
    - Slurm flag ``-n <number of cores>``
 
-.. challenge:: Actually start an interactive sesion with 4 cores for 3 hours. 
+.. challenge:: Actually start an interactive session with 4 cores for 3 hours. 
 
    - We will use it for the exercises later.
    - Since it may take some time to get the allocation we do it now already!
@@ -635,7 +635,7 @@ Xarray package
     - It also **borrows heavily from the Pandas package for labelled tabular data** and integrates tightly with dask for parallel computing. 
 
 - Xarray is particularly tailored to working with NetCDF files.
-- But work for aother files as well
+- But work for another files as well
  
 - Explore it a bit in the (optional) exercise below!
 
@@ -699,7 +699,7 @@ Big file → split into chunks → parallel workers → results combined.
 
 .. admonition:: To think of
  
-   - Chunk size and number of them affect the performance due to overhad/administration of the chunking and combination.
+   - Chunk size and number of them affect the performance due to overhead/administration of the chunking and combination.
    - Briefly explain what happens when a Dask job runs on multiple cores.
 
 
@@ -720,7 +720,7 @@ Big file → split into chunks → parallel workers → results combined.
 Polars package
 ..............
 
-- ``polars`` is a Python package that presnts itself as **Blazingly Fast DataFrame Library**
+- ``polars`` is a Python package that presents itself as **Blazingly Fast DataFrame Library**
     - Utilizes all available cores on your machine.
     - Optimizes queries to reduce unneeded work/memory allocations.
     - Handles datasets much larger than your available RAM.
@@ -988,7 +988,7 @@ Set up the environment
 
    - https://stackoverflow.com/questions/72155514/when-to-use-xarray-over-numpy-for-medium-rank-multidimensional-data
 
-   - Browse: https://docs.xarray.dev/en/v2024.11.0/getting-started-guide/why-xarray.html or change to more applicabe version in drop-down menu to lower right.
+   - Browse: https://docs.xarray.dev/en/v2024.11.0/getting-started-guide/why-xarray.html or change to more applicable version in drop-down menu to lower right.
        - find something interesting for you! Test some lines if you want to!
        - tips: 
            - Pandas: https://docs.xarray.dev/en/v2024.11.0/getting-started-guide/faq.html#why-is-pandas-not-enough
diff --git a/docs/day3/big_data_old.rst b/docs/day3/big_data_old.rst
@@ -17,7 +17,7 @@ High-Performance Data Analytics (HPDA)
 .. admonition:: What is it?
    :class: dropdown
 
-   - **High-performace data analytics (HPDA)**, a subset of high-performance computing which focuses on working with large data.
+   - **High-performance data analytics (HPDA)**, a subset of high-performance computing which focuses on working with large data.
 
          - The data can come from either computer models and simulations or from experiments and observations, and the goal is to preprocess, analyse and visualise it to generate scientific results.
 
@@ -351,7 +351,7 @@ Allocating RAM
 .. important::
 
    - Allocate many cores or a full node!
-   - You do not have to explicitely run threads or other parallelism.
+   - You do not have to explicitly run threads or other parallelism.
 
 - Note that shared memory among the cores works within node only.
 
@@ -622,7 +622,7 @@ Exercises
              ssh nid001057
 
 
-          Use the conda env you created in Exercise 2 in `Use isolated environemnts <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
+          Use the conda env you created in Exercise 2 in `Use isolated environments <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
 
           .. code-block:: console
 
diff --git a/docs/day3/not_used/Seaborn-Intro.rst b/docs/day3/not_used/Seaborn-Intro.rst
@@ -353,7 +353,7 @@ For the ``map_`` commands, the kwargs depend on the type of plot that was passed
 Heatmap and Clustermap
 ^^^^^^^^^^^^^^^^^^^^^^
 
-Sometimes you have too many variables to look at with pairplots or corner plots, and the best you can do is map the correlation coeffcients between different parameters. Alternatively, you might have a DataFrame with a comparable number of numeric rows and columns, and you want to see how the rows and columns correlate. Either way, the DataFrame must be able to be coerced to ``ndarray``. 
+Sometimes you have too many variables to look at with pairplots or corner plots, and the best you can do is map the correlation coefficients between different parameters. Alternatively, you might have a DataFrame with a comparable number of numeric rows and columns, and you want to see how the rows and columns correlate. Either way, the DataFrame must be able to be coerced to ``ndarray``. 
 
 Once again, this type of plot is extremely tedious to make in pure Matplotlib, but in Seaborn, it can require as little as one line of code. There are two functions that do this: ``sb.heatmap()`` and ``sb.clustermap()``. The main difference between the two is that the latter attempts to rearrange variables such that those that are correlated are positioned next to each other on the plot, while the former simply lists the variables in the order they were given in the DataFrame.
 
diff --git a/docs/day3/not_used/old-pandas.rst b/docs/day3/not_used/old-pandas.rst
@@ -12,7 +12,7 @@ Intro to Pandas
 * A simple interface with the Seaborn plotting library, and increasingly also Matplotlib.
 * Easy multi-threading with Numba.
 
-**Limitations.** Pandas alone has somewhat limited support for parallelization, N-dimensional data structures, and datasets much larger than 3 GiB. Fortunately, there are packages like ``dask`` and ``polars`` that can help. In partcular, ``dask`` will be covered in a later lecture in this workshop. There is also the ``xarray`` package that provides many similar functions to Pandas for higher-dimensional data structures, but that is outside the scope of this workshop.
+**Limitations.** Pandas alone has somewhat limited support for parallelization, N-dimensional data structures, and datasets much larger than 3 GiB. Fortunately, there are packages like ``dask`` and ``polars`` that can help. In particular, ``dask`` will be covered in a later lecture in this workshop. There is also the ``xarray`` package that provides many similar functions to Pandas for higher-dimensional data structures, but that is outside the scope of this workshop.
 
 .. admonition:: Get today's tarball!
 
@@ -197,7 +197,7 @@ Load and Run
              ssh nid001057
 
 
-          Use the conda env you created in Exercise 2 in `Use isolated environemnts <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
+          Use the conda env you created in Exercise 2 in `Use isolated environments <https://uppmax.github.io/HPC-python/day2/use_isolated_environments.html#exercises>`_
 
           .. code-block:: console
 
@@ -579,7 +579,7 @@ Iteration over DataFrames, Series, and GroupBy objects is slow and should be avo
 * ``.str.upper()``/``.lower()``
 * ``.str.<r>strip()``
 * ``.str.<r>split(' ', n=None, expand=False)`` can return outputs of several different shapes depending on ``expand`` (bool, whether to return split strings as lists in 1 column or substrings in multiple columns) and ``n`` (maximum number of columns to return).
-* Unlike for regular strings, ``df.str.replace()`` does not accept dict-type input where keys are existing substrings and values are replacements. For multiple simulataneous replacements via dictionary input, use ``df.replace()`` without the ``.str``.
+* Unlike for regular strings, ``df.str.replace()`` does not accept dict-type input where keys are existing substrings and values are replacements. For multiple simultaneous replacements via dictionary input, use ``df.replace()`` without the ``.str``.
 
 **Statistics.** Nearly all NumPy statistical functions and a few ``scipy.mstats`` functions can be called as aggregate methods of DataFrames, Series, any subsets thereof, or GroupBy objects. All of them ignore NaNs by default. For DataFrames and GroupBy objects, you must set ``numeric_only=True`` to exclude non-numeric data, and specify whether to aggregate along rows (``axis=0``) or columns (``axis=1``) .
 
diff --git a/docs/day3/pandas.rst b/docs/day3/pandas.rst
@@ -28,7 +28,7 @@ It is recommended that you do exercises and other code experimentation for this
 * A simple interface with the Seaborn plotting library, and increasingly also Matplotlib.
 * Easy multi-threading with Numba.
 
-**Limitations.** Pandas alone has somewhat limited support for parallelization, N-dimensional data structures, and datasets much larger than 3 GiB. Fortunately, there are packages like ``dask`` and ``polars`` that can help with large data sets. In partcular, ``dask`` will be covered tomorrow in the Parallel Computing section of this course. There is also the ``xarray`` package that provides many similar functions to Pandas for higher-dimensional data structures, but that is outside the scope of this workshop.
+**Limitations.** Pandas alone has somewhat limited support for parallelization, N-dimensional data structures, and datasets much larger than 3 GiB. Fortunately, there are packages like ``dask`` and ``polars`` that can help with large data sets. In particular, ``dask`` will be covered tomorrow in the Parallel Computing section of this course. There is also the ``xarray`` package that provides many similar functions to Pandas for higher-dimensional data structures, but that is outside the scope of this workshop.
 
 Load and Run
 ------------
diff --git a/docs/day3/seaborn-new.rst b/docs/day3/seaborn-new.rst
@@ -188,7 +188,7 @@ Load and Run Seaborn
 
 
 
-In all cases, once Seaborn or the module that provides it is loaded, it can be imported directly in Python. The typical abbreviation in online documentation is ``sns``, but for those of us who never watched The West Wing, any sensible abbrevation will do. Here we use ``sb``.
+In all cases, once Seaborn or the module that provides it is loaded, it can be imported directly in Python. The typical abbreviation in online documentation is ``sns``, but for those of us who never watched The West Wing, any sensible abbreviation will do. Here we use ``sb``.
 
 .. attention::
 
@@ -357,7 +357,7 @@ By default the main plot is a scatter plot, and the marginal plots are either hi
 Heatmap and Clustermap
 ^^^^^^^^^^^^^^^^^^^^^^
 
-Sometimes you have too many variables to look at with pair plots/corner plots, and the best you can do is map the correlation coeffcients between different parameters. Alternatively, you might have a DataFrame with a comparable number of numeric rows and columns, and you want to see how the rows and columns correlate. Either way, the DataFrame must be able to be coerced to ``ndarray``.
+Sometimes you have too many variables to look at with pair plots/corner plots, and the best you can do is map the correlation coefficients between different parameters. Alternatively, you might have a DataFrame with a comparable number of numeric rows and columns, and you want to see how the rows and columns correlate. Either way, the DataFrame must be able to be coerced to ``ndarray``.
 
 Once again, making this type of plot is extremely tedious in pure Matplotlib, but can require as little as one line of code with Seaborn. There are two functions that do this: ``sb.heatmap()`` and ``sb.clustermap()``. The main difference between the two is that ``clustermap()`` attempts to rearrange variables so those that are correlated are positioned next to each other and connected by a tree diagram.
 
diff --git a/docs/summary2.rst b/docs/summary2.rst
@@ -21,7 +21,7 @@ Summary day 2
    - Install packages and use isolated environments 
       - With a virtual environment you can tailor an environment with specific versions for Python and packages, not interfering with other installed python versions and packages.
       - Make it for each project you have for reproducibility.
-      - There are different tools to create virtual environemnts.
+      - There are different tools to create virtual environments.
          - ``virtualenv`` and ``venv``
             - install packages with ``pip``.
             - the flag ``--system-site-packages`` includes preinstalled packages as well
diff --git a/evaluations/20251128_day_2/README.md b/evaluations/20251128_day_2/README.md
@@ -34,13 +34,13 @@
 - It was very informative and fun
 - a little bit tight
 - The time allocated isolated environments and launching IDEs from the command line did not seem adequate
-- no time for excercise so a bit to fast
+- no time for exercise so a bit to fast
 
 ### [Future topics](future_topics.txt)
 
-- Me personally I want more about COSMOS-SENS but that is mainly as there is very little documantation at Lunarc about it
+- Me personally I want more about COSMOS-SENS but that is mainly as there is very little documentation at Lunarc about it
 - NO
-- More andvanced practical management of Conda/virtual environments on HPC, Workflow systems like Nextflow or Snakemake
+- More advanced practical management of Conda/virtual environments on HPC, Workflow systems like Nextflow or Snakemake
 
 ### [Other comments](comments.txt)
 
diff --git a/evaluations/20251201_day_3/README.md b/evaluations/20251201_day_3/README.md
@@ -31,5 +31,5 @@
 
 ### [Other comments](comments.txt)
 
-- Overall a good setup of the course! It is hard to do the excersices because environments etc. has to be set, so I will have to go back later to finish.
+- Overall a good setup of the course! It is hard to do the exercises because environments etc. has to be set, so I will have to go back later to finish.
 - The first half of today’s training covered important material, but I personally found the delivery less engaging. It felt more like a reading of the manual, and I didn’t gain much from it. Demonstrating everything with actual code and a more hands-on approach would be essential to better understand the core concepts. The second half of the training, however, was satisfactory.
diff --git a/evaluations/20251202_day_4/README.md b/evaluations/20251202_day_4/README.md
@@ -20,7 +20,7 @@
 ### [Pace](pace.txt)
 
 - NA
-- a bit to fast for me to really learn by following examples, to uncertain with setting up environments, connectin et c
+- a bit to fast for me to really learn by following examples, to uncertain with setting up environments, connecting et c
 - good
 - NA