Computational Metabolomics (Computer Lab Module in BIO513)

The Lab

The aim of this lab is to gain practical insight into metabolomics by reading the article:

Mickiewicz B. et al. (2020). NMR-based metabolic profiling provides diagnostic and prognostic information in critically ill children with suspected infection. Critical Care 16:R172. https://doi.org/10.1038/s41598-020-77319-0

and analysing the corresponding dataset children_infection.csv using Python.

The lab has three steps:

Read the relevant parts of the article and try to understand what analyses were performed and what the main findings were.
Apply the uni- and multivariate methods you have learnt on the same data using Python. Three student notebooks (block1, block2, block3) are provided as a starting point. The full worked example (metabolomics_workflow_notebook.ipynb) and its documentation (Guide_to_Metabolomics_Workflow.docx) are also available for reference.
Compare your results with the article and discuss similarities, differences, and possible reasons for them.

You are free to perform any relevant analyses. However, since PCA and OPLS-DA are standard methods in metabolomics, you are encouraged to include them. Univariate analysis should also be performed.

Note on software: R is more commonly used in metabolomics research due to the availability of specialised packages. Here we use Python, which gives you more flexibility in how you structure the analysis.

The Dataset

The dataset children_infection.csv contains ¹H NMR urine metabolite profiles from children admitted to a paediatric intensive care unit (PICU), across three groups: Infection, SIRS (systemic inflammatory response without confirmed infection), and Control (healthy children).

Repository Contents

File	Description
`block1_data_loading_preprocessing.ipynb`	Student notebook — Block 1: data loading and preprocessing
`block2_univariate_pca.ipynb`	Student notebook — Block 2: univariate analysis and PCA
`block3_oplsda_visualisation.ipynb`	Student notebook — Block 3: OPLS-DA and visualisation
`metabolomics_workflow_notebook.ipynb`	Full worked example pipeline for reference
`Guide_to_Metabolomics_Workflow.docx`	Guide to the workflow and all parameters
`children_infection.csv`	NMR metabolite dataset (all three groups)

Getting Started

Option A — Google Colab (recommended, no installation required)

Go to colab.research.google.com, choose File → Open notebook → GitHub, and paste this repository URL.
Run the Step 1 cell to install packages. When prompted, go to Runtime → Restart session.
Run the Step 2 cell — children_infection.csv will be downloaded automatically from GitHub.

⚠️ Colab sessions are temporary. Download any output files you want to keep before closing the session. At the start of the next block, upload the files from the previous block when prompted.

Option B — Local Jupyter

Clone the repository:

git clone https://github.com/BojarLab/Bio513_metabolomics.git
cd Bio513_metabolomics

Install dependencies:

pip install pandas numpy matplotlib scipy scikit-learn seaborn

Launch Jupyter and open the notebooks in order:

jupyter notebook

Intermediate Files

Each block saves output files that are used as input in the next block:

File	Generated by	Used by
`results/processed_data.csv`	Block 1	Blocks 2 & 3
`results/transformed_unscaled_data.csv`	Block 1	Blocks 2 & 3
`results/univariate_results.csv`	Block 2	Block 3
`results/metabolomics_report.pdf`	Block 3	Report submission

Local Jupyter: Files are saved and loaded automatically.
Google Colab: Download files at the end of each block and re-upload them at the start of the next.

Group Work

The lab is designed to be completed in same groups as your seminar. Groups should:

Discuss results together before writing them
Submit one report per group

Report Guidelines

Submit one report per group, written as a short scientific paper (approximately 1500–2500 words, excluding figures). Present your results and compare methods and findings with those reported in the original article. Discuss your results in relation to the study's conclusions.

Structure

Section	Content
Introduction	Background on NMR metabolomics, the clinical context, and the aim of your analysis
Methods	Preprocessing choices with justification; statistical tests; multivariate modelling; validation strategy
Results	Univariate findings, PCA, OPLS-DA performance and top metabolites — reference your figures
Discussion	Biological interpretation; comparison with Mickiewicz et al.; limitations

Figures

Include at least four figures with captions:

Volcano plot
PCA scores plot
OPLS-DA scores plot and permutation test
VIP plot or S-plot
(Optional) Boxplots or clustered heatmap

Assessment criteria

Correct execution and interpretation of the analyses
Quality and clarity of figures and captions
Depth of biological interpretation and connection to the original article
Critical discussion of methodological choices and their limitations
Clarity of writing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computational Metabolomics (Computer Lab Module in BIO513)

The Lab

The Dataset

Repository Contents

Getting Started

Option A — Google Colab (recommended, no installation required)

Option B — Local Jupyter

Intermediate Files

Group Work

Report Guidelines

Structure

Figures

Assessment criteria

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.gitignore		.gitignore
Guide_to_Metabolomics_Workflow.docx		Guide_to_Metabolomics_Workflow.docx
LICENSE		LICENSE
README.md		README.md
block1_data_loading_preprocessing.ipynb		block1_data_loading_preprocessing.ipynb
block2_univariate_pca.ipynb		block2_univariate_pca.ipynb
block3_oplsda_visualisation.ipynb		block3_oplsda_visualisation.ipynb
children_infection.csv		children_infection.csv
metabolomics_workflow_notebook.ipynb		metabolomics_workflow_notebook.ipynb

Folders and files

Latest commit

History

Repository files navigation

Computational Metabolomics (Computer Lab Module in BIO513)

The Lab

The Dataset

Repository Contents

Getting Started

Option A — Google Colab (recommended, no installation required)

Option B — Local Jupyter

Intermediate Files

Group Work

Report Guidelines

Structure

Figures

Assessment criteria

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages