Predict Remaining Useful Life (RUL) from multivariate sensor time series using the NASA Prognostics Center of Excellence (PCoE) C-MAPSS turbofan engine dataset.
This repository is designed as a reproducible, report-ready portfolio project. All experiments are implemented as notebooks that export figures (PDF for LaTeX and PNG for preview) and tables (CSV and LaTeX). The LaTeX report reads these exported artifacts directly, ensuring consistency between experiments and the final report.
- Leakage-safe train/validation splitting by engine unit ID
- End-to-end pipeline: raw data → processed data → windowed sequences → models → diagnostics
- Strong baselines combined with deep sequence models (CNN and GRU)
- Diagnostics beyond MAE/RMSE (predicted vs. true, residuals, error-by-RUL bins)
- Fully reproducible LaTeX report tied directly to notebook outputs
Validation metrics from the final comparison notebook:
| Model | Val RMSE | Val MAE |
|---|---|---|
| GRU | 21.41 | 15.14 |
| CNN | 30.38 | 22.29 |
| Boosting (best) | 34.40 | 25.37 |
| Linear (best) | 38.84 | 32.00 |
The GRU model performs best overall and shows the most stable behavior close to failure (low RUL), which is particularly important in predictive maintenance settings.
📄 Full report (PDF):
Download the full LaTeX report
assets/ Portfolio visuals (figures and final PDF)
config/ Configuration files (extensible)
data/
raw/CMAPSS/ Raw NASA C-MAPSS data (NOT committed; see data/README.md)
processed/ Generated datasets (ignored by git)
notebooks/ 00–08 complete pipeline notebooks
outputs/
figures/ Auto-exported PDF and PNG figures (ignored by git)
tables/ Auto-exported CSV and LaTeX tables (ignored by git)
report/ LaTeX report (reads outputs automatically)
src/ Reusable code and future extensions
-
Create a virtual environment and install dependencies
python -m venv .venv
Windows: .venv\Scripts\activate
macOS/Linux: source .venv/bin/activate
pip install -r requirements.txt -
Obtain the dataset
Follow the instructions in data/README.md and place the raw C-MAPSS files in:
data/raw/CMAPSS/ -
Run notebooks in the recommended order
00_setup_and_sanity.ipynb
01_build_fd001_processed_dataset.ipynb
02_eda_fd001.ipynb
03_preprocessing_and_windowing.ipynb
04_baselines_linear_models.ipynb
05_boosting_models.ipynb
06_deep_model_cnn.ipynb
07_deep_model_gru.ipynb
08_model_comparison_and_final_figures.ipynb -
Compile the report
The LaTeX report (report/main.tex) reads figures and tables directly from the outputs directory. If all notebooks ran successfully, the report should compile without manual intervention.
Raw datasets are not committed to GitHub. Generated outputs are ignored by default to keep the repository lightweight. For portfolio purposes, selected figures and the final compiled PDF can be placed in the assets directory.
NASA C-MAPSS Turbofan Engine Degradation Simulation Dataset
https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/

