A Lightweight Deep Convolutional Neural Network for Detecting Artifacts in Continuous EEG Signals

End-to-end EEG artifact detection pipeline: data preprocessing (Part 1) and CNN training and testing workflow (Part 2).

This repository contains the code for the paper "A Lightweight Deep Convolutional Neural Network for Detecting Artifacts in Continuous EEG Signals".

It implements end-to-end EEG artifact detection using a Deep Lightweight 1D Convolutional Neural Network (DLCNN), together with literature-based rule-based methods. It targets three artifact categories derived from TUH annotations:

Eye movements (TARGETED: EYE)
Muscle (EMG) artifacts (TARGETED: MUSC, CHEW, SHIV)
Non-physiological artifacts (TARGETED: ELEC, ELPP)

The pipeline includes preprocessing, binary dataset preparation per target, model training, threshold calibration on validation data, final evaluation on held-out test data, optional window-size sweeps, and comparison against rule-based detectors.

Installation

# Install the package in editable mode (recommended for development)
pip install -e ".[dev]"

# Or install dependencies only
pip install -r requirements.txt

Data Availability

All code, trained models, and evaluation scripts are publicly available in this repository. The TUH EEG Artifact Corpus used in this study is available through the Temple University Hospital EEG Corpus at https://isip.piconepress.com/projects/nedc/html/tuh_eeg/. Access requires completion of a data use agreement form submitted to help@nedcdata.org.

To reproduce results locally, download the TUH EEG Artifact Corpus and place the EDF files under edf/. Then follow the Typical Workflow to preprocess, prepare binary datasets, and train/evaluate models. Preprocessed arrays (.npy) and scalers (.joblib) are generated by the pipeline and excluded from version control due to their size.

Using a Different Dataset

The DLCNN architecture, training loop, focal loss, and evaluation scripts are dataset-agnostic. However, the preprocessing pipeline (artifact_identification/preprocessing.py) is designed for the TUH EEG Artifact Corpus. To adapt it for a different dataset, modify the following:

Component	What to change	Location
Annotation format	The pipeline expects a CSV per recording with `start_time`, `stop_time`, and `label` columns. Reformat your annotations to match, or modify `load_and_validate_file()`.	`preprocessing.py`
Artifact labels	TUH-specific labels (`eyem`, `musc`, `elec`, `chew`, `shiv`, `elpp`) are mapped to integer classes in `CONFIG['artifact_mapping']`. Replace these with your dataset's label vocabulary.	`preprocessing.py`
Channel montage	A 22-channel bipolar montage based on the 10-20 system is assumed. Update `CONFIG['canonical_channels']` and `CONFIG['bipolar_pairs']` if your dataset uses a different electrode configuration.	`preprocessing.py`
File format	EDF (`.edf`) is expected. For other formats (BDF, GDF, etc.), update `load_and_validate_file()` to use the appropriate MNE reader.	`preprocessing.py`

Repository Structure

artifact_identification/          # Root repository
├── pyproject.toml                # Package configuration and dependencies
├── README.md
├── LICENSE
├── requirements.txt
│
├── artifact_identification/      # Python package
│   ├── __init__.py               # Package root (exports, __version__)
│   ├── _version.py               # Version string
│   ├── losses.py                 # Shared focal loss function
│   ├── preprocessing.py          # EEG preprocessing pipeline
│   ├── data_preparation.py       # Binary dataset preparation
│   ├── exploration.py            # Dataset exploration and analysis
│   ├── detectors/                # Artifact detectors
│   │   ├── __init__.py
│   │   ├── eye_movement.py       # DLCNN for eye movement artifacts
│   │   ├── muscle.py             # DLCNN for muscle artifacts
│   │   ├── non_physiological.py  # DLCNN for non-physiological artifacts
│   │   └── rule_based.py         # Heuristic rule-based detectors
│   ├── evaluation/               # Model evaluation
│   │   ├── __init__.py
│   │   ├── cnn_vs_rules.py       # CNN vs rule-based comparison
│   │   └── rule_based_eval.py    # Rule-based evaluation
│   └── utils/                    # Utilities
│       ├── __init__.py
│       ├── check_channels.py     # EDF channel inspection
│       └── check_edf.py          # EDF property inspection
│
├── scripts/                      # CLI entry points
│   ├── preprocess.py             # Run preprocessing pipeline
│   ├── prepare_data.py           # Prepare binary datasets
│   ├── train_eye.py              # Train eye movement detector
│   ├── train_muscle.py           # Train muscle artifact detector
│   ├── train_nonphys.py          # Train non-physiological detector
│   ├── evaluate_cnn_vs_rules.py  # CNN vs rules comparison
│   ├── evaluate_rule_based.py    # Rule-based evaluation
│   ├── explore_data.py           # Data exploration
│   └── window_optimization.py    # Window size sweep
│
├── tests/                        # Test suite
│   ├── test_losses.py            # Tests for focal loss
│   └── test_rule_based.py        # Tests for rule-based detectors
│
├── DOCS/                         # Montage and annotation documentation
├── binary_models_data/           # Preprocessed data (generated)
├── results/                      # Training results and plots
└── checkpoints/                  # Model weights (gitignored)

Methodological Summary

Sampling rate: 250 Hz; standardized 22-channel bipolar montage
Windows: Non-overlapping; size is configurable (e.g., 1-30 s)
Split: 60/20/20 at the patient/recording level to prevent leakage
Normalization: RobustScaler (global fit on training set)
Loss: Focal loss with class weights for imbalanced data
Threshold calibration (validation set): Youden's J, fixed specificity, or max TPR at FPR <= 0.1
Metrics (test set): Sensitivity, specificity, ROC AUC, prevalence-adjusted PR-AUC, partial ROC AUC at FPR <= 0.1
Rule-based detectors: Literature-adapted bandpower, spectral slope, amplitude/variance, and line-noise features

Typical Workflow

Preprocess and window the data (non-overlapping):

python scripts/preprocess.py --window-seconds 3 --overlap 0.0

Build binary datasets for each target:

python scripts/prepare_data.py

Train a detector (repeat per target as needed):

python scripts/train_eye.py
python scripts/train_muscle.py
python scripts/train_nonphys.py

Compare CNN to rule-based methods:

python scripts/evaluate_cnn_vs_rules.py

Optional: Sweep window sizes:

python scripts/window_optimization.py --target all --force

Running Tests

# Run the full test suite
pytest

# Run with coverage
pytest --cov=artifact_identification --cov-report=term-missing

Notes on Models and Checkpoints

Trained weights are saved under checkpoints/<target>/ with unique timestamps.
Checkpoints are excluded from Git to keep the repository small.

Metrics and Reporting

Detectors report: accuracy, precision, recall (sensitivity), specificity, F1, ROC AUC, PR AUC, prevalence-adjusted PR AUC, and partial ROC AUC (FPR <= 0.1). Thresholds are selected on the validation set and applied to the held-out test set.

Plots saved per run include training history, ROC/PR curves, confusion matrix, and prediction distributions.

Citation

If this repository is useful in your work, please cite both the paper and the software:

Paper:

E. Nyanney, P.D. Thirumala, S. Visweswaran, Z. Geng, A lightweight deep convolutional neural network for detecting artifacts in continuous EEG signals, Clinical Neurophysiology Practice, 11 (2026) 208–215. https://doi.org/10.1016/j.cnp.2026.03.005

Software:

E. Nyanney, P.D. Thirumala, S. Visweswaran, Z. Geng, EEG-Artifact-Detection-DLCNN: A Lightweight Deep Convolutional Neural Network for Detecting Artifacts in Continuous EEG Signals (v1.0.0), Zenodo (2026). https://doi.org/10.5281/zenodo.19554506

BibTeX

@article{nyanney2026dlcnn,
  title={A lightweight deep convolutional neural network for detecting artifacts in continuous EEG signals},
  author={Nyanney, Evans and Thirumala, Parthasarathy D and Visweswaran, Shyam and Geng, Zhaohui},
  journal={Clinical Neurophysiology Practice},
  year={2026},
  volume={11},
  pages={208--215},
  doi={10.1016/j.cnp.2026.03.005},
  url={https://doi.org/10.1016/j.cnp.2026.03.005}
}

@software{nyanney2026dlcnn_software,
  title={EEG-Artifact-Detection-DLCNN: A Lightweight Deep Convolutional Neural Network for Detecting Artifacts in Continuous EEG Signals},
  author={Nyanney, Evans and Thirumala, Parthasarathy D and Visweswaran, Shyam and Geng, Zhaohui},
  year={2026},
  version={v1.0.0},
  publisher={Zenodo},
  doi={10.5281/zenodo.19554506},
  url={https://doi.org/10.5281/zenodo.19554506}
}

For data, please acknowledge the Temple University Hospital EEG Corpus (TUH).

License

MIT License. See LICENSE for details. Ensure compliance with TUH dataset usage terms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Lightweight Deep Convolutional Neural Network for Detecting Artifacts in Continuous EEG Signals

Table of Contents

Installation

Data Availability

Using a Different Dataset

Repository Structure

Methodological Summary

Typical Workflow

Running Tests

Notes on Models and Checkpoints

Metrics and Reporting

Citation

BibTeX

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
DOCS		DOCS
artifact_identification		artifact_identification
binary_models_data		binary_models_data
results		results
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

A Lightweight Deep Convolutional Neural Network for Detecting Artifacts in Continuous EEG Signals

Table of Contents

Installation

Data Availability

Using a Different Dataset

Repository Structure

Methodological Summary

Typical Workflow

Running Tests

Notes on Models and Checkpoints

Metrics and Reporting

Citation

BibTeX

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages