As someone with sleep apnea, I tracked and analyzed my CPAP data for 2 years. Here's what I found!.
| Category | Tools / Frameworks |
|---|---|
| Data Processing | Python, Pandas, NumPy, Jupyter Notebooks |
| AWS Cloud Services | SageMaker AI, S3, Glue |
| ML Models / Techniques | Prophet, Scikit-learn, Random Forest Regression, Bayesian Optimization |
Every night, my CPAP (continuous positive airway pressure) uploads data like USAGE_HOURS, AHI, and LEAK, as well as an overall SLEEP_SCORE for the night—a range from 0 to 100.
- So I made a regression model that predicts my
SLEEP_SCOREfor any given night - And a forecasting model that uses historical data to predict what my sleep will be like in the future
- Leveraged SageMaker Studio Jupyter Notebooks for implementation, S3 for data and model storage, and Glue for generating metadata
- Find the relationship between
SLEEP_SCOREand the other metrics, by visualizing how each metric affects the overarchingSLEEP_SCORE.
- Trained and compared several machine learning models, including Linear Regression, Random Forest, XGBoost, and a Neural Network Regressor on the same data.
- Scikit-Learn Random Forest model performed the best.
- Picked this model, and fine-tuned its hyperparameters with Bayesian Optimization.
- Resulted in a improvement of 8.95% in RMSE and 0.19% in R².
- The model is saved at
models/rf_bayesian_tuned_model.pkl, and resulted in values of 0.99 R² and 0.28 RMSE.
Below are the listed features and their calculated weights on determining
SLEEP_SCORE
- Use time series modeling to extrapolate
SLEEP_SCORE,AHI,MASK_SESSION,USAGE_HOURS, andLEAKmetrics 7 days into the future.
- Used Meta's Prophet model to analyze each metric in the dataset, saving the last week for testing predictions
- Achieved RMSE's of 0.83
SLEEP_SCOREpoints, 0.23AHIevents, 0.74MASK_SESSIONevents, 0.52USAGE_HOURShours, and 1.9LEAKL/h. - Analyzed weekly seasonality trends, for example my sleep score consistently being worse on Fridays (see below)
- Example trend graph (this tracks the amount of air leaking from the mask)
- Loops through each of the Prophet models saved in my S3 bucket, and predicts the values for 7 days after the training data stops
- Displayed tabularly and graphically




