This project analyzes airline ticket pricing behavior and builds a predictive pricing validation model using structured analytics and machine learning.
It demonstrates the ability to:
- Design a modular analytics pipeline
- Perform structured EDA with business framing
- Engineer predictive features from raw data
- Translate pricing patterns into executive-level insights
- Validate predictive performance using machine learning
📄 Full Executive Summary (Business Version):
👉 View Executive Summary
To understand:
- How flight structure (stops, duration, timing) impacts pricing
- Whether airlines price above or below route averages
- If premium brands consistently charge higher prices
- Whether structured features can predict airline ticket prices
- Non-stop flights average: ₹5,024
- 4-stop flights average: ₹17,686
- Correlation with price:
- Total Stops → 0.60
- Duration → 0.51
Stops influence pricing more than total flight duration.
- Premium carriers (Jet Airways Business, Vistara Premium) consistently price above route averages.
- Low-cost carriers (GoAir, SpiceJet) price below route averages.
- This confirms brand-driven pricing power independent of route structure.
A Random Forest model was trained to validate structured pricing predictability.
- R² Score: 0.79
- MAE: ₹1,279
The model explains 79% of price variance using structured features.
- Total Duration (minutes)
- Day of Journey
- Airline (Brand effect)
- Month of Journey
- Total Stops
This suggests structured operational and brand features strongly influence price.
This heatmap shows the relationship between price, number of stops, and total duration.
- Total Stops shows stronger correlation with Price than Duration
- Stops and Duration are strongly correlated
This chart shows the most influential predictors in airline pricing.
The project follows a modular analytics pipeline design.
airline-pricing-intelligence/
│
├── data/ # Raw dataset
├── docs/
| ├── executive_summary.md # Executive summary
| └── pipeline-flow.png # Pipeline Flow
|
├── outputs/ # Generated visualizations
│ ├── CorrelationMap.png
│ └── feature_importance.png
│
├── src/
│ ├── data_loader.py # Data ingestion
│ ├── preprocessing.py # Feature engineering
│ ├── analysis.py # Business insights & EDA
│ ├── model.py # ML pipeline & validation
│ └── run_pipeline.py # End-to-end execution
│
├── requirements.txt
└── README.md
Run the entire pipeline:
python -m src.run_pipeline- Python
- pandas
- numpy
- seaborn
- matplotlib
- scikit-learn
- Size: ~518 KB
- Records: 10,682 flights
- Source: Structured airline pricing dataset
This framework demonstrates how structured pricing data can be used to:
- Identify premium vs discount airline positioning
- Quantify brand-based pricing power
- Understand how route structure affects fare dynamics
- Predict expected market pricing with strong accuracy
Such a system can support:
- Revenue optimization teams
- Competitive pricing analysis
- Route-level pricing strategy
- Airline benchmarking dashboards
- Structured analytical thinking
- Modular Python architecture
- Clean data transformation design
- Business interpretation of model output
- Pricing intelligence understanding
- Ability to move from EDA → feature engineering → validation
This is analytics engineering + business intelligence applied to pricing strategy.
Prajwal Anand
Data Analytics | Pricing Intelligence | Machine Learning
- Cross-validation implementation
- Hyperparameter tuning
- Time-series pricing trend modeling
- Deployment-ready inference API


