This project analyzes Walmart sales data to uncover key business insights and forecast future sales. It combines data cleaning, exploratory analysis, feature engineering, and forecasting techniques with deployment of a lightweight web app.
The work demonstrates the end-to-end data science process:
- Cleaning raw transactional data
- Performing Exploratory Data Analysis (EDA)
- Creating a cleaned dataset for modeling
- Building forecasting models for weekly sales trends
- Deploying an app interface for interaction with the analysis
- Identify trends and patterns in Walmart’s historical sales data.
- Analyze the impact of holidays, promotions, and seasonal events on sales.
- Forecast weekly sales using predictive modeling.
- Build a simple web app for showcasing predictions.
- Provide insights for inventory and staffing decisions.
├── app.py # Flask/Streamlit app for deployment ├── ArjunP_24202600_WalmartSales.ipynb # Jupyter notebook with analysis ├── Uncleaned_Walmart_Sales_Data.csv # Raw Walmart dataset ├── Cleaned_Walmart_Sales_Data.csv # Preprocessed dataset
The Walmart dataset includes:
- Store – Store ID
- Date – Weekly sales date
- Weekly_Sales – Sales amount (target variable)
- Holiday_Flag – Whether the week includes a holiday
- Temperature – Average weekly temperature
- Fuel_Price – Fuel cost in the region
- CPI – Consumer Price Index
- Unemployment – Regional unemployment rate
-
Data Cleaning
- Removed nulls, handled duplicates
- Standardized column values
- Created
Cleaned_Walmart_Sales_Data.csv
-
Exploratory Data Analysis (EDA)
- Trends across time, stores, and holidays
- Correlation between macroeconomic variables (Fuel Price, CPI, Unemployment) and sales
-
Modeling & Forecasting
- Time-series forecasting for weekly sales
- Comparisons of different forecasting methods
-
App Development
- Built a web app (
app.py) for simple predictions and visualization - Uses Flask/Streamlit for deployment
- Built a web app (
- Holiday weeks show significantly higher sales spikes.
- Unemployment and CPI negatively correlate with sales.
- Store-level analysis reveals variance in sales performance across locations.
- Forecasting models highlight seasonality and long-term trends.