The orders dataset of a Little Caesars restaurant isn't simply a record of a it's transactions over a period of time. Pizza restaurant order data is highly valuable because it reveals customer demand patterns, seasonal trends, product performance, and ordering behavior.
This raw order data can be used to pinpoint the areas in which sales volume and revenue are highest, thus allowing restaurant stakeholders to make informed decisions around promotions, menu strategy, delivery operations, and customer retention programs.
BUSINESS PROBLEM: When are order volume and revenue the highest? Which menu items generate the most demand? Are there product trends that suggest promotional opportunities?
- Project Overview
- Dataset Summary
- Data Cleaning and Feature Engineering in Python
- Exploratory Data Analysis in MySQL Workbench
- Visualizations in Tableau Public
- Project Insight and Recommendations
- Conclusion
This project analyzes 2024 order data from a Little Caesars restaurant in Brooklyn, New York to uncover meaningful trends in customer preferences and ordering behavior. Raw transactional data was transformed into actionable insights that support strategic business decisions for this pizza restaurant, through data cleaning in Python, SQL querying, and an interactive Tableau dashboard visualization.
The synthetic dataset can be found here. Each row represents an order fulfilled by this Little Caesars restaurant in 2024, and contains information about that order.
- Size (Before Cleaning): 5,050 rows, 11 columns
- Size (After Cleaning): 5,050 rows, 10 columns
- Restaurant Data: Order ID, Order Date, Menu Item Name, Menu Item Category, Menu Item Price, Menu Item Size, Menu Item Quantity, Customer First Name, Customer Last Name, Delivery, Delivery Address
- Feature Engineered Columns: Customer Name (Customer First Name and Customer Last Name were combined into one column, then dropped from the dataset)
This dataset contained many typos and mistakes within the columns, which were corrected throughout the data cleaning process.
1. Some values in the menu_item_category, menu_item_size, and delivery columns have typos, or are recorded in an inconsistent manner. The values within these columns will be corrected and simplified, in order to ensure accurate data when querying and making visualizations:
menu_item_categoryβ Convert all records to eitherPizza,Sides,Wings, orBeveragemenu_item_sizeβ Convert all records to eitherRegular,Large,Small,2L,20oz, or8pcdeliveryβ Convert all records to eitherYesorNo
# Correct and simplify values in menu_item_category column
df["menu_item_category"] = df["menu_item_category"].replace(["Piza", "pzza"], "Pizza")
df["menu_item_category"] = df["menu_item_category"].replace("Sidse", "Sides")
df["menu_item_category"] = df["menu_item_category"].replace("Wing", "Wings")
df["menu_item_category"] = df["menu_item_category"].replace("Bev", "Beverage")
# Correct and simplify values in menu_item_size column
df["menu_item_size"] = df["menu_item_size"].replace("reg", "Regular")
df["menu_item_size"] = df["menu_item_size"].replace("Lrage", "Large")
df["menu_item_size"] = df["menu_item_size"].replace("smll", "Small")
df["menu_item_size"] = df["menu_item_size"].replace("2l", "2L")
df["menu_item_size"] = df["menu_item_size"].replace("20 oz", "20oz")
df["menu_item_size"] = df["menu_item_size"].replace("8 pc", "8pc")
# Correct and simplify values in delivery column
df["delivery"] = df["delivery"].replace(["yes", "Y"], "Yes")
df["delivery"] = df["delivery"].replace(["no", "N"], "No")
# Make sure values in menu_item_category, menu_item_size, and delivery columns are now consistent
print("Unique Category Values: ", df["menu_item_category"].unique())
print("Unique Size Values: ", df["menu_item_size"].unique())
print("Unique Delivery Values: ", df["delivery"].unique())2. Some values in the menu_item_name, customer_first_name, customer_last_name, and delivery_address columns have inconsistent capitalization, or random symbols that don't belong. This will be corrected, while preserving the descriptive text, in order to prevent parsing errors.
# Identify the columns that require cleaning
cols_to_clean = [
"menu_item_name",
"customer_first_name",
"customer_last_name",
"delivery_address"
]
# Identify the symbols that must be removed
unwanted_symbols = r'[\/:;\(\)\-\"ββββ\'\\%]'
# Define a function that cleans the columns
def clean_text(val):
if pd.isna(val):
return val
# Remove unwanted symbols
val = re.sub(unwanted_symbols, '', str(val))
# Normalize extra spaces
val = re.sub(r'\s+', ' ', val).strip()
# Convert words to title case
return val.title()
# Apply the function to the columns
for col in cols_to_clean:
df[col] = df[col].apply(clean_text)
# Make sure text columns were cleaned
print(df[cols_to_clean].head())
# Fix strings that were undesirably altered during the cleaning process
df["menu_item_name"] = df["menu_item_name"].replace("2Liter", "2 Liter", regex=True)
df["menu_item_name"] = df["menu_item_name"].replace("Extramostbestest", "ExtraMostBestest", regex=True)
df["delivery_address"] = df["delivery_address"].replace("Ny", "NY", regex=True)3. In order to simplify the dataset, combine each value in the customer_first_name and customer_last_name columns, to create one customer_name column that contains the customer's full name.
# Combine customer_first_name and customer_last_name columns into customer_name
df["customer_name"] = (
df["customer_first_name"].fillna("") + " " + df["customer_last_name"].fillna("")
).str.strip()
# Place new customer_name column into the correct position
insert_pos = df.columns.get_loc("menu_item_quantity") + 1
col_data = df.pop("customer_name")
df.insert(insert_pos, "customer_name", col_data)
# Drop unnecessary columns
df = df.drop(columns=["customer_first_name", "customer_last_name"])
# Inspect customer_name column
print(df["customer_name"].head(10))An in-depth Jupyter Notebook detailing every step of the data cleaning process is available in this repository.
These SQL queries were used to reveal order data trends and give guidance towards assembling visualizations that pinpoint areas in which Little Caesars could strategize to maximize revenue.
SELECT
COUNT(*) AS total_orders,
SUM(menu_item_quantity) AS total_units_sold,
ROUND(SUM(menu_item_price * menu_item_quantity), 2) AS total_revenue
FROM pizza_restaurant_schema.pizza_restaurant_dataset_cleaned;
SELECT
CASE
WHEN delivery = 'Yes' THEN 'Yes'
WHEN delivery = 'No' THEN 'No'
ELSE 'Unknown'
END AS delivery_status,
COUNT(*) AS order_count,
ROUND(
COUNT(*) * 100.0 /
(SELECT COUNT(*)
FROM pizza_restaurant_schema.pizza_restaurant_dataset_cleaned),
2) AS pct_of_orders
FROM pizza_restaurant_schema.pizza_restaurant_dataset_cleaned
GROUP BY delivery_status;
SELECT
MONTHNAME(order_date) AS month,
COUNT(*) AS order_count,
SUM(menu_item_quantity) AS units_sold,
ROUND(SUM(menu_item_price * menu_item_quantity), 2) AS revenue
FROM pizza_restaurant_schema.pizza_restaurant_dataset_cleaned
GROUP BY month
ORDER BY revenue DESC;
SELECT
menu_item_name,
SUM(menu_item_quantity) AS units_sold,
ROUND(SUM(menu_item_price * menu_item_quantity), 2) AS revenue
FROM pizza_restaurant_schema.pizza_restaurant_dataset_cleaned
GROUP BY menu_item_name
ORDER BY revenue DESC;
SELECT
customer_name,
COUNT(*) AS order_count,
ROUND(SUM(menu_item_price * menu_item_quantity),2) AS total_money_spent
FROM pizza_restaurant_schema.pizza_restaurant_dataset_cleaned
GROUP BY customer_name
ORDER BY order_count DESC
LIMIT 15;
An in-depth SQL file detailing every step of the querying process is available in this repository.
The Tableau Public visualizations can be found here.
The data was organized into one interactive dashboard, which highlights the most relevant information about product trends and buyer behavior. Users can easily see the menu item factors that generated the most revenue, and the times where sales volume was the highest. They can also toggle between menu item categories and see, for example, around which time of year beverages were more popular with customers.
After cleaning the csv file, doing exploratory data analysis, and creating visualizations, the 2024 Little Caesars order data reveals that:
- π 1. INSIGHT: Sales spike significantly during June and December, likely driven by holidays, school breaks, and pre-vacation gatherings. Also, order volume and total units sold are higher on weekends, indicating increased demand during leisure and social time.
- β³ RECOMMENDATION: Meet High Demand Without Sacrificing Quality β Optimize Staffing and Inventory for Peak Periods
- Prepare for higher demand on weekends, June, and December
- Adjust staffing, dough prep, and ingredient availability accordingly
- β³ RECOMMENDATION: Meet High Demand Without Sacrificing Quality β Optimize Staffing and Inventory for Peak Periods
- π
2. INSIGHT: Beverage purchases increase May through August, aligning with warm summer weather and higher cold-drink demand.
- β³ RECOMMENDATION: Capitalize on Seasonal Buying Behavior β Expand Seasonal Beverage Promotions
- Promote large pizza + 2L drink bundles for summer months
- Promote summer combo specials
- β³ RECOMMENDATION: Capitalize on Seasonal Buying Behavior β Expand Seasonal Beverage Promotions
- π 3. INSIGHT: Pizza overwhelmingly outperforms sides, wings, and beverages, making it the core revenue driver. Also, pepperoni and cheese pizzas are the most popular menu items. In addition, large-sized menu items are ordered more frequently than small or regular sizes.
- β³ RECOMMENDATION: Customers Favor Traditional Offerings and Large Orders β Focus Marketing on Best-Selling Items
- Highlight popular menu items like Classic Pepperoni and ExtraMostBestest Cheese Pizzas in marketing campaigns
- Promote large-size combo deals aimed at families, parties, and group orders
- Offer size-upgrade discounts
- β³ RECOMMENDATION: Customers Favor Traditional Offerings and Large Orders β Focus Marketing on Best-Selling Items
- π 4. INSIGHT: Delivery is the dominant ordering method, with 65.88% of orders being delivery.
- β³ RECOMMENDATION: Delivery Speed Impacts Customer Satisfaction β Improve Delivery Efficiency
- Optimize driver staffing and scheduling
- Offer delivery promos on slow days (ex: 20% discount on Tuesdays, Wednesdays, and Thursdays)
- β³ RECOMMENDATION: Delivery Speed Impacts Customer Satisfaction β Improve Delivery Efficiency
- π₯ 5. INSIGHT: A significant number of customers placed multiple orders throughout 2024, with some ordering up to 18 times in one year.
- β³ RECOMMENDATION: Evidence of Repeat Customers β Introduce a Loyalty or Rewards Program
- Launch a points-per-order program
- Offer one free pizza after 10 purchases
- Offer free beverage, wings, or sides add-ons
- β³ RECOMMENDATION: Evidence of Repeat Customers β Introduce a Loyalty or Rewards Program
This project demonstrates how pizza restaurant order data can be transformed into meaningful business intelligence and strategic insight. By identifying seasonal demand patterns, top-performing products, and heavy delivery reliance, Little Caesars can make practical business decisions that enhance profitability, customer satisfaction, and operational efficiency. This analysis provides a strong foundation for future initiatives that help the restaurant better serve its customers while maximizing its revenue.