Skip to content

acobotas/ab-testing-fast-food

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fast Food Marketing Campaign A/B Test Analysis

This is an analysis of the Fast Food Marketing Campaign dataset. According to the dataset description, a fast-food chain tested three different marketing campaigns to promote a new menu item. Weekly sales (in thousands) were recorded across several locations, with each location assigned to one promotion.


Goal of the Test

The main goal of this A/B test was to determine which of the three marketing campaigns led to the highest sales of the new menu item.

Because three campaigns were tested, pairwise comparisons were conducted:

  • Promotion 1 vs Promotion 2
  • Promotion 1 vs Promotion 3
  • Promotion 2 vs Promotion 3

For analysis of A/B test results a confidence level of 99% was used.


Hypotheses

  • Null Hypothesis (H₀): There is no difference in mean sales per location between the compared promotions. Any observed differences are due to random chance.
  • Alternative Hypothesis (H₁): There is a difference in mean sales per location between the compared promotions. The observed difference is not due to random chance.

Target Metric

The dataset contains several columns, but the most relevant metric for this analysis was:

  • Mean Sales per Location (k$) – average weekly sales of a given promotion.

Since each store was observed for four weeks, sales were aggregated across all four weeks per location and promotion, resulting in one total sales value per store under its assigned promotion.


Calculations

The table contains the aggregated results necessary to analyze the A/B test and reach a decision. You can find the query in the appendix.


Promotion Total Locations Total Sales (k$) Mean Sales per Location (k$) StdDev Sales
Promo 1 43 9,993.03 232.40 64.11
Promo 2 47 8,897.93 189.32 57.99
Promo 3 47 10,408.52 221.46 65.54

Table 1. Summary of sales by promotion.


Promotion 1 achieved the highest mean sales per location, followed by Promotion 3, while Promotion 2 had the lowest performance. To determine whether these differences were statistically significant, pairwise two-tailed t-tests were conducted at the 99% confidence level.


Comparison Mean Difference (A–B) SE p-value Conclusion (99%)
P1 vs P2 +43.08 12.93 0.00128 Significant – P1 > P2
P1 vs P3 +10.94 13.67 0.43 Not significant
P2 vs P3 −32.14 12.76 0.0136 Not significant

Table 2. Pairwise comparison results at 99% confidence.

SE (standard error) indicates the variability of the sample mean difference. A smaller SE means the mean difference is estimated more precisely.
p-value indicates the probability of observing such a difference (or more extreme) if the true means were equal. A p-value below 0.01 (corresponding to 99% confidence) indicates a statistically significant difference.


Promotion 1 significantly outperformed Promotion 2. The difference between Promotion 1 and Promotion 3 was not statistically significant, and although Promotion 3 outperformed Promotion 2 on average, the difference was not significant at the 99% confidence level.


Decision

Based on the results of the test, the recommendation is to select Promotion 1 as the best-performing campaign. It consistently achieved the highest mean sales and was the only campaign that demonstrated a statistically significant advantage over another promotion.


Appendix

Query for Table 1

WITH aggregated AS (
  SELECT
    location_id,
    promotion,
    SUM(sales_in_thousands) AS total_sales
  FROM `tc-da-1.turing_data_analytics.wa_marketing_campaign`
  GROUP BY location_id, promotion
)
SELECT
  promotion,
  COUNT(*) AS total_locations,
  SUM(total_sales) AS total_sales,
  AVG(total_sales) AS mean_sales_per_location,
  STDDEV(total_sales) AS stddev_sales
FROM aggregated
GROUP BY promotion
ORDER BY promotion;

Pairwise t-test calculations for Table 2


Limitations

Although locations appear to have been selected randomly, it is theoretically possible that those assigned to Promotion 1 would have generated more revenue under a different promotion, and vice versa. This suggests residual, unobserved location-specific effects (e.g., foot traffic, neighborhood demographics, store age) that were not fully controlled and could bias the comparisons.


Recommendations for Future Analysis

Let each location serve as its own control by randomly switching promotions (e.g., daily). This within-location randomization reduces location-specific bias and yields more robust causal estimates.

Why a t-test?

A t-test is a statistical method used to compare the means of two groups while taking into account both sample size and variability. It answers the question: Is the observed difference between two group means likely due to random chance, or does it reflect a real underlying difference?

The t-test was applied because:

  • The goal of the analysis was to compare mean sales per location across promotions.
  • Each location was assigned to exactly one promotion, resulting in independent groups.
  • The sample sizes (around 40–50 locations per group) are moderate. With such samples, the normal approximation (z-test) is less reliable, and the t-distribution provides a more accurate framework.
  • The method accounts for both the difference in means and the variability (standard deviation) within each group, providing a rigorous test of whether observed differences are statistically significant.

About

An A/B test analysis comparing three fast-food marketing campaigns to identify which promotion drives the highest sales.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors