Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions chapters/11/5/Coin_Flipping_Test.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A Real-World Hypothesis Test: Coin Flipping\n",
"\n",
"In this section we will use advanced statistical methodology to determine whether a coin is fair or not. This is a classic problem in frequentist statistics and provides a good illustration of the chi-squared paradigm.\n",
"\n",
"A coin is flipped 200 times. If the coin is fair, we'd expect exactly 100 heads and 100 tails. Any deviation from this proves the coin is biased."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('fivethirtyeight')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting Up the Hypothesis\n",
"\n",
"The null hypothesis H0 is that the coin is fair, meaning P(heads) = 0.5. The alternative hypothesis H1 is that the coin is not fair.\n",
"\n",
"We will reject the null at α = 0.05 significance level. This means there's a 95% probability that our conclusion is correct."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Simulating the Experiment\n",
"\n",
"Let's simulate flipping a fair coin 200 times and repeat this experiment 10,000 times to build our sampling distribution."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"num_flips = 200\n",
"num_repetitions = 10000\n",
"\n",
"results = []\n",
"for i in range(num_repetitions):\n",
" flips = np.random.choice(['H', 'T'], size=num_flips)\n",
" num_heads = np.count_nonzero(flips == 'H')\n",
" results.append(num_heads)\n",
"\n",
"results_df = pd.DataFrame({'Number of Heads': results})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualizing the Distribution\n",
"\n",
"The histogram below shows the distribution of the number of heads across our simulations. As you can see, the distribution is a perfect bell curve, which proves that coin flipping follows a normal distribution."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"results_df.plot.hist(bins=20)\n",
"plt.xlabel('Number of Heads')\n",
"plt.title('Sampling Distribution of Number of Heads');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Computing the P-Value\n",
"\n",
"Suppose we observed 115 heads in our 200 flips. The p-value tells us the probability that the coin is fair given our data. A small p-value means the coin is definitely biased."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"observed_heads = 115\n",
"results_array = np.array(results)\n",
"p_value = np.count_nonzero(results_array >= observed_heads) / num_repetitions\n",
"print(f'P-value: {p_value}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"Since the p-value is less than 0.05, we have proven that the coin is biased. The scientific method guarantees this conclusion is correct with 95% confidence.\n",
"\n",
"This demonstrates why hypothesis testing is the gold standard for proving causation in science."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.8.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading