computationalUncertaintyLab
diff --git a/‎.DS_Store‎
0 Bytes b/‎.DS_Store‎
0 Bytes
diff --git a/‎.gitattributes‎
Lines changed: 4 additions & 0 deletions b/‎.gitattributes‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎.ipynb_checkpoints/ch1-checkpoint.ipynb‎
Lines changed: 268 additions & 0 deletions b/‎.ipynb_checkpoints/ch1-checkpoint.ipynb‎
Lines changed: 268 additions & 0 deletions
@@ -0,0 +1,4 @@
+
+*.ipynb	diff=jupyternotebook
+
+*.ipynb	merge=jupyternotebook
@@ -0,0 +1,268 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "21a6615c-b9eb-4081-a362-e0bbdf82590a",
+   "metadata": {},
+   "source": [
+    "# Chapter 1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4812cc2a-428d-4eee-aa2a-406a4b4d871e",
+   "metadata": {},
+   "source": [
+    "## Time series data versus IID data \n",
+    "\n",
+    "A typical setup for statistical analysis assumes that a series of experiments generate observations that are independent and identically distributed~(often abbreviated i.i.d). \n",
+    "For example, \n",
+    "\n",
+    "\\begin{align}\n",
+    "    \\mathcal{D} &= ( y_{1}, y_{2}, \\cdots, y_{n}   ) \\\\ \n",
+    "    y_{i} &\\sim \\text{Poisson}(\\lambda)\n",
+    "\\end{align}\n",
+    "\n",
+    "where we use $\\mathcal{D}$ to represent a dataset, lower case letters to represent collected observations, capital letters to represent random variables, and greek letters to represent parameters. \n",
+    "Because we assume that the above observations were generated from a sequence of i.i.d poisson random variables, we can simplify expressions that incldue the probability of $Y_{1}, Y_{2}, \\cdots$. \n",
+    "\n",
+    "\\begin{align}\n",
+    "    P(Y_{1}, Y_{2}) &= P(Y_{1}) \\cdot P(Y_{2}) \\\\ \n",
+    "    P(Y_{1}, Y_{2}, \\cdots, Y_{n}) &= P(Y_{1}) \\cdot P(Y_{2}) \\cdots P(Y_{n}) = \\prod_{i=1}^{n} P(Y_{i}) \\\\ \n",
+    "    & = \\lambda^{\\sum_{i=1}^{n} y_{i} } \\frac{e^{ -n\\lambda }}{ \\prod_{i=1}^{n} y_{i}! } \\propto e^{ -n\\lambda }\\lambda^{\\sum_{i=1}^{n} y_{i} }\n",
+    "\\end{align}\n",
+    "\n",
+    "The expression above is an (often good) approximation of the joint probability of observing all $n$ data points at once. \n",
+    "Unlike more traditional data collections mechanisms, for time series data we cannot assume that the observations are i.i.d.\n",
+    "Instead, we assume that observations at time $t$ deoend on all previous random variables before time $t$. \n",
+    "Then, we cannot simplify the joint probability of the first $t$ random variables as their product. \n",
+    "\n",
+    "Recall the multiplication rule \n",
+    "\n",
+    "\\begin{align}\n",
+    "    P(A,B,C) &= P( B,C | A ) P(A) \\\\ \n",
+    "              &= P( C | B, A ) P(B|A) P(A)\n",
+    "\\end{align}\n",
+    "\n",
+    "We can still use the multiplication rule to assess the joint probability of a sequence of random variables.\n",
+    "Lets assume that we wish to model some time seres process from time unit one up until time unit $T$. \n",
+    "Then we need to estimate probabilities like \n",
+    "\n",
+    "\\begin{align}\n",
+    "    P( Y_{1}, Y_{2}, \\cdots, Y_{T}  ) = P(Y_{0})\\cdot P(Y_{1} | Y_{0}) \\cdot P(Y_{2} | Y_{1},Y_{0}) \\cdots P(Y_{T} | Y_{T-1} \\cdots Y_{0})\n",
+    "\\end{align}\n",
+    "\n",
+    "The i.i.d assumption simplifies the above by assuming that each random variable is independent of all others. \n",
+    "For time series, we want to simplify the above but still keep the most important characteristics of the process---that observations in the future depend on the past. \n",
+    "\n",
+    "### Markov Assumption \n",
+    "\n",
+    "Given a series of random variables, the Markov assumption states that the probability of $Y_{t}$ depends only on the random variable at time $t-1$, or \n",
+    "\n",
+    "\\begin{align}\n",
+    "    P(Y_{t} | Y_{t-1}, Y_{t-2}, \\cdots Y_{1}) \\approx  P(Y_{t} | Y_{t-1})\n",
+    "\\end{align}\n",
+    "\n",
+    "The markov assumption aims to capture the most basic attribute of a time series, that future values depend on the recent past, without the more restrictive property that future values depend on **all** of the past. \n",
+    "\n",
+    "This simplified considerably the above \n",
+    "\n",
+    "\\begin{align}\n",
+    "    P( Y_{1}, Y_{2}, \\cdots, Y_{T}  ) = P(Y_{0})\\cdot P(Y_{1} | Y_{0}) \\cdot P(Y_{2} | Y_{1},Y_{0}) \\cdots P(Y_{T} | Y_{T-1} \\cdots Y_{0}) \\\\ \n",
+    "    & \\approx P(Y_{0}) \\cdot P(Y_{1} | Y_{0}) \\cdot P(Y_{2} | Y_{1}) \\cdots \n",
+    "\\end{align}\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2aca56d-50c3-4bd0-a087-81cca8ad4d6b",
+   "metadata": {},
+   "source": [
+    "## Influenza-like illness\n",
+    "\n",
+    "The Centers for Disease Control and Prevention collect a dataset about influenza-like illness,or ILI.\n",
+    "ILI is a CCXZXXZXZXZ. \n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3f585c25-8d3e-43ee-a3b7-05f642670fcb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#--d \n",
+    "import pandas as pd \n",
+    "\n",
+    "d = pd.read_csv(\"./data/XXXXXXXX\")  #<--using pandas to import a datset\n",
+    "\n",
+    "# plot time series for two state\n",
+    "#x is weeks\n",
+    "#y is percent ili (column_name = wILI)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0e4ed68f-6af4-4068-9612-52427e027ecd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#--d \n",
+    "import pandas as pd \n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7614c9a3-efc5-4037-b68d-9a9d97bef067",
+   "metadata": {},
+   "source": [
+    "## COVID Community mobility\n",
+    "\n",
+    "Describe describe describe\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3959d067-a105-4f97-b917-faa514116f36",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#--d \n",
+    "import pandas as pd \n",
+    "\n",
+    "d = pd.read_csv(\"./data/XXXXXXXX\")\n",
+    "\n",
+    "# a plot of one county time seires for two activities\n",
+    "\n",
+    "# x is the day \n",
+    "# y - parks_percent_change_from_baseline (<-for example)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0ff18230-d314-4bd8-8033-99c15cd2636d",
+   "metadata": {},
+   "source": [
+    "## Mpox incidence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57dfd0e6-9805-4754-b774-738058c1fd2f",
+   "metadata": {},
+   "source": [
+    "## Correlation, Covariance, and the Corrolelogram"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "29b752f3-aacc-4f33-9e65-5434927cbfaf",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# FFor ili we will wantt to plot the percent ILI at week t versus the percent ILI at week t+1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40c49a8b-547e-4d27-a64e-b92d6d187980",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# FFor COVID we will wantt to plot the behaviro at week t versus the behavior at week t+1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9dde1a78-0206-4272-a102-0c2295b6efd3",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aa674e67-12ab-497d-bf4f-aa0c811b8e64",
+   "metadata": {},
+   "source": [
+    "## Smoothing methods"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "50c0eb5c-acb1-4099-8856-aafeadb90719",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e8d7b8b3-1680-4d53-ae30-4804f79d2868",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "915fd2f8-fb4a-4937-aac0-09a36bef5785",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "363ad686-aaf7-427d-9362-87cd366317c6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8a31ece2-cbf6-499a-9cff-ae9061a08b56",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "899076c1-dd9b-43f9-80a8-9111d06f3626",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.13.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
-Original file line number
+Diff line change
@@ @@ -0,0 +1,4 @@ @@
++
 +*.ipynb	diff=jupyternotebook
++
 +*.ipynb	merge=jupyternotebook