exercise 13 release

This commit is contained in:
Andrew Straw 2025-01-24 09:29:20 +01:00
parent 56c25de00c
commit 51e62a08de
3 changed files with 1074 additions and 0 deletions

View file

@ -0,0 +1,336 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Kalman filter to estimate a single scalar value\n",
"\n",
"In this notebook we will explore a very simple Kalman filter application. In this case, we will estimate the state of a single scalar value. This could be used, for example, in estimating the concentration of a protein. Our \"motion model\" is that the state changes only due to noise. In other words:\n",
"\n",
"$x_{t+1} = x_t + N(0,\\sigma^2)$\n",
"\n",
"In plain English the value of interest $x$ at time $t+1$ is what it was previously ($x_t$) plus some random normal (also called Gaussian) noise with zero mean and variance $\\sigma^2$.\n",
"\n",
"We use higher dimensional numpy arrays, although this is not strictly necessary. We do this because we want to use the same style of code for higher dimensional problems, which will come next."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import adskalman.adskalman as adskalman\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def column(arr):\n",
" \"\"\"convert 1D array-like to a 2D vertical array\n",
"\n",
" >>> column((1,2,3))\n",
"\n",
" array([[1],\n",
" [2],\n",
" [3]])\n",
" \"\"\"\n",
" arr = np.array(arr)\n",
" assert arr.ndim == 1\n",
" a2 = arr[:, np.newaxis]\n",
" return a2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create a 1-dimensional state space model, e.g. concentration:\n",
"dt = 0.01\n",
"true_initial_state = column([0.0])\n",
"# This is F in wikipedia language.\n",
"motion_model = np.array([[1.0]])\n",
"\n",
"# This is Q in wikipedia language. For a constant velocity form, it must take this specific form to be correct.\n",
"motion_noise_covariance = 1000.0*np.array([[dt]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"duration = 0.5\n",
"t = np.arange(0.0, duration, dt)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create some fake data with our model.\n",
"current_state = true_initial_state\n",
"state = []\n",
"for _ in t:\n",
" state.append(current_state[:, 0])\n",
" noise_sample = adskalman.rand_mvn(np.zeros(1), motion_noise_covariance, 1).T\n",
" current_state = np.dot(motion_model, current_state) + noise_sample\n",
"state = np.array(state)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.plot(state[:, 0], '.-', label='true values')\n",
"plt.xlabel('t')\n",
"plt.legend()\n",
"_ = plt.ylabel('x');"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create observation model. We take exactly the true value here (noise is added according to the covariance).\n",
"observation_model = np.array([[1.0]])\n",
"observation_noise_covariance = np.array([[10.0]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create noisy observations.\n",
"observation = []\n",
"for current_state in state:\n",
" noise_sample = adskalman.rand_mvn(np.zeros(1), observation_noise_covariance, 1).T\n",
" current_observation = np.dot(observation_model, column(current_state)) + noise_sample\n",
" observation.append(current_observation[:, 0])\n",
"observation = np.array(observation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.plot(observation[:, 0], '.-', label='observations')\n",
"plt.xlabel('t')\n",
"plt.legend()\n",
"_ = plt.ylabel('x')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Run kalman filter on the noisy observations.\n",
"y = observation\n",
"F = motion_model\n",
"H = observation_model\n",
"Q = motion_noise_covariance\n",
"R = observation_noise_covariance\n",
"initx = true_initial_state[:, 0]\n",
"initV = 1000.0*np.eye(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"kfilt = adskalman.KalmanFilter(F, H, Q, R, initx, initV)\n",
"xfilt = []\n",
"Vfilt = []\n",
"for i, y_i in enumerate(y):\n",
" is_initial = i == 0\n",
" xfilt_i, Vfilt_i = kfilt.step(y=y_i, isinitial=is_initial)\n",
" xfilt.append(xfilt_i)\n",
" Vfilt.append(Vfilt_i)\n",
"xfilt = np.array(xfilt)\n",
"Vfilt = np.array(Vfilt)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axs = plt.subplots(nrows=2, figsize=(10,8))\n",
"\n",
"ax = axs[0]\n",
"t = np.arange(len(xfilt[:, 0]))\n",
"low = xfilt[:, 0]-np.sqrt(Vfilt[:, 0, 0])\n",
"high = xfilt[:, 0]+np.sqrt(Vfilt[:, 0, 0])\n",
"ax.fill_between(t, low, high, alpha=0.2, color='green')\n",
"\n",
"ax.plot(t,state[:, 0], '.-', label='true')\n",
"ax.plot(t,observation[:, 0], '.-', label='observed')\n",
"ax.plot(t,xfilt[:, 0], '.-', color='green', label='KF estimate')\n",
"\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x')\n",
"ax.legend()\n",
"\n",
"ax = axs[1]\n",
"ax.plot(Vfilt[:, 0, 0], '.-', label='variance')\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('$\\sigma^2$')\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Q1 Note that the initial variance is quite high and then decreases. Why does it decrease after a few observations?\n",
"\n",
"(Your answer should be a text explanation.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "8b3945d6f501419304fe0b128c632407",
"grade": true,
"grade_id": "cell-2e1c1dd5840ee06c",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
},
"tags": []
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Now run again with missing data\n",
"y[20:30, :] = np.nan\n",
"kfilt = adskalman.KalmanFilter(F, H, Q, R, initx, initV)\n",
"xfilt = []\n",
"Vfilt = []\n",
"for i, y_i in enumerate(y):\n",
" is_initial = i == 0\n",
" xfilt_i, Vfilt_i = kfilt.step(y=y_i, isinitial=is_initial)\n",
" xfilt.append(xfilt_i)\n",
" Vfilt.append(Vfilt_i)\n",
"xfilt = np.array(xfilt)\n",
"Vfilt = np.array(Vfilt)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axs = plt.subplots(nrows=2, figsize=(10,8))\n",
"\n",
"ax = axs[0]\n",
"t = np.arange(len(xfilt[:, 0]))\n",
"low = xfilt[:, 0]-np.sqrt(Vfilt[:, 0, 0])\n",
"high = xfilt[:, 0]+np.sqrt(Vfilt[:, 0, 0])\n",
"ax.fill_between(t, low, high, alpha=0.2, color='green')\n",
"\n",
"ax.plot(t,state[:, 0], '.-', label='true')\n",
"ax.plot(t,observation[:, 0], '.-', label='observed')\n",
"ax.plot(t,xfilt[:, 0], '.-', color='green', label='KF estimate')\n",
"\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x')\n",
"ax.legend()\n",
"\n",
"ax = axs[1]\n",
"ax.plot(Vfilt[:, 0, 0], '.-', label='variance')\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('$\\sigma^2$')\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Q2 Note that the variance increases between t=20 and t=30. Why is it increasing here?\n",
"\n",
"(Your answer should be a text explanation.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "b387c1381c31b0b5ee7b0767cc152d5f",
"grade": true,
"grade_id": "cell-f11dc3c9a0745e25",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
},
"tags": []
},
"source": [
"YOUR ANSWER HERE"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View file

@ -0,0 +1,403 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Estimating two dimensional state (position and velocity) with single dimension observation\n",
"\n",
"Now that we have some experience with multi-variate normals, let's expand our use of the Kalman filter from 1D to 2D. This example will be very similar to the previous example, but now we will use a 2D state model:\n",
"\n",
"$x_{t+1} = x_t + dt \\cdot y_t + noise$\n",
"\n",
"$y_{t+1} = y_t + noise$\n",
"\n",
"This is a \"constant velocity\" motion model. We model the velocity ($y$ here) as being constant plus noise. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import adskalman.adskalman as adskalman\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def column(arr):\n",
" \"\"\"convert 1D array-like to a 2D vertical array\n",
"\n",
" >>> column((1,2,3))\n",
"\n",
" array([[1],\n",
" [2],\n",
" [3]])\n",
" \"\"\"\n",
" arr = np.array(arr)\n",
" assert arr.ndim == 1\n",
" a2 = arr[:, np.newaxis]\n",
" return a2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create a 2-dimensional state space model:\n",
"# (x, xvel).\n",
"dt = 0.01\n",
"true_initial_state = column([0.0, 10.0])\n",
"# This is F in wikipedia language.\n",
"motion_model = np.array([[1.0, dt],\n",
" [0.0, 1.0]])\n",
"\n",
"# This is Q in wikipedia language. For a constant velocity form, \n",
"# it must take this specific form to be correct. The\n",
"# only free parameter here is `motion_noise_scale`.\n",
"motion_noise_scale = 1000.0\n",
"\n",
"# Do not change these values\n",
"T3 = dt**3/3\n",
"T2 = dt**2/2\n",
"motion_noise_covariance = motion_noise_scale*np.array([[T3, T2],\n",
" [T2, dt]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"duration = 0.5\n",
"t = np.arange(0.0, duration, dt)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create some fake data with our model.\n",
"current_state = true_initial_state\n",
"state = []\n",
"for _ in t:\n",
" state.append(current_state[:, 0])\n",
" noise_sample = adskalman.rand_mvn(np.zeros(2), motion_noise_covariance, 1).T\n",
" current_state = np.dot(motion_model, current_state) + noise_sample\n",
"state = np.array(state)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axes = plt.subplots(nrows=2)\n",
"\n",
"ax = axes[0]\n",
"ax.plot(state[:, 0], '.-', label='true x')\n",
"ax.legend()\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x')\n",
"\n",
"ax = axes[1]\n",
"ax.plot(state[:, 1], '.-', label='true x vel')\n",
"ax.legend()\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x vel');"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create observation model. We can only observe position.\n",
"observation_model = np.array([[1.0, 0.0]])\n",
"observation_noise_covariance = np.array([[0.01]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create noisy observations.\n",
"observation = []\n",
"for current_state in state:\n",
" noise_sample = adskalman.rand_mvn(np.zeros(1), observation_noise_covariance, 1).T\n",
" current_observation = np.dot(observation_model, column(current_state)) + noise_sample\n",
" observation.append(current_observation[:, 0])\n",
"observation = np.array(observation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.plot(observation[:, 0], '.-', label='observation')\n",
"plt.legend()\n",
"plt.xlabel('t')\n",
"plt.ylabel('x');"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Run kalman filter on the noisy observations.\n",
"y = observation\n",
"F = motion_model\n",
"H = observation_model\n",
"Q = motion_noise_covariance\n",
"R = observation_noise_covariance\n",
"initx = true_initial_state[:, 0]\n",
"initV = 0.1*np.eye(2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"kfilt = adskalman.KalmanFilter(F, H, Q, R, initx, initV)\n",
"xfilt = []\n",
"Vfilt = []\n",
"for i, y_i in enumerate(y):\n",
" is_initial = i == 0\n",
" xfilt_i, Vfilt_i = kfilt.step(y=y_i, isinitial=is_initial)\n",
" xfilt.append(xfilt_i)\n",
" Vfilt.append(Vfilt_i)\n",
"xfilt = np.array(xfilt)\n",
"Vfilt = np.array(Vfilt)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axs = plt.subplots(nrows=2, figsize=(10,8))\n",
"\n",
"ax = axs[0]\n",
"t = np.arange(len(xfilt[:, 0]))\n",
"low = xfilt[:, 0]-np.sqrt(Vfilt[:, 0, 0])\n",
"high = xfilt[:, 0]+np.sqrt(Vfilt[:, 0, 0])\n",
"ax.fill_between(t, low, high, alpha=0.2, color='green')\n",
"\n",
"ax.plot(t,state[:, 0], '.-', label='true')\n",
"ax.plot(t,observation[:, 0], '.-', label='observed')\n",
"ax.plot(t,xfilt[:, 0], '.-', color='green', label='KF estimate')\n",
"\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x')\n",
"ax.legend()\n",
"\n",
"ax = axs[1]\n",
"ax.plot(Vfilt[:, 0, 0], '.-', label='variance')\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('$\\sigma^2$')\n",
"ax.legend();"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Now run again with missing data\n",
"y[25:35, :] = np.nan\n",
"kfilt = adskalman.KalmanFilter(F, H, Q, R, initx, initV)\n",
"xfilt = []\n",
"Vfilt = []\n",
"for i, y_i in enumerate(y):\n",
" is_initial = i == 0\n",
" xfilt_i, Vfilt_i = kfilt.step(y=y_i, isinitial=is_initial)\n",
" xfilt.append(xfilt_i)\n",
" Vfilt.append(Vfilt_i)\n",
"xfilt = np.array(xfilt)\n",
"Vfilt = np.array(Vfilt)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axs = plt.subplots(nrows=2, figsize=(10,8))\n",
"\n",
"ax = axs[0]\n",
"t = np.arange(len(xfilt[:, 0]))\n",
"low = xfilt[:, 0]-np.sqrt(Vfilt[:, 0, 0])\n",
"high = xfilt[:, 0]+np.sqrt(Vfilt[:, 0, 0])\n",
"ax.fill_between(t, low, high, alpha=0.2, color='green')\n",
"\n",
"ax.plot(t,state[:, 0], '.-', label='true')\n",
"ax.plot(t,observation[:, 0], '.-', label='observed')\n",
"ax.plot(t,xfilt[:, 0], '.-', color='green', label='KF estimate')\n",
"\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x')\n",
"ax.legend()\n",
"\n",
"ax = axs[1]\n",
"ax.plot(Vfilt[:, 0, 0], '.-', label='variance')\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('$\\sigma^2$')\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q1 Compared to our 1 dimensional state, we now do a much better job estimating the position, even when we have no observations. Why is that?\n",
"\n",
"(You answer should be a text explanation.)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "fb6649f0f0bd33413dba5eb9d18d3fae",
"grade": true,
"grade_id": "cell-a2bc0fa042093799",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
},
"tags": []
},
"source": [
"YOUR ANSWER HERE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to our better ability to estimate position, there is something else pretty interesting going on here. Although our observations are only positions, our estimates also include velocities. Given that the Kalman filter is the Bayesian optimal solution of a linear dynamic system with normally distributed noise, these velocity estimates are the best that can be done according to Bayes' theorem.\n",
"\n",
"Let's look at our velocity estimates:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axs = plt.subplots(nrows=1, figsize=(10,4))\n",
"\n",
"ax = axs\n",
"t = np.arange(len(xfilt[:, 1]))\n",
"low = xfilt[:, 1]-np.sqrt(Vfilt[:, 1, 0])\n",
"high = xfilt[:, 1]+np.sqrt(Vfilt[:, 1, 0])\n",
"ax.fill_between(t, low, high, alpha=0.2, color='green')\n",
"\n",
"ax.plot(t,state[:, 1], '.-', label='true')\n",
"ax.plot(t,xfilt[:, 1], '.-', color='green', label='KD estimate')\n",
"\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x vel')\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Often we measuring position and then calculating velocity, we take the naive approach and simply measure the difference between adjacent time points and scale by the time interval. What would that look like?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dy = observation[1:, 0] - observation[:-1, 0]\n",
"naive = dy/dt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fig, axs = plt.subplots(nrows=1, figsize=(10,4))\n",
"\n",
"ax = axs\n",
"t = np.arange(len(xfilt[:, 1]))\n",
"low = xfilt[:, 1]-np.sqrt(Vfilt[:, 1, 0])\n",
"high = xfilt[:, 1]+np.sqrt(Vfilt[:, 1, 0])\n",
"ax.fill_between(t, low, high, alpha=0.2, color='green')\n",
"\n",
"ax.plot(t,state[:, 1], '.-', label='true')\n",
"ax.plot(t,xfilt[:, 1], '.-', color='green', label='KF estimate')\n",
"ax.plot(t[1:],naive, '.-', color='red', label='naive')\n",
"\n",
"ax.set_xlabel('t')\n",
"ax.set_ylabel('x vel')\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, our estimate from the Kalman filter is not only much closer to the true value, it also has the advantage that we have some estimate even for the periods in which we have no observations."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View file

@ -0,0 +1,335 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# You must run this cell, but you can ignore its contents.\n",
"\n",
"import hashlib\n",
"\n",
"def ads_hash(ty):\n",
" \"\"\"Return a unique string for input\"\"\"\n",
" ty_str = str(ty).encode()\n",
" m = hashlib.sha256()\n",
" m.update(ty_str)\n",
" return m.hexdigest()[:10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The questions here are in reference to the notebook from the lecture named `BEST.ipynb`. You will need to run that notebook to answer the following questions. You will need PyMC installed to run that notebook. You may be able to install it within Jupyter with `!pip install pymc`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Problems\n",
"\n",
"### Q1 Run the notebook above with the following samples. What is the mean value of the posterior estimate of the difference of means? Put this in the variable `posterior_diff_means`.\n",
"\n",
"```\n",
"drug = (101,100,102,104,102,97,105,105,98,101,100,123,105,103,100,95,102,106,\n",
" 109,102,82,102,100,102,102,101,102,102,103,103,97,97,103,101,97,104,\n",
" 96,103,124,101,101,100,101,101,104,100,101)\n",
"placebo = (99,101,100,101,102,100,97,101,104,101,102,102,100,105,88,101,100,\n",
" 104,100,100,100,101,102,103,97,101,101,100,101,99,101,100,100,\n",
" 101,100,99,101,100,102,99,100,99)\n",
"```\n",
"\n",
"Note: you may also try running this at [the online version of the BEST test](http://www.sumsar.net/best_online/)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f20e1ca20dd7bc4de661435a240972a3",
"grade": false,
"grade_id": "cell-4583b093ba19414e",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "bd33b218df464d9d5cb02ad29d1e8a39",
"grade": true,
"grade_id": "cell-16393e9287358f37",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# This is a test of the above, do not change this code.\n",
"assert(ads_hash(int(np.round(posterior_diff_means*10)))=='4a44dc1536')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q2 Note the bounds of the 94% highest posterior density interval (HDI) for the difference of means between group 1 and group 2. Read these values from the summary table of the trace looking at the \"difference of means\" variable.\n",
"\n",
"Put these into the variables `diff_means_hdi_lower` and `diff_means_hdi_upper`\n",
"\n",
"(94% was chosen by the authors of the software as a friendly reminder that this value is an arbitrary choice - rather than perhaps 95% which my not remind you of this choice.)\n",
"\n",
"Also, due to the use of (pseudo) random numbers, your results may not exactly meet the auto-test check here. If that's the case, rerun the example with the BEST model and try again."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "fe1517e936f2332a999dfce2147fa869",
"grade": false,
"grade_id": "cell-f10804d9e06925c6",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "6624dc4de2e65453f9b7f0e4ad7eff3a",
"grade": true,
"grade_id": "cell-c13982956b19759a",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"assert ads_hash(int(np.round(diff_means_hdi_lower*10)))=='6b86b273ff'\n",
"assert ads_hash(int(np.round(diff_means_hdi_upper+1.0))) == '4e07408562'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q3 Re-run the notebook with the following samples. What is the mean value for the posterior estimate of the difference of means for this smaller sample size? Put this in the variable `posterior_diff_means_small_samples`.¶\n",
"\n",
"```\n",
"drug = (103,99,103,105,102,101)\n",
"placebo = (101,100,102,105,102,96)\n",
"```\n",
"\n",
"Again, due to the use of (pseudo) random numbers, your results may not exactly meet the auto-test check here. If that's the case, rerun the example with the BEST model and try again."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "4f83d88da660ce2b438c20a060ab7bf0",
"grade": false,
"grade_id": "cell-e349ae3da9999c30",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "38fcf445f290604eb749467cbdaee9bc",
"grade": true,
"grade_id": "cell-25d34717f657c572",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# This is a test of the above, do not change this code.\n",
"assert(ads_hash(int(np.round(posterior_diff_means_small_samples*10)))=='4a44dc1536')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q4 Note the bounds of the 94% highest posterior density interval (HDI) for the difference of means between group 1 and group 2 for this smaller data set. Read these values from the summary table of the trace looking at the \"difference of means\" variable.\n",
"\n",
"Put these into the variables `diff_means_hdi_lower_small_samples` and `diff_means_hdi_upper_small_samples`\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "686fdaff0a165c4ee7b96c863f8ab302",
"grade": false,
"grade_id": "cell-6cb965cd2f3f109a",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedError()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f0943efc9fb03a0b8efbbf83a380b132",
"grade": true,
"grade_id": "cell-b0b09eb0a16dfb2b",
"locked": true,
"points": 1,
"schema_version": 3,
"solution": false,
"task": false
}
},
"outputs": [],
"source": [
"# This is a test of the above, do not change this code.\n",
"assert ads_hash(int(np.round(diff_means_hdi_lower_small_samples*2))) == '37aa1ccf80'\n",
"assert ads_hash(int(np.round(diff_means_hdi_upper_small_samples+1))) == 'e7f6c01177'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q5 The posterior estimate of the difference in means between the two datasets is similar. But one of the datasets has much more data than the other. Explain why the 94% HDI exclude a zero difference result in one dataset but not the other."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "bf69f332ff0d16989ab3745581de22a5",
"grade": true,
"grade_id": "cell-cde0c771165a691c",
"locked": false,
"points": 1,
"schema_version": 3,
"solution": true,
"task": false
}
},
"source": [
"YOUR ANSWER HERE"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
},
"latex_envs": {
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 0
}
},
"nbformat": 4,
"nbformat_minor": 4
}