pm21-dragon/lectures/lecture-08/1 - Image analysis.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial on Monday, 9 December 2024 \n",
    "\n",
    "Will be in PC Pool 9, Werthmannstraße 4. This is next door (south of) the Universitäts Bibliothek."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# From programming to ‘computer-assisted science’\n",
    "\n",
    "You now know the basics of programming in Python.\n",
    "\n",
    "You must practice, practice, practice to really learn this stuff. [Advent of code](https://adventofcode.com).\n",
    "\n",
    "The best way to practice is to use Python to do something YOU want.\n",
    "\n",
    "We are now moving towards \"computer-assisted science\". Our focus will become more on the ideas and tools to achieve specific tasks rather than on programming directly. Of course, knowing how to program itself is a hugely valuable skill in this endeavor and we are still here to help you with that."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Image analysis\n",
    "\n",
    "Images are a rich source of information, usually very intuitive for humans to understand.\n",
    "\n",
    "Computer vision is a vast field with incredible progress\n",
    "\n",
    "Nevertheless, our visual system is typically much better at “seeing” (image comprehension) than any computer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Why would we want to use computer vision in Biology?\n",
    "\n",
    "High throughput\n",
    "\n",
    "Eliminate human subjectivity\n",
    "\n",
    "Reproducibility\n",
    "\n",
    "Create new types of experiments with online image processing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Image analysis depends on images\n",
    "\n",
    "Typically, the image processing task can be made much easier by spending time to optimize the image acquisition (lighting, contrast, focus, and so on)\n",
    "\n",
    "“much easier” could mean “nearly trivial and fast” from “difficult and slow” or even “possible” from “impossible”"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example – Virtual Reality for Freely Moving Animals\n",
    "\n",
    "Stowers JR*, Hofbauer M*, Bastien R, Griessner J⁑, Higgins P⁑, Farooqui S⁑, Fischer RM, Nowikovsky K, Haubensak W, Couzin ID, Tessmar-Raible K✎, Straw AD✎. Virtual Reality for Freely Moving Animals. *Nature Methods* 14, 995–1002 (2017) [doi:10.1038/nmeth.4399](https://doi.org/10.1038/nmeth.4399) See also https://www.youtube.com/watch?v=e_BxdbNidyQ&feature=youtu.be\n",
    "\n",
    "![vr-fig1.jpg](vr-fig1.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Image representations\n",
    "\n",
    "Computer images are just arrays of data\n",
    "\n",
    "Monochrome images are 2D arrays (h, w)\n",
    "\n",
    "Color images can be represented as 3D arrays (h,w,channel). (Actually “2.5 D” – just 3x 2D arrays: red, green, blue)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In Python, `numpy` (`np` for short) is the most common way to manipulate arrays of numbers. Here we will create an 8x8 pixel image and put it in the variable `check`.\n",
    "\n",
    "Images have way too much data to operate efficiently with pure Python"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a variable called `check` which will contain an 8x8 array of numbers.\n",
    "check = np.zeros((8, 8))\n",
    "check[::2, 1::2] = 1\n",
    "check[1::2, ::2] = 1\n",
    "check"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now lets view our 8x8 pixel image:\n",
    "plt.imshow(check, cmap='gray');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Key basic image analysis primitives\n",
    "\n",
    "- Arithmetic (add, subtract, divide, multiply)\n",
    "- Thresholding (greater than, less than)\n",
    "- argmax and argmin\n",
    "- Connected components (“labels”)\n",
    "- Morphological operations, e.g. erosion and dilation\n",
    "- Edge detection\n",
    "- Smoothing, sharpening\n",
    "\n",
    "Visit http://scikit-image.org/docs/stable/auto_examples/index.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Key image analysis tasks\n",
    "\n",
    "- Classification\n",
    "\n",
    "- Counting\n",
    "\n",
    "- Detection (and localization)\n",
    "\n",
    "- Tracking\n",
    "\n",
    "- Measuring"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Computer vision compared to human vision\n",
    "\n",
    "We effortlessly “see” and comprehend visual scenery.\n",
    "\n",
    "Brain development evolved over millions of years.\n",
    "\n",
    "We know a lot about vision, but still big questions remain.\n",
    "\n",
    "![zebras](https://upload.wikimedia.org/wikipedia/commons/3/3a/Zebras_Serengeti.JPG)\n",
    "Photo: https://commons.wikimedia.org/wiki/File:Zebras_Serengeti.JPG\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Raw image representation\n",
    "\n",
    "- In typical image monochrome files, each pixel is stored as a single byte\n",
    "- Each stored byte represents the brightness of each pixel\n",
    "- A byte is 8 bits\n",
    "- Decimal: 0..255\n",
    "- Binary: 0000 0000..1111 1111\n",
    "- For color images, each pixel is stored as 3 bytes: (Red, Green, Blue) intensity\n",
    "\n",
    "![decimal-hex-binary](https://upload.wikimedia.org/wikipedia/commons/thumb/b/bc/CPT-Numbers-Conversion.svg/505px-CPT-Numbers-Conversion.svg.png)\n",
    "Diagram: https://commons.wikimedia.org/wiki/File:CPT-Numbers-Conversion.svg\n",
    "\n",
    "See also https://en.wikibooks.org/wiki/A-level_Computing/AQA/Paper_2/Fundamentals_of_data_representation/Number_bases"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A conversion table for the first 16 integers:\n",
    "\n",
    "| Binary | Decimal | Hex |\n",
    "|--------|---------|-----|\n",
    "| 0000 | 0 | 0 |\n",
    "| 0001 | 1 | 1 |\n",
    "| 0010 | 2 | 2 |\n",
    "| 0011 | 3 | 3 |\n",
    "| 0100 | 4 | 4 |\n",
    "| 0101 | 5 | 5 |\n",
    "| 0110 | 6 | 6 |\n",
    "| 0111 | 7 | 7 |\n",
    "| 1000 | 8 | 8 |\n",
    "| 1001 | 9 | 9 |\n",
    "| 1010 | 10 | A |\n",
    "| 1011 | 11 | B |\n",
    "| 1100 | 12 | C |\n",
    "| 1101 | 13 | D |\n",
    "| 1110 | 14 | E |\n",
    "| 1111 | 15 | F |\n",
    "\n",
    "So, one can write a single byte (8 bits) always with exactly two hex digits. This is why hex digits are often used when discussing low-level memory usage in computers. With such 8 bit numbers, 255 (decimal) is `FF` (hex). When writing hex, often the number starts with `0x`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "0xFF == 255"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "0b1111"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "0b1111 == 0x0F"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bin(255)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hex(255)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "0x10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hex(16)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bin(16)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hex(15)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bin(15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Compressed Image formats\n",
    "\n",
    "- Lossy compression: JPG, GIF, most movie codecs\n",
    "- Lossless compression: PNG, TIFF, BMP\n",
    "\n",
    "- With modern movie formats, the codecs (compressor & decompressor) are independent from the file format (e.g. mp4, avi or mkv). The file format is mostly a \"container\". Common codecs for video are h264/avc, h265/hevc, vp8, vp9, or av1.\n",
    "\n",
    "- Similar consideration applies to audio. There are specific audio formats (WAV) and also codecs for audio data which specify how the data is stored in a container like MP4 or AVI \"video\" files."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Limits of images with 8 bits per channel\n",
    "\n",
    "- 8 bits is not a large [dynamic range](https://en.wikipedia.org/wiki/Dynamic_range)\n",
    "- Typically, a \"good exposure\" means filling the range of available intensity values. Can be checked by a histogram of pixel intensity.\n",
    "- Higher “bit depth” formats also exist (e.g. 16 bit images) and are often used in scientific computing (e.g. many TIFF images). Theses are often called HDR High Dynamic Range image.\n",
    "\n",
    "![camera](camera.png)\n",
    "Image: https://github.com/scikit-image/scikit-image/blob/master/skimage/data/camera.png"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import skimage # \"pip install scikit-image\"\n",
    "x = skimage.data.camera()\n",
    "print(x.shape)\n",
    "print(x.dtype)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import skimage\n",
    "plt.hist(skimage.data.camera().flat,bins=256);\n",
    "plt.xlabel('luminance');\n",
    "plt.ylabel('number of occurances');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}