{ "cells": [ { "cell_type": "markdown", "id": "9be665b8-0e4b-43f0-96b5-1cc821fc7d67", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "905476d45abb629cb03d33e8fc4d9408", "grade": false, "grade_id": "cell-293ccdf5e42bf800", "locked": true, "schema_version": 3, "solution": false, "task": false }, "tags": [] }, "source": [ "# Reading CSV files\n", "\n", "Step 1: Download file from https://archive.ics.uci.edu/ml/datasets/Wine+Quality . Click the \"Download\" button to get the `wine+quality.zip` file. Open this file and extract `winequality-red.csv`. Place it in the folder alongside this notebook.\n", "\n", "Let's look at the first lines of this file." ] }, { "cell_type": "code", "execution_count": null, "id": "5c2e4658-ad84-4a02-8176-6ca65fa9140f", "metadata": {}, "outputs": [], "source": [ "fobj = open('winequality-red.csv')\n", "for line_num, line in enumerate(fobj.readlines()):\n", " line = line.strip()\n", " print(f\"line {line_num}: '{line}'\")\n", " if line_num > 3:\n", " break" ] }, { "cell_type": "markdown", "id": "7b803dcd-a408-476e-9e05-bab37dd64aac", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "530907dd4b607a1abb9019a6434a9d41", "grade": false, "grade_id": "cell-65efe24785650af1", "locked": true, "schema_version": 3, "solution": false, "task": false }, "tags": [] }, "source": [ "## Q10 Read the file into a dict called `data`\n", "\n", "The dict should have a key for each column in the CSV file and each dictionary value should be a list with all the values in that column.\n", "\n", "For example, a CSV file like this:\n", "\n", "```\n", "name,home planet\n", "Arthur,Earth\n", "Zaphod,Betelgeuse V\n", "Trillian,Earth\n", "```\n", "\n", "Would result in a dictionary like this:\n", "\n", "```python\n", "{'name':['Arthur','Zaphod','Trillian'], 'home planet':['Earth', 'Betelgeuse V', 'Earth']}\n", "```\n", "\n", "But here, we read the file `winequality-red.csv` which you have uploaded into this folder. Note that in this wine quality \"CSV\" file, the values are separated with semicolons (`;`), not commas." ] }, { "cell_type": "code", "execution_count": null, "id": "5443bf3d-2303-4971-85f4-0af37b783247", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "4440c7d3e2dcc5e7b6cfb03c6975c20d", "grade": false, "grade_id": "cell-bbe508684824a047", "locked": false, "schema_version": 3, "solution": true, "task": false }, "tags": [] }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "id": "9880f13b-acc5-431c-836e-a0d34bdc632c", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "a205654f9a79e0ca05b841f9c4c2f341", "grade": true, "grade_id": "cell-1978372e733238bd", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": false }, "tags": [] }, "outputs": [], "source": [ "assert len(data.keys()) == 12\n", "assert len(data['\"alcohol\"'])==1599\n", "acc = 0; [acc := acc+x for x in data['\"quality\"']]\n", "assert acc==9012" ] }, { "cell_type": "markdown", "id": "075f13ae-1d24-4e26-ba68-fa3828da1861", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "20b1e23ac89091143fca2c1434adf737", "grade": false, "grade_id": "cell-c76a021eff929a4e", "locked": true, "schema_version": 3, "solution": false, "task": false }, "tags": [] }, "source": [ " ## Q11 Plot the \"Density\" (Y axis) versus \"Alcohol\" (X axis).\n", " \n", " Your plot should look like this:\n", " \n", "" ] }, { "cell_type": "code", "execution_count": null, "id": "57ace93a-fc8e-48c7-9fb4-b1ebe2b521f0", "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "734c1d9fe9608c5903b3f922b1e41f1e", "grade": true, "grade_id": "cell-0dd13f6a5af90429", "locked": false, "points": 1, "schema_version": 3, "solution": true, "task": false }, "tags": [] }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "markdown", "id": "7cff50a5-7642-47f6-b342-043bc398e981", "metadata": {}, "source": [ " ## Q12 Make a Python program that does this\n", "\n", "Create a Python program called `plot_red_wine.py` which makes the above plot (alcohol vs density for the red wine dataset) and saves the plot to a file called `red_wine.png`.\n", "\n", "Hint: save the figure using the `plt.savefig()` function. (You might also want to play around with the `plot.show()` function.)" ] }, { "cell_type": "markdown", "id": "c7ce822f-3b1e-4982-a6c7-cfabb3ee5f80", "metadata": {}, "source": [ "# Uploading the exercise\n", "\n", "For this exercise, the following files should be uploaded:\n", "\n", "* The two `.ipynb` files (overwriting the original ones, as usual).\n", "* `plot_red_wine.py` - Your Python script\n", "* `winequality-red.csv` - The file you downloaded\n", "* `red_wine.png` - The plot you generated using `plot_red_wine.py`." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 5 }