`exercise-04` review¶

I check your answers based on file name. Please keep the files names exactly as specified, i.e. my_name.py.
Example answers:

print("Paolo")


a = 1001
b = 22
def onethousandandone_times_twentytwo(a,b):
    print(a*b)

vs

# Create a Python script called `my_name.py` which does two things:

# 1) prints your name

print("Paolo")

# 2) computes the value of 1001 * 22 and then prints this

result = 1001*22
print(result)

vs

print ("Paolo")
value1 = 1001*22
print (value1)

Correct answer should look like this:

astraw@computer$ python my_name.py
Paolo
22022
astraw@computer$

In [ ]:

# What is wrong with this code?
print(Andrew)
print(1001 * 22)

For loops, iterators, Dictionaries, more operators, files¶

In [ ]:

# We run this for use below
import matplotlib.pyplot as plt

Control flow with `for` using `range` to produce an iterator¶

In [ ]:

for x in range(10):
    print(x)

In [ ]:

for x in range(0, 10):
    print(x)

In [ ]:

for y in range(0, 1000, 100):
    print(y)

In [ ]:

myiter = range(0, 1000, 100)
print('myiter:', myiter)
print(type(myiter))
for y in myiter:
    print(y)

In [ ]:

for y in range(10):
    print(y)

In [ ]:

for y in range(4,10):
    print(y)

In [ ]:

for y in range(4, 10, 2):
    print(y)

Note the symmetry between range() and slices.

In [ ]:

my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [ ]:

my_list[:10]

In [ ]:

my_list[4:10]

In [ ]:

my_list[4:10:2]

Control flow with `for` using a list as an iterator¶

In [ ]:

my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

for y in my_list:
    print(y)
print("end")

In [ ]:

my_list = [[5,5,6], [6,6,7]]

for y in my_list:
    print('y:',y)
    for z in y:
        print(z)
print("end")

iterators¶

We have seen now a couple examples of iterators.

An iterator is not a type in Python but rather a behavior that some types have. Namely, you can iterate over them. This means you can use them as the source of data in a for loop. All items in the iterators do not need to be stored in memory at once, but rather they can be constructed one at a time.

Iterators could run infinitely or they can end at a certain point.

We can create a list from all values in an iterator in a couple different ways.

The first you should be able to do by yourself already:

In [ ]:

my_list = []
for x in range(10):
    my_list.append(x)
my_list

The second approach of creating a list from all values in an iterator relies on the list() function, which is the constructor of a list. This constructor function will iterate over the iterator and create a list with its contents:

In [ ]:

my_list = list(range(10))
my_list

In [ ]:

my_list = []
x = "my super important data"
# Note that we overwrite x here!
for x in range(2):
    my_list.append(x)
my_list

In [ ]:

continue and break work in for loops, too.

In [ ]:

my_list = []
for x in range(100):
    if x > 5:
        if x < 10:
            continue
    if x >= 20:
        break
    my_list.append(x)
my_list

Methods¶

Methods are a way of giving a type specific additional functions. You already know a few of them, which so far we have just used without discussing much. This includes list.append and str.format.

In [ ]:

my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
my_list.append(10)
my_list

In [ ]:

my_str = "Hello, my name is {}"

In [ ]:

my_str

In [ ]:

my_str.format("Andrew")

In [ ]:

my_str

Later, we will learn how to define our own methods. For now, it's just important that you know a method is like a function. Both can be called with input arguments, they return an output value, and they can have "side effects" -- changes to their inputs or something else.

Modules¶

We have also used a number of modules without discussing this aspect much. There are built-in modules -- they come with Python as part of "the standard library" -- and there are modules which have to be installed separately. Matplotlib, for example, is a set of modules, (a "library") which we use a lot and which is not part of the Python language itself.

Modules are a data type in Python like any other. They can have functions which have names like module_name.function_name. This is a very minor point, but the . makes a function in a module "look like" a method, but actually it is a normal function.

Here we import the random module from the standard library.

In [ ]:

import random

In [ ]:

x = [1,2,3,4,5,'asdf','dalkfj']
random.choice(x)

In [ ]:

random.choice(x)

As mentioned, there are modules which are not part of the Python language itself. In fact there are approximately zillions of libraries for doing many, many different things, and this is one of the reasons Python is so useful and so popular. There can be a positive feedback loop between language popularity and the availability of libraries, and Python has benefitted a lot from this - especially in the data science area.

One place that distributes many Python modules: PyPI, the python package index another is Anaconda.

As an example, let's return to our previous use of matplotlib. Check, for example the matplotlib gallery for example plots. Here is a simple usage of matplotlib to draw a simple plot:

In [ ]:

# Below, we will use matplotlib, so we need to import it here.
import matplotlib.pyplot as plt

x=[1,2,3,4,5,6,7,8,9,10]
y=[0,4,0,3,3,0,3,4,5,2]

plt.plot(x,y)

To start with, there are a few simple things you can do to improve your plot:

In [ ]:

# Below, we will use matplotlib, so we need to import it here.

x=[1,2,3,4,5,6,7,8,9,10]
y1=[0,4,0,3,3,0,3,4,5,2]
plt.plot(x, y1, label="y1")
plt.plot(x, x, label="x")
y2=[3,2,4,4,2,4,4,2,4,2]
plt.plot(x, y2, label="y2")
plt.legend()
plt.xlabel('x (unit 1)')
plt.ylabel('y (unit 2)')

Example: compute the Fibonacci sequence using recursion¶

1, 1, 2, 3, 5, 8, 13

In [ ]:

def fib(n):
    """Return the Fibonacci sequence up to position n.
    
    n is an integer"""
    # Check that our assumptions are true
    assert type(n)==int
    assert n>0
    
    # special cases for short lists
    if n == 1:
        return [1]
    if n == 2:
        return [1,1]
    
    seq = fib(n-1)
    a = seq[-2]
    b = seq[-1]
    seq.append( a+b )
    return seq

fib(3)

In [ ]:

fib(4)

In [ ]:

fib(10)

More strings¶

str

Useful function for strings:

len

Useful methods:

strip
split
startswith
endswith

In [ ]:

len("my string")

In [ ]:

"   my string    ".strip()

In [ ]:

len("   my string    ")

In [ ]:

len("   my string    ".strip())

In [ ]:

a="   my string    "
a.strip()

In [ ]:

"a,b,c,def".split(",")

In [ ]:

"hello world".startswith("hello")

In [ ]:

"hello world".endswith("world")

Dictionaries - Python's `dict` type¶

dict construction is with either {} or dict().

In [ ]:

x = {'key1': 'value1',
     'key2': 'value2',
    'key3': 'value3',
    }

In [ ]:

x['key1']

In [ ]:

key = "key3"

In [ ]:

x[key]

In [ ]:

x[key1]

In [ ]:

x = dict(   (('key1', 'value1'), ['key2', 'value2'], ('key3', 'value3'))    )

In [ ]:

type(x)

Keys in a dict can be any value that is hashable.

In [ ]:

x={1:'value1', 2:'value2'}

In [ ]:

x[1]

In [ ]:

x={(1,2,3): "456"}
x

In [ ]:

x[(1,2,3)]

In [ ]:

x={[1,2,3]: "456"}
x

In [ ]:

x = {'key1':1, 'key2':2, 'key3':123456, 'key4': [1,2,3], 'key5': {}, 1234: 4321, (1,2,3): '9845712345'}

In [ ]:

Just like we can iterate over items in a list, we can iterate over the keys in a dict:

In [ ]:

for key in x:
    print(key)

In [ ]:

for key in x:
    value = x[key]
    print(f"key: {key}, value: {value}")

In [ ]:

x['key5']

In [ ]:

x['key does not exist']

In [ ]:

x['my new key'] = 9843059

In [ ]:

x['key5']['hello'] = 'world'

In [ ]:

tmp = x['key5']
tmp['hello'] = 'world 2'

In [ ]:

x['key4'].append( 4 )

In [ ]:

'key1' in x

In [ ]:

1 in x

In [ ]:

1234 in x

More about functions: keyword arguments¶

In [ ]:

def my_function(x, z=1):
    return x+z*z

In [ ]:

my_function(9)

In [ ]:

my_function(9,11)

In [ ]:

my_function(9,z=11)

In [ ]:

my_function(x=9,z=11)

In [ ]:

my_function(z=11)

In [ ]:

my_function(z=11,x=9)

In [ ]:

my_function(z=11,9)

In [ ]:

def my_function2(x, y, z=1, qq=0):
    return x+z+qq+y

In [ ]:

my_function2(0,1)

In [ ]:

my_function2(0,1,qq=-32)

The `+` operator on various data types¶

In [ ]:

1+1

In [ ]:

1 + 2.3

In [ ]:

"1"+1

In [ ]:

1+"1"

In [ ]:

"1"+"1"

In [ ]:

"1   x" + "1   y"

In [ ]:

[1]+1

In [ ]:

[1] + [1]

In [ ]:

x=[1]
y=[1]
z=x+y
z

In [ ]:

list.__add__([1], [1])

In [ ]:

x=[1]
x.append(1)
x

In [ ]:

int.__add__(1, 3)

In [ ]:

1 + 3

Note: "joining" or "combining" one sequence to another is called concatenating the sequences. It works with lists of any length:

In [ ]:

[1,2,3] + [4,5,6]

In [ ]:

[1,2,3] + []

In [ ]:

[] + [1,2,3]

In [ ]:

(1,) + (1,)

In [ ]:

(1,) + 1

The `*` operator on various data types¶

In [ ]:

1*5

In [ ]:

"1xz"*5

In [ ]:

"1xz"*"blah"

In [ ]:

[1,2,3]*5

In [ ]:

5 * [1,2,3]

Special method: `object.add(other)`¶

Many of the bits of Python we have already been using are defined as "special methods". The names of these methods start and end with a double underscore __. They are not usually called directly, but rather Python calls these methods "secretly" to acheive some task. As we saw above, the "add" special method is implemended with __add__:

In [ ]:

six = 6
six.__add__(4)

In [ ]:

int.__add__(6,4)

In [ ]:

six+4

Special method: `object.getitem(index)`¶

The special method object.__getitem__(index) is how python implements object[index].

In [ ]:

x={0:1}

In [ ]:

x[0]

In [ ]:

x.__getitem__(0)

In [ ]:

x={1:"value1",2:43}

In [ ]:

x[1]

In [ ]:

x.__getitem__(1)

Special method: `sequence.len()`¶

Another special method is __len__, which returns the length of a sequence.

In [ ]:

len(x)

In [ ]:

x.__len__()

Special methods: `object.str()` (and `object.repr()`)¶

Another special method is __str__, which returns a string representation of the object. (__repr__ does something very similar but can often be used to "reproduce" the original thing and is hence a little more exact if less "nice" or "pretty".)

In [ ]:

str(0.4)

In [ ]:

x = 0.4
x.__str__()

In [ ]:

x={1:"value1",2:43}
x.__str__()

In [ ]:

print(x)

In [ ]:

print("hello")

In [ ]:

"hello"

In [ ]:

f"my value is: {x}"

In [ ]:

one = 1
one.__str__()

In [ ]:

f"my value is: {1}"

In [ ]:

repr(1/9)

In [ ]:

x.__repr__()

In [ ]:

"hello".__repr__()

In [ ]:

"hello".__str__()

In [ ]:

print("hello".__str__())

In [ ]:

print("hello".__repr__())

In [ ]:

print.__str__()

Abstract interfaces in python¶

for loops iterate over "iterables". You can construct a list (or a dict) from iterables.

Functions and methods are "callable".

Getting items with square brackets (e.g. x[0]) works by calling the __getitem__ method (so, x.__getitem__(0)). Any type can define how this works for that type.

More on iterators¶

There are a couple of very handy functions which take an iterable and return a new iterator:

enumerate(items) - returns iterator with index of items. Each iteration produces a tuple with (index, item).
zip(a_items, b_items) - returns iterator combining two other iterators. Each iteration produces a tuple with (a_item, b_item)

In [ ]:

my_list = ['abc', 'def', 'ghi']
my_iterator = enumerate(my_list)
for x in my_iterator:
    idx, item = x
    print(f"{idx}: {item}")

Usually, the temporary iterator would be implicit:

In [ ]:

my_list = ['abc', 'def', 'ghi']
for x in enumerate(my_list):
    idx, item = x
    print(f"{idx}: {item}")

We can directly assign the tuple to two variables for further elimination of temporary variables:

In [ ]:

my_list = ['abc', 'def', 'ghi']
for idx, item in enumerate(my_list):
    print(f"{idx}: {item}")

In [ ]:

Now, for zip:

In [ ]:

my_list = ['abc', 'def', 'ghi']
list2 = ['red', 'green', 'blue']
my_iterator = zip(my_list, list2)
for x in my_iterator:
    (item, color) = x
    print(f"{item} {color}")

In [ ]:

my_list = ['abc', 'def', 'ghi']
for (item, color) in zip(my_list, ['red', 'green', 'blue']):
    print(f"{item} {color}")

In [ ]:

my_list = ['abc', 'def', 'ghi']
for item, number in zip(my_list, range(3,6)):
    print(f"{item} {number}")

Data Frames¶

We are going to look at data in tables where each row of the table contains measurements or values about a single thing and each column is the measurement type. Such tables are very common in data science.

(Loading the iris data is hidden in this cell. You can ignore this.)

Here is an example of the data we will be looking at. It is a subsampling of the very famous Iris data set.

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	species
11	4.8	3.4	1.6	0.2	setosa
81	5.5	2.4	3.7	1.0	versicolor
97	6.2	2.9	4.3	1.3	versicolor
37	4.9	3.6	1.4	0.1	setosa
31	5.4	3.4	1.5	0.4	setosa
28	5.2	3.4	1.4	0.2	setosa
141	6.9	3.1	5.1	2.3	virginica
149	5.9	3.0	5.1	1.8	virginica

For now, the data are given as a dict. This dict is created in a special way, where each key is the column name and each value is a list of the entry for each row for that column. Later we will read this from a file.

In [ ]:

iris_dataset = {'sepal length (cm)': [5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4, 4.8, 
                                      4.8, 4.3, 5.8, 5.7, 5.4, 5.1, 5.7, 5.1, 5.4, 5.1, 4.6, 5.1, 
                                      4.8, 5.0, 5.0, 5.2, 5.2, 4.7, 4.8, 5.4, 5.2, 5.5, 4.9, 5.0,
                                      5.5, 4.9, 4.4, 5.1, 5.0, 4.5, 4.4, 5.0, 5.1, 4.8, 5.1, 4.6,
                                      5.3, 5.0, 7.0, 6.4, 6.9, 5.5, 6.5, 5.7, 6.3, 4.9, 6.6, 5.2,
                                      5.0, 5.9, 6.0, 6.1, 5.6, 6.7, 5.6, 5.8, 6.2, 5.6, 5.9, 6.1,
                                      6.3, 6.1, 6.4, 6.6, 6.8, 6.7, 6.0, 5.7, 5.5, 5.5, 5.8, 6.0,
                                      5.4, 6.0, 6.7, 6.3, 5.6, 5.5, 5.5, 6.1, 5.8, 5.0, 5.6, 5.7, 
                                      5.7, 6.2, 5.1, 5.7, 6.3, 5.8, 7.1, 6.3, 6.5, 7.6, 4.9, 7.3, 6.7, 
                                      7.2, 6.5, 6.4, 6.8, 5.7, 5.8, 6.4, 6.5, 7.7, 7.7, 6.0, 6.9, 5.6, 7.7, 6.3, 6.7, 
                                      7.2, 6.2, 6.1, 6.4, 7.2, 7.4, 7.9, 6.4, 6.3, 6.1, 7.7, 6.3, 6.4, 6.0, 6.9, 6.7, 
                                      6.9, 5.8, 6.8, 6.7, 6.7, 6.3, 6.5, 6.2, 5.9], 
                'sepal width (cm)': [3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.4, 3.0, 3.0, 4.0, 4.4, 3.9, 3.5, 3.8, 3.8, 3.4, 3.7, 3.6, 3.3, 3.4, 3.0, 3.4, 3.5, 3.4, 3.2, 3.1, 3.4, 4.1, 4.2, 3.1, 3.2, 3.5, 3.6, 3.0, 3.4, 3.5, 2.3, 3.2, 3.5, 3.8, 3.0, 3.8, 3.2, 3.7, 3.3, 3.2, 3.2, 3.1, 2.3, 2.8, 2.8, 3.3, 2.4, 2.9, 2.7, 2.0, 3.0, 2.2, 2.9, 2.9, 3.1, 3.0, 2.7, 2.2, 2.5, 3.2, 2.8, 2.5, 2.8, 2.9, 3.0, 2.8, 3.0, 2.9, 2.6, 2.4, 2.4, 2.7, 2.7, 3.0, 3.4, 3.1, 2.3, 3.0, 2.5, 2.6, 3.0, 2.6, 2.3, 2.7, 3.0, 2.9, 2.9, 2.5, 2.8, 3.3, 2.7, 3.0, 2.9, 3.0, 3.0, 2.5, 2.9, 2.5, 3.6, 3.2, 2.7, 3.0, 2.5, 2.8, 3.2, 3.0, 3.8, 2.6, 2.2, 3.2, 2.8, 2.8, 2.7, 3.3, 3.2, 2.8, 3.0, 2.8, 3.0, 2.8, 3.8, 2.8, 2.8, 2.6, 3.0, 3.4, 3.1, 3.0, 3.1, 3.1, 3.1, 2.7, 3.2, 3.3, 3.0, 2.5, 3.0, 3.4, 3.0], 'petal length (cm)': [1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5, 1.6, 1.4, 1.1, 1.2, 1.5, 1.3, 1.4, 1.7, 1.5, 1.7, 1.5, 1.0, 1.7, 1.9, 1.6, 1.6, 1.5, 1.4, 1.6, 1.6, 1.5, 1.5, 1.4, 1.5, 1.2, 1.3, 1.4, 1.3, 1.5, 1.3, 1.3, 1.3, 1.6, 1.9, 1.4, 1.6, 1.4, 1.5, 1.4, 4.7, 4.5, 4.9, 4.0, 4.6, 4.5, 4.7, 3.3, 4.6, 3.9, 3.5, 4.2, 4.0, 4.7, 3.6, 4.4, 4.5, 4.1, 4.5, 3.9, 4.8, 4.0, 4.9, 4.7, 4.3, 4.4, 4.8, 5.0, 4.5, 3.5, 3.8, 3.7, 3.9, 5.1, 4.5, 4.5, 4.7, 4.4, 4.1, 4.0, 4.4, 4.6, 4.0, 3.3, 4.2, 4.2, 4.2, 4.3, 3.0, 4.1, 6.0, 5.1, 5.9, 5.6, 5.8, 6.6, 4.5, 6.3, 5.8, 6.1, 5.1, 5.3, 5.5, 5.0, 5.1, 5.3, 5.5, 6.7, 6.9, 5.0, 5.7, 4.9, 6.7, 4.9, 5.7, 6.0, 4.8, 4.9, 5.6, 5.8, 6.1, 6.4, 5.6, 5.1, 5.6, 6.1, 5.6, 5.5, 4.8, 5.4, 5.6, 5.1, 5.1, 5.9, 5.7, 5.2, 5.0, 5.2, 5.4, 5.1], 'petal width (cm)': [0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2, 0.2, 0.1, 0.1, 0.2, 0.4, 0.4, 0.3, 0.3, 0.3, 0.2, 0.4, 0.2, 0.5, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.4, 0.1, 0.2, 0.2, 0.2, 0.2, 0.1, 0.2, 0.2, 0.3, 0.3, 0.2, 0.6, 0.4, 0.3, 0.2, 0.2, 0.2, 0.2, 1.4, 1.5, 1.5, 1.3, 1.5, 1.3, 1.6, 1.0, 1.3, 1.4, 1.0, 1.5, 1.0, 1.4, 1.3, 1.4, 1.5, 1.0, 1.5, 1.1, 1.8, 1.3, 1.5, 1.2, 1.3, 1.4, 1.4, 1.7, 1.5, 1.0, 1.1, 1.0, 1.2, 1.6, 1.5, 1.6, 1.5, 1.3, 1.3, 1.3, 1.2, 1.4, 1.2, 1.0, 1.3, 1.2, 1.3, 1.3, 1.1, 1.3, 2.5, 1.9, 2.1, 1.8, 2.2, 2.1, 1.7, 1.8, 1.8, 2.5, 2.0, 1.9, 2.1, 2.0, 2.4, 2.3, 1.8, 2.2, 2.3, 1.5, 2.3, 2.0, 2.0, 1.8, 2.1, 1.8, 1.8, 1.8, 2.1, 1.6, 1.9, 2.0, 2.2, 1.5, 1.4, 2.3, 2.4, 1.8, 1.8, 2.1, 2.4, 2.3, 1.9, 2.3, 2.5, 2.3, 1.9, 2.0, 2.3, 1.8], 'species': ['setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica']}

In [ ]:

plt.plot(iris_dataset['sepal width (cm)'], iris_dataset['petal width (cm)'],'o');
plt.xlabel('sepal width (cm)')
plt.ylabel('petal width (cm)')

In [ ]:

for column_name in iris_dataset:
    print(column_name)

Let's double check that every column (the value of each key) has the same number of rows.

In [ ]:

for column_name in iris_dataset:
    column = iris_dataset[column_name]
    print("'{}': {} rows".format(column_name, len(column)))

Now let's compute the average value for each measurement across all of our data.

In [ ]:

def compute_average(my_list):
    assert type(my_list)==list
    accum = 0.0
    for item in my_list:
        accum = accum + item
    average = accum / len(my_list)
    return average

In [ ]:

compute_average([4, 6])

In [ ]:

for column_name in iris_dataset:
    if column_name == 'species':
        continue
    average = compute_average(iris_dataset[column_name])
    print("'{}' average: {}".format(column_name, average))

Let's see what species we have in our data.

In [ ]:

known_species = {}
count = 0
for row_species in iris_dataset['species']:
    known_species[row_species] = None
    count = count + 1

print(count)
for species in known_species:
    print(species)

In [ ]:

known_species

In [ ]:

known_species = {}
for row_species in iris_dataset['species']:
    if row_species in known_species:
        known_species[row_species] += 1
    else:
        known_species[row_species] = 1

print(known_species)

Now, we will want to calculate values for each species, not across all measurements. This is going to be a little tricky, because we need to calculate which species is in which row. As our first step, we will figure this out.

In [ ]:

rows_for_species = {'setosa':[], 'versicolor':[], 'virginica':[]}
for species_name in rows_for_species:
    # print(species_name)
    row_index = 0
    for row_species in iris_dataset['species']:
        # print(row_index, row_species)
        if row_species == species_name:
            rows_for_species[species_name].append(row_index)
        row_index = row_index + 1

In [ ]:

rows_for_species

Let's check if this worked by building a list for each species of each column.

In [ ]:

for species_name in rows_for_species:
    # get a list of row numbers for `species_name`
    species_indexes = rows_for_species[species_name]
    # iterate over columns in dataset
    for column_name in iris_dataset:
        # get all data for this column (get all data for this measurement type, e.g. sepal width)
        all_rows_for_this_column = iris_dataset[column_name]
        
        # accumulate measurements in a list **for this species**
        this_species_values = []
        for species_index in species_indexes:
            # take only the rows corresponding to this species
            row_value = all_rows_for_this_column[species_index]
            this_species_values.append(row_value)
        print(f"{species_name} -> {column_name}: {this_species_values}")
        print()

58 KiB Raw Blame History

exercise-04 review¶

For loops, iterators, Dictionaries, more operators, files¶

Control flow with for using range to produce an iterator¶

Control flow with for using a list as an iterator¶

iterators¶

Methods¶

Modules¶

Example: compute the Fibonacci sequence using recursion¶

More strings¶

Dictionaries - Python's dict type¶

More about functions: keyword arguments¶

The + operator on various data types¶

The * operator on various data types¶

Special method: object.__add__(other)¶

Special method: object.__getitem__(index)¶

Special method: sequence.__len__()¶

Special methods: object.__str__() (and object.__repr__())¶

Abstract interfaces in python¶

More on iterators¶

Data Frames¶

58 KiB

Raw Blame History

`exercise-04` review¶

Control flow with `for` using `range` to produce an iterator¶

Control flow with `for` using a list as an iterator¶

Dictionaries - Python's `dict` type¶

The `+` operator on various data types¶

The `*` operator on various data types¶

Special method: `object.add(other)`¶

Special method: `object.getitem(index)`¶

Special method: `sequence.len()`¶

Special methods: `object.str()` (and `object.repr()`)¶