Python list comprehensions

Python has a succinct way of representing lists:

>>> a = [1, 2, 3]

is a list of first three natural numbers. It is often necessary to do something with all elements of the list, to transform them in some way. This can be done, among other ways, by using list comprehensions ([[http://docs.python.org/tutorial/datastructures.html|docs]]):

>>> [z * z for z in a]
[1, 4, 9]

What we have done with the above is create a list of squares of the elements in the original list.

[z * z for z in a]
is saying: make a new list by going through all the elements of list a and making a new element of the new list by doing z * z for each element z of the original list. Very useful.

You can do even better then this:

>>> a = [90, 80, 70]
>>> b = [1, 2, 3]
>>> [x + y for x in a for y in b]
[91, 92, 93, 81, 82, 83, 71, 72, 73]

What the above does is does a double list comprehension – it adds the numbers in the list a with each of the numbers of the list b, again creating a new list.

What is important about the above is the order of operations. You can see that it first goes with the outermost list (

for y in b
) and then with the innermost list (
for x in a
), something that you might find counterintuitive (at least I did – I thought it would go through the lists left-to-right). So, it first says – OK, let’s take one element from the innermost list (list a) and make x be that element. Then, go to the next list to the right (in our case, list b) and work in the same way with it. That is, bind it’s first element to y. Since there are no more lists, process the instruction given (in our case, x + y) and go to the next element of the outermost list – so, bind the second element of list b to y. Continue until there are no more elements in y. Thus, in the first step, we will have the following:

||x||y||
|90|1|
|90|2|
|90|3|

Since there are no more elements in list b, go back one comprehension to the left (i.e. go to the previous inner comprehension), change its binding variable (x) to the next from its list (80) and repeat by going to the next outer comprehension (for y in b). The second step would be:

||x||y||
|80|1|
|80|2|
|80|3|

Repeat this again and again, until all lists are exhausted. The whole picture would be:

||x||y||
|90|1|
|90|2|
|90|3|
|80|1|
|80|2|
|80|3|
|70|1|
|70|2|
|70|3|

A couple weeks ago I ran into the above construct which has been used along these lines:

>>> a = [(1, 2), (3, 4), (5, 6)]
>>> [b for c in a for b in c]

In this example, the list comprehensions are being shared – for b in c is actually using c from it’s previous comprehension for c in a. I say previous based on the above logic – previous means some of the comprehensions that are inner-bound. Given the above, here’s what would happen:

||c||b||
|(1, 2)|1|
|(1, 2)|2|
|(3, 4)|3|
|(3, 4)|4|
|(5, 6)|5|
|(5, 6)|6|

Again, c is the inner comprehension, so for each of the values in it (and remember – c is just looping through all the elements in a) you will go through the whole cycle for all outer comprehensions. In our case, there’s only one comprehension – for b in c. So, for each c bound to each of the elements in the list a, you will go through all the elements in c, bind them to b and do something to them – in our case, just copy them.

The output of the above would be:

>>> a = [(1, 2), (3, 4), (5, 6)]
>>> [b for c in a for b in c]
[1, 2, 3, 4, 5, 6]

This is a very neat way to “unwrap” the list of lists or, as in this case, list of tuples.

Of course, you can go deeper then that. If you remember that strings are also iterable, you can do this:

>>> a = [("abc", "def"), ("geh", "ijk")]
>>> [d for b in a for c in b for d in c]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'e', 'h', 'i', 'j', 'k']