Tuple Indices

The basic building block of multidimensional indexing is the tuple index. A tuple index doesn’t select elements on its own. Instead, it contains other indices that themselves select elements. The general rule for tuples is that

each element of a tuple index selects the corresponding elements for the corresponding axis of the array

(this rule is modified a little bit in the presence of ellipses or newaxis, as we will see later).

For example, suppose we have a three-dimensional array a with the shape (3, 2, 4). For simplicity, we’ll define a as a reshaped arange, so that each element is distinct and we can easily see which elements are selected.

>>> import numpy as np
>>> a = np.arange(24).reshape((3, 2, 4))
>>> a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]]])

If we use a basic single axis index on a such as an integer or slice, it will operate on the first dimension of a:

>>> a[0] # The first row of the first axis
array([[0, 1, 2, 3],
       [4, 5, 6, 7]])
>>> a[2:] # The elements that are not in the first or second rows of the first axis
array([[[16, 17, 18, 19],
        [20, 21, 22, 23]]])

We also observe that integer indices remove the axis, and slices keep the axis (even when the resulting axis has size 1):

>>> a[0].shape
(2, 4)
>>> a[2:].shape
(1, 2, 4)

The indices in a tuple index target the corresponding elements of the corresponding axis. So for example, the index (1, 0, 2) selects the second element of the first axis, the first element of the second axis, and the third element of the third axis (remember that indexing is 0-based, so index 0 corresponds to the first element, index 1 to the second, and so on). Looking at the list of lists representation of a that was printed by NumPy:

>>> a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]]])

The first index is 1, so we should take the second element of the outermost list, giving

[[ 8,  9, 10, 11],
 [12, 13, 14, 15]]

The next index is 0, so we get the first element of this list, which is the list

[ 8,  9, 10, 11]

Finally, the last index is 2, giving the third element of this list:

10

And indeed:

>>> a[(1, 0, 2)] 
np.int64(10)

If we had stopped at an intermediate tuple, instead of getting an element, we would have gotten the subarray that we accessed. For example, just (1,) gives us the first intermediate array we looked at:

>>> a[(1,)]
array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])

And (1, 0) gives us the second intermediate array we looked at:

>>> a[(1, 0)]
array([ 8,  9, 10, 11])

In each case, the integers remove the corresponding axes from the array shape:

>>> a.shape
(3, 2, 4)
>>> a[(1,)].shape
(2, 4)
>>> a[(1, 0)].shape
(4,)

We can actually think of the final element, 10, as being an array with shape () (0 dimensions). Indeed, NumPy agrees with this idea:

>>> a[(1, 0, 2)].shape
()

Now, it’s important to note a key point about tuple indices: the parentheses in a tuple index are completely optional. Instead of writing a[(1, 0, 2)], we could simply write a[1, 0, 2].

>>> a[1, 0, 2] 
np.int64(10)

These are exactly the same. When the parentheses are omitted, Python automatically treats the index as a tuple. From here on, we will always omit the parentheses, as is common practice. Not only is this cleaner, but it is also important for another reason: syntactically, Python does not allow slices in a tuple index if the parentheses are included:

>>> a[(1:, :, :-1)] 
  File "<stdin>", line 1
    a[(1:, :, :-1)]
        ^
SyntaxError: invalid syntax
>>> a[1:, :, :-1]
array([[[ 8,  9, 10],
        [12, 13, 14]],

       [[16, 17, 18],
        [20, 21, 22]]])

Now, let’s go back and look at an example we just showed:

>>> a[(1,)] # or just a[1,]
array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])

You might have noticed something about this. It is selecting the second element of the first axis. But from what we said earlier, we can also do this just by using the basic index 1, which will operate on the first axis:

>>> a[1] # Exactly the same thing as a[(1,)]
array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])

This illustrates the first important fact about tuple indices:

A tuple index with a single index, a[i,], is exactly the same as that single index, a[i].

The reason is that in both cases, the index i operates over the first axis of the array. This is true no matter what kind of index i is. i can be an integer index, a slice, an ellipsis, and so on. With one exception, that is: i cannot itself be a tuple index! Nested tuple indices are not allowed.

In practice, this means that when working with NumPy arrays, you can think of every index type as a single element tuple index. An integer index 0 is “actually” the tuple index (0,). The slice a[0:3] is actually a tuple a[0:3,]. This is a good way to think about indices because it will help you remember that non-tuple indices operate as if they were the first element of a single-element tuple index, namely, they operate on the first axis of the array. Remember, however, that this does not apply to Python built-in types: l[0,] and l[0:3,] will both produce errors if l is a list, tuple, or str.

Up to now, we looked at the tuple index (1, 0, 2), which selected a single element. And we considered sub-tuples of this, (1,) and (1, 0), which selected subarrays. What if we want to select other subarrays? For example, a[1, 0] selects the subarray with the second element of the first axis and the first element of the second axis. What if instead we wanted the first element of the last axis (axis 3)?

We can do this with slices. In particular, the trivial slice : will select every single element of an axis (remember that the : slice means “select everything”). So we want to select every element from the first and second axis, and only the first element of the last axis, meaning our index is :, :, 0:

>>> a[:, :, 0]
array([[ 0,  4],
       [ 8, 12],
       [16, 20]])

: serves as a convenient way to “skip” axes. It is one of the most common types of indices that you will see in practice for this reason. However, it is important to remember that : is not special. It is just a slice, which selects every element of the corresponding axis. We could also replace : with 0:n, where n is the size of the corresponding axis.

>>> a[0:3, 0:2, 0]
array([[ 0,  4],
       [ 8, 12],
       [16, 20]])

Of course, in practice using : is better because we might not know or care what the actual size of the axis is, and it’s less typing anyway.

When we used the indices (1,) and (1, 0), we observed that they targeted the first and the first two axes, respectively, leaving the remaining axes intact and producing subarrays. Another way of saying this is that the each tuple index implicitly ended with : slices, one for each axis we didn’t index:

>>> a[1,]
array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])
>>> a[1, :, :]
array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])
>>> a[1, 0]
array([ 8,  9, 10, 11])
>>> a[1, 0, :]
array([ 8,  9, 10, 11])

This is a rule in general:

A tuple index implicitly ends in as many slices : as there are remaining dimensions of the array.

The slices page stressed the point that slices always keep the axis they index, but it wasn’t clear why that is important until now. Suppose we slice the first axis of a, then later, we take that array and want to get the first element of the last row.

>>> n = 2
>>> b = a[:n]
>>> b[-1, -1, 0] 
np.int64(12)

Here b = a[:2] has shape (2, 2, 4)

>>> b.shape
(2, 2, 4)

But suppose we used a slice that only selected one element from the first axis instead

>>> n = 1
>>> b = a[:n]
>>> b[-1, -1, 0] 
np.int64(4)

It still works. Here b has shape (1, 2, 4):

>>> b.shape
(1, 2, 4)
>>> b
array([[[0, 1, 2, 3],
        [4, 5, 6, 7]]])

Even though the slice a[:1] only produces a single element in the first axis, that axis is maintained as size 1. We might think this array is “equivalent” to the same array with shape (2, 4), since the first axis is redundant (the outermost list only has one element, so we don’t really need it).

>>> # c is kind of the same as b above
>>> c = np.array([[0, 1, 2, 3],
...               [4, 5, 6, 7]])

This is true in the sense that the elements are the same, but the resulting array has different properties. Namely, the index we used for b will not work for it.

>>> c[-1, -1, 0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: too many indices for array: array is 2-dimensional, but 3 were
indexed

Here we tried to use the same index on c that we used on b, but it didn’t work, because our index assumed three axes, but c only has two:

>>> c.shape
(2, 4)

Thus, when it comes to indexing, all axes, even “trivial” axes, matter. It’s sometimes a good idea to maintain the same number of dimensions in an array throughout a computation, even if one of them sometimes has size 1, simply because it means that you can index the array uniformly.[1] And this doesn’t apply just to indexing. Many NumPy functions reduce the number of dimensions of their output (for example, numpy.sum()), but they have a keepdims argument to retain the dimension as a size 1 dimension instead.

There are two final facts about tuple indices that should be noted before we move on to the other basic index types. First, as we saw above,

if a tuple index has more elements than there are dimensions in an array, it raises an IndexError.

Secondly, an array can be indexed by an empty tuple (). If we think about it for a moment, we said that every tuple index implicitly ends in enough trivial : slices to select the remaining axes of an array. That means that for an array a with \(n\) dimensions, an empty tuple index a[()] should be the same as a[:, :, (n times)]. This would select every element of every axis. In other words,

the empty tuple index a[()] always just returns the entire array a unchanged.[2]

Footnotes