2021-05-20

Indexing arrays with other arrays

Since I can't seem to fully internalise how numpy advanced indexing works, here's the solution to a common indexing task, written out for future reference.

Task

Given an array a of dimension [n, m, l] and an index array idx of dimension [n, k <= m], I'd like to define a new array b such that b[i, j, :] = a[i, idx[i, j], :].

Example

Let's make an example and a naive solution:

import numpy as np

a = np.random.random((3, 3, 2))

idx = np.array([[1, 2], [0, 2], [0, 1]])

naive = np.zeros((*idx.shape, 2))
for i in range(idx.shape[0]):
    for j in range(idx.shape[1]):
        naive[i, j] = a[i, idx[i, j]]

Solution

The indexing-based solution, from this blog post is:

b = a[np.arange(a.shape[0])[:, None], idx, :]

# verify solution
np.testing.assert_array_equal(naive, b)

This also works with torch tensors and seems reasonably fast. If I notice any performance problems, I'll update this post with a (hopefully) more efficient version...