Indexing arrays with other arrays
Since I can't seem to fully internalise how numpy
advanced indexing works, here's the solution to a common indexing task, written out for future reference.
Task
Given an array a
of dimension [n, m, l]
and an index array idx
of dimension [n, k <= m]
, I'd like to define a new array b
such that b[i, j, :] = a[i, idx[i, j], :]
.
Example
Let's make an example and a naive solution:
import numpy as np
a = np.random.random((3, 3, 2))
idx = np.array([[1, 2], [0, 2], [0, 1]])
naive = np.zeros((*idx.shape, 2))
for i in range(idx.shape[0]):
for j in range(idx.shape[1]):
naive[i, j] = a[i, idx[i, j]]
Solution
The indexing-based solution, from this blog post is:
b = a[np.arange(a.shape[0])[:, None], idx, :]
# verify solution
np.testing.assert_array_equal(naive, b)
This also works with torch
tensors and seems reasonably fast. If I notice any performance problems, I'll update this post with a (hopefully) more efficient version...