2021-04-12

Transposing a 'sparse', randomly-ordered array

This note is about a little programming difficulty I ran into recently. I'm putting it here so I can point people to it while trying to figure out if this is either (a) already known, (b) easy to solve with advanced indexing, (c) in fact not a real problem, or (d) actually hard.

Update: So, it appears that this is just a sparse matrix operation, and one way to solve it would be to transition everything to sparse matrices. The scipy.sparse package would be quite useful for this, and should support the format described below with coo_matrix. For my usecase, I briefly used a numba implementation, given below, and then found a way to avoid having to use the transpose entirely. 😅 I'm leaving this note up in case anyone else runs into a similar problem!

Update (2): Thanks to Sergey Pozdnyakov for pointing out a typo below, now fixed. Two additional notes: (1) This implementation assumes that indices only occur once and no "self-interactions" occur, i.e. k != i in the notation below, and (2) it's worth considering sorting idx, then one can implement this operation with just a bit of numpy sorting (this ends up being slower than numba below).

Task

Let's say we have a numpy array u with dimensions [n, m, …], with n>m. From now on we'll ignore the additional indices in u, as we're interested in operations on the first two indices.

The entries in u correspond to the non-zero entries of an implicit, large [n, n, …] numpy array U. We know the following about U: For each index pair [i, k], if U[i, k] is nonzero, U[k, i] is also nonzero, and for each i there are at most m non-zero entries k in that row/column. Therefore, u contains all the relevant information about the non-zero elements of U. Each row in m may be padded with zero entries at the end to make u not ragged.

There is one more important twist: the entries in each row of u are ordered in some arbitrary, essentially random way. To track this, we have an index array idx with dimension [n, m]. Each row in i contains the column indices (k above) in U corresponding to the entries in u at that position. The index -1 is reserved for padded entries. So for instance, if a row i of u is [1, 2, 3], corresponding to [i, k3], [i, k1], [i, k2] in U, then idx[i] = [k3, k1, k2].

Our task is to compute the "transpose" t of u, in the sense that t has the same relationship to transpose(U) as u has to U: if the [i, j] entry of u corresponds to [i, k] in U, then [i, j] of t should correspond to [k, i].

The entry [i, j] of u belongs in row idx[i, j] = jj of the transpose. The difficulty is figuring out the column index: We need to know which index in u[jj, :] belongs to i. We'll call this "reverse" index ii in the code.

For my particular usecase, we'll need to compute this transpose many times for different u, but can expect idx to stay the same: the values we are transposing might change, but not the underlying structure.

Example

This is rather confusing, so here is an example.

u = np.array([[3, 0],
              [2, 1],
              [4, 0]], dtype=int)
idx = np.array([[1, -1],
                [2, 0],
                [1, -1]], dtype=int)

with the result of our transpose:

t = [[1, 0],
     [4, 3],
     [2, 0]]

Naive solution

Here is a naive pure-python implementation.

def naive_transpose(u, idx):
    t = np.zeros_like(u)
    for i in range(u.shape[0]):
        for j in range(u.shape[1]):
            jj = idx[i, j]
            if jj == -1:
                break

            for ii in range(u.shape[1]):
                if idx[jj, ii] == i:
                    break

            t[i, j] = u[jj, ii]

    return t

Clearly, this is inefficient: We run through each row of the idx array many times to search for a match. But it works!

numba + indexing solution

This solution has two parts: First, we make an [n, m, 2] transpose index array that contains idx in the [:, :, 0] entries, and the "reverse" indices in [:, :, 1]. This part is implemented using numba, which provides a just-in-time (jit) compiler for a subset of python, speeding up the for loops considerably.

We then use some ✨ advanced indexing ✨ to collect the entries of t out of u based on the indices in our transpose index array.

This approach has the advantage of separating out the complicated index-finding part from the actual array operations: We'll only need to compute the transpose index array once, and then can re-use it for different us with the same structure.

import numba
import numpy as np


def get_transpose_idx(idx):
    transpose = np.zeros((*idx.shape[:2], 2), dtype=int)
    return _get_transpose_idx(idx.astype(int), transpose)


@numba.jit(nopython=True)
def _get_transpose_idx(idx, transpose):
    n, m = idx.shape

    for i in range(n):
        for j in range(m):
            # what entry are we inverting?
            jj = idx[i, j]
            if jj == -1:
                ii = -1
            else:
                ii = find(idx[jj], i)
            transpose[i, j] = [jj, ii]

    return transpose


@numba.jit(nopython=True)
def find(array, target):
    n = len(array)
    for i in range(n):
        if array[i] == target:
            return i
    return -1

(For better performance one needs to either find a way to make use of accumulated knowledge about "reverse" indices when going through idx, or optimise the lookup, for example by switching to a dict, where average lookup complexity is better. Or sort!)

The transpose itself can be achieved via

t = u[t_idx[:, :, 0], t_idx[:, :, 1], ...]

adding : in place of ... for additional dimensions of u.

This approach should be reasonably fast, but please run your own benchmarks if you end up using it!