2021-04-12

# Transposing a 'sparse', randomly-ordered array

This note is about a little programming difficulty I ran into recently. I'm putting it here so I can point people to it while trying to figure out if this is either (a) already known, (b) easy to solve with advanced indexing, (c) in fact not a real problem, or (d) actually hard.

Update: So, it appears that this is just a sparse matrix operation, and one way to solve it would be to transition everything to sparse matrices. The `scipy.sparse` package would be quite useful for this, and should support the format described below with `coo_matrix`. For my usecase, I briefly used a `numba` implementation, given below, and then found a way to avoid having to use the transpose entirely. 😅 I'm leaving this note up in case anyone else runs into a similar problem!

Let's say we have a `numpy` array `u` with dimensions `[n, m, …]`, with `n>m`. From now on we'll ignore the additional indices `…` in `u`, as we're interested in operations on the first two indices.

The entries in `u` correspond to the non-zero entries of an implicit, large `[n, n, …]` `numpy` array `U`. We know the following about `U`: For each index pair `[i, k]`, if `U[i, k]` is nonzero, `U[k, i]` is also nonzero, and for each `i` there are at most `m` non-zero entries `k` in that row/column. Therefore, `u` contains all the relevant information about the non-zero elements of `U`. Each row in `m` may be padded with zero entries at the end to make `u` not ragged.

There is one more important twist: the entries in each row of `u` are ordered in some arbitrary, essentially random way. To track this, we have an index array `idx` with dimension `[n, m]`. Each row in `i` contains the column indices (`k` above) in `U` corresponding to the entries in `u` at that position. The index `-1` is reserved for padded entries. So for instance, if a row `i` of `u` is `[1, 2, 3]`, corresponding to `[i, k3]`, `[i, k1]`, `[i, k2]` in `U`, then `idx[i] = [k3, k1, k2]`.

Our task is to compute the "transpose" `t` of `u`, in the sense that `t` has the same relationship to `transpose(U)` as `u` has to `U`: if the `[i, j]` entry of `u` corresponds to `[i, k]` in `U`, then `[i, j]` of `t` should correspond to `[k, i]`.

The entry `[i, j]` of `u` belongs in row `idx[i, j] = jj` of the transpose. The difficulty is figuring out the column index: We need to know which index in `u[jj, :]` belongs to `i`. We'll call this "reverse" index `ii` in the code.

For my particular usecase, we'll need to compute this transpose many times for different `u`, but can expect `idx` to stay the same: the values we are transposing might change, but not the underlying structure.

### Example

This is rather confusing, so here is an example.

``````u = np.array([[3, 0],
[2, 1],
[4, 0]], dtype=int)
idx = np.array([[1, -1],
[2, 0],
[1, -1]], dtype=int)
``````

with the result of our `transpose`:

``````t = [[1, 0],
[4, 3],
[2, 0]]
``````

## Naive solution

Here is a naive pure-python implementation.

``````def naive_transpose(u, idx):
t = np.zeros_like(u)
for i in range(u.shape):
for j in range(u.shape):
jj = idx[i, j]
if jj == -1:
break

for kk in range(u.shape):
if idx[jj, ii] == i:
break

t[i, j] = u[jj, ii]

return t
``````

Clearly, this is inefficient: We run through each row of the `idx` array many times to search for a match. But it works!

## `numba` + indexing solution

This solution has two parts: First, we make an `[n, m, 2]` transpose index array that contains `idx` in the `[:, :, 0]` entries, and the "reverse" indices in `[:, :, 1]`. This part is implemented using `numba`, which provides a just-in-time (`jit`) compiler for a subset of `python`, speeding up the `for` loops considerably.

We then use some ✨ advanced indexing ✨ to collect the entries of `t` out of `u` based on the indices in our transpose index array.

This approach has the advantage of separating out the complicated index-finding part from the actual array operations: We'll only need to compute the transpose index array once, and then can re-use it for different `u`s with the same structure.

``````import numba
import numpy as np

def get_transpose_idx(idx):
transpose = np.zeros((*idx.shape[:2], 2), dtype=int)
return _get_transpose_idx(idx.astype(int), transpose)

@numba.jit(nopython=True)
def _get_transpose_idx(idx, transpose):
n, m = idx.shape

for i in range(n):
for j in range(m):
# what entry are we inverting?
jj = idx[i, j]
if jj == -1:
ii = -1
else:
ii = find(idx[jj], i)
transpose[i, j] = [jj, ii]

return transpose

@numba.jit(nopython=True)
def find(array, target):
n = len(array)
for i in range(n):
if array[i] == target:
return i
return -1

``````

(For better performance one needs to either find a way to make use of accumulated knowledge about "reverse" indices when going through `idx`, or optimise the lookup, for example by switching to a `dict`, where average lookup complexity is better.)

The transpose itself can be achieved via

``````t = u[t_idx[:, :, 0], t_idx[:, :, 1], ...]
``````

adding `:` in place of `...` for additional dimensions of `u`.

This approach should be reasonably fast, but please run your own benchmarks if you end up using it!