Problems with the Current Sparse Field Interpretation

# Problems with the Current Sparse Field Interpretation ###### tags: `Felix’s remains` ## Problems Several problems with the current interpretation of sparse fields as “normal” tuple fields are presented here. ### Ambiguity A sparse field on edges can either be represented by an edge field of 2-tuples or a vertex field of $n$-tuples (where $n$ is the maximum number of vertex-vertex neighbors). ### Missing Generality Depending on the choice above, some computations become possible or impossible. Examples: - How to compute the sum of the two sparse field values on an edge if the $n$-tuples-per-vertex approach was chosen? `deref(shift(e2v, 0)(x))[WHICH ENTRY??] + deref(shift(e2v, 1)(x))[WHICH ENTRY??]` - How to compute the sum of all adjacent sparse values of a vertex in the 2-tuples per edge approach? `deref(shift(v2e, 0)(x))[WHICH ENTRY??] + …` ### Missing Connectivity Info Correctly reducing over a sparse field is impossible in the $n$-tuples case, as connectivity information is missing in `deref(x)`. The only thing we can do is: `reduce(func, deref(x))`, but this incorrectly reduces over all values in the tuple, even if there are no associated neighbors (that is, if there are “skip values” in the neighbor table). ### Inconsistent Access If we chose the $n$-tuples per vertex approach, we call `deref` on a vertex, but the values actually live on edges. This is inconsistent with normal field types. ## Root of All Presented Problems To correctly dereference a sparse field two locations are required: the one where the value should be read from and the one of a neighbor from which the value is “viewed”. The current approach lacks one of these. Or rather, it hardcodes the viewer to one location type. ## ~~Possible Solution?~~ - ~~A sparse field has two location types: one residence location and one viewing location.~~ - ~~Dereferencing a sparse field requires passing a definition of the viewer-to-residence connectivity. For example, `deref(v2e)(x)`.~~ ~~This allows to implement the examples above: `deref(e2v)(x)[0] + deref(e2v)(x)[1]` sums the two entries on an edge. `deref(v2e)(x)[0] + …` sums the entries on edges around a vertex.~~ ~~Note that this allows for a unified view on “partial shifting” and dereferencing on sparse and non-sparse fields. That is, `shift(v2e)(x)` may return a tuple-like of iterators for both sparse and non-sparse fields; `deref(v2e)(x)` may return a tuple-like of (optional, possibly masked) values for both sparse and non-sparse fields.~~ ~~Further, note that the backend is still able to choose between the above 2-tuple and $n$-tuple forms for data storage. The choice merely decides which access is more efficient. For example, if we do a `deref(e2v)(x)` on an edge with the $n$-tuple-on-vertex approach, the following steps are required:~~ 1. ~~Looking up neighbor vertices of the edge in the e2v table~~ 2. ~~Finding the current edge in eachs vertex v2e table entry (note: the backend has to know the relationship between e2v and v2e tables)~~ 3. ~~Getting the tuple index from the previous step and access value~~ ## hacking (don't read) ```python= inp = getEdgeField() shifted = shift(V2E)(inp) deref(shift(0)(shifted)) sparse_old = getSparseOldApproach() deref(shift(0)(sparse_old)) sparse_it_of_it = ... deref(shift(1)(deref(sparse_it_of_it))) tuple_get(1, deref(sparse_it_of_it)) reduce(lambda a,b:a+reduce(lambda a,b: a+b, 0)(b), 0)(shift(E2V)(shift(V2E)(inp))) reduce(lambda a,b:a+deref(b), 0)(deref(sparse_it_of_it)) reduce(lambda a,b:a+b, 0)(sparse_old) # reduce(lambda a,b:a+b, 0)(deref(sparse_old)) reduce(lambda a,b:a+b, 0)(shift(V2E)(inp)) np_as_located_field(Vertex, V2E)(np.zeros((N,M))) neigh_tup = shift(V2E)(inp) -> (shift(0)(shift(V2E)(inp)), shift(V2E(1))(inp), ...) V2E = (V2E(0), V2E(1)) shift(E2V)(shift(V2E)(inp)) LapNeighs = (I(+1), J(-1), ...) shift(I) shift(K,1)(neigh_tup) deref(shift(K,1)(tuple_get(0, neigh_tup))) tuple_of_values = deref(shift(V2E)(inp)) # tuple[Optional,...] deref(V2E)(inp) -> deref(shift(V2E)(inp)) deref(V2E)(sparse) -> deref(shift(V2E)(sparse)) tuple_of_values = deref(sparse) # has to be possibly sparse deref() reduce(shift(V2E)(ones)*sparse) ``` ## Questions - Nested reductions: Do we need it? How could we make it work? - Do we need actually sparse field as concept? - shift(V2E): does it produce a tuple of iterators ## THE Solution: Simple, Clean & Consistent It’s easy: sparse fields are totally normal fields, but living on a new location type! Like edges connecting two vertices, we have sparse fields connecting a vertex with an edge. So we just introduce new connectivities V2S and E2S. The important thing is that V2E and V2S are designed in such a way that they have compatible neighbor orders. That is, `deref(shift(V2S, 0))` should give the value which corresponds to the same edge that we can shift to using `deref(shift(V2E, 0))`. This way, `shift(V2S)` of sparse fields is compatible with `shift(V2E)` of non-sparse fields. Now, do we really need two new neighbor tables? No! We _can_ have new and separate neighbor tables, but instead we can also remap the existing `V2E` and `E2V` tables with some transformations applied. Let’s assume that we want to use the 2-tuple storage layout from above: ```python # Trivial ones first: mapping between edge and sparse indices. # Not a single table lookup required! def e2s_table(edge: int, neighbor: int): assert 0 <= neighbor <= 1 return 2 * edge + neighbor def s2e_table(sparse: int, neighbor: int): assert neighbor == 0 return sparse // 2 # More complex ones: mapping between vertex and sparse indices. # Naive implementation of v2s requires two real table lookups; # one in the v2e table and one on the e2v table. # But see further below for a solution! # # Note: e2s_table and s2e_table are not costly real table lookups, # but just the arithmetic mapping functions from above. def v2s_table(vertex: int, neighbor: int): # use the v2e table, but consider that we have 2 values per edge edge: int = v2e_table(vertex, neighbor) # second table lookup to check which value we actually want if e2v_table(edge, 0) == vertex: # we are coming from edge neighbor 0 return e2s_table(edge, 0) else: assert e2v_table(edge, 1) == vertex # we are coming from edge neighbor 1 return e2s_table(edge, 1) def s2v_table(sparse: int, neighbor: int) edge = s2e_table(sparse, neighbor) if e2s_table(edge, 0) == sparse: return e2v_table(edge, 0) else: assert e2s_table(edge, 1) == sparse return e2v_table(edge, 1) ``` (Handling of skip values is omitted). Note that it is very easy to avoid the second table lookup (`e2v_table(...)`) by precomputing a one-bit flag as meta information in each v2e table entry. Than we would have something like: ```python def v2s_table(vertex: int, neighbor: int): edge, src = v2e_table_with_src(vertex, neighbor) return 2 * edge + src ``` The same would also work for an $n$-tuple layout, just requiring some more bits of meta information to identify the source neighbor. That is, each neighbor access costs _at most_ one table lookup! Same as before! The biggest change is how the computations of sparse fields have to be represented (as stencils on the sparse location). But this is probably no issue as it is much more consistent with all the other computations. ## hacking Iterator: - if the following functions are defined: - `shift(offset, it)` - `deref(it)` - ... - reduce takes tuple(s) with same number of elements ```python= V2E_all = (shift(V2E,0), shift(V2E,1), ...) lap = (shift(I, 1), ...) shifted = shift(V2E_all)(inp) -> tuple(shift(V2E,0)(inp), shift(V2E,1)(inp), ...) shift(...)(it: Iter) -> Union[Iter, Tuple[Iter,...]] shift(...)(it: Tuple[Iter, ...]) -> Tuple[] shift(E2V)(shifted) -> tuple(shift) deref(shifted) reduce(lambda a,b: a+b, 0)(deref(sparse)) # sparse is iterator of tuples, deref(sparse) is tuple of values reduce(lambda a,b: a+b, 0)(shift(V2E)(inp)) #shift(V2E)(inp) is tuple of iterators def transform_deref(inp: Tuple[Iterator]): return (deref(i) for i in inp if can_deref(i)) def reduce(...): for i in len(it): if not can_deref(it[0]): return state = op(state, ...) return state ```