Cartesian Product of 2 lists ============================ Settings -------- Considering 2 lists of np.ndarray ``` fore_states: List[np.ndarray] post_states: List[np.ndarray] ``` The arrays in `fore_states` are essentially $1\times2$ row vectors, and the arrays in `post_states` are $2\times1$ column vectors, so that when multiplying one row vector with one column vector from each lists, that results in a scalar number. Question -------- I'm looking for the catesian product of these 2 lists. For instance, ``` fore_states: List[np.ndarray] = [arraya, arrayb, arrayc] post_states: List[np.ndarray] = [array1, array2, array3] result = np.array( [ [arraya @ array1, arraya @ array2, arraya @ array3], [arrayb @ array1, arrayb @ array2, arrayb @ array3], [arrayc @ array1, arrayc @ array2, arrayc @ array3] ] ) ``` These 2 lists are in same length for sure, so 2 lists in length $N$ sould return an $N \times N$ array. In practice, these 2 lists are very long $(N \gg 10000)$, so I have some performance issue. Current implementation ---------------------- ``` @dataclass class WrappedArray: arr: np.ndarray def __matmul__(self, other: WrappedArray) -> int | float: return (self.arr @ other.arr).item() fore_states: List[WrappedArray] = ... post_states: List[WrappedArray] = ... cartesian_prod = np.array( np.meshgrid(fore_states, post_states) ).T.reshape(-1, 2) iterable = ( fore_state @ post_state for fore_state, post_state in cartesian_prod[:, ] ) result = ( np.fromiter(iterable, dtype=float, count=N*N) .reshape((N, N)) ) ``` This is slow mainly because of an loop in $\mathcal{O}(N^2)$. Update ------ After some thoughts, I realize this is essentially a matrix problem, $$ C_{\alpha\beta} = \sum_{i=1}^{2} A_{i\alpha} B_{i\beta} \quad \text{.} $$ So one can actually do, ``` fore_states: List[np.ndarray] post_states: List[np.ndarray] fore_mat = np.array(fore_states).reshape(N, 2) post_mat = np.array(post_states).reshape(N, 2) result = fore_mat @ post_mat.T ``` This is the vectorized code and so as a result it is also much faster.