Linear algebra roadmap

# Linear algebra roadmap Linear algebra in Oscar land needs some love. Many conventions have been taken from flint almost verbatim, but they make little sense mathematically. For example, `rref(A::ZZMatrix)` does not return a row reduced echelon form (Hermite form), but a row reduced echelon form of $A$ considered as a rational matrix. Or, the `rref` also returns the rank at the moment (which makes little sense SB: Why? It was quite useful for me). TH: My main point would be, that it is unexpected and not what the name `rref` (or the new new `echelon_form`) suggests. CF: also a proper rref has exactly rank many rows which are all non-zero. This is a flint artifact which computes rref via inplace LU and returns a matrix of the original size.. SB: Then to find out the rank? I would have to manually check how many nonzero rows there are? I can ofcourse ask for rank explicitly ... but that could be inefficient since the rank function does not know the matrix is rref and will recompute it. | Topic | Design decision | Implementation ----|----|---- Matrix normal forms | yes | yes Rank | yes | ? Kernel | yes | yes Solving | yes| yes Solving with same left/right side | yes | yes Indexing | yes | yes Iteration | (yes) | no change :( "Rational" | ? | ? Misc | ? | ? Renaming | ? | ? CF: rational: I'd also like two general features: - solve/ kernel/ ... over a different ring (Z->Q, Z -> Fp). Converting explicitly (map_entries) is e.g. for Z->Q bad and for Z-> Fp badly implemented - some general feature to solve over the quotient field and getting a rational solution OR a pair: integral solution and common denominator. Here, I'd favour to either not promise anything or return a simplified "fraction" TH: Is this something that is already there and needs to be adjusted? Or can this be added later? ## User API ### Matrix normal forms #### Fields: | Function | Output | |----------|--------| | `echelon_form(A::MatElem; reduced::Bool = true, shape = :upper, trim = false)` | An upper resp. lower row echelon form of `A`. If `trim` is `true`, then zero rows are removed. | `echelon_form_with_transformation(A; reduced, shape)` | Same as above wihout the trim argument. Also return $U$ such that $UA$ is the thing. | #### Principal ideal domains: | Function | Output | |----------|--------| | `hermite_form(A::MatElem; reduced::true, shape = :upper, trim = false, modulus)` | An upper resp. lower Hermite normal form of `A`. If `trim` is `true`, then zero rows are removed. | `hermite_form_with_transformation(A; reduced, shape)` | Same as above wihout the trim argument. Also return $U$ such that $UA$ is the thing. | #### Howell normal form For principal ideal rings. `howell_form` etc. #### Popov normal form For polynomial rings over fields. `popov_form` etc. #### Continuation of echelon forms: TH: Claus, can you specify the input/ouput?: Either `echelon_form(A; first_rows_are_echelonform = n)` or `echelonize_context(A)` and then `push!(...)` and `finish(A)`. Anyway, this is not too critical, as this can be added post 1.0. (We don't have it at the moment.) #### Open question: - allow rref and ref as the second r can dominate the runtime TH: Claus, do you want it as part of the name or not? That is do you want `echelon_form(;reduced::Bool)` or `echelon_form`/`reduced_echelon_form`? And for HNF? `hermite_form(; reduced::Bool)` or `hermite_form`/`reduced_hermite_form`/`weak_hermite_form`? We go with the `reduced` keyword argument. - maybe an interface to the "rational" version (for ZZ and similar) TH: Claus, What does that mean? (Postponed) - is this matrices or modules, ie. how do we sell the pseudo_ stuff of Cohen? TH: Only matrices. No modules and no pseudo stuff. ### Rank Implement only for PIDs/Euclidean domains? ### Kernel Function | Output | ---------|--------| `kernel(A::MatElem)` | Returns a matrix with rows a generating set of the left kernel of $A$. If the ring is a domain, it should be minimal. | Comment: - I write minimal generating set (and not basis), because I also want it to make sense over say $\mathbf Z/n\mathbf Z$. - CF I do not know if even for domains minimality can be guaranteed/ tested/ assured in all cases (MPoly) - It should not contain zero rows. - SB: "it should be minimal" at least for fields and PIDs we could promise minimality? TH: Yes, good idea. Let's promise minimality in those cases. We can think about the rest later. - a second interface returning spaces? Which would also allow matrices over maximal orders or polynomial rings? - or do we allow also PMats (or matrices over maximal orders): a non-minal generating set is "easy": TH: No, this is about matrices. ### Solving Function | Output | ---------|--------| `solve(A::MatElem, b::Vector; side = :left)` | Solve $Ax = b$ for $x$. | `can_solve(A::MatElem, b::Vector; side)` | Return whether $Ax = b$ has a solution | `can_solve_with_solution(A::MatElem, b::Vector; side)` | Return `fl, b` with `fl` the flag `b` a solution | `can_solve_with_solution_and_kernel(A::MatElem, b::Vector; side)` | Return `fl, b, K`, with `K` the right resp. left kernel | Then the same functions also with `b` of type `MatElem`. Open question: - What should we do if the user wants to supply more information like uniqueness of the solution, or that the matrix has full rank? Should these be keyword arguments? Or `solve_full_rank` etc? (Postponed) - there is also the fun with the triu, tril, right/ left versions. they are neccessary for the implementation and efficient use of the inplace lu stuff... TH: Yes, but what do you suggest? Keyword arguments or seperate functions (`can_solve_with_solution_lower_triangular` etc)? I misunderstood, this is about the weird functions that only take the triangular part of the input matrix and violate `Ax = b`. ### Solving with same left/right side We need an interface for the situation, where one wants to solve `Ax = b` for `A` fixed and `b` popping up later. | Function | Output | |-|-| | `C = solve_context(A)` | Dummy objects to keep track of things | | `solve(C, b)` | Solve `xA = b` | | `can_solve(C, b)` | ... | | `can_solve_with_solutition(C, b)` | ... | | `can_solve_with_solution_and_kernel(C, b)` | ... | | `kernel(C)` | ... | Open question: - Better name than `solve_context`? CF: solve_init, returning a SolveCtx? TH: `solve_init` it is. - Shoud the `side` be part of the context `C`? Or should we allow both `solve(C, x; side = :left)` *and* `solve(C, x; side = :right)`? CF:No. In general, this cannot be done efficiently. You need a different normal form for right/ left solve TH: Yes, but the context can compute and store them on demand. So this would not be a problem for efficiency. So what do you prefer if it could be possible to do `solve(C, b, :right)` and `solve(C, b, :left)` with the same context? Or add the flag in the init phase? It will be lazy, so the side can be a flag. Comment: - We already have a prototype of this in Hecke. ### Indexing We support many of the standard operations already: ``` A[i, j] A[i] # if A is a row or a vector A[i:j, k:l] A[i, :] A[:, j] ``` Open question: - `A[i, :]` returns something of the same type as `A`. Should it return a `Vector`? TH: I would like it to return a `Vector`. SB: Returning sometimes a Julia vector and sometimes an OSCAR matrix seems inconsitent to me. - SB: should we support `rows(A)` and `columns(A)`? They would return a vector of row/column matrices. CF??? lists of rows or columns? An iterator for rows/ columns as vectors(?) - I dislike the vector interface immensly, but it should be consistent. Well, it's more a case of anything that currently works should continue to work. What about vcat(M[i, :] for i = [1,3,5]) or similar? To me itis annoying that I cannot predict when a Vector behaves like anything 1-dim that makes sense, vs. it really is a "n x 1" object. It is (maybe) useful, and people where asking for it, to support Vectors as 1-dim, regardless of orientation, however it has limits (try vcat/ hcat) - SB: the julia vector vs n x 1 and 1 x n matrix keeps annoying me when I do linear algebra. I have to convert everything back and forth all the time depending on what the functions want. TH: - The julia array rule convention, that `A[i1,...,in]` has dimension that sum of the dimensions of the indices `ij`, that is, `sum(length(size(ij)) for j in 1:n)`. There are many bad design decisions in julia, but I think this is sensible. More concretely, if `A` is an `Array{T, 2}`: - `A[1, 1]` has dimension `0`, so is scalar - `A[[1], 1]` has dimension `1`, so is a `Vector` - `A[[1], [1]]` has dimension `2`, so is an `Array{T, 2}` - `A[1, :]` has dimension `1`, so is a `Vector`. - `A[:, 1]` has dimension `1`, so is a `Vector`. For Oscar matrices we are inconsistent: - `A[1, 1]` is a scalar - `A[[1, 2], [1, 2]]` is a matrix - `A[[1, 2], 2]` is a matrix? - `A[:, 2]` is a matrix? The other option is that we make `A[1, 1]` return an Oscar `1x1` matrix, then we would also be consistent. I don't think that there is a big problem with `vcat` or `hcat`. In the end aren't these just submatrices? `vcat(M[i, :] for i = [1,3,5])` can be written either as `M[[1, 3, 5], :]` or `vcat(M[[i], :] for i in [1,3,5])`. One can always get a matrix by just doing `M[[i], :]` instead of `M[i, :]`. On the other hand, it cumbersome to go from `Vector` to `1xn` or `nx1` matrix. Let me summarize the arguments in favor of changing `A[[1, 2], 1]` resp. `A[:, 1]` to return `Vector`: - We have functionality for `solve` and `*` which works for Oscar matrices and julia `Vector` (we did not have this when we introduced indexing) - It is easy to get the old behavior: Just do `A[:, [i]]` instead of `A[:, i]`. - (Consistency with `Array` conventions) TH: We will do it like this. - Julia has an `eachrow` and `eachcolumn` function. Their return views of `Vector`, which we could in principle also do? ``` julia> B = [1 2; 3 4]; julia> first(eachrow(B)) 2-element view(::Matrix{Int64}, 1, :) with eltype Int64: 1 2 julia> ans[1] = 10 10 julia> B 2×2 Matrix{Int64}: 10 2 3 4 ``` This is not too critical for 1.0, since it can be added later. TH: We can do this with the same syntax and behavior. ### Broadcasting We need this! (Maybe TH can give pointers to Aaruni?) ``` B = f.(A) A .= B ``` Comment: - "Fusing" or something efficient is not important right now. - Implementation is connected to the "Iteration" question. ### "Rational" There is a lot functionality, where one has a matrix over a domain $R$ and wants the result over the field of fractions $K$. Examples are `solve_rational` or `nullspace_right_rational`. In the second case, the result is returned as a matrix over $R$, so that it is not just an application of `map_entries`. Even better: it is a basis in R for the K-basis which is different from the R-basis. We have examples where the span of the nullspace_rational is strictly less than nullspace as the saturation is missing Open questions: - Keep those? - Rename those to `rational_*`? - see above: version with numerator and denominator? simplified? TH: Simplified is hard/impossible in general. So maybe the former? TH: Not clear if we want this or for wich functions we want this. Claus' only application is solving. ### Misc - sanitize pseudo_inverse vs (potentially) rational_inverse (TH: ) - trivial: purge solve_dixon from the planet: it returns s (in Z^n) and M s.th. As = b mod M. The size of M is caluclated via Hadamard, so no early abort. - come up with a decent interface for the various flavours: - det, det_fflu, det_df, det_clow, det_popov - lu, lu!, fflu, fflu!, Strassen.??? - charpoly_???? TH: Do this via an `algorithm::Symbol` keyword. (Alternative would be a `Det.` submodule). ### Renaming Old | New ----|---- `det` | `determinant`? `tr` | `trace`? `minpoly`? | `minimal_polynomial`? `charpoly` ? | `characteristic_polynomial`? `lu` | `lu_decomposition` `fflu` | `lu_decomposition_fraction_free` ... We want the long forms, and maybe keep the short forms to be in accord in julia. ## Controversial things ### Iteration (and broadcasting) Background: There was once a push to make Nemo/Oscar matrices implement as much as possible from the "array interface" (which is underspecified and does not really exist) from julia Base. One concrete step was https://github.com/Nemocas/AbstractAlgebra.jl/pull/540, which implemented iteration and indexing. Note that the length of the discussion shows that this has been and still is controversial. At the moment we have `for x in A`, which iterates the elements in the *wrong* order: ``` julia> A = ZZ[1 2; 3 4] [1 2] [3 4] julia> for x in A; @show x; end x = 1 x = 3 x = 2 x = 4 ``` I think we have to do it this way, so that we have ``` julia> [a for a in A] 2×2 Matrix{ZZRingElem}: 1 2 3 4 julia> A == matrix(ZZ, collect(A)) true ``` The big disadvantage is that - We are loosing ``` julia> matrix(ZZ, 2, 2, vec(A)) == A false ``` - It is inefficient for flint matrices CF: well.. for nmod_mat=gfp_mat: yes. For ZZMatrix with large entries: no, as the entries are everywhere. For QQmatrix my guess it it never matters Open question: - Does anyone actually use `for a in A`? Should we remove `for a in A`? Should we remove for non-rows or non-columns? - If we keep it, should it be row major iteration? - Should we remove it and add something like `for a in elements(A)` with `elements(A; order = :rowmajor)` the default? - reshape - when we can? (ie. non-permuted rows, no view) TH: Did you ever miss this? I really dislike the `reshape` in Base: ```julia> A = [1 2; 3 4]; julia> B = reshape(A, (1, 4)) 1×4 Matrix{Int64}: 1 3 2 4 julia> B[1, 1] = 10 10 julia> A 2×2 Matrix{Int64}: 10 2 3 4 ``` Do you want the same nonsensical behavior? - view with disjoint row ranges for flint, arbitrary for generic? TH: No, we need to be consistent. - support, possibly, for packed matrices? TH: Implementation detail, can be added later. Comment: - I think this might also be relevant to implementing broadcasting. TH: # Building blocks: - efficient, in place, addrows, columns, butterfly (rows/ cols transformed at the same time) - findfirst_nonzero - iszero_row, is_zero_column, is_zero_entry - for flint matrices, uniform(?) interface to pointer based implementations - spinning as a primitive? echelon_continue? # Correctness: e.g. determinant over Z (Q, NumberField, any situation where CRT applies) is much easier with unproven results... # normal forms - ref/rref/ hnf/ howell (all echelon) - popov - jordan, frobenius, ... (names and definitions) - spectrum, eigen space, eigen value (definitions: require roots or more general) - imprecise rings (padic, real): QR, LU, ???, cholesky? # Obsolete questions/material Questions: - Are the names OK? We want something "generic". SB: Will we keep special cases such as `hermite_form`? - I avoided adding `row` to the name, because I did not plan on having methods for column echelon forms. - Should we promise anything for things which are not PIDs? - Should we promise anything about the pivots for the reduced version, if the coefficient ring is not a field? - What do we promise in the not so obvious cases? I am thinking about $\mathbf{Z}/n\mathbf{Z}$, where one usually wants a "Howell normal form". There is a nice overview by Storjohann in his PhD thesis (p. 28, https://doi.org/10.3929/ethz-a-004141007): ![Screenshot 2023-12-02 at 15.15.43](https://hackmd.io/_uploads/Hy0Bh2_B6.png) Note that "(r2)" is the ususal reducedness of the pivots and for the elements above the pivots. - the popov_form is missing - and weird as it is not a triangular one - one way or other, the possibility to contine an echelon form (ie. promising that the top n rows are reduced already)