# Linear algebra roadmap
Linear algebra in Oscar land needs some love. Many conventions have been taken from flint almost verbatim, but they make little sense mathematically. For example, `rref(A::ZZMatrix)` does not return a row reduced echelon form (Hermite form), but a row reduced echelon form of $A$ considered as a rational matrix.
Or, the `rref` also returns the rank at the moment (which makes little sense SB: Why? It was quite useful for me). TH: My main point would be, that it is unexpected and not what the name `rref` (or the new new `echelon_form`) suggests.
CF: also a proper rref has exactly rank many rows which are all non-zero. This is a flint artifact which computes rref via inplace LU and returns a matrix of the original size.. SB: Then to find out the rank? I would have to manually check how many nonzero rows there are? I can ofcourse ask for rank explicitly ... but that could be inefficient since the rank function does not know the matrix is rref and will recompute it.
| Topic | Design decision | Implementation
----|----|----
Matrix normal forms | yes | yes
Rank | yes | ?
Kernel | yes | yes
Solving | yes| yes
Solving with same left/right side | yes | yes
Indexing | yes | yes
Iteration | (yes) | no change :(
"Rational" | ? | ?
Misc | ? | ?
Renaming | ? | ?
CF: rational: I'd also like two general features:
- solve/ kernel/ ... over a different ring (Z->Q, Z -> Fp). Converting explicitly (map_entries) is e.g. for Z->Q bad and for Z-> Fp badly implemented
- some general feature to solve over the quotient field and getting a rational solution OR a pair: integral solution and common denominator. Here, I'd favour to either not promise anything or return a simplified "fraction"
<b>TH: Is this something that is already there and needs to be adjusted? Or can this be added later?</b>
## User API
### Matrix normal forms
#### Fields:
| Function | Output |
|----------|--------|
| `echelon_form(A::MatElem; reduced::Bool = true, shape = :upper, trim = false)` | An upper resp. lower row echelon form of `A`. If `trim` is `true`, then zero rows are removed.
| `echelon_form_with_transformation(A; reduced, shape)` | Same as above wihout the trim argument. Also return $U$ such that $UA$ is the thing. |
#### Principal ideal domains:
| Function | Output |
|----------|--------|
| `hermite_form(A::MatElem; reduced::true, shape = :upper, trim = false, modulus)` | An upper resp. lower Hermite normal form of `A`. If `trim` is `true`, then zero rows are removed.
| `hermite_form_with_transformation(A; reduced, shape)` | Same as above wihout the trim argument. Also return $U$ such that $UA$ is the thing. |
#### Howell normal form
For principal ideal rings.
`howell_form` etc.
#### Popov normal form
For polynomial rings over fields.
`popov_form` etc.
#### Continuation of echelon forms:
<b>TH: Claus, can you specify the input/ouput?: Either `echelon_form(A; first_rows_are_echelonform = n)` or `echelonize_context(A)` and then `push!(...)` and `finish(A)`. </b>
Anyway, this is not too critical, as this can be added post 1.0. (We don't have it at the moment.)
#### Open question:
- allow rref and ref as the second r can dominate the runtime
<b>TH: Claus, do you want it as part of the name or not? That is do you want `echelon_form(;reduced::Bool)` or `echelon_form`/`reduced_echelon_form`? And for HNF? `hermite_form(; reduced::Bool)` or `hermite_form`/`reduced_hermite_form`/`weak_hermite_form`?
</b>
We go with the `reduced` keyword argument.
- maybe an interface to the "rational" version (for ZZ and similar)
<b>TH: Claus, What does that mean?</b> (Postponed)
- is this matrices or modules, ie. how do we sell the pseudo_ stuff of Cohen? TH: Only matrices. No modules and no pseudo stuff.
### Rank
Implement only for PIDs/Euclidean domains?
### Kernel
Function | Output |
---------|--------|
`kernel(A::MatElem)` | Returns a matrix with rows a generating set of the left kernel of $A$. If the ring is a domain, it should be minimal. |
Comment:
- I write minimal generating set (and not basis), because I also want it to make sense over say $\mathbf Z/n\mathbf Z$.
- CF I do not know if even for domains minimality can be guaranteed/ tested/ assured in all cases (MPoly)
- It should not contain zero rows.
- SB: "it should be minimal" at least for fields and PIDs we could promise minimality? TH: Yes, good idea. Let's promise minimality in those cases. We can think about the rest later.
- a second interface returning spaces? Which would also allow matrices over maximal orders or polynomial rings?
- or do we allow also PMats (or matrices over maximal orders): a non-minal generating set is "easy": TH: No, this is about matrices.
### Solving
Function | Output |
---------|--------|
`solve(A::MatElem, b::Vector; side = :left)` | Solve $Ax = b$ for $x$. |
`can_solve(A::MatElem, b::Vector; side)` | Return whether $Ax = b$ has a solution |
`can_solve_with_solution(A::MatElem, b::Vector; side)` | Return `fl, b` with `fl` the flag `b` a solution |
`can_solve_with_solution_and_kernel(A::MatElem, b::Vector; side)` | Return `fl, b, K`, with `K` the right resp. left kernel |
Then the same functions also with `b` of type `MatElem`.
Open question:
- What should we do if the user wants to supply more information like uniqueness of the solution, or that the matrix has full rank? Should these be keyword arguments? Or `solve_full_rank` etc? (Postponed)
- there is also the fun with the triu, tril, right/ left versions.
they are neccessary for the implementation and efficient use of the inplace lu stuff...
<b>TH: Yes, but what do you suggest? Keyword arguments or seperate functions (`can_solve_with_solution_lower_triangular` etc)?</b> I misunderstood, this is about the weird functions that only take the triangular part of the input matrix and violate `Ax = b`.
### Solving with same left/right side
We need an interface for the situation, where one wants to solve `Ax = b` for `A` fixed and `b` popping up later.
| Function | Output |
|-|-|
| `C = solve_context(A)` | Dummy objects to keep track of things |
| `solve(C, b)` | Solve `xA = b` |
| `can_solve(C, b)` | ... |
| `can_solve_with_solutition(C, b)` | ... |
| `can_solve_with_solution_and_kernel(C, b)` | ... |
| `kernel(C)` | ... |
Open question:
- Better name than `solve_context`? CF: solve_init, returning a SolveCtx? <b>TH: `solve_init` it is.</b>
- Shoud the `side` be part of the context `C`? Or should we allow both `solve(C, x; side = :left)` *and* `solve(C, x; side = :right)`? CF:No. In general, this cannot be done efficiently. You need a different normal form for right/ left solve <b>TH: Yes, but the context can compute and store them on demand. So this would not be a problem for efficiency. So what do you prefer if it could be possible to do `solve(C, b, :right)` and `solve(C, b, :left)` with the same context? Or add the flag in the init phase?</b> It will be lazy, so the side can be a flag.
Comment:
- We already have a prototype of this in Hecke.
### Indexing
We support many of the standard operations already:
```
A[i, j]
A[i] # if A is a row or a vector
A[i:j, k:l]
A[i, :]
A[:, j]
```
Open question:
- `A[i, :]` returns something of the same type as `A`. Should it return a `Vector`? TH: I would like it to return a `Vector`. SB: Returning sometimes a Julia vector and sometimes an OSCAR matrix seems inconsitent to me.
- SB: should we support `rows(A)` and `columns(A)`? They would return a vector of row/column matrices. CF??? lists of rows or columns? An iterator for rows/ columns as vectors(?)
- I dislike the vector interface immensly, but it should be consistent. Well, it's more a case of anything that currently works should continue to work. What about vcat(M[i, :] for i = [1,3,5]) or similar? To me itis annoying that I cannot predict when a Vector behaves like anything 1-dim that makes sense, vs. it really is a "n x 1" object. It is (maybe) useful, and people where asking for it, to support Vectors as 1-dim, regardless of orientation, however it has limits (try vcat/ hcat)
- SB: the julia vector vs n x 1 and 1 x n matrix keeps annoying me when I do linear algebra. I have to convert everything back and forth all the time depending on what the functions want.
<b>TH</b>:
- The julia array rule convention, that `A[i1,...,in]` has dimension that sum of the dimensions of the indices `ij`, that is, `sum(length(size(ij)) for j in 1:n)`. There are many bad design decisions in julia, but I think this is sensible. More concretely, if `A` is an `Array{T, 2}`:
- `A[1, 1]` has dimension `0`, so is scalar
- `A[[1], 1]` has dimension `1`, so is a `Vector`
- `A[[1], [1]]` has dimension `2`, so is an `Array{T, 2}`
- `A[1, :]` has dimension `1`, so is a `Vector`.
- `A[:, 1]` has dimension `1`, so is a `Vector`.
For Oscar matrices we are inconsistent:
- `A[1, 1]` is a scalar
- `A[[1, 2], [1, 2]]` is a matrix
- `A[[1, 2], 2]` is a matrix?
- `A[:, 2]` is a matrix?
The other option is that we make `A[1, 1]` return an Oscar `1x1` matrix, then we would also be consistent.
I don't think that there is a big problem with `vcat` or `hcat`. In the end aren't these just submatrices? `vcat(M[i, :] for i = [1,3,5])` can be written either as `M[[1, 3, 5], :]` or `vcat(M[[i], :] for i in [1,3,5])`. One can always get a matrix by just doing `M[[i], :]` instead of `M[i, :]`. On the other hand, it cumbersome to go from `Vector` to `1xn` or `nx1` matrix.
Let me summarize the arguments in favor of changing `A[[1, 2], 1]` resp. `A[:, 1]` to return `Vector`:
- We have functionality for `solve` and `*` which works for Oscar matrices and julia `Vector` (we did not have this when we introduced indexing)
- It is easy to get the old behavior: Just do `A[:, [i]]` instead of `A[:, i]`.
- (Consistency with `Array` conventions)
<b>TH: We will do it like this.</b>
- Julia has an `eachrow` and `eachcolumn` function. Their return views of `Vector`, which we could in principle also do?
```
julia> B = [1 2; 3 4];
julia> first(eachrow(B))
2-element view(::Matrix{Int64}, 1, :) with eltype Int64:
1
2
julia> ans[1] = 10
10
julia> B
2×2 Matrix{Int64}:
10 2
3 4
```
<b>This is not too critical for 1.0, since it can be added later.</b>
TH: We can do this with the same syntax and behavior.
### Broadcasting
We need this! (Maybe TH can give pointers to Aaruni?)
```
B = f.(A)
A .= B
```
Comment:
- "Fusing" or something efficient is not important right now.
- Implementation is connected to the "Iteration" question.
### "Rational"
There is a lot functionality, where one has a matrix over a domain $R$ and wants the result over the field of fractions $K$. Examples are `solve_rational` or `nullspace_right_rational`. In the second case, the result is returned as a matrix over $R$, so that it is not just an application of `map_entries`. Even better: it is a basis in R for the K-basis which is different from the R-basis. We have examples where the span of the nullspace_rational is strictly less than nullspace as the saturation is missing
Open questions:
- Keep those?
- Rename those to `rational_*`?
- see above: version with numerator and denominator? simplified?
<b>TH: Simplified is hard/impossible in general. So maybe the former? </b>
TH: Not clear if we want this or for wich functions we want this. Claus' only application is solving.
### Misc
- sanitize pseudo_inverse vs (potentially) rational_inverse (TH: )
- trivial: purge solve_dixon from the planet: it returns s (in Z^n) and M s.th. As = b mod M. The size of M is caluclated via Hadamard, so no early abort.
- come up with a decent interface for the various flavours:
- det, det_fflu, det_df, det_clow, det_popov
- lu, lu!, fflu, fflu!, Strassen.???
- charpoly_????
TH: Do this via an `algorithm::Symbol` keyword. (Alternative would be a `Det.` submodule).
### Renaming
Old | New
----|----
`det` | `determinant`?
`tr` | `trace`?
`minpoly`? | `minimal_polynomial`?
`charpoly` ? | `characteristic_polynomial`?
`lu` | `lu_decomposition`
`fflu` | `lu_decomposition_fraction_free`
...
We want the long forms, and maybe keep the short forms to be in accord in julia.
## Controversial things
### Iteration (and broadcasting)
Background: There was once a push to make Nemo/Oscar matrices implement as much as possible from the "array interface" (which is underspecified and does not really exist) from julia Base. One concrete step was
https://github.com/Nemocas/AbstractAlgebra.jl/pull/540,
which implemented iteration and indexing. Note that the length of the discussion shows that this has been and still is controversial.
At the moment we have `for x in A`, which iterates the elements in the *wrong* order:
```
julia> A = ZZ[1 2; 3 4]
[1 2]
[3 4]
julia> for x in A; @show x; end
x = 1
x = 3
x = 2
x = 4
```
I think we have to do it this way, so that we have
```
julia> [a for a in A]
2×2 Matrix{ZZRingElem}:
1 2
3 4
julia> A == matrix(ZZ, collect(A))
true
```
The big disadvantage is that
- We are loosing
```
julia> matrix(ZZ, 2, 2, vec(A)) == A
false
```
- It is inefficient for flint matrices CF: well.. for nmod_mat=gfp_mat: yes. For ZZMatrix with large entries: no, as the entries are everywhere. For QQmatrix my guess it it never matters
Open question:
- Does anyone actually use `for a in A`? Should we remove `for a in A`? Should we remove for non-rows or non-columns?
- If we keep it, should it be row major iteration?
- Should we remove it and add something like `for a in elements(A)` with `elements(A; order = :rowmajor)` the default?
- reshape - when we can? (ie. non-permuted rows, no view)
<b>TH: Did you ever miss this?
I really dislike the `reshape` in Base:
```julia>
A = [1 2; 3 4];
julia> B = reshape(A, (1, 4))
1×4 Matrix{Int64}:
1 3 2 4
julia> B[1, 1] = 10
10
julia> A
2×2 Matrix{Int64}:
10 2
3 4
```
Do you want the same nonsensical behavior?</b>
- view with disjoint row ranges for flint, arbitrary for generic? TH: No, we need to be consistent.
- support, possibly, for packed matrices? TH: Implementation detail, can be added later.
Comment:
- I think this might also be relevant to implementing broadcasting.
<b>TH: </b>
# Building blocks:
- efficient, in place, addrows, columns, butterfly (rows/ cols transformed at the same time)
- findfirst_nonzero
- iszero_row, is_zero_column, is_zero_entry
- for flint matrices, uniform(?) interface to pointer based implementations
- spinning as a primitive? echelon_continue?
# Correctness:
e.g. determinant over Z (Q, NumberField, any situation where CRT applies) is much easier with unproven results...
# normal forms
- ref/rref/ hnf/ howell (all echelon)
- popov
- jordan, frobenius, ... (names and definitions)
- spectrum, eigen space, eigen value (definitions: require roots or more general)
- imprecise rings (padic, real): QR, LU, ???, cholesky?
# Obsolete questions/material
Questions:
- Are the names OK? We want something "generic". SB: Will we keep special cases such as `hermite_form`?
- I avoided adding `row` to the name, because I did not plan on having methods for column echelon forms.
- Should we promise anything for things which are not PIDs?
- Should we promise anything about the pivots for the reduced version, if the coefficient ring is not a field?
- What do we promise in the not so obvious cases? I am thinking about $\mathbf{Z}/n\mathbf{Z}$, where one usually wants a "Howell normal form". There is a nice overview by Storjohann in his PhD thesis (p. 28, https://doi.org/10.3929/ethz-a-004141007):

Note that "(r2)" is the ususal reducedness of the pivots and for the elements above the pivots.
- the popov_form is missing - and weird as it is not a triangular one
- one way or other, the possibility to contine an echelon form (ie. promising that the top n rows are reduced already)