Filling out details in Baker's _Comprehensive Course_

###### tags: `study notes` `number theory` # Filling out details in Baker's _Comprehensive Course_ [TOC] ## Theorem 11.2 ### "Then, by symmetry, we have ..." We give two proofs of Baker's claim that for $f\in\mathcal{O}_K[X]$ $$F=\prod_{\sigma:K\hookrightarrow\Bbb{C}}f^\sigma\in\Bbb{Z}[X]$$ Baker claims that this follows "by symmetry". One proof was provided to me in a comment by user @reuns and relies on the __Primitive Element Theorem__ and the __Symmetric Function Theorem__; we give a second more elementary proof based on Lagrange interpolation of $F$, but it is less in the "by symmetry" spirit. Both proofs can be found [on math.stackexchange][MSE question]. ## Theorem 12.2 ### Why λ_1 ⨉ ··· ⨉ λ_n = |det(A)| is the right condition We denote by $\mathcal{L}(\cdot)$ the Lebesgue measure on $\Bbb{R}^n$. Consider the set $$S = \Big\{ \mathbf{x}\in\Bbb{R}^n~\Big|~ A\mathbf{x}\in\prod_{i=1}^n\;[-\lambda_i,+\lambda_i] \Big\}$$ Then $A(S) = [-\lambda_1,+\lambda_1]\!\times\!\cdots\!\times\![-\lambda_n,+\lambda_n]$ and thus $$|\det(A)|\cdot\mathcal{L}(S)=2^n\lambda_1\cdots\lambda_n$$ To apply the refinement of Minkovski's theorem with the lattice $\Bbb{Z}^n$ and the symmetric convex set $S$, we need that $$\frac{\mathcal{L}(S)}{2^n}\geq 1$$ that is $$\lambda_1\cdots\lambda_n\geq|\det(A)|.$$ ## Theorem 12.3 (Dirichlet's Unit Theorem) ### Properly applying Minkovski's theorem #### Minor discrepancies Let $n=\dim_{\Bbb{Q}}K$. We recall (corollary of the primitive element theorem) that there are precisely $n$ ring (field) homomorphisms $\sigma_i:K\hookrightarrow\Bbb{C}$. We assume that the $\sigma_1,\dots,\sigma_s$ are real and that the remaining $2t$ are complex with $\sigma_{s+t+j}=\overline{\sigma_{s+j}}$ for $j=1,\dots,t$. Applying Minkovski's theorem requires * an $n$-dimensional real vector space $V$, * $n$ linear $\Bbb{R}$-linear forms $L_1,\dots,L_n:V\to\Bbb{R}$, * which have to be linearly independent. This is not quite the setup we have here. For several reasons: * $K$ is a _rational_ vector space, * the $\sigma_i$ are $\Bbb{Q}$-linear, * and they are are _complex valued_. Fixing these discrepancies is easy enough. I highlighted them merely to pin point precisely what made me uneasy. #### Linear independence of the field embeddings The most important point, and which is quite independent from the other points, is the _linear independence_ of the $\sigma_i\in\mathrm{Lin}_{\Bbb{Q}}(K,\Bbb{C})$. This follows from the fact that $\langle\alpha\mid\beta\rangle_K:=\mathrm{Tr}_{K/\Bbb{Q}}(\alpha\beta)$ defines a (symmetric) _nondegenerate bilinear form_ on $K$. Indeed, $\mathrm{Tr}_{K/\Bbb{Q}}(1)=n$ so that $\forall \alpha\in K\setminus 0$, $\langle\alpha\mid\alpha^{-1}\rangle_K=n\neq 0$. > __Lemma.__ The bilinear pairing $\langle\alpha\mid\beta\rangle_K:=\mathrm{Tr}_{K/\Bbb{Q}}(\alpha\beta)$ is nondegenerate. Now if $\theta_\bullet=(\theta_1,\dots,\theta_n)$ is some $\Bbb{Q}$-basis of $K$, the matrix $$S=\Big(\sigma_j(\theta_i)\Big)_{1\leq i,j\leq n}$$ is invertible since $$S\cdot S^t=\underset{\theta_\bullet}{\mathrm{Mat}}\big(\langle\cdot\mid\cdot\rangle_K\big)\in\mathrm{GL}_n(\Bbb{Q}).$$ In conclusion, > __Lemma.__ The field embeddings $\sigma_1,\dots,\sigma_n:K\hookrightarrow\Bbb{C}$ are $\Bbb{Q}$-linearly independent. #### Realification When applying Minkovski we will implicitely apply it to $V=K\otimes_\Bbb{Q}\Bbb{R}$ and to the maps $$\left\{\begin{array}{lclr}L_i & = & \sigma_i & \qquad(i=1,\dots,s)\\ L_{s+2j-1} & = & \mathrm{Re}(\sigma_{s+j}) & (j=1,\dots,t)\\L_{s+2j} & = & \mathrm{Im}(\sigma_{s+j}) & (j=1,\dots,t)\end{array}\right.$$ (we use the same notation for the $\Bbb{R}$-linear extensions of $\Bbb{Q}$-linear maps). We also notice that simple line operations convert the matrix $$\big(L_i(\theta_j)\big)~\Rightarrow~\big(\sigma_i(\theta_j)\big)$$ at the cost of multiplying the determinant by $(-2i)^t$. Since the matrix on the right is invertible, we get that the matrix on the left is aswell. This is the "$(a_{ij})$"-matrix in Theorem 12.2. ### Covolume of ring of integers = discriminant of number field In this subsection we investigate why when applying Minkovski's theorem in Dirichlet's Unit Theorem, one requires $\lambda_1\cdots\lambda_n=\sqrt{\mathrm{disc}(K)}$ rather than $\lambda_1\cdots\lambda_n=\mathrm{covol}(\mathcal{O}_K)$. The answer is, unsurprisingly, that $\mathrm{covol}(\mathcal{O}_K)=\sqrt{\mathrm{disc}(K)}$ relative to a natural volume form associated to the nondegenerate bilinear form $\langle\cdot\mid\cdot\rangle_K$. > __Proposition.__ Let $Q$ be a nondegenerate quadratic form on $V$, a finite dimensional real vector space. Then there exists a unique translation invariant measure $\mathcal{L}_Q$ on $V$ such that for any $Q$-orthonormal basis $$\mathcal{L}_Q\Big(\sum_{i=1}^n\;[0,1]e_i\Big)=1.$$ The is easy: $\mathcal{L}_Q$ is simply the Lebesgue measure on $V$ identified with $\Bbb{R}^n$ by choosing a $Q$-orthonormal basis. This is independent on the choice of $Q$-orthonormal basis. Indeed, for any base change matrix $P=\underset{\mathcal{B}}{\mathrm{Mat}}(\mathcal{C})$, $$\underset{\mathcal{C}}{\mathrm{Mat}}(Q)\big)={}^tP\underset{\mathcal{B}}{\mathrm{Mat}}(Q)P$$ Therefore, if $\mathcal{B}$ and $\mathcal{C}$ are both $Q$-orthonormal bases, $|\det(P)|=1$, and $\mathcal{L}_Q$ is well-defined. > __Proposition.__ Let $\Lambda$ be a (full rank, discrete) lattice in $V$; let $\theta_\bullet=(\theta_1,\dots,\theta_n)$ be a $\Bbb{Z}$-basis of $\Lambda$. Then setting $\mathrm{covol}_Q(\Lambda):=\mathcal{L}_Q\Big(\sum_{i=1}^n\;[0,1]\theta_i\Big)$, we have $$\Big|\det\big(\underset{\theta_\bullet}{\mathrm{Mat}}(Q)\big)\Big|=\mathrm{covol}_Q(\Lambda)^2$$ Let $\theta_\bullet$ be as above and let $e_\bullet=(e_1,\dots,e_n)$ be a $Q$-orhtonormal basis of $V$. We have $$\begin{array}{rcl}\underset{\theta_\bullet}{\mathrm{Mat}}(Q)\big) & = & {}^tP\underset{e_\bullet}{\mathrm{Mat}}(Q)P \\ & = & {}^tP\mathrm{Diag}(\pm1)P\end{array}$$ where $P=\underset{e_\bullet}{\mathrm{Mat}}(\theta_\bullet)$ is the matrix of the vectors of $\theta_\bullet$ relative to the basis $e_\bullet$. Also, if we define $g\in\mathrm{GL}(V)$ by $g(e_i)=\theta_i$, then $$g\Big(\sum_{i=1}^n\;[0,1]e_i\Big)=\sum_{i=1}^n\;[0,1]\theta_i$$ and so $\mathrm{covol}_Q(\Lambda)=|\det(g)|$. Since $\underset{e_\bullet}{\mathrm{Mat}}(g)=P$ this imposes $$\mathrm{covol}_Q(\Lambda)=|\det(P)|.$$ > __Corollary.__ If $K$ is a number field, $\mathrm{covol}_Q(\mathcal{O}_K)=\sqrt{|\mathrm{disc}(K)|}$. The real vector space $V=K\otimes_\Bbb{Q}\mathbb{R}$ inherits the nondegenerate bilinear form $\langle\cdot\mid\cdot\rangle_K$ from $K$. The statement above is w.r.t. the associated volume form. Recall that the discriminant of $K$ is defined as the determinant $$\begin{array}{rcl}\mathrm{disc}(K) & = & \det\big(\mathrm{Tr}_{K/\Bbb{Q}}(\theta_i\theta_j)\big) \\ & = & \det\big(\underset{\theta_\bullet}{\mathrm{Mat}}(Q)\big) \\ & = & \det(P)^2\end{array}$$ of any integral basis $\theta_\bullet=(\theta_1,\dots,\theta_n)$ of $\mathcal{O}_K$. That is, $$\mathrm{disc}(K)=\mathrm{covol}_Q(\Lambda)^2.$$ ### Kernel of the Log map By construction, the kernel of the Log map $$\mathrm{Log}_K:\left\{\begin{array}{ccl}\mathcal{O}_K\setminus 0 & \longrightarrow & \Bbb{R}^r \\ x & \longmapsto & \big(\ln|\sigma_{1}(x)|, \dots, \ln|\sigma_{s}(x)|, \ln|\sigma_{s+1}(x)|, \dots, \ln|\sigma_{r}(x)|\big)\end{array}\right.$$ where $r= s + t - 1$. Its kernel is comprised of those algebraic integers of $K$ all of whose "conjugates" lie on the unit circle. > __Lemma.__ Let $\xi\in\mathcal{O}_K$ have all its conjugates of modulus $1$, then $\xi$ is a root of unity, i.e. the kernel of $K$'s Log map is the group of roots of unity in $K$. Clearly, every root of unity in $K$ is an algebraic integer and lies in the kernel of the Log map: $$\mu_K\subset\ker(\mathrm{Log}_K)$$ ($\mu_K$ is the group of all roots of unity in $K$). Conversely, if $x\in\mathcal{O}_K$ has all its conjugates of modulus $1$, then its minimal polynomial lies in a finite set of integer polynomials by Kummer's argument. Therefore $\ker(\mathrm{Log}_K)$ is a _finite_ subgroup of $\mathcal{U}(K)$. If $\xi$ is in the kernel, then so are all its powers, and thus $\xi$ is a root of unity. If $N=\#\ker(\mathrm{Log}_K)$, then if follows that $\ker(\mathrm{Log}_K)\subset\mu_N$ is a subgroup of the group of $N$-th roots of unity. Equality follows from cardinality. __Note.__ The lemma is wrong if one changes "$\xi\in\mathcal{O}_K$" to "$\xi\in K$": the point $\frac35+\frac45i$ and its (sole) conjugate $\frac35-\frac45i$ have both modulus $1$ yet aren't roots of unity: if they were they would have to be integers in $\Bbb{Q}(i)=\Bbb{Q}(\sqrt{-1})$ which they are not since their coordinates aren't integers (note $-1\equiv 3~[4]$). __Note.__ The Log map clearly makes sense as a map $K^\times\to\Bbb{R}^r$ where its defines a _group homomorphism_. For this note we only care about it along the subset $\mathcal{O}_K\setminus 0$. ### How and where "r = s + t - 1" is used in the proof #### "r = s + t - 1" is enough This is simply the observation that for any unit $u\in\mathcal{U}_K$, $$\displaystyle 1=|N_{K/\Bbb{Q}}(u)|=\Big|\prod_{\sigma\hookrightarrow\Bbb{C}}\sigma(u)\Big|=\prod_{i=1}^s|\sigma_i(u)|\cdot\prod_{j=1}^t|\sigma_{s+j}(u)|^2$$ so that $$\displaystyle 0=\sum_{i=1}^s\ln|\sigma_i(u)|+2\sum_{j=1}^t\ln|\sigma_{s+j}(u)|$$ In other words, the more natural "full Log map" $\mathrm{Log}_K'(u)=(\ln|\sigma_i(u)|)_{1\leq i\leq s+t}$ has its image in the hyperplane $\sum_{1\leq i\leq s} x_i + 2\sum_{s<j\leq s+t} x_j=0$. The "missing modulus" $\ln|\sigma_{r+1}(u)|=\ln|\sigma_{s+t}(u)|$ is redundant information. #### How the precise value "r = s + t - 1" is used The beginning of the proof works without modification for "shorter Log maps" (i.e. by including fewer than $r$ of the $\sigma_i$). The fact that $r=s+t-1$ isn't used until one invokes the proposition below. Let $\eta_1,\dots,\eta_r\in\mathcal{U}_K$ be the family of units constructed in Dirichlet's Unit Theorem (DUT). By construction, the vectors $\mathrm{Log}_K(\eta_i)$, $i=1,\dots,r$, are _linearly independent_ in $\mathbb{R}^r$, and $\langle\eta_1,\dots,\eta_r\rangle$[^differing_conventions] is a free abelian group of rank $r=s+t-1$ within $\mathcal{U}_K$. > __Proposition.__ The quotient group $\mathcal{U}_K/\langle\eta_1,\dots,\eta_r\rangle$ is finite. __Proof.__ This is the proposition that is used in the proof of the DUT. Its proof is basically that of corollary 12.1. Let $u\in\mathcal{U}_K$: * $u$ is equivalent (up to an element in $\langle\eta_1,\dots,\eta_r\rangle$) to a unit $v$ such that for all $\sigma:K\hookrightarrow\Bbb{C}$, $|\sigma(v)|\leq C$ for some absolute constant $C>0$; * $v$'s minimal polynomial thus belongs to a finite set of integer polynomialls; * thus $v$ belongs to the finite set of the $K$-roots of these polynomials, * and thus $\mathcal{U}_K/\langle\eta_1,\dots,\eta_r\rangle$ is finite. The unit $v$ is constructed as in Baker's proof by taking a nearby vector $\mathrm{Log}(\eta_1^{m_1}\cdots\eta_r^{m_r})$ to $\mathrm{Log}(u)$ in the (discrete, full rank) lattice $\mathrm{Log}(\langle\eta_1,\dots,\eta_r\rangle)\subset\Bbb{R}^r$ and setting $$v=u\cdot\eta_1^{-m_1}\cdots\eta_r^{-m_r}.$$ Here "nearby" means w.r.t. the $\|\cdot\|_\infty$-norm on $\Bbb{R}^r$. We are not necessarily interested in the closest lattice point, merely in the fact that there is a uniform error bound $\delta>0$ on the coordinates $\big|\ln|\sigma_i(v)|\big|$. This is clear, for by construction, if $v$ was constructed by 1. first: expressing $\mathrm{Log}(u)$ in the $\Bbb{R}$-basis $\big(\mathrm{Log}(\eta_1), \dots, \mathrm{Log}(\eta_r)\big)$ as $$\mathrm{Log}(u)=\sum_{i=1}^r u_i\mathrm{Log}(\eta_i)$$ 2. defining $m_i=\lfloor u_i\rfloor$ 3. setting $v=u\cdot\eta_1^{-m_1}\cdots\eta_r^{-m_r}$, then for $i=1,\dots,r$: $$\begin{array}{rcl} \big|\ln|\sigma_i(v)|\big| & \leq & r\cdot\max_{j=1,\dots,r}\big|\ln|\sigma_i(\eta_j)|\big|\\ & \leq & \underbrace{r\cdot\max_{i=1,\dots,r}\max_{j=1,\dots,r}\big|\ln|\sigma_i(\eta_j)|\big|}_{=\delta} \end{array}$$ The only noteworthy point is how one deduces the boundedness of the length of the "__single missing conjugate__" $|\sigma_{r+1}(v)|=|\sigma_{s+t}(v)|$. We get it from the previously made observation: $$0=\sum_{i=1}^s\ln|\sigma_i(v)| + 2\sum_{j=1}^{t}\ln|\sigma_{s+j}(v)|\quad\text{thus}\quad\big|\ln|\sigma_{s+t}(v)|\big|\leq\underbrace{\sum_{i=1}^r\big|\ln|\sigma_i(v)|\big|}_{\leq r\delta}$$ (not bothering with the $\frac12$ factors one would expect on the first $s$ terms allows us not to have to differentiate between the case $t>0$ and $t=0$.) __This__ (trivial) computation is precisely where we use the fact that we used $r=s+t-1$: had we chosen a value lower, we would not have been able to untangle the remaining moduli. Taking $r=s+t-1$ allows us to have a __single__ modulus to dominate. The remaining points are easy consequences of the boundedness of the conjugates. [^differing_conventions]: We use here the notation $\langle g_1,\dots,g_r\rangle$ for $g_1,\dots,g_r\in G$ to denote the subgroup of a group $G$ generated by the $g_i$. This should not be confused with the ideal $\langle\eta_1,\dots,\eta_r\rangle\subset\mathcal{O}_K$ generated by the $\eta_i$. Since ### The conclusion of the proof Once Baker knows that the quotient is finite, the proof is nearly done. We put $N=\#(\mathcal{U}_K/\langle\eta_1,\dots,\eta_r\rangle)$. Then $\mathcal{U}_K^N\subset\langle\eta_1,\dots,\eta_r\rangle$[^langle_convention], where we set $\mathcal{U}_K^N=\{\epsilon^N\mid\epsilon\in\mathcal{U}_K\}$. We note that $\mathrm{Log}_K(\mathcal{U}_K^N)$ includes the full rank sugroup $$\langle\mathrm{Log}_K(\eta_1^N),\dots,\mathrm{Log}_K(\eta_r^N)\rangle=N\langle\mathrm{Log}_K(\eta_1),\dots,\mathrm{Log}_K(\eta_r)\rangle$$ of $\langle\mathrm{Log}_K(\eta_1),\dots,\mathrm{Log}_K(\eta_r)\rangle$, hence has full rank itself. Using __Lemma 11.3__, there exists a basis $\big(\mathrm{Log}_K(\epsilon_1^N),\dots,\mathrm{Log}_K(\epsilon_r^N)\big)$ of $\mathrm{Log}_K(\mathcal{U}_K^N)$ such that its matrix w.r.t. the basis $\big(\mathrm{Log}_K(\eta_1),\dots,\mathrm{Log}_K(\eta_r)\big)$ is upper triangular. I don't believe the "upper triangular" property is important here, though. [^langle_convention]: In this subsection we write $\langle v_1,\dots,v_n\rangle$ for the _subgroup_ of some group $\Lambda$ generated by elements $v_1,\dots,v_n\in\Lambda$. Here $\Lambda=\mathrm{Log}(\mathcal{U}_K)\subset\Bbb{R}^r$ is actually a lattice. There should be no possible confusion with the analogous notation for ideals. If $u\in\mathcal{U}_K$ is a unit, then there exist $j_1,\dots,j_r\in\Bbb{Z}$ such that $$\mathrm{Log}_K(u^N)=\sum_{k=1}^r j_k\cdot\mathrm{Log}_K(\epsilon_k^N)$$ therefore $$u\cdot\epsilon_{1}^{-j_{r}}\cdots\epsilon_{r}^{-j_{r}}~\equiv~\text{some root of unity}$$ (indeed, raising the LHS to the $N$-th power lands it in the kernel of the Log map; the kernel of the Log map is the set of roots of unity in $K$; the LHS is thus a root of unity in $K$) i.e. $$u=\rho\cdot\epsilon_{1}^{j_{r}}\cdots\epsilon_{r}^{j_{r}}$$ for some $N$-th root of unity $\rho\in\mathcal{U}_K$. ### Wrapping up In conclusion, we get a split short exact sequence since its final term is free abelian: $$0\to\mu_K\hookrightarrow\mathcal{U}_K\twoheadrightarrow \underbrace{\mathrm{Log}(\mathcal{U}_K)}_{\displaystyle\simeq\Bbb{Z}^r}\to 0$$ and so $\mathcal{U}_K\simeq\mu_K\times\Bbb{Z}^r$. [MSE question]: https://math.stackexchange.com/a/3518846/11258

Read more

Notes from a First Encounter with Class Groups

The simplest proximity-to-low-degree-polynomial test and how to rehabilitate approximate polynomials