tags: `study notes` `number theory`

Filling out details in Baker's Comprehensive Course

Filling out details in Baker's Comprehensive Course

Theorem 11.2

"Then, by symmetry, we have …"

We give two proofs of Baker's claim that for

f \in O_{K} [X]

F = \prod_{σ : K ↪ C} f^{σ} \in Z [X]

Baker claims that this follows "by symmetry". One proof was provided to me in a comment by user @reuns and relies on the Primitive Element Theorem and the Symmetric Function Theorem; we give a second more elementary proof based on Lagrange interpolation of

F

, but it is less in the "by symmetry" spirit.

Both proofs can be found on math.stackexchange.

Theorem 12.2

Why λ_1 ⨉ ··· ⨉ λ_n = |det(A)| is the right condition

We denote by

L (\cdot)

the Lebesgue measure on

R^{n}

. Consider the set

S = {x \in R^{n} | A x \in \prod_{i = 1}^{n} [- λ_{i}, + λ_{i}]}

Then

A (S) = [- λ_{1}, + λ_{1}] \times \dots \times [- λ_{n}, + λ_{n}]

and thus

| det (A) | \cdot L (S) = 2^{n} λ_{1} \dots λ_{n}

To apply the refinement of Minkovski's theorem with the lattice

Z^{n}

and the symmetric convex set

S

, we need that

\frac{L (S)}{2^{n}} \geq 1

that is

λ_{1} \dots λ_{n} \geq | det (A) | .

Theorem 12.3 (Dirichlet's Unit Theorem)

Properly applying Minkovski's theorem

Minor discrepancies

Let

n = \dim_{Q} K

. We recall (corollary of the primitive element theorem) that there are precisely

n

ring (field) homomorphisms

σ_{i} : K ↪ C

. We assume that the

σ_{1}, \dots, σ_{s}

are real and that the remaining

2 t

are complex with

σ_{s + t + j} = \overset{―}{σ_{s + j}}

for

j = 1, \dots, t

Applying Minkovski's theorem requires

an
$n$ -dimensional real vector space
$V$ ,
$n$ linear
$R$ -linear forms
$L_{1}, \dots, L_{n} : V \to R$ ,
which have to be linearly independent.

This is not quite the setup we have here. For several reasons:

$K$ is a rational vector space,
the
$σ_{i}$ are
$Q$ -linear,
and they are are complex valued.

Fixing these discrepancies is easy enough. I highlighted them merely to pin point precisely what made me uneasy.

Linear independence of the field embeddings

The most important point, and which is quite independent from the other points, is the linear independence of the

σ_{i} \in {Lin}_{Q} (K, C)

. This follows from the fact that

⟨ α ∣ β ⟩_{K} := {Tr}_{K / Q} (α β)

defines a (symmetric) nondegenerate bilinear form on

K

. Indeed,

{Tr}_{K / Q} (1) = n

so that

\forall α \in K ∖ 0

⟨ α ∣ α^{- 1} ⟩_{K} = n \neq 0

Lemma. The bilinear pairing
$⟨ α ∣ β ⟩_{K} := {Tr}_{K / Q} (α β)$ is nondegenerate.

Now if

θ_{∙} = (θ_{1}, \dots, θ_{n})

is some

Q

-basis of

K

, the matrix

S = (σ_{j} (θ_{i}))_{1 \leq i, j \leq n}

is invertible since

S \cdot S^{t} = \underset{θ_{∙}}{Mat} (⟨ \cdot ∣ \cdot ⟩_{K}) \in {GL}_{n} (Q) .

In conclusion,

Lemma. The field embeddings
$σ_{1}, \dots, σ_{n} : K ↪ C$ are
$Q$ -linearly independent.

Realification

When applying Minkovski we will implicitely apply it to

V = K \otimes_{Q} R

and to the maps

{\begin{array}{lclr} L_{i} & = & σ_{i} & (i = 1, \dots, s) \\ L_{s + 2 j - 1} & = & Re (σ_{s + j}) & (j = 1, \dots, t) \\ L_{s + 2 j} & = & Im (σ_{s + j}) & (j = 1, \dots, t) \end{array}

(we use the same notation for the

R

-linear extensions of

Q

-linear maps). We also notice that simple line operations convert the matrix

(L_{i} (θ_{j})) \Rightarrow (σ_{i} (θ_{j}))

at the cost of multiplying the determinant by

(- 2 i)^{t}

. Since the matrix on the right is invertible, we get that the matrix on the left is aswell. This is the "

(a_{i j})

"-matrix in Theorem 12.2.

Covolume of ring of integers = discriminant of number field

In this subsection we investigate why when applying Minkovski's theorem in Dirichlet's Unit Theorem, one requires

λ_{1} \dots λ_{n} = \sqrt{disc (K)}

rather than

λ_{1} \dots λ_{n} = covol (O_{K})

. The answer is, unsurprisingly, that

covol (O_{K}) = \sqrt{disc (K)}

relative to a natural volume form associated to the nondegenerate bilinear form

⟨ \cdot ∣ \cdot ⟩_{K}

Proposition. Let
$Q$ be a nondegenerate quadratic form on
$V$ , a finite dimensional real vector space. Then there exists a unique translation invariant measure
$L_{Q}$ on
$V$ such that for any
$Q$ -orthonormal basis
$L_{Q} (\sum_{i = 1}^{n} [0, 1] e_{i}) = 1.$

The is easy:

L_{Q}

is simply the Lebesgue measure on

V

identified with

R^{n}

by choosing a

Q

-orthonormal basis. This is independent on the choice of

Q

-orthonormal basis. Indeed, for any base change matrix

P = \underset{B}{Mat} (C)

\underset{C}{Mat} (Q)) =^{t} P \underset{B}{Mat} (Q) P

Therefore, if

B

and

C

are both

Q

-orthonormal bases,

| det (P) | = 1

, and

L_{Q}

is well-defined.

Proposition. Let
$Λ$ be a (full rank, discrete) lattice in
$V$ ; let
$θ_{∙} = (θ_{1}, \dots, θ_{n})$ be a
$Z$ -basis of
$Λ$ . Then setting
${covol}_{Q} (Λ) := L_{Q} (\sum_{i = 1}^{n} [0, 1] θ_{i})$ , we have
$| det (\underset{θ_{∙}}{Mat} (Q)) | = {covol}_{Q} (Λ)^{2}$

Let

θ_{∙}

be as above and let

e_{∙} = (e_{1}, \dots, e_{n})

be a

Q

-orhtonormal basis of

V

. We have

\begin{array}{rcl} \underset{θ_{∙}}{Mat} (Q)) & = & ^{t} P \underset{e_{∙}}{Mat} (Q) P \\ = & ^{t} P Diag (\pm 1) P \end{array}

where

P = \underset{e_{∙}}{Mat} (θ_{∙})

is the matrix of the vectors of

θ_{∙}

relative to the basis

e_{∙}

. Also, if we define

g \in GL (V)

g (e_{i}) = θ_{i}

, then

g (\sum_{i = 1}^{n} [0, 1] e_{i}) = \sum_{i = 1}^{n} [0, 1] θ_{i}

and so

{covol}_{Q} (Λ) = | det (g) |

. Since

\underset{e_{∙}}{Mat} (g) = P

this imposes

{covol}_{Q} (Λ) = | det (P) | .

Corollary. If
$K$ is a number field,
${covol}_{Q} (O_{K}) = \sqrt{| disc (K) |}$ .

The real vector space

V = K \otimes_{Q} R

inherits the nondegenerate bilinear form

⟨ \cdot ∣ \cdot ⟩_{K}

from

K

. The statement above is w.r.t. the associated volume form. Recall that the discriminant of

K

is defined as the determinant

\begin{array}{rcl} disc (K) & = & det ({Tr}_{K / Q} (θ_{i} θ_{j})) \\ = & det (\underset{θ_{∙}}{Mat} (Q)) \\ = & det (P)^{2} \end{array}

of any integral basis

θ_{∙} = (θ_{1}, \dots, θ_{n})

O_{K}

. That is,

disc (K) = {covol}_{Q} (Λ)^{2} .

Kernel of the Log map

By construction, the kernel of the Log map

{Log}_{K} : {\begin{array}{ccl} O_{K} ∖ 0 & ⟶ & R^{r} \\ x & ⟼ & (\ln | σ_{1} (x) |, \dots, \ln | σ_{s} (x) |, \ln | σ_{s + 1} (x) |, \dots, \ln | σ_{r} (x) |) \end{array}

where

r = s + t - 1

. Its kernel is comprised of those algebraic integers of

K

all of whose "conjugates" lie on the unit circle.

Lemma. Let
$ξ \in O_{K}$ have all its conjugates of modulus
$1$ , then
$ξ$ is a root of unity, i.e. the kernel of
$K$ 's Log map is the group of roots of unity in
$K$ .

Clearly, every root of unity in

K

is an algebraic integer and lies in the kernel of the Log map:

μ_{K} \subset \ker ({Log}_{K})

(

μ_{K}

is the group of all roots of unity in

K

). Conversely, if

x \in O_{K}

has all its conjugates of modulus

1

, then its minimal polynomial lies in a finite set of integer polynomials by Kummer's argument. Therefore

\ker ({Log}_{K})

is a finite subgroup of

U (K)

. If

ξ

is in the kernel, then so are all its powers, and thus

ξ

is a root of unity. If

N = # \ker ({Log}_{K})

, then if follows that

\ker ({Log}_{K}) \subset μ_{N}

is a subgroup of the group of

N

-th roots of unity. Equality follows from cardinality.

Note. The lemma is wrong if one changes "

ξ \in O_{K}

" to "

ξ \in K

": the point

\frac{3}{5} + \frac{4}{5} i

and its (sole) conjugate

\frac{3}{5} - \frac{4}{5} i

have both modulus

1

yet aren't roots of unity: if they were they would have to be integers in

Q (i) = Q (\sqrt{- 1})

which they are not since their coordinates aren't integers (note

- 1 \equiv 3 [4]

Note. The Log map clearly makes sense as a map

K^{\times} \to R^{r}

where its defines a group homomorphism. For this note we only care about it along the subset

O_{K} ∖ 0

How and where "r = s + t - 1" is used in the proof

"r = s + t - 1" is enough

This is simply the observation that for any unit

u \in U_{K}

1 = | N_{K / Q} (u) | = | \prod_{σ ↪ C} σ (u) | = \prod_{i = 1}^{s} | σ_{i} (u) | \cdot \prod_{j = 1}^{t} | σ_{s + j} (u) |^{2}

so that

0 = \sum_{i = 1}^{s} \ln | σ_{i} (u) | + 2 \sum_{j = 1}^{t} \ln | σ_{s + j} (u) |

In other words, the more natural "full Log map"

{Log}_{K}^{'} (u) = (\ln | σ_{i} (u) |)_{1 \leq i \leq s + t}

has its image in the hyperplane

\sum_{1 \leq i \leq s} x_{i} + 2 \sum_{s < j \leq s + t} x_{j} = 0

. The "missing modulus"

\ln | σ_{r + 1} (u) | = \ln | σ_{s + t} (u) |

is redundant information.

How the precise value "r = s + t - 1" is used

The beginning of the proof works without modification for "shorter Log maps" (i.e. by including fewer than

r

of the

σ_{i}

). The fact that

r = s + t - 1

isn't used until one invokes the proposition below.

Let

η_{1}, \dots, η_{r} \in U_{K}

be the family of units constructed in Dirichlet's Unit Theorem (DUT). By construction, the vectors

{Log}_{K} (η_{i})

i = 1, \dots, r

, are linearly independent in

R^{r}

, and

⟨ η_{1}, \dots, η_{r} ⟩

^[1] is a free abelian group of rank

r = s + t - 1

within

U_{K}

Proposition. The quotient group
$U_{K} / ⟨ η_{1}, \dots, η_{r} ⟩$ is finite.

Proof. This is the proposition that is used in the proof of the DUT. Its proof is basically that of corollary 12.1. Let

u \in U_{K}

$u$ is equivalent (up to an element in
$⟨ η_{1}, \dots, η_{r} ⟩$ ) to a unit
$v$ such that for all
$σ : K ↪ C$ ,
$| σ (v) | \leq C$ for some absolute constant
$C > 0$ ;
$v$ 's minimal polynomial thus belongs to a finite set of integer polynomialls;
thus
$v$ belongs to the finite set of the
$K$ -roots of these polynomials,
and thus
$U_{K} / ⟨ η_{1}, \dots, η_{r} ⟩$ is finite.

The unit

v

is constructed as in Baker's proof by taking a nearby vector

Log (η_{1}^{m_{1}} \dots η_{r}^{m_{r}})

Log (u)

in the (discrete, full rank) lattice

Log (⟨ η_{1}, \dots, η_{r} ⟩) \subset R^{r}

and setting

v = u \cdot η_{1}^{- m_{1}} \dots η_{r}^{- m_{r}} .

Here "nearby" means w.r.t. the

‖ \cdot ‖_{\infty}

-norm on

R^{r}

. We are not necessarily interested in the closest lattice point, merely in the fact that there is a uniform error bound

δ > 0

on the coordinates

| \ln | σ_{i} (v) | |

. This is clear, for by construction, if

v

was constructed by

first: expressing
$Log (u)$ in the
$R$ -basis
$(Log (η_{1}), \dots, Log (η_{r}))$ as
$Log (u) = \sum_{i = 1}^{r} u_{i} Log (η_{i})$
defining
$m_{i} = ⌊ u_{i} ⌋$
setting
$v = u \cdot η_{1}^{- m_{1}} \dots η_{r}^{- m_{r}}$ ,

then for

i = 1, \dots, r

\begin{array}{rcl} | \ln | σ_{i} (v) | | & \leq & r \cdot max_{j = 1, \dots, r} | \ln | σ_{i} (η_{j}) | | \\ \leq & \underset{= δ}{\underset{⏟}{r \cdot max_{i = 1, \dots, r} max_{j = 1, \dots, r} | \ln | σ_{i} (η_{j}) | |}} \end{array}

The only noteworthy point is how one deduces the boundedness of the length of the "single missing conjugate"

| σ_{r + 1} (v) | = | σ_{s + t} (v) |

. We get it from the previously made observation:

0 = \sum_{i = 1}^{s} \ln | σ_{i} (v) | + 2 \sum_{j = 1}^{t} \ln | σ_{s + j} (v) | thus | \ln | σ_{s + t} (v) | | \leq \underset{\leq r δ}{\underset{⏟}{\sum_{i = 1}^{r} | \ln | σ_{i} (v) | |}}

(not bothering with the

\frac{1}{2}

factors one would expect on the first

s

terms allows us not to have to differentiate between the case

t > 0

and

t = 0

.) This (trivial) computation is precisely where we use the fact that we used

r = s + t - 1

: had we chosen a value lower, we would not have been able to untangle the remaining moduli. Taking

r = s + t - 1

allows us to have a single modulus to dominate.

The remaining points are easy consequences of the boundedness of the conjugates.

The conclusion of the proof

Once Baker knows that the quotient is finite, the proof is nearly done. We put

N = # (U_{K} / ⟨ η_{1}, \dots, η_{r} ⟩)

. Then

U_{K}^{N} \subset ⟨ η_{1}, \dots, η_{r} ⟩

^[2], where we set

U_{K}^{N} = {ϵ^{N} ∣ ϵ \in U_{K}}

. We note that

{Log}_{K} (U_{K}^{N})

includes the full rank sugroup

⟨ {Log}_{K} (η_{1}^{N}), \dots, {Log}_{K} (η_{r}^{N}) ⟩ = N ⟨ {Log}_{K} (η_{1}), \dots, {Log}_{K} (η_{r}) ⟩

⟨ {Log}_{K} (η_{1}), \dots, {Log}_{K} (η_{r}) ⟩

, hence has full rank itself. Using Lemma 11.3, there exists a basis

({Log}_{K} (ϵ_{1}^{N}), \dots, {Log}_{K} (ϵ_{r}^{N}))

{Log}_{K} (U_{K}^{N})

such that its matrix w.r.t. the basis

({Log}_{K} (η_{1}), \dots, {Log}_{K} (η_{r}))

is upper triangular. I don't believe the "upper triangular" property is important here, though.

u \in U_{K}

is a unit, then there exist

j_{1}, \dots, j_{r} \in Z

such that

{Log}_{K} (u^{N}) = \sum_{k = 1}^{r} j_{k} \cdot {Log}_{K} (ϵ_{k}^{N})

therefore

u \cdot ϵ_{1}^{- j_{r}} \dots ϵ_{r}^{- j_{r}} \equiv some root of unity

(indeed, raising the LHS to the

N

-th power lands it in the kernel of the Log map; the kernel of the Log map is the set of roots of unity in

K

; the LHS is thus a root of unity in

K

) i.e.

u = ρ \cdot ϵ_{1}^{j_{r}} \dots ϵ_{r}^{j_{r}}

for some

N

-th root of unity

ρ \in U_{K}

Wrapping up

In conclusion, we get a split short exact sequence since its final term is free abelian:

0 \to μ_{K} ↪ U_{K} ↠ \underset{≃ Z^{r}}{\underset{⏟}{Log (U_{K})}} \to 0