Number Theory - HackMD

# Number Theory Many problems at SOI have at least a small part that falls under the category *number theory*. Usually only very few knowledge is required, but having a good understanding of the basic definitions can help. ## Floor The *floor* function of a real $x$ is the largest previous integer: $$\lfloor x\rfloor=n \Leftrightarrow n\le x<n+1$$ Examples: * $\lfloor 0.5\rfloor=0$ * $\lfloor 1.6\rfloor=1$ * $\lfloor 42.3\rfloor=42$ * $\lfloor -1.3\rfloor=-2$ # Modulo The *modulo* operation finds the remainder after division. $$a \bmod m = a-m\cdot\lfloor a/m \rfloor$$ Example: $11\bmod 3 = 10 - \lfloor 11/3\rfloor \cdot 3 = 10 - 3\cdot 3=2$ Note that the result of the modulo operator is always between $0$ and $m-1$, also for negative numbers. This is different to the `%`-operator in C++, so be careful with that and use the following: ```c++ int mod(int a, int m) { int x = a%m; // between -m+1 and m-1 return x<0 ? x+m : x; // between 0 and m-1 } ``` ## Modular Arithmetic Modulo gets interesting when we apply it after every operation. First, let's introduce the following syntax: $$a\equiv b{\pmod {m}}$$ This just means that $a\bmod m=b\bmod m$. Now we can observer the following: - $a + b \equiv (a\bmod m)+(b\bmod m){\pmod {m}}$ - $a - b \equiv (a\bmod m)-(b\bmod m){\pmod {m}}$ - $a \cdot b \equiv (a\bmod m)\cdot(b\bmod m){\pmod {m}}$ To obtain a proof, just apply the definitions for modulo. Also: - $a + b \equiv a'+b \Leftrightarrow a=a' {\pmod {m}}$ - $a - b \equiv a' - b \Leftrightarrow a=a' {\pmod {m}}$ - $a\cdot b \equiv a'\cdot b \Leftrightarrow a=a' {\pmod {m}}$ ## Divisor A number $m$ is called a *divisor* of a number $n$ if $n=k n$ for some $k$. Or, with our modulo operator, $n\bmod m=0$. Note that every number is a divisor of $0$. ### Finding all Divisors Let's count the number of divisors of $n$. The obvious approach is to iterate through all the numbers from $1$ to $n$ and check whether or not each one is a divisor. The time complexity of this solution is $\mathcal O(n)$. We can improve the above solution based on some property about divisors: - If number $d_1$ is a divisor of n, then $d_2=n/a$ is also a divisor. - One of these two divisors is $\leq\lfloor\sqrt n\rfloor$: If that were not the case, $n$ would be a product of two numbers greater than $\sqrt n$, which is impossible. Iterating through all the numbers from 1 to $\sqrt n$ n allows us to find all divisors $\leq\sqrt{n}$. All other divisors have the form $n/d$ for a $d\leq\sqrt{n}$, so we can find them together with the smaller one. ```c++ void print_divisors(int n) { for (int i=1; i*i <= n; ++i) { if (n%i == 0) { cout << i << '\n'; if (i*i == n); break; cout << n/i << '\n'; } } } ``` This code has the runtime $\mathcal O(\sqrt{n})$. In other words, the number of divisors can be at most $\sqrt{n}$. ## Primes A *prime number* is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number. ### Primality Test To check whether a number is prime, we just have to find out whether it has a divisor other than $1$ and $m$. We can just modify our `print_divisors` function to return if some divisor has been found: ```c++ bool is_prime(int n) { if (i<=1) return false; for (int i=2; i*i<=n; ++i) if (n%i == 0) return false; return true; } ``` Runtime: $\mathcal O(\sqrt n)$ ## Greatest Common Divisor Another useful concept is the "Greatest Common Divisor", short GCD, of two numbers. The GCD of $a$ and $b$, $\gcd(a,b)$, is defined as the largest number $d$ that divides both $a$ and $b$. We define $\gcd(0,0)=0$ as a special case. Let's analyze a few basic properties of the GCD: * $\gcd(a,b)=\gcd(b,a)$ because the definition is symmetric. * $\gcd(a,0)=a$ * If $a$ is a divisor of $b$ and $a\neq 0$, then $\gcd(a,b)=a$. * Let $d=\gcd(a,b)$. Then, $\gcd(a,b+k\cdot a)$ for any $k\in\mathbb Z$.<br/> Proof: If $a\equiv b\equiv 0{\pmod d}$, then also $b+k\cdot a\equiv 0{\pmod d}$. Conversely, if $a\equiv b+k\cdot b\equiv 0{\pmod d'}$, then $b+k\cdot a\equiv 0{\pmod d'}$. So $d\leq d'$ and $d'\leq d$ and both have to be equal. * $\gcd(a,b)=gcd(a, b\bmod a)$ because $b\bmod a=b-k\cdot a$. The last property leads to an interesting algorithm: ```c++ int gcd(int a, int b) { return b ? gcd(b, a%b) : a; } ``` If $a<b$, then this swaps the arguments. In all other cases, we have $a\geq b$ and we can apply the property. The algorithm is known as "Euclidean algorithm" and it runs in $\mathcal O(\log(a+b))$. Proof sketch: Let $(a_0,b_0),(a_1,b_1),\dots (a_k,b_k)$ be the sequence of $a$'s and $b$'s. In the worst case, $a\bmod b$ is just $a-b$, so $a_{i+1}\ge b_i+a_i,b_{i+1}=a_i$. This ressembles the fibonacci numbers $F_{i+1}=F_i$, which grow exponentially ($F_i\approx 1.618^i$), so k grows logarithmically in $\max\{a,b\}$. In C++, you can use `__gcd`. ## Extended Euclid For numbers $a$ and $b$, there exist $k$ and $l$ such that: $$ka+lb=\gcd(a,b)$$ Our `gcd` algorithm does this implicitly. The arguments are all integer linear combinations of $a$ and $b$. ## Modular Inverse Given $a!=0$, is there a unique number $b$ (modulo $p$) such that $ab=1$? - Uniqueness: We saw before that if $ab'\equiv 1{\pmod m}$, then $b'=b{\pmod m}$. - Existence: Look at the $n-1$ numbers $1,2,\dots,n-1$. Multiplied with $a$, $1a,2a,\dots,(n-1)a$ gets another set of $n-1$ numbers. Apparently all of them are different. So one of them must be $1$ and we got our $ab\equiv 1$. The important interpretation of it is that we can now do division! $ab\equiv 1$ basically means $b\equiv 1/a=a^{-1}$ In many tasks where we have to output the answer modulo $p$, this can be quite useful, since we can also divide numbers. Just be careful to not divide by zero. ## Little Fermat Very useful is Fermat's little theorem: $$a^p\equiv a{\pmod m}$$ If a is not divisible by $p$ ($a\not\equiv 0$), this is equivalent to: $$a^{p-1}\equiv 1{\pmod m}$$ Or even: $$a^{p-2}\equiv a^{-1}{\pmod m}$$ Now we found a way to calculate the inverse: Compute $a^{p-2}$. ## Binary Exponentiation The obvious way to calculate $a^8$ is $((((((a\cdot a)\cdot a)\cdot a)\cdot a)\cdot a)\cdot a)\cdot a$, which uses 7 multiplications. However, we can do the following $a_2=a\cdot a, a_4=a_2\cdot a_2, a^8=a_4\cdot a_4$. Now we only 3 use multiplications. In general, $a^{2^n}=a^{2^{n-1}}\cdot a^{2^{n-1}}$, so we require $\mathcal O(\log e)$ steps to compute $a^e$ for any $e=2^k$. How can this be to generalized for any exponents? Any number can be written as a sum of powers of two. E.g $13=8+4+1=2^3+2^2+2^0$. It is easy to extract those numbers if we look at the binary representation of $13_{(10)}=1101_{(2)}$. Thus, to compute $a^e=a^{2^{k_1}+2^{k_2}+\dots+2^{k_{\log e}}}$, we first compute $a^{2^{k_1}}, a^{2^{k_2}}, \dots, a^{2^{k_{\log e}}}$ and then multiply them together, because $a^{2^{k_1}+2^{k_2}+\dots+2^{k_{\log e}}}=a^{2^{k_1}}a^{2^{k_2}}\cdots a^{2^{k_{\log e}}}$. The running time will be $\log e$. A easy way to implement this is by extracting the bits of the exponent and modify the base to reflect the shifted value of the exponent: ```c++ int ipow(int base, int exp) { int res=1; for (; exp; exp>>=1, base=base*base) if (exp&1) res = res*base; return res; } ``` For the above code, the faster time complexity is not very relevant because `int`'s overflow quite easily and even for base=2, the exponent can be at most 31 (or 63 for `long long`'s). Typically, fast exponentiation is used with modulo, because then this overflow problem doesn't exist: ```c++ int powm(int base, int exp, int mod) { int res=1; for (; exp; exp>>=1, base=base*base%mod) if (exp&1) res = res*base%mod; return res; } ```