# 2019q3 Homework1 (review)
contributed by < `kksweet8845` >
### Question 1
```cpp
#include <stdio.h>
int main() {
float sum = 0.0f;
for (int i=0;i<10000; i++) sum += i+1;
printf("Sum %f\n", sum);
return 0;
}
```
#### Thought
- Wrong thought
- I thought that this question is just a simple problem.
- However, when jserv execute the code above, the answer was absolute different from mine.
- Then I think that `sum` is a type of float and `i` is a type of int. According to IEEE 754 standard, I think that the`sum`,which is type of float, might ignore some trailing value when the value is quite large.
#### Assumption
![](https://i.imgur.com/oGU3Axv.png)
(Soucre from IEEE 754 wikipedia)
When the number is quite large enough, that is, the fraction part is `0xFFFFFE`, the the exponent part is just need $\ 2^{23}$. That is,
\begin{gather*}0x1.FFFFFE\ \times 2^{23}\end{gather*}
In here, we just want to calculate the number of digit of this value, so we take $\log$ in each side.
\begin{split}
(2-2^{23}+1)\ \times2^{23} &= 2^{24}-1+ 2^{23}=k\\
\log(2^{24}-1+ 2^{23}) &= \log k\\
\log(3 \cdot 2^{23}) &= \log k\\
\log k &= 7.4001\\
\log k &= 7 + 0.4001\\
k&\approx 2.53\ \times 10^{7}
\end{split}
It means that if the digit of value is more than 8 digits. Computer will start to ignore the trailing numbers at the end of value, so the error occurred.
I assuming the value must about to ignore trailing value even I just add 1 to this number,so another value which slightly smaller than this number may also occurred rounding error.
#### Experiment
In order to prove my assumption, I write a code which will sum up from 1 to n and will store the digits of sum to see when the error occurred in 8 digits.
Fortunately, it coincides with my assumption.
```bash
==============================From 1 to 5000
The value of sum_f is : 12502500.000000
The value of sum_i is : 12502500
The digits of sum_i is :8
The error : 0.000000
Normalized Error value: 0.000000
==============================From 1 to 6000
The value of sum_f is : 18002896.000000
The value of sum_i is : 18003000
The digits of sum_i is :8
The error : 104.000000
Normalized Error value: 0.000006
==============================From 1 to 7000
The value of sum_f is : 24502896.000000
The value of sum_i is : 24503500
The digits of sum_i is :8
The error : 604.000000
Normalized Error value: 0.000025
==============================From 1 to 8000
The value of sum_f is : 32002896.000000
The value of sum_i is : 32004000
The digits of sum_i is :8
The error : 1104.000000
Normalized Error value: 0.000034
==============================From 1 to 9000
The value of sum_f is : 40502896.000000
The value of sum_i is : 40504500
The digits of sum_i is :8
The error : 1604.000000
Normalized Error value: 0.000040
==============================From 1 to 10000
The value of sum_f is : 50002896.000000
The value of sum_i is : 50005000
The digits of sum_i is :8
The error : 2104.000000
Normalized Error value: 0.000042
```
#### Improve accuracy
Then we know that the computer will occurred this issue when we want to calculate the large floating-point number.
In order to improve accuracy, jserv offer a simple solution to improve its accuracy.
```cpp
#include <stdio.h>
int main() {
float sum = 0.0f, corr = 0.0f; /* corrective value for rounding error */
for (int i = 0; i < 10000; i++) {
float y = (i + 1) - corr; /* add the correction to specific item */
float t = sum + y; /* bits might be lost */
corr = (t - sum) - y; /* recover lost bits */
sum = t;
}
printf("Sum: %f\n", sum);
return 0;
}
```
The main concept in here is to calculate the value which will be ignored in advanced and try not to calculate with large number,`(t-sum)-y`.
That is, handling the rounding error problem, we often need to **rewrite our expression** in order to avoid the part of expression occurred ignorance when computing greatly large number.
#### Quadratic formula
Take a simple example, consider the familiar quadratic formula:
\begin{split}
\dfrac{-b\ - \sqrt{b^2-4ac}}{2a}\end{split}
Now,we know that this form is inaccurate for negative b and large positive b when translated naively to a floating point computation. This expression is prone to two types of rounding error:
- for negative b, cancellation between $-b$ and $\sqrt{b^2-4ac}$
- for large positive b, **overflow** in computing $b^2$
For negative b, the error is caused by cancellation at the outer substraction in the numerator $(-b - \sqrt{b^2-4ac})$. For $b^2$ much larger than $a$ or $c$, the discriminant $\sqrt{b^2-4ac}$ approximately equals $\sqrt{b^2}$. But for negative b ,$\sqrt{b^2} = -b$, so
\begin{split}
(-b)-\sqrt{b^2-4ac} \approx (-b) - (-b)
\end{split}
substracting two large values to compute a small result and causing terrible cancellation.
Thus, we can rewrite the problematic substraction by applying following rule:
\begin{split}x-y \rightarrow \dfrac{(x^2-y^2)}{(x+y)}\end{split}
Then, it produces
\begin{split}
\dfrac{(-b)^2 - (\sqrt{b^2-4ac})^2}{(-b)+\sqrt{b^2-4ac}}/2a=\dfrac{4ac}{(-b)+\sqrt{b^2-4ac}}/2a
\end{split}
However, the expression above is not accuracy when b is positive large. Then We use another approximation rule when the value is quite large
$\sqrt{1+x} = 1+\dfrac{1}{2}x$
Then, we can rewrite the expression as following:
\begin{split} \dfrac{-b\sqrt{1-\dfrac{4ac}{b^2}} - b}{2a}\approx \dfrac{-b(1-\dfrac{2ac}{b^2})-b}{2a} = -\dfrac{b}{a} + \dfrac{c}{b} \end{split}
Finally, we can conclude that quadratic formula can be written as following:
$quadratic(a,b,c)\ = \left\{
\begin{array}{rcl}
& \dfrac{4ac}{-b+\sqrt{b^2-4ac}}/2a & & {\mathrm i\mathrm f\ b<0}\\
&(-b-\sqrt{b^2-4ac})\dfrac{1}{2a} & & {\mathrm i\mathrm f\ 0 \leq b \leq 10^{127}}\\
&-\dfrac{b}{a}+\dfrac{c}{b} & &
{\mathrm i\mathrm f\ b > 10^{127} }\\
\end{array} \right.$
And, I test it by myself. It actually improve the accuracy when the number is negative large.
```bash
b = -574290360989628072255692613763742040064.000000
The original formula with no tricks
0.000000000000000000000000000000000000000000000000000000000000
The modified one
0.000000000000000000000000000000000008973251691435005568199720
```
It is unfortunate that I can't not see the positive one which improving accuracy. I just obtain the identical result. However, the modified expression saves a lot of calculation.
```bash
574290360989628072255692613763742040064.000000
The original formula with no tricks
-218777280377001173981114506477345701888.00000
The modified one
-218777280377001173981114506477345701888.00000
```
#### Pathological floating point arithmatic
For computers, some simple airthmatic will cause rounding errors. Such as
```
0.1 + 0.2
>> 0.30000000000000000004
```
Because the base of 2, the floating-point number 0.1 and 0.2 cannot be represent exactly by binary form. Which will be approximate by a pattern, so that why the 4 appear at the end of number.
#### IEEE standard
- IEEE 854 allows $\beta$ = 10(Base 10) is how humans exchange and think about numbers.
- IEEE 854 requires that if the base is not 10, it must be 2.
- One of reasons is that the results of error analyses are musch tighter then $\beta$ is 2 because a rounding error is $1/2$ ulp wobbles by a factor of $\beta$ when computed as a relative error, and error analyses are almost always simpler when based on relative error.
#### `Example`
Consider $\beta$ = 16, $p$ =1 compared to $\beta$ =2, $p$ = 4. Both systems have 4 bits of significand. Consider the computation of $\dfrac{15}{8}$.
- For $\beta$ =2, $p$ = 4, it can be represented as
- 1.111 x $2^0$
- For $\beta$ = 16, $p$ =1, it just can be represented as
- 1 x $16^0$
In general, base 16 can lose up to 2 bits, so a precision of $p$ can have an effective precision as low as $4p-3$ rather than $4p$.
However, IBM choose $\beta$ = 16, $p$ = 6 for its [IBM system/370](https://en.wikipedia.org/wiki/IBM_System/370)
- ==Increased exponent== :
Hence the significand requires 24 bits. Since this must fit into 32 bits, this leaves 7 bits for the exponent and 1 for the sign bit. Thus,the magnitude of representable numbers ranges from about $16^{-2^6}$ to about $16^{2^6} = 2^{2^8}$ To get a similar exponent range when $\beta$ = 2 would require 9 bits of exponent, leaving only 22 bits for the significand.
- ==shifting==
When adding two floating-point numbers, if their exponents are different, one of the significands will have to be shifted to make the radix points line up, slowing down the operation. In the $\beta$ = 16, $p$ = 1 system, all the numbers between 1 and 15 have the same exponent, so no shifting is reqired when adding any of the ${15}\choose{2}$ = 105 possible pairs of distinct numbers from this set.In the $\beta$ = 2, $p$ = 4 system, however, these numbers have exponents ranging from 0 to 3, and shifting is required for 70 of the 105 pairs.
#### Special Quantities
- On some floating-point hardware every bit pattern represents a valid floating-point number. The IBM System/370 is an example of this. On the other hand, the VAX reserves some bit patterns to represent special numbers called **reserved operands**.
- IEEE standard continues in this tradition and has NaNs(Not a Number) and infinites. Without special quantities, there is no good way to handle exceptional situations like taking the square root of a negative number other than aborting computation. So the IEEE standard specifieds the following special values: $\pm 0, denormalized \ num, \pm \infty \ and \ NaNs$
|Exponent|Fraction|Represents|
|:----:|:----:|:----:|
|$e = e_{min} -1$|$f=0$|$\ \pm0$|
|$e = e_{min} -1$|$f\ne0$|$0.f\times2^{e_{min}}$|
|$e_{min} \leq e \leq e_{max}$|all|$1.f \times 2^e$|
|$e = e_{max} +1$|$f=0$|$\pm\infty$|
|$e = e_{max} +1$|$f \ne 0$| NaN|
#### Exception, Flags and Trap Handlers
Exception
- When an exceptional condition like division by zero or overflow occurs in IEEE arithmetic, the default is to deliver a result and continue. Typical of the default results are Nan for 0/0 and $\sqrt{-1}$ and $\infty$ for 1/0 and overflow.
Flags
- The flags are ''sticky'' in that once set, they remain set until explicitly cleared. Such like testing the flags is the only way to distinguish 1/0, which is a genuine infinity from an overflow.
Trap Handlers
- Sometimes continuing executing in the face of exception conditions is not appropriate. Here is a example
- $x/(x^2+1)$, when $x> \sqrt{\beta}\ \beta^{e_{max}/2}$, the denominator is infinite, resulting in a final answer of 0, which is totaly wrong.
- Although rewriting the expression as $1/(x+x^{-1})$ may solve the problem above, rewriting may not always solve the problem.
- The IEEE standard strongly recommends that implementations allow trap handlers to be installed.
|Exception|Result When Traps Disabled|Argument to Trap Handler|
|:----:|:----:|:----:|
|Overflow|$\pm\infty \ or \ \pm x_{max}$|Round($x2^{-\alpha}$)|
|Underflow|0, $\pm2^{min} \ or \ denormal$|Round($x2^{\alpha}$)|
|Divide by zero|$\pm\infty$|Operands|
|Invalid|NaN|Operands|
|Inexact|round($x$)|round($x$)|
- One obvious use for trap handlers is for backward compatibility.
- This is especially useful for codes with a loop like below. Since comparing a NaN to a number with relative operands always returns false, this code will go into an infinite loop if x never becomes an NaN
```cpp
do {
// Statement
} while (x >= 100)
```
- Another more interesting use for trap handlers that comes up when computing products such as $\prod_{i=1}^{n} x_i$ that could potentially overflow. One solution is to use logarithms and compute $exp(\sum logx_i)$ instead.
#### System aspects
The design of almost every aspect of a computer system requires knowledge about floating point.
As an example of how plausible design decisions can lead to unexpected behavior, consider the following BASIC program
```cpp
float q = 3.0/7.0;
if (q == 3.0/7.0)
printf("Equal");
else
printf("Not Equal");
```
- When compiled and ran on my labtop, the program prints ==Not Equal==!
- Incidentally, some people think that solution to such anomalies is never to compare floating-point numbers for equality but instead to consider them equal if they are within some error bound E.
- This is hardly a cure all, because it raises as many questions as it answers. What should the value of E be? If x < 0 and y > 0 are within E, should they be considered equal, even though they have different signs? Futhrermore, the relation defined by this rule, $a \sim b \Rightarrow |a-b| < E$, is not an equivalence relation because $a \sim b$ and $b \sim c$ do not imply that $a \sim c$
#### Instruction Sets
- It is common for an algorithm to require a short burst of higher precision in order to produce accurate results.
- Take a previous example of quadratic formula. The computation of $b^2-4ac$ in double precision when each of the quantities a, b, and c are in single precision is easy if there is a multiplication instruction that takes two single precision numbers and produces a double precision result.
- Thus, hardware to compute a double-precision product from single precision operands will normally be only a little more expensive than a single-precision multiplier and much less expensive than a double-precision mulitplier.
- This instruction has many other uses, not only for quadratic formula.
- Consider the problem of solving a system of linear equations:
\begin{split}
a_{11}x_1 + a_{12}x_2+ \dots + a_{1n}x_n &= b_1 \\
a_{21}x_1 + a_{22}x_2+ \dots + a_{2n}x_n &= b_2 \\
\dots \\
a_{n1}x_1 + a_{n2}x_2+ \dots + a_{nn}x_n &= b_n
\end{split}
which can be written in matrix form as $Ax = b$, where
\begin{split}A=\begin{bmatrix}
a_{11} & a_{12} & a_{13} & \dots & a_{1n} \\
a_{21} & a_{22} & a_{23} & \dots & a_{2n} \\
\dots & \dots & \dots & \dots & \dots \\
a_{n1} & a_{n2} & a_{n3} & \dots & a_{nn}
\end{bmatrix}\end{split}
Suppose a solution $x^{(1)}$ is computed by some method, perhaps Gaussian elimination. There is a simple way to improve the accuracy of the result called ==iterative improvement==.
First compute
\begin{split}
\xi = Ax^{(1)} - b
\end{split}
Then solve the system
\begin{split}
Ay = \xi
\end{split}
That is,
\begin{split}
Ay \approx \xi \approx Ax^{(1)}-b = A(x^{(1)}-x)
\end{split}
Then, we can repreadted replacing $x^{(1)}$ with $x^{(2)}$ and $x^{(2)}$ with $x^{(3)}$. This argument that $x^{(i +1)}$ is more accurate than x^{(i)} is only informal.
- When performing iterative improvement, $\xi$ is a vector whose elements are the difference of nearby inexact floating-point numbers and so can suffer from catastrophic cancellation. Thus, iterative improvement is not very useful unless $\xi = Ax^{(1)} -b$ is computed in double precision. Once again, this is a case of computing the product of two single-precision numbers, where the full double-precision result is needed.
#### Extended source
- [Automatically Improving Accuracy for Floating Point Expressions](https://herbie.uwplse.org/pldi15-paper.pdf)
- [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf)
:::info
TODO: check [Pairwise summation](https://en.wikipedia.org/wiki/Pairwise_summation) as well.
> In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-off error compared to naively accumulating the sum in sequence.[1] Although there are other techniques such as Kahan summation that typically have even smaller round-off errors, pairwise summation is nearly as good (differing only by a logarithmic factor) while having much lower computational cost—it can be implemented so as to have nearly the same cost (and exactly the same number of arithmetic operations) as naive summation.
:notes: jserv
:::
### Question 2
```cpp
#include <stdio.h>
#define align4(x) (((x) + K) & (-4))
int main(void) {
int p = 0x1997;
printf("align4(p) is %08x\n", align4(p));
return 0;
}
```
#### Thought
- At first, I gave the wrong answer to this question. At that time, I just want to find a number which can make the desired output,`0x1998`, given `0x1997`.
- After, however, I start to think this question through cautiously.
- According to jserv description of this macro, this can round up the address to a alignment address.
- That is, for every 4-byte, I need to round extra one to this address.
- If using math expression to express it, it will be:
\begin{split}
\lceil p \rceil_{(4)}
\end{split}
It make me consider the rule of remainder.
\begin{split}
\forall m \in \mathbb N, m \in \{ 4n+p| n \in \mathbb N ,0 \leq p \leq 3 \}
\end{split}
The address can be expressed into one of four potential forms,that is shown as following.
And the $4n$ don't need to change, while for $4n +1, 4n+2, 4n+3$ need to round up to $4(n+1)$
\begin{split}
&4n &\Rightarrow 4n \\
&4n+1 &\Rightarrow 4(n+1) \\
&4n+2 &\Rightarrow 4(n+1) \\
&4n+3 &\Rightarrow 4(n+1)
\end{split}
Applying simple math,we add 3 to every line,then we get:
\begin{split}
&4n\ +&(3)&\Rightarrow 4(n) + 3\\
&4n+1+&(3) &\Rightarrow 4(n+1) + 0\\
&4n+2+&(3) &\Rightarrow 4(n+1) + 1 \\
&4n+3+&(3) &\Rightarrow 4(n+1) + 2
\end{split}
Finally, we just need to ignore the remainder. In here, jserv use a celever technique, `& -4`. Because `-4` in binary form is `0xFFFFFFFC`(32 bits), the result of operating `x & -4` will become the original value with two trailing zero.That is,
```cpp=
0xFFFFFFF6 & (-4) => 0xFFFFFFF4
0xFFFFFFF8 & (-4) => 0xFFFFFFF8
0xFFFFFFF7 & (-4) => 0xFFFFFFF4
```
We can see that the least bytes will becomes the multiple of 4. Requiring to round up to nearest multiple of 4-bytes number. Just add '3'
```cpp
(0xFFFFFFF6 + (3)) & (-4) => 0xFFFFFFF8
(0xFFFFFFF8 + (3)) & (-4) => 0xFFFFFFFC
(0xFFFFFFF7 + (3)) & (-4) => 0xFFFFFFF8
```
#### In linux kernel also have macro of ALIGN
In [arch/riscv/include/asm/page.h](https://github.com/torvalds/linux/blob/master/arch/riscv/include/asm/page.h)
:::danger
Add hyperlink to Linux kernel source tree (via GitHub or LXR). e.g. [include/linux/list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h)
:notes: jserv
:::
```cpp
/**
* REPEAT_BYTE - repeat the value @x multiple times as an unsigned long value
* @x: value to repeat
*
* NOTE: @x is not checked for > 0xff; larger values produce odd results.
*/
/* @a is a power of 2 value */
#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a))
#define __ALIGN_MASK(x, mask) __ALIGN_KERNEL_MASK((x), (mask))
#define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a)))
#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
```
In [tools/testing/scatterlist/linux/mm.h](https://github.com/torvalds/linux/blob/master/tools/testing/scatterlist/linux/mm.h)
```cpp
#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
```
#### `__ALIGN_KERNEL_MASK(x, mask)`
```cpp
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
```
The linux developer writes the general formula of rounding-up macro. `mask+1` is restircted to be the multiple of 2.
#### `__ALIGN_KERNEL(x, a)`
```cpp
#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
```
And, a more advanced one which casts the type of a depending on x.
#### `ALIGN(x, a)`
```cpp
#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
```
More simply one for the name.
#### Extended source
- [align macro kernel stack overflow](https://stackoverflow.com/questions/13122846/align-macro-kernel)
- [Linux kernel-5.3.0](https://www.kernel.org/)
### Question 3
```cpp
#include <stdbool.h>
bool func(unsigned int x) {
return x && ((x & (~x + 1)) == x);
}
```
The code above can be rewritten by the following equvalent code.
```cpp
bool func(unsigned int x) {
unsigned int unknown = 0;
while (x && unknown <= 1) {
if ((x & Z) == 1)
unknown++;
x >>= 1;
}
return (unknown == 1);
}
```
- The answer of `Z` is 1, because it is obvious that the code is checking every bit which is oen or not. And any x which has more than and equal two bits will return false.
- It is a function that can check the number x whether there is only one bit in x or not.
#### Thought
- In the beginning, I couldn't not figure out what will happen if the x and the complement of x AND together. I tested it by hand with all value in 1 bytes. Then I found that
|x|x & (~x + 1)|(x & (~x + 1)==x)|
|:-----:|:----:|:---:|
|0000 |0000 (with overflow)|0|
|0001 |==0001==|1|
|0010 |==0010==|1|
|0011 |1101|0|
|0100 |==0100==|1|
|0101 |1001|0|
|0110 |0010|0|
|0111 |0001|0|
|1000 |==1000==|1|
|1001 |0001|0|
|1010 |0010|0|
|1011 |0001|0|
|1100 |0100|0|
|1101 |0001|0|
|1110 |0010|0|
|1111 |0001|0|
Obviously, The number x which will be identical after the operation is that having only 1 set bit in binary form.
#### Compute the minimum (min) or maximum (max) of two integers without branching
Here is a approch to realize the comparison.
```cpp
int x; // we want to find the minimum of x and y
int y;
int r; // the result goes here
r = y ^ ((x ^ y) & -(x < y)); // min(x, y)
```
##### Description
Consider the `(x<y)` shown above
```cpp
int x = 5,y = 9;
printf("-(x<y) : %x\n", -(x<y)); // 0xffffffff
x = 10, y = 9;
printf("-(x<y) : %x\n", -(x<y)); // 0
```
- According to the result above,
- If `x<y` is true, then -(x<y) will be all ones, so `(x^y) & ~0` will be `(x^y)`. That is,`r = y ^ (x^y) = x`
- If `x<y` is false, then -(x<y) will be all zeros, so `(x^y) & 0` will be `0`.That is,`r = y ^ 0 = y`
##### Maximum version
```cpp
int x; // we want to find the minimum of x and y
int y;
int r; // the result goes here
r = x ^ ((x ^ y) & -(x < y)); // min(x, y)
```
The expression above is obvious that only changing the `y` to `x`.
##### Application
- On some rare machines where branching is very expensive and no condition move instructions exist, the above expression might be faster than the obvious approch, `r = (x<y) ? x : y`, even it involves two more instructions.
Hoever, on some machines, evaluating (x<y) as 0 or 1 sill requires a branch instruction, so there maybe no advantage.
Quick and dirty versions:
We know that `INT_MIN <= x -y <= INT_MAX`, then we can use the following, which are faster because (x-y) only needs to be evaluated once.
```cpp
r = y + ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1))); // min(x, y)
r = x - ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1))); // max(x, y)
```
However, This version still risks the overflow if x-y is not in the intervel.
We need to set a precondition, that is, `INT_MIN <= x -y <= INT_MAX`
#### Extend source
- [How to get absolute value as fast as possible](https://community.arm.com/developer/ip-products/processors/f/cortex-m-forum/6636/how-to-get-absolute-value-of-a-32-bit-signed-integer-as-fast-as-possible)
- [Bit Twidding hacks](https://graphics.stanford.edu/~seander/bithacks.html#ParityNaive)
- [How to swap two numbers without thirds variable](https://www.geeksforgeeks.org/swap-two-numbers-without-using-temporary-variable/)
- [McBits: fast constant-time code-based cryptography](https://cryptojedi.org/papers/mcbits-20130616.pdf)
- [Side Channel Attack](https://www.sciencedirect.com/topics/computer-science/side-channel-attack)
- [Dude, is my code constant time?](https://eprint.iacr.org/2016/1123.pdf)
- [Algorithmic problem solving: How to efficiently compute the parity of a stream of numbers](https://www.freecodecamp.org/news/algorithmic-problem-solving-efficiently-computing-the-parity-of-a-stream-of-numbers-cd652af14643/)