VxRail
For this purpose, you might be think: That's too easy, Use modulus to resolve, see the pseudo code
if $nr modulus 15 is 0
print $nr is divided by 15
else if $nr modulus 3 is 0
print $nr is divided by 3
else if $nr modulus 5 is 0
print $nr is divided by 5
But just as we know, the modulus calculation costs a lot of computing power (But in the recent CPU/ GPU, the HW has special desgin for these computing), so how to reduce it by math. meth?
We can see Faster remainders when the divisor is a constant: beating compilers and libdivide
The intuition is as follows. To divide by four, you might choose to multiply by 0.25 instead. Take 5 * 0.25, you get 1.25. The integer part (1) gives you the quotient, and the decimal part (0.25) is indicative of the remainder: multiply 0.25 by 4 and you get 1
upon sentences showing in this document
So following this, we can set modulus by 3 to show with 0.
Now we can be in conclusions with that:
Let dividend is call y, and divisor is call x, so following these rules, we can think
e.g. Thinking about in hex system, 34(0x22) is divide by "hex", we can think that a number 34 is walking on "hex-ring" system( if larger than 0xf becomes a carry number), so the 34 in "hex-ring" system will be 2 carrys and 2 steps in the ring.
And the remainder is the steps in the ring.
By the figure 1, we can extern 0xF to 0xFFFFFFFFFFFFFFFF for 64 bits machine, and we cae write a new code for this topic:
static uint64_t M3 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 3 + 1;
static uint64_t M5 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 5 + 1;
static inline bool is_divisible(uint32_t n, uint64_t M) {
return n * M <= M - 1;
}
div3 = is_divisible(i, M3);
div5 = is_divisible(i, M5);
On the following, we use 16 bits for the MAX. data and using divide by 3 to explain
0xFFFF 0xAAAA 0x5555 0x0000
|---------|---------|---------|
In the this system, more than 16 bits (overflow) will be truncate automatically. So in this numerical system, if the number falls between 0x0000 ~ 0x5555, that's means the number can be divided by 3(in the c code, present by "<= M - 1"), 0x5556 ~ 0xAAAA is the remainder 1, 0xAAAB ~ 0xFFFF is the remainder 2.
And +1 make sure the calculation number will be within the range
e.g.
2/3 -> this means: 2 * (0x5555 + 1) = 0xAAAC, so the remainder is 2
4/3 -> this means: 4 * (0x5555 + 1) = 0x5558, so the remainder is 1
Why +1 is needed as an adjustment?
without +1, what happen in "2/3"?
2 * (0x5555) = 0xAAAA, this number would is on the boundary number, so that this is way to distinguish the answer is 1 or 2.
That's do some tests in the linux system. Using the number is from 0 to MAX_INT >>2 (C MAX_INT) for the checking, compiler option using -Os
This is use modulus to cal.
This is use the "is_divisible" method to cal.
You can see that, this method is better performance.
You can see the Fizz/ Buzz problem
That has another question, why is function "is_divisible(uint32_t n, uint64_t M)" input paramter using "uint32_t" not using "uint64_t" ?
let we use n = 0x7FFFFFFFFFFFFFFF , n = 0x8000000000000000 for testing…
that's wrong answer…
The reason is "+1"
as the same we use 16bit system for example.
nr | hex after mul M3 | over num(compare with 0x5555) |
---|---|---|
1 | 0x5556 | 1 |
2 | 0xAAAC | 2 |
3 | 0x0002 | 2 |
4 | 0x5558 | 3 |
5 | 0xAAAE | 4 |
6 | 0x0004 | 4 |
Using another format is more easy to understand
nr | hex | over | nr | hex | over | |
---|---|---|---|---|---|---|
1 | 0x5556 | 1 | 4 | 0x5558 | 3 | |
2 | 0xAAAC | 2 | 5 | 0xAAAE | 4 | |
3 | 0x0002 | 2 | 6 | 0x0004 | 4 |
by this rule, we can know which nubmer will make to over the next range.
so, we can calcular the max number: 0x5555/2 + 0x5555 = 0x7FFF
that's the answer why using uint32_t, (half the 64 bit) to make sure all uint32 can be calculated.