# rand() function in C lib
Due to the issue about the random function recently, I survey some related topics. In fact, rand() in C lib is not secure enough to be used in cryptographic or privacy problem. This kind of random function we called random number generator (RNG), and we called the random function, which strong enough in cryptographic scenario, cryptographically-secure pseudorandom number generator (CSPRNG).
In the following, I will list some drawbacks about PRNG. And in order to imagine easily, we just regard PRNG as rand() in C lib for convenience.
---
## 1. The rand() is deterministic
Everyone has the experience about calling rand(), when we need a random number. Before calling it, we may need to choose the **"seed"** used to decide random number. In the common situation, we usually call **"srand(time(NULL))""** first. **That is the problem!** If we try to use this random number in implementation required security or privacy, it may cause system be vulnerable.
For example, if we use this method to construction "private key". And an attacker know which **"year"** you generated this random number. The complexity an attacker using brute-force key search is about $2^{25}$. (**NOTE: we use time as seed**) This complexity keeps decreasing as the more precise time an attacker known. (**e.x. year, month, even day**).
>**The simple solution is that using another way to generate "seed", but it is still not ensure security if the bad PRNG be applied.**
In other situations , it will not be a problem, if we use this random number in game field......and so on. Hence, we need to be careful when it is related to security and privacy.
---
#### REALTED ARTICLES:
**How vulnerable is the C rand() in public cryptography protocols?**
https://crypto.stackexchange.com/questions/15662/how-vulnerable-is-the-c-rand-in-public-cryptography-protocols
**Use of Cryptographically Weak Pseudo-Random Number Generator (PRNG)**
This links lists about issue about use rand() in software, and casue some problems.
CWE™ is a community-developed list of software and hardware weakness types.
https://cwe.mitre.org/data/definitions/338.html
I do a sreenshot about the page, if you are interested, you can checkout this link.

## 2. The (rand() % N) has bad statistic property
The number is not sampled uniformly. If we want uniformly random a number in $[0, N-1]$, we may use **rand() % N**. Note that the output of rand() has an upper bound. I mean the output of rand() is an type of **"int"**, it may be 2 bytes or 4 bytes up to system defined. Therefore, rand() is always less than $2^{32}$ (if we suppoes "int" takes 4 bytes memory storage). This cause a problem, when we do modulo.
In order to understand, we just suppose the upper bound of rand() is **"16"** and now we are going to samples a number between $[0, 10]$.
The uniformity about rand() is still a problem, but now we believe it is uniformly in range $[0,16]$. In other words, rand() samples a number in range $[0, 16]$, every nubmer has the same probability to be chosen. We call this property **"equal likely"**
Therefore, we use **rand() % 11** directly.
#### Case 1: $rand() \% 11 == 0$
Then rand() can be $0, 11$ with probability $2/16$
#### Case 2: $rand() \% 11 == 1$
Then rand() can be $1, 12$ with probability $2/16$
#### continued....
#### Case 8: $rand() \% 11 == 7$
Then rand() can be only $7$ with probability $1/16$
This causes bias if we use $rand()$ and $modulo$ $N$ simultaneously. There is a way to reduce bias with
$$\displaystyle\frac{rand()}{RAND\_MAX + 1}*N$$
(NOTE: not the best way)
---
#### REALTED ARTICLES:
**Why do people say there is modulo bias when using a random number generator?**
https://stackoverflow.com/questions/10984974/why-do-people-say-there-is-modulo-bias-when-using-a-random-number-generator
**Random integers in C, how bad is rand()%N compared to integer arithmetic? What are its flaws?**
https://stackoverflow.com/questions/49880304/random-integers-in-c-how-bad-is-randn-compared-to-integer-arithmetic-what-a
**Why is the use of rand() considered bad?**
https://stackoverflow.com/questions/52869166/why-is-the-use-of-rand-considered-bad
## 3. The rand() uses Linear congruential generator (LCG)
The linear congruential generator looks like
$$ {\displaystyle X_{n+1}=\left(aX_{n}+c\right){\bmod {m}}} $$
$a、c、m$ are parameters, can be chosen depending on implementation. And $X_0$ stands for **"seed"**. Hence, we can get a random number from $rand()$
$$ {\displaystyle X_{2}=\left(aX_{0}+c\right){\bmod {m}}=\left(a*{seed}+c\right){\bmod {m}}} $$
and next time we get random number from $rand()$ by recursion, which is usually a fast, but **notoriously bad class of PRNGs**. ([see this discussion](https://stackoverflow.com/questions/52869166/why-is-the-use-of-rand-considered-bad))
#### In glibc,

(from wiki)
#### source code

---
#### REALTED ARTICLES:
**Linear congruential generator**
https://en.wikipedia.org/wiki/Linear_congruential_generator
## Other related articles
**你的程式夠「亂」嗎?**
https://www.ithome.com.tw/voice/110007
**The Importance of True Randomness in Cryptography**
https://www.design-reuse.com/articles/27050/true-randomness-in-cryptography.html
**Taiwan’s national “Citizen Digital Certificate” problem**
One of the authors of this paper is **Daniel J. Bernstein**
https://link.springer.com/chapter/10.1007/978-3-642-42045-0_18
the news 8 years ago, the issue is about **certificate and RNG**
https://www.inside.com.tw/article/2984-government-certifications-broken
## Conclusion
In the above, I just list some article about $rand()$ and $PRNG$. There are more discussions can be found by **"googling"**. I think that the ordinary $rand()$ indeed has some drawbacks. That is why people need to design or construct another kinds of random number generator, cryptographically-secure pseudorandom number generator (CSPRNG).
Intuitively, I think that the problem 2 may also need to solve in the case **CSPRNG**. In reality, we cannot get as large number from machine or computer as we want. But I regard that in the construction of **CSPRNG** they will try to adjust the algorithm such that **CSPRNG** will have better statistic property. **I mean more closed to uniformly distribution.**
---
Once people may think that the people/attacker outside cannot understand the method we use to generate random function. It makes no difference if we use bad PRNG. This problem is discussed already years ago. We call this problem **Security through obscurity**.
>**The United States National Institute of Standards and Technology (NIST) specifically recommends against using closed source as a way to secure the software (i.e. “security through obscurity”), and they state, “system security should not depend on the secrecy of the implementation or its components”**
:::info
**I think hide the implementation is fine, but if unfortunately people broken this system and see the method use in rand() is using. (for example "time" as seed)**.
**It will be a problem if your implementation is about security or privacy.**
**If you still decide to use "rand()", then at least don't use "time" as seed.**
:::
---
#### REALTED ARTICLES:
**Kerckhoffs's principle**
https://en.wikipedia.org/wiki/Kerckhoffs%27s_principle
**Security through obscurity**
https://en.wikipedia.org/wiki/Security_through_obscurity
**Open-source and the “security through obscurity” fallacy**
https://www.efrontlearning.com/blog/2012/04/open-source-and-the-security-through-obscurity-fallacy.html
**Guide to General Server Security**
https://csrc.nist.gov/publications/detail/sp/800-123/final
**Security Through Obscurity Is Not an Answer**
https://www.pearsonitcertification.com/articles/article.aspx?p=2218577&seqNum=7
###### tags: `Cryptography` `密碼學` `PRNG` `CSPRNG`