# 50.002 W1 Pre-Class 1 (Representing and Encoding Information)
## What is information :question:
**Information** is knowledge communicated or received concerning a particular fact or circumstance. We can convey information using various methods, such as language. However, machines are not so complex that they can understand human language. They are only capable of storing and parsing through 1s and 0s (binary digits or ==*bits*==) and thus information is stored using ==voltages==.
## Binary Number System :1234:
**Binary Number System** is base 2.
> i.e. 0 0 1 1 0 1 is equal to 1x2^0 + 1x2^2 + 1x2^3 = 13 (in decimal or base 10)
The rightmost digit is known as the ==*Least Significant Bit*== and the leftmost is known as the ==*Most Significant Bit*== as the rightmost one carries the least value.
However, binary number system doesn't store the data efficiently when it comes to larger integers.
> i.e. 16 can be expressed with 2 digits with the base 10 system, but it is expressed as 1 0 0 0 (4 digits) in the binary number system.
## Hex and Octal Number System
**Hexadecimal (base 16) and Octal (base 8) Number System** are also often used since they are based on the powers of 2 to shorten the binary digits.
| Decimal | Binary | Octal | Hexadecimal |
| ------- | ------:|:----- |:----------- |
| 0 | 0 | 0 | 0 |
| 1 | 1 | 1 | 1 |
| 2 | 10 | 2 | 2 |
| 3 | 11 | 3 | 3 |
| 4 | 100 | 4 | 4 |
| 5 | 101 | 5 | 5 |
| 6 | 110 | 6 | 6 |
| 7 | 111 | 7 | 7 |
| 8 | 1000 | 10 | 8 |
| 9 | 1001 | 11 | 9 |
| 10 | 1010 | 12 | A |
| 11 | 1011 | 13 | B |
| 12 | 1100 | 14 | C |
| 13 | 1101 | 15 | D |
| 14 | 1110 | 16 | E |
| 15 | 1111 | 17 | F |
| 16 | 10000 | 20 | 10 |
| 17 | 10001 | 21 | 11 |
> i.e. computing hexadecimal 0x15F = 1x16^2 5x16^1 + 15x16^0
> Base number system is indicated by a subscript. 77<sub>8</sub> = (7x8+7)<sub>10</sub> = 63<sub>10</sub>
## Number System Conversion :arrow_left: :arrow_right:

### Binary to Octal
A binary string can be converted into an octal number system by grouping the binary string into groups of 3:
> 101 101 101 111 = 5557<sub>8</sub>
### Binary to Hex
A binary string can be converted into hex by grouping the binary string into groups of 4:
> 1011 0110 1111 = 0xB6F
If we can't group the bit string into groups of 4, we can always pad the higher bits of the binary (this won't alter the value).
## 2's Complement (Representing signed integers)
MSB is used to indicate the sign of the number.
>00101 is positive (the MSB is 0)
>11011 is negative (the MSB is 1), and is equal to (-1)x2^4 + 1x2^3 + 1x2 +1 = -5
2's Complement is the algorithm used to obtain the negated value of a particular number.
* **Step 1** Inverse all 0's and 1's in the original binary
* **Step 2** Add 1 to the inverted number
Better human algorithm:
* **Step 1** Find the first occurring 1 from the LSB.
* **Step 2** Flip all the bits to the left of this bit.
## Encoding
**Encoding** refers to assigning ==representations== to information.
In this course we assume that all the choices are equally probable (fixed length encoding).
ASCII encoding is a 7-bit encoding that is describes English characters. Each character of a text file encoded in this format has the size of 7-bit.
UTF-16 or 16-bit Unicode encoding is 16-bit encoding that could describe foreign languages (Korean, Japanese, etc).
Hence, we can create electronic devices that could map (decode) information given an encoded information.
## Information and Uncertainty
The amount of information held by an event is inversely proportional to the probability of that event happening.
Hence, the amount of information is proportional to the amount of uncertainty of a certain event.
Formal definition: For discrete events with probability of occurence of (p1,p2,...), the basic measure of information for all of these events is the bit.

For example, to encode 8 equally probable events, the number of bits needed are:

If given N equally probable choices, and it is narrowed down to M choices, where N > M, we are given:
 bits of information
(the original number of bits required - the number of bits narrowed down)
> e.g. if our pool of possible events are narrowed down from 8 to 3, we will have 2 bits instead of 3 to encode the 3 different events: 00, 01 or 10.
> Therefore we are given at least 1 bit of information.