# IB Topic 2 Computer Science - Computer architecture ###### tags: `IB` `Computer Science` `Arduino` [ToC] # Topic 2 Computer organization _This covers what is in the book from page 61 onwards_ We're going to talk about computer organization and computer Architecture. What is architecture in terms on computing? Can refer, depending on the context to: - What are the main outlines of a programming software (specially when is a complex one) - What are the main outlines of a hardware. Here we're going to focus on the later and we're going to talk about how hardware is place in the machines to do what they do. ## The basic i/o machine As a first approach we can think of any machine as a input/output machine. This is an _abstraction_ but works. * A machine has some kind of input (sensors/data) * The machine has some kind of process and access to some kind of storage. * And finally there is some kind of output (actuartors, information) This is true to _lots_ of machines frome phones to microwave ovens, to laundry machines, even can be mechanical (in a piano) #### Exercise: Think of inputs and outputs The **input** of a device is something that gives the machine _information_ and the **output** is some kind of decision that comes _out_ the machine as a conclusion. In one of the mechanical examples a ringbell, if we press the button (input) the bell rings (output). :::info There are some cases where the information is bi-directional (comes and goes in both ways). In this case we usually talk not about input-output but "communication" ::: Let's do a list of physical inputs of _any_ device and physical outputs of _any_ device and communication systems of _any_ device. Ideas in the spoiler section :::spoiler **Device: Phone** Inputs: - Capacitive screen - Buttons (volume and else) - Microphone - Battery sensor (through some electrical sensors) - Fingerprint sensor (also capacitive) - Gyroscope (the sensor that tells the orientation of the phone) - Cameras Outputs: - Speakers - Screen (the lights on it) - Vibration motor - Flashlight (LED) Communication: (wired) - Port (USB, Lightning) - Mini-jack (audio) (unwired) - Wi-fi - Bluetooth - Mobile data connection (2G, 3G, 4G, 5G) - NFC **Device: Laptop** Inputs: - Buttons - Keyboard - Sensitive screen (optional) - Touch pad (also capacitive) - Fingerprint sensor (also capacitive) - Microphone - Camera - Battery sensor (through some electrical sensors) - Temperature sensor Outputs: - Speakers - Screen - Lights (LEDs to signify if the power is on, or charging) - The small screen in the mac pro versions - Cooling system activators (in air proplelled ones Communication: Wired - USB - HDMI (screen) - RJ45 (LAN) Wireless - Wi-fi - Bluetooth - NFC (sometimes) ::: Other input sensors: http://academy.cba.mit.edu/classes/input_devices/index.html Other output sensors: http://academy.cba.mit.edu/classes/output_devices/index.html ### Entering into the machine These machines have been existing and have been studied since the 19th century (Ada Lovelace) but the main part was done in the first half of the 20th century. In the 1940s the first computers were born. Computers that had the size of rooms. ![](https://i.imgur.com/uWRp5o7.png) *[Eniac: source wikipedia](https://en.wikipedia.org/wiki/ENIAC)* ![](https://i.imgur.com/7jrd8cm.png) *Harvard machine* ## CPU CPU stands for "Central Processor Unit" and it refers to the hardware component of a computer that makes the calculations. ### Different kinds of architectures and paradigms Regarding the design of these CPUs we can find several "paradigms". These paradigms are ways to do things and we usually need to know what paradigms are we using so we can make choices between those paradigms. :::success Remember that paradigm is an idea that we want to adhere to. Something that we want to achieve even if the reality is not that perfect. For example the IB expects the student to behave with the IB profile. This is a paradigm to assest the grades of the student in the end. ::: Usually there is no "good paradigm" or "bad paradigm" because usually each of them has some advantages and disadvantages. Sometimes these paradigms, these "boxes" fall short when defining some new technology so sometimes they fade away. One of this "breaking paradigm" is quantum computing. ([more about it here](https://research.ibm.com/blog/ibm-quantum-roadmap-2025)) Quantum computing since doesn't work with 1s an 0s is so different that some ideas about computing need to be redone if that technology takes over. ### Regarding the instructions that the CPU reads Each type of CPU has some kind of "Fundamental operations" set to work properly. Those "Fundamental operations" are the bricks of any algorithm that the CPU (or CPU cores) are going to operate. Usually we have several layers of compilers to transform automatically the complex operation that we want to do into these fundamental operations. Depending of the approach to these there are (were) two types of architecture #### **RISC** (Reduced instruction set computing). This approach says that the instruction should be few and very optimized. This has the advantage of allowing easily multi-coring and set several CPU in paralel. Nowadays you can see it in microcontrollers. you can find more about it [here](https://es.wikipedia.org/wiki/Reduced_instruction_set_computing) #### CISC (Complex Instruction Set Computing) This category exist as an non-established antonym of RISC. Meaning "everything that is not RISC". But the main concept is that the CPU should have as many instructions posible as it can so you can use it directly. You can read more about it [here](https://en.wikipedia.org/wiki/Complex_instruction_set_computer) #### The conclusion from RISC and CISC today Nowadays we have for example the [X64 type of microprocessors](https://en.wikipedia.org/wiki/X86-64) that are widely used. They work with a fairly large set of instructions buuut they are optimized to work only with a subset of all of these instructions. Are they still CISC if they are not optimized for the full set of instructions? Are they RISC? In our case our Arduino works in a RISC model. :::info RISC and CISC models are not something assest directly in the IB exams but I think that "paradigm" is a good concept to handle this and other concepts and explaining through this lense can help the student to understand more easily other concepts. ::: ### Fundamental operations (page 241) from the book We can see a set of fundamental operations like these: * LOAD (instructions or data) * ADD (data on the data that we already have, doing addition) * STORE (instructions or data) * COMPARE (data with data) using logic gates. We can think that RISC computers work with these and maybe a couple more that are compounds of these instructions. ### Parts inside a CPU (going back to page 62) Inside the CPU we have several important objects. * Control Unit (CU) * Arithmetic Logic Unit (ALU) _The beast_ * Memory Address Register (MAR) * Memory Data Register (MDR) Then we have the memories divided in * __Primary memory__ accesible from the CPU * __Secondary memory__ not directly accesible from the CPU There are several types of Primary memory. Mainly we have RAM (Random Access Memories) and ROM (Read Only Memories) and also we usually have "cachés". (This part is just a summary. Students are expected to improve this part) #### The Control Unit (CU) * Responsible of managing the CPU * Controls the retrieval of instructions and data from the primary memory to the CPU. What is going to be *fed* to the ALU (_the beast_) * Contains several registers, that is a small storage location that can hold data, usually multiple of 8 bit. 2 of them are * MAR * MDR #### Memory Address Register (MAR) * Holds the Memory Address (where is stored) so the ALU can fetch the data itself from the memory. * It may also hold the Address where the data processed will be stored * It's connected to the primary memory (usually the RAM) via Address Bus (_Buses are some data wires that go in parallel from A to B_) We saw what was stored in the MAR and then sent to the RAM to retrieve/write the data ![imagen](https://hackmd.io/_uploads/r1q8mnig0.png) https://youtu.be/7J7X7aZvMXQ?si=8cLvppz2aKTsI1Sz&t=548 #### Memory Data Register (MDR) * Holds the data itself that is going to be used by the ALU (_the beast_) and then saved to the RAM * Whichever memory address location that the MAR is holding, the corresponding data will be loaded onto the MDR for processing the ALU (_the beast_).Then, the ALU (_the beast_) places the (_digested_) result onto the MDR and the data us copied to the to the memory address location in RAM specified by the MAR. #### THE ARITHMETIC LOGIC UNIT (ALU) THE BEAST This is the part of the microprocessor that _actuallly_ makes the operations at very fast speed doing arithmetical, logic or input/output operations. ### Microcontroller vs microprocessors Mircroprocessors (such us intel core i5) are more powerful but they need other chips around them in order to work. Microcontrollers (like the chip embedded in an arduino) are less powerful but they need less stuff around. They are more self contained. Microprocessors go always into a motherboard that is a computer. Microcontrollers are a whole entity that can operate on their own. Microcontrollers have small microprocessors. Also microcontrollers are associated with other kind of hardware interface. We usually say that a laundry machine has a microcontroller rather than a full computer, even if the line is sometimes blurry. In our case, the Arduinos that we are using they are a microcontroller a Harvard archictecture, RISC microcontroller. #### Multi cores Some of the microprocessors can be multi core to improve performance. That can be applied to microprocessors (nowadays almost any microprocessor has multiple cores). There is a limitation around 8-10 cores usually in this kind of architecture because you need to coordinate them. Each of the processors works with an entity of instructions called *thread*. Also you can apply it into some microcontrollers and GPUs can have even hundreds of cores. Nvidia do a lot of stuff with this. ### Machine instruction cycle With the explanation of these parts we can read about the machine instruction cycle in page 68-69 We have 4 parts of the Machine Instruction Cycle 1) Fetch instructions from primary memory (RAM most of the times) to the Control Unit (CU) 2) Decode the instructions in the CU 3) Execute the decoded instructions by the ALU (_the beast_) 4) Store the result in the primary memory 5) Fetch the next instruction. This is something that happens around the clock (we will see what is the concept of clock) indefinetly. :::info **Decode** and **encode** Encode can be understood as encrypt but is not strictly the same. When we encode information in low level computer science, we are making signals shorter. (4 entries, 2 exits) Decode also can be understood as decrypt but is not strickly the same. When we decode information in low level computer science, we are making signals longer. (2 inputs, 4 outputs) ::: ### Memory Architectures Regarding memory there are 2 main architectures. * The first one is **Von Neumann** ([a physicist that also worked in the Manhattan project](https://en.wikipedia.org/wiki/John_von_Neumann)). This architecture uses 1 type of memory for instructions and data. ![](https://i.imgur.com/ccPtS7d.png) In the image is the "Memory Unit". Most microprocessors of laptops use this kind of architecture. * The second is **Harvard** ([named after a computer called the Harvard mark I](https://en.wikipedia.org/wiki/Harvard_Mark_I)) and it distinguishes between the memory for instructions and the memory for data. ![](https://i.imgur.com/PO3y35Z.png) (source: wikipedia) Our microcontrollers use this kind of architecture, so they don't have just "a RAM" but they have a Flash and SRAM and a EEPROM. We will see it in detail in the _datasheet_. :::info This information also not included directly in IB syllabus I think that gives some context and usefulness for the information of the ALU and the types of memory. ::: ### Reading datasheets All of these becomes useful to read datasheets. Datasheets tells us many information regarding a specific component. Here is the datasheet of [the CPU of my computer](https://d2pgu9s4sfmw1s.cloudfront.net/UAM/Prod/Done/a062E00001ZcMRLQA3/e47f2a25-a623-47d0-b298-81b05b263bfd?Expires=1657114862&Key-Pair-Id=APKAJKRNIMMSNYXST6UA&Signature=s24PmtR5YM-m6~s2TDjKSOMUh1kZnvar-417JcHhxVBHP4dNzFRyemb8pixuzmUn6bskVxWZRlRSH89idbUyzi3l3NyZSdSPizn9UQVRGzV0aMa8~pOyxLLUhA8JhiMk6KvkPxUHZqfV5996jpKLDXh-df-Uvp0F-rrbyIrCjvcfqXlLxpogdFUDudsnXLLTmpyNqaDoDYQbh27Qld5JkdiOav0U-Qzd8LOUH3Ut23L6k4ilKe~9Swvj94EJp3Kla8mg71FoBeSBAiqAbNvNI3sdD6BnUA4-gW-yuILhTIwLf5YAnwjYr7DVDHBsg2YcxGwY1nEnjSQnoKDCH~QD-Q__) In our case we have these arduinos that we are going to explore a bit. They have this processor: **ATmega328P**. We can check it out that [from the main Arduino page](https://store.arduino.cc/products/arduino-uno-rev3) And this **ATmega328P** has a datasheet that specify everything on it and how it works. Here is the link to the datasheet: http://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-7810-Automotive-Microcontrollers-ATmega328P_Datasheet.pdf If you go to the page 6 you can see an overview of the chip here: ![](https://i.imgur.com/0B5ozNs.png) If you go to page 9 you can see this, that is inside of the CPU ![](https://i.imgur.com/1BH2jFc.png) And you will see that we have Data SRAM, we have an EEPROM([Electrically Erasable Programmable Read-Only Memory](https://es.wikipedia.org/wiki/EEPROM)) that is a kind of ROM memory. You can see a tutorial applied to arduino [here(Spa)](https://programarfacil.com/blog/arduino-blog/eeprom-arduino/). ## Other concepts ### Clock and cycle Usually the cycle of a computer runs very fast. We measure it in how many times happen in a second. We have these clocks in order of MHz (1 000 000 times in a second) or GHZ (1 000 000 000 times in a second). ## Cache ![29004](https://hackmd.io/_uploads/HkaoGAB0p.jpg) Cache can occurr in CS in different contexts In software is some data that is hold somewhere temporarely to have it at hand. For example browsers have a cache of the images that you have visited so if you visit the same page you don't need to download again the same image. The other context is the same concept but in _hardware_ Cache in RAM Cache in RAM is going to store the most common instructions so if they're needed again they are in a place that is easier to access. Static RAM is an example of Cache. Is a more expensive and faster type of ram and it's saved for just a small caché of common instructions. Dynamic RAM is slower (but bigger an cheaper) RAM. L1 caché is placed in the CPU L2 caché is placed in the RAM #### Cache in the CPU Cache in the CPU can be (from a fast look into the intel chipsets) very small, around 12MB, but it's still very useful because it's only going to contain very commonly used instructions https://www.intel.com/content/www/us/en/products/sku/199271/intel-core-i510400-processor-12m-cache-up-to-4-30-ghz/specifications.html ![image](https://hackmd.io/_uploads/rkf4yIUOJx.png) The idea of having a cache is to speed up calculations so the processor doesn't have to go to the RAM to fetch the next instruction/data and come back. ### Size of the word Each instruction has a set of bits. Typically 8-16-32-64. It's a bit misleading because you may use 8 bit processors for some stuff and it's ok. Nowadays the laptops use 64 bit architecture. This "size of the word" is used to talk about how big can be the memory registers and the instructions themselves. #### CPU vs GPU ![](https://i.imgur.com/icNvEKI.png) _[source](https://www.youtube.com/watch?v=pmDLOdDabRo&t=35s)_ #### Why do we need to do small math operations to play a game like Minecraft/valorant/LoL? GPUs are mainly done for 3D rendering systems. When we have to render something in 3D (like a box) we need to put the elements of the box into the frame (that will be the screen). To do that, the computer (the GPU) needs to do the calculations of all the lines that goes from the edges of the box to the camera and find the intersection with the plane of the point of view. That's a lot of geometrical calculations that needs to be done. ![imagen](https://hackmd.io/_uploads/SJMr_H0gC.png) _image from the branch education video that you can see later_ They are more less simple geometrical calculations but they are many of them. That's why the architecture of a GPU is one that has many _threads of execution_ as possible at the cost that the precision is not that good. A GPU usually have lots of cores rathern than a handful of them that is common in a CPU. In this video of branch education you can see mor information about this. https://www.youtube.com/watch?v=C8YtdC8mxTU {%youtube C8YtdC8mxTU%} #### The other nowaday uses of GPUS * Mining crypto (since finding hashes is a lot of simple calculations). Don't do crypto nor NFTs they are a bigger fool scam. * Simulations (some of them) that have many small operations * Artificial Intelligence (LLM and other types of Neural Networks) ### References: Explanation in detail of these types of architecture in Spanish: http://cv.uoc.edu/annotation/8255a8c320f60c2bfd6c9f2ce11b2e7f/619469/PID_00218274/PID_00218274.html Using a jetson: https://www.youtube.com/watch?v=pmDLOdDabRo&t=35s ## Primary memory Primary memory is defined by being directly accessible direcly by the CPU. Mainly we have RAM (Random Access Memories) and ROM (Read Only Memories) and also we usually have "cachés". RAM is way smaller than secondary memory in size but faster in terms of accessing. Also the RAM is held by capacitors that loose their data once the power is off. ## Secondary memory Secondary memory is defined by not being accesible directly by the CPU. We usually need to move the information from the RAM to the secondary memory back and forth. Types of secondary memory: * HDD. Hard disc device (also called "mechanical discs") * CDs. Compact Discs * DVDs * Flash drives (USB sticks) * SDcards * SSD (Solid state drives) To read a CD/DVD/Blu-ray we need an _optical unit_. A CD/DVD/Blue-ray reader. ### BIOS and UEFI Reference: https://www.spiceworks.com/tech/devops/articles/what-is-bios/