--- tags: Server --- Rack Infrastructure === Racks come in various sizes, typically measured in Rack Units. A Rack consists of multiple [Rack Servers] or [Blade Enclosures]. --- ## Rack Server A Rack Server comes with all necessary functionalities. It can operate independently as a single Node or host multiple Nodes. - Single Node: Contains all required components to function as a standalone unit. - Multiple Nodes: Chassis with shared infrastructure, such as power and cooling systems. A `Node` includes the following components: ### Motherboard The motherboard is the core hardware platform that connects all essential components: - Houses [BMC], [NIC], CPU, GPU, RAM, and interfaces for [Storage Devices]. - Acts as a central hub for communication between components. ![MotherBoard](https://hackmd.io/_uploads/HyYcTrhf1x.png) ### BMC The BMC is a remote management module embedded in the [motherboard], enabling monitoring and control of the Node's [hardware]. - Hardware Monitoring: Tracks CPU temperature, fan speeds, and other metrics. - Remote Control: Supports remote power cycling and firmware updates. - Storage Health: Monitors and reports the status of [Storage Devices]. - Network Integration: Uses [NIC] for remote communication. - BIOS Collaboration: Works with [BIOS] for remote startup and configuration (e.g., via IPMI). ### BIOS The BIOS is firmware stored on the [motherboard] responsible for [hardware] initialization and OS booting. - Hardware Initialization: Prepares [Hardware] such as CPU, RAM, GPU, and [Storage Devices]. - Network Boot: Configures [NIC] for PXE Boot or interacts with [BMC] for remote startup. - Configuration: Provides a user interface for setting system parameters. ### NIC The NIC enables network communication for the Node. - [Motherboard] Integration: Installed via dedicated slots or onboard connections. - BMC Support: Managed by [BMC] for remote network configuration. - [BIOS] Interaction: Configures PXE Boot for network-based system startup. ### Hardware The foundational computing and processing components installed on the [motherboard]. - CPU - GPU - RAM - etc... Hardware Initialization: Collaborates with [BIOS] during system startup. ### Storage Devices Provide persistent data storage, connected to the [motherboard] via interfaces like SATA or NVMe. - [BMC] Monitoring: Tracks health and failure events of storage devices. - [BIOS] Initialization: Prepares storage for OS boot. #### Common types - SSD - HDD --- ## Cooling Systems Cooling systems are essential for maintaining the optimal temperature of hardware in a rack, ensuring efficient operation and preventing overheating. Two primary methods are commonly used: Air Cooling and Liquid Cooling. ### Liquid Cooling Liquid cooling systems use liquid (typically water or a specialized coolant) to absorb and remove heat from hardware. #### CDU CDU contains 1+1 redundant hot-swappable pump systems that circulated the coolant to cold plates cooling down CPUs and GPUs heat. The CDU cooling capacity is up to 100kW enabling extremely high rack densities. #### Cold Plate Cold plates are placed on top of CPUs and GPUs to cool down chips efficiently by letting coolant flowing through micro channels in the cold plate. #### CDM CDM are the distribution pipes that supply coolant to each server and collect the hotter coolant back to the [CDU]. ### Immersion Cooling Immersion cooling submerges hardware in a non-conductive liquid coolant. ### Air Cooling Air cooling systems rely on the movement of air to dissipate heat generated by hardware components. ## Power Equipment ## Network Equipment --- ## Blade Enclosure (Chassis) A Blade Enclosure provides shared infrastructure for hosting multiple `Blade Servers`. - Shared Resources: Centralized power, cooling, and management resources for efficiency. - Blade Servers: Individual computing units, each potentially containing one or more Nodes. ## Storage Systems ## Management and Monitoring Equipment ## Patch Panels ## Appliance An Appliance is a specialized server designed for a specific application. - Can be a standalone [Rack Server] or a `Blade Server` within a chassis. - Optimized for scenarios such as database management, AI processing, or web services. --- [Rack Server]: #Rack-Server [Rack Servers]: #Rack-Server [Blade Enclosure]: #Blade-Enclosure-Chassis [Blade Enclosures]: #Blade-Enclosure-Chassis [Motherboard]: #Motherboard [BMC]: #BMC [BIOS]: #BIOS [NIC]: #NIC [Hardware]: #Hardware [Storage Devices]: #Storage-Devices [CDU]: #CDU *[Rack Unit]: Data center racks as vertical shelves, with each shelf having a standard height measurement called a Rack Unit (U). Each U equates to roughly 1.75 inches (44.45 mm) of vertical space. *[Rack Units]: Data center racks as vertical shelves, with each shelf having a standard height measurement called a Rack Unit (U). Each U equates to roughly 1.75 inches (44.45 mm) of vertical space. *[BMC]: Baseboard Management Controller *[BIOS]: Basic Input/Output System *[NIC]: Network Interface Card *[CPU]: Central Processing Unit *[GPU]: Graphics Processing Unit *[RAM]: Random Access Memory *[SSD]: Solid-State Disk *[HDD]: Hard Disk Drive *[CDU]: Cooling Distribution Unit *[CDM]: Coolant Distribution Manifold