I built a small-scale rack to run server-grade CPUs in a home environment. It is entirely water-cooled and managed by a custom controller that handles thermals, fan speeds, and provides authenticated remote access via IPMIv2/Redfish. ![Screenshot 2026-04-08 144347](https://hackmd.io/_uploads/rkBG0T7nbx.jpg) ## WTF, why I am a software dev building hypervisors for a living and I love running local hardware for server-grade hypervisor testing, but I can't stand the noise. Server equipment usually sounds like a jet engine, so water-cooling was a requirement. Plus, hardware is just super fun to mess with (albeit expensive). Also, I think 1Us are really really cool so i'm going to solve the noise problem with a solution **that is 10x more complex and expensive than just buying a bigger case**. ![9dccbb58-b02f-4201-9b5a-3874025114dc](https://hackmd.io/_uploads/HkmrfeNhbe.jpg) ## The hardware The rack currently houses my main devbox and three test nodes: an EPYC Milan, a Xeon EMR, and an Ampere Altra. These are the systems i need to cover all CPU types my work has to deal with. The rest of it is a mix of used motherboards and IPMI firmware versions. I managed to source most of this before the DRAM-pocalypse hit. All units are externally liquid-cooled and managed by a rack controller. The fan grill on the foto below is laser-cut by a local workshop. ![1b438dd1-8ec4-4166-8070-22f812d87dd4](https://hackmd.io/_uploads/rJxCC6X3Zg.jpg) ## Rack controller The rack, cooling and units are managed by an ESP32 controller sitting on a physically isolated IPMI VLAN. It acts as a single, secure entry point for the management network. I wrote a custom IPMI management stack specifically for this hardware. The controller is responsible for: * Remote authentication and ipmi access. * Monitoring thermal sensors to generate PWM signals for the pump and fans. * Pushing system fan curve configurations to the individual hosts. With it i can control stuff from my devbox, power-manage my test hosts and ssh into them through the controller tunnel. Also, there is something inherently "low-life + high-tech = cyberpunk" about using a $4 microcontroller to boss around thousands of euro in high-end server gear. ``` > th-thermals devbox 46 degrees C milan 32 degrees C emr 31 degrees C altra offline ``` ![24bd7c52-4b56-4c0e-b424-02ea645dc91a](https://hackmd.io/_uploads/HyXD0pXhbl.jpg) ## Cooling strategy The 1U and 2U boxes are liquid-cooled externally. Each unit uses a cold-plate and quick-disconnect fittings (QDCs) that lead to a shared external radiator. ![1b72bf6a-2cd1-4eb9-86ef-70b10c1341f9](https://hackmd.io/_uploads/rysmgAQhZe.jpg) The loop is designed in parallel so I can remove individual units without shutting down the entire system. However, this makes flow speed harder to predict. To prevent the manifold from becoming a bottleneck, the collector tubes must be significantly wider than the branch lines. Here's my amazingly clear schematic of the whole thing: ![image](https://hackmd.io/_uploads/HkXwhTmhWx.png) To maintain consistent pressure and flow across $n$ parallel loops, the cross-sectional area of the manifold should ideally be greater than or equal to the sum of the areas of the individual branch lines. The required manifold diameter $D$ for $n$ branches of diameter $d$ is: $$D \geq d \sqrt{n}$$ Using standard 7mm inner-diameter tubing for the branches and 12.8mm for the manifold hits its physical limit at roughly 3-4 loop nodes: $$n \approx \frac{12.8^2}{7^2} \approx 3.34$$ ![19240b06-dc3f-45ba-ad72-6cfe4f92a8ad](https://hackmd.io/_uploads/Hyr20aQ2Wx.jpg) Beyond four parallel loop nodes, the 12.8mm manifold tube causes a massive increase in back-pressure, which stresses the pump and reduces cooling efficiency for the furthest nodes in the loop, so i need to scale up to another manifold for every 4 nodes or install a second pump to brute-force the issue. Using different less-common diameter tubes is also an option but that severely limits part selection for fittings and disconnects. ![7dca2cbc-884f-4618-a08b-11242990a150](https://hackmd.io/_uploads/BkFYk0m2bg.jpg) ## Things i got wrong * Radiator dust is a real problem, I should have built a filter cage. * Pump redundancy is something that is highly desirable. * I still need to install proper leak detection sensors and hook them up to the controller. * Server hardware is fragile, much more so than consumer parts. A slipped screwdriver while the power is connected will end tragically for the board. My electronics repair skills definitely leveled-up a couple of times from this. ![image](https://hackmd.io/_uploads/rkIyppmh-x.png) * IPMI spec is bloated. I have a feeling there's much to be said about potential firmware attack surface over IPMI. * While the system works well and stays silent under load, I made enough design mistakes to justify a v2 in the future. I might just be looking for an excuse to do this all over again though, lol.