# HomeLab / Home Server / Self Host Setup
> Reference \:
> * HomeLabHelper_repo
> https://github.com/Chen-KaiTsai/HomeLabHelper_repo
# Hardwares

:::info
:desktop_computer: **Linux HPC Server for Compute**
**CPU** : *~~EPYC 7742~~* *EPYC 7302P*
**GPU** : *Nvidia RTX A5000 x 2*
**RAM** : *ECC 2400 512GB*
**MB** : *ROMED8-2T*
**Power** : *ASUS Thor 1600w on a dedicate 220v power source*
The following picture shows that the main component of this server. Note that the Nvlink is not included yet.

> Reference \:
> 3x RTX A5000s in the Sliger CX3170a
> https://www.youtube.com/watch?v=9n0NkfmMca4
Today \(01/18/2025\) I find one of tech youtube channel that I followed showcase a server that he build with similar platform chooses as I have right here. Very fun to watch.
<-------------------------------------------------------------->
:desktop_computer: **Windows Server for Hosting WSL & Experiments**
**CPU** : *E5-2696v3 ~~x 2~~*
**GPU** : *Nvidia Quadro T400, Nvidia Tesla P40*
**RAM** : *non-ECC 2666 128GB \(run with 2133\)*
**MB** : *~~X99-F8D PLUS~~* ASUS X99-AII
**Disks** : *~~PNY CS1031 256GB M2.2280 PCIe SSD~~ Team T-FORCE Z44A7Q 2TB M.2 PCIe Gen4, MSI SPATIUM M480 PRO 4TB Gen4 PCIe SSD*
**Power** : *Seasonic X-series 1050W*
<-------------------------------------------------------------->
> Reference \:
> * Dual-CPU Powerhog\: HUANANZHI X99 F8D PLUS Motherboard
> https://youtu.be/IcnOQK1S4z8?feature=shared
The above youtube video unbox the Chinese MB as I have for this computer. In the comment section, you will found a debate about wheather to bought this kind of MB. In my opinion, I will definitely not recommand these to anyone except they already own an Xeon CPU. In my case, my old PC which have a malfunctioned MB \(MSI X99 GodLike Carbon\) has to be replaced. At that time, I already \"Upgrade\" my CPU to Xeon 2696v3 and I notice that, if I want to use it for the nxet 10 years, I might want to go futher and upgrade the entire thing to NUMA system. Notice that an Xeon 2696v3 is very cheap so it is not that expensive to have this upgrade. Anything else is basically migrated from my old PC, including Power Supply, RAM \(Non-ECC\), Disks and I bought Tesla P40 and Quadro T400 later on.
**GPU PCIe NUMA Node Mapping**

:desktop_computer: **TrueNAS Scale Server for Storage & BackUP Services**
**SERVER MODEL** : *Dell PowerEdge R730*
**CPU** : *E5-2680v3 x 2*
**RAM** : *non-ECC 2133 96GB*
***Report from iDrac***

***Motherboard & System Layout***

:::
## Server Room Layout \(2025\)
Update Picture can be found in my github repo \:
https://github.com/Chen-KaiTsai/HomeLabHelper_repo/tree/main/2025_ServerUpdate

* Thin Client \: Microsoft Surface Go 2 \(broken screen\)
* Router \: [QNAP QHora-322](https://www.techbang.com/posts/101114-qnap-qhora-322-sd-wan-router-unboxing-review)
* Switch \: [Zyxel 合勤 XGS1210-12](https://24h.pchome.com.tw/prod/DRAF0I-A900AKSKT)
* Other
* Extension Cords \: [Castle蓋世特](https://www.castleshop.com.tw/)
On the QNAP QHora-322, I only use one 10G port to connect to a switch and one 2.5G wan port to the moden as describe in the above diagram.

Personally, I disable all the port that I don't need, as the above diagram shows. The 10G LAN connect to the switch directly.
# Operating System & Software Environment
## Setup OpnSense
> Reference \:
> * Blog provide most of the use case
> https://www.sakamoto.blog/
> * Adblock with Unbound DNS
> https://www.youtube.com/watch?v=o12a2cFGopQ
Newest update keep causing Opnsense to reboot randomly, \(01\/09\/2025\) and after a manual reboot, the Opnsense fail to restart and probabily stuck at boot up section. A physical access is required to identify the issue\([Similar Issue](https://forum.opnsense.org/index.php?PHPSESSID=5a2eg0r0bfq58dlrc9gr58inve&topic=43931.15)\). Opnsense keep causing trouble while I need a stable network environment; therefore, I decided to dive into proprietary routers. Apparently on the spec side, the current mini PC is way more powerful than the proprietary routers; however, these routers work more efficiently and the software integration is way better than Opnsense.
## Migrate to Proprietary Firewall Routers
### Opnsense with mini PC
https://www.toptonpc.com/product/new-4x-intel-i226-v-2-5g-firewall-mini-pc-pentium-n6005-n5105-v5-edition-ddr4-2nvme-fanless-soft-router-dp-type-c-opnsense-esxi/
* Intel i226-V 4 x 2.5Gb Intel NIC
* N5105 4 core 4 thread x86 CPU
* 2 * 8Gb DDR4 memory
* 256Gb Nvme SSD
Opnsense is easy to setup; however, hardware might not stable.
**Physical Inspection**
https://forum.opnsense.org/index.php?topic=36139.0
Today \(01\/26\/2025\) I finally can physically access the hardware and have a look on what happened. Apparently, the PC will not boot up into bios and stay in power cycle again and again. There are no HDMI output and it never get into bios. I therefore, deduced that this can be **hardware issue rather than software issue**.
:::success
:bulb: **Bios Update Info**
https://forum.opnsense.org/index.php?topic=27938.0
:::
## Possible Candidates
### QNAP QHora-322 \(Elected\)
* QNAP QHora-322 10Gb Router Review
https://www.youtube.com/watch?v=NlSTLIOmkZI
* QNAP QHora-322 SD-WAN 路由器開箱評測:集結新世代高效能,人性化的管理與資安防護於一身!
https://www.techbang.com/posts/101114-qnap-qhora-322-sd-wan-router-unboxing-review

2025\/02\/01 I decided to roll with QNAP QHora-322 for mainly two reason. Firstly, it support 10G and 2.5G for most of its ports which is not commonly seen in commercial firewalls. Secondly, I do not need a powerful firewall; what I need is a stable one with NAT and VPN functionality. For NAT and VPN functionality, QNAP provide some very powerful built-in tools for my need. Therefore, I don't need to go through all the settings I did for my old OpenSense.
### MikroTik Routers
* MikroTik RB5009UG+S+IN
https://www.google.com/amp/s/masonsfavour.com/mikrotik-rb5009-router/amp/#cobssid=s
* Migrating From OpnSense To Mikrotik
https://www.youtube.com/watch?v=k5eShv6l1ts
His opnsense hardware is janky but with higher reliability since it's not some mini PC from any chinese brand.
* Some Reddit discussion about OpnSense \& Mikrotik
[1. OPNsense or MikroTik: Anyone who has used both have strong opinions?](https://www.reddit.com/r/mikrotik/comments/1ajhztr/opnsense_or_mikrotik_anyone_who_has_used_both/)
[2. Will I regret going from OPNsense to MikroTik? (Time vs Money)](https://www.reddit.com/r/homelab/comments/198efum/will_i_regret_going_from_opnsense_to_mikrotik/)
Require understand of routerOS, which is, as far as I know, difficult to master.
### Zyxel 合勤USG FLEX200
* USG FLEX200
https://24h.pchome.com.tw/prod/DRAF0I-A900I2N1T
* USG FLEX100
https://24h.pchome.com.tw/prod/DRAF0I-A900I2MXM
### ASUS ExpertWiFi EBG15
* Official Site
https://www.asus.com/tw/networking-iot-servers/business-network-solutions/asus-expertwifi/asus-expertwifi-ebg15/
* Review by Craft Computing
https://www.youtube.com/watch?v=aLbHuAFqsio
I notice that the differences between ExpertWifi EBG19P and EBG15 might be only about the interfaces. I don't need LAN interface since I already have a powerful enough switch; therefore, if I choose to roll with ASUS, I will only buy this EBG15.
> Reference \:
> * ASUS ExpertWiFi EBG19P review: A great wired router for SMBs!
> https://www.digitalcitizen.life/asus-expertwifi-ebg19p-review/
In the above review article, there is a speed test that can be found in which it contain both EBG19P and EBG15. The chart shows that the performance is basically identical for network speed.
:::success
Additionally, this is the **cheapest solution** for my network.
:::
## Setup Linux System using `sudo` without Password
> Reference \:
> https://askubuntu.com/questions/147241/execute-sudo-without-password
## Setup Tesla P40 for WSL2 in Windows 10\/11
### Registry Setting

To make WSL2 detect and use Tesla P40, we need to make the card work in the WDDM mode. Please refer to the following youtube video for how-tos. \[https://www.youtube.com/watch?v=K1emL7pwDH0]
**Steps**
* Open `Registry Editor`
* `HKEY_LOCAL_MACHINE` \-\> `SYSTEM` \-\> `CurrentControlSet` \-\> `Control` \-\> `Class` \-\> `4d36e968......` \-\>
* This list all the GPUs connected in the system
* Find Tesla P40 \(or whatever workstation card you have\)
* Delete DWORD `AdapterType`
* Set DWORD `FeatureScore` to `d1`
* Add DWORD `GridLicensedFeatures` and set value to `7`
* Add `EnableMsHybrid` and set value to `1`
* Find NVIDIA T400 \(or whatever your another GPU is)\
* Add `EnableMsHybrid` and set value to `2`
* Reboot Windows
**Caveats**
Use newer driver will make this method \(changing registry\) not working. **Roll back to older version of Nvidia Driver is required**. I am currently using `537.70` \(02\/12\/2025\).
### P40 Cooling

## Windows Server SSH Setting and Rebooting
> Reference \:
> * 開始使用適用於 Windows 的 OpenSSH
> https://learn.microsoft.com/zh-tw/windows-server/administration/openssh/openssh_install_firstuse?tabs=gui&pivots=windows-server-2025
Since I might need to reboot the system with ssh access to CMD. I use the following command to restart the Windows PC.
```bash
# Login in cmd
powershell.exe # to work as powershell
Restart-Computer -Force -AsJob # schedule a restart
```
## Windows Server Sleep with Powershell Script
> Reference \:
> * How to create and run a PowerShell script file on Windows 11 or 10
> https://www.windowscentral.com/how-create-and-run-your-first-powershell-script-file-windows-10
> * What is the PowerShell command equivalent to selecting "Sleep" from the Win7 menu?
> https://superuser.com/questions/1116599/what-is-the-powershell-command-equivalent-to-selecting-sleep-from-the-win7-men
```bash
# load assembly System.Windows.Forms which will be used
Add-Type -AssemblyName System.Windows.Forms
# set powerstate to suspend (sleep mode)
$PowerState = [System.Windows.Forms.PowerState]::Suspend;
# do not force putting Windows to sleep
$Force = $false;
# so you can wake up your computer from sleep
$DisableWake = $false;
# do it! Set computer to sleep
[System.Windows.Forms.Application]::SetSuspendState($PowerState, $Force, $DisableWake);
```
## Windows 11 Function Tweak
Use [Winaero Tweaker](https://winaerotweaker.com/) can help disable many unwanted function Microsoft put inside the Windows 11. Including but not limited to Ads, start menu web search, disable Windows Update, disable Copilot, etc.
## WSL2 with Windows 11 23H2 accessing from local network
> Reference \:
https://learn.microsoft.com/en-us/windows/wsl/networking
`Set-NetFirewallHyperVVMSetting -Name '{40E0AC32-46A5-438A-A0B2-2B479E8F2E90}' -DefaultInboundAction Allow`
Firstly, make sure windows 11 version is above 22H2. \(I am using 23H2\)
The following `.wslconfig` file should be add under the user space.
```clike
[wsl2]
memory=112GB
swap=0
nestedVirtualization=false
guiApplications=false
networkingMode=mirrored
dnsTunneling=true
firewall=true
autoProxy=true
[experimental]
hostAddressLoopback=true
autoMemoryReclaim=gradual
```
Secondly, ssh in WSL2 is not default to open thus we need to start the service.
> Reference \:
> https://ouch1978.github.io/blog/2022/12/07/enable-ssh-in-ubuntu-on-wsl2
```bash
service ssh status
service ssh start
# To change the ssh port, enable password login and disable publickeyauthentication
sudo vim /etc/ssh/sshd_config
service ssh restart
```
Personally, I will add `service ssh start` to whatever distro that I want to access through local network in the `.bashrc`
We need to set different wsl2 distros to different ssh ports otherwise, we will have collision.
## Windows WSL2 with NUMA
### Issue on WSL Github Repo
[WSL 2 uses half the number of cores on AMD Threadripper 3990X #5423](https://github.com/microsoft/WSL/issues/5423)
Please note that this is the temperary solution since this solution will still cause some performance issue on the NUMA system.
* **Step 1 :** Check if the Hyper-V hypervisor scheduler is in `classic` type. Please refer to https://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/manage/manage-hyper-v-scheduler-types and ***set the scheduler type to classic***. If the type is not in classic, the WSL2 will ***not be able to use more than one socket*** ram and CPU resource.
* **Step 2 :** Follow the following tutorial https://github.com/xieyubo/WSL2/commit/41511c9978c9862f156db4aa4a2072df980b72fc which can help WSL2 to detect and utilize all the CPU cores in both CPU sockets.
* **Caveats :** This is not a solution since the Linux distribution will not recognize the system topology as two CPUs but will see all cores under one socket. This cause some issue as describe in https://github.com/microsoft/WSL/issues/5423#issuecomment-1842128469
#### WSL Official Fix
In the[ new update](https://github.com/microsoft/WSL/releases/tag/2.4.5), wsl officially fix this issue ish? The thing is, as the previous solution provided by the community, this has almost the same problem that the WSL still combine two sockets into one and does not have NUMA in mind.
**Output of `lscpu` with Ubuntu 24\.04**
```bash
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 72
On-line CPU(s) list: 0-71
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
CPU family: 6
Model: 63
Thread(s) per core: 2
Core(s) per socket: 36
Socket(s): 1
Stepping: 2
BogoMIPS: 4589.37
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdt
scp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 fma cx16 pdcm pcid
sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti ssbd ibrs i
bpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear flush_l1d arch_capabilities
Virtualization features:
Hypervisor vendor: Microsoft
Virtualization type: full
Caches (sum of all):
L1d: 1.1 MiB (36 instances)
L1i: 1.1 MiB (36 instances)
L2: 9 MiB (36 instances)
L3: 45 MiB (1 instance)
```
**CPU Usage Issue \([See Update](https://hackmd.io/EhDRrZWEQ4it2FMZZPVE4w?view=&stext=15528%3A17%3A0%3A1740852345%3AJtful3)\)**

We can see that in the WSL, the CPU usage is full. However, in the Task Manager, the second CPU is not fully utilized.
:::success
:bulb: **Failed & TODO : Compile Custom WSL2 Linux Kernel**
* 如何編譯與更換WSL核心 (custom WSL kernel)
https://ivonblog.com/posts/compile-custom-wsl-kernel-on-wsl/
* CONFIG_NUMA: NUMA Memory Allocation and Scheduler Support https://cateee.net/lkddb/web-lkddb/NUMA.html
* BTF: .tmp_vmlinux.btf: pahole (pahole) is not available https://stackoverflow.com/questions/61657707/btf-tmp-vmlinux-btf-pahole-pahole-is-not-available
As far as I know, the WSL2 Linux kernel does not compile with NUMA support. By look into this possible solution a custom linux kernel is provided. However, it does not resolve the issue.
:::
**CPU Schedule Update**
Now with the [new update](https://github.com/microsoft/WSL/releases/tag/2.4.5), WSL can now evenly distribute all work to all the cpu cores.

:::info
:bulb: **Invoking Shell Script with `source` or `bash`**
https://superuser.com/questions/176783/what-is-the-difference-between-executing-a-bash-script-vs-sourcing-it
:::
## WSL Release Upgrade from Ubuntu 22.04 to 24.04
I have issue upgrading a 22.04 WSL to 24.04 and as the following post suggested. The easiest way to do it is just uninstall `snap` and the upgrade might go through correctly.
* Can't update to Ubuntu 24.04 LTS on WSL2
https://askubuntu.com/questions/1511584/cant-update-to-ubuntu-24-04-lts-on-wsl2
## Linux Manage CUDA Version
> Reference \: https://gist.github.com/garg-aayush/156ec6ddda3d62e2c0ddad00b7e66956
## Nvidia GPU Monitoring
***nvitop***
https://github.com/XuehaiPan/nvitop
***nvtop***
https://github.com/Syllo/nvtop
***nvidia-smi***
*Please refer to the Appendix*
https://hackmd.io/@Erebustsai/HJ5p3-NFp
## Nvidia GPU Power \& Performance Related
* **NVIDIA GPU持久模式是什麼?(驅動程式持久性 Driver Persistence Daemon 守護程序)**
https://blog.csdn.net/Dontla/article/details/104013931
* **NVIDIA GPU 持久化模式**
https://erhwenkuo.github.io/mlops/02-gpu-sharing/mig/mig-persistence-mode/
* **Getting the Most Out of Your GPU for Machine Learning Applications**
https://datamachines.com/blog/getting-the-most-out-of-your-gpu-for-machine-learning-applications
* **nvidia-smi Cheat Sheet**
https://www.seimaxim.com/kb/gpu/nvidia-smi-cheat-sheet#GPU_Initialization_Info
## Nvidia GPU Driver Update
> Reference \:
> * 在 Ubuntu 上更新 NVIDIA GPU 驅動程式
> https://natlee.github.io/Blog/posts/6ff826ea/
## Hold Update from Ubuntu `apt`
https://www.arthurtoday.com/2015/05/ubuntu-apt-mark-how-to.html
## Switch Between Different version of GCC
https://linuxconfig.org/how-to-switch-between-multiple-gcc-and-g-compiler-versions-on-ubuntu-20-04-lts-focal-fossa
# Building / Setup
:::success
***Assembly Issue # 1 : One DIMM failed to pass Bios post***
At some time, one dimm on A1 position failed to pass bios post. Bios shows a warnning before boot into system and that dimm will not be load. After few boot section and debug, reseating the CPU resolve the issue.

Similar to what happen in the below youtube video.
> ***The $1,000,000 Computer is Broken***
> https://www.youtube.com/watch?v=XR72qmzRzfQ
As what they point out in the video, EPYC CPU is pretty big in size which can have issue if the CPU is not perfectly install onto the CPU socket.
***Assmebly Issue # 2 : Bios b2 Post code***
System continuosly failed to boot and stuck at post code b2 when two RTX A5000 installed. Both cards can be boot individually with only one of them istalled. Finally, we clear the CMOS by take out the battery on the motherboard and short to clear CMOS. The system can be boot with integrated graphic on the motherboard and keep the two RTX A5000 to only compute. (A setting in bios should put integrated GPU as primary output)
:::
:::info
:bulb: **Turn on/off GUI**
Reference \: https://askubuntu.com/questions/148321/how-do-i-stop-gui
```bash
sudo service gdm stop
sudo service gdm start
sudo systemctl disable gdm # disable GUI on boot
sudo systemctl enable gdm # enable GUI on boot
```
:::
:::info
:bulb: **Turn off/on EPYC server turbo**
```bah
echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/boost
echo 1 | sudo tee /sys/devices/system/cpu/cpufreq/boost
```
```clike
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 43 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 128
On-line CPU(s) list: 0-127
Vendor ID: AuthenticAMD
Model name: AMD EPYC 7742 64-Core Processor
CPU family: 23
Model: 49
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 1
Stepping: 0
Frequency boost: disabled
CPU max MHz: 2250.0000
CPU min MHz: 1500.0000
BogoMIPS: 4499.85
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse ss
e2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid
extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt ae
s xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowpr
efetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat
_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rd
seed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_tota
l cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin arat npt lbrv svm_lock nrip_sa
ve tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload
vgif v_spec_ctrl umip rdpid overflow_recov succor smca sev sev_es
```
:::
:::info
:bulb: **Check NvLink**
``` script
desktop:~$ nvidia-smi nvlink -s
GPU 0: NVIDIA RTX A5000 (UUID: GPU-74bb45a8-f503-261d-121b-bba1d462c33c)
Link 0: 14.062 GB/s
Link 1: 14.062 GB/s
Link 2: 14.062 GB/s
Link 3: 14.062 GB/s
GPU 1: NVIDIA RTX A5000 (UUID: GPU-8d513fbe-2239-85cd-c234-61a58ccdd468)
Link 0: 14.062 GB/s
Link 1: 14.062 GB/s
Link 2: 14.062 GB/s
Link 3: 14.062 GB/s
erebus@erebus-desktop:~$ nvidia-smi nvlink -cBridge
GPU 0: NVIDIA RTX A5000 (UUID: GPU-74bb45a8-f503-261d-121b-bba1d462c33c)
GPU 1: NVIDIA RTX A5000 (UUID: GPU-8d513fbe-2239-85cd-c234-61a58ccdd468)
> Peer access from NVIDIA RTX A5000 (GPU0) -> NVIDIA RTX A5000 (GPU1) : Yes
> Peer access from NVIDIA RTX A5000 (GPU1) -> NVIDIA RTX A5000 (GPU0) : Yes
```
:::
## Tesla P40 Cooling
Not recording the old setup

The new setup only have one 4pin fan which can be connected to the case fan connector and with [fancontrol](https://getfancontrol.com/), I can control the fan speed. Therefore, I can cool this GPU without fan running 100\% always. **However, the motherboard just don't show these fan control to the system. Therefore, I still run it as loud as it can get.**
## Temperature & Humidity Monitoring
An ESP32 with a DHT22 sensor is used to monitor the case temperature. ESP32 will host a web page that update the sensor data when accessed.
```cpp
// Pin 27 -> out
// Pin 3v3 -> +
// Pin GND -> -
#include "main.h"
void setup() {
Serial.begin(115200);
while (!Serial)
; // wait for Serial port to be opened
Serial.printf("ESP32 Start\n");
WiFi.mode(WIFI_STA);
// Scan WiFi Networks
netFunction::printWiFiScan();
// Start Connect to WiFi
const char ssid[] = "";
const char pwd[] = "";
WiFi.begin(ssid, pwd);
while (WiFi.status() != WL_CONNECTED) {
Serial.printf("Wait for WiFi network connection\n");
delay(1000);
}
WiFi.printDiag(Serial);
server.on("/", netFunction::handleRoot);
server.onNotFound(netFunction::handleNotFound);
server.begin();
Serial.printf("Start Temp Sensor Setup\n\n");
// DHT22 : AM2302
dht.setup(27, DHTesp::AM2302);
}
void loop() {
server.handleClient();
}
```
### main.h
```cpp
#include <WiFi.h>
#include <WebServer.h>
#include <DHTesp.h>
DHTesp dht;
WebServer server(80);
namespace netFunction {
void printWiFiScan() {
// Start probing for WiFi
int numberNetwork = WiFi.scanNetworks();
char delimiter[40] = "===================================";
Serial.printf("%s\n\n", delimiter);
if (numberNetwork == 0) {
Serial.printf("No network found\n\n");
} else {
Serial.printf("%d network found\nStart listing networks\n\n", numberNetwork);
for (int i = 0; i < numberNetwork; ++i) {
Serial.printf("%d SSID : ", i);
Serial.print(WiFi.SSID(i));
Serial.printf(" RSSI : ");
Serial.print(WiFi.RSSI(i));
Serial.printf("\n\n");
}
}
}
void handleRoot() {
// Read and Update Sensor Data when website is accessed.
TempAndHumidity data = dht.getTempAndHumidity();
Serial.printf("Temparature : %lf Humidity : %lf\n\n", data.temperature, data.humidity);
String tempString = String(data.temperature);
String humidString = String(data.humidity);
String HTML = "\
<!DOCTYPE html>\
<html><head><meta charset='utf-8'></head>\
<body>Temperature : \
" + tempString + " Humidity : " + humidString +
"</body></html>\
";
server.send(200, "text/html", HTML);
}
void handleNotFound() {
String plainText = "File Not found\n";
server.send(404, "text/plain", plainText);
}
};
```
## Power Saving Measures
:::info
:information_source: *Intel® Server System BIOS Setup Utility Guide*
https://www.intel.com/content/dam/support/us/en/documents/motherboards/server/sb/intelserversystembiossetuputilityguide10_final.pdf
With the above guide, we can dive into more setting on the *Intel E5-2600 v3* processors
:::
***1\# \: Wake on LAN***
* Follow the follwoing guide on Windows 10/11. https://www.asus.com/tw/support/faq/1049115/
* Open Wake on LAN in bios and remember to have network stack support open.
* Using WakeOnLan Software from **NirSoft** https://www.nirsoft.net/utils/wake_on_lan.html
:::info
:bulb: **Using `btop`**
* How to Install and Use btop on Ubuntu 20.04
https://docs.vultr.com/how-to-install-and-use-btop-on-ubuntu-20-04
:::
:::success
:bulb: **How to Stop Your Windows PC From Randomly Waking Up From Sleep Mode**
https://www.pcmag.com/how-to/stop-your-computer-from-randomly-waking-up-from-sleep-mode
:::
***2\# \: ThrottleStop on Windows***
* ThrottleStop can be download here https://www.techpowerup.com/download/techpowerup-throttlestop/
* Select `Disable Turbo` and the CPU will not overclock higher than base clock.
* Bug find \(10\-07\-2024\) \: ThrottleStop will only prevent CPU 1 from overclocking and CPU 0 will still overclock freely. This can be checked by `Open Hardware Monitor` \[https://openhardwaremonitor.org/\]
**3\# Windows OS Clock Bounding**
> Reference \: https://www.reddit.com/r/LifeProTips/comments/hnzliv/comment/fxepeph/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
Actually, you can create few energy saving plans, and put a CPU percentage for each of them, like below for example\:
* 100\% CPU usage, when you need full power and performance\, and you don’t care about fan noise and temperature
* 99\% \- which **turn off the turbo mode of the CPU**\, but still good for daily routine tasks, and also your computer will work much cooler and quieter
* 80\% \- for light tasks, like messengers, reading, etc. Even more cooler and usually no noise at all \(at least on my Intel i7 CPU\)
And just switch between them depends of your current tasks.
:::info
:information_source: **Power Saving Result**
* Before Power Saving Measure Applied

* After Power Saving Measure Applied

:::
:::success
:bulb: **How to control windows's sleep and awake time.**
Using `Powertoys` \(https://learn.microsoft.com/en-us/windows/powertoys/\) can help when a windows system need to be kept awake idefinitely. Additionally, switching back to the original plan \(Sleeps in 1 minutes\) only take one click to switch off the setting.
**However, in my system, the China made mother board can sometimes sleep to death and waking it up will only trigger a booting, losing the previous system state.**
http://www.huananzhi.com/more.php?lm=10&id=311
**But it is cheap and there are tons of supplyers that you can buy from**
:::
## Linux PowerSaving
* **Linux 伺服器功耗與性能管理(四):監控、組態、調優**
https://arthurchiao.art/blog/linux-cpu-4-zh/
* **CPU 空閒狀態 by Red Hat Enterprise Linux**
https://docs.redhat.com/zh_hans/documentation/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/cpu-idle-states_tuning-cpu-frequency-to-optimize-energy-consumption
**Set Powermode for CPU in Ubuntu**
https://askubuntu.com/questions/379311/how-can-i-optimize-ubuntu-for-minimal-energy-usage
**Use Powertop**
* Reduce power consumption with powertop
https://linuxconfig.org/how-to-check-and-tune-power-consumption-with-powertop-on-linux
* How to check and tune power consumption with Powertop on Linux
https://forums.unraid.net/topic/98070-reduce-power-consumption-with-powertop/
**Use Linux system call `cpupower`**
https://www.cnblogs.com/HByang/p/17957747
The following shell command will set the CPU to powersaving mode.
```bash
cpupower -c all frequency-set -g powersave
```
**to install `cpupower`**
`sudo apt install linux-tools-$(uname -r) linux-cloud-tools-$(uname -r)`
:::success
:bulb: **AMD p-state**
https://wccftech.com/amd-p-state-driver-epyc-cpus-zen-5-enhance-performance-efficiency/
:::
:::success
:bulb: **Intel `cpupower`, `--perf-bias`**
* 第 17 章 調整 CPU 頻率以最佳化能源消耗
https://docs.redhat.com/zh-cn/documentation/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/tuning-cpu-frequency-to-optimize-energy-consumption_monitoring-and-managing-system-status-and-performance#supported-cpupower-tool-commands_tuning-cpu-frequency-to-optimize-energy-consumption
:::
## Zenpower
> Zenpower is Linux kernel driver for reading temperature, voltage\(SVI2\), current\(SVI2\) and power\(SVI2\) for AMD Zen family CPUs.
* Zenpower Driver \: https://github.com/ocerman/zenpower/
* Zen monitor \: https://github.com/ocerman/zenmonitor
Description of the meaning of the numbers showed.
https://github.com/ocerman/zenpower/issues/11
## Network Monitoring
**Socket Statistics**
How to Use the ss Command \(Linux Crash Course Series\)
https://www.youtube.com/watch?v=phY8Q7Woxsw
```bash
watch -n2 ss -4 # Shows IPv4 Connections every 2 sec
```
## AMD EPYC Server Bios Tour with Level1Techs
> Reference \:
> * Talking about what's in a Server Bios with AMD
> https://www.youtube.com/watch?v=1C7P-V05SgQ
Extra Material to Follow Through the Entire Discussion
* **DOC**\: 2019-amd-epyc-7002-tg-bios-workload
https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/2019-amd-epyc-7002-tg-bios-workload-56745_0_80.pdf
* **DOC**\: SBP-AMD-EPYC-2-SLES15SP1
https://documentation.suse.com/sbp/tuning-performance/pdf/SBP-AMD-EPYC-2-SLES15SP1_en.pdf
* Power Determinism Mode Still Proves Beneficial For AMD EPYC 9005 Performance
https://www.phoronix.com/review/amd-epyc-9005-determinism
* AMD EPYC™ 系統管理軟體 \(E-SMS\)
https://www.wpgdadatong.com/blog/detail/74857
*Notice that in the bios, using HSMP driver is disabled.*
* C2 State in Linux Kernel is C6 State in Hardware
https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58011-epyc-9004-tg-bios-and-workload.pdf
## TrueNAS Scale on Dell R730
> Reference \:
> How I Configure TrueNAS \(Complete Setup Guide\)
> https://www.youtube.com/watch?v=67KtKoW4IM0
## Dell R730 Powersaving
When going through BIOS setting in R730, we can found a profile selecter that can be set to focusing on power efficient. Additionally, I turn off most of the cores to reduce power comsumption.
# Server Room Upgrade\/Redesigned \(01\/25\/2025\)
## Server Room
Since the OpnSense Router broke, I decided that it's time to completely redesign the layout of the server room. I am lazy therefore, I am not going to write down the design that I have currently and I will only include changes that made by me.
* Improve Windows Server for future usage \(Windows Visual Studio 2022\/2024 Development Environment for Algorithm Projects\).
* Replace 256GB NVMe SSD with 2T NVMe SSD to Improve hard drive lifespan and reserve space for other applications. The original plan is not having a Windows Server with WSL, but a Proxmox Server. Therefore, for Proxmox OS, the size of the SSD does not need to be big. This also explain why I have another 4TB SSD installed. Everything is ad hoc.
* Tesla P40 Cooler Renew to help GPU cooling and [the fan](https://shopee.tw/B%E2%97%8F%E5%8F%B0%E7%81%A3%E4%B8%89%E5%B7%A8%E2%9C%AF-AC%E9%9B%A2%E5%BF%83%E9%A2%A8%E6%89%87-SJ12032-%E6%95%A3%E7%86%B1%E9%A2%A8%E6%89%87-%E9%BC%93%E9%A2%A8%E6%89%87-%E5%B0%8F%E5%9E%8B%E8%9D%B8%E7%89%9B%E6%89%87-%E5%B7%A5%E6%A5%AD%E9%A2%A8%E6%89%87-%E9%A2%A8%E9%BC%93-%E8%9D%B8%E7%89%9B%E9%A2%A8%E6%89%87-%E6%8E%92%E9%A2%A8%E6%89%87-%E9%80%9A%E9%A2%A8%E6%89%87-i.121938504.1912098549?sp_atk=122e8259-6bea-4548-8d11-d21be77f65a8&xptdk=122e8259-6bea-4548-8d11-d21be77f65a8) on the new cooler can be easily changed when weared out.
* Replace the Cooler to [Silverstone XE04](https://24h.pchome.com.tw/prod/DRAE9M-A900GO5Y1). The original water cooling from NZXT keep disconnected and the fan will not spin at all only the pump is working.
* Reduce cord complixity by replace HDMI connection cables with [HDMI dummy](https://tw.shp.ee/72RuyWv) \(Connection required for Parsec to function\)
* Relocate the thin Client \(Surface Go 2 with a broken screen; OS is Tiny11 22H2\) to a place that is cooler with an USB-C extension cable
* Get rid of the HDMI&USB switch \(Help reduce the complixty\)
* Redesign the cable placement
* Parepare for New Router
* Cable Management and Identification
* Voltage Stabilizer Relocate
* Clean Servers
* Clean Server air filters
* Clean Rack fan
* EPYC Server Fan Control Setting
* I did not change much compare to default. The default run all the fans connected to the MB with CPU temperature. I simply increase the [open loop table](https://www.geeksforgeeks.org/open-loop-control-system/) by 5\%.
2025\/03\/17 \: EPYC Server fail to boot. The issue still related to PCIe Enumeration 93. The Motherboard is checked by the engineer of AsRock and worked properly. The only possible issue I deduce is on the CPU side. Therefore, I bought a cheap EPYC 7302P for testing. With the new CPU, the server boot up without any issue. Therefore, the EPYC 7742 that I have might have some issue. This CPU will be stored.
The only good thing that this down-grade have is the power consumption is reduced. Consider the TDP of EPYC 7742 is 225W and EPYC 7302P is 155W.
## Offsite Server

Similar to the EPYC Server in server room, offsite server is located in a complete different place with different ISP.
* CPU \: AMD EPYC 7542
* GPU \: Nvidia RTX 2080 Ti
* RAM \: DDR4 ECC 3200
The offsite server is running Windows 11 currently \(01/25/2025\) without TPM 2.0 module installed. A [TPM 2.0 module](https://tw.shp.ee/79ZtBML) is acquired and ready to be install and enabled.

Basically, when you install the module onto the MB, the TPM module automatically detected by the system. For this ROMED8-2T, there is an option in the bios setting to choose which TPM module connecter is used \(02\/03\/2025\).

# Migrate Windows Server MB with ASUS X99 MotherBoard
*I don't want to put up with all the things that I go through with a NUMA System on Windows. The power consumption is way too high for a system that only serve for experiments. Additionally, in the future, I might need to use this server as a normal PC, when working on these projects remotely is not feasible, since I don't want to use laptop to do compilation for all the MCU and FPGA systems.*
## New Motherboard ASUS x99 AII
> Reference \:
> https://motherboarddb.com/motherboards/1647/X99-A%20II/
This motherboard allows the following \:
* **Fan control for Tesla P40 Cooler** \: Old X99-f8d does not have PWM chassis fan connector.
* **Less power consumption** \: This allow me to keep this system running all the time and still keep the power consumption within my power budget.
* **Fan control for all the chassis fans** \: Extend the chassis fan lifetime and reduce noise.
Notice that I have two m.2 SSD and will have to use a PCIe to m.2 PCIe adapter for one of the SSD. This means that I will have three PCIe devices and will take up three slots.

Notice that this MB does not provide full support for all the PCIe lanes in Xeon E5 v3 CPU, thus only the Tesla P40 can have x16 lanes and the Quadro T400 and the m.2 SSD have x8 lanes.
# A Cluster within the system
The detail is in the following post.
* Cluster Building & MPI Programming
https://hackmd.io/@Erebustsai/SkyCU2g4n