installed CentOS9 Stream.
dnf update
Reference--
https://medium.com/@blackhorseya/step-by-step-guide-to-installing-nvidia-drivers-on-rhel-9-1107e0cd641d
*skipped RHEL 9 CodeReady Builder repo step*
dnf makecache
*Install the epel-release package*
dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
*Update the DNF package repository cache again to ensure all changes take effect*
dnf update
*Install Dependencies and Build Tools*
dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc make dkms acpid libglvnd-glx libglvnd-opengl libglvnd-devel pkgconfig
*Add the Nvidia CUDA Repository*
dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel9/$(uname -i)/cuda-rhel9.repo
*Install Nvidia GPU Drivers*
sudo dnf module install -y nvidia-driver:open-dkms
*reboot System*
lsmod | grep nvidia # Should return results
lsmod | grep nouveau # Should return nothing

Looks good!
* if you need to remove the drivers*
dnf remove -y nvidia-driver
dnf module reset -y nvidia-driver
Reference --
https://docs.docker.com/engine/install/centos/#install-using-the-repository
*Install docker*
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

systemctl start docker
* To Test *
docker run hello-world

Refernce --
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-the-nvidia-container-toolkit
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit

*Configuring Docker*
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

*install ollama*
curl -fsSL https://ollama.com/install.sh | sh
This script will search for GPU and try to installe drivers. But they are already installed so it should see that as well.

Sees GPU Installed.
*running ollama on CLI
ollama run llama3.1
-calls llama3.1 LLM. will pull down if not already downloaded
use "watch -n 0.5 nvidia-smi" to see status of GPUs.

Bottom line showes LLM loaded into memory on GPU 0
asked the LLM "why is the sky blue?"

shows GPU 0 under load
everything seems to be working.
* This will allow multiple LLMs to be loaded (5) and multiple users to send requests at the same time (10)*
dnf install vim
vim /etc/systemd/system/ollama.service
Environment="OLLAMA_NUM_PARALLEL=10"
Environment="OLLAMA_MAX_LOADED_MODELS=5"
systemctl daemon-reload
systemctl restart ollama.service
opened another window and ran gemma2 LLM

now we see both models running. One in each GPU.
lets ask both LLMs the same question.

note the GPU utilization and WATT usage.
dnf inatll cockpit
mounted RAID 5 set (2.7 TB)
/mnt/raid5/
open webUI container data will go here "/mnt/raid5/openwebui-data"
Now, this is what I have issues. Getting the Open WebUI container running properly.
-Reference
https://docs.openwebui.com/
To run Open WebUI with Nvidia GPU support
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v /mnt/raid5/openwebui-data:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda


Container is up and running

Looks like the webpage is up. Created initial account and logged in.

Try to upload an LLM

Per this site. this should fix this.
https://github.com/open-webui/open-webui/blob/main/TROUBLESHOOTING.md
Trying this
docker run -d --network=host –gpus all -e OLLAMA_BASE_URL=http://127.0.0.1:11434 -v /mnt/raid5/openwebui-data:/app/backend/data –name open-webui –restart always ghcr.io/open-webui/open-webui:cuda
docker run -d --network=host -v /mnt/raid5/openwebui-data:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
at this point the cainter keeps rebooting.

docker logs open-webui
now it cant find the NVIDIA driver

docker stop open-webui
docker rm open-webui
went back to original docker run command.
Container is up.. but here are the logs.

remembered wireguard stuff about adding masquerade to the firewall for doing hairpins within the system. not sure if it applies here.. but. tried it.
firewall-cmd --add-masquerade --permanent
firewall-cmd --reload
does not seemt to work. noticed its trying to connect to 172.17.0.1 is that the container networks gateway? hmmm
I think we had this issue before. Cant recall what we had to do.
*** Update *** Found the issue.
I needed to run the ollama serve command. OLLAMA_HOST=192.168.1.184:11435 ollama serve
Once that is up.. you will see the connections to the service.

I can now see this when hitting the port number of my host

and now the container can talk to the ollama service on the host after changing the settings.

I edited the /etc/systemd/system/ollama.service file to add that into the service so it will come up every time with that environment variable.
The file contents:
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
Environment="OLLAMA_NUM_PARALLEL=10"
Environment="OLLAMA_MAX_LOADED_MODELS=5"
Environment="OLLAMA_HOST=192.168.1.184:11434 ollama serve"
[Install]
WantedBy=default.target
Save and then reloaded the service.
systemctl daemon-reload
systemctl restart ollama.service
Everything seems to be working!
**** Yay me! :) ***