Run HA on a server?

Hi all.

I have a desktop running a small LLM via a Home Assistant add-on (Oobabooga), and I’ve hit the ceiling on the model size I can “effectively” use.
Currently I have a Ryzen 5 3600 and 16GB of RAM, on a Lenovo “special” motherboard. I have been considering 128GB ECC server RAM since I can get it cheap and that way increase the LLM model size I can “effectively” run, but I will need to change out the motherboard to do this.

Alternatively I could ditch the desktop I have and get a server/ workstation quite cheap. The server has two Xeon 6140 Gold, and 4 x 32GB DDR4 2400T ECC RAM. Would HA run on such a creature? Would it just be waste? I am not a fan of Proxmox due to issues with Corals, so bare-metal is preferred.
I use HA for a bit of image analysis as well, with the Coral, but compared to the LLM’s anything else I run on the machine is peanuts.

Comparable in price to the server/ workstation is a RTX 3090, but that has a very low VRAM amount so it would limit the model size quite a bit, even if it would be fast. New motherboard to the desktop I already have and 128GB DDR4 RAM would also be comparable to a RTX 3090 in cost.

Would the best alternative be the server/ workstation WITH a RTX 3090, and then offload layers to the GPU?

Or am I completely lost?