Future-proofing HA with local LLMs: Best compact, low-power hardware?

aleco · November 4, 2024, 11:20am

I’m currently running Home Assistant on a Raspberry Pi 4 and looking to upgrade. With HA moving toward local LLMs (like Ollama), I want to make sure my next setup can handle these new demands. My main questions are:

Hardware Requirements for LLMs (x86): What CPU and RAM are ideal for running HA with a local LLM? Would an Intel N100 be enough, or do I need a mini PC with dedicated/external GPU?
ARM Alternatives for Future-Proofing: Could the new Mac Mini M4, with its AI-optimized CPU/GPU, be a better choice? What’s the best way to run HA (with HACS) on Mac hardware, via macOS or other virtualization? Are there active efforts to optimize HA for Apple’s ARM chips, and any communities focused on using Apple hardware with HA?

I was initially looking at an Intel N100 mini PC from brands like Protectli, iKoolCore, Minisforum, and others. But now I’m thinking it may fall short once HA fully integrates local LLMs. I’d planned to use Proxmox on it for HA plus a few light tasks (like Samba and UniFi Controller) while keeping noise and heat low, as it’ll be in a living room cabinet.

Alternatively, the new Mac Mini M4 seems appealing, but there’s limited info on running HA (with Ollama) on Apple hardware, whether on macOS or Linux.

Considering Apple’s edge in AI performance, I could see development shifting toward Apple chips for HA with local LLMs. Since I want something future-proof, any advice would be much appreciated. Thanks!

Sir_Goodenough · November 5, 2024, 12:33am

Hi aleco,

So, let’s face it. With LLM’s there is no future proofing. Currently the best GPU Video card that you can afford is what you need, but I bet within 12 months NVidia and others will have a LLM specific Board and the GPU will be old and slow.

I suggest getting the best GPU you can find for the money you want to spend and put it in a PC that can handle it. I needed a laptop and bought a new but 1YO model Intel 13000 series with a 4060 in it because I could, but not cheap… It works on the 7B models, but too small, not enough memory, for larger models and crawls for those.

The landscape is so uncertain that HA wants to build something, but knows it will be outdated in a year (more realistically months) so it doesn’t pay…

aleco · November 5, 2024, 2:13pm

Thanks! I realize this landscape is evolving quickly. But the recent announcement 2024.6: Dipping our toes in the world of AI using LLMs 🤖 - Home Assistant made me think that a small, HA-focused LLM model could be on the horizon?

From what I’ve seen, LLMs are moving in two directions: complex, resource-heavy models and smaller, specialised ones that can run locally on lower-power devices, even on decent mobile phones. I also read research suggesting vector data could be simplified to 1.5 bits per vector (-1, 0, +1), which would reduce resource needs or allow larger models on the same hardware.

All of this leaves me uncertain about which upgrade path to take. Moving from a Pi 4 to an Intel N100 mini PC might be a great short-term improvement, but I’d hate to find the minimum requirements for enabling local LLMs in HA are just a bit higher. Meanwhile, the new Mac Mini M4 feels like it has the right hardware (and Apple certainly does bet on it). So why can’t I run HA on it? Why are we limited to outdated x86 tech that lags so far behind on low TDP?

tmjpugh · November 5, 2024, 3:30pm

I’m hoping for a coral type device for LLMs

If these are made it will be usb3, m2 or pci. This also may be run on a seperate host machine. There are 2 paths really. If space is limited get the small NUC like 1L PCs or you can opt for larger PC chassis to allow expansion

WallyR · November 5, 2024, 4:24pm

It will not come as USB3, M2 or PCI.
It will be its own little computer.
It will not come tomorrow. It is already here.
It is named Nvidia Jetson.

They might come down in price in the future though.

aleco · November 6, 2024, 8:11pm

Coral looks great. I can’t comment on the specs, but the idea of attaching a device costing less than US$ 100 to an existing server to make it capable of handling a LLM for HA sounds great.

@WallyR As for Nvidia Jetson, the difference to Coral is that it is a standalone device with its own OS, connected to the home network? That’s also fine, but they will have to reduce the price drastically, the developer kit costs US$ 500.

Both options suggest that the future for home servers is offloading the AI to an external device. Is that something the Home Assistant core devs are actively working on or are Coral and Jetson just concepts at the current stage?

WallyR · November 6, 2024, 9:12pm

LLMs require a lot of floating point computational power.
In todays hardware we are talking highend 3D graphic cards, starting with something like Nvidia GeForce RTX 4060 or preferably better.
A Nvidia GeForce RTX 4060 can provide around 240 TOPS.
A Google Coral can provide 4 TOPS.

The devs are making the voice assistant pipeline modular, so the parts that are better performed by another piece of hardware can be offloaded to that and they are also trying to specialize the LLMs to make it smaller, but a Coral have to MANY times better to be an option.

Another thing that talks against the Coral is the memory requirements, where a Coral use a shared memory of the host computer, which is a lot slower than the normal graphic card memory.

The current LLMs often require 8Gb or more and the specialized one might only require 4Gb, but as the hardware gets faster and cheaper you will see this requirement actually go up a bit too, because the LLM size relates to the capability and precision of the LLM, so the Coral like device for LLMs will probably only be a dream for the next decade.

aleco · November 7, 2024, 9:58am

Wally, thanks. So what do you think about the Mac Mini M4 with 16GB? Apple is betting on it to be performant enough to run AI tasks locally. It’s affordable, compact, low power. I just don’t know how to run HA with HACS on it.

WallyR · November 7, 2024, 10:22am

Sorry to say it, but not a chance!
CPUs are mostly designed for integer computing and LLMs need floating point computing, which is why highend graphic cards are needed.

aleco · November 7, 2024, 4:11pm

Apple’s M4 chip in the MacMini is designed to handle floating-point computations effectively. The M4 features a 10-core CPU, a 10-core GPU, and a 16-core Neural Engine capable of 38 TOPS. This integrated architecture should allow the M4 to process LLMs efficiently without relying solely on dedicated GPUs. I believe 38 TOPS should be more than sufficient for handling a dedicated LLM for Home Assistant, don’t you think so?

WallyR · November 7, 2024, 4:27pm

Well, 38 TOPS is not bad, so maybe the machine can be used for LLMs.
The question is then just how, because Apple is not exactly happy with opening up their hardware for free use.

And looking at the price for that MacMini, then I do not think it will be targetted a lot.
A Jetson Orin NX will deliver twice as much power for half the price and it is ready for free use.

aleco · November 7, 2024, 4:45pm

I’m not familiar with the Jetson Orin NX, but if it is cheaper than a MacMini M4 (US$ 599 for 16GB/256GB) and can run Home Assistant including the LLMs and also do a few light weight tasks via Proxmox (like Samba and UniFi Controller), I’m all in.

And as for using Apple devices: That’s part of my question, if one would install a different OS or virtualize HASS on top of MacOS – because Ollama runs natively on MacOS.

WallyR · November 7, 2024, 5:01pm

Okay, the price I found on the MacMini M4 was almost twice as high.
The issue with the MacMini is that just installing another OS is not that easy.
There are not many alternative OSes for Apples M-chips and it is not enough just to be able to run the OS. You also need direct access to the floating point units and without extensive help from Apple it can be hard. This will also be the things that kills a virtualization option.

The Jetson is open from the start and supported by Nvidia on many different levels, so it will just be easier to take on.

I do not know if there are any installations for HA VA on the Jetson available, but the Jetson is made with exactly the purpose of providing AI for other devices.

aleco · November 7, 2024, 7:05pm

Thanks! I’m tempted to get a 16GB RAM Jetson with a 256GB SSD as an ARM-based home server for HA and light tasks, likely using Docker or LXD. But the lack of discussion around HAOS on Jetson or Coral makes me wonder if it’s safer to skip AI integration in HA for now, buy a compact low-power x86 mini PC as HA server, and revisit in a two years.

I’ve been here before – I went all-in on Thread devices two years ago, only to realise later that Zigbee2MQTT is still miles ahead of Matter and Thread, which led me to switch to Zigbee. Now I have two networks and feel like buying into “new” tech early was a mistake. Maybe ignoring local LLMs in HA is the smart move for now. Sigh.

ShadowFist · November 7, 2024, 9:17pm

Do that. You can get an adequate nuc for the price of a Pi + SSD, or a pretty decent one for the same price if you’re not afraid to get something used.

Judging by your last paragraph, you’re a fellow early adopter who’s been bitten before. I have too, and I’ve learnt not to repeat my past mistakes.
I would suggest you do the same.

tmjpugh · November 7, 2024, 11:01pm

Coral is TPU and plugs into your existing PC hardware

Frigate NVR software uses coral for image processing.

WallyR · November 8, 2024, 5:40am

That is my approach for now.

mpdbn · November 11, 2024, 7:23pm

I have been toying with the same idea. I have a Jetson Nano Xavier and it is ok for a local LLM but it only has 8Gig of memory so it is very limited as to which model you can run on it. I know the newer Jstson’s are much more capable. However I think I am going to go the M4 Mac Mini route using LM studio on the Mac os to run the model. When I get my new M4 mac mini at the end of Nov I will post my results.

Larrikin · November 11, 2024, 9:26pm

I’m confused as to why it is is a requirement to install Home Assistant on the Mac at all. Ollama and all of the other local LLMs I’ve played around with on my Mac Studio offer OpenAI compatible API calls. HA can run on a small little ODroid and just make API calls to your LLM box.

WallyR · November 12, 2024, 2:37pm

It is not necessary to install HA on the Mac.
It is necessary to get a network interface to the Macs LLM and it is necessary to convert the HA Wyoming protocol to what the Macs LLM use.
If there is a network interface to the LLM already then the converter plugin could be made on the HA installation. If not the a network interf e needs to developed for the Mac and the converter could then be included in it or be separated on either Mac or HA.

Similar process needs to be made for the output.
I do not know anything but what is available on the MacMini M4, so no idea what is available.