Nabu Casa should also sell larger “Home Assistant Green Plus” appliances as mid-range/high-end models with more expansion slots/ports and fast CPU/NPU/GPU for onboard AI acceleration

Hedda · December 12, 2024, 3:12pm

Well they should probably hurry up now if they want more users to run an on-premises AI agent with their upcoming voice hardware that is planned to be released next week see → https://youtube.com/live/ZgoaoTpIhm8

In case you missed it, “the biggest announcement of the year has yet to come though” as Nabu Casa will launch some kind of first-generation Home Assistant Assist Satellite voice smart speaker hardware product (based on XMOS XU316 DSP chip for advanced audio processing + ESP32-S3 running ESPHome) during the the live stream of Voice: “What’s new in voice - Chapter 8” on the 19th of December 2024 (which now state “Join us as we announce the voice assistant hardware we’ve been working hard on all year and see what’s new since our last Voice chapter!”)

They mentioned during the last Home Assistant release party that it will have immediate availability and that they have already manufactured A LOT of them so should not be a problem for loads of users to get it. A lot more info purposely “leaked” about that voice hardware in the Home Assistant 2024.12 release notes:

2024.12: Scene you in 2025! 🎄 - Home Assistant

Regardless, I think the existing Home Assistant Green hardware specifications do not need a refresh for it to continue as an entry-level model for users who do not want or need on-premises AI or more local storage.

Didgeridrew · December 12, 2024, 7:02pm

What makes you think that is an inherent goal of the soon-to-be-released hardware?

Sir_Goodenough · December 13, 2024, 2:59am

Hi Hedda,

I believe because things like a Beelink are so cheap, you can’t compete with it, so why try?

The green filled a niche as a preloaded entry level model that is supposed to just work.

Also, often times you can say more with fewer words, because people will actually read it all. That’s a LOT of word salad there.

gutifarra · December 13, 2024, 6:48am

As far as I know local AI (in the sense of LLMs to control voice assistants) is still a bleeding edge use case that is both expensive, unreliable and difficult for non-enthusiasts.

Top performing models that can reliably control a smart home need either a beefy GPU that is at least several hundred dollars, or a lot of RAM with expensive CPU like Mac M3. The Copilot-ready HW doesn’t have support for Ollama at the moment from what I can tell, and it’s unclear how much NPUs will help, in my understanding the main specs needed are memory bandwidth and capacity, not compute power. Because of the current AI craze, anything which can run LLMs is expensive.

Even when it does work, it’s a lot more power hungry and chaotic compared to the current intent handling. You are at the whim of the LLM and your prompt to decide what it should do, and it is very difficult to provide a good user experience using local models, even top models like Sonnet 3.5 of GPT4 get it wrong sometimes.

Given the current state I think it is too early for Nabu Casa to go into local AI. Hopefully the tech matures quickly and we can consider it soon but IMO it is still in the early adopter/enthusiast phase for now.

Hedda · December 13, 2024, 9:59am

If you already running Home Assistant on a Raspberry Pi or a computer with a spare M.2 slot then the barrier to entry has never been as low as now as you can simply buy either the 26-TOPS Raspberry Pi AI HAT+ or the 13-TOPS Raspberry Pi AI Kit (both available from $70 US-dollar if you can find any in stock) to quickly get started, and since supported in Home Assistant OS 14 should be easy to use.

So you no longer must have expensive GPU if you do not have higher reuirements. If you simply want to experiment with AI agents in Home Assistant to power a conversational agent then you can alternativly even run a local Ollama server for this purpose on fast modern CPU without a dedicated AI accelerator or GPU.

Yes features are still bleeding edge but progress is moving very fast now, just check out JLo’s latest demo in Home Assistant 2024.12 Release Party → https://www.youtube.com/watch?v=9Y9YY_YHNBY

Not ”the goal” but having a on-premises voice assistant () with option to run AI agent locally with local hardware acceleration is certainly one of the goals, even if it is experimatal only at an early stage today:

AI agents for the smart home - Home Assistant

You can already run a local LLM via Ollama today, and their roadmap does mentions related future plans:

Roadmap 2024 Year-end Update: Full steam ahead! - Home Assistant

While I can understand you maybe missing all the hype about using an “AI agents” as a conversation agent in Home Assistant if you are personally not interested in voice assistants but if you have read any of the official Home Assistant blog posts or watched some of release party videos (which they dony many demos of both AI agents and local LLM) during the last year or so then you can not have missed some things mentioned about both cloud and local LLM (Large Language Model) running “AI agent” to provide a conversation agent for Home Assistant, at least it has been written and talked about A LOT in Home Assistant blog posts and forum + covered even more by community creators ever since Home Assistant’s year of Voice was first announced in the end of 2022, (especially in concert with a voice assistant), with the latest Home Assistant OS 14 release adding support for Hailo-8 series of AI accelerators.

https://www.home-assistant.io/blog/2024/11/27/home-assistant-yellow-gets-cm5-support/#other-additions-to-haos-14

"On the topic of newly supported hardware, our release of Home Assistant OS 14 will bring support not only for CM5 but also for the Hailo-8 AI accelerator. This is the AI accelerator found in the Raspberry Pi AI Kit or the even more powerful Raspberry Pi AI HAT+ released last month, which is exclusively for the Raspberry Pi 5. For those using a Pi 5 they can now offload AI processing, like object or person detection, to this efficient add-on."

There are also several companies working on various upcoming (or already released) “Edge AI server” that is an seperate appliance which runs Ollama server locally on your network to provide AI acceleration, see example:

reComputer J1010 powered by NVIDIA Jetson Nano - Edge AI solution at Seeed Studio - Seeed Studio

and

AI Base Station – FutureProofHomes

Anyway, fact is that you can not only already a local Ollama LLM via the Ollama integration but you can run different local LLMs (Large Language Models) via the Ollama integration as its acts like an abstraction API

Ollama - Home Assistant

Anyway, local LLM is becoming more and more with hardware like the Hailo-8 series of AI accelerators (an AI accelerator module for running local LLM faster than you can on a CPU), such as example an Ollama addon for Home Assistant. So you can either create an Ollama server using an Hailo m,odule or Nvidia GPU in a computer on your network or simply put them in the computer running Home Assistant OS with that addon

GitHub - SirUli/homeassistant-ollama-addon: Provides an Home Assistant addon for Ollama
- Ollama Addon: Running SLMs Locally on the Same Box and Not Just Remotely

Also check out this custom component from HACS helping you add an conversation agent using Ollama:

GitHub - ej52/hass-ollama-conversation: Ollama conversation integration for Home Assistant

Suggest read these example references for more overview information on how very much on-topic this is:

Again there is loads more coverage about local LLM via Ollama server done by community creators. Ex:

Hedda · December 13, 2024, 10:58am

Thought I had indirectly explained my reasoning in my original post above that it is fact that many (but not all) people simply do not want run Home Assistant on DIY hardware even if such solution can often be simple to put together their own from off-the-shelf parts, instead many just want buy a complete and finished product because it is plug-and-play and designed to work out-of-the-box with as little time/effort to get started and maintain as possible.

If this was not the case then Nabu Casa would not have sold any Home Assistant Blue and Home Assistant Yellow as people could instead at that time buy Hardkernel’s ODROID-N2/ODROID-N2+ or a Raspberry Pi 4 for much less money. I think many existing users (like myself) buy stuff like that because we not only want a finished product but there are also other non-logical reasons (such as buying official Home Assistant branded products because of brand loyalty and believing it will a way directly and indirectly support this project), or simply for bragging rights and showing off to friends

But let me to add some better comparison from my perceived market and target audience point-of-view to other types of some shat similar hardware prodocts for which many people buy way more costly pre-built products that users could relatively easily put together alternative DIY-style solutions for much less cost but many users instead choose to buy a finished all-in-one product for much more money:

NAS (Network-Attached Storage) - Many if not most people to get a NAS who want the function but not the hassle of DIY:ing the hardware (like myself) buy expensive finished product from Synology, Qnap, Asustor Terramaster, (and now also Ubiquiti), etc. instead of just putting together a DIY solution from off-the-shelf computer parts (yes I know you can even buy inexpensive complete NAS-like chassis) using free firmware/operatingsystem/software such as TrueNAS, Unraid, OpenMediaVault, XigmaNAS (formerly NAS4Free), EasyNAS, SnapRAID etc…
NVR (Network Video Recorder + Camera Video Management) - Again, many if not most people choose to get either an all-in-one dedicated product (like example Reolink NVR) or use the NVR solution provided in their NAS (like Synology) or Router (like Ubiquiti’s Unifi Protect) instead of using DIY solutions such as example Frigate or Blue Iris.
Router/Firewall (and Wireless router) - Think one might not be the most obvious but I bet that most of us here have probably just bought finished Wi-Fi Router/Firewall product(s) from Asus, Netgear, D-Link, Ubiquiti (Unify), or Synology instead going with an alternative DIY solution when it today can actually be relativly easy to put together your own DIY Router/Firewall and Wireless router solution using off-the-shelf (ARM or x86 based) hardware parts and operatingsystem/software such as example OpenWRT, DD-WRT, Tomato/FreshTomato, pfSense, OPNsense, Zenarmor, or similar.

Everything about Home Assistant is niche. I believe that mid-range and high-end all-in-one models would fill two more niches as some people will prefer to buy a completey finished product instead of putting together a DIY solution, and some of those prefer to buy one brand because of brand loyalty or other reasons. Whatever the reason

This is partially also why I bought branded merchandise from the official Home Assistant Store (and branded products from other similar projects) to show my support for things I like. I guess could instead just always black black polo shirt and blue jeans like Steve Jobs, but I do not want to

PS: Speaking for myself, my day-job is a server and storage sysadmin so I could even get old but free enterprise hardware if I wanted but instead for our home I choose to buy a Home Assistant Blue, Home Assistant Yellow, a Synology NAS, and Ubiquiti Unify network components instead of putting together DIY solution from off-the-shelf, (and partily also that I do not want to have another job maintaining several DIY solutions at home when I do similar stuff at work even if those are enterprise solutions).

I know that rant on but you do not need to read it, it is part of my own personal process, I have OCD and I prefer to research stuff myself in detail rather than waiting for someone to spoonfeed me details bit by bit. Still do not think there is an excuse for you posting abusive personal attack comments that are not only off-topic but offensive.

unclfuzzy · December 13, 2024, 10:59am

I have to agree about encouraging a higher-end plug and play solution. I’m kind of locked in FOBO before getting started. Given how much time and money one has to spend for the rest of the components required to smarten up a home, I would be very surprised if a $/£/€ 300 ready to go device that is fairly well future proofed didn’t sell better than the Green.

gutifarra · December 13, 2024, 6:36pm

If you already running Home Assistant on a Raspberry Pi or a computer with a spare M.2 slot then the barrier to entry has never been as low as now as you can simply buy either the 26-TOPS Raspberry Pi AI HAT+ or the 13-TOPS Raspberry Pi AI Kit (available from $70 US-dollar) to quickly get started.

So you no longer need expensive GPU, and you if you simply want to experiment with it you can even run a local Ollama server for this purpose on modern CPU without a dedicated AI accelerator or GPU.

These AI accelerators do not work for LLMs, they are meant for other purposes (normally computer vision but not an expert in this area). At the moment you need a GPU with plenty of VRAM if you want an correct answer in short time. If you want assist integration, most of the smaller 3B/7B I have tried do not work well. You can also use a CPU with a lot of RAM, but then it is slow for a lot of answers.

Basically at the moment, you can have either cheap, fast, high quality, but only choose 2. Even then, LLMs still do strange things, and are slower than the inbuilt intent handler.

Hedda · December 14, 2024, 4:58pm

Ah, apparently the less expensive modules in the Hailo family will not work unless the LLM model has specifically using thier compiler on very powerful hardware.

But they instead have a newer and more expensive Hailo 10 series with a Hailo 10H M.2 model that might work with Ollama!

Generative AI Accelerator: The Efficient Hailo-10H M.2 Module
- https://hailo.ai/files/hailo-10h-m-2-et-product-brief-en/

gutifarra · December 14, 2024, 6:17pm

I’m still not hopeful, from the looks of this thread on reddit it’s only 8GB which is pretty small for function calling.

Hedda · December 14, 2024, 7:30pm

Ollama smallest model (llama 2) is 7B parameters and looks to require at least 8GB of RAM so guess similar for these AI accelerators or?

llama2:7b

However that is out-of-the-box but I understand it is possible to finetune those models using several faster GPUs first (locally on another computer or more likely in the cloud) which can reduce the RAM requirements?

https://www.datacamp.com/tutorial/fine-tuning-llama-2
Fine-Tuning LLaMA-v2–7B on Google Colab: Unleashing the Full Potential of Language Modeling | by vignesh yaadav | Medium

gutifarra · December 14, 2024, 7:41pm

Llama 3.1 has an 8B model which I have used on my N100 machine, but they are still pretty terrible for assist. It’s definitely OK if a bit slow for a fallback, but not usable as a main model.

From what I understand fine tuning will not reduce the model size but can squeeze better results for specific use cases from smaller models.

Sir_Goodenough · December 15, 2024, 9:54pm

There were too many words to read, and you spent too much getting your AI to write them all for you, IMO.

That AI hat thing is probably already out of date, that’s why HA isn’t wasting resources on a tool for that, yet.

Teach a man to say things in fewer words so that people can read them and you end up spreading the word faster.

Hedda · December 17, 2024, 7:33pm

@Sir_Goodenough Do you realize that most of your comments are not only off-topic but extremly rude and offensive, they are a personal attack that have absolutly nothing to do with the topic at hand, they only make it seem like you are trolling. For the record I have never used AI to write a single word here (if I did had then I am sure it would be better written). Yes I know I use too many words, however the reason why is I have ADHD and OCD as well as probably being on the Autism Spectum, and explaining things in details is my coping mechanism for making myself understod, and on top of that English is not my first language, so those are my excuses. What is your excuse for trolling and being a jerk doing so? In my opinion the unnecessary negativity and unrelated acusations in your posts is certainly not contributing anything constructive to this discussion.

TH3xR34P3R · December 17, 2024, 7:36pm

Lets not degrade things into back and forth attacks.

petro · December 17, 2024, 7:37pm

Hedda, his words aren’t really offensive, they are more in line with constructive criticism. I believe you are searching for negative inflections in his post. FYI

Yes please.

Sir_Goodenough · December 17, 2024, 7:50pm

No offense meant. I just found the massive amount of words that take forever to get to the point a bit offensive myself, so I decided to point that out. My OCD flavor will not let me read all of that, and my cognitive memory disorder gets me frustrated trying to follow the long and involved points.

I appreciate you are contributing to the community, but perhaps starting a wordpress blog and posting links to it with TL:DR shorter posts here would allow you to add even more detail in your dissertations and the links from here would be the connection to the HA world to get your point out there. Then people interested in what you say could read it. You have full editorial control. And you don’t have to put up with Jerks like me (paraphrasing you a bit). As it is the long an detailed stuff is impossible for me to read with my disabilities and I take offense.

That said, I’m not going to respond any more, I don’t have any intention for this to become destructive rather than constructive.
I wish you well in your endeavors.

Hedda · December 19, 2024, 10:12am

Back on-topic; NVIDIA just launched their NVIDIA Jetson Orin Nano Super Developer Kit for $249(US) which include a Jetson Orin Nano compute module and a carrier board for it:

At the announcent NVIDIA’s CEO where marketing this kit as ”The World’s Most Affordable Generative AI Computer”:

That specific development kit is based on the 8GB variant model of new NVIDIA Jetson Orin Nano module series which could potentially be good choice for use in a such suggested mid-range model with buitl-in AI acceleration capabilities to run a smaller LLM.

NVIDIA also makes a 4GB variant, two 8GB variants (with different memory bandwidth), and a 16GB variant in that same NVIDIA Jetson Orin Nano series, but the other modules looks to not yet be available or there is at least no price for the other variants that I could find, however I believe a compute module like that 16GB model could potentially be good enough to be used in a high-end model.

$249 NVIDIA Jetson Orin Nano Super Developer Kit targets generative AI applications at the edge - CNX Software

As a proof-of-concept the same NVIDIA Jetson Orin Nano module is used in reComputer J3011 (Edge AI Computer with NVIDIA Jetson Orin Nano 8GB) sold for $599(US) by Seeed Studio:

reComputer J3011-Edge AI Device with Jetson Orin™ Nano 8GB module - Seeed Studio

Hedda · December 20, 2024, 8:32am

Well now with the launch of the official Home Assistant Voice Preview Edition there is another clear reason for more powerful hardware, it is officially stated that the Home Assistant Green (and Home Assistant Yellow) does not have powerful enough hardware to run the larger Whisper model for Assist locally.

This was also mention multiple times by Paulus, J-Lo and Mile in the video linked here as well as shown in the workflow diagram

”The N100 platform by Intel is our recommended hardware for local voice, that is like the bare-minimum.”

”If you are going to go local then we recommend out-of-the-gate something like an Intel N100 or better, especially for Whisper, if you have a graphics processor with a couple of gigs VRAM then you can get really fast large models that have high accuracy running.”

So if you run Home Assistent OS on hardware with a relativly slow CPU (like the Home Assistant Green) then for voice control you must off-load speech-to-text and text-to-speech to cloud services for it to be fast enough to be usable. If you want to run fully locally then you need a faster computer to run Home Assistent OS and the local Whisper piper on.

Note that they are here not even talking about LLMs for AI conversation agents but only the STT and TTS parts. It is the Whisper part, i.e. the Speech-To-Text parts that is usually the bottleneck for local voice control (without LLM). Meaning you will be dependet on the internet and cloud services for voice control if you do not have faster hardware to run everything locally.

Hedda · December 21, 2024, 8:19pm

Radxa Orion O6 mini-ITX motherboard that allow you to build your own ARM-based mini-PC looks like a a larger very flexible design with a theoretilcal 45 TOPS combined AI compute capability.

Radxa Orion O6 mini-ITX motherboard is powered by Cix P1 12-core Armv9 SoC with a 30 TOPS AI accelerator - CNX Software

It is powered by Cix P1 12-core Armv9 SoC with core Armv9 processor with four Cortex-A720 cores clocked at 2.8 GHz, four Cortex-A720 cores at 2.4GHz, and four low-power Cortex-A520 cores clocked at 1.8 GHz. The Cix P1 SoC also features an Arm Immortalis-G720 GPU for graphics and AI computing, a 30 TOPS AI accelerator. It also features two M.2 socket with PCIe and a full PCIe x16 slot, plus more.

Thwy also plans to sell this board as part of a Radxa AI PC Development Kit with an enclosure some time next year and are already taking preorders.