Agreed, I think most of the work here is the pipeline between the HA instance an its prompts (entity names, etc.), and the HA instance and the response. What happens between the prompt and the response just goes through the API.
In that sense it makes the most sense to get rolling using the OpenAI API so people can test and try things out. Iām running Ollama locally now, but while the local solutions are working well, they are also changing fast, APIs still shifting, and setting them up can be tricky, especially if you are hoping to use your GPU for larger models.
Iām sure this is ultimately the goal to support, it just takes time that runs in parallel to the time needed to integrate the responses with HAās controls (which is what we are seeing here).
The most useful application of LLMs in HA would be to be able to build automations and scripts based on prompts.
āCreate an automation that turns on the living room light when Iām in the room and turns it off automatically when the upstairs is vacant for at least 30 minutes. If itās after 8pm turn on the lamp insteadā
Itās a relatively fixed problem space in terms of variables. It would be a lot easier than using copilots hacked together YAML since you could have deep knowledge of the triggers, conditions, and actions available.
Iād prioritize that higher than Assist integration IMO.
Yeah, thatās totally fair. Iām not used to HA development prioritizing an external service first, but I can understand that it might be easier to start with that for this particular functionality.
Mostly, I just wish this had been addressed a little more explicitly. I donāt disagree that ādipping our toesā implies a beginning, but I would have loved an acknowledgment in the blog post, like: āWeāre launching this feature with support for external LLM services to begin with. In the future, Home Assistant will also support local-only LLMs like Ollama.ā
Ollama is not an LLM but a service that is able to run most local LLMs. Itās been added to Home Assistant in 2024.4: Ollama - Home Assistant
As we mentioned in the live stream, and will further expand on in our deep dive blog post tomorrow, weāre collaborating with NVIDIA to bring control of Home Assistant to Ollama. It doesnāt support it out of the box nor are any of the main open source models tuned for it. Itās something that needs to be built from scratch and something NVIDIA showed a prototype of last week: https://youtu.be/aq7QS9AtwE8?si=yZilHo4uDUCAQiqN
Fair! Iām excited to learn more. As for the Ollama bit, you got me; Iām not entirely sure how all the pieces intersect, and mostly I just meant āsome sort of local LLMā.
That is amazingā¦ congrats! I am watching NVIDIA key developers doing stuff with Home Assistant, same folks at the company that just passed Appleās valuation, over USD 3 Trillion and are work with Musk companyās Tesla, X ā¦ keep it up and feel good that you are doing some important and fun work!
How about adding Azure OpenAI Conversation to Home Assistant Cloud? In that way I can afford more bill than $65 since I can discontinue my OpenAI subscription. Of course it should be an option for who wants to use.
We already have Google Assistant and Alexa on Home Assistant Cloud. Why not OpenAI?
While I like some of the intention behind integrating AI into Home Assistant, I think itās kind of misleading to say that the LLM will be āunderstanding the intention behind the spoken commandā. LLMs run statistical calculations to determine the most likely response to a query. They have no intelligence as we know it. No true knowledge. They just predict the most likely sequence of tokens.
The demonstration of the AI voice assistant, while interesting, left me wondering. Is it better to say āIām doing a meeting. Make sure people see my faceā. Or to simply say āTurn on my webcam lightā. How subtle do you expect home commands to be that they require the energy and processing overhead required by an LLM. I fail to see the use cases. I hope that the Home Assistant team concentrates on continuing to make home automation that fits the habits of regular people.
Yep, I would much rather HA Voice be able to handle minor mistakes in my speech locally and turn things on/off etc, than rely on an LLM to figure out what I want from some obscure request.
One idea for background images - allow an image of a room picture to a room card or card section. This will give a visual cue for room controls rather than text or or just an icon.
One control on images would be a transparency llevel, so not to make the background images too distracting.
The new 24.6 release is fantastic and makes me really eager to get my hands dirty and replace all those Amazon Echos with something that actually can understand what I meant to say.
And here comes the problem - replace them with what? Are there still no aesthetically pleasing devices out there? Or at least some nice 3D printed cases to put the ESP32 wire hell? Any ideas are welcome, thank you.