AI agents for the smart home

I suspected this might happen… I have not looked hard enough! But this is when this great community steps in!
Thanks a lot @royrogger, I will try it right away.

Works just as intended, great so far.
Thanks again.

By the way, I just realised that recommended settings for the new GPT Assist were pointing to gpt4o, which is very expensive compared to gpt3.5-turbo.
It might be me who somehow changed the default recommended settings, but it is worth checking before one starts spending $$$ on voice assist.

It would be great to be able to specify the URL for the OpenAI integration, or at least be able to choose the model that’s used. I’ve integrated a voice assistant with OpenAI and i’ve only been testing it out for the last hour or two and my bill is already a couple dollars. i run models locally and would much prefer to use a local model. tools like LM studio let you run a bunch of different models and expose an api that is the same as OpenAI’s - all that would be needed is the ability to specify the URL and model name (and omit the API key). it would also be nice to be able to see the prompts sent to help with debugging.

i dug through the integration code a bit and it looks like it might be fairly easy to add some customization options. core/homeassistant/components/openai_conversation at 2024.6.2 · home-assistant/core · GitHub

i would be happy to help with this, though i’ve never contributed to HA before.

Would love to see an AI assistant for Home Automation’s Automation Editor to help create automations.

Assist AI assistant support available could lower the entry point if could helping new users get started with creating new automations starting from scratch using the Home Automation’s Automation Editor.

I would also love to see an ESPHome Automation Editor with an optimized LLM YAML code generator agent assistant that would make it easier to assist in writing + verifying advanced on-device automations and templates for ESPHome’s integrated automation engine:

That should make it much easier for beginners to make advanced Home Assistant automations (and scripts) as well as ESPHome based DIY devices with more complex stand-alone automations/scripts running on-device onboard the embedded SoC.

Hopefully making automation and scripting more accessible in turn would lead to wider adoption and a larger userbase for Home Assistant and ESPHome that could benefit everyone in the community.

Would be awesome if an AI-agent assistant inside Home Assistant Automation Editor could then also help users write advanced/complex ESPHome automations that can be automatically uploaded directly from Home Assistant’s Automation editor to the device to run standalone locally on-device even when Home Assistant is down.

BLUF: Anyone planning on integrating LLMs (AI) into their Home Assistant instance should read this article by former Open-AI researcher Leopold Aschenbrenner (formerly on the Superalignment Team):
Situational-awareness.AI, especially From AGI to Superintelligence

Once we get AGI [~3 years]… we’d be able to run millions of copies (and soon at 10x+ human speed) of the automated AI researchers…The automated AI researchers could have way better intuitions.

Before we know it, we would have superintelligence on our hands—AI systems vastly smarter than humans, capable of novel, creative, complicated behavior we couldn’t even begin to understandperhaps even a small civilization of billions of them. Their power would be vast, too. Applying superintelligence to R&D in other fields, explosive progress would broaden from just ML research; soon they’d solve [cheap] robotics [replication], make dramatic leaps across other fields of science and technology within years, and an industrial explosion would follow.

Superintelligence would likely provide a decisive military advantage, and unfold untold powers of destruction. We will be faced with one of the most intense and volatile moments of human history

Whoever controls superintelligence will quite possibly have enough power to seize control from pre-superintelligence forces. Even without robots, the small civilization of superintelligences would be able to hack any undefended military, election, television, etc. system, cunningly persuade generals and electorates, economically out-compete nation-states, design new synthetic bio-weapons and then pay a human in bitcoin to synthesize it, and so on.

2 related points;

  1. As a former AI practitioner who envisioned a future where AI provided exciting new capabilities, I am now very alarmed at the amount of energy being used by LLMs, whose use is predicted to skyrocket to the point where electricity providers are planning to crank old mothballed coal plants back up and likely having to abandon 100% clean energy goals. Use of LLMs is accelerating global warming, and the rate of acceleration continues to increase. I no longer even use Google Searches, as they now employ their LLM for search findings summarization.

  2. The blog stated “AI models hallucinate and their output cannot be completely trusted”. Actually, the types of models that hallucinate center around Generative Adversarial Networks (GAN) and Transformers (the latter of which are building blocks for LLMs). Artificial Narrow Intelligence (ANI) models, on the other hand, simply focus on very specific tasks, synthesizing concepts, trends, and data relationships from large data sets to do everything from detect early onset of cancer from CT scans to route planning on autonomous robots on other worlds. Compared to LLMs, ANI models are multiple orders of magnitude smaller, using similarly far smaller amounts of energy.

1 Like

@Rafaille It looks as if this feature is planned in core:
https://www.reddit.com/r/homeassistant/comments/1djnn1k/comment/l9c216n/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

1 Like

Athozs/hass-environment-variable: Add Environment Variable to Home Assistant. (github.com)

This integration can modify the openAI URL, which I specify to the local inference tool (can only be used for chatting, cannot be controlled). But there is currently no local inference tool that supports LLM API. Currently, only OpenAI and Google AI support the LLM API

interesting - what’s the env var to set the openai url? i actually went and modified the openai conversation integration and raised a PR to the HA core repo, but it was denied, with no real reason or what should be done instead, and then they locked the comments. add option to change openai base url by j4ys0n · Pull Request #122423 · home-assistant/core · GitHub

1 Like
# configuration.yaml
environment_variable:
  OPENAI_BASE_URL: http://192.168.3.241:9997/v1

Modify URL, restart HA。Key can be freely filled in。

1 Like

Forget the visionary approach and just give me an LLM to write scripts and automations.

Existing LLM’s can already do this.

I have been experimenting with local LLMs controlling HomeAssistant for a while now (I’m the author of the blog post from January), and have noticed that RAG actually makes a massive impact with (relatively) little work. I blogged about it a little while back if you’re interested in the details.

The idea is, it’s unlikely that the user ever wants to take action on their entire house, but rather only some portion of their house (for example a room, a floor, or a specific entity). We can use RAG in its simplest form (retrieve pre-computed embeddings for each area/entity and get the top X results ranked by cosine similarity) and it will drastically cut down on processing times. I have been using this at my house and it made a big impact despite adding even more things to the LLM’s total context, while not causing any quality issues I could perceive.

This performance boost is because prefill cost (the processing required before anything is ever output from the LLM) increases exponentially based on the context, so reducing it has a non-linear effect on performance. This is notably the case without GPUs because as far as my experiments go, prefill is the worst part of CPU inference.

I have also included examples that are dynamically generated for in-context learning (based on the smart home state itself), which I found massively improves function calling accuracy while giving the LLM the ability to adapt to new HomeAssistant APIs that they are not trained for.

With the combination of all of this, I am actually able to use the whole pipeline locally with usable speeds (it’s not fast, but it’s no longer 8 seconds per query, at least if you have GPUs) locally.

4 Likes

Thats interesting! Cheers

I want to integrate a local multimodel LLM into an NVIDIA Jetson AGX to read from video, audio and mics to automatically control HA. I need help getting started with certain things. Anyone here willing to offer me some guidance? I’d very much appreciate it. :pray: Whatsapp: https://wa.me/34638244348

(Background: I’m a Senior Android Engineer and I’ve been playing a lot with prompting, OpenAI APIs, etc even doing stuff with LLMs for production features at work)

Quite poorly, sure. Chat GPT is usually the most accurate but there are usually way too many issues, kind of defeats the purpose.

Hi there, there was an LLM model comparison of the synthetic home. But I cannot find that repo anymore. Does anybody know about it?

Still don’t quite see what the use case is for AI. :thinking:

I can remember some of the early discussions about the value of smart homes - particularly one point: it’s no use being able to turn lights on and off with your mobile phone if it’s easier to flip a wall switch; it’s just showing off.

There’s a case to be made for voice control as part of a multi-pronged approach to making home automation easier to use than wall switches (manual/automated/voice-controlled). I’m just not sure why we need the house to make smart remarks.

Could this be an expectation that has been planted in us by Amazon and Google? I can see that it might be interesting to implement and fun to use, but it’s hardly worth the attention it’s getting. Is it? I would put it roughly on a par with having your own weather sensors in the garden.

1 Like

Agreed, or even better: automated so that a switch is redundant most of the time.

I think it’s more about parsing the input. LLMs are better at “understanding” intent than just the literal meaning and ambiguities or requests.