Yesterday I created HA tool
for langchain as an experiment, bot was able to find devices and entities with states, and was able to call services, but it was consuming my money like a black hole.
MORNING UPDATE:
I’ve spent the entire night having fun, and I’m slowly discovering that questions about the state of entities and devices, or invoking services, would only make sense with a continuous stream of data. We would need something like Google STT or Whisper working in real-time (and preferably locally and able to distinguish different voices), then ChatGPT could pick up the context from our statements. It works - I asked ChatGPT to be a smart home assistant, and it was great at distinguishing what in the conversation pertained to devices and what was in another context, and used commands correctly. It could literally pick out from a sample conversation that it was too cold and should turn on the heating, even though that sentence was just thrown into a dinner conversation.
We could also use geolocation to create automations, for example, “remind me to buy potatoes when I’m at the store,” and by having data on locations within a 30-meter radius of our position, for example, using Everpass, ChatGPT could send such a reminder, and it works quite well.
However, all this fun has cost me quite a bit. The development itself consumed a lot of money, and sending all that data and checking it in real-time eats tokens like the Cookie Monster. Add to that the cost of STT and other services, unless we want to play with Alexa and use some hotword detection mechanism, but then it becomes annoying. Additionally, there’s the language overhead - I used Polish, and the bot often confuses entities - either it will be more or less a lucky hit, or it will devour tokens. Here, a difference arises between GPT-4 and GPT-3.5. GPT-4 is almost perfect but monstrously expensive and too slow to be home assistant, while the GPT-3.5-turbo assistant must be heavily optimized and manually updated with each change, and it is still too slow. Or, we would need to create a tool that can handle it well.
For your information, I didn’t use aliases or any HA chat mechanisms.
IMO, the technology offers incredible possibilities, but its price and availability are still blockers for such applications - this will likely change by the end of the year.
I used langchain
and created tools based on langchain/tools
, similar to plugins
, but I only scratched the tip of the iceberg of possibilities.