Hello, I am brand new to this community and I hope I’m not asking a question that has been asked many times before-- ChatGPT is telling me that what I’m asking for isn’t possible.
I’ve just installed HA for the first time after being a non-professional smart-home enthusiast for yeeeaars. I should have gotten here sooner, and, I haven’t really even began to understand the software… but I will.
I have been developing my home automations using Python and Node on the backend and Autohotkey, Alexa, and a few other things on the frontend. What I really ,really, really want to be able to do, is extract Alexa directives in near-real-time, so I can decide how to handle them, in spite of whether or not “she” understands the command.
For example, if I say, “Alexa ________________”, that text goes to my chat history. If I had some way to access that string of text in near-realtime, I could use that information to do anything I wanted using my PC command host, regardless if Amazon’s intent system has any idea what I’m talking about. Custom skills are out the window too, because they only respond to certain word commands…
I realize that what I’m asking for not as ideal as creating my own home assistant, which I fully still intend to do (using Rhasspy and a pi board I think), but, to be honest, doing all that costs money and I already have a bunch of echo dots.
TL:DR - Do any you home automation gods have a way to get me near-realtime transcriptions of my Alexa commands?
I’m not sure if this helps, but regarding the custom Alexa integration, the echos have an attribute called “last_called_summary” which contains the last command given to the echo. I notice that for the value of that attribute to update quickly, you have to call “update_last_called”
Remember chatgpt is grounded in data that is before October last year so even if it were possible it wouldn’t know (short version use it with a grain of salt and read the forum pinned posts about using gpt)
That said Im pretty sure it is correct that it (real-time) is impossible. (Because Amz limitations)
Remember Alexa os pretty much a walled garden. They (Amazon) love to bring data and control in. Outbound… Not so much.
Id fire up a small HA server, enable assist, wire it into an LLM and see exactly how easy that part is before you head down that path. That sounds… Painful on many levels. Mostly because even IF it’s possible you’re going to be fighting an uphill battle all the way.
This is definitely an even if you could probably shouldn’t. (remember your time is worth money too)
If its tied to last called. It’s not been reliable at all for the last 6mo I ditched all the automations I had using it specifically because it was too flaky.
As you’re probably aware, I am not referring to the “last_called” attribute. I’m just saying to instantly update the “last_called_summary” attribute, you have to call the “update_last_called” service.
I am aware of the issues with the integration not being maintained regularly, but several people are now helping and I am hoping reliability returns soon.
Thats exactly what I’m talking about. At least on my install that call is met 2/3 of the time with a rate limit or simply zero response.
I do not have the same optimistic view because the Amazon Alexa team has been hemorrhaging cash and Panos was hired specifically to fix that. You’re about to see more locking down on access and more pay walls around Alexa. Not fewer. (they MUST monetize or they’ll be laying off the rest of the team next year…)
The new resources are great but the cat and mouse is going get worse.
Now specifically to this application we’re already talking about using things in unintended ways which we know Amz is trying to prevent. I just don’t see where they have any incentive to make it EASIER. so I strongly suspect it’s a hard road ahead for them unfortunately.
And why I noted even if you find a way to do it I’m not sure Id try because it’ll be cat and mouse - constantly.
Instead if I were new to HA I’d spend time looking at the alternatives to She who shall not be named. (what we call her when we do t want to set off her wake word)
What I’m hearing here is that trying to leverage the Echo dots the way I want to is a fool’s errand, no matter what, the state of Alexa is likely to be different a year from now, and really I just need to bite the bullet and figure out how to build myself a custom voice assistant (or figure out some other way of interfacing with the system that I haven’t thought of yet. Telepathy bridge, anyone? )
I see you like to do things the hard way. you don’t have to continue to hit your head on stuff
Yeah I think if you spin up that HA install and poke around for a few minutes you’ll see how easy the rest of it is.
I had my first voice assistant running on my phone in less than two hours after I started. Was it perfect. No. Not by a Longshot. You need to spend time learning the uniqueness of how HA does the voice pipeline and writing intents. But it was usable and I was stumbling around in the dark.
A month later and an LLM riding on top (btw the part Alexa is about to charge you for) and Friday can do things Alexa wishes…
Note: this required no hardware above my HA install and a phone. I’m paying for the LLM api but you can (and should) start without the LLM to learn the assist pipelines.
I just think you’ll be much happier with a less impossible path. Using the echos you can use them as speech targets and I believe there may be a skill you can pump the voice call directly to assist (you wouldn’t get the chat history only the direct calls to that skill) may be good enough to hold out until BYO devices…
So, I think you’re right, and I think I’m going to go down that path. I’m going to dive headfirst in to HA this weekend. So far I’ve used it to discover my Hue ecosystem and nothing else. I have a HDMI matrix that I’m going to try to set up as a virtual device (it has to be controlled through my node.js service, it’s the only way). I’ll make integrating my LLM of choice one of the last beasts I try to slay. I’m already paying for that API but I’m using it for other things as well.
Out of curiosity, any threads on here where you talk about your voice-assistant setup? I’d take any guidance on both the hardware and software side of things-- I’m starting from scratch.
Negative ghostrider. Mine is all proprietary and experimental and in no way fit for public consumption.
What I can say is spend time understanding how the intent script and intents work that will be the hardest hurdle but once you figure out how that works start thinking of intents like tools to be called.