Allow different voice hardware to have access to different entities

updated to better reflect the idea

Brought this upon discord and seemed to be a good idea so brought it here.

The idea being that different rooms can call different things as one person put it “very rudimentary voice RBAC” (NB: Its not really RBAC but as a descriptor…)

examples:

voice hardware near a door cant have “unlock the door” shouted at it through the mail slot.
people cant mess with lights in other rooms
a guest room cant set off alarms
kids room cant unlock the door or set the cars climate control
etc…

as well as “malicious” behaviour this could help protect against some voice detection hallucinations too.

You already can create multiple pipelines. There’s even a project out there that makes use of this. To have a collection of agents… But even in the default

Here I have two identical connections to openai the difference is only the prompt. One has personality and customization for Friday, one has personality and customization for Jarvis…

In your scenario you have to build a prompt that includes “you are under no circumstances allowed to unlock a door” into the version of the prompt being used by the door assistant.

Read: You’ll be at the whim of the llm if it enforces it or not.

So hallucinate door. Sure!

Is it any worse or better than having it directly in the prompt. We’ll you’ll have to actively convince the llm it can do it first but… With it exposed. Totally still doable. In short Don’t call it RBAC because it’s not it’s only hiding an entity ND because it’s exposed to one pipeline, all see it. if you can convince the llm to figure it out and do it… It will. The real solution is don’t have audio hardware in an unsecured location

You don’t get to selectively expose entities I would not do this and assume it adds any shred of security. Theater or otherwise there are valid reasons for this but to ‘secure’ something is not one of the uses and please don’t fall into the trap of security by obscurity… Especially with your door.

Perhaps I wasn’t clear, this has nothing to do with LLM connections.

It was more the default “assist” not a llm connection. I understand that you can have multiple connections to a LLM such as openai with different configurations but that is not relevant to what is being suggested/requested.

from the “security” side. I do agree that its not really security but it does impose some limits which in some cases would be good enough. it is most certainly not RBAC but that was used as a descriptor more than anything else.

you would have the same issue - without separating the exposed entities to multiple ‘assist’ agents - same issue - doesn’t matter HOW the recognizer matches, llm or builtin, - it’s still an issue. While I can see fringe use cases for different expose list per assist pipeline - it’d be a whole lot of work for not a lot of benefit.

No it wouldn’t. That’s what this feature request should be about, not simply allowing multiple instances of conversation agents.

then i have misundersood the previous discussion. this is fine. i should have probably been more generic. Will update to make it more more sensible

Your FR to get what you want - would be to allow assist to expose entities per chat pipeline instead of just at the assist level

entities > assist > pipeline
                  > another pipeline
                  > another pipeline

You’d need:

Assist > entities for pipeline 1 > pipeline
       > entities for pipeline 2 > pipeline 2
       > entities for pipeline 3 > pipeline 3

Which means in memory multiple lists - multiple checks changing where recursion happens. (read: BIG deal)

I myself am still on a learning curve here, so sorta have a related question…

From this doc,there is this picture:

Which indicates it is possible to expose a device per “Assist”, per “Amazon Alexa” and per “Google Assistant”. The documentation seems to be calling each of these an “Assistant”.

For the “Assist” configuration (UI->Settings->Voice Assistants), I can create multiple instances (the documentation calls an instance an “Assistant”), say one for local conversation agent (call it “Home-Assistant” and is marked as preferred), another for OpenAI conversation agent (call it “Chatgpt”). Yet with these two “Assist” assistance instances, a device can only be exposed to “Assist” instead of per each assistant within “Assist” .
So I guess my question is why not (and maybe this was what was being explained above and I just didn’t get it)?

Assist is a different thing than Alexa or Google here.

The entities are passed off to each of those three things as a package. In your example ‘Assist’ (Capital A - Assist) gets a copy just like Alexa does and Google does.

The ask would be to move the Pipelines under assist to be at the same level as Assist lives itself.

So they’d have to go into assist (first problem. At this level the packages don’t know nor care what the recirver does with it or how it’s structured - on purpose) parse out all the potential (infinite btw) voice pipelines then add them to the collection at the level of Google etc.

Rn implementation limits three filters. This would require infinite filters, overflow checkers, move things around in structure and UI etc etc.

Is not small. In fact it’s a bit of a mountain.

Another idea that pushes the complexity out into an add-on instead of into the HA UI: an external Wyoming conversation add-on that lets you create multiple agents which include/exclude specific sentences/intents/entities. These agents would use the same library and intents from Assist itself. Then you can just select the agent you want for a specific pipeline, and use that pipeline on the device.

With the example above, you would install the add-on and create an agent for “Front Door” that excludes that entity from its possible sentences. Then the voice satellite by the door would use a pipeline with that agent.

So rather than treating this as an exposure problem in HA, it would become a matter of customizing each agent to only recognize what you want. Of course, this would not work with LLM fallback.