Voice assistant asking for performing an automation (Google)

This is most likely the 100th question about that topic. But most are pretty old, so Im rolling the question out again, hopefully theres a better answer than no (yes google, we all hope you improve…)

So Im running HA with node red and figured some manual intervention would most often be pretty needed. Like if I want to trigger a cinema mode when sitting on a couch. Doing such things totally automatic isnt possible, because its not always the case.

There would come the asking before doing something handy.

Technically I can add a sensor with 3 states. UNKNOWN, YES, NO. Which will be set to UNKNOWN before the question will be triggered.

Then I can set another binary_sensor or something, that would trigger an google home automation (if sensor xy changes, then do xxx). That automation could (hopefully) have as an action that it listens for yes/no and set the previously mentionend sensor with the 3 states accordingly.

Then my HA automation can detect that change and do whatever it should.

Now the question would be how I make google home listen for a yes/no answer.

Perplexity says its not directly possible, it recommends using dialogflow, but it seems to be at a cost (which I want to avoid)

Alternatively I thought of using an esp32 to listen for voice and then transmit it to HA, which parses it.

Is this my only option to activate an esp32 to listen for voice if i want to avoid “hey google, yes”? Or is there a modern/new approach that would be way better?