I was on a run just now and was listening to the 2023.5 release party and I had a thought for voice detection.
Im not a programmer so this is a CONCEPT ONLY that perhaps someone can get a better idea over and forward to the best people to develop.
WLED has a threshold detection module in one of the builds… Use that in a ESP32
If the ATOM that paulus demo’ed is capable of sending audio to HA on a button push, why cant it send audio when after a threshold is exceeded/detected.
It might mean that the first couple of miliseconds is missed in the recording but in the end it will get us close.
It would also negate the need for a wake word in that any commands (like Friends playing on your TV in the background) that is misunderstood or noit recognized is just ignored and not replied to. In the end do we really need to hear “I didnt understand that”???
WLED has a threshold detection module for a micrph