Hello everyone,
This is a demo of a voice assistant made in Home Assistant with visual responses played on a tablet on which I installed fully kiosk browser and browser mod.
Here’s a tutorial:
https://www.youtube.com/watch?v=bZgH4NDmBpk
EDIT UPDATE
Due to the fact that AlexxIT managed to implement Wake Word detection in the StreamAssist integration, I managed to make some changes in this integration to be able to play these random visual responses without the need for an Esp32 satellite, using:
- RtpMic android app
- Fully Kiosk Browser app
- Browser mod integration
- Preferred tts service
- Preferred tts voice and language
- Multiple tts random custom responses in your preffered language and voice
For more information you can visit my github page
https://github.com/relust/VisualStreamAssist/blob/main/README.md
VISUAL RESPONSES USING ESP32 SATELLITE
- Can be used if the version with StreamAssist integration does not work for you or for Google Home displays.
For more information you can visit my github page
github: https://github.com/relust/HA-Visual-Voice-Assistant
Here’s a small demo - for English turn on subtitles:
Hardware
- ESP32 dev board
- INMP441 Microphone
- WS2812 Led
- copper conductor spiral for touch
- printed ha logo case
- google home display
Software
- ESPHome
- Google Cloud STT
- Edge TTS
- Porcupine
- Fully kiosk browser
- Browser mod
- Extended OpenAi integration
- Vindoz AI talking photo platform
What he does:
- When the wake word is detected, the MUTE switch on the Esp32 satellite turns on. This is a REAL MUTE SWITCH made by connecting the L/R pin on the microphone to a digital pin on the Esp32 through which the microphone records only when there is voltage on this pin, and in the code, the microphone is set to the left channel.
- When the wake word is detected, a video and audio response is streamed through the browser mode media player on the tablet’s display. You can make several answers that can be played randomly.
- After this answer is finished playing, the mute switch goes back to off and the listening starts.
- When the streaming of the tts response starts, it is played through the fully kiosk media player on the tablet.
- At the same time, a no_sound_speech video is sent, i.e. simulation of speech but without sound, through the browser mode media player on the tablet display.
- Several voice assistants can be set, each with individual responses and pipelines that can be exchanged through voice commands.