VoiceAssistant + TTS + MultiRoom Audio HW and SW Options Local only

jonaspaulo · May 24, 2020, 11:43am

Hi,

I am researching right now a full audio setup to integrate with HA.

Some features/requirements:

Totally Local (No cloud/internet connection for anything)
Voice commands input (speech2text)
TTS notifications
Multi-room audio (ideally, but if not possible can be standalone per room also so that the SW doesn’t have to sync it) with support for Spotify, Airplay, Bluetooth play (not sure how to achieve this)
Lowest footprint/cost possible

I know it is a lot to ask but just trying to understand what is possible and what is not. From my investigation I reached some conclusions:

HW side:

Raspberry pi 4
ReSpeaker 4 MIC array (https://respeaker.io/4_mic_array/) (they already have a 6 MICs option but it way more expensive). Probably there are other options but this was the cheapest I found that could be integrated into the rpi with lowest footprint and good performance
Speakers Tribit XSound Go connected via 3,5mm audio cable to rpi. (Here there are some other options like buying some adhoc speakers and then using a USB sound card on the rpi or even some DAC connected directly to GPIO, but I thin they are more cumbersome and probably costly). Also I am hoping that the rPI can power/charge the speakers via USB cable.
Custom made 3D case to have the PI and somehow the microphone array (if there was a ready to use version to buy in aliexpress or other it would be better since I don’t have a 3d Printer nor skills to make a custom case)

SW Side:
For the TTS/Voicecontrol:

Almond/Ada - Seems nice and the official HA SW but it isn’t fully local for now as far as I understand
Mycroft - Same as above. It isn’t fully local
So it leaves me with two options:
Rhasspy or Project Alice( based on Snips). Rhasspy seems to be more developed and Project Alice newer. Any opinion on both?

For the MultiRoomAudio:

Snapcast and Mopidy or Volumio (Any other options?)

The issue here is that I don’t seem to reach a solution that provides the TTS and VoiceControl as well as the multiroom audio on a single SW package and I am afraid that I cannot run Rhasspy or Project Alice alongside Snapcast with Mopidy or Volumio and have everything working together.

I am open to more opinions and thoughts on this.

Thanks a lot.

jonaspaulo · May 29, 2020, 5:59pm

Well i just ordered the rpi and the respeaker to test for now.

AlmostSerious · May 30, 2020, 6:53pm

If building a new house or when laying cables is an option I’d consider putting speaker and microphone cables all leading to a central server.

jonaspaulo · May 31, 2020, 8:20am

Ty. That is not the case. It is an existing one. For new houses agreed, cables is the best option.

AlmostSerious · May 31, 2020, 8:43am

Then yes probably for a synced Multiroom Solution the Snapcast is your best bet. You can also take a look into LMS (Logitech Media Server or Squeezebox) but from my testings Snapcast does a much better job. My current situation is similar. For now I am running Google Assistant / Alexa alongside Snapcast for Multiroom Audio. But I am contemplating simply getting rid of Google and Amazon Devices and deploying rPIs with the added Rhasspy and Microphone Arrays. As far as I am aware, there is not 1 single package available as you described, however I wouldn’t expect them not to work nicely alongside each other.

jonaspaulo · May 31, 2020, 9:00am

Yes my doubt is that since they are different packages they could lock in the MIC Hardware device inside rpi and not let it be available for the other software.
Thanks for the answers btw

Seppe · November 21, 2020, 3:35pm

Any update on this? What hardware did you end up using?
I would like to start with VoiceAssistant (Rhasspy) + MultiRoom audio (Snapcast), but have no idea what hardware to buy (esp32, raspberry pi, dac, amp …).

Thanks!