Help to decide/understand the hardware for voice assistant

Hi all
I need some help to find out which are the options regarding the hardware for voice assistant. My goal is to be able to use it like eg google home or alexa in native language. I have a HAOS intalled on a NUC

I have read about some esphome devices, but ideally I would prefer something plug ana play (in a reasonable price) which I could install it in the living room for example and look nice or at least not to be noticed.

Ideally I woukd like to use for example an older android phone as a microphoe and why not as a speaker too, but I don’t think I can.

thanks

Hey there! :slight_smile:

At the moment, you can use a few different ways to connect with your voice assistant:

  • A microphone and speaker, based on an ESP chip set, that can be flashed via ESPHome. An example would be the m5Stack ATOM Echo.
  • Any old Android or Apple phone, tablet or smartwatch, that can run the HA companion app.
  • A separate computer, that runs microphone and speaker and sends the signal in some way to HA. An example would be a Rhasspy server and/or client, that runs on a Pi (Pi Zero, Pi2, Pi3, Pi4).

There are other devices out there, that can be connected as well, but in the end, these boil down to the three things above. Eg. there is a nice little ESP-based speaker out there, the ESPmuse, that works nicely, but it is an ESP based device.

What I’d do is check where you want to put your mic/speaker and more important, how you’d use it. Eg. I have an ATOM Echo on my livingroom table, but I’ll change its location rather sooner than later, as we always have our phones and one tablet right there, too. Doesn’t make sense, as all mobile devices are running the companion app. So one click on the ATOM Echo is nearly the same as touching two buttons on my phone.

On the other hand, you’d likely not want to touch your phone, right after you have left the shower. This will be a good place for an Echo. :slight_smile:

So my recommendation is: buy one or two of the ATOM Echo, flashing is really easy, and try them out. Get one or more (what you have at hand) Android or Apple devices and install the companion app. Give it a few weeks test run, to see, how you and your other household members are working with it. You’ll likely find some things totally different than expected, in terms of how people use these voice assistants.

I for one was sure, the mic on the livingroom table will be the most used device. Turns out, I was totally wrong. My wife uses her phone/companion app to speak her voice commands, and the next one in line is the tablet in the hallway. The Echos are fine and work great, but they are not yet located at the right places.

What I want to say is this: take your time, buy one mic/speaker beside the app, and test it. You’ll see quite quickly, where you need a mic, and where a mic is totally useless. And an ATOM Echo is around $15 each, so no real harm done in trying. If you order via the manufacturer shop, there is a discount code from NC available. :wink: Depends on where you live. :slight_smile:

1 Like

I am going to order the atom echo, but until then is there a link or sonething that I coukd see how I can use an old android phone?

Just install the Home Assistant Companion App from the PlayStore and configure it.

Configure means, the wizard will guide you to setup the connection to HA, and you can afterwards configure, which sensors you want to show up in HA. That’s it, simple as that. :slight_smile:

If you now open the app, you have the Assist icon in the header, just click it and start speaking. :slight_smile: Depending on your phone, you can even set it up as default Assistant, so a long press on the home button opens up directly the voice assistant and you can start speaking. :slight_smile:

1 Like

I will try it on friday when I get home.
but is there a way to “wake” the assistant in the phone with my voice? like hey google?

Not yet. :slight_smile: A so called “wake word” is in development, rumors say, but I’m sure it is. :laughing:

The last “Year of the voice” video on YT was expected to announce “wake word” functionality, but it seems, we have to wait for that. I’m fine with it, I’m very happy without.

But I used to use Rhasspy, where a wake word functionality is already implemented, but I didn’t like it. It had too many false positive, especially in the livingroom, when the TV was on. So for me the way better approach is to use it with a one-click-and-talk approach or with a long press on the home button on Android.

1 Like

Usally, almost everything I need is automated, and this is my prefered method. However sometimes there is a need, to ask by voice HA to do something, eg turn on a light while my hands are wet so the button wouldn’t suit me.

Just wondering if a bluetooth conference call microphone/speaker combination would work with a raspberry pi as a voice assistant. Has anyone tried that?

This is my big issue too, and I believe it always will be when the TV audio out doesn’t come from the Voice Assistant device. Subtracting the audio being played out the speaker from the audio signal input heard by the microphone is necessary to identify what sound is being created in the room. That’s why they demo Google/Alexa/Siri playing music from its integrated speaker.

Personally I use Rhasspy and turn off the satellite device (a RasPi 3A with reSpeaker 4-mic HAT) in the living room when the TV is turned on. I have added a button to pause my media centre/TV and enable Rhasspy when pressed; and to un-pause on double-tap. Not an especially elegant solution, but it works for me better than having false positives from the TV :wink:

As you are probably aware, the new HA voice assistant is the next generation by the developer of rhasspy; and I’m sure it will be well worth waiting for. Home Assistant will provide most of the processing centrally, with small cheap satellite devices (mic + speaker + communications + minimal processing) spread around the house.

Re: voice assistant hardware … currently for Rhasspy, a RasPi with reSpeaker HAT is commonly recommended; but is more expensive than google/alexa options :frowning:

I believe the limitation for ATOM Echo with Rhasspy is that it doesn’t have enough memory or CPU power to detect wakeword locally - so has to send its audio stream 24/7 over the LAN to the more powerful server to detect its wakeword. The current HA voice assistant release gets around this by using a button instead of a wakeword.

I am personally hoping that Nabu Casa will use its ESPHome expertise to develop an integrated ESPHome based satellite device at a competitive price.

Yes, in the rhasspy forum several are using Jabra conferencing devices. I understand the audio quality is excellent - but these are expensive, and designed to be used in closed conference rooms without background noise, so back to the first point :wink:

This you should be able to solve with an automation. While giving back the response to Rhasspy, you could also unpause the TV.

You shouldn’t look at this as two different tings, it is all coming together nicely. :slight_smile: HA will still provide a functionality, where you will be able to use the ECHOs, but as you said, I highly doubt, wake word will be functional on these devices. But the combination Rhasspy/HA Assist will solve this for us:

  • Assist for push-to-talk-mics
  • Rhasspy for wake-word (but, see the next paragraph!)

That’s why Paulus announced (in a side sentence), that the device they are working on right now will be an ESP S3 probably with a display. This device has enough power and space to use a wake word on the device. :slight_smile: But he also said, these aren’t available broadly, at least for now, so the team isn’t in a hurry to release anything here. :slight_smile:

As Don said, in the Rhasspy forum there are some people using these, but my highest recommendation for mic arrays are these:

They go for $20 a pop and are technically very good and reliable. They are really a quality product and run without any problems. Added bonus: you can still use the camera inside these to get a video feed or something like that. Just needs a Pi that can handle it (aka not a Pi2 :wink: ).

do you have any instructions (or a link to some) on how to use the PlayStation Eye as a voice assistant?

I am also looking for hardware to create a couple of voice assistants spread throughout the house, but would definitely want to have wake word as that is the use case I am seeing in our house.

What options are there currently for having a wake word?
Also, when using wake word, what range or how good are these devices at picking up the wake word? Google Assistant is really good in this regard I believe even when other audio from other sources are currently playing.

+1 looking for a cheap, efficient, nice looking and not power hungry device.
As already said earlier in this thread, a nabucasa speaker based on esp32 would be great (I personally do not need a display).
Right now the esp32-s3-box-3 seems out of stock everywhere…

Regards

I’m also starting to bring the voice into my smart home, in addition to existing

  • wall tablet running Fully Kiosk and the Android HA Companion app
  • several iPhones with the iOS HA Companion app

I’m thinking of getting at least

Unfortunately,

  • the Atom Echo is currently very pricy in my country (roughly 30 €) and the original site takes from 3 to 8 weeks to ship - next to the fact that in general I try to avoid to order at chinese shops directly, I prefer to pay a bit more and get the hardware much faster.
  • The ESP32-S3-BOX on the other hand: I can’t find any source / shop / marketplace to buy it at all.

So where do other e. g. Europeans get their devices like the two mentioned above?

If at least for the Android HA Companion app there would be a wake word (no need to press a physical button or the virtual assist button in the HA app), that would buy me some time…

Is it possible to use the microphone from the PlayStation 3 camera (PlayStation Eye) for Talking to Home Assistant?

I got mine ESP32-S3-Box from AliExpress. It works fine but I also needed a few without a display. For that I went the diy route. Besides a speaker a few RGB leds are used for feedback

2 Likes

Home Assistant detects the Playstation Eye camera as the following devices. Do you have an idea how to configure it to use its microphone to control devices and ask questions?

controlC0
/dev/snd/by-id/usb-OmniVision_Technologies__Inc._2000-01
Podsystem:
sound
ĹšcieĹĽka urzÄ…dzenia:
/dev/snd/controlC0
Identyfikator:
/dev/snd/by-id/usb-OmniVision_Technologies__Inc._2000-01
Atrybuty:
DEVLINKS: >-
  /dev/snd/by-id/usb-OmniVision_Technologies__Inc._2000-01
  /dev/snd/by-path/platform-xhci-hcd.3.auto-usb-0:1:1.1
DEVNAME: /dev/snd/controlC0
DEVPATH: >-
  /devices/platform/soc/ffe09000.usb/ff500000.usb/xhci-hcd.3.auto/usb1/1-1/1-1:1.1/sound/card0/controlC0
ID_BUS: usb
ID_MODEL: '2000'
ID_MODEL_ENC: '2000'
ID_MODEL_ID: '2000'
ID_PATH: platform-xhci-hcd.3.auto-usb-0:1:1.1
ID_PATH_TAG: platform-xhci-hcd_3_auto-usb-0_1_1_1
ID_REVISION: '0200'
ID_SERIAL: OmniVision_Technologies__Inc._2000
ID_TYPE: audio
ID_USB_DRIVER: snd-usb-audio
ID_USB_INTERFACES: ':ff0000:010100:010200:'
ID_USB_INTERFACE_NUM: '01'
ID_USB_MODEL: '2000'
ID_USB_MODEL_ENC: '2000'
ID_USB_MODEL_ID: '2000'
ID_USB_REVISION: '0200'
ID_USB_SERIAL: OmniVision_Technologies__Inc._2000
ID_USB_TYPE: audio
ID_USB_VENDOR: OmniVision_Technologies__Inc.
ID_USB_VENDOR_ENC: OmniVision\x20Technologies\x2c\x20Inc.
ID_USB_VENDOR_ID: '1415'
ID_VENDOR: OmniVision_Technologies__Inc.
ID_VENDOR_ENC: OmniVision\x20Technologies\x2c\x20Inc.
ID_VENDOR_ID: '1415'
MAJOR: '116'
MINOR: '0'
SUBSYSTEM: sound
SYSTEMD_USER_WANTS: sound.target
SYSTEMD_WANTS: sound.target
TAGS: ':systemd:uaccess:'
USEC_INITIALIZED: '15162493'

video1
/dev/v4l/by-id/usb-OmniVision_Technologies__Inc._2000-video-index0
Podsystem:
video4linux
ĹšcieĹĽka urzÄ…dzenia:
/dev/video1
Identyfikator:
/dev/v4l/by-id/usb-OmniVision_Technologies__Inc._2000-video-index0
Atrybuty:
DEVLINKS: >-
  /dev/v4l/by-id/usb-OmniVision_Technologies__Inc._2000-video-index0
  /dev/v4l/by-path/platform-xhci-hcd.3.auto-usb-0:1:1.0-video-index0
DEVNAME: /dev/video1
DEVPATH: >-
  /devices/platform/soc/ffe09000.usb/ff500000.usb/xhci-hcd.3.auto/usb1/1-1/1-1:1.0/video4linux/video1
ID_BUS: usb
ID_FOR_SEAT: video4linux-platform-xhci-hcd_3_auto-usb-0_1_1_0
ID_MODEL: '2000'
ID_MODEL_ENC: '2000'
ID_MODEL_ID: '2000'
ID_PATH: platform-xhci-hcd.3.auto-usb-0:1:1.0
ID_PATH_TAG: platform-xhci-hcd_3_auto-usb-0_1_1_0
ID_REVISION: '0200'
ID_SERIAL: OmniVision_Technologies__Inc._2000
ID_TYPE: generic
ID_USB_DRIVER: ov534
ID_USB_INTERFACES: ':ff0000:010100:010200:'
ID_USB_INTERFACE_NUM: '00'
ID_USB_MODEL: '2000'
ID_USB_MODEL_ENC: '2000'
ID_USB_MODEL_ID: '2000'
ID_USB_REVISION: '0200'
ID_USB_SERIAL: OmniVision_Technologies__Inc._2000
ID_USB_TYPE: generic
ID_USB_VENDOR: OmniVision_Technologies__Inc.
ID_USB_VENDOR_ENC: OmniVision\x20Technologies\x2c\x20Inc.
ID_USB_VENDOR_ID: '1415'
ID_V4L_CAPABILITIES: ':capture:'
ID_V4L_PRODUCT: USB Camera (1415:2000)
ID_V4L_VERSION: '2'
ID_VENDOR: OmniVision_Technologies__Inc.
ID_VENDOR_ENC: OmniVision\x20Technologies\x2c\x20Inc.
ID_VENDOR_ID: '1415'
MAJOR: '81'
MINOR: '1'
SUBSYSTEM: video4linux
TAGS: ':seat:uaccess:'
USEC_INITIALIZED: '15046120'

Hi CichY,

It looks like the system is correctly recognizing the PS3 Eye Camera as a USB audio device. If it’s directly connected to the machine running Home Assistant, you can use the Assist Microphone Addon. I’m using it myself with a Jabra USB speakerphone, and it works great! Just make sure to select the correct device under Audio in the add-on configuration.

Alternatively, if you want to run the microphone on a separate device (like a Raspberry Pi), you could set up a Voice Satellite using Rhasspy’s Wyoming Satellite. There’s a good example of setting it up with the PS3 Eye Camera in this video: PS3 Eye Camera Mic Setup.

Hope this helps!