Sonoff TX Ultimate and Voice Assistant

Nice! I will take a look.
I’ve tried esp32_rmt_led_strip before, but got some flickering… will try it again.
About media_player, I really wanna try to find some time to make it compatible with esp-idf. That would be awesome.
Just like the speaker volume. I had an impression it worked better when the Sonoff firmware was installed, so maybe the hardware is capable of more than what we are getting with ESPHome.

Hi guys,
I’ve just pushed the image (amd64) into dockerhub. You can find it here: https://hub.docker.com/r/pxpert/esphome

@EdwardTFN I don’t think the speaker component is worse than media_player. They just behave differently.
The voice_assistant component do also include some functionality not provided by the Arduino framework (Like VAD Threshold) and there are components like microwakeword supported only by the esp_idf Framework.

Talking abount the stock firmware… The esphome firmware itself is not comparable with the stock one. The stock is much more simple and the sound effects are stored locally, which requires less resources than a remote played sound and does not depend on the wifi signal quality. Anyway you could also store a local sound effect to play it with the speaker component just to replicate the sonoff stock firmware behaviour (I don’t think this is possible with media_player).

I also got some flickering with the LED string, but only when using complex animations and when the device is also performing other tasks (for example when it’s sending recorded audio to home assistant for STT analysis)… I think that’s only a computational limitation of the ESP32 chip itself (and that’s understandable… after all it’s always a switch, not a PC :slight_smile: )

I was not comparing the components speaker and media_player… I believe those are similar… My point is that I got an impression that sounds from stock firmware are louder than when using ESPHome. I don’t have both working at the same time for comparison, so it’s just a feeling, but I don’t think it is related to the fact the sound is local or not.
I will investigate this later…

Understandable, however, it works just fine with Neopixelbus (currently, Arduino only, but I believe this is not hard to make it IDF compatible). :wink:

I need longer days to look at this… :joy:

I completely aggree with you :pensive:

Anyway just a warning: from what I’ve seen from the esphome code in the past, the media_player (and I’m almost sure even Neopixelbus) do use methods exposed only in the ESP32 Arduino framework. That’s the reason it’s not compatible with esp-idf Framework, even if it’s more feature rich. So I think adding support for both media_player and neopixel bus requires some effort, if it’s possible at all.

And remember just as I’ve said there are many components that need esp-idf, like microwakeword (Tested V1, need some time to test the V2 release)

1 Like

Yeah!
I’ve started with Neopixelbus and it requires changes on that project to support esp-idf… I see other people looking at that, and hopefully I would be able to contribute on this also.

I went one step futher and 3d printed a cover over the sonoff and put a sc01 plus running hasp on , touch screen and voice

This is simply beautyful! Can you share your STL (or even better the source) of your 3D Printed model?

Hi,
Just to inform you there is some cool progress in this project:

As always, you can find all the yaml changes in my github project: GitHub - PxPert/esphome-config: esphome configuration

2 Likes

I’ve got myself caught up and things updated!

What would be the best way to enable the wake word by default / reboot? Currently I just have an automation to enable it every 15 minutes.

On that note, I had things set up to use a custom wake word, but now that is not working. Do I just need to enable the okay_nabu.json file in the micro wake word model? Sad to say, I did like how I had it set up before. I understand the default is easier for a new user without having to set everything else up in the back end.

Edit: I also noticed it wouldn’t compile with this command left in: use_psram: false in tx_ultimate_base_leds.yaml, so I currently have it commented out. I’ve also noticed that the LED do odd flashing when going between modes, when the previous version was smooth. Not sure if this is related, but I am on the current Dev branch of ESPHome.

Maybe you can attach to the on_boot trigger (With a > 800 priority) to automatically enable wake word on boot

Yes, but that implementation has its own drawbacks, like needing a constant connection with the backend. Not to mention the backend resource usage (I’ve got 8 of these devices connected and my server started struggling managing wakeword for all of them)

That’s because of PSRAM<->rmt_led_strip component (see my 4th point in my previous post): When enabling PSRAM the rmt_led_strip component relies on external allocation which causes problems like you’ve mentioned. That’s why I’ve patched it and added the additional option use_psram. It’s still not in the official dev branch, so that’s why it’s not recognized in your code, but you could sync with my fork or use my docker image which does include it

Hi Pietro, thank you for sharing the project, I really love it. I noticed you might install more than one TXU in your house, if you talk to one of them, will other TXU respond to you? Or will only the one you speak to respond to you?

Hi Andy, Yes I currently have 8 Sonoff TXs around my house, But I’ve only added one microphone per room, so no problem of concurrent triggers.
Anyway I suspect that if more than one of them are triggered, they’ll both listen (and Answer) to you. You could also set up a mechanism (for example by subscribing / publishing to a MQTT Topic when the wakeword event is triggered), to limit this effect but I think that will make things a bit complicated…

Thank you! And one more question, if you talk to a TX in a room (let’s say your bedroom), only the TX in your bedroom responds to you, and the one in the living room will not answer you (I assume it doesn’t hear you). I ask this because if all the TXs are connected to 1 Alexa, once Alex responds to me, maybe all TXs answer me together…

From my experience when using OWW and the other back end stuff, only one would trigger even though multiple would have been in range of hearing me. I’m not sure about the new version where the wake word is now running on board.

Hi Dock, if only one would trigger, here comes another question…If I and my mom talk to TXs in different rooms, only one TX responds?