Sonoff TX Ultimate and Voice Assistant

EdwardTFN · September 16, 2024, 7:28am

Nice! I will take a look.
I’ve tried esp32_rmt_led_strip before, but got some flickering… will try it again.
About media_player, I really wanna try to find some time to make it compatible with esp-idf. That would be awesome.
Just like the speaker volume. I had an impression it worked better when the Sonoff firmware was installed, so maybe the hardware is capable of more than what we are getting with ESPHome.

PxPert · September 16, 2024, 9:13pm

Hi guys,
I’ve just pushed the image (amd64) into dockerhub. You can find it here: https://hub.docker.com/r/pxpert/esphome

@EdwardTFN I don’t think the speaker component is worse than media_player. They just behave differently.
The voice_assistant component do also include some functionality not provided by the Arduino framework (Like VAD Threshold) and there are components like microwakeword supported only by the esp_idf Framework.

Talking abount the stock firmware… The esphome firmware itself is not comparable with the stock one. The stock is much more simple and the sound effects are stored locally, which requires less resources than a remote played sound and does not depend on the wifi signal quality. Anyway you could also store a local sound effect to play it with the speaker component just to replicate the sonoff stock firmware behaviour (I don’t think this is possible with media_player).

I also got some flickering with the LED string, but only when using complex animations and when the device is also performing other tasks (for example when it’s sending recorded audio to home assistant for STT analysis)… I think that’s only a computational limitation of the ESP32 chip itself (and that’s understandable… after all it’s always a switch, not a PC )

EdwardTFN · September 17, 2024, 2:49am

I was not comparing the components speaker and media_player… I believe those are similar… My point is that I got an impression that sounds from stock firmware are louder than when using ESPHome. I don’t have both working at the same time for comparison, so it’s just a feeling, but I don’t think it is related to the fact the sound is local or not.
I will investigate this later…

Understandable, however, it works just fine with Neopixelbus (currently, Arduino only, but I believe this is not hard to make it IDF compatible).

I need longer days to look at this…

PxPert · September 17, 2024, 9:52pm

I completely aggree with you

Anyway just a warning: from what I’ve seen from the esphome code in the past, the media_player (and I’m almost sure even Neopixelbus) do use methods exposed only in the ESP32 Arduino framework. That’s the reason it’s not compatible with esp-idf Framework, even if it’s more feature rich. So I think adding support for both media_player and neopixel bus requires some effort, if it’s possible at all.

And remember just as I’ve said there are many components that need esp-idf, like microwakeword (Tested V1, need some time to test the V2 release)

EdwardTFN · September 18, 2024, 3:49am

Yeah!
I’ve started with Neopixelbus and it requires changes on that project to support esp-idf… I see other people looking at that, and hopefully I would be able to contribute on this also.

xHirscHx · September 19, 2024, 2:54am

I went one step futher and 3d printed a cover over the sonoff and put a sc01 plus running hasp on , touch screen and voice

PxPert · September 21, 2024, 10:15am

This is simply beautyful! Can you share your STL (or even better the source) of your 3D Printed model?

PxPert · September 21, 2024, 10:25am

Hi,
Just to inform you there is some cool progress in this project:

My previous PR Was merged in the dev branch
I’m now switching to microwakeword V2 to rely to local wake word. No more openwakeword
I’ve enabled the onboard PSRam component (this virtually adds 2MB Ram, wonderful)
I’ve fixed a problem with the LED Strip when using PSRam. I’ve updated my github repo and Dockerhub image with this fix (which needs a change in the yaml, see above). Obviously I’ve pushed another PR for this. You can find it here:
- esphome: Add FORCE_INTERNAL Flag to ExternalRAMAllocator and use_psram to esp32_rmt_led_strip by PxPert · Pull Request #7478 · esphome/esphome · GitHub
- esphome-docs: Add use_psram param description to esp32_rmt_led_strip.rst by PxPert · Pull Request #4266 · esphome/esphome-docs · GitHub
I’m starting to play around with local sound effects. For now a beep is played whenever a wakeword is detected and when the voice_assistant ends STT recording. I’ve also added a custom push button to manually trigger, and try, the sound effect.

As always, you can find all the yaml changes in my github project: GitHub - PxPert/esphome-config: esphome configuration

DDock · September 23, 2024, 3:26am

I’ve got myself caught up and things updated!

What would be the best way to enable the wake word by default / reboot? Currently I just have an automation to enable it every 15 minutes.

On that note, I had things set up to use a custom wake word, but now that is not working. Do I just need to enable the okay_nabu.json file in the micro wake word model? Sad to say, I did like how I had it set up before. I understand the default is easier for a new user without having to set everything else up in the back end.

Edit: I also noticed it wouldn’t compile with this command left in: use_psram: false in tx_ultimate_base_leds.yaml, so I currently have it commented out. I’ve also noticed that the LED do odd flashing when going between modes, when the previous version was smooth. Not sure if this is related, but I am on the current Dev branch of ESPHome.

PxPert · September 23, 2024, 12:37pm

Maybe you can attach to the on_boot trigger (With a > 800 priority) to automatically enable wake word on boot

Yes, but that implementation has its own drawbacks, like needing a constant connection with the backend. Not to mention the backend resource usage (I’ve got 8 of these devices connected and my server started struggling managing wakeword for all of them)

That’s because of PSRAM<->rmt_led_strip component (see my 4th point in my previous post): When enabling PSRAM the rmt_led_strip component relies on external allocation which causes problems like you’ve mentioned. That’s why I’ve patched it and added the additional option use_psram. It’s still not in the official dev branch, so that’s why it’s not recognized in your code, but you could sync with my fork or use my docker image which does include it

Andy6 · September 25, 2024, 9:15am

Hi Pietro, thank you for sharing the project, I really love it. I noticed you might install more than one TXU in your house, if you talk to one of them, will other TXU respond to you? Or will only the one you speak to respond to you?

PxPert · September 25, 2024, 10:37am

Hi Andy, Yes I currently have 8 Sonoff TXs around my house, But I’ve only added one microphone per room, so no problem of concurrent triggers.
Anyway I suspect that if more than one of them are triggered, they’ll both listen (and Answer) to you. You could also set up a mechanism (for example by subscribing / publishing to a MQTT Topic when the wakeword event is triggered), to limit this effect but I think that will make things a bit complicated…

Andy6 · September 25, 2024, 11:05am

Thank you! And one more question, if you talk to a TX in a room (let’s say your bedroom), only the TX in your bedroom responds to you, and the one in the living room will not answer you (I assume it doesn’t hear you). I ask this because if all the TXs are connected to 1 Alexa, once Alex responds to me, maybe all TXs answer me together…

DDock · September 25, 2024, 12:19pm

From my experience when using OWW and the other back end stuff, only one would trigger even though multiple would have been in range of hearing me. I’m not sure about the new version where the wake word is now running on board.

Andy6 · September 25, 2024, 12:37pm

Hi Dock, if only one would trigger, here comes another question…If I and my mom talk to TXs in different rooms, only one TX responds?

Andy6 · October 8, 2024, 1:17pm

Hi Pietro,
Was the solution you’re using Alexa ask skills?

PxPert · October 10, 2024, 10:01pm

Hi, not at all. It’s based on home assistant internal (and local) components. It has nothing to do with Cloud based Alexa services and skills

PxPert · October 13, 2024, 8:31pm

Hi, just to inform you my last PR was merged into the DEV branch. Now you can use the psram parameter from the official (for now) dev branch of esphome repository

DDock · October 14, 2024, 4:19am

I made up another 3 switches this weekend and took photos and plan on doing a write up, but not sure where I want to do it at. Any thoughts?

I checked and looks like they finally pushed it to the dev docker branch, so I updated and they work great again!

AnLeh · October 14, 2024, 5:24am

What about https://www.instructables.com/ ?