Just posted to say that this is an outstanding component and already integrated into my doorbell project to replace my Ring pair. Currently using one ESP32-s3 to drive the camera, one to drive the audio and controls though Iāll try and bring it all back into a single device in the near future. I didnāt use go2rtc, just streaming the mic direct from the board. Many thanks!
@meconiotech Can you add ESP32P4 to the check in esp_aec/init.py?
I canāt send you a PR because github wonāt allow me to fork your repo directly since I already have it forked via @will_santana 's fork.
@gtjoseph , Iāve merged your PR.
Iāll find some time to test it too before submitig the PR to @meconiotech
@meconiotech you just read my mind about the dynamic AEC. It would save some resources disabling it once thereās nothing playing or announcing, but still very important to be enabled during those to improve the mic detection.
Iād extend the mic component adding actions like aec_start and aec_stop
Really exited with this project and with the possibility do finally ditch my echos for good
Another upgrade that should be awesome is the possibility to use the speaker at 48000 and delagate the resampling to the speaker mixer component. so voice calls and announcements could be used at lower sample rates and media would play at itās best
Donāt know if the sampling rate MUST be equal down the hardware level, but thereās no problem in the config. Maybe because it assumes separate buses. But, as long as I know, the clock pins in the esp32 i2s are virtual, and only the data in and out pins gotta be phisical (for obvious reasons). So, maybe allowing mic and speaker separate configurations in the i2s_audio_duplex component could allow this feature
In time, thinkin in buying one of these for testing
https://pt.aliexpress.com/item/1005010271067848.html
Is this the one youāre using, @gtjoseph?
Iāve merged your PR and tested just fine
The voice assist pipeline works perfectly, but it breaks the media pipeline as shown below:
Iāll try to work on it, but my work schedule is a quite full today
Hi, my experience:
DIY
DAC MAX98357
Mic INMP441
Intercom-Mini config exept changes here:
esp_aec:
id: aec_processor
sample_rate: 16000
filter_length: 8 # 4 = 64ms tail (good balance of quality vs CPU)
mode: VOIP_HIGH_PERF # Optimized for real-time voice, lower CPU than default
Pretty well Audioquality.
Even good Echo cancelation.
Spotbear Ball V2
Bad audio quali.
Cracking mic even with no one speaking.
I think its the device.
Nope, Iām using these right nowā¦
Hmmm. Iām not sure why it would have broken the media pipeline unless youāre doing something in the yaml like stopping the speaker but not restarting it. I can play both media and announcements from HA .
Hereās the yaml config Iāve been testing withā¦
i2s_audio_duplex:
id: i2s_duplex
i2s_lrclk_pin: ${i2s_lrclk_pin}
i2s_bclk_pin: ${i2s_bclk_pin}
i2s_mclk_pin: ${i2s_mclk_pin}
i2s_din_pin: ${i2s_din_pin}
i2s_dout_pin: ${i2s_dout_pin}
sample_rate: ${sample_rate}
audio_dac:
- platform: es8311
id: es8311_dac
i2c_id: ${dac_i2c_bus_id}
bits_per_sample: 16bit
sample_rate: ${sample_rate}
microphone:
- platform: i2s_audio_duplex
i2s_audio_duplex_id: i2s_duplex
id: ${i2s_microphone_id}
sample_rate: ${sample_rate}
speaker:
- platform: i2s_audio_duplex
i2s_audio_duplex_id: i2s_duplex
id: ${i2s_speaker_id}
sample_rate: ${sample_rate}
audio_dac: es8311_dac
num_channels: 1
- platform: mixer
id: mixing_speaker
output_speaker: ${i2s_speaker_id}
source_speakers:
- id: announcement_mixing_input
timeout: never
- id: media_mixing_input
timeout: never
- platform: resampler
id: announcement_resampling_speaker
output_speaker: announcement_mixing_input
- platform: resampler
id: media_resampling_speaker
output_speaker: media_mixing_input
media_player:
- platform: speaker
name: ${media_player_name}
id: ${media_player_id}
task_stack_in_psram: true
codec_support_enabled: true
media_pipeline:
speaker: media_resampling_speaker
num_channels: 1
format: FLAC # FLAC is the least processor intensive codec
sample_rate: ${sample_rate}
announcement_pipeline:
speaker: announcement_resampling_speaker
format: FLAC # FLAC is the least processor intensive codec
num_channels: 1 # Stereo audio is unnecessary for announcements
sample_rate: ${sample_rate} #Supported by Music Assistant
files:
- id: mute_switch_on_sound
file: ${mute_switch_on_sound_file}
- id: mute_switch_off_sound
file: ${mute_switch_off_sound_file}
I cant open the link it says not availeable in my country.
I found an interesting device maybe its the one you linked:
Its called:
Farbe: JC4880P443C-I-W-Y
@tomcat Iāve seen this device today (or some kind of similar one)
But the one I saw had no mic or speaker (despite a speaker header)
So I was now suitable for a voice assistant
The one Iāve posted is this one:
Itās called ESP32-P4-WIFI6-Touch-LCD-3.4C
3.4 inch (4 inch available) round capacitive screen, dual mic and included reasonable sized speaker
With a little work and a nice 3D printed case would be a perfect replacement for an echo dot/spot
Have no hopes for replacing an echo show, but I donāt care about the video features, but some visual feedback and interaction is good
With
Ah ok saw that some days ago too.
But its very expansive ~75ā¬.
The One I postes is around 30⬠and the JC4880P443C-I-W version without enclosure and camera around 24ā¬. Both prices with 3 ⬠discound code.
Both seams to have a mic on board and a speaker connector.
Maybe its also possible to connect an INMP441 to it on the free gpios.
Size is very nice for a wall pannel or a doorbell I guess.
Well the speaker has to be somewhere then but shoud be possible with 3D Printed case.
Really, the JC4880P443C-I-W is way more reasonably priced. The rectangular format would be easier to work with
The only cons I see are the only mic versus the dual mic + hardware AEC from the one Iāve posted and the lack of the speaker out of the box
But it seems great for a āin wall mountā like in a 4x2 box, but may suffer for voice assistence
PS: I could find the round one for about 60 euros, but still way pricier than the JC
Here is a config with audio and I guess this one:
is for that device too.
So for me it looks like being usable as voice assistant.
Il give it a try because I gues it could be nice controll pannel too (main reason).
WinWin if both will work ![]()
Why you want dual mic and hardware aec?
Was INMP441 not working for you?
I have it inside this enclosure:
(But mic input to the front not back as on his pictures.
At least the quality is much batter that the one from that Spotbear Ball V2.
Just played around a bit with voice assistant but not that much until now.
But because of the quality I can hear over the mic uing it as an intercom it should be enough.
For intercom, the INMP441 works fine, but could do better
My focus, as stated before, is not intercom. Thatās meconiotechās goal
My goal is to replace my echo devices with something that can work locally and have a better interface than the decadent echos. They are getting dumber by the time and the promised Alexa+, besides being paywalled, isnāt avaiable outside de US
In my experience this weeks with the spotpear v2 and itās INMP441, the voice detection is ok to bad, requiring some shounting or retries. And if planing to use as a media playback device, AEC is quite important. So, if it could be delegated to hw, the better. Multiple mic arrays usually do way better with noise supression too. Echo devices do so much better 'cause of this. But the low prices comes from Jeffās deep pokets and scale economics.
Donāt get me wrong, Iām still amazed with the spotpearās performance, specially in the price range. With the stock firmware itās absolutelly amazing and so fluid. But of course, it can do better ($$$)
Iāll probably integrate meconiotechās intercom in my stack because drop in calls are just amazing. Add cameras like the one in the hardware you posted and the sky is the limit, but my main goal now is efortless voice detection and media integration
I understand.
Made such experience too with the diy voice assistant but wasnt sure about the reason.
Anyway the Ball was much more dissapointing for me. Im sending it bag.
This could be interesting too:
~23-25 ⬠@ Ali
Despite the shouting and poor sound quality (spected both for the price tag), it really impressed me
As I said, with the stock firmware and activating the MCP connection with HA, it can handle my house and requests WAY better than any of my many echo devices (have 4 different versions)
What bugs me is the cloud part. Itās not all bad, but donāt wanna be stuck with it
I use Perplexity and Eleven Labs in my pipeline now, but thatās MY choice
The only major bugs Iāve found were the lack of media playback (with that speaker almost a bonus), and the impossibility to interrupt a response by calling it againg
Upon further inspection, the last came to be because of the single bus design (despite the hardware having 2). The only project that could handle that issue was meconiotechās, and here we are
And I guess, once this is all done, it should be merged in to esphome main, 'cause it enables so much functionality and unlocks a lot of power to such cheap devices
I think so too, you know? As far as I know, the V2 and V3 I have arenāt that different. On my V3, this problem doesnāt occur.
Donāt mean to pollute this thread butā¦
@tomcat I have 3 of those Wavesharesā¦
I just turned them upside down. ![]()
Camera, display and audio all work fine.
There a thread for them hereā¦
After further investigation my problem seems to be the media player not sending the max_bit_depth to music assistant
Adding bits_per_sample to the speaker setting makes the player stream indefinitely, but still no sound and the same error in the end
Speaker, mixer, media player, mww and va configs seems to be ok
Voice replies (TTS) and wake word detections work 100% of the time after @gtjoseph PR was merged. Only music stream from music assist thats buggy
speaker:
- platform: i2s_audio_duplex
id: i2s_audio_speaker
i2s_audio_duplex_id: i2s_duplex
sample_rate: 16000
audio_dac: es8311_dac
num_channels: 1
- platform: mixer
id: mixer_speaker_id
output_speaker: i2s_audio_speaker
source_speakers:
- id: announcement_spk_mixer_input
timeout: never
- id: media_spk_mixer_input
timeout: never
- platform: resampler
id: announcement_spk_resampling_input
output_speaker: announcement_spk_mixer_input
- platform: resampler
id: media_spk_resampling_input
output_speaker: media_spk_mixer_input
media_player:
- platform: speaker
name: None
id: external_media_player
task_stack_in_psram: true
#codec_support_enabled: true
volume_initial: 70%
media_pipeline:
speaker: media_spk_resampling_input
num_channels: 1
format: FLAC
sample_rate: 16000
announcement_pipeline:
speaker: announcement_spk_resampling_input
format: FLAC
sample_rate: 16000
num_channels: 1 # S3 Box only has one output channel
micro_wake_word:
id: mww
microphone: i2s_mics
stop_after_detection: false
models:
- alexa
on_wake_word_detected:
- if:
condition:
voice_assistant.is_running:
then:
voice_assistant.stop:
# Stop any other media player announcement
else:
- if:
condition:
media_player.is_announcing:
then:
- media_player.stop:
announcement: true
else:
# Start the voice assistant
- voice_assistant.start:
wake_word: !lambda return wake_word;
voice_assistant:
id: va
microphone: i2s_mics
media_player: external_media_player
#speaker: announcement_spk_resampling_input
micro_wake_word: mww
#noise_suppression_level: 2
use_wake_word: false
auto_gain: 31dBFS
volume_multiplier: 2.0
on_client_connected:
- micro_wake_word.start:
on_client_disconnected:
- voice_assistant.stop:
Can you guys tell me the versions youāre running?
Suspecting from some release bug, specially EPSHome builder and Music assistant (but addons are hard to rollback on HA)
Iām on:
Core: 2026.1.3
Supervisor: 2026.01.1
HAOS: 17.0
ESPHome Builder: 2026.1.2
Music Assistant: 2.7.5
Installation method Home Assistant OS
Core 2026.1.3
Supervisor 2026.01.1
Operating System 15.2
ESPHome 2026.1.2
Music Assistant 2.8.0b9 but dont rely use it






