Speaker vs Media Player components?

From reading the platform description, the difference between these two i2s audio components - speaker and media player - is not clear to me.
Am I right that speaker is only for tts, while media player can handle both tts and multimedia streams? Which platform is intended for which use cases?

p.s. In general, there are many places in the description of ESPHome (and Home Assistant itself), where the functionality of a component is described superficially, and from this it is not very clear what are the spheres of its application, especially when there are components with similar functionality. In such a case it would be good to describe the differences in detail, as it is done in the thermostats section.

2 Likes

Yeah this is unclear to me too.

Media player has all the stuff, the features and whatnot. Its mainly for streaming audio back and forth.

Speaker is more for local playback, like playing rtttl buzzer tones from the esp32 or using dfplayer where the music is stored on the SD card that is the dfplayer. The Speaker component doesnt stream audio like media player does.

Now, there is the i2s speaker component, which to my understanding falls under the regular Speaker and it can play streamed media through the i2s audio component but, it only has a fraction of the controls that media player has. No volume +/- , volume set, toggle, and all the other actions, triggers, and conditions whereas speaker basically just has Play/Stop.
So media from HA goes to i2s audio and then output via i2s speaker.

Dont ask me why its like that, i have no idea. This is just my understanding of all the audio/media pieces of esphome. I think you just have to assess your needs and wants, then use whichever are most applicable to your project.

2 Likes

I see examples of Voice Assistant using the, ‘i2s speaker’ only, ‘i2s Media Player’ only, and both.

I haven’t experimented with Media Player component, but the i2S speaker was giving me a very low frequency clock BCLK 512kHz only when the speaker was playing something. The mic BCLK seems fixed at 1.024MHz.

Any idea if we can use the same BCLK and LRCLK signal for both mic and speaker/media player?

Any chance of picking a bitrate/clock frequency?

I am trying to reuse an old Insignia Smart Speaker and replace the steaming module/cpu with an ESP32-S3 N16R8 module. I only see a single 3.074 MHz BCLK for both the i2s MIC and what I assume is the i2s amplifier. The mic works, but I can’t seem to get sound out of the amp. It still could be missing other signals.