I’ve been searching for a hardware solution for making an ESPhome based media player, for a decent but simple notification system.
It must be able to use the TTS service, play wav and / or mp3 and is ESPhome based.
I tried to build one with an ESP32 and an external DAC (max98357a) but it only resulted in poor audio quality. Tried also ESP32 S2, but could not get it programmed yet.
A raspberry Zero W with a HifiBerry DAC+ Zero might be a better solution.
You can run a simple Linux there and have all the option of a complete OS, including SnapCast and the likes.
Thanks for the idea!
Although I understand the extra possibilities, I still prefer a solution based on ESPhome (esp8266 or esp32). Why? Simplicity, power consumption, no failing sd cards after power outage, no OS to maintain, etc…
You would need a decent audio board, like a PicoAudio.
But that requires a a TinyPico ESP32 development board too, because ESP boards do not have. A standard for pinouts.
Then you would have to do some coding to make libraries to link to ESPhome, before you can start to tinker with the ESPhome setup.
There is nothing special about the Muse Proto other than a screw terminal instead of using just the pins and then a onboard speaker.
The rest of the board is just the traditional ESP32 dev board with no extra DAC or anything else.
The output here is pretty bad and only mono. Nowhere near the typical 24bit / 96Khz the PicoAudio provides and even further from the 24bit / 192Khz the HifiBerry cards provide for a raspi.
I have played with the combination of an ESP32-WROOM-32 and a MAX98357A as well and discovered the following:
It seems to be very dependent of the speaker you use.
The first one I tried was some 1€ China speaker (3W/4Ohm) and the result was awful most of the time. I had massive crackling and sometimes complete silence in the middle of a song.
I then tested it with a Philips Speaker I had laying around. No Idea what Wattage this needs but it worked out of the box with decent sound.
Conclusion: Try another speaker if you have one at hand.
Go ahead and experiment away. Would absolutely love to here what your result is
The only thing I didn’t get around yet is the response volume. I get very much yelled at when tts pushes the response. Maybe I need some sort of potentiometer in-between? Hopefully I get some time next weekend to solve this.
Same mileage here. I used some random tiny speaker which was only acceptable on a very low sound level or the audio was distorted.
Then I salvaged a speaker from a old pair of computer speakers and now it’s even good for playing music on a louder level.
A raspberry with dedicated DAC hat that I used previously doesn’t have any superior sound - highly suggest esp with good speaker as it is much more versatile!
The Nest is also an interesting option! Thanks for that idea.
I wonder if this is working pure locally, or always using Google/clouds to play sound, even if the sound originates from a local resource?
For now I got it working with an ESP32 WROOM devkit board, indeed with a bigger speaker. Using the gain of the max98357 connected to GND, balancing between quality and volume.
The board is quite big, so I’m looking for a smaller board that can do the job. Tried ESP32 S2, but could not get them programmed at all, also saw the warning these are still experimental…
Somewhere I should have some ESP32cam laying around, will see if these can do the job
After some fiddling got it working: ESP32 S2 with Max98357.
Programming of the ESP succeeded with esptool.py. Had to do some triggering of programming mode with the buttons (hold O, then push and release RST, then release O button.) After first programming, the Wifi OTA works fine.
First sound was distorted, but after resampling the used MP3 files to 32kHz (used Audacity) sound is good! Also, TTS works fine if using a decent speaker, of course.
Nice Gerben. I am looking for a relative simple solution to create a speaker for my alarm and maybe as a media player when the alarm is not necessary.
It would be nice if you could share in more detail which components you used. For instance, the MAX98357 are available in several models, like DFRobot or one of Aliexpress. And hopefully a Wemos ESP module will also work!
I have excellent sound quality with these components at the level of hifi systems.
max98357a - is high quality digital D-class aplifier. You need to pay attention to the circuitry and you will get a great result with it.
This is sample playback recorded with smartphone:
Of course it noticeably degrades the quality through the smartphone microphone, it sounds better live.
Played on this small desktop speaker:
I have built two speakers with voice assistant function using this manual.
Use as short wires as possible, if possible shielded or twisted pair to avoid crackling and interference. Don’t forget that this is a high frequency digital data bus.
As an enclosure it is good to use old computer speakers or from 5.1 systems
Thank you. The components are in house now. How to connect those components? Do you have a schema of this solution? The only thing I need to make it working