TTS With music playing in the background

so, I’m not new to HA, however I’ve never done any media, tts or music related stuff, as I didn’t have any speakers to use. Now I got a free bluetooth speaker, and started to wonder what to do with it. My idea is, to use it together with my wake up automation and set up something similar to what Tony had in Iron Man, where with the wake up automation it started playing music and jarvis is saying some text at the same time (time, weather, etc.).

I’ve read, that unfortunately HA doesn’t support voice encoding, so probably I’d need to use a server for this purpose. This is not an issue, as I have a home server, where I can just set up a new VM, I’m just not sure what I need. Also I’m not sure if it is possible to play music and text to speech at the same time. Has anyone done anything similar? Is it possible to do? If so, how can I achive something like this?