Hardware for real time TTS and STT

Can anyone recommend GPU hardware that can process STT and TTS in real time? I want at least as good as an experience as Google Home or Alexa devices but completely locally controlled. I want to use both a custom wake word as well as a custom voice.


yes, this with HA Voice PE becomes a particularely interesting question many might have (including me)