There are FOSS options that are starting to work completely locally - Mycroft One works via cloud, but with Mozilla, they have created good local TTS, and STT works on demos controlling HASS directly with Mycroft Two all locally:
Mycroft One and the HASS skill works well for me, but uses the cloud. An upgrade to a RPi4 with Mimic 3 might be completely local, but Mycroft Two is on the way.
The biggest downside is FOSS hardware development hasn’t the resources of FANG so the sound quality and cost are inferior.