Fixing "Denglisch" – A Llama-3.1 Pre-Processor for smooth German‑English Piper TTS 🚀

Hi everyone,

Like many of you using Home Assistant’s "Year of the Voice" architecture with local German TTS voices (like the awesome Piper/Thorsten voice), I ran into a frustrating roadblock.

In today’s world, we rarely speak 100% pure German. Our smart homes and chatbots constantly feed us "Denglisch" or pure English phrases (e.g., "Weil der Download laggy war, wurde er asap gecancelt..." or technical sensor data like "...at 68°F with a latency of 10ms").

Whenever a native German TTS voice hits these English words, the pronunciation completely breaks down.

To fix this for my own setup, I trained a small Llama-3.1-8B model that acts as an intelligent phonetic frontend. It handles context-aware code-switching. It leaves German text untouched but translates English words and complex units into clean German phonetics that Piper can finally pronounce without breaking character.

What it does (Quick Examples):

1. Everyday "Denglisch":

  • Input: Weil der Download laggy war, wurde er asap gecancelt und vom project leader rescheduled.
  • Output: Weil der Daunlohd läggi war, wurde er äißap gekäntselt und vom Proddschäkt Lieder riskäddiuhlt.

2. Handling German & English Units & Abbreviations: English example:

  • Input: The speaker has an impedance of 8Ω. The blood pressure rose to about 160/90mmHg while we were moving at an average speed of 45mph, covering an area of almost 200 km². The battery capacity of our EV measured 120kWh at 800V and a maximum charging current of 450A.
  • Output: Se Spieker häss än Impiehdens off Äit Ohm. Se Bladd Präscher rohs tu äbaut Wonn-Handred-ßixti tu Neihnti Milli Mieters off Mörkiuhri wail wi wöhr muhwing ätt än äwwridsch Spied off Fohrti-Feihw Meils pör Auer, kawwering än Ärria off orlmohst Tuh-Handred Squähr Kilommieters. Se Bätteri Käppässiti off auer Ie Wie mäschert Wonn-Handred-Twännti Kilo Wott Auers ätt Äit-Handred Wollts änd ä mäximum tschardsching Körrent off Fohr-Handred-Fifti Ämps.

I have just released the model as an alpha‑0.1 version. Since I am a huge fan of the open-source spirit here, I wanted to share it with the Home Assistant community to see if this is as useful to you as it is to me!? Please try it out and let me know what you think!

I would love to get your feedback! Let me know if this helps your smart home speakers to sound a bit more natural when dealing with mixed-language responses.

Best regards,
Sven (RudraChakrin)

2 Likes