[Guide] Train a Custom French Wake Word for Home Assistant with OpenWakeWord (Colab)

:rocket: Tired of “Hey Google” or “Alexa”?
Train your own French wake word (e.g., “Alexa”, “Jarvis”, “Lumière”) offline using OpenWakeWord + Piper TTS—directly in Google Colab!

:white_check_mark: No local setup needed (runs in the cloud)
:white_check_mark: Supports French (with fr_FR-upmc-medium voice model)
:white_check_mark: Export to .tflite for Home Assistant integration


:pushpin: Prerequisites

  • A Google account (to use Colab).

:hammer_and_wrench: Step 1: Open the Colab Notebook

  1. Open the Wake Word Training Environment.
  2. Click “Show code” (under the first cell) to reveal the script.

:wrench: Step 2: Replace the Code for French Support

Copy-paste this entire block to replace the default code:
(The script downloads a French TTS model and generates samples with proper resampling.)

# @title  { display-mode: "form" }
# @markdown # 1. Test Example Training Clip Generation
# @markdown Since openWakeWord models are trained on synthetic examples of your
# @markdown target wake word, it's a good idea to make sure that the examples
# @markdown sound correct. Type in your target wake word below, and run the
# @markdown cell to listen to it.
# @markdown
# @markdown Here are some tips that can help get the wake word to sound right:

# @markdown - If your wake word isn't being pronounced in the way
# @markdown you want, try spelling out the sounds phonetically with underscores
# @markdown separating each part.
# @markdown For example: "hey siri" --> "hey_seer_e".

# @markdown - Spell out numbers ("2" --> "two")

# @markdown - Avoid all punctuation except for "?" and "!", and remove unicode characters

import os
import sys
from IPython.display import Audio
if not os.path.exists("./piper-sample-generator"):
    !git clone https://github.com/rhasspy/piper-sample-generator
    !wget -O piper-sample-generator/models/fr_FR-upmc-medium.onnx 'https://huggingface.co/rhasspy/piper-voices/resolve/main/fr/fr_FR/upmc/medium/fr_FR-upmc-medium.onnx'
    !wget -O piper-sample-generator/models/fr_FR-upmc-medium.onnx.json 'https://huggingface.co/rhasspy/piper-voices/resolve/main/fr/fr_FR/upmc/medium/fr_FR-upmc-medium.onnx.json'
    !cd piper-sample-generator

    # Install system dependencies
    !pip install piper-tts piper-phonemize-cross
    !pip install webrtcvad
    !pip install 'torch<=2.5' torchvision torchaudio

if "piper-sample-generator/" not in sys.path:
    sys.path.append("piper-sample-generator/")

import generate_samples

target_word = 'alexa' # @param {type:"string"}

def text_to_speech(text):
    generate_samples.generate_samples_onnx(text = text,
                max_samples=1,
                length_scales=[1.1],
                noise_scales=[0.7], noise_scale_ws = [0.7],
                output_dir = './', batch_size=1, auto_reduce_batch_size=True,
                file_names=["test_generation.wav"],
                model="piper-sample-generator/models/fr_FR-upmc-medium.onnx"
                )

text_to_speech(target_word)
Audio("test_generation.wav", autoplay=True)

What this does:

  • Downloads the French Piper TTS model (fr_FR-upmc-medium).
  • Generates a test audio clip of your wake word.
  • Plays the clip automatically for preview.

:microphone: Step 3: Test Your Wake Word

  1. Replace 'alexa' in target_word with your desired wake word (e.g., 'jarvis', 'lumière').
  2. Run the cell (:arrow_forward: button).
  3. Listen to the generated clip. If it sounds wrong:
  • Try phonetic spelling (e.g., 'j_a_r_v_i_s' for “Jarvis”).
  • Adjust length_scales or noise_scales for clearer audio.

:open_file_folder: Step 4: Modify the Training Script

  1. In Colab’s file explorer (:file_folder: icon on the left), navigate to:
openwakeword/openwakeword/train.py
  1. Double-click to open the file, then replace its entire content with this optimized script (see below for key improvements).

Link to the modified script on Pastebin

Key Improvements in the New train.py:

Feature Why It Matters
French TTS support Uses fr_FR-upmc-medium.onnx by default.
Automatic resampling Ensures all clips are 16kHz (required for OpenWakeWord).

:warning: Note: The script automatically resamples audio to 16kHz. If you skip this, training will fail!

:rocket: Step 5: Train Your Model

  1. Run Step 1 and Step 2 in Colab to generate positive/negative samples.
  2. Run Step 3 to start training. The script will:
  • Save the best model as .onnx.
  • Convert it to .tflite (compatible with Home Assistant).
  1. Download the .tflite file from Colab’s file explorer (look in output_dir).

:wrench: Step 6: Integrate with Home Assistant

Place the .tflite file in your Home Assistant’s

:earth_africa: Next Steps (To-Do List)

Task Status Notes
Translate adversarial phrases :warning: Pending Replace English negatives with French equivalents.
Automate train.py replacement :hammer_and_wrench: In Progress Script to auto-replace the file in Colab.
Multi-language support :bulb: Planned Add a dropdown menu to select voice models (French, German, Spanish, etc.).

Thank you for your feedbacks!
to be continued…

1 Like

Here is the French colab :

1 Like