WTH do YouTube videos trigger the voice assistant?

My question is not as much why wakewords in YouTube videod trigger our devices, but moreso why we don’t have a solution to make it not trigger.

For example, why don’t we have some inaudible tone that causes the wake word to be ignored? The YouTubers could talk without bleeping the wakeword (just edit in the inaudible tone) and no one’s assistant will trigger.

I can see a few problems with this.

  1. It would rely on youtubers bothering to insert this tone into their videos. I think from experience it’d be a vanishingly small amount of them that would.

  2. This would not be able to be retroactively applied to older videos.

  3. The speakers you’re using for playback might not be able to recreate the tone in question. Human hearing is between 20hz - 20khz but most speakers do not bother to capture the entire range or do so unevenly. A cell phone speaker, for example, certainly would not be able to recreate tones at the lower end of the range.

A better solution, and the one I think is most likely, is to make wake words customizable. Or, maybe less likely but certainly possible, train the voice assistant to recognize the voices of the people in your household and only respond when those particular voices say the wake word.

The other two voice assistant platforms have been dealing with this for a while. For the large part there was a “South Park” TV episode where one of the characters was saying the wake words for their devices repeatedly throughout the episode. Not to mention TV commercials and the like. What they do is they train the model to listen for that exact voice, and then tell the device to ignore it.

I will say that most YouTubers already do a fairly good job of abusing the wake words in video, from “She who shall not be named”, to distorition, to just outright muting it.