The Assistant Microphone addon can use any Media Player as output

AlfredJKwack · January 22, 2025, 1:06am

Hi,

There have been a few conversations around here discussing how to get the “Assistant Microphone” to send audio to places other than the locally attached speaker (if there even is one attached). The solutions to this all require you to get your hands dirty which may not be for everyone. If you just want to load an add-on, configure it and get on with your day then perhaps I’ve got something for you.

What it is: a fork of the original HA addon. Feel free to verify the differences.

What you’ll have to do:

uninstall the HA provided Assistant Microphone
install a custom addon repository
install the one and only addon in that repository
read the docs, they’re good
configure the addon
start the addon
go on your merry way

No need to edit code or anything more complex than what you may be used to.

Everything is done through the HA GUI.

If that sounds appealing, here’s what the add-on offers : it sends the text that would be processed by the text-to-speech engine to a home assistant webhook. From there you can pick it up in an automation. The docs provide an example, don’t worry.

Lets get on with it then shall we?

Installing the custom addon repository
Go to “Settings > Add-ons”. On the Add-ons page look for the “Add-ons store” icon in the bottom right corner. On the Add-ons Store page click on the tree little dots in the top right hand corner and select “Repositories” from the drop-down. On the Manage add-on repositories pop-up add the following repository:

https://github.com/AlfredJKwack/ha-core-addons

Once loaded you should now have a new “Assistant Microphone” addon with a slightly different name to distinguish it from the original one. Go ahead and install that one.

Now for the configuration.
For that I suggest you do read the documentation on GitHub. To put it briefly: leave sound_enabled on; turn synthesize_using_webhook on; go and configure an automation using the webhook platform as a trigger. With that done you should be good to turn on the addon. The first time you may be notified of a new integration, do go and set that up in the devices. With that done you can go ahead and test things out. The logs will tell you everything there is to know.

Here’s an example of the automation you can run.

alias: Satellite response
description: ""
trigger:
  - platform: webhook
    allowed_methods:
      - POST
      - PUT
    local_only: true
    webhook_id: "synthesize-assist-microphone-response" # This must match the webhook_id in the add-on configuration
condition: []
action:
  - service: telegram_bot.send_message
    metadata: {}
    data:
      message: "{{ trigger.json.response }}" # This is how you catch whatever the add-on sent
      title: Mycroft said
  - service: tts.cloud_say
    data:
      entity_id: media_player.name # Don't forget to change this to your own media player
      cache: false
      message: "{{ trigger.json.response }}" # This is how you catch whatever the add-on sent
mode: single

If you run into trouble
Turn on debugging, restart the addon and check the addon logs. You should see entries like

DEBUG:root:Wake word detected
…
DEBUG:root:Event(type=‘synthesize’, data={‘text’: “Sorry, I couldn’t understand that”, ‘voice’: {‘name’: ‘NatashaNeural’}}, payload=None)
…
[01:35:30] INFO: Successfully sent text to webhook.

If you don’t see the last one of those then you may have an issue with the webhook. Look for an entry like below and verify that the URL is reachable.

[01:35:30] INFO: Webhookurl set to : xxxxxxx

What about limitations?
Yes there are some.

This does not cover the Awake, Done nor Timer Finished sounds. They go where they always went.
You have to leave sound_enabled turned on. Turning it off will stop speech synthesis events from firing and we rely on that. There is a PR for that upstream.
I think you pretty much end up doing speech synthesis twice. Once for the device Assistant Microphone is running on and once for wherever your automation takes your fancy. Reason I’m not sure is that I have no speaker on the assistant microphone (oh the irony)
I’ve done my best to do the discovery of the HA service IP correctly but who knows, someone’s bound to be using IPv6 or with a dual home system and neither are covered.

So without further due, if this sounds like your kind of jam. Give this thing a spin and let me know what you think. I’m thinking of proposing a pull request to the mothership as well but would like to see a bit of traction to tell me it’s good/bad, the wrong direction, etc.