Almond & Ada: privacy-focused voice assistant

TL;DR:

  • Teamed up with Almond, available in Home Assistant 0.102.
  • Introducing Ada, voice assistant powered by Home Assistant integrations. Available as Hass.io add-on.
  • New beta speech-to-text and text-to-speech service for Home Assistant Cloud subscribers.

Voice assistants are a great way to interact with your house, ask a quick question, set a timer or control your devices. The more an assistant knows about you, your home and it’s other inhabitants, the better it is able to help you.

Today’s available virtual assistants work great, but they have a big problem: They store your data in the cloud, don’t provide APIs to allow other companies to build products on top and are run by companies whose core business is building profiles on their users to help serve ads and product suggestions.

The backbone to our homes needs to be one that keeps data local and has APIs allowing other companies to build on top. Innovation happens when many different people, with many different backgrounds, do many different experiments until we find something that sticks. This cannot be left to a single company.

Recently we got in touch with the Open Virtual Assistant Lab at Stanford University. In the last four years, they have been working on a virtual assistant named Almond. And it’s a perfect match for Home Assistant.

Almond

Almond is an open, privacy-preserving virtual assistant that is open source. With Almond, you can run a virtual assistant at home, that can tell you the news or control your house. It is powered by LUInet, a state-of-the-art neural network developed at Stanford. And it now works with Home Assistant.

The Almond team has updated Almond to make it aware of the different device types in Home Assistant and allow Almond to control them. In turn, we have upgraded the conversation integration in Home Assistant to support Almond, allowing users to converse with Almond via the frontend.

Screenshot showing Almond integration in Home Assistant.

Almond is available to users today in Home Assistant 0.102. It requires an Almond Server, which you can either install yourself, use the new Almond Hass.io add-on or rely on Almond Web, a cloud version hosted by Stanford. By default, Almond Server will rely on a cloud version of LUInet, but it is possible to run it locally.

Almond is set up in a way such that your privacy is still partially preserved even with LUInet running in the cloud. This is made possible because LUInet is only responsible for converting the text into a program, whose details are filled in locally by the Almond Server. For example, LUInet will convert “turn on the lights” into code that Almond Server understands. Only Almond Server will know which lights the user has, how to control them and the context of how the text was received.

How Almond compares to Google/Alexa

You’re probably wondering if Almond is as good as Alexa or Google. And it’s not yet as good. However, it doesn’t matter.

If you want to have an assistant in your home that knows everything about you, it needs to be one that cares about privacy. It needs to be one that is open. That’s not negotiable.

Almond has room for improvement. But it’s open source, and with the Home Assistant community, we’ll work with the Almond team on making it better. You can start helping right now:

Almond is gathering sentences that you want to use to control the devices in your home. We already have a basic set of sentences, but the more, the better. You can submit those sentences using this form.

You are also able to help train LUInet directly by teaching it how to interpret sentences in the training console.

Ada

Almond is not the full story. Almond only works with text input, and generates text as output. It doesn’t handle speech-to-text to receive input nor text-to-speech to speak answers. Those technologies are out of scope for Almond. However, not out of scope for Home Assistant! Home Assistant already has a text-to-speech integration with different backends. In Home Assistant 0.102, we’re introducing a new speech-to-text integration to complement this.

Now we almost have all the pieces for a voice assistant built-in to Home Assistant, and so we decided to finish it off by introducing a new project called Ada. Ada integrates hotword detection and will route all data to the various integrations to provide a full voice assistant experience.

Ada is still very much in the beginning. We’ll be working on improving it. If you have expertise in this area and want to help, please get in touch.

Ada is also available as a Hass.io add-on. This means that you can plug a microphone and speakers into your Raspberry Pi and turn Hass.io into a full, privacy-focused, voice assistant.

To make it easier to add speech-to-text and text-to-speech integrations to your system, Nabu Casa is introducing a new beta service offering speech-to-text and text-to-speech services to Home Assistant Cloud subscribers, powered by Azure Cognitive Services.

Can a virtual assistant still be private if parts run in the cloud?

With Home Assistant we care about privacy and local control. We want to be able to offer home automation that keeps working if there is no internet. Home automation, that is fast and reliable.

But we also want privacy to be accessible. A user should be able to get a private solution without running a big server at home. Privacy should not be something that is reserved for the rich.

With the current approach, some things will still run in the cloud, but the home data and control stays local. We will bring more things local when faster technology becomes more accessible or new projects emerge that can help with this.

We don’t want to wait with integrating this until all the pieces run 100% locally. We need to help build the future we want to see.

What’s next?

With Almond and Ada, we’ve put the building blocks in place to create voice assistants. It’s now time to use it, improve it and surprise us by sharing the things you’ll use it for.

Bonus

I hacked together a quick prototype to allow you to talk to Almond via a Telegram Bot! It’s available as a custom component.

Screenshot of talking to Almond via Telegram.


This is a companion discussion topic for the original entry at https://www.home-assistant.io/blog/2019/11/20/privacy-focused-voice-assistant/
4 Likes

Does anyone have a list of recommended hardware to run Ada? Minimum Pi. Microphone? Sounds like I need to start deploying Ada boxes all over the house.

2 Likes

So we have to have naba casa and be running home assistant on a pi?

Can Ada run on virtual box pc machine as well? Also can I use my home mini as an Ada device?

I’m interested too. The blog only talks about setting it up on your existing HA server… I would want to use multiple RPi ZeroW’s around the place and not use my HA server at all (it’s tucked away in a back room). The Github page has no install info at all just yet.

2 Likes

Yes it will only be useful if the microphone and speaker can be remote. No one wants to go to their server cupboard to yell at home assistant. However there are solutions for that, for example the (long) thread on rhasspy Rhasspy offline voice assistant toolkit has several contributors, one of whom has made a networked microphone/speaker software thing.

Sorry that is not a technical description. Try https://github.com/koenvervloesem/hermes-audio-server

4 Likes

Could a discrete GPU (such as AMD Radeon RX 5500) help with speech to text with Ada? I like to have all the processing done in my server, if possible.

I could train Ada with my voice by giving them sample sentences in form of speech and text. In fact, I would like to train the system with 1,000+ sentences and once Ada gets better, I like to train it with a couple of paragraphs (one sample paragraphs with only a few sentences for starters).

How does ada’s function and maturity, compare to Mycroft.ai?

Cloud is right now the only integration that provides a speech-to-text platform. I expect more implementations to follow. We needed an open-speech to text platform, hence we chose the shortcut of using Azure. The available open source solutions perform badly on open-speech to text, but do great if all possible sentences are known in advance.

We’ve been working with the Rhasspy author and he has had an advisory role with Almond/Ada. Work is being done to have conversation and intent recognition be tighter integrated with Rhasspy.

Ada is still very early. Because it outsources a lot to Home Assistant, it is only responsible for hot word detection and passing data to/from the different integrations. There are no remote instances available just yet.

6 Likes

How about support for languages other than English?

I know this is early development, but are there any long term plans for other languages?

1 Like

I’ve just wanted to chime in to say: This all sounds very exiting and awesome! :slight_smile:

1 Like

Very excited as well!
Home Assistant is quite amazing in what it has become.

One question for anyone who has tried Almond-Server in Docker… I am having trouble starting the container. They suggest podman but since I already have everything in Docker i was hoping it would work… but I’m getting access denied errors:

Database needs migration...,
events.js:174,
      throw er; // Unhandled 'error' event,
      ^
Error: Access denied,
    at Context.process.nextTick (/opt/almond/node_modules/pulseaudio2/lib/pulse.js:121:21),
    at Function.Module.runMain (internal/modules/cjs/loader.js:834:11),
    at startup (internal/bootstrap/node.js:283:19),
    at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3),
Emitted 'error' event at:,
    at Context.process.nextTick (/opt/almond/node_modules/pulseaudio2/lib/pulse.js:123:22),
    at process._tickCallback (internal/process/next_tick.js:61:11),
    [... lines matching original stack trace ...],
error Command failed with exit code 1.

I have tried a few different suggestions, but nothing has worked so far.
Has anyone successfully started the Almond Server in docker?
If so, can you shed any light on your run command?

Thanks!
DeadEnd

Make sure you run the portable Docker image.

Right now Almond only supports English. It requires engineering effort on their side to built and maintain neural networks for extra languages. They have not done that yet. This is also part of the Almond docs on the HA site.

Thanks @balloob. I love this project and the effort to put privacy firsty, on the part of the Almond developers, as well es yours! I will follow the development with great interest.

I do hope, internationalization will be in the pipeline, though. At least speaking for me, this will have no use in my household otherwise :frowning: I am probably not alone with this, this is an international community after all

3 Likes

Can anyone tell me the name of the guy in the short video? I haven’t met a lot of other Swiss people here, and I would like to get in contact with him.

Wow, this all sounds really cool! Normally I’m not a fan of cloud services, but I think this approach makes sense. We can get experience, integrations and improvements right away and in a few years i wouldn’t be surprised if the neccesity of a remote host for the neural network is no longer there since we can simply host it at home.

I haven’t been too impressed by even the ‘big’ voice assistants though, interacting with them never feels natural. Hopefully an open source approach will help with this.

Sorry to be honest, but as usual for new announcements, it doesn’t work, here is reallity:

Visit Almond page for details doesn’t work.

Visit Hey Ada! page for details doesn’t work.

There is nothing to be configured which avoids user mistake, but here is result trying start ADA, why should not work without microphone? Just as voice output for automation.

ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2565:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_hw.c:1822:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1822:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1822:(_snd_pcm_hw_open) Invalid value for card
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/src/ada/ada/__main__.py", line 28, in <module>
    main()
  File "/usr/src/ada/ada/__main__.py", line 24, in main
    ada.run()
  File "/usr/src/ada/ada/__init__.py", line 30, in run
    self.microphone.start()
  File "/usr/src/ada/ada/microphone.py", line 49, in start
    frames_per_buffer=self.frame_length,
  File "/usr/local/lib/python3.7/dist-packages/pyaudio.py", line 750, in open
    stream = Stream(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/pyaudio.py", line 441, in __init__
    self._stream = pa.open(**arguments)
OSError: [Errno -9996] Invalid input device (no default output device)

Which QR code should be scanned?

Pressing Almond in left side menu says “Web almond.standford.edu refused connection”

Tested on RPI3 B+

arch	armv7l
dev	false
docker	true
hassio	true
os_name	Linux
python_version	3.7.4
timezone	Europe/Prague
version	0.102.0
virtualenv	false
1 Like

When I press button ‘Forget’ on “My Skills” page:

Setting up in Docker is really easy (if you don’t have Grafana running also since they use same ports).
It’s really fun although still a bit limited.