Appdaemon: voice controlled commands (Appi is born)

I’ve seen it too - I think I need to add a way for the app to clean the thread up.

1 Like

Can you explain me what you are trying to do, or doing now?

Wake up word on the PI, that transmits the command to a voice recognition software on the cloud, that sends back the commands to the PI?

Have I understood correctly, or?

i have taken some existing parts and put them together.( python speechrecognition(including google STT, sphinks STT, and others), google TTS and appdaemon.)

i use google for the speech recognition. so that means wakeupword and commands are send to google. (not that you notice that, unless you dont have network on your RPI, and you need that for HA anyway)
google doesnt send back commands, but just some text from the spoken sound. the program makes commands out of the text.

I am going to try to use sphinks which works locally.

I’m still hoping for continued development of Lucida.ai , but this is also really great! I wonder if it is possible to have multiple rpi’s with microphones and have voice control all over the house (per room)?

Any suggestions on the microphone hardware?
Think the PI should be able to connect with a Bluetooth mic?

that wouldnt be a problem. they only need to be connected to a single network.
i am absolutely going that way and i know thats not very hard to accomplish.

just think of this:

any RPI that can understand your voice can translate that to a simple textfile.
any RPI connected to a server RPI can save that file on the server RPI.
any server RPI can translate a textfile to a command, which can do anything.

the slave RPIs wouldnt be any different then a sensor. if the sensor sends a new value, you can make an automation for that.
and the sensor would be your voice giving the sensorvalue.

do you want high end? or cheap?
if you say high end i say, dont go RPI.
if you say cheap i say any USB webcam, with a decent microphone.

I’m absolutely going to test this. (But first I’m stuck trying to get a second instance of Home Assistant to run on a rpi3 with OSMC)

1 Like

Would a USB mic work?

if i am correct an USBmic should work on the RPI.

USB mic works just fine on a Pi. Plug and play.

You will need to install some extra libraries to get HASS going on a Pi running the OSMC image, though. OSMC really just contains the stuff needed for Kodi to play media. A lot of really basic stuff is missing. I’ve played around with this image quite a bit, and banged my head against a wall just as much. It really isn’t an image suited for general use. That said, it’s pretty easy to get Kodi going on it, so it’s fairly good at what it sets out to do.

You CAN get HASS running on it, just take it slow and watch the errors for missing libraries and dependencies. Install them when you catch it, and move on to the next error. (If you aren’t familiar with linux, or working in the shell, I wouldn’t recommend this - it will be an exercise in frustration for you).

Hi @ReneTode - your wish is my command!

In the lates version, I added dependencies and a new terminate() call. The dependencies allow you to mark any app that uses your TTS or other App reload if that App changes.

Also, there is now a terminate() call for when your App is being reloaded that gives you a chance to clean up the thread so it restarts correctly. To make this work you will need to make some slight changes to your code - but you based it on mine so it shouldn’t be hard.

I added a terminate call that in turn tells the worker thread to exit by giving it a special queue entry. I then wait to be sure it has exited using a lock. With these 2 things in place, when I make a change to my Say/TTS App - it shuts down correctly and the thread terminates, it restarts with the new thread honoring any of the changes I made, and then forces the downstream app to also reload and pick up the new App. Here is the code, let me know if you have any questions:

import appdaemon.appapi as appapi
from queue import Queue
from threading import Thread
from threading import Event
import time
#
# App to manage announcements via TTS and stream sound files to Sonos
#
# Provides methods to enqueue TTS and Media file requests and make sure that only one is executed at a time
# Volume of the media player is set to a specified level and then restored afterwards
#
# Args:
#
# player - media player to use for announcements
# base = base directory for media - this will be a subdirectory under <home assistant config dir>/www
# ip = IP address of machine running this app
# port = HASS port
#
# To use from another APP:
# TTS:
# sound = self.get_app("Sound")
# sound.tts(text, volume, duration)
# duration should be set to longer than the expected duration of the speech
#
# e.g.:
#
# sound = self.get_app("Sound")
# sound.tts("Warning: Intuder alert", 0.5, 10)#
#
# SOUND:
# sound = self.get_app("Sound")
# sound.play(file, volume, content_type, duration)
# file is the path of the file to play relative to "base"
# Content type is the mime type of the media e.g. "audio/mp3" or "audio/wav"
# duration should be set to longer than the expected duration of the media file
#
# e.g.:
# sound = self.get_app("Sound")
# sound.play("warning.wav", "audio/wav", 0.5, 10)
#
# Release Notes
#
# Version 1.0:
#   Initial Version

class Sound(appapi.AppDaemon):

  def initialize(self):
    
    # Create Queue
    self.queue = Queue(maxsize=0)

    # Create worker thread
    t = Thread(target=self.worker)
    t.daemon = True
    t.start()
    
    self.event = Event()
    
  def worker(self):
    active = True
    while active:
      try:
        # Get data from queue
        data = self.queue.get()
        if data["type"] == "terminate":
          active = False
        else:
          # Save current volume
          volume = self.get_state(self.args["player"], attribute="volume_level")
          # Set to the desired volume
          self.call_service("media_player/volume_set", entity_id = self.args["player"], volume_level = data["volume"])
          if data["type"] == "tts":
            # Call TTS service
            self.call_service("tts/google_say", entity_id = self.args["player"], message = data["text"])
          if data["type"] == "play":
            netpath = netpath = 'http://{}:{}/local/{}/{}'.format(self.args["ip"], self.args["port"], self.args["base"], data["path"])
            self.call_service("media_player/play_media", entity_id = self.args["player"], media_content_id = netpath, media_content_type = data["content"])

          # Sleep to allow message to complete before restoring volume
          time.sleep(int(data["length"]))
          # Restore volume
          self.call_service("media_player/volume_set", entity_id = self.args["player"], volume_level = volume)
          # Set state locally as well to avoid race condition
          self.set_state(self.args["player"], attributes = {"volume_level": volume})
      except:
        self.log("Error")
        self.log(sys.exc_info())

      # Rinse and repeat
      self.queue.task_done()
      
    self.log("Worker thread exiting")
    self.event.set()
       
  def tts(self, text, volume, length):
    self.queue.put({"type": "tts", "text": text, "volume": volume, "length": length})
    
  def play(self, path, content, volume, length):
    self.queue.put({"type": "play", "path": path, "content": content, "volume": volume, "length": length})

  def terminate(self):
    self.event.clear()
    self.queue.put({"type": "terminate"})
    self.event.wait()

1 Like

i will give this a close look tommorow, but it seems like you implement things that i did in my soundfunction app.

you end and restart the function in a certain way.

the way i have setup my soundfunction loop starts with a “do in 2 seconds” and ands with the same thing on the end of the loop. :wink:

so maybe this wasnt even neccesary when i had done this loop the same way as i have done my other loop and not followed your way :wink:

but it could be that this is a bit cleaner. and at least there are some things here that i can learn from again :wink:

I’ve built an alexa using a Raspberry Pi 2 (RPi2) and had good luck with the Playstation eye as a microphone (USD $7 - https://www.amazon.com/gp/product/B000VTQ3LU/ref=oh_aui_detailpage_o01_s01?ie=UTF8&psc=1)

I’m using kitt.ai for local wake word detection (https://github.com/Kitt-AI/snowboy) which so microphone input only hits the network (and amazon) if the wake word matches on the RPi. One can customize the wake word as well, but I haven’t bothered yet. On my list is trying out the alexa with home assistant, but appdaemon is sounding interesting.

– charles

1 Like

Hi @ReneTode!

Have you improved on Appi since your original post?

I’m thinking of using surveillance cameras with both speaker and microphone for this system, what would be needed to change to use multiple rtp/rtsp streams instead of a local microphone?

/R

ueah, i improved a little since then, but i got hold up. so it isnt running right now.
it works but the responce is a little to slow and inaccurat for the moment.

i dont know what rtp/rtsp streams are.

if i would have it to work on multiple places i would probably install more RPIs on more places and install appdaemon on them.

the other option would be to record audio files on different places and place those on a specific place on the network.
then in the workerthread translate those files to text. but i am afraid that would cause more lagging.

Hmmmm “most crazy” I think I might take that as a challenge. LOL See which one of us can do the most off the wall things with appdaemon.

then better try to get @aimc more crazy then i am doing :stuck_out_tongue:

I could never get more crazy than you Rene - you do things with AppDaemon I never even imagined!

1 Like

but i make you crazy with all my requests :wink:
and i told @turboc to start with that, if he wants to be more crazy then me :wink: