Bender Voice Assistant

Hello everyone,

i created a voice assistant that looks and talks like Bender Bending Rodríguez from Futurama.

IMAGE_ALT

He is about 60cm tall. You can move the arms, legs and head. It’s also possible to open the door und put some things inside.
All the credit for the nice model goes to @JHB_17720 from printables.com

Things i used:

Hardware

  • ESP32 dev board
  • DFPlayer Mini (+ 16gb Micro SD)
  • NeoPixel Ring with 16 LEDs
  • INMP441 Microphone

Software

  • ESPHome
  • Home Assistant Conversation Agent
  • Google Cloud STT
  • openwakeword

What he does:

  • When he is speaking the neopixel ring is randomly pulsating so that it looks like he is talking
  • When he connects to wifi he says “I like it here, this place has class”
  • When you tell him to do something he answers with some random stuff
  • When he doesn’t know what to do he answers with “Uh… i don’t know” and the neopixel is pulsating in red

What i use him for:

  • When my airquality monitor says it’s stinky bender tells me that in a “charming” way :slight_smile:
  • He notifies me when my tea is ready
  • Turn on lights, switches, shutters, etc. with my voice by saying “hey bender turn on …”
  • “Hey bender set temperatur to 21° in the living room”
  • He can play mp3 files from the DFPlayer mini so i can listen to music
  • He says something random when somebody enters the room (only on daytime)

The next step is to get the ESPHome voice assistant and media player components running on my ESP32. Maybe then i could have him say things with Benders voice and he could also answer with the Piper TTS voice for the stuff i don’t have good audio files from the show.

Cheers
Harry

"Jimmy crack corn and I don’t care. "

This is fantastic. One of my robot vacuum cleaners is Bender and whenever he has issues or problems, the house will say something g like ‘Bender is stuck, like a turtle’ or ‘Bender needs battery and he says to bite his shiny metal ass’ or ‘Hey meatbag…’

Cheers mate that’s an amazing project.

Thanks a lot!

I totally like that idea, probably gonna steal it and rename my James to Bender… :joy: :yum:

This is absolutely awesome. I still have plans to give my frontend a Futurama theme, however I still haven’t gotten further than the user icons. I use bender for my stationary tablet interface, but having him talk back in statue form would be really cool.

I can also totally see a tucked away voice assistant using voice lines from the professor. Or just one that can throw random voice lines from the show in general :smiley:

2 Likes

Awesome!

Btw. what voice do you use for TTS? I doesn’t sound exactly like Bender, but it’s not far off either. Maybe playing prerecorded audio, instead of TTS, would be interesting, at least for some catchphrases like “Bite my shiny metal ass” :slight_smile:

1 Like

TLDR: I randomize the voice TTS notification, while I’d love for it to actually be John DiMaggio doing Bender’s voice, it’s just a random neural TTS voice. Kinda funny to hear a British dude say “Bite My Shiny Metal Ass…”

So right now I’m using a random set of neural TTS Voices. I use NodeRed and all my TTS notifications which are sent through a subflow. This subflow receives arguments and does the following:

Determine the most appropriate speaker if a speaker is not specified, (based on in-home motion/location tracking)
Pause existing music or speaker usage
Set the volume accordingly (50% for day, 25% at night)
Say the text passed with a random voice and random style unless specified
Reset the speaker (volume, and what was playing).

WAY complicated, but works for me and kinda proud of it. I am using Microsoft Cognitive Services TTS (yeah, I’m one of the 10 people using it) and one of the nice things I get to do with this subflow is have it randomly select a voice, so it maybe EN-US, EN-UK, EN-AU or EN-NZ,
If I don’t specify a tone or style (e.g. angry , cheerful , excited , friendly , hopeful , newscast , sad , shouting , terrified , unfriendly , whispering), it will randomly select one.

So Bender’s text notification might come out Aussie Sad one day or British Shouting another day. The variety is a funny compromise.

I also played around with creating my own custom neural voice. You upload a text file of sentences, then upload a bunch of wav files of you reading those sentences, it processes it, then it synthesizes a voice. I used it for a little while, but my partner was NOT happy with it. Said it was WAY too creepy. Basically it sounded like me having just returned from the brain slug planet.

So if anyone want to pay for, I don’t know, 1000 Cameos of John DiMaggio reading sentences like Bender with the disclosure that it is to make a TTS font of Bender, I would love an actual Bender voice. I would want John DiMaggio to get paid for it tho.

1 Like

Thank you!

At the moment there is no TTS involved. I use recorded snippets from the show.
So he is already (randomly) saying “Bite my shiny metal ass” :smile:
When i get the voice assistant component in ESPHome to work with the media_player component on my ESP Bender should be able to talk via TTS.

Uh… I would be more than happy to chip in for a real Bender TTS :star_struck: :rofl:

1 Like

The results of the contest are out!
They may be of interest to you :wink:
Have a look

3 Likes

For everyone who is interested here is my esphome yaml for Bender:

esphome:
  name: benderesp32
  friendly_name: BenderESP32
  on_boot:
    - priority: -100
      then:
        - switch.turn_on: first_boot
        - wait_until: api.connected
        - delay: 1s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:
        
esp32:
  board: esp32dev
  framework:
    type: esp-idf
    version: recommended


logger:
  
api:
  encryption:
    key: !secret api_bender
  
  services:
  - service: dfplayer_next
    then:
      - dfplayer.play_next:
  - service: dfplayer_previous
    then:
      - dfplayer.play_previous:
  - service: dfplayer_play
    variables:
      file: int
    then:
      - script.execute:
          id: _bender_speak_
          filenumber: !lambda 'return file;'
          is_error_s: 0

  - service: dfplayer_play_loop
    variables:
      file: int
      loop_: bool
    then:
      - dfplayer.play:
          file: !lambda 'return file;'
          loop: !lambda 'return loop_;'

  - service: dfplayer_play_folder
    variables:
      folder: int
      file: int
    then:
      - dfplayer.play_folder:
          folder: !lambda 'return folder;'
          file: !lambda 'return file;'
      - light.turn_on:
          id: benderlight
          brightness: 60%
          red: 100%
          green: 100%
          blue: 100%
          effect: SpeechFlicker
      - delay: 200ms
      - if:
          condition:
            not:
              dfplayer.is_playing
          then:
            - dfplayer.play_folder:
                folder: !lambda 'return folder;'
                file: !lambda 'return file;'
            - delay: 200ms
            - if:
                condition:
                  not:
                    dfplayer.is_playing
                then:
                  - script.execute: reset_led
          

  - service: dfplayer_play_loop_folder
    variables:
      folder: int
    then:
      - dfplayer.play_folder:
          folder: !lambda 'return folder;'
          loop: true

  - service: dfplayer_set_device_tf
    then:
      - dfplayer.set_device: TF_CARD

  - service: dfplayer_set_device_usb
    then:
      - dfplayer.set_device: USB

  - service: dfplayer_set_volume
    variables:
      volume: int
    then:
      - dfplayer.set_volume: !lambda 'return volume;'
  - service: dfplayer_set_eq
    variables:
      preset: int
    then:
      - dfplayer.set_eq: !lambda 'return static_cast<dfplayer::EqPreset>(preset);'

  - service: dfplayer_sleep
    then:
      - dfplayer.sleep

  - service: dfplayer_reset
    then:
      - dfplayer.reset

  - service: dfplayer_start
    then:
      - dfplayer.start

  - service: dfplayer_pause
    then:
      - dfplayer.pause

  - service: dfplayer_stop
    then:
      - dfplayer.stop

  - service: dfplayer_random
    then:
      - dfplayer.random

  - service: dfplayer_volume_up
    then:
      - dfplayer.volume_up

  - service: dfplayer_volume_down
    then:
      - dfplayer.volume_down

  on_client_connected:
   # My pipeline didn't run correctly without this. You may not need this!
    - delay: 1s
    - switch.toggle: use_wake_word
    - delay: 5s
    - switch.toggle: use_wake_word

ota:
  password: !secret ota_bender

wifi:
  networks:
  - ssid: homeau
    password: !secret pwd_homeau
  
  on_connect:
    if:
      condition:
        - switch.is_on: internal_sounds
      then:
        - if:
            condition:
              - switch.is_on: first_boot
            then:
              - switch.turn_off: first_boot
        - delay: 1s
        - script.execute:
            id: _bender_speak_
            filenumber: 24
            is_error_s: 0

  on_disconnect:
    if:
      condition:
        - switch.is_on: internal_sounds
      then:
        - script.execute:
            id: _bender_speak_
            filenumber: 23
            is_error_s: 1

  ap:
    ssid: "Bender Fallback Hotspot"
    password: !secret hotspot_bender

captive_portal:
  
# Pins for communication with DFPlayer Mini
uart:
  tx_pin: GPIO25
  rx_pin: GPIO33
  baud_rate: 9600

dfplayer:
  on_finished_playback:
    then:
      - logger.log: 'Playback finished event'
      - script.execute: reset_led
      
i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO13
    i2s_bclk_pin: GPIO12

microphone:
  - platform: i2s_audio
    i2s_audio_id: i2s_in
    id: echo_microphone
    i2s_din_pin: GPIO21
    adc_type: external
    pdm: false

voice_assistant:
  microphone: echo_microphone
  use_wake_word: false
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  id: va

  on_wake_word_detected:
    - light.turn_on:
        id: benderlight
        brightness: 60%
  
  on_stt_end:
    - script.execute: reset_led

  on_tts_start:
    if:
      condition:
        - switch.is_on: internal_sounds
      then:
        - script.execute:
            id: _bender_speak_
            # Needs to be changed if you don't use english. It's the same message the text assistant returns
            filenumber: !lambda |-
                if (x == "Sorry, I couldn't understand that") {
                  return (rand() % (10 + 1 - 1) + 1) + 10;
                } else {
                  return (rand() % (10 + 1 - 1) + 1);
                }
            # Needs to be changed if you don't use english. It's the same message the text assistant returns
            is_error_s: !lambda |-
                if (x == "Sorry, I couldn't understand that") {
                  return 1;
                } else {
                  return 0;
                }


light:
  - platform: esp32_rmt_led_strip
    id: benderlight
    disabled_by_default: true
    entity_category: config
    pin: GPIO23
    default_transition_length: 0s
    chipset: WS2812
    num_leds: 12
    rgb_order: grb
    rmt_channel: 0
    name: "Bender Light"
  
    effects:
      - flicker:
      - flicker:
          name: SpeechFlicker
          alpha: 95%
          intensity: 25%
      - pulse:
          name: FastBlink
          transition_length: 0s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%
      
script:
  - id: _bender_speak_
    mode: queued
    parameters:
      # Filenumber to play from DFPlayer Mini
      filenumber: int
      is_error_s: int
    then:
      - dfplayer.play_mp3: !lambda 'return filenumber;'
      - script.execute:
          id: flicker_led
          is_error_l: !lambda 'return is_error_s;'
      - delay: 200ms
      - if:
          condition:
            not:
              dfplayer.is_playing
          then:
            - dfplayer.play_mp3: !lambda 'return filenumber;'
            - delay: 200ms
            - if:
                condition:
                  not:
                    dfplayer.is_playing
                then:
                  - script.execute: reset_led

  - id: flicker_led
    parameters:
      is_error_l: int
    then:
      - if:
          condition:
            - lambda: 'return is_error_l < 1;'
          then:
            - light.turn_on:
                id: benderlight
                brightness: 60%
                red: 100%
                green: 100%
                blue: 100%
                effect: SpeechFlicker
          else:
            - light.turn_on:
                id: benderlight
                brightness: 60%
                red: 100%
                green: 0%
                blue: 0%
                effect: SpeechFlicker

  - id: reset_led
    then:
      - if:
          condition:
            - switch.is_on: use_wake_word
            - switch.is_on: use_listen_light
          then:
            - light.turn_on:
                id: benderlight
                blue: 100%
                red: 100%
                green: 100%
                brightness: 15%
                effect: none
          else:
            - light.turn_off: benderlight

switch:
  - platform: template
    name: Wake word benutzen
    id: use_wake_word
    optimistic: true
    restore_mode: ALWAYS_OFF
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);
      - script.execute: reset_led
  - platform: template
    name: Listen Light benutzen
    id: use_listen_light
    optimistic: true
    restore_mode: ALWAYS_ON
    entity_category: config
    on_turn_on:
      - script.execute: reset_led
    on_turn_off:
      - script.execute: reset_led

  - platform: template
    name: Interne Sounds
    id: internal_sounds
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
  
  - platform: restart
    name: restart

  - platform: template
    name: First Boot
    id: first_boot
    internal: true
    optimistic: true


text_sensor:
  - platform: homeassistant
    id: bender_volume
    # You have to set up this input_number in your configuration.yaml
    entity_id: input_number.bender_volume
    internal: True
    on_value:
      then:
        - dfplayer.set_volume: !lambda 'return atoi(id(bender_volume).state.c_str());'
        - logger.log: 'Volume Set'
        - light.turn_on:
              id: benderlight
              brightness: 50%
              blue: 100%
              red: 0%
              green: 0%
              effect: FastBlink
        - delay: 2s
        - script.execute: reset_led

  - platform: version
    name: $friendly_name ESPHome Version
  - platform: wifi_info
    ssid:
      name: WiFi Name

sensor:
  - platform: wifi_signal
    name: Wifi signal
    update_interval: 10s
  - platform: uptime
    name: Uptime

Cheers!

Harry

2 Likes

Hi @dirtyharriv

Does this Assistant use TTS and a voice model or is it still a mix of recorded clips from the series?

Hey @TheStigh,

it still plays the clips. But i could imagine that it will soon be possible to have a Bender TTS.

Did you see the guy on youtube making the Bender with moving head and eyes? He’s using ElevenLabs for Bender’s voice. I do have a Creators account but can’t find Bender (or John’s voice) there. WHat I could do if I had a + 1 minute good sound clip of Bender talking, without background noise, is using Voice Clone to create his voice.

Haven’t seen the one on youtube. You got a link?
I’ll try to get a good audio clip on the weekend. Do you know if it’s possible to use the voice clone with Piper? If so, maybe i could DM you when i got the audio clip? :blush:

Sent you a DM :slight_smile: