Triggering on "Beeping" Sounds

I have a bunch of devices which “beep”:

  • Microwave
  • AC
  • Fire Alarm

Assuming I have a microphone plugged into a computer is there an easy way to capture “Beep” type sounds (brief single tones) and categorize them by pitch and duration.

So that they could be feed into HA to trigger automations (again keyed by pitch and duration). A bit of googling shows that some combination of arecord and sox might be able to do the analysis of the sound.

However I am wondering if anyone else has already gone down this path / has something that is already usable?

I have been playing with an RFFT.

If I set the parameters to:

  • Sample at 44100 Hz
  • With an 4K chunk size

Then I get a frequency resolution of ~ 10 Hz and I get a new result every 93 ms.
Those tolerances are pretty good for most beeps.

For very loud beeps they stand out like a beacon (very easy to detect with a simple magnitude filter).


For quiet beeps its difficult to pick them out:

This is a frequency waterfall for a fairly quiet room:

If I turn on the the AC, I start to see noise on the low end of the frequency range:

However loud beeps really stand out:
I think the small ridges are either an artifact of the FFT or they may be harmonics.

The problem is other loud noises like speech absolutely swamp the frequency range:

When I get a quiet beep it’s still possible to see it in a quiet room (even with the AC running)

However I am struggling to avoid false positives in an automated fashion.

I may be forced to average the magnitude of all the frequencies over say 1500 Hz - so if noise is happening I just can’t pick out quiet beeps.

1 Like

I have it working.
This (Python) code is really terrible **, but it’s enough to prove the concept:

** - Need to spend some time making it look pretty.

import numpy as np
import sounddevice as sd
import paho.mqtt.client as mqtt

SAMPLE_RATE = 44100
CHUNK = 4096

# MQTT broker details
mqtt_broker = "ha.local"
mqtt_port = 1883
mqtt_username = "dtrott"
mqtt_password = "my_password"
topic = "test/topic"

client = mqtt.Client(mqtt.CallbackAPIVersion.VERSION2)
client.username_pw_set(mqtt_username, mqtt_password)

client.connect(mqtt_broker, mqtt_port, 60)
client.loop_start()

detected = False
sent_detected = False


def detect_beep(indata, frames, time, status):
    global detected
    global sent_detected

    audio = indata[:, 0]

    fft = np.fft.rfft(audio)
    freqs = np.fft.rfftfreq(len(audio), 1 / SAMPLE_RATE)
    magnitude = np.abs(fft)

    low = max(magnitude[351:361]);
    detect = max(magnitude[371:373])
    high = max(magnitude[381:600]);

    dd = detect * 0.5
    detected = detect > 3 and low < dd and high < dd

    if detected != sent_detected:
        payload = "on" if detected else "off"
        client.publish(topic, payload)
        sent_detected = detected


def process_audio():
    with sd.InputStream(callback=detect_beep,
                        channels=1,
                        samplerate=SAMPLE_RATE,
                        blocksize=CHUNK):
        input("Listening... Press Enter to stop\n")


process_audio()

client.loop_stop()
client.disconnect()

How it Works

I am creating three ranges:

  • The “detection” range - the frequency I am looking for.
  • A “low” range below the frequency I am looking for.
  • A “high” range above the target frequency.

If I get a signal in the target range, above an arbitrary threshold (3 in this case) and the target signal is 3 dB’s clear of the low and high ranges then I consider that a detection.

I update an MQTT topic with the detection state whenever it changes - hence HA can use that topic to trigger automations.

Visually the detection bands look like this:

The pink band is the detection band.
The two green bands are the low (left) and high (right) bands.

Since the higher frequencies are normally much quieter than the lower frequencies I use a much wider band on that.

Note: I leave a gap either side of the detection band as there can be noise close to the target frequency.

Per my previous post 4K chunks is the sweet spot:

  • Smaller chucks loose frequency resolution.
  • Larger chucks sample too much time - hence can’t detect short beeps.