How Bayes Sensors work, from a Statistics Professor (with working Google Sheets!)

I’ve seen dozens of Bayes Binary Sensor explanations. As both a programmer and a statistics professor, they all give me the shivers. There are explanations out these that are wrong both from a math & statistics perspective and from a programming & implementation perspective.

So, here is the definitive explanation. Grab some coffee, and lets get down to brass-stats.

Weaved within the post is how I think about setting the values for these sensors (from a real stats perspective). Knowing how the numbers are actually being used under-the-hood should help with setting them.

(for the impatient): HASS Bayes Sensors FOR REALSIES - Google Sheets

As a concrete example we’ll use a bayes-sensor determine whether my house is occupied. We’ll say the occupied state is TRUE.

Bayes theorem describes a way to update our knowledge of a probability given new information. We always start a bayesian analysis by determining a prior. This is our estimate of the probability before learning about any event. From our example, this is interpreted as “Knowing NOTHING, what do I think the probability of the house being occupied?”. I can estimate this by saying: “I work ~8 hours 5 days a week, have other ~2 hours of non-house activities per day”. So, I estimate that, knowing nothing, my house is empty ~10 hours per day, which means it is occupied 14 hours per day. So, I’d set my prior at ~14/24 ~= 0.58.

Now, for information. Imagine my TV turns on. That should inform my understanding of whether the house is empty. Bayes Rule tells me how to update my prior probability given this new information.

The raw Bayes Rule:

p( house_occupied | TV_on ) = P(house_occupied) * p (TV_on | house_occupied) / p(TV_on)

I would read this equation as: “The probability the house is occupied given the TV is on is equal to the probability the house is occupied (before knowledge) times the probability the TV is on given the house is occupied divided by the probability the TV is on.”

That was a big word-salad but we can break it down.

  • p( house_occupied | TV_on ) : Probability that the house is empty now that we know the TV is on. This is called the “posterior probability” (probability post-knowledge).
  • p(house_occupied) : Probability the house is occupied before we knew the TV was on (our prior)
  • p(TV_on | house_occupied) : Probability the TV is on when the house is occupied. This is the prob_given_true from the config files. This is materially different from “the probability the house is occupied when the TV is on” (how I commonly see it explained).
  • p(TV_on) : Probability the TV is on (on the whole throughout the day).

The ratio of the last two items are often called the predicate. They represent how “important” this new information is.

Imagine the next two scenarios:

My TV is on only when I’m home: p(TV_on | house_occupied) will be LARGER than p(TV_on).

  • If I’m home 12 hours a day, the TV is on for 4 of that. p(TV_on | house_occupied) will be ~4/12
  • But p(TV_on) will be 4 hours out of 24 per day. ~= 4/24.
  • This means the division of the two will be greater than one. p(TV_on | house_occupied) ~= 4/12 ~= 8/24 while p(TV_on) ~= 4/24. So, my predicate is ~= 2.0.
  • And when I multiply it by the prior, the probability of p( house_occupied | TV_on ) will INCREASE.

I leave my TV on for my dog when I’m away: p(TV_on | house_occupied) will be SMALLER than p(TV_on)

  • p(TV_on) ~= 10 (time away, TV on for the dog) + 4 (time I’m home watching TV) / 24 ~= 14/24
  • p(TV_on | house_occupied) will still be ~4/12.
  • Now, the division of the two is LESS than one. 8/24 / 14/24 ~= 0.57
  • This means that when I multiply the predicate by the prior, the probability will DECREASE.

This makes intuitive sense. If my TV is only on when I’m home, knowing the TV is on increases the probability the house is occupied. If I leave the TV on for my dog, then the TV is actually on more when I’m NOT home. Meaning the TV being on implies that the house is actually empty (except for the dog).

Most of the explanations I’ve seen around are good up to this point. But, they all get the next part wrong. What about the prob_given_false in the docs? Which we haven’t used yet.

The explanation I see floating around is prob_given_false is used when (in our example) the TV is off. This is WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG WRONG from both a statistics and code implementation perspective.

Here’s the leap we need to make that I see everyone making a mistake on. How do we know the probability my TV is on throughout the whole day (the denominator of the Bayes rule)? Well I can re-think about the probability the TV is on as ”The probability the TV is on while I’m home times the probability I’m home plus the probability the TV is on while I’m not home times the probability I’m not home”. In math:

p(TV_on) = p(TV_on | house_occupied)*p(house_occupied) + p(TV_on | house_not_occupied)*p(house_not_occupied)

  • p(house_occupied) is the prior
  • p(house_not_occupied) is 1-prior
  • p(TV_on | house_occupied) is the prob_given_true from the configs
  • p(TV_on | house_not_occupied) is the prob_given_false from the configs.

So, we can rewrite the equation as (often split into separate numerator and denominator eqns):

numerator = p(TV_on | house_occupied) * p(house_occupied)
denominator = p(TV_on | house_occupied) * p(house_occupied) + p(TV_on | house_not_occupied) * (1 - p(house_occupied))
probability = numerator / denominator

If we look at the code in HA core/homeassistant/components/bayesian/binary_sensor.py at 83a709b768b88389f0077ca22aa7a445c5babaac · home-assistant/core · GitHub we can see that is exactly how it is implemented.

def update_probability(prior, prob_true, prob_false):
    """Update probability using Bayes' rule."""
    numerator = prob_true * prior
    denominator = numerator + prob_false * (1 - prior)
    probability = numerator / denominator
    return probability

And the relevant part of def async_threshold_sensor_state_listener

prior = self.prior
for obs in self.current_obs.values():
  prior = update_probability(prior, obs["prob_true"], obs["prob_false"])
self.probability = prior

From this code we can see that EVERY update of an observation includes both a prob_given_true and a prob_given_false.

This is also stated in the docs (Bayesian - Home Assistant):

prob_given_true
(float)(Required)
The probability of the observation occurring, given the event is true.

prob_given_false
(float)(Optional)
The probability of the observation occurring, given the event is false can be set as well.

These are the probabilities of the events (TV On) happening GIVEN the true/false states (house occupied). NOT, the probability of the state (house-occupied) given the event (TV on/off). THESE ARE DIFFERENT PROBABILITIES AND ARE NOT INTERCHANGEABLE.

So, I would set my sensor up like this:

prior: 0.58 # home roughly 14/24 hours per day
observations:
  - platform: state
    entity_id: sensor.tv
    prob_given_true: 0.33  #4 hours of TV per the 10 hours I’m home
    prob_given_false: 0.017 #15 minutes of TV per 14 hours away, I don’t like zeros here
    to_state: on

This means that our sensor will have a new “observation” when the TV is on.
This sensor will give a value of 0.96 when the TV is on.

numerator = p(TV_on | house_occupied) * p(house_occupied)
numerator = 0.33* 0.58 = 0.1914

denominator = p(TV_on | house_occupied) * p(house_occupied) + p(TV_on | house_not_occupied) * (1 - p(house_occupied))
denominator = 0.33* 0.58 +0.017 * (1 - 0.58) = 0.198

probability = numerator / denominator
probability = 0.1914 / 0.198 = 0.96

But, what happens when the TV is off. Well, looking at the code:

   def _process_state(self, entity_observation):
        """Add entity to current observations if state conditions are met."""
        entity = entity_observation["entity_id"]

        should_trigger = condition.state(
            self.hass, entity, entity_observation.get("to_state")
        )

        self._update_current_obs(entity_observation, should_trigger)

    def _update_current_obs(self, entity_observation, should_trigger):
        """Update current observation."""
        obs_id = entity_observation["id"]

        if should_trigger:
            prob_true = entity_observation["prob_given_true"]
            prob_false = entity_observation.get("prob_given_false", 1 - prob_true)

            self.current_obs[obs_id] = {
                "prob_true": prob_true,
                "prob_false": prob_false,
            }

        else:
            self.current_obs.pop(obs_id, None)

The code only triggers (and adds to the “current_obs”) when the state is in the to_state. When the TV leaves the on state, it will fall out of the current_obs list (that’s the else in the last two lines). This leads to an important consequence, the not “to_state” observations are not seen by the bayesian sensor.

When the TV is off, it will give the prior (because there are no “observations”). So, its value will be 0.41.

If we want the TV being OFF to indicate we are likely to be away, we need to adjust the sensor.

prior: 0.58 # home roughly 14/24 hours per day
observations:
  - platform: state
    entity_id: sensor.tv
    prob_given_true: 0.33  #4 hours of TV per 10 hours I’m home
    prob_given_false: 0.017 #15 minutes of TV per 14 hours away, I don’t like zeros here
    to_state: on
  - platform: state
    entity_id: sensor.tv
    prob_given_true: 0.66 # 8/12 Hours off while home
    prob_given_false: 0.946 # 13.75/14 hours off while I’m away
    to_state: off # this is the “TRUE” state now

Now, when the TV is on. It will have p=0.96
When the TV is off. It will have p=0.49

This makes intuitive sense. The TV being on is “rare” (roughly 4 hours per 24) and only happens when I’m home, so the change from the prior is large, 1.66 times. However, the TV is often off (I gotta sleep sometime), it being off doesn’t down-shift the probability as much, only changed by ~9%.

How do we integrate multiple pieces of information. This is actually easy.

We start with the original prior. Then we get new information. We do the Bayes Rule for the new info. The posterior becomes our “new prior”. When we get more new information, we use the “new prior” in our next calculation. That’s what is expressed in the for-loop from async_threshold_sensor_state_listener.

I’ve put these into a Google Spreadsheet that anyone can use. Just duplicate your own from mine. HASS Bayes Sensors FOR REALSIES - Google Sheets

You can use the drag-fill to include more sensors. The “predicate” field indicates how much the sensor will update when the observation is True. The “posterior” column indicates the probability after each observation, if the observation is in the FALSE state, it is not included in the observation list (like it is in HASS).

I hope this helps people understand what is going on under the hood of the Bayes Sensor platform.

98 Likes

This is a pretty great write-up. I was working on bayesian sensors this week again and got confused (again). This cleared a lot up.

Do you have any thoughts on making bayesian sensors less ‘noisy’? Right now I create a recursive event that assigns probabilities to the bayesian sensor itself being on. Obviously this is statistacilly nonsensical, but in my mind this increases the ‘prior’ for deactivation.
I think, properly chosen, such a recursion can make sure that it is relatively easy to trigger a bayesian sensor, but hard to deactivate (or the other way around of course). I just have no clue what to pick as good values for these.

I’ve tended to deal with the noisiness in two ways.

The first is to have more sensors as part of the Bayes calculation. If you only have 3-4 observations and the prob_given values are high (or low), then the sensor will be drastically effected. Most of mine have 15-20 observations, that way as things happen within and around the house the sensor only moves a few percentage points at any given time. Try adding sensors about your work-day, time of day, your thermostat state, individual room lights.

The way other way has been using the Binary Temple Sensors and the delay_on and delay_off parameters. I also tend to make a few template sensors for each Bayes sensor because sometimes I want to do different things when the house is “probably empty” vs “definitely empty”

- platform: template
  sensors:
  
    house_prob_occupied:
      friendly_name: House Probably Occupied
      value_template: >-
        {{ state_attr('binary_sensor.house_empty', 'probability') < 0.5 }}
      device_class: presence
      delay_off:
        minutes: 15
        
    house_def_occupied:
      friendly_name: House Definitely Occupied
      value_template: >-
        {{ state_attr('binary_sensor.house_empty', 'probability') < 0.2 }}
      device_class: presence
      delay_off:
        minutes: 15

As for putting the bayes-sensor itself in the list of observations. There’s no inherent problem with that. One might use that for a sleep sensor … once I’m asleep I’m more likely to be asleep. You could use the template platform and the last_changed attributes to set “been asleep for an hour” and 'been asleep for 8 hours" to different “observations” with different prob_given_true and prob_given_false values. I had a setup like this for a while but ultimately found just using sensor.time and sensor.workday to be more reliable. But YMMV.

Hope that helps.

3 Likes

Thanks, I’m using a combination of these already, but I guess I just don’t have enough sensors or some sensors have too much impact, I probably need some more tweaking. I don’t use lights as sensors btw, since that’s what I want to automate, so that seems like an unwanted recursion.

Regarding templates I found out yesterday that using ‘last_changed’ and ‘now’ doesn’t work, unless you specify sensor.time as a triggering entity, since the template otherwise is only updated on a state change of an entity. This might be the reason of your lack of succes there. I stumbled on the delay_off method and that seems ot work fine, although you have to make a seperate sensor for each ‘buffer’ sadly.

This spreadsheet is awesome! Thank you! Great description as well. I was able to model a sensor with 12 observations and the results match exactly what I get from Home Assistant. Thank you!

You know @JudoWill when you first posted this I marked it to read when I am on holiday. I am on holiday so the probability I would read it are high. In fact it is now known that the probability that I read it today is 1.0.

When I was in my final year at school (1979) my maths teacher was very very good. I understood everything he taught because he was so good. I followed his absolutely clear statistics classes and found it fascinating.

To my regret I didn’t study stats past that, it wasn’t in the engineering course.

However my point is that you remind me of that teacher, very clear explanations thank you. I’ll probably have to read it again, as learning skill diminish with age, but that does not detract from your clarity. Well done.

4 Likes

Thank you so much for the post the google sheet! This was very helpful

I’m a cs student and lit spent the whole day thinking how this is implemented without looking at the code … ( stupid ) and all the explanations on the internet didn’t make any sense.
thank you for letting me go to sleep at ease!

Very nice explanation! And great google sheet, which got me playing around with it until late…

I was actually also misled by the “WRONG WRONG WRONG“ assumption that the sensor would use the fact that an input is not in the to_state. So, my sensor was actually quite unbalanced, because I only had “confirming“ inputs. Therefore, I added a few inputs with the other to_state. I came to realize the two probabilities given truewould always sum to 1. Obviously, actually…

That made me wonder: is it good practice to include the other to_state always and by default? And exactly (and simply) with probabilities set to 1 - the other? Are there cases where you would advise against that?

In my sensors I usually have something for both states but they don’t always end up being directly related. TV on/off, lights on/off are really useful to have in both directions. I also use the https://www.home-assistant.io/integrations/tod/ sensor a lot in both the on and off state.

I’ve also had to recalibrate a lot of my bayes sensors with the new COVID landscape, my prior for the house being occupied isn’t 0.58 anymore. And the workday sensor isn’t really relevant. The worksheet really helped with that process.

Can I just add my gratitude for your post. I grappled with the Bayes sensor until I ultimately gave up on it. I knew enough about Bayes to know that most of the explanations were a little wonky but not enough to figure it out for myself. So thank you. I’ll have to give it another go.

Hi @JudoWill ,

First of all, this is awesome! Someone pointed me to this post when I asked how I could make a sensor that could tell me if someone is in bed or not. I think I understand the most off what you just said (English is not my ‘default’ language, so It’s a little more difficult to understand sometimes)

I have one question; could you explain the next bit a little more. That is the only thing I get stuck at in the spreadsheet.

This is whether the condition is TRUE (in the “to_state”) or the value_template evaluates to TRUE

Thanks and stay safe! :slight_smile:

Thanks for the Google Sheet. Super helpful!

I think there’s a minor error in it. Cell B12 I think needs a +1 after the COUNTA to include the final row’s adjustment to the probability. i.e. it should be:
=INDEX(L:L,COUNTA(L:L)+1,1)

Or am I missing something?

1 Like

Regarding templates I found out yesterday that using ‘last_changed’ and ‘now’ doesn’t work, unless you specify sensor.time as a triggering entity, since the template otherwise is only updated on a state change of an entity.

Can you post what you ended up using here? I think I’ve gotten my observations to say that I’ve gotten home working well, but then if Wi-Fi or Bluetooth disconnect (but I’ve been home for a while), I want to continue believing I’m home. Struggling to use time here, as you suggested you did.

You’re definitely right about the off-by-one error. I fixed it on my local copies but I guess it never made it back to the original here.

For the disconnecting, check out the delay_off parameters of the template binary sensors. Also, most of the bluetooth presence detectors have ways that you can extend the time it counts you as “present” after a disconnect. If you use a history_graph in your lovelace you can get an estimate of how long to make it.

I would also check out using the tod sensor. I’ve found defining a handful of time-of-day sensors is a pretty easy way to get bayes sensors more accurate. Individually they don’t provide “much” information, but they can help boost the signals of weaker or more intermittent sensors.

Is there a way to have the bayes sensor use “delay_off”? I’m using mine for room occupancy and I’m tying together a hue motion sensor and an ecobee occupancy sensor (as well as a few other inputs).

This is working pretty well but the occupancy sensor “flutters” with the motion sensor. I’m working around this by setting a time restriction on the automation (senor must be off for 15 min) but it would be nice to have the occupancy wait 15 minutes before turning off.

To be clear, it doesn’t flutter when I have other inputs (like if my TV is on or music is playing) so it doesn’t happen all the time. On the upside, my lights shouldn’t turn off on their own anymore.

1 Like

Sadly, my pull request to add delay_on and delay_off to the bayesian sensor was refused on somewhat dubious grounds, see https://github.com/home-assistant/core/pull/29122

I’m fairly sure they don’t actually understand why this is basically a necessity, but such is the nature of open source software. :confused:
I’m still a bit pissy about it because I put a lot of work into it…

As for the solution, I use (a lot) of one off template binary sensors, such as this one for example:

 # buffer on time
- platform: template
  sensors:
    presence_keuken_15s:
      friendly_name: "max 15s geleden presence keuken"
      value_template: >
        {{
        is_state('binary_sensor.presence_keuken', 'on')
        }}
      delay_off:
        seconds: 15

Then just use this sensor instead of the normal one. Or define 2 observations, for instance one for the sensor itself and one template observation that checks if the sensor is off AND that this sensor is on, guaranteeing that you have the ‘timer’ situation. I tend to have a few of these and then have a ‘fall-off’ of probabilities the longer ago the sensor was on.

2 Likes

Just read your PR, completely agree with you. I also have a dozen or so template sensor implementing the delay_on & delay_off logic. I actually use a Jupyter Notebook to create all of the sensors as dict objects and then dumping to yaml. It helps when I have a few different constants that need to be propagated across different observations.

The advantage I have found to having those intermediate sensors has been using the lovelace history_graph. I put all of the binary sensors in along with a template-sensor against the probability. It helps me diagnose false positives. I also use the history sensor to measure the length of the on state and make sure it matches my assumptions.

2 Likes

Wow, I’m surprised the pull request wasn’t merged. This seems logical to me. I mean, as I see it I should be able to use any property of the underlying sensor in the Bayes Sensor…but I guess it’s not designed that way. Shouldn’t the Bayes Sensor inherit the properties of the other sensors?