I started using the binary sensor platform for Bayesian Sensors a few months ago. After several months of working with the bayesian sensor, I have what I think is something akin to a best practice (at least for me).
I’ve seen several posts where people get somewhat tied up in what the parameters should be in the sensor. The sensor configuration is fairly simple (at least until you get into the details and try to make it work as expected).
Below is my sensor that decides if we’re away for an extended period (we define that is overnight). If it triggers to ON, then that blocks some automations, and enables others - for example, the morning coffee script doesn’t need to trigger if we’re not home. But I still want some lights to turn on and off in the evening and in the morning - the automations using this sensor are pretty easy to set up - in the conditions section of an automation, I create a condition for normal automations (coffee run, get ready for work, etc) when the sensor is OFF. The automations that run when we’re not at home have this binary sensor as an ON condition.
After giving this some thought, there are a number of observations that could “indicate” we are away for an extended period (overnight or longer). These are:
- I’m more than 50-100 miles away
- My partner is more than 50-100 miles away
- My google calendar has the word Hotel in it for the day
- One or both of us is more than 100 miles away (higher probability than the 50-100 mile range)
- No motion has been sensed by any of the house motion sensors (there are several motion sensors scattered throughout the house - odds of us being gone is high if none of them have detected motion in the last 10 hours
Setting up the Bayesian sensor is a two-step process - complete the top part and then add one or more observations.
The top portion has three parameters: a friendly name
(which is turned into the Bayesian sensor name by making the words lower case and adding a “_” between the words), a prior probability, and a probability threshold. Don’t worry too much about the actual values for these probabilities right now. I always set the prior probability to 0.4, and then adjust it and the threshold until I get the desired outcome (more on that below).
Each observation is based on another sensor (sensor.xxx
, binary_sensor.xxx
, input_boolean.xxx
, etc). Below, I use binary sensors and input_booleans
. There are four related to range from home (two for each of us), one that looks for the word “hotel” in my Google calendar, one that looks for no house motion within the last 10 hours, and then two overrides.
- platform: bayesian
name: Extended Away
prior: 0.4
probability_threshold: 0.98
observations:
- entity_id: 'binary_sensor.kirby_far_range'
prob_given_true: 0.7
prob_given_false: 0.2
platform: 'state'
to_state: 'on'
- entity_id: 'binary_sensor.sandy_far_range'
prob_given_true: 0.7
prob_given_false: 0.2
platform: 'state'
to_state: 'on'
- entity_id: 'binary_sensor.kirby_extended_range'
prob_given_true: 0.9
prob_given_false: 0.1
platform: 'state'
to_state: 'on'
- entity_id: 'binary_sensor.sandy_extended_range'
prob_given_true: 0.9
prob_given_false: 0.1
platform: 'state'
to_state: 'on'
- entity_id: 'binary_sensor.staying_at_hotel'
prob_given_true: 0.8
prob_given_false: 0.2
platform: 'state'
to_state: 'on'
- entity_id: 'binary_sensor.no_house_motion_long'
prob_given_true: 0.95
prob_given_false: 0.1
platform: 'state'
to_state: 'on'
- entity_id: 'input_boolean.bay_enxtended_away_override_to_true'
prob_given_true: 1.0
prob_given_false: 0.0
platform: 'state'
to_state: 'on'
- entity_id: 'input_boolean.bay_enxtended_away_override_to_false'
prob_given_true: 0.0
prob_given_false: 1.0
platform: 'state'
to_state: 'on'
Each observation has at least five parts to it:
-
entity_id
is the name of the entity that is being monitored (observed) -
prob_given_true
(this is a probability that the Bayesian sensor should turn on if the entity is ON (depending on the to_state) -
prob_given_false
(this is a probability that the Bayesian sensor should NOT turn on if the entity is OFF (depending on the to_state) -
platform
you looking at (in the case of other binary sensors and input booleans, this will almost always be ‘state.’ -
to_state
- the state that the entity sensor goes to and uses prob_given_true to calculate the overall probability. For example, from above, ifbinary_sensor.no_house_motion_long
turns ON, then the probability is calculated with the value inprob_given_true
(0.95)
That’s all there is to setting up the observations.
Of course, we haven’t started the hard part yet - figuring out the correct values for all the prob_given_true
, prob_given_false
, prior
, and probability_threshold
parameters! Let’s tackle that issue. Let’s look first at the two kirby range observations - one is 50-100 miles away (binary_sensor.kirby_far_range
) and the other is more than 100 miles away from home (binary_sensor.kirby_extended_range
):
- entity_id: 'binary_sensor.kirby_far_range'
prob_given_true: 0.7
prob_given_false: 0.2
platform: 'state'
to_state: 'on'
- entity_id: 'binary_sensor.kirby_extended_range'
prob_given_true: 0.9
prob_given_false: 0.1
platform: 'state'
to_state: 'on'
If I’m 50-100 miles away (kirby_far_range
), then I might be gone for the night. Or maybe I just visited someone and I’ll be back that night. If I am 50-100 miles away, I decided there was a 70% chance I might be gone for the night, so I set prob_given_true: 0.7
. I set prob_given_false: 0.2
- why? I’ll explain how I came to those probs in a moment. If I’m more than 100 miles from home (kirby_extended_range
), then I decided there was higher chance that I wasn’t coming home for the night. The probabilities I decided on were 0.9 for true and 0.1 for false.
NOTE: You can eliminate the prob_given_false
parameter (and I probably should). If you do, the Bayesian component module calculates prob_given_false
as:
prob_given_false = 1.0 - prob_given_true
Similarly, I set the other parameters as shown in the config - again, don’t stress too much about what these values are. We’ll make this easier towards to end of this post.
With no other tools at our disposal, we now save the configuration file, restart HA, and then wait until some of the observations are met to see if we get the expected results. I didn’t want to wait until I was 100 miles away to see if the Bayesian sensor would trigger, but I didn’t know what else to do. So we went on a roadtrip one sunny afternoon. It didn’t work as expected (as an afterthought, I realized I could have set up dummy input booleans to trigger the observed binary sensors, but I hadn’t thought of that yet - and that’s still kind of a pain). Now I had to adjust the tuning parameters (probabilities and prior) and wait to see if those gave the expected results. This takes a lot of time to validate the Bayesian sensor model. There has to be a better way!
And there is. I dug into the Bayesian sensor module, found the calculation, and built a simple Excel spreadsheet to model the behavior. A screenshot of the spreadsheet is:
How this works:
- The yellow highlighted cells are values you have to set in the bayesian configuration;
- The row that starts with “Prior” is only relevant for the highlighted number - 0.400. That’s the
prior: 0.4
line in the config. The rest of the numbers in that row are calculated values (shown below); - The ON/OFF row is used to simulate the conditions in the sensor. For example, if I want to see what would happen if we’re both 50-100 miles away, then I would set those two columns to 1 (the condition is TRUE), otherwise, it’s 0 (condition is FALSE). They are shown as 1 in the screenshot;
- The rows with
prob_true
andprob_false
are those tuning parameters that you have to provide numbers for in the config file (prob_given_true
andprob_given_false
); - The row Bayes’ Calc is calculated numbers (shown below)
- Current Prob is the value calculated by the Bayesian component module (and shown below in Excel format) that is compared to the threshold value that you set in the config file (for this example, that’s 0.98). If this calculated number exceeds the threshold value, then the Bayesian binary sensor triggers to ON.
Notice the last two columns that start with Override. I use these in all my Bayesian sensors. The overrides are simply input_boolean
toggles that I can use to force the bayesian sensor to ON or OFF. This happens by setting up the probabilities as shown in the Excel and below to force the Bayesian sensor to ON:
prob_given_true: 1.0
prob_given_false: 0.0
Why do this? It’s a backdoor into the Bayesian calculation. For example, if we’re in a hotel 20 miles away, the bayesian sensor won’t trip to ON, but I can turn on the input_boolean
and it will force the probability to 1.0 (this is an artifact of the math behind the sensor calculation), which does trip the Bayesian sensor to ON. Similarly I can use the OFF input_boolean
to turn it OFF if for some reason it’s ON when it shouldn’t be by setting the probabilities to:
prob_given_true: 0.0
prob_given_false: 1.0
I attempted to upload this relatively simple spreadsheet, but the system doesn’t allow that. So the following three views show the calculations in the cells:
Once you have the first two columns (conditions) built, then it’s a matter of copying the second column one or more times to create the rest of the sensor conditions.
The last screen shot shows the override to ON condition:
The scenario that is modeled in the spreadsheet is if we were staying in a hotel (from my Google calendar) that was less than 50 miles from home and there hadn’t been any motion in the house for over 10 hours. The calculated prob is 0.962 which is less than the threshold value of 0.98. So I know I’d have to override it to ON in this particular case.
Now that I have an Excel model, I can tune the probabilities and prior without having to restart HA and waiting for the conditions to occur. I iterate through the possibilities (simulating different observations are TRUE or FALSE by simply changing that row to 1s and 0s). If I don’t hit the threshold when I think it should, I adjust the prob_given_true
values (and maybe the prob_given_false
values). Once I’m happy with when the bayesian sensor trips to ON based on the numerous different combinations of observations, I load the values into the config file and restart HA once. This has saved me a ton of time. And the overrides give me a backdoor if the tuned probs aren’t quite giving me the expected outcome for a scenario I didn’t plan for.
If someone can tell me how to share the spreadsheet on the forum, I’m more than happy to do that!
And of course, looking forward to comments on how others use the Bayesian sensor platform. I also use it to decide if my HVAC should be in cooling or heating mode, and whether or not we’re sleeping. I’m planning another one to decide what the thermostats should be set up based on presence, holidays, work days, getting ready for work in the morning, etc.