Link to Github Pull Request on the same topic: Improve code quality of Bayesian integration by HarvsG · Pull Request #79098 · home-assistant/core · GitHub
The Bayesian integration has the power to become a new and very powerful ‘Helper’ with minimal configuration that would allow users to create Virtual Sensors that measure the unmeasurable and save significant money.
The Bayesian integration has a very small user base in Home Assistant installations, showing in only 700 installations according to analystics. However it has been called ‘The Most Powerful Home Assistant Sensor’ and is advocated by some of the Home Assistant’s most influential members, e.g those at Home Assistant Podcast.
I think this ‘hidden gem’ of Home Assistant has been the preserve of power users but not lighter users for several reasons:
- Unfamiliar name unless you know about Bayesian statistics
- Unclear use - the name is not self explanatory
- YAML configuration
- Cognitive work required to set configuration variables
- Previous Bugs
I think by addressing these issues, the Bayesian integration could become a very useful, even important, feature in Home Assistant.
- Save users money by emulating other sensors (e.g presence detection)
- Measure/estimate states that are very hard or impossible to directly measure (e.g actions like Sleep or Cooking)
- In some cases this could also be solved by complex conditional logic in automations or template YAML.
- Allows more robust and reliable measurement by combining sensors (e.g pings to google, Cloudflare and Microsoft to test internet connection is up)
Limitations:
- Requires having enough (>1) sensors correlated with what it is you want to measure
A number of steps would need to be taken to realize it’s potential and I would suggest the following:
- Code review and optimisation
- Static typing of
binary_sensor.py
Improve typing and code quality in beyesian by HarvsG · Pull Request #79603 · home-assistant/core · GitHub - Enable multiple entries for
numeric state
for 1 entity. Accept more than 1 state for numeric entities in Bayesian by HarvsG · Pull Request #119281 · home-assistant/core · GitHub - Be able to handle
UNKOWN
andUNAVAILABLE
as sates that update the prior (will be useful for some use cases) - Enable device classes include device tracker. Bayesian - already supports device class by HarvsG · Pull Request #24458 · home-assistant/home-assistant.io · GitHub
- Support
unique_id
Bayesian - support `unique_id:` by HarvsG · Pull Request #79879 · home-assistant/core · GitHub
- Static typing of
- Better testing, error tolerance
- More testing of template entities
- One test that combines several different state types
- Clarify any errors caused by premature rounding/approximation - number of decimal places can matter a lot in Bayesian probability
- Tests for new features in (1. and 3.) ]
- More validation and more tests of behaviours with incorrect config values (Probs of 1, 0, negative and >1)
- If kept in, better test for numeric states with multiple ranges for one entity (negative obs should be ignored) but should warn if some values not included and probabilities do not sum
- Improve current experience
-
Create a Repairs notification for breaks caused by #67631
- Configs without
prob_given_false
Add repair for missing Bayesian `prob_given_false` by HarvsG · Pull Request #79303 · home-assistant/core · GitHub - Configs that contain duplicate, mirrored entries for an entity Add to issue registry if user has mirrored entries for breaking in #67631 by HarvsG · Pull Request #79208 · home-assistant/core · GitHub
- Warn on odd configurations (such has
prob_given_false
~prob_given_true
)
- Configs without
-
Update documentation
- Glossary of terms (Prior, posterior etc)
- Explain similarity of probs to sensitivity and 1-specificity
- Link to a Bayes calculator Bayes Theorem Calculator with Formula & Examples
- Better examples using template entries
- Multi-state examples including
UNKNOWN
(when feature added)
-
- UI configurations
- Simple config flow for binary entities (+/-
UNKNOWN
state - default to ignore) with appropriate verbosity- Probably needs to be in percentages
- UI for numeric and template configs
- Use history stats to suggest values for
prob_given_true
andprob_given_false
How to get history data for other sensors?- User would need to input a time range (using scheduling UI) when
true/false
and HA would spit out values (should adjust 0 and 1 to 0.999/ 0.001) - Would not work for attributes
- Would be harder for numeric and template values
- User would need to input a time range (using scheduling UI) when
- Use history stats to suggest most informative entities
- https://stats.stackexchange.com/questions/588458/how-to-select-the-best-binary-inputs-for-a-bayesian-model
- Since
prob_given_true
is sensitivity andprob_given_false
to specificity we can calculate positive and negative likelihood ratios - We can suggest variables based on the sum of their likelihood ratios (+LR + 1/-LR)
- As uses add suggested entities to the model we can show historical accuracy, false negative and false positive rates
- We can re-order the list to suggest entities with the needed rule-in / rule-out
- PulmCrit – Mythbusting sensitivity and specificity
- Simple config flow for binary entities (+/-
- Automate (4.) for users. They simply specify time period(s) that are
True
for the Bayesian sensor and HA does the rest. - Specialised config-flows for virtual presence, sleep and other ‘popular’ Bayesian sensors.
- When available create an official template blueprint that pulls the
probability
attribute as a numeric sensor.
I would be grateful for your thoughts and how this might conflict or help with how you use the Bayesian integration.
N.B Config subentries may become useful