Hi all,
I’m Will, I’m a professor in Philadelphia and I study the HIV epidemic. I’ve been a HA user for a while, you might know my previous post on How Bayes Sensors work, from a Statistics Professor (with working Google Sheets!). I wanted to give the community here my perspective on tracking COVID-19 using Home Assistant at the local level.
In this post I will walk through creating an interactive tool using base HA components that will let you understand the local risk of COVID-19 that you have in your immediate area. I suggest reading in full as you’ll have to make a lot of non-automatable decisions and go on some sleuthing expeditions. Following along with the logic will help you adapt it to your specific situation.
I know this is an international audience, so this will apply differently to everyone. I am in Philadelphia, PA, USA with 76 confirmed cases and zero deaths so far. So, we’re very early. But, I’m going to try to tailor the HA code I’m sharing to a wide range of individual situations. Also, I mostly do my HA development through the templating engine, if someone wants to convert this stuff to an integration, I’m happy to collaborate, I’m just not good with the HA backend.
Now, our goals: 1) Estimate the true number of cases in our area given the number of deaths and 2) quantify the minimum number of individuals that could gather WITHOUT ANYONE being infected. This last number will be useful as a metric for whether it is safe to go grocery shopping or you shouldn’t even leave your house. The answer to the first, will let us calculate the second.
I’m drawing this methodology from this Medium article. Coronavirus: Why you must act now. In it he provides a handful of Google Sheets to help you estimate the number of TRUE CASES in your area from the number of fatalities. He talks about a method for using confirmed cases, but here in the US testing has been so terrible as to make it virtually useless. I believe I can at least trust the government to count the dead.
But Home Assistant has an integration to track COVID I hear you say. For some people, that may be enough, but here in the US, the country is vast, and the death count across the country means very little to my individual situation. Build an integration to scrape local data I hear you say. Good luck, every state, county, and city is putting together their own websites (none of which allow downloads or provide machine-readable formats) and those websites are changing structure every 12 hours as they figure out their response. You’d spend all your time fighting that. Instead, we’ll just use our eyes and a few input_number
s.
Before we get started programming, think about your “population”. This should include all people within 3-4 interactions of your daily life. Who are the people that interact with the people that interact with the people that interact with you. I live in the city, so, I would assume that would be everyone in Philadelphia County or its bordering 7 counties across PA and NJ. Think about your situation and come up with a reasonable guess. Do some internet sleuthing and find some state/local government websites that will break out data into the counties. Bookmark those for later. I put mine in a markdown lovelace card.
content: >
# Links for quick Info
[PA Health Department
Info](https://www.health.pa.gov/topics/disease/Pages/Coronavirus.aspx)
[Philly Health Department
Info](https://www.phila.gov/services/mental-physical-health/environmental-health-hazards/covid-19/whats-new/)
[Camden Health
Department](https://www.camdencounty.com/service/health-human-services/covid-19-updates/)
type: markdown
While you’re Googling, Find the population sizes of all of those counties combined. Mine comes to 5,370,926. Save that number, we’ll need it later.
Now for some biostats. First, in words:
As epidemiologists track deadly outbreaks they focus on only a few numbers. The number of deaths, time between infection and death, the doubling time of the infection, and the percentage of infections that lead to death (the case fatality rate, CFR). The logic being, if we know the number of deaths, and the percentage of infections that lead to deaths, we can calculate the total number of infections. But remember, there’s a time-lag between infection and death, so the deaths we see now, are due to infections in the past. We can use the doubling-rate of the infections to then estimate the number of TRUE infections right now and project that into the future. We can also use the CFR to calculate the number of deaths in the future to check our predictions and adjust parameters.
I’m taking the constants from either World Meter (for CFR) or the Medium article referenced above.
In Jinja template:
{% set deaths = 1 %}
{% set lag = 17.3 %}
{% set doubling_time = 6.18 %}
{% set CFR = 0.04 %}
{% set doubings_during_lag = lag/doubling_time %}
{% set true_cases_lagged = deaths/CFR %}
{{ doubling_time }} days ago there were ~{{true_cases_lagged | round}} true cases.
{% set true_cases_now = true_cases_lagged*(2**doubings_during_lag) %}
During that time there were {{doubings_during_lag | round(1) }} doublings.
This means there are {{ true_cases_now | round }} cases now.
{% set true_tom = (true_cases_now * (2**(1/doubling_time))) | round %}
{% set true_two = (true_cases_now * (2**(2/doubling_time))) | round %}
{% set true_week = (true_cases_now * (2**(7/doubling_time))) | round %}
Tomorrow there will be {{true_tom }} with {{ true_tom*CFR | round}} deaths.
In 2 days there will be {{ true_two }} with {{ true_two*CFR | round}} deaths.
In a week there will be {{ true_week }} with {{ true_week*CFR | round}} deaths.
You can use the template editor to play around with the model and get a feel for it. Then you can grab out the sensors that are relevant to you. I made these:
- platform: template
sensors:
current_covid_cases:
friendly_name: Current COVID Cases
unit_of_measurement: 'people'
value_template: >-
{% set deaths = states.input_number.greater_pa_covid_deaths.state | float %}
{% set lag = 17.3 %}
{% set doubling_time = 6.18 %}
{% set CFR = 0.04 %}
{% set doubings_during_lag = lag/doubling_time %}
{% set true_cases_lagged = deaths/CFR %}
{% if deaths > 0 %}
{{ (true_cases_lagged*(2**doubings_during_lag)) | round }}
{% else %}
{{ ((5/CFR)* 2**(lag/doubling_time)) | round }}
{% endif %}
tom_covid_cases:
friendly_name: Tomorrow's COVID Cases
unit_of_measurement: 'people'
value_template: >-
{% set doubling_time = 6.18 %}
{{ ((states.sensor.current_covid_cases.state|float) * (2**(1/doubling_time))) | round }}
tom_covid_deaths:
friendly_name: Tomorrow's COVID Deaths
unit_of_measurement: 'people'
value_template: >-
{% set CFR = 0.04 %}
{{ ((states.sensor.tom_covid_cases.state | float) * CFR) | round }}
week_covid_cases:
friendly_name: COVID Cases in 7 days
unit_of_measurement: 'people'
value_template: >-
{% set doubling_time = 6.18 %}
{{ ((states.sensor.current_covid_cases.state|float) * (2**(7/doubling_time))) | round }}
week_covid_deaths:
friendly_name: COVID Deaths in 7 days
unit_of_measurement: 'people'
value_template: >-
{% set CFR = 0.04 %}
{{ ((states.sensor.week_covid_cases.state | float) * CFR) | round }}
You’ll notice on the current_covid_cases template I used an if tag to deal with situations where the number of deaths is 0. With how poor testing is in the US, you should probably hedge and assume the values associated with 5 deaths until you have concrete data.
Cool, now you can use your input_number
to experiment and see how this changes things. But, that’s a perfectly useful tool for scaring the shit out of you, but it doesn’t give you any actionable information. We’ll use this information now to calculate a maximum safe group size.
First, in words. If we know (or think we know) the number of true cases in a population of known size we can calculate the likelihood of any individual having been infected. With that we can calculate the likelihood that any group of X individuals has ZERO infected people. Since we can’t enforce that risk to be exactly 0% (that only happens in group sizes of 0 people) we have to tolerate some level of risk. I’m willing to have a 1% risk (you can tailor accordingly). Using the magic of logarithms we can then calculate the maximum safe size. This could be used to modulate your behaviour.
In template
{% set current_cases = states.sensor.current_covid_cases.state | float %}
{% set tom_cases = states.sensor.tom_covid_cases.state | float %}
{% set week_cases = states.sensor.week_covid_cases.state | float %}
{% set population = 5370926 | float %}
{% set risk = 0.01 %}
{% set current_healthy_rate = (1-(current_cases/population)) %}
According to the data there are {{ current_cases }} in your area with a population of {{ population }}.
This implies that {{ (current_healthy_rate*100) | round(3) }} of people are healthy.
In a group of 5 people at a dinner party there is a {{ (100*(1-current_healthy_rate**5)) | round(5) }}% chance that there is 1 or more infected people.
In a group of 50 people at a small store there is a {{ (100*(1-current_healthy_rate**50)) | round(5) }}% chance that there is 1 or more infected people.
In a group of 500 people at a large store there is a {{ (100*(1-current_healthy_rate**500)) | round(5) }}% chance that there is 1 or more infected people.
In an event of 5000 people there is a {{ (100*(1-current_healthy_rate**5000)) | round(5) }}% chance that there is 1 or more infected people.
In a large event of 50000 people there is a {{ (100*(1-current_healthy_rate**50000)) | round(5) }}% chance that there is 1 or more infected people.
{% set current_max_size = (1-risk) | log(current_healthy_rate) %}
A crowd size of {{ current_max_size | round }} is the largest crowd where the risk of 1 or more infected people is below {{100*risk}}%.
{% set week_healthy_rate = (1-(week_cases/population)) %}
In a week there will be {{ week_cases }} in your area meaning {{ (week_healthy_rate*100) | round(3) }} of people are healthy.
In a group of 5 people at a dinner party there is a {{ (100*(1-week_healthy_rate**5)) | round(5) }}% chance that there is 1 or more infected people
In a group of 50 people at a small store there is a {{ (100*(1-week_healthy_rate**50)) | round(5) }}% chance that there is 1 or more infected people
In a group of 500 people at a large store there is a {{ (100*(1-week_healthy_rate**500)) | round(5) }}% chance that there is 1 or more infected people
In an event of 5000 people there is a {{ (100*(1-week_healthy_rate**5000)) | round(5) }}% chance that there is 1 or more infected people
In a large event of 50000 people there is a {{ (100*(1-week_healthy_rate**50000)) | round(5) }}% chance that there is 1 or more infected people
{% set week_max_size = (1-risk) | log(week_healthy_rate) %}
A crowd size of {{ week_max_size | round }} is the largest crowd where the risk of 1 or more infected people is below {{100*risk}}%.
And then I pulled out these sensors.
- platform: template
sensors:
today_max_size:
friendly_name: "Today Max Safe Group Size"
unit_of_measurement: 'people'
value_template: >-
{% set cases = states.sensor.current_covid_cases.state | float %}
{% set population = 5370926 | float %}
{% set risk = 0.01 %}
{% set healthy_rate = (1-(cases/population)) %}
{% set max_size = (1-risk) | log(healthy_rate) %}
{{ max_size | round }}
tomorrow_max_size:
friendly_name: "Tomorrow's Max Safe Group Size"
unit_of_measurement: 'people'
value_template: >-
{% set cases = states.sensor.tom_covid_cases.state | float %}
{% set population = 5370926 | float %}
{% set risk = 0.01 %}
{% set healthy_rate = (1-(cases/population)) %}
{% set max_size = (1-risk) | log(healthy_rate) %}
{{ max_size | round }}
week_max_size:
friendly_name: "Next Week's Max Safe Group Size"
unit_of_measurement: 'people'
value_template: >-
{% set cases = states.sensor.week_covid_cases.state | float %}
{% set population = 5370926 | float %}
{% set risk = 0.01 %}
{% set healthy_rate = (1-(cases/population)) %}
{% set max_size = (1-risk) | log(healthy_rate) %}
{{ max_size | round }}
For me the relevant numbers are the max group size, but if you’re in an office deciding about closing (you should. Please READ the Medium article), or a business weighing options, you could make a sensor for your employee size and decide when that likelihood gets too high. You can also just use the input_number
to get a sense of what life may be like for the next few weeks. If you “walk along” and repeatedly put the “tomorrow” number in the “today” input_number
you can see how exponential growth really makes this disease dangerous.
I just put them into an entities card.
entities:
- entity: input_number.greater_pa_covid_deaths
- entity: sensor.current_covid_cases
- entity: sensor.tom_covid_cases
- entity: sensor.tom_covid_deaths
- entity: sensor.week_covid_cases
- entity: sensor.week_covid_deaths
- entity: sensor.today_max_size
- entity: sensor.tomorrow_max_size
- entity: sensor.week_max_size
title: Greater PA True Cases
type: entities
I’ll make some plots and use the “deaths tomorrow” and “deaths next week” to assess how well we’re bending the curve. This model, by its very nature assumes a “well mixed population”. If we plot the number of deaths predicted in the future along with the actual number of deaths, we can see how well the social distancing measures are helping. I’ll make a follow-up post in a week or so with plots for my area. Follow along and do it yourself if you’d like.
A note of caution. LISTEN TO YOUR LOCAL AUTHORITIES. If they say to stay home. STAY HOME! There is data available to them and people with much better models that know way more information. This is a very basic model and will only give you an idea of what’s happening and SHOULD NOT be considered an excuse to hold your 1000 people concert or keep your 50 person office open. STAY SMART. STAY SAFE. STAY HOME!