SNMP bandwidth monitor using statistics and a utility meter

I’ve built a bandwidth monitor based on this and this post. It almost works. I have read through this post and other related ones to see if I’ve missed something but to no avail.

My goal is to see instantaneous usage but especially daily and monthly totals. The former works, but the latter two seems to be off. I’m using a utility meter for that.

Here is my config (as a minimal example – my full config can be seen here):

input_number:
  wan_traffic_delta_in:
    min: 0
    max: 4294967295
  wan_traffic_delta_out:
    min: 0
    max: 4294967295

utility_meter:
  daily_internet_usage_in:
    source: sensor.internet_usage_in
    cycle: daily
  daily_internet_usage_out:
    source: sensor.internet_usage_out
    cycle: daily
  monthly_internet_usage_in:
    source: sensor.internet_usage_in
    cycle: monthly
  monthly_internet_usage_out:
    source: sensor.internet_usage_out
    cycle: monthly

automation:
  - alias: "Monitor Inbound Internet Traffic"
    trigger:
      platform: state
      entity_id: sensor.snmp_wan_in
    action:
      - service: input_number.set_value
        data_template:
          entity_id: input_number.wan_traffic_delta_in
          value: '{{ ((trigger.to_state.state | int - trigger.from_state.state | int) * 8 ) / ( as_timestamp(trigger.to_state.last_updated) - as_timestamp(trigger.from_state.last_updated) ) }}'
  - alias: "Monitor Outbound Internet Traffic"
    trigger:
      platform: state
      entity_id: sensor.snmp_wan_out
    action:
      - service: input_number.set_value
        data_template:
          entity_id: input_number.wan_traffic_delta_out
          value: '{{ ((trigger.to_state.state | int - trigger.from_state.state | int) * 8 ) / ( as_timestamp(trigger.to_state.last_updated) - as_timestamp(trigger.from_state.last_updated) ) }}'

sensor:
  - platform: snmp
    name: snmp_wan_in
    host: 192.168.0.1
    community: Router
    version: 2c
    baseoid: 1.3.6.1.2.1.2.2.1.10.18  # ifInOctets.14
    unit_of_measurement: octets
  - platform: snmp
    name: snmp_wan_out
    host: 192.168.0.1
    community: Router
    version: 2c
    baseoid: 1.3.6.1.2.1.2.2.1.16.18  # ifOutOctets.14
    unit_of_measurement: octets
  - platform: statistics
    name: 'Internet Traffic In'
    entity_id: sensor.internet_speed_in
  - platform: statistics
    name: 'Internet Traffic Out'
    entity_id: sensor.internet_speed_out
  - platform: template
    sensors:
      internet_speed_in:
        value_template: '{{ ((states.input_number.wan_traffic_delta_in.state | float ) / 1000000 ) | round(3) }}'
        unit_of_measurement: 'Mbps'
      internet_speed_out:
        value_template: '{{ ((states.input_number.wan_traffic_delta_out.state | float ) / 1000000 ) | round(3) }}'
        unit_of_measurement: 'Mbps'
      internet_usage_in:
        value_template: '{{ ((states.input_number.wan_traffic_delta_in.state | float ) / 1000000000 / 8 ) | round(3) }}'
        unit_of_measurement: 'GB'
      internet_usage_out:
        value_template: '{{ ((states.input_number.wan_traffic_delta_out.state | float ) / 1000000000 / 8 ) | round(3) }}'
        unit_of_measurement: 'GB'

In my ui-lovelace.yaml I have:

          - type: history-graph
            entities:
              - entity: sensor.internet_speed_in
              - entity: sensor.internet_speed_out
            hours_to_show: 24
            refresh_interval: 60
          - type: glance
            title: Today
            show_name: false
            columns: 2
            entities:
              - entity: sensor.daily_internet_usage_in
                name: Today
              - entity: sensor.daily_internet_usage_out
                name: Today
          - type: glance
            title: This Month
            show_name: false
            columns: 2
            entities:
              - entity: sensor.monthly_internet_usage_in
                name: This Month
              - entity: sensor.monthly_internet_usage_out
                name: This Month

What I see:

What my ISP says:

I’m pretty confident (as can be seen from my history graph of the instantaneous usage) that that part is working correctly (it’s a 10Mbps line (downlink) and a 5Mbps uplink).

I’m not sure whether it will be sensible to start fiddling with sampling sizes and scan intervals.

Am I misunderstanding how a utility meter could be used for this or do I have an error in my calculations or config?

EDIT: Is the problem perhaps that I shouldn’t be feeding deltas to a utility meter?

UPDATE (2022-02-20): I’ve been lazy about this error for a long time, but counters like these wrap around and then will cause errors using a naive implementation. I’ve at last fixed this issue. Herewith the updated automations:

- alias: "Monitor Inbound Internet Traffic"
  trigger:
    platform: state
    entity_id: sensor.snmp_wan_in
  action:
    - service: input_number.set_value
      data:
        entity_id: input_number.wan_traffic_delta_in
        value: >-
          {# safe delta catering for wrap-around of a 32-bit int (snmp counter is 32-bit unsigned int) #}
          {# basically taking 2's complement #}
          {% set from = trigger.from_state.state | int %}
          {% set to = trigger.to_state.state | int %}
          {% set traffic_delta = (to - from) if (to >= from) else (4294967295 - from + to + 1) %}
          {% set time_delta = as_timestamp(trigger.to_state.last_updated) - as_timestamp(trigger.from_state.last_updated) %}
          {{ (traffic_delta * 8) / time_delta }}

- alias: "Monitor Outbound Internet Traffic"
  trigger:
    platform: state
    entity_id: sensor.snmp_wan_out
  action:
    - service: input_number.set_value
      data:
        entity_id: input_number.wan_traffic_delta_out
        value: >-
          {# safe delta catering for wrap-around of a 32-bit int (snmp counter is 32-bit unsigned int) #}
          {# basically taking 2's complement #}
          {% set from = trigger.from_state.state | int %}
          {% set to = trigger.to_state.state | int %}
          {% set traffic_delta = (to - from) if (to >= from) else (4294967295 - from + to + 1) %}
          {% set time_delta = as_timestamp(trigger.to_state.last_updated) - as_timestamp(trigger.from_state.last_updated) %}
          {{ (traffic_delta * 8) / time_delta }}
1 Like

It seems I misunderstood how the utility meter works. It needs a sensor that provides the current meter reading and not a delta value (it seems obvious now, but somehow I missed it). Changing my config as follows solved the issue for the download values:

      internet_speed_in:
        value_template: '{{ ((states.input_number.wan_traffic_delta_in.state  | float ) / 1000000 ) | round(3) }}'
        unit_of_measurement: 'Mbps'
      internet_speed_out:
        value_template: '{{ ((states.input_number.wan_traffic_delta_out.state | float ) / 1000000 ) | round(3) }}'
        unit_of_measurement: 'Mbps'
      internet_usage_in:
        value_template: "{{ ((states('sensor.snmp_wan_in')  | float ) / 1000000000 ) | round(3) }}"
        unit_of_measurement: 'GB'
      internet_usage_out:
        value_template: "{{ ((states('sensor.snmp_wan_out') | float ) / 1000000000 ) | round(3) }}"
        unit_of_measurement: 'GB'

Oddly, the uploads are still off by a factor of 30. It doesn’t seem like a conversion factor and I’m reading the data from the same interface on the router. I also don’t think I should doubt my service provider, as the download data is measured correctly. But, any or all of my assumptions could be wrong. If someone comes across this post I’d be keen to hear ideas.

1 Like

I solved this. I suppose the lesson is to always check your assumptions. I wasn’t quite using the correct base OIDs for the SNMP sensors. I’m not posting the specifics, as it depends on your specific router or other network equipment you may be using.

But it may help to say how I figured it out: I traversed all the interfaces while doing a predictable upload and download test (known sizes and timings). I then calculated all the speeds and usages using a bit of Python.

2 Likes

Hey Parautenbach,

Thanks for posting this, helped me a bunch. I have never even used snmpwalk before so it was a journey for me. I really appreciate the time you put behind this, as well as the follow up. Please keep up the good work!

Also please note you have your latt and long info on your github code.

1 Like

Thank you for the kind feedback and the note.

As i needed something similar, i created a small integration which should get all interfaces from snmp:

maybe this make work a little bit easier as you now could simply use the generated sensors for you utility meters.

1 Like

I think with this approach you’ll loose your usage history after rebooting your router. Another approach is to use an extra input_number counting the usage.

I’ve added to the automation another service call like this:

service: input_number.set_value
data_template:
  entity_id: input_number.wan_traffic_in
  value: >-
    {{ (states.input_number.wan_traffic_in if states.input_number.wan_traffic_in
    is not none else 0) | float + (trigger.to_state.state | int -
    trigger.from_state.state | int)  / 1000000 }}

That way my input_number.wan_traffic_in stores the number of MBytes

I use pfSense and installed the integration for that, and plugged everything in using your config…not sure what the integration uses under the covers, but I’ll post back here in a while on my findings / measurement accuracy.

If anyone is still having trouble with this, I have this running capturing stats using snmp from my edgerouter. No extra plugins needed.

I documented it here Home Assistant and WAN tracking – KeithMcD.com and just extended it last night to also do cyclical tracking of in/out along with total usage. Very simple to add based on that post but I’ll update it later today too.

I don’t trust my ISP and since it’s Comcast here, I only get one number for entire month.

I also documented there how to add in an Apple Watch complication for instant glanceable data.

1 Like

Great guide, will try to pull it to my config any day I find time to do so.
One thing though about the divisions by 1 000 000 and 1 000 000 000. Would it not be more correct to use 1 048 576 (2^20) and 1 073 741 824 (2^30) ?

Networks and storage devices typically use the metric prefixes (multiples of 1000) and not the binary ones (multiples of 1024). For small values the error is small, but not for larger numbers.

The cumulative statistics aren’t accurate and I’ve had to slowly tweak them to get it roughly closer to what Comcast is reporting. I’ll try to update the guide this week. I still slowly drift to lower cumulative data than what Comcast claims but it gives me a good rough estimate of usage still. I’m just not great with statistics - hurts my brain. :grin:

See my post before yours. You need to establish how they count (using which multiple), for one. My upload data has been very accurate so far (I did isolated speed tests e.g. to compare precisely). My upload data, though, isn’t always accurate, and I haven’t been able to figure out why. Just keep in mind that you could have several seemingly correct OIDs (when you do an snmpwalk) and you need to be sure to use the right one. Remember, in the end, your router is basically counting actual octets sent on an interface on the wire and unless there’s a bug in your firmware, it can’t be way off – the implementation and reporting could be off though (on either side).

1 Like