Error writing to InfluxDB

I am running 0.63.3 in a Docker container, with InfluxDb in another container. I updated Home-assistant and I started getting a lot of these messages in the log (log-level is set to debug). Other clients writing to the influxdb do not experience issues (I think :p).

2018-02-18 22:34:30 ERROR (InfluxDB) [homeassistant.components.influxdb] Resumed writing to InfluxDB
2018-02-18 22:35:03 ERROR (InfluxDB) [homeassistant.components.influxdb] Error writing to InfluxDB
2018-02-18 22:35:08 ERROR (InfluxDB) [homeassistant.components.influxdb] Resumed writing to InfluxDB
2018-02-18 22:35:12 WARNING (InfluxDB) [homeassistant.components.influxdb] Dropping old events to catch up
2018-02-18 22:35:17 WARNING (InfluxDB) [homeassistant.components.influxdb] Dropping old events to catch up
2018-02-18 22:36:31 ERROR (InfluxDB) [homeassistant.components.influxdb] Error writing to InfluxDB
2018-02-18 22:36:41 WARNING (InfluxDB) [homeassistant.components.influxdb] Dropping old events to catch up
2018-02-18 22:37:30 ERROR (InfluxDB) [homeassistant.components.influxdb] Resumed writing to InfluxDB

I see a breaking change in the blog: https://home-assistant.io/blog/2018/02/10/release-63/

How can I troubleshoot this?

Thanks,
/M

That looks odd, does it happen constantly or in bursts?

Constantly, but not periodically. Here’s a histogram of the times between the lines starting with “Error writing”. time_diff

I’ve cropped the picture at 200 s.

Ouch. Maybe it cannot keep up with the traffic? Is this a particularly busy HA, or a quite slow InfluxDB server?

I will try reworking the influxdb supprt to batch writes so they will be faster.

Can you try putting this file in config/custom_components/influxdb.py and restart?

It might improve things but even if it does not, it should log some numbers to better judge the situation.

1 Like

I started seeing these errors when I upgraded to 0.63.3, I think. :slight_smile: I’ll see if I can downgrade and try again. I am running home-assistant and influxdb in docker containers. A problem with my setup is that I use the local IP-adress of the Raspberry instead of the internal Docker IP-address, but I haven’t changed that since it last worked.

2018-02-23 21:04:59 ERROR (InfluxDB) [custom_components.influxdb] Write error
2018-02-23 21:04:59 INFO (MainThread) [homeassistant.core] Bus:Handling <Event system_log_event[L]: message=Write error, level=ERROR, exception=, timestamp=1519416299.0897262, source=custom_components/influxdb.py>
2018-02-23 21:05:36 ERROR (InfluxDB) [custom_components.influxdb] Resuming, lost 183 events
2018-02-23 21:05:36 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-02-23 21:05:36 INFO (MainThread) [homeassistant.core] Bus:Handling <Event system_log_event[L]: message=Resuming, lost 183 events, level=ERROR, exception=, timestamp=1519416336.8636613, source=custom_components/influxdb.py>
2018-02-23 21:06:06 ERROR (InfluxDB) [custom_components.influxdb] Write error
2018-02-23 21:06:06 INFO (MainThread) [homeassistant.core] Bus:Handling <Event system_log_event[L]: message=Write error, level=ERROR, exception=, timestamp=1519416366.5360003, source=custom_components/influxdb.py>

@martisak

This is odd, but I think it is a local problem with your setup since nobody else has reported any issue. I updated my work-in-progress so if you can copy the same file to custom_components, it will log even more information. You can PM me the log and I will have a look.

I just setup influxdb and grafana today and I have encountered the same issue. I see the data getting written but every 2-3 minutes the logs get the same error. I have influxdb installed locally on ubuntu server

@amelchio let me know if you need more information.

I am on 0.63.3, python 3.6.3

For a start, I will need logs from the custom_component linked above. Feel free to PM me with the information.

@amelchio thanks sent a PM!

OK, finally some success here. It’s been running for a little more than an hour without a single lost event. Here’s what I did.

  1. Downloaded the new influx.db from @amelchio and set the following in configuration.yaml
  logger:
      default: error
      logs:
        custom_components.influxdb: debug 

This showed some timeouts in the log.

  1. Removed docker.io and installed Docker CE. This allowed me to use influxdb as hostname instead of the external IP.
  2. Set max_retries to 10 in configuration.yaml.
influxdb:
  host: influxdb
  username: username
  password: secret
  max_retries: 10

Thanks @amelchio for all your help! I don’t see in the logs how many retransmissions are needed, and can’t tell why (apparantly) more were needed, but at least it seems to work.

1 Like

@martisak Glad that you fixed it but running with constant retries is a band-aid. I would still appreciate a PM with the logs when it fails.

The errors I am getting are requests.exceptions.ReadTimeout: HTTPConnectionPool(host='influxdb', port=8086): Read timed out. (read timeout=5) (longer printout in pm). Changing max_retries to 3, still works. The errors come back if I remove max_retries.

1 Like

What a fix ever discovered? I’ve been having this ever since I installed InfluxDB. I recently added a custom alarm panel and it forgets my alarm settings once in a while when I restart, so it has become a priority for me to fix. Whenever I restart the Home Assistant docker container I see errors like this in the log:

2018-05-08 14:17:11 ERROR (InfluxDB) [custom_components.influxdb] Resumed, lost 13 events

2018-05-08 14:17:11 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events

I installed your custom component above and set:

logger:
  default: error
  logs:
    custom_components.influxdb: debug 

Will a new log file be created? Or is this information just added to the existing log?

TypeError: 'bool' object is not iterable
2018-05-08 15:36:19 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:37:20 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:38:07 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 2 events
2018-05-08 15:38:23 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:38:43 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:38:45 ERROR (InfluxDB) [custom_components.influxdb] Write error
Traceback (most recent call last):
  File "/config/custom_components/influxdb.py", line 309, in write_to_influxdb
    self.influx.write_points(json)
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 468, in write_points
    tags=tags, protocol=protocol)
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 532, in _write_points
    protocol=protocol
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 312, in write
    headers=headers
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 271, in request
    raise InfluxDBClientError(response.content, response.status_code)
influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write: unable to parse 'state,domain=sensor,entity_id=pvoutput custom_ui_state_card_str=\"state-card-custom-ui\",efficiency=0.165,energy_consumption=nan,energy_generation=713.0,friendly_name_str=\"PVOutput\",power_consumption=nan,power_generation=948.0,show_last_changed=1.0,state_card_mode_str=\"badges\",temperature=nan,value=713.0,voltage=nan 1525793924116101888': invalid number dropped=0"}

2018-05-08 15:39:05 ERROR (InfluxDB) [custom_components.influxdb] Resumed, lost 3 events
2018-05-08 15:39:05 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:39:06 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:39:09 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:39:18 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:39:21 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:39:53 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:39:57 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:40:00 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:40:04 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:40:08 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:40:51 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:40:56 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:41:08 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:41:08 ERROR (MainThread) [frontend.js.latest.201804260] :0:0 Script error.
2018-05-08 15:41:12 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 21 events
2018-05-08 15:42:05 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:42:13 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:42:59 ERROR (MainThread) [frontend.js.latest.201804260] :0:0 Script error.
2018-05-08 15:43:05 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:43:16 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events
2018-05-08 15:43:37 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:44:17 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 1 events
2018-05-08 15:44:19 DEBUG (InfluxDB) [custom_components.influxdb] Wrote 19 events

The current releases have all fixes included so there is no use in installing old custom components.

However, it seems like you found a new error. I tried to fix that in PR 14347.

I think that error was after I installed the custom component above. I will recheck and post updated results when I get back home.

OK, so after removing the old component version… I now only get the error when the PVOutput.sensor is enabled. I will do some tests and verify that is the only time this error occurs.

2018-05-08 22:32:51 ERROR (InfluxDB) [homeassistant.components.influxdb] Write error
Traceback (most recent call last):
  File "/usr/src/app/homeassistant/components/influxdb.py", line 312, in write_to_influxdb
    self.influx.write_points(json)
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 468, in write_points
    tags=tags, protocol=protocol)
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 532, in _write_points
    protocol=protocol
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 312, in write
    headers=headers
  File "/usr/local/lib/python3.6/site-packages/influxdb/client.py", line 271, in request
    raise InfluxDBClientError(response.content, response.status_code)
    influxdb.exceptions.InfluxDBClientError: 400: {"error":"partial write: unable to parse 'state,domain=sensor,entity_id=pvoutput custom_ui_state_card_str=\"state-card-custom-ui\",efficiency=0.709,energy_consumption=nan,energy_generation=24917.0,friendly_name_str=\"PVOutput\",power_consumption=nan,power_generation=4076.0,show_last_changed=1.0,state_card_mode_str=\"badges\",temperature=nan,value=24917.0,voltage=nan 1525818770111304960': invalid number dropped=0"}

If disabling the PVOutput component is all I need to make the database write reliably, then I have other ways to get that data.

I think that error is fixed with the PR mentioned above (the fix will be in 0.69).

1 Like

I can confirm that this is fixed in 0.69b3! Thank you.