Home Assistant Community Add-on: InfluxDB

hi frenck,
would you consider this once again.
Now when plugin is rock solid and super convenient.
Adding telegraph and possibility handling ha logs and different logs from other systems would be perfect.
I haven’t found any good looking log management for home assistant. influxdb log viewer looks very promising.

Update to 3.7.1 keeps crashing

How do I down grade to 3.6.x please.

	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x30
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x1babf2c0, 0x1c239000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:169 +0x178
net.(*netFD).Read(0x1babf2c0, 0x1c239000, 0x1000, 0x1000, 0x1, 0x0, 0x0)
	/usr/local/go/src/net/fd_unix.go:202 +0x38
net.(*conn).Read(0x5555280, 0x1c239000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/net.go:184 +0x58
net/http.(*persistConn).Read(0x3597180, 0x1c239000, 0x1000, 0x1000, 0x70df8, 0xc864ef4, 0x2)
	/usr/local/go/src/net/http/transport.go:1758 +0x164
bufio.(*Reader).fill(0x3cea4e0)
	/usr/local/go/src/bufio/bufio.go:100 +0x108
bufio.(*Reader).Peek(0x3cea4e0, 0x1, 0x2, 0x0, 0x0, 0x2528bc00, 0x0)
	/usr/local/go/src/bufio/bufio.go:138 +0x38
net/http.(*persistConn).readLoop(0x3597180)
	/usr/local/go/src/net/http/transport.go:1911 +0x178
created by net/http.(*Transport).dialConn
	/usr/local/go/src/net/http/transport.go:1580 +0x8f0

goroutine 131690 [select]:
net/http.(*persistConn).writeLoop(0x4e72000)
	/usr/local/go/src/net/http/transport.go:2210 +0xc0
created by net/http.(*Transport).dialConn
	/usr/local/go/src/net/http/transport.go:1581 +0x90c

goroutine 57463 [select]:
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compactCache(0x4225bc0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1976 +0xe4
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableSnapshotCompactions.func1(0xe0a8cb0, 0x4225bc0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:505 +0x48
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableSnapshotCompactions
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:505 +0xfc

goroutine 57464 [select]:
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compact(0x4225bc0, 0xe0a8cc0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2023 +0x1b4
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableLevelCompactions.func1(0xe0a8cc0, 0x4225bc0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:426 +0x50
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableLevelCompactions
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:426 +0x114

goroutine 155551 [runnable]:
github.com/influxdata/influxdb/tsdb/cursors.(*StringArray).Exclude(0xc6195c8, 0xffffffff, 0x7fffffff, 0x0, 0x80000000)
	/go/src/github.com/influxdata/influxdb/tsdb/cursors/arrayvalues.gen.go:670 +0x2bc
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*tsmBatchKeyIterator).combineString(0x34e6320, 0xbbf71301, 0x40, 0xbbf713a0, 0x2)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.gen.go:1785 +0x5e4
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*tsmBatchKeyIterator).mergeString(0x34e6320)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.gen.go:1731 +0x24c
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*tsmBatchKeyIterator).merge(0x34e6320)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:1856 +0x1e4
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*tsmBatchKeyIterator).Next(0x34e6320, 0x61686a6)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:1716 +0x105c
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Compactor).write(0x3533740, 0x7fbf090, 0x49, 0x1e755f0, 0x34e6320, 0x18f0ab01, 0x0, 0x0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:1141 +0x150
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Compactor).writeNewFiles(0x3533740, 0xe, 0x2, 0x4b8a1c0, 0x6, 0x8, 0x1e755f0, 0x34e6320, 0x1, 0x1e755f0, ...)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:1045 +0x11c
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Compactor).compact(0x3533740, 0x4b8a100, 0x4b8a1c0, 0x6, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:953 +0x4a0
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Compactor).CompactFull(0x3533740, 0x4b8a1c0, 0x6, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:971 +0x110
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*compactionStrategy).compactGroup(0x4b8a280)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2208 +0xf64
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*compactionStrategy).Apply(0x4b8a280)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2185 +0x2c
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compactFull.func1(0xbb43440, 0x3e3a000, 0x4b8a280)
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2154 +0xc4
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compactFull
	/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:2150 +0xd0

goroutine 131689 [IO wait]:
internal/poll.runtime_pollWait(0xa6cfec80, 0x72, 0xffffffff)
	/usr/local/go/src/runtime/netpoll.go:184 +0x44
internal/poll.(*pollDesc).wait(0x182934b4, 0x72, 0x1000, 0x1000, 0xffffffff)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x30
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x182934a0, 0x197dc000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:169 +0x178
net.(*netFD).Read(0x182934a0, 0x197dc000, 0x1000, 0x1000, 0x1, 0x0, 0x0)
	/usr/local/go/src/net/fd_unix.go:202 +0x38
net.(*conn).Read(0xbafdff0, 0x197dc000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/net.go:184 +0x58
net/http.(*persistConn).Read(0x4e72000, 0x197dc000, 0x1000, 0x1000, 0x70df8, 0x22bf1ef4, 0x2)
	/usr/local/go/src/net/http/transport.go:1758 +0x164
bufio.(*Reader).fill(0x36ac810)
	/usr/local/go/src/bufio/bufio.go:100 +0x108
bufio.(*Reader).Peek(0x36ac810, 0x1, 0x2, 0x0, 0x0, 0xba954f00, 0x0)
	/usr/local/go/src/bufio/bufio.go:138 +0x38
net/http.(*persistConn).readLoop(0x4e72000)
	/usr/local/go/src/net/http/transport.go:1911 +0x178
created by net/http.(*Transport).dialConn
	/usr/local/go/src/net/http/transport.go:1580 +0x8f0
[cont-finish.d] executing container finish scripts...
[cont-finish.d] 99-message.sh: executing... 
[cont-finish.d] 99-message.sh: exited 0.
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.

Running out of memory

2020/06/15 16:49:36 Using configuration at: /etc/kapacitor/kapacitor.conf
time="2020-06-15T16:50:04+01:00" level=info msg="Reporting usage stats" component=usage freq=24h reporting_addr="https://usage.influxdata.com" stats="os,arch,version,cluster_id,uptime"
time="2020-06-15T16:50:04+01:00" level=info msg="Serving chronograf at http://127.0.0.1:8889" component=server
[16:50:05] INFO: Starting NGinx...
runtime: out of memory: cannot allocate 32768-byte block (727744512 in use)
fatal error: out of memory

goroutine 7523 [running]:
runtime.throw(0xfa5314, 0xd)
        /usr/local/go/src/runtime/panic.go:774 +0x5c fp=0xf7573cc sp=0xf7573b8 pc=0x41644
runtime.(*mcache).refill(0xb6f9536c, 0x6d)
        /usr/local/go/src/runtime/mcache.go:140 +0xfc fp=0xf7573e0 sp=0xf7573cc pc=0x262ec
runtime.(*mcache).nextFree(0xb6f9536c, 0x26d, 0x201, 0xbe7fa000, 0x200)
        /usr/local/go/src/runtime/malloc.go:854 +0x7c fp=0xf757400 sp=0xf7573e0 pc=0x1b0f4
runtime.mallocgc(0x2a80, 0x0, 0x0, 0x3b9aca00)
        /usr/local/go/src/runtime/malloc.go:1022 +0x7a0 fp=0xf757468 sp=0xf757400 pc=0x1ba40
runtime.growslice(0xddd320, 0xbe5ae9c0, 0x13f, 0x168, 0x527, 0x800, 0x0, 0x48000000)
        /usr/local/go/src/runtime/slice.go:175 +0x11c fp=0xf757498 sp=0xf757468 pc=0x59548

On restart of host

Runs away until I stop the InfluxDB addon

Thank the stars for snapshots - back to a 3.6.x (still reports as 3.7.1) and it all works again…

Influxdb gone crazy again. No one else seeing this problem?

Gone crazy in what way?

I’m also seeing high CPU usage in InfluxDB 3.7.1.

You can see when I updated Influx.

My CPU hovers around 80-85% now. If I stop the Influx addon CPU usage goes back down to 10%.

When Influx is running ha addons stats shows this:

$ ha addons stats a0d7b954_influxdb
blk_read: 576511156224
blk_write: 950576304128
cpu_percent: 238.36
memory_limit: 968855552

With InfluxDB addon active my Pi hovers around 80% CPU, 90% memory and 75°C.

If I turn it off it drops to around 10% CPU, 45% memory and 60°C.

I’m now looking for ways to downgrade Influx DB addon. Is that possible? I have a 2 week old snapshot but have not had good experiences restoring snapshots in the past so would like to avoid that if possible.

This post might be related. Is it possible to set store-enabled = false using the envvars attribute?

I have asked several times - @frenck, is it possible to install a specific version of the addon?

Oh good, not just me then :frowning_face:

High CPU as per screenshot above.

  • Yes you can downgrade using a backup, but you lose some data of course. I’ve downgraded to 3.6.2 that way, but after some time I ran into the same issue again.
  • store-enabled = false didn’t fix it for me. Same as above, it seemed to provide some relief, but not in the long run.

In the end I think this is (partly) caused by a combination of memory limitations (on a RPi 3) and too many measurements. I store about 1500 measurements per hour (36k a day) and it feels like that is just to much. For now I’m going to upgrade to a RPi4 with 4Gb and cut back on the number of measurements.

I’m past that point. I was hoping it could be done via a configuration element.

It could be but has been fine for months. I’ve no idea how to find out how many measurements nor how to limit what is stored. How do you do that?

Top get the number of measurements per hour, you can do something like:

SELECT count("value") FROM /./ WHERE time >= now() - 6h GROUP BY time(1h) fill(null)

I could if it was running…

And to cut down on the measurements?

Same here, but mayby you reach a threshold at some point which causes this to happen.

Perhaps change the retention period? - can you do that globally?

Yes, in Influxdb:

1 Like

Well … for the moment at least my problem has gone away :grinning:

While troubleshooting this I uninstalled InfluxDB and re-installed it. I wasn’t expecting it but this appears to have deleted my database :rofl:

It’s no biggie, I only set up Influx to test out Grafana a few months ago so the database history was totally unimportant, but after creating a new DB and reconnecting Grafana InfluxDB (the current 3.7.1 version) is now back to normal performance. I’ll see how long it lasts.

Also FWIW, it’s been a while since I tried restoring a snapshot, and I discovered that when you restore a snapshot now you get a checklist of things you can selectively restore, including the InfluxDB addon. I tried restoring that on its own and got a ‘successfully restored’ message, but nothing appeared to change including the InfluxDB addon version number.

I believe you can only cut down the measurements by limiting what is being recorded i.e. by domain/entities. This should help especially in cases where you have entities that have very frequent state changes. I personally exclude everything but the specific domains I want to monitor. I considered per entity at one time but it would be too difficult to manage. The sensor domain is likely to have entities that update frequently so you might want to look in there for culprits as well.

1 Like

InfluxDB has done it again. 180% CPU.

This is really beginning to irritate me.

My Rpi 3 has been hanging recently, causing Teckin wall plug devices to reset when they lose connectivity. The temp and CPU usage has gone sky-high with recent InfluxDB updates.

Today I updated to 3.7.3 and I was hoping a fix had been in the works. Whilst it was disabled, process or use dropped to 8% and Swap Use down to 40%. Within seconds of InfluxDB being started up, we’re back up to 70% CPU usage, 60+ degrees c temp (last week it was 80+ and I installed a fan), and Swap Use of 100%.

The drop in temp in the attached image is when InfluxDB is disabled (it didn’t restart after an update). I also tried disabling all the other add-ones one at a time, and InfluxDB is the only one that has a huge impact. The rest are negligible (Grafana, ESPhome, Google Drive Backup).

1 Like