Hassio crashes and raspberry pi locks up then homeassistant restarts

Hi,

I’m having problems with hassio.
Hassio crashes and locks up about every 1,5 day and I can’t find out why.
All automations go to ‘disabled’ state when pi comes back online.

What I tried:

Brand New Raspberry Pi 3b
New microsd card 64 GB samsung evo
Latest hassio image for pi 3b (32 bit)
Powerd by PoE switch 5v 2.4A but also tried official raspberry pi power supply (5.1v 2.5a)
Connected by ethernet 100 mbit (due to poe splitter)

Tried another Raspberry Pi 3b with same result.

Some relevant system info and logs:
Having problems to paste it here (exceeding 32000 chars) so using pastebinit.
https://pastebin.com/QUw0ZjPt

The problem also occurs when trying to download a full snapshot.

2019-01-23 12:20:28 INFO (MainThread) [homeassistant.components.http.view] Serving /api/hassio/snapshots to 10.0.0.15 (auth: True)
2019-01-23 12:20:32 INFO (MainThread) [homeassistant.components.http.view] Serving /api/hassio/snapshots/39e9c200/info to 10.0.0.15 (auth: True)
2019-01-23 12:20:39 INFO (MainThread) [homeassistant.components.http.view] Serving /api/hassio/snapshots/39e9c200/download to 10.0.0.15 (auth: True)
2019-01-23 12:21:02 ERROR (MainThread) [aiohttp.server] Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 406, in start
    resp = await task
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_app.py", line 435, in _handle
    resp = await handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_middlewares.py", line 120, in impl
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/http/static.py", line 66, in staticresource_middleware
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/http/real_ip.py", line 34, in real_ip_middleware
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/http/ban.py", line 67, in ban_middleware
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/http/auth.py", line 99, in auth_middleware
    return await handler(request)
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/http/view.py", line 118, in handle
    result = await result
  File "/usr/local/lib/python3.6/site-packages/homeassistant/components/hassio/http.py", line 64, in _handle
    data = await client.read()
  File "/usr/local/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 943, in read
    self._body = await self.content.read()
  File "/usr/local/lib/python3.6/site-packages/aiohttp/streams.py", line 347, in read
    return b''.join(blocks)
MemoryError
core-ssh:~#

Hope someone can help me out.

Best Regards,
Donald.

I have similar issue when pi CPU temperature has been to hot.

Try to cool it down

Thanks for your reaction, I’ll attach a fan with cooling block in it and will report back if the problem returns or goes away.
Have upgraded to 0.86.1 and the problem still persists, so it is a hardware issue then.

[99009.961781] Out of memory: Kill process 864 (python3) score 792 or sacrifice child
[99009.964165] Killed process 864 (python3) total-vm:1049864kB, anon-rss:770516kB, file-rss:0kB, shmem-rss:0kB
[99010.279461] oom_reaper: reaped process 864 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[99011.117318] kauditd_printk_skb: 32 callbacks suppressed
[99011.117326] audit: type=1701 audit(1548342035.324:94): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=112 comm="systemd-journal" exe="/usr/lib/systemd/systemd-journald" sig=6 res=1
[99011.303525] systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=6/ABRT
[99011.309875] systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
[99011.321271] systemd[1]: systemd-journald.service: Service has no hold-off time, scheduling restart.
[99011.328987] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1.
[99011.332804] systemd[1]: Stopped Flush Journal to Persistent Storage.
[99011.334639] systemd[1]: Stopping Flush Journal to Persistent Storage...
[99011.336354] systemd[1]: Stopped Journal Service.
[99011.342730] systemd[1]: Starting Journal Service...
[99011.508645] audit: type=1305 audit(1548342035.714:95): audit_enabled=1 old=1 auid=4294967295 ses=4294967295 res=1
[99011.754112] systemd[1]: Started Journal Service.
[99012.701008] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[99012.704515] brcmfmac: power management disabled
core-ssh:~#

Hassio keeps on crashing, tried again another pi, another power supply, reinstalled hassio on a new sd card, still keeps crashing in about 1,5 day.
CPU Temperature isn’t even 40 degrees celsius, so that couldn’t be the problem.
I’m clueless, where to look?

In the logs I see hundreds of:
[46490.153538] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[46490.155618] brcmfmac: power management disabled

And when hassio doesn’t respond (I can login with ssh without any problem, but it is very slow)
this is in the dmesg:

[46558.237785] dockerd: page allocation failure: order:0, mode:0x140000a(GFP_NOIO|__GFP_HIGHMEM|__GFP_MOVABLE), nodemask=(null)
[46558.240714] dockerd cpuset=/ mems_allowed=0
[46558.242241] CPU: 2 PID: 300 Comm: dockerd Tainted: G         C      4.14.81-v7 #1
[46558.245151] Hardware name: BCM2835
[46558.246602] [<801101dc>] (unwind_backtrace) from [<8010c37c>] (show_stack+0x20/0x24)
[46558.249528] [<8010c37c>] (show_stack) from [<808849cc>] (dump_stack+0xcc/0x110)
[46558.251062] [<808849cc>] (dump_stack) from [<8023acbc>] (warn_alloc+0xcc/0x174)
[46558.252566] [<8023acbc>] (warn_alloc) from [<8023be48>] (__alloc_pages_nodemask+0x1034/0x1128)
[46558.255621] [<8023be48>] (__alloc_pages_nodemask) from [<80299a38>] (zs_malloc+0x134/0x450)
[46558.258629] [<80299a38>] (zs_malloc) from [<805c95e0>] (zram_bvec_rw.constprop.2+0x5b4/0x898)
[46558.261705] [<805c95e0>] (zram_bvec_rw.constprop.2) from [<805c9cf8>] (zram_rw_page+0xbc/0x150)
[46558.264852] [<805c9cf8>] (zram_rw_page) from [<802e033c>] (bdev_write_page+0x90/0xcc)
[46558.268058] [<802e033c>] (bdev_write_page) from [<8027d3d4>] (__swap_writepage+0x1a4/0x368)
[46558.271384] [<8027d3d4>] (__swap_writepage) from [<8027d5d8>] (swap_writepage+0x40/0x90)
[46558.274992] [<8027d5d8>] (swap_writepage) from [<8024a604>] (shrink_page_list+0x8b4/0xe4c)
[46558.278615] [<8024a604>] (shrink_page_list) from [<8024b324>] (shrink_inactive_list+0x270/0x63c)
[46558.282179] [<8024b324>] (shrink_inactive_list) from [<8024be14>] (shrink_node_memcg+0x324/0x6a0)
[46558.286026] [<8024be14>] (shrink_node_memcg) from [<8024c2a4>] (shrink_node+0x114/0x348)
[46558.289927] [<8024c2a4>] (shrink_node) from [<8024c5e4>] (do_try_to_free_pages+0x10c/0x3a0)
[46558.293832] [<8024c5e4>] (do_try_to_free_pages) from [<8024c9c4>] (try_to_free_pages+0x14c/0x474)
[46558.297711] [<8024c9c4>] (try_to_free_pages) from [<8023b50c>] (__alloc_pages_nodemask+0x6f8/0x1128)
[46558.301613] [<8023b50c>] (__alloc_pages_nodemask) from [<80241e40>] (__do_page_cache_readahead+0x108/0x280)
[46558.305409] [<80241e40>] (__do_page_cache_readahead) from [<802321d8>] (filemap_fault+0x45c/0x668)
[46558.309265] [<802321d8>] (filemap_fault) from [<8026726c>] (__do_fault+0x28/0x7c)
[46558.313256] [<8026726c>] (__do_fault) from [<8026b924>] (handle_mm_fault+0x69c/0xbb8)
[46558.317143] [<8026b924>] (handle_mm_fault) from [<808a14e8>] (do_page_fault+0x160/0x3a0)
[46558.321252] [<808a14e8>] (do_page_fault) from [<8010129c>] (do_PrefetchAbort+0x48/0xac)
[46558.325156] [<8010129c>] (do_PrefetchAbort) from [<808a1164>] (ret_from_exception+0x0/0x1c)
[46558.329067] Exception stack(0xb4d67fb0 to 0xb4d67ff8)
[46558.331129] 7fa0:                                     13036ab0 130b655c 00000000 130b6558
[46558.334977] 7fc0: 1340f400 00000002 00000003 00000000 ee499fc4 13bd81e0 130b6540 0004129c
[46558.338644] 7fe0: 75a91e98 1304ed94 00c8f7e4 00078350 60000010 ffffffff
[46558.340637] Mem-Info:
[46558.342484] active_anon:105331 inactive_anon:105333 isolated_anon:32
[46558.342484]  active_file:371 inactive_file:368 isolated_file:0
[46558.342484]  unevictable:0 dirty:0 writeback:0 unstable:0
[46558.342484]  slab_reclaimable:4258 slab_unreclaimable:6402
[46558.342484]  mapped:357 shmem:6 pagetables:952 bounce:0
[46558.342484]  free:1176 free_pcp:269 free_cma:0
[46558.353046] Node 0 active_anon:420888kB inactive_anon:421420kB active_file:1364kB inactive_file:1472kB unevictable:0kB isolated(anon):128kB isolated(file):0kB mapped:1428kB dirty:0kB writeback:0kB shmem:24kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[46558.359703] Normal free:4384kB min:3876kB low:4844kB high:5812kB active_anon:420836kB inactive_anon:421184kB active_file:1468kB inactive_file:1172kB unevictable:0kB writepending:0kB present:970752kB managed:948352kB mlocked:0kB kernel_stack:2160kB pagetables:3808kB bounce:0kB free_pcp:996kB local_pcp:52kB free_cma:0kB
[46558.366478] lowmem_reserve[]: 0 0
[46558.368148] Normal: 387*4kB (MHC) 29*8kB (MH) 80*16kB (H) 22*32kB (H) 8*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4404kB
[46558.371468] 750 total pagecache pages
[46558.373149] 25 pages in swap cache
[46558.374748] Swap cache stats: add 401501, delete 401477, find 8095/364785
[46558.376373] Free swap  = 82320kB
[46558.377989] Total swap = 189668kB
[46558.379530] 242688 pages RAM
[46558.381017] 0 pages HighMem/MovableOnly
[46558.382502] 5600 pages reserved
[46558.383933] 2048 pages cma reserved
[46807.318648] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[46807.320317] brcmfmac: power management disabled
[47123.270883] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[47123.272411] brcmfmac: power management disabled

Who can help me out?

I stopped using hassio and installed homeassistant manually on raspbian and no crashes anymore.