HA. 2021.11.2 crash on raspbery pi 4

Well, one moves to the SSD specifically to avoid corruption (that never happened with my SD optimized for heavy workloads), so if that’s the case (=corruption still happens), there’s zero reasons to use SSD at all. The speed difference between a good, fast SD and a SSD on a Rpi4 is negligible and only seen during restart/reboot time.

In my case if the system crashes during the night it can only be the recorder:

Because in the docuemntation page of the recorder here, it indicates that the auto purge is by default programmed for 4:12 local time.

image

I’ll try to disable the auto purge to check if it still has problems

Just for posterity: mine crashed around 5:00. Maybe it was a conflation of recorder issue, USB chip overheating AND the DST bug that was recently fixed in HA.

I updated yesterday and now this evening it crashed. I have tried to restore some backup without any luck.

Logger: homeassistant.components.websocket_api.http.connection
Source: components/deconz/switch.py:73
Integration: Home Assistant WebSocket API (documentation, issues)
First occurred: 22:30:57 (2 occurrences)
Last logged: 22:35:13

[548149241792]
[548111155792]
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pydeconz/api.py", line 137, in request
    return await self._request("put", path=field, json=data)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 141, in request
    return await self._request(
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 167, in _request
    _raise_on_error(response)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 261, in _raise_on_error
    raise_error(data["error"])
  File "/usr/local/lib/python3.9/site-packages/pydeconz/errors.py", line 59, in raise_error
    raise cls("{} {}".format(error["address"], error["description"]))
pydeconz.errors.BridgeBusy: /lights/2/state/on Internal error, 951

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pydeconz/api.py", line 137, in request
    return await self._request("put", path=field, json=data)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 141, in request
    return await self._request(
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 167, in _request
    _raise_on_error(response)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 261, in _raise_on_error
    raise_error(data["error"])
  File "/usr/local/lib/python3.9/site-packages/pydeconz/errors.py", line 59, in raise_error
    raise cls("{} {}".format(error["address"], error["description"]))
pydeconz.errors.BridgeBusy: /lights/2/state/on Internal error, 951

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pydeconz/api.py", line 137, in request
    return await self._request("put", path=field, json=data)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 141, in request
    return await self._request(
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 167, in _request
    _raise_on_error(response)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/gateway.py", line 261, in _raise_on_error
    raise_error(data["error"])
  File "/usr/local/lib/python3.9/site-packages/pydeconz/errors.py", line 59, in raise_error
    raise cls("{} {}".format(error["address"], error["description"]))
pydeconz.errors.BridgeBusy: /lights/2/state/on Internal error, 951

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 185, in handle_call_service
    await hass.services.async_call(
  File "/usr/src/homeassistant/homeassistant/core.py", line 1495, in async_call
    task.result()
  File "/usr/src/homeassistant/homeassistant/core.py", line 1530, in _execute_service
    await handler.job.target(service_call)
  File "/usr/src/homeassistant/homeassistant/helpers/entity_component.py", line 213, in handle_service
    await self.hass.helpers.service.entity_service_call(
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 667, in entity_service_call
    future.result()  # pop exception if have
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 863, in async_request_call
    await coro
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 704, in _handle_entity_call
    await result
  File "/usr/src/homeassistant/homeassistant/components/deconz/switch.py", line 73, in async_turn_on
    await self._device.set_state(on=True)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/light.py", line 230, in set_state
    return await self.request(field=f"{self.deconz_id}/state", data=data)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/api.py", line 150, in request
    return await self.request(field, data, tries)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/api.py", line 150, in request
    return await self.request(field, data, tries)
  File "/usr/local/lib/python3.9/site-packages/pydeconz/api.py", line 153, in request
    raise BridgeBusy
pydeconz.errors.BridgeBusy

Hello, everybody,

As I indicated above, I have a good result from yesterday to today. I did two things:
1 - Deleted the old database which had approximately 1.3Gb.
2 - I disabled the recorder, as indicated in this link.

I can say that the home assistant hasn’t crashed yet.

I’ll be back in a few days with more news.

1 Like

I can chime in that while occasionally having hung up once every other month before, my RPi 4 w/ SSD , 2021.11.5 has now frozen for the second night in a row.
Sadly, the logs/ log.1 don’t offer any vital insights. I’ll try to disable auto-purge for the next one to come to see whether this is the culprit since it always dies around the morning hours.
And while I’m running some custom components, my gut feeling tells me that this is a HASSOS issue.

Update a day later: RPi4 still up. Definitely will leave the auto-purge disabled for a while.

2 Likes

I think mine may have been getting to hot. I would temps just above 130F now and then. I replaced the Pi4 with a little mini PC and its been fine ever since and it responds faster.
But since I had to do a new install and a restore it could have been something else. My config is all the same so I don’t think its that.

Still going strong without any crashes? When you say you disabled the recorder did you just delete the recorder entry from your yaml entry?

Hey- read your concerns above. I just moved what was a stable build from SD to SDD (ironically thinking I was moving to even more robustness) and now am having shutdowns every 24-36 hrs. Any insight since posting here?

Did you root cause the issue ? Thanks

No, I just went back to the SD card. The performance benefit from using SSD on Rpi is minimal, and if the reliability goes down instead of up, there’s no point in using SSD at all.

Once you figure out how to get the SSD working correctly you will find the opposite of what you said. My SSD (again once working correctly) has caused zero issues where the SD would fail at least once a year. To me it is amazing the performance difference between the two. You will notice it the most on booting, logs, recorder, and anything that needs to read or write data on the SD. I think it will be worth your time to try and figure out what you issue is with the SSD. There was issues between Nov 2020 and Jan 2022 that effected a small group of PI users. This has been fixed and my system is very stable now.

Have you opened an issue on Github and posted your logs? Stefan has been excellent in resolving HAOS issues.

Congratulations but my experience is the opposite. I have done my homework and spent days troubleshooting, providing logs, talking with devs on github, everything. Not worth it at all to me. I had zero SD card fails in three years and I have daily cloud backups.

There’s a solution that has zero issues (SD card) vs a solution that had catastrophic failures. The choice is rather simple.