Home assistant crashed and won't start anymore

I need some help, my home assistant setup crashed today and does not want to start anymore. I’m running homeassistant on a rpi3 with razberry board, DSMR cable (smart meter) and it is connected via wifi. I was actually just playing a bit with the settings od my fibaro fgsd 002 smoke sensors. Most likely completely unrelated, but got a pop up on my IPAD that a new version of IOS was available, so decided why not get it over with.

When the IPAD was working again, I tried to reconnect without any success. I checked on my router settings, and the rpi was no longer connected to the network. Did a powercycle of the rpi, without any success. Connected a screen to the rpi and powered it up, and nothing happens. Not even the standard boot screen. The screen gives the message “no input detected”. I can open the SD card on my pc. I can access the linux partition via ext2fsd on windows. This makes me think that the SD card itself is OK.

Now something similar happened a while ago. Back then I did not bother to investigate since a reinstall takes about 1 hour, but since it is the 2nd time that this happened now. Now I want to see if it can be fixed or at least diagnosed. Any tips where I should start to look?

These is what I find in the home assistant log from around the time that problems started to occur:

2018-07-01 12:42:09 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:42:19 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:42:39 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:43:09 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:43:14 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:43:21 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:43:29 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:44:01 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:46:23 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:46:34 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:46:46 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:47:06 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:47:17 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:47:24 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:47:27 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_production_phase_l2 (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.499 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:47:30 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_production_phase_l3 (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.485 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:47:31 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.gas_consumption (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.640 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:47:33 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:47:39 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_consumption_phase_l1 (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 1.344 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:47:43 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:48:04 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_consumption (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.548 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:48:19 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_consumption_low (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.469 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:48:58 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.voltage_swells_phase_l1 (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.464 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:49:06 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.voltage_swells_phase_l3 (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 0.439 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:49:49 ERROR (MainThread) [homeassistant.core] Timer got out of sync. Resetting
2018-07-01 12:50:09 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for group.meter_readings (<class ‘homeassistant.components.group.Group’>) took 0.401 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:50:23 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for weather.br_unknown_station (<class ‘homeassistant.components.weather.buienradar.BrWeather’>) took 2.039 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:50:50 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_production (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 1.181 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:51:42 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_tariff (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 14.006 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub
2018-07-01 12:52:17 WARNING (MainThread) [homeassistant.helpers.entity] Updating state for sensor.power_production_low (<class ‘homeassistant.components.sensor.dsmr.DSMREntity’>) took 1.799 seconds. Please report platform to the developers at Platforms that do I/O inside properties · Issue #4210 · home-assistant/core · GitHub

I know, this isn’t the answer you want to read, but to be honest, I wouldn’t waste too much time analyzing this problem any further. There obviously is some problem with your sd card. RPIs are famous for screwing with sd cards, especially when they are improper unmounted (hard power cycle etc.) and when the OS causes heavy disk I/O. Even the latest RPI3 suffer from this issue.

You may try to fs check the sd card by mounting it with an external card reader in another linux installation, but you’ll never find out for sure, which files are still ok and which are not.

If the sd card is still readable from windows: save your configs as long as you can and setup a new system.

Home assistant makes heavy use of a sqlite database to save history data. Thus, you may consider using an external drive. Or at least outsource your HA config folder including the database file to an external drive or a network attached storage to minimize disk I/O on your SD card and to store your HA config on a more reliable storage medium.

SD card fail?

Ok, to be honest this is exactly what I want to hear. Explains a lot.
I have a partition with ubuntu, so I will run fs check to see if it can find something. I noticed that my database grew to 2.5 GB in a week.

Need to rethink my setup. I have a synology NAS, and I could just move the setup to there. But I have to add a zwave stick and move my synology so I can read out my smart meter. Another option would be to run the raspberry from a externel hdd. Or just go for your proposal.

I ran on a Raspberry Pi then moved to Synology… Camera images seem to load a little bit faster in this case.

You could also move your database to Synology in MySQL. And also create some backup using rsync or other method for the Raspberry Pi.That way if your SD card fails again all you have to do is just install new SD card and restore config files.

Raspberry Pi can also do Network boot. Never tried it but it could be interesting for this use case. Docker swarm also allows something similar

EDIT
Database writes are the biggest killer of the SD cards

If you own a Synology NAS, you should checkout berryboot with iSCSI:

https://www.berryterminal.com/doku.php/storing_your_files_on_a_synology_nas_using_iscsi

Thus you can still use your RPi for HA, but without wear&tear of the SD card and without an external drive. If you also enable netboot on your RPi3, you even don’t need an SD card anymore.

I’m running a RetroPie/Kodi and a Raspbian Image with berryboot and iSCSI on a RPi3 and despite the RPi’s 100 mBit/s ethernet the images run faster over iSCSI than from SD card before.

One more thing: If you don’t have a good reason to keep your history data for more than a couple of days, you should configure the recorder component with reasonable purge options:

This keeps the database file from growing through the roof over time and also improves HA’s performance.

Reinstalled everything yesterday and moved the database to my NAS. Ican hear it writing stuff now all the time, I guess it is the data from my smartmeter. Let’s hope it will now keep running, reinstall goes pretty quick but fixing the small details like the naming of the zwave smoke sensors takes too much time :wink:

Thanks for the help