Recently, I was looking at something in core.device_registry file via FileZilla. As a precaution, before I open any critical file, I always make a copy. So, I copied the file core.device_registry to my laptop using FileZilla. As bad luck would have it, I accidentally made a change to the file and ended up corrupting it. As soon as I restarted HA, it failed to start and gave a huge warning about corrupted core.device_registry file. HA had renamed the corrupted file and create a default core.device_registry from somewhere. This default core.device_registry file had incorrect device IDs and most of my automations showed error.
Since I had made a copy of the file prior to corrupting it, I deleted the default core.device_registry file and copied the one I had saved to my laptop. When I restarted HA, it again gave me a warning that core.device_registry file was corrupted. How can that happen? I had copied a fully functional file but HA refused to use it. I tried several times but HA kept on throwing the corrupted file error. It would just rename the file to corrupted and create a default file.
Finally, I restored from a full backup and it worked. But my questions are:
a) From where did HA create the default core.device_registry file which was totally wrong?
b) Why did HA fail to use the perfectly working file I copied from laptop to HA?
I did not shutdown HA core. I assumed that if I can delete the file and copy it, then it is not locked by the OS. How can I shutdown HA core? There doesn’t seem to be any option. The only option is to shutdown the entire system running HA which would make the files inaccessible.
I tested after stopping the core and was able to restore the copied file. After restoring the copied file, I realized that the notification about corrupted file does not go away. I have to click on the Submit button to tell HA that issue has been fixed. The same thing happened when I corrupted the file accidentally and copied it back with core running. I thought that HA was still detecting corrupted file. So, it looks like I do not need to stop core.
That being said, as I was testing with core stop/start file corruption etc., I noticed something way weirder.
I did this quickly:
a) Stop the core.
b) Make a change to core.device_registry to corrupt it by adding an extra comma
c) Start core. GUI reported file corruption.
d) Stop core.
f) Restore file from copy.
g) Start core. GUI still complains about file corruption, Click on Submit to tell HA that issue has been fixed.
h) Stop core.
e) Make a change to core.device_registry, like add an extra comma to a line to corrupt the file.
f) Start core.
h) Change is gone. File is not corrupted.
I have no idea why such behavior is happening. I also noticed that after couple of tests when HA somehow undid changes to core.device_registry, the most recently added device name changed. It changed to the original one HA has assigned after adding it. I had renamed it.
Does HA do some kind of restore from old backups if it detects corrupted file?