HAOS Crashing - need help

I just recently backed up my pi 4 w/ 4G of ram & usb-attached NVMe and migrated to a pi 5 w/ 8G of ram & an NVMe hat because I’d started seeing OOM errors. Both old and new boot from the NVMe (it’s the only locally storage). The pi 5 was a fresh install with a restore from Nabu Casa cloud. I’ve had a couple of crashes in the last few days. Tonight I caught the errors in the image here on screen:

Everything is fully up to date.

I’m not sure how to debug this and am hoping someone here can help. I’m perfectly comfortable jumping on the cli or pulling any needed logs… I just need some direction. Thanks in advance

It looks to me like your NVMe is failing.

Try plugging it into the USB 2 port instead of the USB 3 port. There is some chat about compatibility and interference issues that this workaround seems to resolve. Sure slower, but more reliable.

Given that there were data errors on at least two locations, the contents of your drive is suspect. Wipe your drive, start again, and restore from a known working backup.

1 Like

Regarding a failing drive, it seems unlikely based on age, but I have a replacement on order just in case it is that and not some other issue.

This is a Pi 5 hat (Amazon.com: iUniker PCIe M.2 HAT+ for Raspberry Pi 5, NVMe SSD PIP PCIe Peripheral Board With Extra Screw Sets for 2280/2260/2242/2230 NVMe SSD : Electronics), not an external enclosure, so it’s not a USB version issue. I am, however, starting to think it might be related to overall power draw as I moved one accessory over to a powered usb hub and it hasn’t crashed again since. I’ve also added a hardware device in line with the power cord to monitor the maximum power draw in hopes of gaining some related insights.

:+1:

I think @IOT7712’s suggestion of data corruption has merit too.

No more crashes yet… seeming like power draw on the pi might have been it.

So, it seems the actual cause was the tiny cable that goes between the hat and Pi wasn’t seated perfectly :man_facepalming:t2: