Stability problems NVMe on RPI 5

My Raspberry Pi 5 (16GB) Running Home Assistant OS frequently stops working. It occurs seemingly random every few weeks.

I cannot access the Home Assistant web page using my browser. The Home Assistant App is barely usable, not useful information, such as log files is accessible.
I cannot connect via SSH.
All functionality, such as Zigbee and Z-wave communication is not working anymore.

When connecting a monitor the console output shows an endless amount of the following messages:

Buffer I/O error on dev nvme0n1p3, logical block 17399, async page read
erofs: (device nvme0n1p3): erofs_read_inode: failed to get inode (nid: 2227159) page, err -5
systemd-journald[130]: Failed to write entry to /var/log/journal/<id>/system.journal (21 items, 685 bytes) despite vacuuming, ignoring: Input/output error

The only solution to get out of this situation is to hard reboot the Raspberry Pi.

The home-assistant, host and supervisor logs, accessible after rebooting show no useful information.

Setup currently in use:

HW:
Raspberry Pi 5, 16 GB
Raspberry Pi 45W USB-C Power Supply
Pi NVMe HAT: 52pi 04-m-2-2280-pcie-to-nvme-top
SSD: WD Blue SN5000 500GB PCIe G4
Sonoff Zigbee 3.0 USB Dongle Plus, TI CC2652P
Zooz ZST39 LR Serie 800 Z-Wave USB stick
SW:
Home Assistant OS
Core: 2025.6.1
Supervisor: 2025.05.5
OS: 15.2

Zigbee2MQTT, InfluxDB, Grafana, etc.

However, this issue also occured with my previous setup consisting of a RPI 4 with a SD Card for booting and an additional USB SSD. I suspected that the USB SSD might be the cause of these problems and I decided to switch to the RPI5 setup with NVMe SSD, however, to my great frustration, I still have the same problems!

This issue might be related to the following post, however that didn’t result in any responses:

This post might also described the issue, but also didn’t result in a solution:

Have any of you an idea what might be the cause of the issue or if it is possible to get more information (logs) for finding the cause?

My suggestion would be to try a different NVMe SSD - ideally one of the RPi Foundation own-brand tested devices.

Several RPi5 HATs (and the Yellow) report issues with specific SSDs. I’ve not got a compatibility list to hand but ISTR issues with command queueing, overheating, and brown-out of power.

The Pimoroni list calls out a range of WD SSDs:

I certainly had boot issues with the Pimoroni HAT and Samsung 970 EVOs, although other Samsung drives work in the Yellow (probably pushes the device a lot less as PCIe 1 or 2, not 3.0).

If this helps, :heart: this post!

I have to admit I’m desperate enough to try something like buying another SSD. However I do wonder, why I encounter the exact same issues with such a different setup: I switched the RPI from 4 to 5 and the SSD from USB to NVMe. Might it no be software related?

And I also wonder what other HA user have as a HW and storage configuration. I always assumed a RPI with HA is a pretty common setup. If that is indeed the case I would assume more people would encounter the same problems I have.

Also, I assume I won’t make any difference considering the not so powerful hardware of a RPI, but the speed differences between the RPI SDD (50k / 90k IOPS read/write) vs the WD Blue (460k / 770k IOPS read/write) make me cringe a bit.

Before going with the nuclear option, can you get your hands on a powered usb hub for your SSD?

Any cheap usb2 one will do, as long as you provide it with a separate power source. My money’s on insufficient usb power for your SSD.

The RPi forum goes into some detail about software features the Linux kernel can use, and that some manufacturers don’t implement or don’t implement correctly (ISTR command queueing was one).

Given the cost of a RPi “original” NVMe, I decided it’s not worth the time to tune, and just upgraded to a “known good” part. The old one is in a USB-C enclosure as a portable drive - seems to work for that use.

Hi,
I’m experiencing the exact same issue. After running a bunch of tests, I found that the problem only shows up when I’m using HA OS flashed to the NVMe. At the same time, if I run the Supervised version (using Raspberry Pi OS Lite, which is basically slimmed-down Debian), everything works rock solid. That makes me think it’s all about how the kernel is built—if it’s optimized for the Pi, no issues at all.

The bad news is that they’re planning to phase out the Supervised version, so I’m back to square one…

Since I can’t really change how the kernel handles my NVMe, the only thing left to try is a different drive—maybe HA OS will “like” it better. I’m currently using a WD SN530 NVMe 256GB, and I’m planning to test it with a Micron 512GB next.

Cheers,
George

I will start with using a powered USB hub, which seems a useful improvement anyway. If that doesn’t work I’ll try the official RPI SSD.
Switching from HA OS to Supervised will be a last resort, especially if they are phasing out that version. I do agree with the suggestion of jokoto777 that it is a SW issue and not a HW issue, since I already switched HW.

If, in the meantime, anybody has any other suggestions or solutions, please let me know!