Migration rpi3->rpi4, huge I/O performance issue (iowait sensor)

Hi all. For couple years my homeassistant has been running really well on rpi3 + 64GB random microsd. When rpi4 came out I decided to buy one.
Tow days ago upgraded to latest HA made a backup, installed fresh HA to rpi4 and restored backup on almost new Samsung EVO U3 64GB card. Also changed the zigbee2mqtt usb stick to more powerful one.

And there began problems to arise. Realized that lovelace is not as responsive as on rpi3, also the HA from mobile app opened after 10-15 seconds, in some cases it gave me error that “homeasistant is not reachable, retry”. The switches lights work ± 10 seconds after i press them. The whole system got like hiccups.
CPU load is under 10% and there is plenty RAM and SWAP. Temperature also keeps around 50 degrees.

In logs there are lots of lots of these: (on rpi3 i got just a few as I remember)

2020-09-28 00:16:32 ERROR (MainThread) [homeassistant.components.rest.switch] No route to resource/endpoint: http://192.168.1.52/json/state
2020-09-28 00:16:36 WARNING (MainThread) [homeassistant.components.broadlink.sensor] The sensor platform is deprecated, please remove it from your configuration
2020-09-28 00:16:37 WARNING (MainThread) [homeassistant.components.calendar] Setup of calendar platform google is taking over 10 seconds.
2020-09-28 00:16:37 WARNING (MainThread) [homeassistant.components.camera] Setup of camera platform netatmo is taking over 10 seconds.
2020-09-28 00:16:38 WARNING (MainThread) [homeassistant.components.climate] Setup of climate platform netatmo is taking over 10 seconds.
2020-09-28 00:16:38 ERROR (MainThread) [homeassistant.components.calendar] Entity id already exists - ignoring: calendar.contacts
2020-09-28 00:16:39 WARNING (MainThread) [homeassistant.components.light] Setup of light platform mqtt is taking over 10 seconds.
2020-09-28 00:16:39 WARNING (MainThread) [homeassistant.components.switch] Setup of switch platform mqtt is taking over 10 seconds.
2020-09-28 00:16:39 WARNING (MainThread) [homeassistant.components.binary_sensor] Setup of binary_sensor platform mqtt is taking over 10 seconds.
2020-09-28 00:16:43 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform scrape is taking over 10 seconds.
2020-09-28 00:16:43 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform scrape is taking over 10 seconds.
2020-09-28 00:16:43 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform mqtt is taking over 10 seconds.
2020-09-28 00:16:46 WARNING (MainThread) [homeassistant.components.sensor] Setup of sensor platform netatmo is taking over 10 seconds.
2020-09-28 00:20:59 WARNING (MainThread) [homeassistant.helpers.entity] Update of device_tracker.son_bedroomnight is taking over 10 seconds
2020-09-28 00:20:59 WARNING (MainThread) [homeassistant.helpers.entity] Update of device_tracker.hs110 is taking over 10 seconds
2020-09-28 00:20:59 WARNING (MainThread) [homeassistant.helpers.entity] Update of device_tracker.son_gardenlamp_2 is taking over 10 seconds
2020-09-28 00:20:59 WARNING (MainThread) [homeassistant.helpers.entity] Update of device_tracker.son_bridge_2 is taking over 10 seconds
2020-09-28 00:20:59 WARNING (MainThread) [homeassistant.helpers.entity] Update of device_tracker.google_home_mini is taking over 10 seconds
2020-09-28 00:20:59 WARNING (MainThread) [homeassistant.helpers.entity] Update of device_tracker.google_home_mini_2 is taking over 10 seconds

etc…

Terminal is realllly slow and laggy as well, but it feel that the problem is related to I/O.

iostat gives me %iowait around 18.5% all the time what I guess it not quite good.

How can I see what writes all the time to the sd card and cloggs up the system? maybe someone has any idea whats happening with my HA. :frowning:

Looks like problems where with db. Deleted all

home-assistant_v2.db
home-assistant_v2.db-shm
home-assistant_v2.db-wal
home-assistant_v2.db.corrupt.202

Afterwards iowait started to fall. Also Resposivness is back. :+1:t2:

Made sensor for monitoring the IO wait %. Maybe usefull for someone.

  - platform: command_line
    name: iowait
    command: "iostat -c|awk '/^ /{print $4}'"
    unit_of_measurement: "%"
4 Likes

Problem started again. Bad responsiveness, now iowait is 8% and log is full of errors, mostly about that something is taking over X seconds.

Looks like I will have to start from scratch…

To continue… Tried different Kingston 64GB card, the issue was event worst, but looked like the card was acting bit weird.

Got the almost new SanDisk 16G card, reverted from snapshot, and my HA was up and running in no time, all issues are gone, responsiveness is back, iowait is under 1,9%

Did some tests on these cards while running HA:

➜  ~ ioping -RL /dev/mmcblk0

samsung evo 64 GB U3

--- /dev/mmcblk0 (block device 59.6 GiB) ioping statistics ---
2 requests completed in 5.13 s, 512 KiB read, 0 iops, 99.9 KiB/s
generated 3 requests in 6.08 s, 768 KiB, 0 iops, 126.3 KiB/s
min/avg/max/mdev = 1.72 s / 2.56 s / 3.41 s / 844.7 ms


--- /dev/mmcblk0 (block device 59.6 GiB) ioping statistics ---
46 requests completed in 4.15 s, 11.5 MiB read, 11 iops, 2.77 MiB/s
generated 47 requests in 4.17 s, 11.8 MiB, 11 iops, 2.82 MiB/s
min/avg/max/mdev = 6.54 ms / 90.2 ms / 1.72 s / 347.6 ms



SanDisk 16G

--- /dev/mmcblk0 (block device 14.8 GiB) ioping statistics ---
433 requests completed in 2.96 s, 108.2 MiB read, 146 iops, 36.5 MiB/s
generated 434 requests in 3.01 s, 108.5 MiB, 144 iops, 36.1 MiB/s
min/avg/max/mdev = 6.32 ms / 6.84 ms / 25.8 ms / 1.06 ms

The Samsung Evo is U3 card. Looks like something to do with rpi4 controller, that it did not like the U3 card.

Ordering new SanDisk U1 card, and hoping my saga is over with this :slight_smile: