NetworkManager gets connected to WiFi but no internet

I suspect there’s something weird about the network setup that supervised HASS produces, I hope someone will be able to chime in and point me to right direction. I’m using supervised hass on top of RaspberryPi OS (buster).

My problem is that from time to time my RPi 4 loses WiFi connection and after reconnecting to it, it doesn’t get the internet connection back (even though I can ssh to it over WiFi). At least not until a reboot.

It’s unclear to me why it disconnects from wifi in the first place but anyways I have a cron job running every 5 minutes that restarts the NetworkManager service when connection is lost. Having restarted, it successfully reconnnects to wifi but the device won’t connect to the internet. I see nothing suspicious in syslog that could trigger the disconnect nor any errors while connecting.

The only thing I managed to diagnose is that the route is changed, and my device probably defaults to one of hass virtual networks it created, rather than using wlan0.

Before reboot

$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         0.0.0.0         0.0.0.0         U     207    0        0 veth47c24e9
0.0.0.0         192.168.50.1    0.0.0.0         UG    **600**    0        0 wlan0
169.254.0.0     0.0.0.0         255.255.0.0     U     207    0        0 veth47c24e9
169.254.0.0     0.0.0.0         255.255.0.0     U     209    0        0 veth66c4199
169.254.0.0     0.0.0.0         255.255.0.0     U     211    0        0 vethae8c724
169.254.0.0     0.0.0.0         255.255.0.0     U     213    0        0 veth1ee51c7
169.254.0.0     0.0.0.0         255.255.0.0     U     215    0        0 vetha757eae
169.254.0.0     0.0.0.0         255.255.0.0     U     217    0        0 veth126ec60
169.254.0.0     0.0.0.0         255.255.0.0     U     219    0        0 veth2bf023a
169.254.0.0     0.0.0.0         255.255.0.0     U     221    0        0 veth0c3cdc3
172.17.0.0      0.0.0.0         255.255.0.0     U     **0**      0        0 docker0
172.30.32.0     0.0.0.0         255.255.254.0   U     **0**      0        0 hassio
192.168.50.0    0.0.0.0         255.255.255.0   U     600    0        0 wlan0

After reboot

$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.50.1    0.0.0.0         UG    **0**      0        0 wlan0
0.0.0.0         0.0.0.0         0.0.0.0         U     207    0        0 veth5e60ad6
0.0.0.0         192.168.50.1    0.0.0.0         UG    600    0        0 wlan0
169.254.0.0     0.0.0.0         255.255.0.0     U     207    0        0 veth5e60ad6
169.254.0.0     0.0.0.0         255.255.0.0     U     209    0        0 veth1c8a93b
169.254.0.0     0.0.0.0         255.255.0.0     U     211    0        0 veth8a99575
169.254.0.0     0.0.0.0         255.255.0.0     U     213    0        0 veth483f273
169.254.0.0     0.0.0.0         255.255.0.0     U     215    0        0 vetha5e5c7a
169.254.0.0     0.0.0.0         255.255.0.0     U     217    0        0 vethdecaa67
169.254.0.0     0.0.0.0         255.255.0.0     U     219    0        0 vethb122bf8
169.254.0.0     0.0.0.0         255.255.0.0     U     221    0        0 veth61d0906
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
172.30.32.0     0.0.0.0         255.255.254.0   U     0      0        0 hassio
192.168.50.0    0.0.0.0         255.255.255.0   U     600    0        0 wlan0

I wonder if anyone could give me a hint how to debug/fix that?

As far as I can comprehend wifi - Howto migrate from networking to systemd-networkd with dynamic failover - Raspberry Pi Stack Exchange - it seems that Home Assistant creates second default gateway, and when wlan0 fails, the system will switch to the other one created in docker. As it now has lower metric, it will be used by default.
So the question is, should the network created by Home Assistant be a default gateway? If so, how to ensure it never takes precedence over wlan?

Looking at the router logs I discovered sometimes a radar is found and that causes a channel change. I’ve changed it to a fixed one so that should reduce a number of wifi failures, but it’d be still nice to have a mechanism to restore networking whenever it fails. (I wonder why network manager doesn’t handle channel change - is it rpi specific?)

Ok, nevermind, I’ve added a few cron jobs and it seems running this line after wifi gets reconnected “solves” it:

/usr/sbin/route add -net default gw 192.168.50.1 netmask 0.0.0.0 dev wlan0 metric 0

It would be nice though if home assistant setup didn’t require it

The bug is still there.
Another workaround is:

sudo nmcli connection modify "Supervisor wlan0" ipv4.route-metric 1
sudo nmcli connection up