Issues with notifications during network failover

Hello everyone,

I’ve been working for a couple of days on configuring a failover setup on a MikroTik router to maintain internet connectivity via LTE when the fiber connection goes down.

Main issue

The failover itself works correctly, but I have a major problem:

I am NOT receiving notifications on my mobile device when switching between Fiber and 4G (failover), especially when the phone is connected via WiFi.

Observed behavior:

  • If the phone is on WiFi:

    • Notifications stop arriving after the failover.
    • They only resume if I manually change the connection (e.g., switching to mobile data).
  • If the phone is on mobile data (4G/5G):

    • Everything works correctly, no issues.
  • Additionally:

    • Home Assistant Cloud takes some time to reconnect (I partially mitigated this by restarting it via webhook).

This leads me to believe the issue is related to persistent connections/sockets that become invalid after the public IP changes.

Context

Initially, I also had issues with Home Assistant Cloud after failover, but I solved that by sending a webhook to Home Assistant to force a cloud reconnection.

However, the notification issue still persists, which is my main concern right now.

Current network topology

  • ISP router (fiber)
  • MikroTik (handling 4G failover)
  • TP-Link Deco Mesh (in router mode)

I know this is not ideal. My future plan is:

  • ONT + MikroTik as the main router
  • Deco in AP mode

But for now, I prefer not to change this until everything is stable.

Failover behavior

  • WAN (fiber) → distance 1
  • LTE (4G) → distance 2

The failover works correctly:

  • Automatic
  • Fast
  • No noticeable interruptions

The issue is not the failover itself, but what happens afterward.

Technical issue observed

When switching between Fiber and 4G:

  • The public IP changes
  • Some connections remain open or become invalid

This results in:

  • Notifications not being received over WiFi
  • Temporary issues with Home Assistant Cloud
  • Affects both cloud access and local access (I use Tailscale)

MikroTik script

I use the following script to detect the change and trigger actions:

:global lastState
:local haIP "IP_HOME_ASSISTANT"

:local activeRoutes [/ip route find where dst-address="0.0.0.0/0" active=yes]
:local bestDist 999

:foreach i in=$activeRoutes do={
    :local d [/ip route get $i distance]
    :if ($d < $bestDist) do={
        :set bestDist $d
    }
}

:log info "[CHECK] Active distance: $bestDist | Previous state: $lastState"

:if ($bestDist = 1) do={

    :if ($lastState != "WAN") do={
        :log warning "[FAILOVER] >>> SWITCH to FIBER (distance=$bestDist)"
        /tool fetch url="http://$haIP:8123/api/webhook/failover_fiber" http-method=post keep-result=no
        :set lastState "WAN"

        :delay 5s

        /ip firewall connection remove [find protocol=tcp dst-address=$haIP dst-port=8123]
        /ip firewall connection remove [find protocol=tcp src-address=$haIP src-port=8123]
        /ip firewall connection remove [find dst-address-type=!local]

        /ip dns cache flush
    }

} else={

    :if ($lastState != "4G") do={
        :log warning "[FAILOVER] >>> SWITCH to 4G (distance=$bestDist)"
        /tool fetch url="http://$haIP:8123/api/webhook/failover_4g" http-method=post keep-result=no
        :set lastState "4G"

        :delay 5s

        /ip firewall connection remove [find protocol=tcp dst-address=$haIP dst-port=8123]
        /ip firewall connection remove [find protocol=tcp src-address=$haIP src-port=8123]
        /ip firewall connection remove [find dst-address-type=!local]

        /ip dns cache flush
    }
}

Home Assistant automation

When the webhook is received:

  • It restarts Home Assistant Cloud
  • Sends a critical notification to the mobile device

Questions

I would like to understand:

  • Am I handling connection tracking correctly?
  • Is removing connections this way the right approach?
  • Is there a better way to prevent persistent connections from breaking after an IP change?
  • How is this typically handled in failover setups with services like Home Assistant?

Any help or guidance would be greatly appreciated.

Right now, failover works perfectly, but notifications are no longer reliable, which is exactly what I need to fix.

Thank you in advance! :raised_hands:

What happens if you just reboot the router to force all your devices to form a new connection?

From the looks of it, you’re only having an issue when you’re at home and connected to WiFi? I don’t use the Home Assistant cloud, I just connect locally and have a vpn that all my devices connect to when I’m not at home but on my network (also running with a Mikrotik router) I have all my home services have their Domain Name route directly to the local address so my public IP is irrelevant. Everything does have a FQDN but they all route directly to their private IP on the home network. I have a PiHole running but you can have custom DNS entries on Mikrotik too.

Maybe this might solve your problem?

Alternatively, you can also drop a wifi connection from the WiFi section of the GUI. Knowing Mikrotik, you’ll be able to do the same thing through the terminal to drop your phone and force it to reconnect (as long as your phone isn’t using a randomised MAC).
These reconnections happen in seconds, I do that when I’m assigning a new static IP to a device.

The nuclear option would be to reboot the router and just force everything to make a new connection.

Persistent sessions over changing IP Addresses is probably asking a little too much from HomeAssistant, especially if the notifications are in real time and not queued.

1 Like

Telegram should work fine in those scenarios.

(post deleted by author)

I would prefer not to switch to Telegram or similar, because it would mean losing actions that I have configured exclusively for notifications (such as a button to disable the effects when the alarm is triggered without disabling it).