How to best deal with devices involved in automations that go unavailable occasionally

Guess I should start by stating that I’m a HA newbie and my installation is less than 2 weeks old but really enjoying messing around with it. I have a bunch of TP-Link smart plugs that I use for various things like turning lights off and on at specific times, turning my coffee maker on at 7am, etc etc etc. Since these are cheap WiFi devices with what is likely puny little wiFi transceivers they do drop out for a few seconds here and there and then come back. Came downstairs this morning and noticed my aquarium light was still on from the previous day when it was scheduled via an automation to turn off at 9pm. Looked in the logbook for this device and noticed that it did go unavailable for about 15 seconds right as the automation was supposed to run.

What I need help with is dealing with this scenario. Ultimately what I would like is for the automation to keep trying until it is successful in changing the device to the desired state. Maybe a nice to have eventually would be for it to send a notification if its not available for an extended period of time but would be happy to just get the automation to be reliable with these occasional drop outs. While nothing super important is using timers like this, the aquarium light can cause excess algae to grow if its left on too long.

I don’t think it’s normal for wi-fi devices to drop out as often as yours seem to be doing…

But in any case, the automations should still run, even if they don’t complete everything successfully. Are you using device ids in your automations? If you are, the device becoming unavailable will also make the automation unavailable. Much better to use entity ids and services.

For the notifications, you could write a short automation to be triggered when the state of the switch entity associated with the plug changes to unavailable.

This turns off a switch and then, 5 seconds later, checks if the switch is off. If it isn’t then it tries again. It tries a maximum of 6 times (5 seconds x 6 times = 30 seconds total) and then gives up. You can adjust the time period and number of attempts to suit your requirements.

  - repeat:
      until: "{{ is_state('switch.your_switch', 'off') or repeat.index >= 6 }}"
      sequence:
        - service: switch.turn_off
          target:
            entity_id: switch.your_switch
        - delay:
            seconds: 5

Did what I suggest, many months ago, fulfill your request?