Script getting stuck on homeassistant.turn_off sometimes

I have a script that is getting stuck sometimes when turning off a group using homeassistant.turn_off. I have no idea why but it looks like it is stuck on homeassistant.turn_off for group.downstairs. It is turning off lights, media_players, and climate entities in nested groups. Any ideas?

Below is the script and here is an image of the trace.

    good_night_script:
      alias: Good Night
      icon: mdi:shield-sun
      description: Turn off and on things for bedtime
      sequence:
        - choose:
            - alias: "If shed open send notification."
              conditions:
                - condition: state
                  entity_id: binary_sensor.shed_door_is_open
                  state: "on"
              sequence:
                - service: notify.josh_android
                  data_template:
                    message: Shed was left open!
        - service: light.turn_on
          data:
            entity_id: light.upstairs_hall_light
            brightness_pct: 20
        # split out these group turn offs so I can see where it is getting stuck
        - service: homeassistant.turn_off
          target:
            entity_id: group.downstairs
        - service: homeassistant.turn_off
          target:
            entity_id: group.garage
        - service: homeassistant.turn_off
          target:
            entity_id: group.basement_minus_retropie
        - service: homeassistant.turn_off
          target:
            entity_id: group.shed
        - service: script.turn_on
          target:
            entity_id: script.retropie_off_script # had to create basement_minus_retropie to pull this out so we dont wait for the script to finish
        - service: cover.close_cover
          entity_id: cover.garage_door_rav_4
        - service: cover.close_cover
          entity_id: cover.garage_door_focus
      mode: single!

I have this similar/same issue - it has only come about in the last few months. I thought it was because a WiFi light bulb was turned off at the mains but that is fixed now and doesn’t seem to be the case.
It happens for lists of entities of entities using light.turn_off and also an individual light with light.turn_off.

It always gets stuck on one of these 3:

  - service: light.turn_off
    data: {}
    continue_on_error: true
    target:
      entity_id:
        - light.bedroom_bed_downlight
        - light.bedroom_mirror_downlight
        - light.balcony_lights
        - light.living_room_strip_light
        - light.bar_lights
        - light.dining_table_light
        - light.flood_lights
        - light.kitchen_main_lights
        - light.kitchen_pantry_light
        - light.kitchen_sink_lights
        - light.kitchen_pendant_lights
        - light.outside_wall_light
        - light.study_downlight
        - light.downstairs_fairy_lights
  - service: switch.turn_off
    data: {}
    continue_on_error: true
    target:
      entity_id:
        - switch.driveway_lights_switch
        - switch.front_outside_lights_switch
        - switch.kitchen_bench_lights_switch
        - switch.kitchen_cabinet_lights_switch
        - switch.kitchen_lights_switch
        - switch.living_room_lamp_switch
        - switch.outside_wall_lights_switch
        - switch.workshop_back_lights_local
        - switch.downstairs_heater
        - switch.kettle
  - service: light.turn_off
    data: {}
    continue_on_error: true
    target:
      entity_id:
        - light.downstairs_theatre_light

I added the continue on errors but that doesn’t seem to have helped.

My temporary workaround is to allow the script to run in parallel but it still means at least one instance of the script can just be stuck running in the background. As far as i can tell there is no way to set a timeout on a script or a script element. I suppose i could have another automation that stops the script after X amount of minutes to stop it from being left stuck running.

My issue also just started in the last couple months. This exact script was running fine for over a year before.

You have an entity in the group that does not behave correctly to the turn_off command. Probably a climate or media_player entity.
I would probably bet on the media_players, since they might have other settings than just on/off, like playing/buffering/idle/paused and so on, which might not behave well with a group that prefer on/off states.

Try to remove them and then add it one by one to see if you can find the error by trial.

Yeah same, there has been some change for light.turn_off in the last few months. Do you use Zwave? Most of the lights are Zwave so i suppose there could be a ZwaveJS2MQTT change that is causing issues.

@WallyR No Media Players in this list just lights that are causing the script to stop running and hang - It can hang for days until the script is updated, stopped or the HA container is restarted

My reply was to jjmerri, since the usage of groups might be the cause there.

1 Like

I will try to pinpoint which entity is causing the issue.

I think I have found the issue. In my case one light sometimes goes Unavailable, in this state the turn_off command fails and the script or automation freezes.

The light has been like this for a long while and I have had other lights which have been like this so I think this is a newish defect (2023.6/7) if the light is or goes to Unavailable then the turn_off should just skip over the light (or other entity).

I’ll log a bug up.

Thanks to @WallyR for the suggestion of the difference states of entities in the groups for putting me on the right track.

Interesting. I will cut the power to one of my lights and see if I can replicate the issue.

I have the same issue with 2 lights. Both are listed in the same turn off service call. One has turned off (logbook says turned off by script blabla triggered by event blabla). The other has also turned off but doesn’t say what event turned it off in the logbook.
Consequently the script was running for almost 24 hours. The trace states the 2nd light never turned off, but I assure you that it has.
Knowing which light has caused this issue I digged deeper but did not find this light to be in the unavailable or unknown state in the last couple of weeks…

It looks like a timeout was removed in 2023.7 which is being re-added in Add back a timeout for service calls run from scripts by allenporter · Pull Request #98501 · home-assistant/core · GitHub