Commands getting missed with MQTT and Zwave SmartThings bridge

I have Home Assistant configured to talk to all my Zwave/Zigbee SmartThings devices via MQTT (using the SmartThings MQTT bridge solution). I have been running this way for close to two years with few issues, and I like the arrangement because hardware device management is easier via SmartThings app than via HA.

Lately, I am experiencing some issues where some commands sent from Home Assistant (via an automation script for example) are not being received by SmartThings for all devices. For example, when I trigger my “night mode” script, which turns off most of the house’s lights, some of the lights included in the script won’t turn off. In the HA interface, it will show them as off, even though the lights are still on. If I toggle the switch in HA to on, then off, the lights turn off fine.

When this occurs, if I look at the light in SmartThings, it still shows as on. So either HA isn’t actually sending the MQTT command, or SmartThings isn’t receiving it. The thing is, it doesn’t happen all the time, but it is semi-consistent with the devices it does happen with. For example, 80% of the time it’s my kitchen pendants that “stick” and don’t turn off when they’re supposed to. Last night, though, those turned off…but the main kitchen can lights didn’t turn off, along with another light in the family room.

This inconsistency is making it hard for me to troubleshoot. It’s almost like there’s congestion and it’s causing messages to drop. Does anyone have any tips on how I can figure out why this is happening, and what I can do about it?

you could use MQTTfx (a MQTT tool) and subscribe to the topics addressed with your SmartThings bridge.
If you can see MQTT messages for your lights that stay on, the issue is with the bridge. Otherwise HA has problems.
Next thing to try is QoS parameter with value 1 or 2 in the MQTT device settings. This ensures message delivery on a MQTT protocol level. It might help.

I used MQTT lens to watch the traffic. The MQTT commands were definitely being sent. And, since the lights would sometimes respond, I know the configuration of the commands themselves is correct.

So I’ve started adding the QoS parameter to my MQTT devices and it seems to be helping. Thanks for the tip!

Well, that may have been a fluke. I’m still getting missed command issues. The commands themselves ARE getting published on the correct MQTT topics, but when they’re done in a large group it’s like some of them get dropped (and it’s always the same ones) – even though I’ve now put qos parameter on those particular devices.

They always work reliably when I issue one-off commands, but when the commands are issued in large groups (e.g. in a script that hits many lights at once), it fails. I’m thinking I might have to build some hacks into my script that re-issue commands for lights that tend to miss them the first time. Definitely not desirable but I don’t know what else to do.

Does anyone have any ideas on what I can do here? My network continues to deteriorate…the missed commands are getting worse and worse. Instead of just one light not turning off when I issue a group command, now half of them are sometimes ignored. Then I have to go through and turn them all on/off one at a time.

I’ve added qos=2 to a lot of my affected light devices, but it doesn’t seem to be having any effect. I do see the MQTT commands when I watch the topics with MQTT Lens, so it seems that HA is sending all the commands.

Do an image back up of your system.

Deleted all the HASS auto generated files including the HASS database. Restart HASS twice.

I’m not sure how the smartthing hub works but you may need to run a detection again.

If it doesn’t work, revert the system from the image.

I am still struggling with this. What I’ve observed from looking at the various logs:

  1. The MQTT commands are getting sent from HA. That is never an issue. I can see them consistently in MQTT Lens.
  2. I see all the commands I expect to see in the “MQTT” section of the logs on SmartThings.
  3. I do not see all the commands I expect to see in the “MQTT Bridge” section of SmartThings.

To expand on items #2 and #3, here are some screenshots. Pay attention to the “Kitchen Island Pendants”, for example…that’s one that I always have trouble with. Prior to capturing these screenshots, I ran a script that sent commands to turn off 8 or so lights…only a few of them actually turned off.

Here, you can see the MQTT app reporting that it received a device event from the bridge (these all correspond to the commands sent from my HA script):

However, in the logs for the bridge around the same time (12:53:59), you don’t see the “Kitchen Island Pendants”:

In fact, you only see the Bookshelf Lights and Family Room Lamp, which are the only devices that actually turned off, of the 8 or so I commanded to turn off. None of the others from that first screenshot turned off.

Coincidence? I think not. However, I don’t know what it means. How could the MQTT app be reporting that it is receiving a command from the bridge when the command doesn’t appear in the bridge’s logs?

Additionally, you can see that if I send a command from HA for only that device, it shows as expected in the logs (and the light actually turns off):

Any ideas?

Paging @stjohn …could you please help?