Insteon state changes not consistently reflected in dashboard

Hi, I have a relatively large Insteon setup that I’ve installed in my house over several years (75 devices). I’ve converted everything over to HA and have setup various dashboards for control purposes.

I’ve noticed more and more often that the state changes in the dashboard eventually get out of sync with the actual device. (ie bedroom light will show on, when it’s actually off). I’m using an Insteon hub as the PLM. I’ve gone through the ALDB on all hub and every device and made sure all the default links are in place.

I can’t pinpoint or recreate a scenario when an entity starts to go out of sync. The device type will vary (ie. switch vs. dimmer). The dashboard will vary as I use a number of different controls and customizations depending on the board.

How do I further troubleshoot this? Is anyone else experiencing this?

I’ve noticed this as well, but haven’t spent any time on what to do about it. Restarting usually brings things back in sync or I just toggle the device. Not great solutions though.

If you have a specific device that we can debug please add the following to the configuration.yaml file:

logger:
  default: warning
  logs:
    pyinsteon.1a2b3c: debug

Replace 1a2b3c with the device address without any periods and in lower case. Restart HA and let it run until it is out of sync. Send me the home-assistant.log and we can at least see what the system is doing.

1 Like

Also, 2023.3 should have some performance improvements that may make this less of an issue. Hard to say for sure but I do believe it will help.

Hi @teharris1
I’m really hoping you can help me solve the unreliability of my Insteon/HA setup. It’s driving me crazy.

Most of the time, the setup works ok, but as I mentioned to you in my previous email in this thread, after a while HA starts to get “out of sync” and does not recognize when devices are turned off or on.

I followed your instructions and turned on debugging for:
2b.37.39 which is a switchlinc dimmer (main level hall by front door)
33.ab.59 which is a switchlinc dimmer (main level by mud room)
31.41.3e which is my Insteon hub (2245-222)
33.a7.1a which is a switchlinc dimmer (front porch)

I have the two switches in the hallway cross linked to form a standard 3-way setup. This works great. There’s never a problem with the physical switches; they update each other ok.

In the home-assistant-problem.log at the link attached, you’ll see:
16:51:10 - I physically turn on 2b.37.39 at the switch. All works great. The status of the devices update in HA with no issue.
16:52:05 - I physically turn off 2b.37.39 at the switch. Again, all works great; HA is updating with no issue.
I do the same again at 16:53:05 and 16:54:05 turning it on and off again at the switch. All works ok.
16:55:05 - I physically turn on 2b.37.39 and in HA only that particular device turned on. It’s matching device (33.ab.59) did not show as turning on in HA (although physically at the switch it did turn on).
16:56:08 - I physically turned off 2b.37.39 and HA updated.
16:57:05 - I physically turned on 2b.37.39 and HA updated.
16:58:07 - I physically turned off 2b.37.39 and while that switch turned off in HA, it’s matching device (33.ab.59) did not turn off in HA.
16:59:23 - I physically turned on 2b.37.39 and only that device turned on in HA.
17:00:15 - I physically turned off 2b.37.39 and nothing updated in HA, so now, given the sequence above, HA shows both switches on, yet nothing is on IRL.
For one last test, I then turned on the porch light (33.a7.1a) at 17:01:05, and turned it off 17:01:21 and HA updated correctly.

So I’m quite confused as to why HA is reacting to some state changes from the hub but not all. Or is it that the hub is not sending all state changes, even though everything is setup correctly?

I’ve also provided the ALDB’s for the switches and the hub.

Hoping you can shed some light on this situation. It’s not just with these devices BTW. Over time, many of my devices end up not showing up correctly HA. I’ve taken to rebooting it every night just to clear the states. This is not a viable solution however.

Thanks in advance for the support.

Link to files here.

Thanks for the logs. I see what the issue is. I need to make a code change to make this more consistent for you. In the meantime, you can setup automation to trigger device 1 to turn on/off device 2 in HA. This will cause some additional Insteon traffic but will keep the system in sync until I can get a patch out.

Thanks for looking into this. Glad you found the issue.
If I understand your suggestion properly, you’re saying in essence to stop using Insteon Scenes (short term at least) and use HA to trigger device 2,3,4 etc once device 1 is triggered.
Not viable long term for me, as Insteon scenes are way better than anything HA deliver (and work independently of HA as an added bonus!!)

Will await your fix. Happy to do some testing if you have an early release available.

Keep your Insteon scenes, just add the automation for now. This will keep the devices up to date in HA. Once I make the code change you can delete the automation.

Hi. How are things going with a patch for this issue? Also is there something about my setup that is causing the issue? Or is everyone else that uses the Insteon integration having an issue with their device statues going out of sync in HA?

Sorry for the delay. I did post a patch today. Hopefully, it will be in 2023.6.1.

Just so you are aware, the issue is happening because some of your messages are being lost in the network. The patch I put in should significantly improve reliability but it may not fix the entire problem. How many Insteon devices do you have and how large and how old is your house? The more devices you have the more reliable the network becomes.

Looks like I was too late for the 2023.6.1 release so targeting 2023.6.2.

Interesting. I have about 75 Insteon devices distributed throughout my home, which is about 20 years old so relatively newish. Not like I have old or unusual wiring. So given the trace was run in HA, and you’re not seeing all the messages on the “network”, then they are lost either from the end device to the PLM (hub in my case) or from the PLM to HA on my LAN. From the debug info, can you confirm on which side of the PLM the loss of messages is occuring?

With that many devices you should have a very stable network. So the messages that I saw that were lost was the broadcast message from the device into the network. Not sure how familiar you are with the Instoen protocol but a manual device change creates 2 types of messages. The first is a broadcast message out to the entire network. Any responders of that device will then act according to the broadcast message. The device then sends a clean-up message to each responder in the device’s ALDB. The clean-up message was seen but not the original broadcast. The Insteon module was assuming that it would see the broadcast message and was ignoring the cleanup message. (It has always tracked cleanup messages to track the state of the device where the manual change occurs but was not seeing what other devices are linked to the initial device).

So why would the Insteon module not be seeing the broadcast message? To your point, if I had a “weak’ Insteon network I could maybe see some issues occurring but then I would also expect various linked devices around the house to be having issues as well, since a weak network would theoretically prevent other devices from seeing the broadcast. So is the hub not seeing the message and should I try to move it to a different location in the house perhaps, or is the Insteon module in HA not getting all the messages from the hub for some reason? I assume your fix addresses the latter somehow?

Hi, I’ve had your fix running for a few days now. There is a marginal improvement. But at the end of the day many of my device states still show on when they are off in real life.

Is there any other debugging or traces I can do?

I’ve recently acquired a 2143 PLM. Would swapping that for my hub provide more consistent results? Lots of work to do that so I’m not keen on it but will do so if I can make my system reliable so we can actually use it regularly.

Can you send me a new log exactly as you did before? I want to see how the updated code is responding.

OK. I’ve uploaded the log to the same Google Drive share as before. The file is called home-assistant-problem2.log

home-assistant-problem2.log

It tracks the same devices as before. The sequence of events was:

20:30 - I physically turn on 2b.37.39 at the switch. Nothing shows on in HA.
20:31 - I physically turn off 2b.37.39 at the switch.
20:32 - I physically turn on 2b.37.39 at the switch. HA shows both linked switches on.
20:33 - I physically turn off 2b.37.39 at the switch. HA shows everything off.
20:34 - I physically turn on 2b.37.39 at the switch. HA shows both linked switches on.
20:35 - I physically turn off 33.ab.59 at the switch. HA shows both linked switches off.
20:36 - I physically turn on 33.ab.59 at the switch. HA shows only 33.ab.59 on; the linked switch 2b.37.39 shows off in HA.
20:37 - I physically turn off 2b.37.39 at the switch. HA shows everything off.

Hope this helps. I can provide other traces of scenarios that don’t work. For example, Insteon scenes triggered by 8 button keypads seem to never get updated properly in HA.

As a reminder, physically everything works great; all devices update properly so I’m not “losing” anything on the insteon network anywhere. Also, for the heck of it, I moved my Insteon hub to another location in the house, but that didn’t fix anything either.

I am not sure if you sent me the log file you meant to send but this log file starts at 2023-05-20 16:48:25 so there is no reference to 20:30 - 20:37 as a time stamp. Also, the reference to 2B.37.39 is not in the log file but I do see 2B.E7.39 so I assume that is the device that you used for the tests.

Also, can you confirm you are running version 2023.6.2 or higher?

Here is what I see in the logs. At 16:51:10 2B.E7.39 was turned on and 33.AB.59 was queried for its status:

2023-05-20 16:51:10.933 DEBUG (MainThread) [pyinsteon.2be739] Topic: 2be739.1.on.all_link_broadcast data: {'cmd1': 17, 'cmd2': 0, 'target': 000001, 'user_data': None, 'hops_left': 3}
2023-05-20 16:51:10.935 DEBUG (MainThread) [pyinsteon.2be739] Topic: handler.2be739.1.on.all_link_broadcast data: {'on_level': 255}
2023-05-20 16:51:10.936 DEBUG (MainThread) [pyinsteon.2be739] Topic: state_2be739_dimmable_light_1 data: {'name': 'dimmable_light', 'address': '2be739', 'value': 255, 'group': 1}
2023-05-20 16:51:10.944 DEBUG (MainThread) [pyinsteon.2be739] Topic: event_2be739_1_on_event data: {'name': 'on_event', 'address': '2be739', 'group': 1, 'button': 'dimmable_light'}
2023-05-20 16:51:10.959 DEBUG (MainThread) [pyinsteon.33ab59] Topic: send.status_request.direct data: {'address': 33ab59, 'status_type': 2}

This is all as expected, however, you say

20:30 - I physically turn on 2b.37.39 at the switch. Nothing shows on in HA

So I am not sure if this means that for some reason HA did not see this change. I can say for sure the underlying Insteon library saw the change and responded accordingly. I don’t see anything unexpected here.

At 16:52:05 I see 2B.E7.39 turned off and 33.AB.59 queried for its status. This is also as expected. There are a few other on/off sequences that also appear to be as expected. All of these sequences started with a broadcast message, which is also as expected.

However, at 16:55:07 I see 2B.E7.39 turned on but the broadcast message is missing. The first message is an All-Link cleanup message. This is the same behavior we saw before that I thought I had fixed but it appears the fix is not working.

Let me look at this again because the fix I provided should have worked and I am not sure why it did not.

@teharris1 hello - I updated to 2023.6.3 and re-ran my tests. Still having the same problems.
Please see this log dated July 2:
July 2 testing log

In terms of activities, here’s what happened:
17:38 - I turned on 2b_e7_39 at the switch; all states/groups in HA updated OK
17:39 - I turned off 2b_e7_39 at the switch; all states/groups in HA updated OK
17:40 - I turned on 2b_e7_39 at the switch; HA did not react at all; no state changes on 2b_e7_39 or other switch that is linked to it (33_ab_59)
17:41 - I turned off 2b_e7_39 at the switch; HA still showed off so no surprise there;
17:42 - I turned on 2b_e7_39 at the switch; HA still did not react at all; no state changes; you may see some other activity from the hub around this time time as well from someone else turning lights on/off
17:43 - I turned off 2b_37_39 at the switch; HA still showed off so no surprise there
17:44 - I turned on 33_a7_1a at the switch; HA updated OK; you may see some other activity from the hub around this time time as well from someone else turning lights on/off
17:45 - I turned on 2b_e7_39 at the switch; all states/groups in HA updated OK
17:46 - I turned off 33_a7_1a at the switch; HA did not change; still showed the device as being on
17:47 - I turned off 2b_e7_39 at the switch; all states/groups in HA updated OK

So there are still issues. Everything is physically working ok. The way I see it simplistically is either:

  • the hub is not receiving/processing all the messages from the Insteon network (doubtful given the Insteon Director works fine)
  • the Insteon integration is not receiving/processing all messages from the hub
  • HA is not receiving/processing all the updates from the Insteon integration

Really hoping you can help get this fixed. My entire HA platform and home automation is my home is basically on hold and not useable until I can get reliable updates from HA.

Thanks in advance for your help.

Just my notes here. I can test at my PAD dashboard which happens to be mounted right above two Insteon switches. I click a switch and it is reflected in HA and I turn off from switch or HA and it is reflected or visa-versa … testing all different possibilities.

But yes, I can say that sometimes it takes about 10secs for HA to reflect a change. Never more, sometimes less which leads me to believe it is something like a scan_interval … if that is set to 10s then that would be worst case.

Now pushing things on/off when the interval has not completed and the reflection of the state is not in the GUI can lead one to believe things are screwed up, but eventually it gets to the right answer.

Just my input.

I took a movie of this behavior, switching on/off rapidly and you would see HA go on/off/on/off/on/off … lagging behind but at the end it is in the correct state.