TRVs connected to HA dashboard but sometimes not responding to automations

This is a branch of the topic …

Zigbee network configuration: HA tools and stability of Aqara E1 TRVs

… which really included what I now think are two problems: one about the Zigbee network and visualisation tools, and the other about TRVs that are connected to Home Assistant but do not respond to HA commands.

The old post received only replies about Zigbee, so I removed the parts of the question about TRVs and put them here…

The problem

The issue is that TRVs often get into a state whereby they are connected to Home Assistant but do not respond to commands from automations to change the temperature.

  • It does not appear to be a communication problem because the dashboard changes the TRV temperature and a change in TRV temperature updates the dashboard.
  • It does not appear to be a firmware problem because it happens with both types of TRV
  • It does not appear to be a quirk problem for the same reason
  • It seems unlikely to be an automation blueprint error because the automations work most of the time for most of the TRVs. Failures are intermittent and specific to one or a few TRVs.
  • Could it be a problem in the lower levels of HA?

Error Messages

Sometimes during testing I manage to spot an error message that flashes up briefly on the screen then disappears! Here are two examples I managed to capture.

Failed to perform the action climate/set_temperature. Failed to send request: <Status.NWK_NO_ROUTE: 205>

or

Failed to perform the action climate/set_temperature. Failed to send request: <Status.APS_NO_ACK: 183>

Can anyone translate these into English for me?

The Installation

I have an installation with 21 TRVs of two types:

  • 10 Aqara E1 TRVs
  • 11 generic Chinese TS0601 TRVs

FYI, they are connected to HA with a SONOFF 3.0 Dongle, but if you think this is a network error please reply to the other post mentioned at the top of this post.

The Workaround

Sometimes just manually changing the temperature on the TRV reanimates it. Most times, a full reset is needed – i.e. take out he batteries, re-insert and go through the setup and calibration procedure again. This invariably works and the TRV is good again … until it isn’t.

Some fail more often than others but there is no fixed pattern. The automations generally change the settings at least twice a day so they usually go through several good cycles before they fail.

It is annoying that checking and resetting TRVs has become a daily chore.

Any ideas?

Thanks!

:slightly_smiling_face:

No route and no acknowledgement errors firmly sounds like network transmission errors to me. Having to logically rebuild your ZigBee network configuration on a regular basis seems to confirm this.

Why have two threads going for one issue?

I split the original thread into two on purpose for reasons described in the preamble to both of them (see above). So they are not the same issue.

This one is about TRVs that are connected to HA, responding to the dashboard, but still not responding to temperature changes set by automations. I am assuming here that if the HA dashboard responds to the TRV and/or vice versa, then Zigbee is working.

I am sorry if this is confusing – it is for me too (!) because I am stuck with the situation that I do not know how to reproduce any of these errors – I can only analyse them when they happen, whether they be Zigbee network failures, something inside the TRVs, or poor handling by HA.

Regarding the error messages, I asked for a translation into English, so I did not know in advance that they pertain to network errors!

I do not know what you mean by this – I have never rebuilt the Zigbee configuration. I only carry out a reset on TRVs that have failed (as described above); that usually fixes the problem temporarily.

This thread

I invite readers to comment in this thread if they have had a similar problem with TRVs apparently connected but not responding – especially if they are using the same brands as me, and more especially if they found a solution!

One theory I am working on is that the connection of the TRVs to the radiators is suboptimal because it relies on an adapter that in turn relies on friction to hold it in place. I have had several incidents of a TRV coming loose; in that case they give a ‘valve error’ indication on the TRV as an icon. However, that error cannot be seen on the HA device dashboard (though for Aqara TRVs detection can be turned on and off). I do not know whether the HA HVAC interface can do anything with that information? I assume not, and that it therefore goes ahead and tries to set the temperature anyway on a TRV that is unable to respond?

The TVRs in question don’t respond in time to requests send by HA. Hence “NO_ROUTE” and “NO_ACK”. Manually changing the temperature on the TRV reanimates it.

Check whether they go into some kind of sleep mode (hibernation)/power saving mode after a certain time of inactivity and disable this “feature” if available.

If there is no such thing like sleep mode/power saving mode on the TVRs I would change the referring automations to simply repeat the action(s) after a short delay (i.e. 3 seconds). That should be enough to wake them up and respond properly to requests.

I have not found any such thing on the devices or in the user manual.
In any case I don’t think that would not explain why it just happens sometimes to some devices?

The automations already
a) wait one minute and then read the set temperature to see if it is the same as that actually set
b) if not, wait another minute and then try again
c) repeat for up to five retries (10 minutes)

I assumed this would get over any intermittent network problems

Curious if you can track (debug mode) on which re-try count each device command finally gets through on. Is there a pattern? Time of day, valve location within your premises (draw a floor plan to scale), grouping of devices, time since last reset/recalibration, orientation in respect of your dangle (horizontal, vertical, north facing etc).

Ooo, nice idea, but sounds like a massive job! And the answer might be that it is random. Maybe correlate with use of the microwave cooker?!?

Beats guessing! That’s what debugging is all about - getting some facts. Hopefully the penny will drop, aided by many eyes. Hindsight is a wonderful teacher.