Xiaomi Gateway becomes unavailable after some hours

@alexjomin not all… I’m not having any issues and @Lucas_Rey was able to solve the exact same issue. There are several (long) topics on network issues related to the Xiaomi gateway. As far as I know the way HA communicates with the gateway is different from the Mi home app. From that point of view there are some limitations in the API. The solutions I have seen where always related to the network configuration (router, wifi repeater, firewall, etc.) Others replaced the xiaomi gateway with a different zigbee gateway solution.

@sjee Maybe it’s a networking issue, but if so, the 2.5 hours timeout limit should be hardcoded somewhere in HA. I will try with another router and I’ll let you know.

I guess we have a clue here : https://github.com/home-assistant/home-assistant/blob/dev/homeassistant/components/xiaomi_aqara.py#L45

So if we have a networking issue, the timeout is fired after 150 min. It explains why we are experimenting the same time to see the Xiaomi Sensors unavailable. New router ordered today, I’ll you know.

@alexjomin Any luck yet? My Homeassistant/Aqara was stable until about two weeks ago. Since then, it’s always stable for 2,5 hours and gets unavailable after exactly 2,5 hours.
My router is a Netgear R7000, so that should be stable enough :slight_smile:
The Aqara gateway is about 1 meter away from the router.

I did change my setup of home assistant to a docker environment, with port 8123 exposed. If anything would have been wrong with the ports, it wouldn’t work at all I guess - and not just 2,5 hours, right?

@harmjanr Did you add the --net=host to your container ?

By the way to be sure, enter inside your container and run a tcpdump -A -i wlan0 udp port 9898 (if your interface is wlan0) If all is ok you should see the UDP messages coming from the Xiaomi Gateway if not, the gateway is not reaching the container.

@alexjomin Nope, I created one bridge network that is used for all my docker containers. It is set up this way to have container linking in my reverse proxy. But the docker network can’t really be the problem, otherwise it wouldn’t work at all instead of only 2,5 hours, right?

Read the topic, there is a very good explanation why network issues will actually cause a 2,5 hour timeout.

@harmjanr that what I thought first too but without the net option set, I had the same 2.5 hour timeout behaviour

I just changed the network to host, and replaced the container linking. Let’s wait for 2,5 hours now :slight_smile:

@alexjomin After 3,5 hours, it still seems to work! Thanks for the help with the net option :smiley:

@harmjanr Glad to hear that :slight_smile:

My installation is on hassbian.
Can someone please explain me what should I do with the —net=host? Do i need to do this change on my installation?
I have a problem that sometimes I stop get events from the gateway on ha but continue to get events on mi home app.
For example I can press the xiaomi smart button and it wont work buy if I press it again it will work.
The same issue with the door sensor… after some time I don’t open the door I try to open and I don’t get notification but the second time I get the notification just fine.
Can someone help me with this?

Thanks a lot!

I’ve got the same problem :frowning: it’s really annoying.
I will try to describe it a little bit.

Gateway configuration:

xiaomi_aqara:
  discovery_retry: 5
  interface: 192.168.0.94
  gateways:
      key: !secret gateway_key
      host: 192.168.0.136
      mac: !secret gateway_mac

There are 3 motion sensors, 3 door/window sensors, and one aqara cube.
Problems:

  1. On HA frontend there is a light switch automatically generated by HA. When switched on, in 90% of cases it turns on the gateway light. When switched off, in 40% cases it turns light off. Sometimes this kind of actions generate “Invalid Key” error in logs.
  2. Sometimes sensors (not all of them) become unavailable. So any automations based on them are worthless.

What do we do if we cannot use

—net=host

because the ha dock is behind a reverse proxy, for example traefik.

My docker-compose looks like this:

  hass:
container_name: homeassistant
image: homeassistant/raspberrypi2-homeassistant:latest
networks: 
  - web
  - default
volumes:
  - /home/pi/******/config:/config
  - '/etc/localtime:/etc/localtime:ro'
  - '/etc/timezone:/etc/timezone:ro'
restart: unless-stopped
privileged: true
ports:
  - 8123:8123
  - 9898:9898
labels:
  - "traefik.backend=homeassistant"
  - "traefik.default.protocol=http"
  - "traefik.frontend.headers.SSLRedirect=true"
  - "traefik.frontend.entryPoints=http,https"
  - "traefik.enable=true"
  - "traefik.docker.network=web"
  - "traefik.port=8123"
  - "traefik.frontend.rule=Host:******.duckdns.org"

traefik is being used for certificates, and later I’ll also be trying to get pihole running on this device as well.

Is there some workaround for this 2.5hour timeout?

I get that

Broadcast messages is used by Gateway to discover its sensors, without these, it just go in timeout after 2,5 hours.

And I get that the component says that the Gateway uses this, what I’m assuming is Multicast messages, is there seriously no other way?

I guess we have to wait for this issue to be resolved:

https://github.com/Danielhiversen/PyXiaomiGateway/issues/26

Alternatively, has anyone tried using

https://www.weave.works/docs/net/latest/install/plugin/plugin-how-it-works/

When any aqara sensor in HA says it’s unavailable, it means it’s not working as intended. If it’s not working as intended, you won’t get motion detection events, you won’t get water leakage alerts, etc. There is a good explanation here https://github.com/home-assistant/home-assistant/pull/11631.

The only workaround is to ask each sensor each second for its current state. That’s a horrible workaround and nobody will implement it.

So you must make multicast to work correctly, because that’s what aqara hub requires.

1 Like

I fixed my issue by having a cron restart my hass container every hour.

For anyone reading this, don’t buy into the Xiaomi ecosystem, it’s a mess.

Sorry cant agree, i have numerous xiaomi devices and work flawlessly
Something else must be at play in your environment

Actually, you didn’t fix the issue. You just masked it. You still don’t receive events from sensors.

Completely disagree. Ive been using Xiaomi for 7 months now and find it reliable enough i’ve taken to removing dumb wall switches for so my smart light bulbs cannot be accidently turned off.

The solution seems simple to me. remove all over complicated network components. I really see no need to have separate VLAN’s and reverse proxy’s etc they just needlessly overcomplicate things and create unnecessary management. For me it’s simple one network thats for our home and a separate wifi for guests.

1 Like