Hi All,
I’ve been using Home-Assistant and Z-Wave for about 6 months now. I’m running a Raspberry Pi 3B with:
- Raspian Stretch OS
- Home assistant (0.84.6 current) in a python virtualenv
- Razberry daughter card plugged into the GPIO header.
I have five connected z-wave devices so far, four Aeotec MultiSensors (three mains powered, one on battery) and one Aeotec SmartSwitch.
I also have a number of other components in the mix including:
- A systemmonitor getting the memory and disk usage of the pi.
- A bom sensor getting weather data from the internet.
- A custom udp sensor listening for data from an Arduino.
By and large the system has been reliable, the most challenging problem was switching the SQLlite DB to MySQL and troubleshooting some referential integrity issues. Aside from that, everything works fine.
Execpt for this problem…
Upon a fresh boot of the Pi, home-assistant starts up OK, all the sensors including z-wave devices and others send data in a timely fashion. For the z wave devices, the battery Multisensor reports in half hour intervals, while the powered sensors report every thirty seconds.
I check the home-assistant GUI to see all is well, then I go to sleep (or work).
Unfortunately, when I get back home to check it, I almost always find that after a few hours of successful operation, Home-Assistant has stopped receiving data from ALL z-wave devices; the charts have flatlined and the last sensor report from any z-wave device was something like “5 hours ago”.
Checking the OZW_Log, I find that the timestamp of the last line of the OZW_Log corresponds exactly with the time that home-assistant stopped receiving z-wave sensor data.
There is nothing in the OZW_Log I can see that would point towards a crash or error. Here are the last several lines of the log before the last outage:
2019-01-08 23:53:07.650 Info, Node006, Received SensorMultiLevel report from node 6, instance 1, Ultraviolet: value=0
2019-01-08 23:53:07.650 Detail, Node006, Refreshed Value: old value=0, new value=0, type=decimal
2019-01-08 23:53:07.650 Detail, Node006, Changes to this value are not verified
2019-01-08 23:53:07.650 Detail, Node006, Notification: ValueChanged //LAST LINE
And here’s the first few lines after a heal:
2019-01-09 19:49:25.283 Detail, Node001, Queuing (Controller) Request Node Neighbor Update
2019-01-09 19:49:25.304 Detail, Node004, Queuing (Controller) Request Node Neighbor Update
2019-01-09 19:49:25.304 Detail, Node005, Queuing (Controller) Request Node Neighbor Update
2019-01-09 19:49:25.304 Detail, Node006, Queuing (Controller) Request Node Neighbor Update
2019-01-09 19:49:25.305 Detail, Node007, Queuing (Controller) Request Node Neighbor Update
2019-01-09 19:49:25.305 Detail, Node008, Queuing (Controller) Request Node Neighbor Update
2019-01-09 19:49:25.310 Info, Requesting Neighbor Update for node 1
]
Note that all other devices continue to display in Home assistant as normal.
See the graph below for a picture of what’s going on.
This graph has all the temperature sensors that I currently have; The “Room” z-wave senors were unresponsive since 08/01/2019 23:53:07 until I did a heal of the network through the homeassistant GUI at 2019-01-09 19:49:25. Not that the Front Yard sensor restored a little later, presumably because it is on battery and only wakes up every 30 minutes. The BOM sensor I included works from an API call to a website and continued to function while the z-wave sensors were down.
I also noticed that the Home Assistant would sometimes restore the sensors before I could check again. I later discovered this was due to the auto-heal function which was enabled by default. I disabled it.
I have also tried setting the polling interval to 120000ms (2 minutes) to attempt to automatically restore the devices if they didn’t report themselves. This has not changed the behaviour.
Here’s my z-wave config out of home-assistant at this present time. Like I said, there’s nothing wrong with the configuration of the sensors per-se, just the fact that they all stop at the same time.
zwave:
usb_path: /dev/ttyAMA0
network_key: "Secret"
#config_path: /srv/homeassistant/lib/python3.5/site-packages/python_openzwave/ozw_config
debug: False
polling_interval: 120000
autoheal: False
Also relevant, the SQL config. I have it purging stuff older than14 days to keep the DB manageable within the limited resources of the Pi
recorder:
purge_keep_days: 14
db_url: mysql://hass:secretpass@localhost:3307/hass_db?charset=utf8
The fact that they all fail at once implies there’s either an error in OZW which is not writing to the logs, or a hardware issue with the Razberry board itself. I do not know how to begin troubleshooting these devices short of ordering replacement parts or factory resetting the whole system, something I am not yet prepared to do.
So I’m currently out of options. I am sure there must be some way to dig into OZW or the Razberry controller, but I have not discovered a method to do that yet.
Right now I’m documenting the timestamps between when OZW_Starts and stops logging, hopefully trying to observe any common conditions that might be causing the issue, but nothing stands out as of yet.
I’ll post my findings here as I troubleshoot.
In the meantime, does anyone have any ideas what could be going on of where else I could check?
Thanks in advance