Mesh Device Batch Management UI (Z-Wave, Zigbee, Lyra, TP-Link Casa, Broadlink, other meshes)

The problem: One of my HA installs has 19 Z-Wave switches of 2 different types, and checking their firmware version, updating device options / settings, ascertaining devices’ recent issues to troubleshoot mesh communication, updating firmware - took more than a full day. 90% of which was clicking through multiple screens from one device to the next, recording data in a notepad app, as well as lack of easy way to batch e.g. firmware updates of multiple devices (with 1-2 auto-retries).

The recommended solution:
Extend the [HA UI]/config/devices/dashboard screen or make a dedicated “mesh management” screen that would list devices of a given mesh (Z-Wave, etc - same type screen might be used for all supported meshes), with useful status/info fields, and allow a given action (with retries, if applicable) to be batch-started on any of the given mesh’s devices (multiple selection). These fields would seem useful for such a screen:

  1. Device selection checkbox (to select multiple devices for a batch action). The first device selected is “super-selected” (like in Adobe Lightroom) - which is useful to designate a particular device as the source / model - e.g. to replicate its chosen settings / options to the rest of the selected devices.
  2. Device’s assigned Area if any (same as on current HA device dashboard)
  3. Manufacturer (same as on current HA device dashboard)
  4. Model (same as on current HA device dashboard)
  5. Persistent device hardware ID (e.g. MAC address or Z-Wave code; optionally truncated to first 2 plus last 4 digits of it plus complete number “on mouse hover”)
  6. Device’s assigned HA Name if any (same as on current HA device dashboard)
  7. Device status (alive / error / unconnected etc - same as seen on [HA UI]/config/devices/device/[device id]. Can be conveyed via a color-coded circle and/or icon, with more info available in a pop-up tooltip on-mouse-hover
  8. Firmware version
  9. Signal strength (dB or bar graph or antenna icon w/bars) - basically what device or mesh reports about the quality of its connection to its mesh, or a metric synthetically derived 7 populated by HA integration based on device’s response (% dropped packets, qty of re-transmits, how long a command takes effect, etc). This can be directly obtained where supported by device/mesh vendor (and interpreted in the context of what the reported dB etc value means in terms of device’s practical responsiveness on a given installation, to show the realistic & actionable signal strength / connection quality value). Some devices/meshes that don’t support real-time signal quality, have “range test” option available that can be run on-demand or scheduled; with meshes where none of this is supported, HA can infer this based on how long / how many retries it has taken for a given e.g. scene change to take effect on the device on a recent weighed average, and how often a command fails to reach it. MAYBE, actual signal strength (if available) and de-facto HA command responsiveness in a mesh, should be two separate fields.
  10. Mesh neighbor topology. For meshes where explicit “parents”, “peers”, and “children” are identified / reported, this (small-font text) field can list them. Route. This should make it easy to see “the big picture” of which node takes to which without too much text or some big graphs.
  11. Recent errors. This text field lists error messages, if any, with timestamps, reported by the device recently. Dashboard option fields at the top of this dashboard screen can specify if this goes back only 5 min or 2hrs or 48hrs, for example, and how many lines max to report per device.
  12. Diagnostic info (one of batch actions available is to run diagnostics routine for the entire mesh - if directly supported by mesh / device vendor - or if not, do the next best thing, such as change dimmer brightness 1% and then back, or run any other innocuous command to test). Small text field, making it easy to see problem nodes and basic nature of the problem. Less data to clog the eye, more actionable answers.
  13. Statistics. Again, few reasonable sub-columns. Recent-weighed averages. Color-coding (in the context of given mesh’s hardware capability vs reasonable thresholds for failure rate levels) - are the numbers being listed, good vs meh vs borderline vs problem. This may be redundant vs above.

Am I missing any other fields? Please suggest.

Then we have batch action buttons, applicable to all selected devices whose checkboxes are checked by the user:

  1. Copy/sync settings (HA log level, device-specific options). Only allow syncing device-specific options if the vendor, model, and firmware version match. Maybe allow limited option syncing across model/firmware versions of same vendor.
  2. Re-interview
  3. Heal
  4. Update firmware
  5. Remove | Remove all failed / disconnected | Remove all unnamed | Factory-reset (if supported) and then remove

Am I missing any actions (that would be appropriate fr multiple devices at once and wouldn’t be made redundant by the dashboard info already listing the effect)?

Batch actions (e.g. firmware update): concurrent if supported; otherwise (or if explicitly preferred) sequential, with 1…3 retries per device. Sequential actions that target individual devices, try to follow the mesh topology, from the HA mesh controller out - as opposed to go alphabetically or in a random order.

The [HA UI]/config/devices/dashboard screen shows batch progress. Should be able to schedule the batch to start immediately or at a given time / in “X hours”. This is because eg. Z-Wave firmware updates make switches & dimmers non-responsive while the update takes place, and there may be less RF interference overnight.