Serial integration is unstable

Hello,

I have a serial integration where I read some DHT22 sensors from an Arduino Nano.
My config:

sensor:
  - platform: serial
    name: pod_nano
    serial_port: /dev/serial/by-id/usb-1a86_USB2.0-Serial-if00-port0

  - platform: template
    sensors:
      hol_temperatura:
        friendly_name: Temperatura Hol
        unit_of_measurement: "°C"
        value_template: '{{states.sensor.pod_nano.attributes["hol_temperatura"] | round(1)}}'
      hol_umiditate:
        friendly_name: Umiditate Hol
        unit_of_measurement: "%"
        value_template: '{{states.sensor.pod_nano.attributes["hol_umiditate"] | round(0)}}'

The sensors run fine for a while, but it could be that after a day or so, they don’t get any more data from the serial.
Checking the serial with picocom for example, shows that the Arduino is still sending data.
But it looks like Home Assistant lost the link with the serial device.

The same happens if I unplug the device, and plug it back.
The link is lost and I need to restart Home Assistant in order to get it back.

Is there a way for the serial integration to recover from a broken connection to the device?

Thanks!

1 Like

We’ll need some more info.

What OS are you running?

What hardware are you running?

My guess from the serial path, and serial device id is a raspberry pi, and I’ll assume you’re running HassOS (Or HomeAssistant as it’s called now?)

Are there any clues in dmesg? What about the home assistant logs?

Looking at the code, I don’t see any reason why it wouldn’t auto reconnect. It only closes the serial connection if EVENT_HOMEASSISTANT_STOP is received.

Also, looking at the Arduino Nano Schematic (https://www.arduino.cc/en/uploads/Main/Arduino_Nano-Rev3.2-SCH.pdf), the DTR# is connected to the RESET line, so the nano reboots every time you open a serial connection. This means, when you open picocom, you are actually resetting the NANO. So you can’t assume “it’s still sending data”, because by observing it, you’ve altered the behavior. Clearly the NANO operates in the quantum realm :slight_smile:

Anyway, what is the output of DMESG and Home Assistant logs? I would start HomeAssistant, tail the logs, then in a new window, open picocom. It’s likely this will fail as the device should be in use…but maybe not. Then, I’d unplug the usb device while everything is running. What do the logs show. Then plug it back in (with everything still running). What are the logs again?

1 Like

Thanks for the fast reply!
I’m answering now the first questions, and I’ll do the tests you suggested afterwards.

I’m running Ubuntu 19.10 as a VirtualBox VM, the host is a NUCi8 running Windows 10.
There I’m running HassIO in a Docker container. I forward the required USB-to-Serial and USB-to-Bluetooth adapters from VirtualBox to VM.

There is no info in the HASS logs. I even enabled DEBUG for Serial, like this:

logger:
  default: critical
  logs:
    homeassistant.components.sensor: debug
    homeassistant.components.serial: debug
    homeassistant.components.sensor.serial: debug

The only messages I was receiving was the data that the Arduino was sending.
But no errors or anything else when it stops.

While HASS receives data correctly, I can start picocom and it will read data sent. But doing this stops HASS from receiving anything.
I don’t see anything in dmesg, nor HASS logs. I started picocom while HASS should have the serial port open (it was updating the sensors). These are the last entries:

[ 1153.248282] udevd[6]: starting version 3.2.8
[ 1153.250433] udevd[7]: starting eudev-3.2.8
[ 7093.733373] udevd[6]: starting version 3.2.8
[ 7093.736079] udevd[7]: starting eudev-3.2.8
[ 9178.492812] Bluetooth: hci0: command 0x200b tx timeout

(I’m not experienced with Linux at all)

And you are correct, the NANO will reset when you connect the USB serial.
I will add a counter to the JSON message so that I can see if it resets when I open it in picocom.

So, after adding an index to the output it’s clear that the board does not reset when I access it with picocom after it was working in HASS.

What I did after:

  1. HASS was properly reading data, logs show:
2020-03-17 18:09:23 DEBUG (MainThread) [homeassistant.components.serial.sensor] Received: {"hol_temperatura":21.3,"hol_umiditate":38.9,"pod_temperatura":15.7,"pod_umiditate":43,"dressing_temperatura":15.3,"dressing_umiditate":39.6,"afara_temperatura":15.8,"afara_umiditate":44.4,"run_index":95}
2020-03-17 18:09:28 DEBUG (MainThread) [homeassistant.components.serial.sensor] Received: {"hol_temperatura":21.3,"hol_umiditate":38.9,"pod_temperatura":15.6,"pod_umiditate":42.9,"dressing_temperatura":15.3,"dressing_umiditate":39.6,"afara_temperatura":15.8,"afara_umiditate":44.4,"run_index":96}

  1. removed the USB device, no change in the HASS logs, dmesg shows:
[ 1411.614489] ch341-uart ttyUSB0: ch341-uart converter now disconnected from ttyUSB0
[ 1411.614513] ch341 1-4:1.0: device disconnected
  1. reattached the device, no change in the HASS logs, dmesg shows:
[ 1468.479785] usb 1-4: new full-speed USB device number 7 using xhci_hcd
[ 1468.743865] usb 1-4: New USB device found, idVendor=1a86, idProduct=7523, bcdDevice= 2.54
[ 1468.743867] usb 1-4: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[ 1468.743869] usb 1-4: Product: USB2.0-Serial
[ 1468.747180] ch341 1-4:1.0: ch341-uart converter detected
[ 1468.749291] usb 1-4: ch341-uart converter now attached to ttyUSB0
  1. I waited a few minutes, no changes in HASS. I started then picocom and now the run_index was reset to 0:
picocom -b 9600 /dev/ttyUSB0
picocom v3.1

port is        : /dev/ttyUSB0
flowcontrol    : none
baudrate is    : 9600
parity is      : none
databits are   : 8
stopbits are   : 1
escape is      : C-a
local echo is  : no
noinit is      : no
noreset is     : no
hangup is      : no
nolock is      : no
send_cmd is    : sz -vv
receive_cmd is : rz -vv -E
imap is        :
omap is        :
emap is        : crcrlf,delbs,
logfile is     : none
initstring     : none
exit_after is  : not set
exit is        : no

Type [C-a] [C-h] to see available commands
Terminal ready
{"hol_temperatura":21.3,"hol_umiditate":39.2,"pod_temperatura":15.6,"pod_umiditate":43.1,"dressing_temperatura":15.3,"dressing_umiditate":39.3,"afara_temperatura":15.8,"afara_umiditate":44.7,"run_index":0}
{"hol_temperatura":21.3,"hol_umiditate":39.2,"pod_temperatura":15.7,"pod_umiditate":43.1,"dressing_temperatura":15.3,"dressing_umiditate":39.3,"afara_temperatura":15.7,"afara_umiditate":44.7,"run_index":1}
{"hol_temperatura":21.3,"hol_umiditate":39.2,"pod_temperatura":15.7,"pod_umiditate":43.1,"dressing_temperatura":15.3,"dressing_umiditate":39.3,"afara_temperatura":15.7,"afara_umiditate":44.7,"run_index":2}
{"hol_temperatura":21.2,"hol_umiditate":39.2,"pod_temperatura":15.6,"pod_umiditate":43,"dressing_temperatura":15.3,"dressing_umiditate":39.3,"afara_temperatura":15.7,"afara_umiditate":44.7,"run_index":3}

If I restart HASS while picocom is connected to the port, HASS will not be able to read data - no mention of issues in its logs though.
I still don’t get why there are not more logs from this serial integration to let you know what’s it doing (like connecting to, disconnecting from, error, etc) - have I not enabled the correct logging component?

Ok, here’s a question. Why are you using the /dev/serial/by-id for this device in the config? What happens if you just attach straight to /dev/ttyUSB0?

Is minicom able to attach to that device? What happens to the symlink when you unplug and replug it in (I assume that’s a symlink?)

The only debug message in the entire app is when it receives a message…
_LOGGER.debug(“Received: %s”, line)

Can’t even find one in pyserial-asyncio lol

I’m using serial-by-id because that one looks to be static, whereas /dev/ttyUSB0 might point to another device if in the future I add more or the configuration changes.

I’ve tried now /dev/ttyUSB0 and the behavior is identical.
Minicom connects to the serial-by-id device just fine.

I don’t know what to do next…

Makes sense. I just wasn’t sure how the symlink would handle connecting/reconnecting usb devices.

So, after adding an index to the output it’s clear that the board does not reset when I access it with picocom after it was working in HASS.

You say it’s clear the board does NOT reset, but you say the index was reset to 0. So you mean the board is resetting when you connect picocom?

If I restart HASS while picocom is connected to the port, HASS will not be able to read data - no mention of issues in its logs though.

Yeah, it should’t be able to open the serial port if something else has it open. Really lacking on useful debug statements…

Looking more into the code and how pyserial_asyncio is implemented, I’m not sure about this line…

serial_asyncio.open_serial_connection will return a reader and a writer. The writer stores the trasnport protocol and is what actually opened the serial connection. I wonder if the garbage collector is removing that writer, essentially closing the serial port. The reader will do nothing without a transport…which is owned by the writer.

Specifically, using the ‘_’ tells python it doesn’t care about that thing. Had they just stored it and left it unused, it could work? I don’t know, but I could see this causing the exact thing you’re describing (stops working after a while).

I don’t know how to report bugs here. I see @fabaff is the code owner for the serial interface…maybe he would know?

What I meant to say there were two different cases:

  1. If HASS already has the serial port open and I connect picocom => the index does not reset to 0 => the board does not get reset;
  2. I remove the device => HASS loses it’s connection. I reconnect the device, and let it run for some time so the run_index increases (HASS does not recreate the connection) => I connect picocom => NANO gets reset and run_index turns to 0.

Should I try storing it as a variable and see if it works? By changing the line to “reader, writer = await serial_asyncio.open_serial_connection(”
[not really sure how to do that though, yet]

Yeah, if you’re comfortable with changing the code, you could give that a shot. Make sure to delete the _pycache_ folder in there after you do it so it’s regenerated next boot.

It’s a simple enough change and should be fairly easy to test. I don’t think it will change what happens when you disconnect/reconnect the serial device. My only hope is it helps the long term connections…

I think I have a working fix.
I have created a custom_component serial integration that overrides the built-in one.

This is the code:

"""Support for reading data from a serial port."""
import json
import logging

import asyncio
import serial_asyncio
import voluptuous as vol

from homeassistant.components.sensor import PLATFORM_SCHEMA
from homeassistant.const import CONF_NAME, CONF_VALUE_TEMPLATE, EVENT_HOMEASSISTANT_STOP
import homeassistant.helpers.config_validation as cv
from homeassistant.helpers.entity import Entity

_LOGGER = logging.getLogger(__name__)

CONF_SERIAL_PORT = "serial_port"
CONF_BAUDRATE = "baudrate"

DEFAULT_NAME = "Serial Sensor"
DEFAULT_BAUDRATE = 9600

PLATFORM_SCHEMA = PLATFORM_SCHEMA.extend(
    {
        vol.Required(CONF_SERIAL_PORT): cv.string,
        vol.Optional(CONF_BAUDRATE, default=DEFAULT_BAUDRATE): cv.positive_int,
        vol.Optional(CONF_NAME, default=DEFAULT_NAME): cv.string,
        vol.Optional(CONF_VALUE_TEMPLATE): cv.template,
    }
)

async def async_setup_platform(hass, config, async_add_entities, discovery_info=None):
    """Set up the Serial sensor platform."""
    name = config.get(CONF_NAME)
    port = config.get(CONF_SERIAL_PORT)
    baudrate = config.get(CONF_BAUDRATE)

    value_template = config.get(CONF_VALUE_TEMPLATE)
    if value_template is not None:
        value_template.hass = hass

    sensor = SerialSensor(name, port, baudrate, value_template)

    hass.bus.async_listen_once(EVENT_HOMEASSISTANT_STOP, sensor.stop_serial_read())
    async_add_entities([sensor], True)

class SerialSensor(Entity):
    """Representation of a Serial sensor."""

    def __init__(self, name, port, baudrate, value_template):
        """Initialize the Serial sensor."""
        self._name = name
        self._state = None
        self._port = port
        self._baudrate = baudrate
        self._serial_loop_task = None
        self._template = value_template
        self._attributes = []

    async def async_added_to_hass(self):
        """Handle when an entity is about to be added to Home Assistant."""
        self._serial_loop_task = self.hass.loop.create_task(
            self.serial_read(self._port, self._baudrate)
        )

    async def serial_read(self, device, rate, **kwargs):
        """Read the data from the port."""
        loggedError = False
        while True:
            try:
                reader, _ = await serial_asyncio.open_serial_connection(
                    url=device, baudrate=rate, **kwargs
                )
                _LOGGER.info("Serial device %s connected", device)
                while True:
                    try:
                        line = await reader.readline()
                        line = line.decode("utf-8").strip()

                        try:
                            data = json.loads(line)
                            if isinstance(data, dict):
                                self._attributes = data
                        except ValueError:
                            pass

                        if self._template is not None:
                            line = self._template.async_render_with_possible_json_value(line)

                        _LOGGER.debug("Received: %s", line)
                        self._state = line
                        self.async_schedule_update_ha_state()
                    except Exception as e:
                        self._state = None
                        self._attributes = None
                        self.async_schedule_update_ha_state()
                        _LOGGER.error("Error while reading serial device %s: " + str(e), device)
                        break
            except:
                if not loggedError:
                    _LOGGER.error("Unable to connect to the serial device %s. Retrying...", device)
                    loggedError = True
                await asyncio.sleep(5)

    async def stop_serial_read(self):
        """Close resources."""
        if self._serial_loop_task:
            self._serial_loop_task.cancel()

    @property
    def name(self):
        """Return the name of the sensor."""
        return self._name

    @property
    def should_poll(self):
        """No polling needed."""
        return False

    @property
    def device_state_attributes(self):
        """Return the attributes of the entity (if any JSON present)."""
        return self._attributes

    @property
    def state(self):
        """Return the state of the sensor."""
        return self._state

As this is my first ever Python code, please let me know if I made some errors.
It’s running on my machine since yesterday, and it’s recovering from connection errors, and sets the sensors state to None in case of issues.
I don’t see that it slows down HASS, because of the asyncio.sleep(5) call, but I didn’t measure it’s impact in any way.

@fabaff / @cgtobi / @zxdavb - would it be possible that this code change be included in the original integration?

1 Like

Why don’t you open an PR for that?

I don’t have experience with that, you mean Pull Request from https://github.com/home-assistant/core/pulls ?

I’m having the same issue. HA sees data being sent from a serial device (consumed by Node RED serial in node) for a while and then simply doesn’t. Restarting the serial device does not fix this. After coming back up from a reboot via the Supervisor, things work again for a while (tens of minutes, not hours) until they don’t. But HA / Node RED gives no indication of any connection issue .

I can’t see any useful debug information in the logs, but sometimes it seems like I can send data to the device, but just get no data from the device. But that is not always the case.

This is super frustrating as the serial device is the physical layer interface to my lighting system.

Most of the serial issues should be fixed in the current HA.
I’m running the serial integration just fine now and it’s working for days without issues.

If you add this to configuration.yaml, do you see log entries when data is received, connection lost, etc ?

logger:
  logs:
    homeassistant.components.serial: debug

I’m still having this issue. The integration works for about an hour and then stops working

1 Like

Facing the same issue here.

Ref:
|Installation Type|Home Assistant OS|
|Development|false|
|Supervisor|true|
|Docker|true|
|Virtual Environment|false|
|Python Version|3.8.7|
|Operating System Family|Linux|
|Operating System Version|5.4.83-v7|
|CPU Architecture|armv7l|
|Timezone|Europe/Stockholm|

Home Assistant Supervisor

|Host Operating System|Home Assistant OS 5.11|

|Supervisor Version|supervisor-2021.02.11|
|Docker Version|19.03.13|
|Installed Add-ons|deCONZ (6.7.0), ESPHome (1.16.1), File editor (5.2.0), Terminal & SSH (9.0.1), Traccar (0.11.0)|

1 Like

Same problem, occurs on various USB devices, various cables - sooner or later USB0 becomes not available. HASS server restart from web Settings page or from console does not help. I use VirtualBox - disable and enable USB in VirtualBox recovers from problem state. No device disconnect required.

When USB failed:
“lsusb” shows it.
“ha hardware info” also shows ttyUSB0.
Terminal has message like - homeassistant ttyUSB0 failed to set flow control -110

System Health

version: core-2021.3.4
installation_type: Home Assistant OS
dev: false
hassio: true
docker: true
virtualenv: false
python_version: 3.8.7
os_name: Linux
os_version: 5.4.99
arch: x86_64

logged_in: false
can_reach_cert_server: ok
can_reach_cloud_auth: ok
can_reach_cloud: ok

host_os: Home Assistant OS 5.12
update_channel: stable
supervisor_version: supervisor-2021.04.0
docker_version: 19.03.13
disk_total: 30.8 GB
disk_used: 11.1 GB
healthy: true
supported: true
board: ova
supervisor_api: ok
version_api: ok
installed_addons: Visual Studio Code (3.1.1), SQLite Web (3.1.0), Log Viewer (0.10.2), ESPHome (1.16.2), Terminal & SSH (9.1.0)

dashboards: 2
resources: 2
views: 10
mode: storage

Found autosyspend = 2 for USB -
cat /sys/module/usbcore/parameters/autosuspend

Changed to -1 by
echo -1 >/sys/module/usbcore/parameters/autosuspend

Will see if it helps. I think docker will reset it after reboot.

Autosuspend -1 made things even worse. Removed.

While looking for root cause a workaround solution might be to setup a repeating task to reconnect USB for VM, for example it could be like this:

Find out your VM id by:

C:\Program Files\Oracle\VirtualBox> VBoxManage list -l usbhost

Then create .bat and schedule its execution:

VBoxManage controlvm <VM ID> usbdetach <USB ID>
VBoxManage controlvm <VM ID> usbattach <USB ID>

… Update:

Reattaching the device did not help in long term.

“ha core logs” show such error

After serial integration stops feeding data into the serial sensor (I can still see using microcom) that my serial device is sending data. The issue persists and sensors are not getting updated neither with Core or Host reboot. Sometimes with some luck which I still do not understand why it starts working again with host reboot

System Health

version: core-2021.6.6
installation_type: Home Assistant OS
dev: false
hassio: true
docker: true
virtualenv: false
python_version: 3.8.9
os_name: Linux
os_version: 5.10.17-v7
arch: armv7l
timezone: Europe/Stockholm

logged_in: false
can_reach_cert_server: ok
can_reach_cloud_auth: ok
can_reach_cloud: ok

host_os: Home Assistant OS 6.0
update_channel: stable
supervisor_version: supervisor-2021.06.3
docker_version: 20.10.6
disk_total: 28.0 GB
disk_used: 9.9 GB
healthy: true
supported: true
board: rpi3
supervisor_api: ok
version_api: ok
installed_addons: ESPHome (1.19.1), File editor (5.2.0), Traccar (0.11.0), deCONZ (6.9.0), Samba share (9.5.1), Spotify Connect (0.8.2), SSH & Web Terminal (8.2.4)

dashboards: 4
resources: 0
views: 9
mode: storage

api_endpoint_reachable: ok