How do I check what went wrong with the SSH access container (port 22222)?
And how do I troubleshoot the Supervisor and failure to restart an add-on?
Alternatively: how do I increase/improve logging to capture these failures next time, so I can open proper bug reports?
All of these components were working without a hitch for a long time, but I haven’t checked in detail for quite a while now (over a month).
When accessing ssh on port 22222 before, I would get the normal welcome screen and “ha” prompt, but now I get odd messages and I’m stuck after this:
Using username "root".
Authenticating with public key "rsa-key-20200526"
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] done.
[services.d] starting services
[services.d] done.
If I hit CTRL+C after this, I get some more, and then the connection is closed by the remote end.
^C[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
I am running Home Assistant Core 0.115.6 with HassOS 3.13 on RPi3. I haven’t had to touch my Home Assistant in a while now, and it has been working well until recently when I noticed my remote access was down. I use auto-ssh add-on to set up a tunnel to a remote host where I have a public domain configured. Mainly I have been accessing my dashboard from home over WIFI, so I have not typically noticed this, but today I asked one of my Google Home Minis to perform an action which is executed via Home Assistant and it said HA could not be reached.
Last time I restarted Home Assistant was on October 24th, or about 50 days of HA uptime. The RPi system itself, and I thus assume also the ssh container, has been running for a lot longer than this. I get an uptime of 104 days (since 2020-08-31 00:25:09 according to “uptime -s”) from the File editor add-on’s shell command execution environment. I’m pretty sure the auto-ssh add-on was working fine at least into late November.
What I wanted to do was to find a way to troubleshoot why my add-on container was obviously not restarted the last time it disconnected and/or failed to re-connect (auto-ssh). Maybe this is linked to updates of the Supervisor, it is the only automatic function I can think of that is changing, and it should be handling this. But, I don’t think I will be able to check why this time, but while looking into it I also ran into this ssh (port 22222) container issue.
I really, really, dislike the apparent lack of troubleshooting capability in Home Assistant related to these core components. Perhaps this is only or mainly related to running the bare-bones HassOS installation…?
I can execute shell commands via File editor, if that will be of any help.
Finally: Many faults such as these may go unnoticed if you simply restart often and upgrade frequently, so I don’t see those options as the final solution to these issues. I’d rather try to fix the issues or at least improve the capability for finding them.