HA Supervised Unreliable?

Hi,

I have HA running on a Debian12 VM already a while ago. I have some AddOns installed (Influx, Grafana, …) but due to time lack there is not much going on on the system. I only check the system from time to time, so there are frequently several days before I access the frontend.

Unfortunately, my HA installation is quiet unreliable. When tring to connect the server can not be reached by the browser. Checking on the Debian system itself it appears the port 8123 is not open to offer communication.

So for some reason port got closed and the task behind it vanished.
I am unsure how to recover here and I want to know WHY the port ist closed!

  • I tried docker restart hassio_supervisor
  • I checked logs with journalctl -xe but I do not see any hints.
  • I re-installed os-agent with dpkg -i os-agent_1.6.0_linux_x86_64.deb.
  • I re-installed Docker by curl -fsSL get.docker.com | sh..
  • I even re-install the package by apt install ./homeassistant-supervised.deb from fresh download.

No matter what I try- it stays offline and I cannot connect.
This is the journalctl -f output after restarting docker:

Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.231 INFO (MainThread) [supervisor.host.services] Updating service information
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.233 INFO (MainThread) [supervisor.resolution.checks.base] Run check for free_space/system
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.234 INFO (MainThread) [supervisor.resolution.check] System checks complete
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.234 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state running
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.243 INFO (MainThread) [supervisor.host.network] Updating local network information
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.415 INFO (MainThread) [supervisor.host.sound] Updating PulseAudio information
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.420 INFO (MainThread) [supervisor.host.manager] Host information reload completed
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.446 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.450 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state running
Apr 23 12:52:22 hoas hassio_supervisor[171900]: 2024-04-23 12:52:22.450 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
Apr 23 12:52:30 hoas systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
Apr 23 12:52:49 hoas dockerd[171900]: 2024/04/23 12:52:49 http: superfluous response.WriteHeader call from go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*respWriterWrapper).WriteHeader (wrap.go:98)
Apr 23 12:52:49 hoas systemd[1]: run-docker-runtime\x2drunc-moby-22b2f2f8b804eeee94a0d6076189437ec4603ced9e6b65686a74ecaa6a4d0255-runc.jE8PQ7.mount: Deactivated successfully.
Apr 23 12:52:52 hoas systemd[1]: systemd-hostnamed.service: Deactivated successfully.
Apr 23 12:52:52 hoas systemd[1]: systemd-timedated.service: Deactivated successfully.
Apr 23 12:53:19 hoas systemd[1]: run-docker-runtime\x2drunc-moby-22b2f2f8b804eeee94a0d6076189437ec4603ced9e6b65686a74ecaa6a4d0255-runc.caCYxo.mount: Deactivated successfully.

After a full system reboot everything is up and running and I can connect to my instance. And I can see from what time it went down as the MQTT data did not get collected.

Now, I could schedule a cron job to reboot the system every now and then or based on reachability of port 8123. But I guess we all agreee this is not the solution to this issue.

Anyone here having some ideas to troubleshoot the issue? Why only a reboot does help? What steps can I do to figure out why it stopped?

Thanks a lot!

/KNEBB

No, HA (Supervised) is not unreliable, your system is.

You could restart in safe mode, disable (custom/community) integrations to see where it comes from.
Another option is to wait for someone who knows more then me.

I wonder if this line could play a role:

2 Likes

if you run in a VM anyway, why not HA OS ?

2 Likes

Hi,

there are several reasons I prefer my own OS and use supervised mode. Some of them:

  • Use of encrypted disks
  • Ability to use ssh for the full system and not just in a docker
  • Scripting admin tasks to automated as much as possible
  • not having a black box
  • and some more

Ok, regarding the issue:
These are my integrations:

  • Apple TV
  • DLNA
  • HA Supervisor
  • IFTTT
  • IPP
  • iRobot
  • Met.no
  • MobileApp
  • MQTT
  • Philips Hue
  • Ping (ICMP)
  • Shelly
  • Sonne

Of course I can try to disable these step-by-step. And wait if HA breaks. If not, assume I found the culpit. But this will be a very weird way to troubleshoot.
I prefer to check logs, and find the reason for a problem.

I expected to get more information from the logs.
@Nick4
What might be the problem in the entry you quoted?

Thanks!

/KNEBB

Are you adept in running a Linux server with the requirements listed for Supervisor?
Many think they can run Debian, but HA Supervised sets specific requirements that limits what you can and can’t do.
One common mistake is to try and circumvent Network Manager, which typically makes a mess of the network configuration. Often to such a degree that a roll back is not possible and a reinstall is necessary.

See, that’s why supervised is not the preffered installation method. It is advised for people who know their own system exactly… :wink:

As you might have noticed, we can’t really help, because no one besides you knows, what your system is built like. All the points you name as an advantage for supervised are exactly the points, you’re now fighting with.

“not having a black box”: and now you don’t have a black box? You know less from your system, as you would know from a “closed” HA-OS install. Thing is, in case of HA-OS you don’t need to know. In supervised it’s your responsibilty to know and to fix.
“Use SSH for the whole system”: Yep, same here. You don’t need to fiddle around in HA-OS, it just works. You only need the SSH connection, because you must work on the system where HA-OS doesn’t give you access, because you don’t need to have it.
“Scripting admin tasks”: that you wouldn’t need to schedule, if you wouldn’t need to repair your system…

I’m honest, I had this discussion many times here in the forum, and so I know the arguments. HA-OS is the better alternative for almost every use case!

I’d strongly recommend you make your VM with HA-OS and “just use it”. If you want to fiddle around in a system, setup another VM and play there. There are litterally no reasons for using supervised anymore. It’s so 2010, it’s just outdated. :slight_smile:

Sorry, not what you wanted to hear I’m sure! :slight_smile:

1 Like

Hi,

well yes. Running Linux server since 30years. Iguess I have some experience :smiley:
And yes, I read (and followed!) the steps and requirements en detail!

And no, besides of some minor changes I did not configure anything in an unusual way.

I am wondering why service I should have to restart in order to get my HA instance back again working.
I just did some troubleshooting any check the available service in systemd. So next time it happens I will try to restart these serivces:

root@hoas:~# systemctl list-unit-files| grep hassio
hassio-apparmor.service                    enabled         enabled
hassio-supervisor.service                  enabled         enabled

As long as no one else has a better idea?

/KNEBB

Again probably not what you want to hear, but if you’re having stability issues and the need of restarting services to get it back running I would look at the underlying cause, not workarounds to “hack it to work”.

Is that “log-snippet” all you got ?
Why not look in logs prior to Restart ?
Have you enabled DEBUG mode in logger ?

Well, this was my origin idea heare…
But as no one could tell me where to have a look or helps in reading the log I provided I am limited to workarounds…

@paddy0174 : You are trying to tell me HA-OS will not have any issues I need to troubleshoot? At least I can try to troubleshoot based on my Debian, in HA-OS I am not really able to get on the console…and I am pretty sure the logs will not be better…

/KNEBB

FWIW I’ve been running HAOS for some years now and I’ve never had the need to SSH into it to restart services or troubleshooting anything on the OS level.

What VM hypervisor are you using ? You should be able to get to the console.

Thanks for the hint about logger.

There is obviously some more, but I will not cut&paster loads of lines as long as we do not know what we are looking for… I’ll see if I can attach it…

Logger was set to info, now switched to debug:

logger:
 default: debug

/KNEBB

1 Like

Yes, I am. But HA ist a server type-OS and I it is very cumbersome to do troubleshooting on the VM console through hypervisor (no cut&paste and wrong keyboard and stuff like this).

Don’t call that “snippet” of log-entries ( after a restart !) for troubleshooting-logs !
Everything looks fine, maybe beside the part NICK4 mention

With 30 years experience running Linux Server, im sure you know how to
Google, on a simple log-output

Now Restart HA again , and “monitor” your logs until it “spews” errors/connection-timeouts/ etc

What about the first part of that question?

IDNK, just made that remark in case it might help.

A simple google search, first post, reveal this

Now tell me with your 30 years Linux VS HA experience , is this caused from HA, OR your Docker env. ?
PS: I dont say it will lead you to a Solution, But i do know ( IF you would have experienced this in HAOS ) , it would be fixed, by HA Devs, if it was an Issue

PS: In HA-Supervised you also find logs in UI /Settings/System/Logs, if i remember right , 3 years now since i “tried” Supervised, and came to the conclusion " No You can’t really do what you want with your “Own” Debian OS, HA limits/sets the rules you Have To follow, And So your really have to know HA(All parts) and the requirement for a Supervised installation, beside Docker env. Which was totally unknown for me “pre-retired” since more than a decade

Anyway, your DEBUG outputs, might lead you to an “easier” conclusion on what’s going on in your System
And you Cant Exclude, all your Debian logs, just because you can’t access HA ( As you already have noticed )

Nope, not trying to tell, I’m telling you! :slight_smile: History and experience shows numerous times, that the only error and fault free running supervised installations are taken care of by Linux specialists.

I know from own experience, what I’m speaking about. :wink: I had supervised running for nearly three years and switched to HA-OS after some nice guy here in the forum had a heated exchange with me. I tried it, and have zero problems since then! My HA-OS install in a Proxmox VM is running now for almost half a year without restarts of the VM or any problems.

Believe it or not, but it’s your choice:
Either be a computer nerd, that waists time, nerves and money with a supervised install, that brings no advantages, or be a HA user, that focuses on making his life easier. :slight_smile:

Not meant rude, really, but if you ask around, everyboody will tell you the same (as a few already did here in this topic): use HA-OS in a VM and not supervised! Simple as that!

Why not try it for yourself and prove me or yourself wrong? Setup a VM with HA-OS, ideally with Proxmox and not VMware, restore the backup from your supervised installation and let it run for at least two weeks or one major update.
I can practically guarantee you, that you’ll be better off with that installation. :wink:

As I said, this forum and its many users have proven my point right. :slight_smile:

The reason why I wrote as I did was that the usual way is not to use Network manager, but other tools and that is simply a no-go.
If config or direct editing of config files will trash the Network manager control and will then affect HA in unexpected ways.

1 Like