where will you echo that request? I will back you up!
I think there are many like me with a production setup and a pre-production setup. My “production” is the core of my smart home, runs on an old version of HA but does everything it needs to do. My “pre-production” is a test setup to see if I need to upgrade/test new devices/integrations.
Ive got a same problem.
Recently i have restarted my system (which previously had an uptime of several hundreds of days) and after the restart supervisor started to die every few minutes and it was not possible to update it to the latest (2022.07) version.
I have tried to fix a problem by fixing each of reasons my system was not compliant:
updated Buster to Bullseye
updated docker to 20.10.17
disabled avahi daemon from starting via systemctl
added systemd.unified_cgroup_hierarchy=0 to kernel parameters
installed os-agent
added correct log-driver and storage-driver options to /etc/docker/daemon.json
and this only made it worse, now the supervisor doesnt start at all.
22-07-21 11:02:34 WARNING (MainThread) [supervisor.addons.options] Option 'availability' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:02:34 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.FREE_SPACE/ContextType.SYSTEM
22-07-21 11:02:34 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.SECURITY/ContextType.CORE
22-07-21 11:02:34 INFO (MainThread) [supervisor.resolution.checks.base] Run check for IssueType.TRUST/ContextType.SUPERVISOR
22-07-21 11:02:34 INFO (MainThread) [supervisor.host.manager] Host information reload completed
22-07-21 11:02:35 INFO (MainThread) [supervisor.resolution.check] System checks complete
22-07-21 11:02:35 INFO (MainThread) [supervisor.resolution.evaluate] Starting system evaluation with state CoreState.RUNNING
22-07-21 11:02:35 WARNING (MainThread) [supervisor.resolution.evaluations.base] Found unsupported images: {'guacamole/guacd', 'linuxserver/mariadb', 'lunik1/tt-rss', 'viktorstrate/photoview', 'linuxserver/piwigo'', 'jlesage/crashplan-pro', 'deepquestai/deepstack', 'linuxserver/sonarr', 'linuxserver/heimdall', 'linuxserver/calibre-web', 'guacamole/guacamole', 'dyonr/jackettvpn', 'linuxserver/radarr', 'jlesage/firefox', 'linuxserver/lidarr', 'portainer/portainer-ce', 'jacobalberty/unifi', 'linuxserver/grocy', 'binhex/arch-qbittorrentvpn', 'linuxserver/swag', 'flaresolverr/flaresolverr', 'postgres'} (more-info: https://www.home-assistant.io/more-info/unsupported/software)
22-07-21 11:02:36 INFO (MainThread) [supervisor.resolution.evaluate] System evaluation complete
22-07-21 11:02:36 CRITICAL (MainThread) [supervisor.jobs] The following job conditions are ignored and will make the system unstable when they occur: {<JobCondition.HEALTHY: 'healthy'>}
22-07-21 11:02:36 INFO (MainThread) [supervisor.resolution.fixup] Starting system autofix at state CoreState.RUNNING
22-07-21 11:02:36 INFO (MainThread) [supervisor.resolution.fixup] System autofix complete
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'anonymous' does not exist in the schema for Mosquitto broker (core_mosquitto)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Unknown option 'base_topic' for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'external_converters' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'devices' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'groups' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'homeassistant' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'permit_join' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'advanced' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'device_options' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'blocklist' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'passlist' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'queue' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'frontend' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'experimental' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
22-07-21 11:06:01 WARNING (MainThread) [supervisor.addons.options] Option 'availability' does not exist in the schema for Zigbee2MQTT (45df7312_zigbee2mqtt)
s6-rc: info: service legacy-services: stopping
22-07-21 11:08:59 INFO (MainThread) [supervisor.misc.scheduler] Shutting down scheduled tasks
[10:09:00] INFO: Watchdog restart after closing
s6-svwait: fatal: supervisor died
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
22-07-21 11:09:00 INFO (MainThread) [supervisor.api] Stopping API on 172.30.32.2
22-07-21 11:09:00 INFO (MainThread) [supervisor.hardware.monitor] Stopped Supervisor hardware monitor
22-07-21 11:09:00 INFO (MainThread) [supervisor.core] Supervisor is down - 0
22-07-21 11:09:00 INFO (MainThread) [__main__] Closing Supervisor
Sentry is attempting to send 1 pending error messages
Waiting up to 2 seconds
Press Ctrl-C to quit
[10:09:01] WARNING: Halt Supervisor
s6-linux-init-hpr: fatal: unable to talk to shutdownd: Operation not permitted
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
cont-init: info: running /etc/cont-init.d/udev.sh
[10:36:10] INFO: Setup udev backend inside container
[10:36:10] INFO: Update udev information
cont-init: info: /etc/cont-init.d/udev.sh exited 0
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun supervisor (no readiness notification)
services-up: info: copying legacy longrun watchdog (no readiness notification)
s6-rc: info: service legacy-services successfully started
s6-rc: info: service legacy-services: stopping
[10:36:12] INFO: Watchdog restart after closing
s6-svwait: fatal: supervisor died
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
[10:36:12] INFO: Supervisor restart after closing
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
cont-init: info: running /etc/cont-init.d/udev.sh
[11:18:46] INFO: Setup udev backend inside container
[11:18:46] INFO: Update udev information
cont-init: info: /etc/cont-init.d/udev.sh exited 0
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun supervisor (no readiness notification)
services-up: info: copying legacy longrun watchdog (no readiness notification)
s6-rc: info: service legacy-services successfully started
[11:18:48] INFO: Starting local supervisor watchdog...
s6-rc: info: service legacy-services: stopping
[11:18:50] INFO: Watchdog restart after closing
s6-svwait: fatal: supervisor died
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
[11:18:50] INFO: Supervisor restart after closing
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped
Im running out of ideas how to make it work. I can run HA manually, but Zwavejs2mqtt doesnt start when run manually without the supervisor and without it i cannot control the heating and ventilation at my home.
I fully understand that my setup is currently not supported, but ive been successfully using it for the past few years, it was supported at the time i have installed the supervisor and working perfectly fine until now. Is there an option to disable auto updates of the supervisor?
I have been quite frustrated as well. I have two identical setups and both exhibit the “crash” but not always at the same time.
I have tried the suggestions above but the ha system is always down in the morning.
My issue is that the timers are not working when the system is in its hiatus state. Not good. I discovered through portainer that the ha container was down. using the docker restart command, things go back working; until the next failure, which could be 24 hours later or maybe even 3 weeks.
From the docker supervisor logs it appears code is running attempting to do updates every morning (eastern time GMT-5) and strategically shuts down 172.30.32.2 which is the container for homeassistant and fails to restart by itself
So, the solution i came up with was to write a script and initialize it through crontab: basically testing port 8123 and if it was non-responsive, do a docker container restart. it essentially puts my systems back online but doesn’t solve the problem of why it fails. I have had this suspicion it has something to do with the backend engine doing system analysis with the mother ship. (the failures seem to occur around the same time of the day) and now from reading these responses I think that is exactly what my cause is.
here is the script: test_port.sh #-----------------start------------------- #script to test port satus
HOST=$1
PORT=$2
if [ -z $HOST ];
then echo “host needs to be established”;
exit 1;
fi
$(echo > /dev/tcp/${HOST}/${PORT}) #> /dev/nul 2>&1;
if [ $? -eq 0 ];
uncomment this next line; however, it fills up the log too fast - all we really care about is failure
#then echo “the port $2 is open on:” $(date) #>> /usr/share/hassio/homeassistant/test_8123.log;
exit
else
echo “the port $2 is closed on:” $(date) >> /usr/share/hassio/homeassistant/test_8123.log;
docker restart $(docker ps -a -q);
this puts the log data in the directory available for the file editor for easy troubleshooting
I think I have managed to resolve the issue - and the solution was actually provided above As per my last message, after updating everything on the host system and putting it it’s tiptop shape I’ve been still struggling with the supervisor not starting. So then I have updated the supervisor image via docker pull command on the host and then everything started working
This was one of the worst days to simple update and reboot Ubuntu (Yes, another Ubuntu user). Many tanks to those for providing the solution. Took a reboot or two to get it all back on track again.
Moving to Debian is still on the list, but there’s a lot on that list lately.
My Home Assistant just died with cryptic errors. I was thinking it was a dead SSD, but turns out my docker is just on version 19 like many people mentioned here.
Such (literally) breaking changes should’ve been alerted when trying to update.
Trying to update docker to see if it will run…
EDIT: sudo apt update && sudo apt upgrade && sudo reboot now did the trick.
Hi,
Same issue for me, but the solution provided doesn’t work. (Not sure if it’s appropriate to reply here or start a new thread?)
I’m running Home Assistant Core 2022.6.7, with the Home Assistant OS running on a Raspberry Pi 3B+.
I’ve recently upgraded after probably about a year, and now the supervisor won’t start.
The solution provided on this page is for Docker I believe, but I’m not sure how to go about fixing this for HA OS. Is the 172.30.32.2:80 address for the supervisor correct for a HA OS system?
Had the same set of problems, eventually got sorted with a combination of the instructions on here, updating my RPi to latest OS, and a few hours of head scratching!
Looking at how Supervisor’s new auto-update “option” works… it seems you were spot on about the reason why they didn’t want it to be optional.
I think this is a good compromise. Don’t allow anything else to update while Supervisor is out of date. That is fine by me. Many people don’t want to touch their installation once it’s working. They don’t necessarily need new features and updates just for the sake of staying up to date makes no sense when there are usually multiple breaking changes.
A “stable” system should be possible. Stable enough to go months (or years) without needing an update (except for security patches). Having the ability to disable auto-updates is a real help for those of us who value stability of a perfectly fine system.
Well I did implement it I’m actually mdegat01 on GitHub, dunno if I mentioned that earlier.
It actually ended up with a few more caveats then I initially realized. When supervisor is out of date all of the following actions are disabled:
updating core
updating OS
updating addons
installing addons
adding new addon repositories
checking for updates to installed addons
restoring a backup made on a later version of supervisor
to clarify, you can restore a backup, just not one made on a later version of supervisor then you’re currently running
5 and 6 in particular I didn’t really think about initially. I thought we could at least inform users about updates even if they couldn’t occur while supervisor was behind. But that’s not really safe either since the addon config schema can change.
But yea, the option is now added. Only in the beta for now but should make it to stable soon.
I do have to note though, only the latest version of supervisor is supported. If you do choose to freeze your system and fall behind on updates that’s fine but you will see the unsupported flag. And in issues users will be asked to reproduce them on the latest supervisor.
its just unacceptable. such a bedroom-teenager approach.
people run their houses on this. what if backups arnt old enough?
its becoming a consistent approach that an update fully breaks the system.
I am waiting for the moment when a deadlock situation will arise from this, especially the OS part. Both the OS and the Supervisor version updates can be halted. Once a version will be released which will not support an older OS, but the OS cannot be updated, because the Supervisor version is not the latest.
Unfortunately when most people came to this thread a failsafe check was not implemented and broke the Supervisor and was unable to start. And was added a version later regarding Docker.
I can see the same reason why a deadlock can arise again.
Home Assistant 2022.10.0
Supervisor 2022.09.1
Operating System 9.0
Frontend 20221005.0 - latest
and am now getting this error with my deconz after recent updates:
07:35:10:280 SC state change failed: xx:xx:xx:xx:xx:xx:xx:23-02-0402
[06:35:12] INFO: Service restart after closing
[07:35:12] INFO: Starting VNC server (local/yes)...
[06:35:12] WARNING: Halt add-on
s6-rc: info: service legacy-services: stopping
In exit
Closing socket listening at 127.0.0.1:5901
2022/10/06 07:35:12 [notice] 124#124: signal 15 (SIGTERM) received from 117, exiting
2022/10/06 07:35:12 [notice] 723#723: exiting
2022/10/06 07:35:12 [notice] 723#723: exit
2022/10/06 07:35:12 [notice] 124#124: signal 17 (SIGCHLD) received from 723
2022/10/06 07:35:12 [notice] 124#124: worker process 723 exited with code 0
2022/10/06 07:35:12 [notice] 124#124: exit
s6-svwait: fatal: supervisor died
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
[06:35:12] INFO: Service restart after closing
s6-rc: info: service s6rc-oneshot-runner successfully stopped
Hi, I’m a super noob, just used the packaged hassio image for my pi. How can I enter all those SUDO commands? The Terminal does not open, cause it seems to rely on the supervisor.
I feel your pain. I had the initial choice of installing the “defacto” supported pi image but when I realized the issues you’re having I trashed that idea and installed the supervisor version on an rp4 using docker containers. The real compelling reason was that I needed heyu to communicate with a usb port to support the legacy x10 environment. The hassio configuration doesn’t allow that.
My suggestion for your need to communicate with the pi for the purpose of making changes is to use another machine and a terminal emulator like putty. You can do this with nearly any external platform. I have a terminal app on my iphone that I use in a pinch but typically I use a windows machine running putty. Apple works too.