2024.5+: Tracking down instability issues caused by integrations

If you find your system unstable because of a misbehaving integration, there are a few methods for identifying and recovering from the problem.

It is recommended to use Home Assistant’s built-in debug mode in conjunction with asyncio debug mode.

Using Home Assistant's built-in debug mode in 2024.5.x or later (preferred)

If you are using version 2024.5.x or later, Home Assistant has a built-in debug mode:
https://www.home-assistant.io/docs/configuration/troubleshooting/#handling-unexpected-restarts-or-crashes

Note that debug mode is not a valid option for versions older than 2024.5.x and will cause the system to fail to validate the configuration when added.

  1. Add the following to configuration.yaml, and restart.
# Example configuration.yaml entry
homeassistant:
  debug: true
  1. Download and post logs with any RuntimeError in a GitHub issue

Home Assistant debug mode has ~1% performance cost.

Enabling asyncio debug mode at run time (preferred)

If the system is coming up ok, and you have already enabled Home Assistant debug mode (or you are using a version older than 2024.5.x), but you find that you are getting unexpected restarts:

  1. Install profiler integration Link to Integrations: add integration – My Home Assistant
  2. Enable asyncio debug service as soon as possible after startup Profiler - Home Assistant
  3. Watch logs for RuntimeError: Non-thread-safe operation
  4. Download and post logs with full trace in a GitHub issue

asyncio debug mode has ~10% performance cost.

Enabling asyncio debug mode before startup (fallback plan)

If you cannot get the system to start, you can enable debugpy (https://www.home-assistant.io/integrations/debugpy/), which enables asyncio debug.

# Example configuration.yaml entry
debugpy:

debugpy has ~40% performance cost and should not be used long term.

Manually enabling safe mode (last resort)

This is an advanced topic of last resort.

If you cannot get the system to start at all, you can manually enable safe mode.

  1. Enable port 22222 using https://developers.home-assistant.io/docs/operating-system/debugging/
  2. Connect to ssh over port 22222
  3. Run the following: ha core restart --safe-mode

As a last resort, if you cannot get core to restart you can do an forced restart with
docker restart homeassistant&

Integration blocking startup
  1. Enable debug logs for the setup and bootstrap process:
# Example configuration.yaml entries
logger:
  default: info
  logs:
    homeassistant.bootstrap: debug
    homeassistant.setup: debug
    homeassistant.loader: debug
    homeassistant.config_entries: debug
  1. Restart
  2. Download logs and check for tasks that are delaying startup
Sluggish performance or increased CPU usage

If the system’s response is sluggish or you observe increased CPU usage, the profiler integration can generate a callgrind file that can help determine the root cause.

  1. Install profiler integration Link to Integrations: add integration – My Home Assistant
  2. Start the profiler for using the profiler.start service Link to Developer tools: call service – My Home Assistant
  3. Check the persistent notifications area for a link to the callgrind.out.XXXX file
  4. Download the callgrind.out.XXXX file and open it in qcachegrind using the instructions in the profiler documentation.
Finding a run-away template
  1. Install profiler integration Link to Integrations: add integration – My Home Assistant
  2. Call the profiler dump_log_objects service Link to Developer tools: call service – My Home Assistant with:
service: profiler.dump_log_objects
data:
  type: RenderInfo

Download the logs and look for lines with RenderInfo:
<RenderInfo Template<template=({{ ((states('sensor.energy_usage') | float(default=0)) + (states('sensor.energy_usage_2') | float(default=0))) / 1000 }}) renders=8> all_states=False all_states_lifecycle=False domains=frozenset() domains_lifecycle=frozenset() entities=frozenset({'sensor.energy_usage_2', 'sensor.energy_usage'}) rate_limit=None has_time=False exception=None is_static=False>

Look for anything with a very high number of renders. In the example above its quite low with renders=8

Tracking down a memory leak of python objects
  1. Install profiler integration Link to Integrations: add integration – My Home Assistant
  2. Call the profiler start_log_objects service
    Link to Developer tools: call service – My Home Assistant with the default interval of 30s
  3. Let the logger run for about an hour
  4. Call the profiler stop_log_objects service Link to Developer tools: call service – My Home Assistant
  5. Download the logs and look for object counts that are still growing at the end of the log.
  6. If you are unsure what the object is, call the profiler dump_log_objects service Link to Developer tools: call service – My Home Assistant with:
service: profiler.dump_log_objects
data:
  type: LeakingObjectNameHere

The Home Assistant Cookbook - Index.

21 Likes