NEW Script to fully DELETE devices & entities from homeassistant

In looking through the documentation for home assistant, I discovered that there is no simple and clean way to fully delete devices and/or their associated entities from home assistant. Although it is possible to hide them from the UI, the data and metadata still remain in the homeassistant_v2.db database and the various registry and restore files in the .storage directory.

Unfortunately, I found that my homeassistant instance was cluttered with a bunch of “foreign” (i.e. neighbors) MQTT devices & entities as well as legacy devices and entities that are created whenever I change batteries in certain MQTT sensors as well as sensors that I have tested and discarded.

I wanted a clean, safe and easy script that allows one to fully delete all data for individual devices and entities.

So, I created the following bash script that you can run from a ‘ssh’ shell that will do exactly that.

Specifically it cleans up the following:

  • Delete devices ‘.storage’ info from:
    • core.device_registry
  • Delete entity ‘.storage’ info from:
    • core.entity_registry
    • core.restore_state
    • homeassistant.exposed_entities
  • Delete corresponding entity data from the following home-assistant_v2.db tables:
    • states_meta
    • states
    • statistics_meta
    • statistics
    • statistics_short_term
    • state_attributes (delete orphan attributes)

The only thing it doesn’t touch are the event data tables although I may add that later…

I tried to have as much error checking as possible as wells as auto-creation of time-stamped backups before any files are modified.

Enjoy! (feedback is welcome!)

#!/bin/bash
#===============================================================================
# DESCRIPTION:
# Delete devices and/or entities *completely from homeassistant registries & databases
#    - Delete devices '.storage' info from:
#        core.device_registry
#    - Delete entity '.storage' info from:
#        core.entity_registry
#        core.restore_state
#        homeassistant.exposed_entities
#    - Delete corresponding entity data from the following home-assistant_v2.db tables:
#        states_meta
#        states
#        statistics_meta
#        statistics
#        statistics_short_term
#        state_attributes
#    - NOTE: Timestamped backups are generated for every modified database or registry
#    - NOTE: Deleted devices can only be accessed by id (-D), not by name
#    - NOTE: MQTT devices may consistently be republished if topics are retained.
#            So, you need to delete the corresponding topics from the 'homeassistant' tree
#            e.g., use MQTT Explorer
#            Until then, you may want to stop MQTT from re-adding any existing published
#            entities *and* corresponding devices, by adding the following to 'config.yaml'
#                mqtt:n
#                  discovery: false
#
#===============================================================================
# USAGE:
#    Usage: delete_device_entity [-d DEV_NAME | -D DEV_ID | -e ENT_NAME | -E ENT_ID ] [-A]
#        where you need to specify exactly one of: -d, -D, -e, -E
#        If -d or -D specified OR -e or -E plus -A specified, then
#            the corresponding device and ALL its entities are deleted
#        If -e or -E specified with no '-A', then ONLY the specified entity is deleted
#
#===============================================================================
# VERSION: 0.5
#
#===============================================================================
#
# CHANGELOG
#     0.5 (June 2024)
#      - First official release
#===============================================================================
#
# AUTHOR:
#    Jeff Kosowsky
#    Copyright June 2024
#
#===============================================================================
#### VARIABLES

BASE=/homeassistant
DATABASE="home-assistant_v2.db"
DATABASE="$BASE/$DATABASE"
STORAGE="$BASE/.storage"

DEVREGISTRY=core.device_registry
ENTREGISTRY=core.entity_registry
RESTOREREGISTRY=core.restore_state
EXPOSEDENTITIES=homeassistant.exposed_entities

#===============================================================================
#PARSE OPTIONS

PROG=${0##*/}
usage() {
    cat <<EOF >&2
Usage: $PROG [-d DEV_NAME | -D DEV_ID | -e ENT_NAME | -E ENT_ID ] [-A]
    where you need to specify exactly one of: -d, -D, -e, -E
        If -d or -D specified OR -e or -E plus -A specified, then
            the corresponding device and all its entities are deleted
        If -e or -E specified with no '-A', then the specified entity alone is deleted
EOF
    exit 1
    }

NUMDEV=0
while getopts hd:D:e:E:A options; do
    case "${options}" in
	d) DEVNAME=${OPTARG}
	    ((NUMDEV++))
	    ;;
	D) DEVID=${OPTARG}
	   if ! [[ "$DEVID" =~ ^[[:xdigit:]]{32}$ ]]; then
	       echo "Error: DEV_ID must be 32 char hex" >&2
	       usage
	   fi
	   ((NUMDEV++))
	   ;;
	e) ENTNAME=${OPTARG}
	   ((NUMDEV++))
	   ;;
	E) ENTID=${OPTARG}
	   ((NUMDEV++))
	    if ! [[ "$ENTID" =~ ^[[:xdigit:]]{32}$ ]]; then
		echo "Error: ENT_ID must be 32 char hex" >&2
		usage
	    fi	
	    ;;
	A) ALL=1
	   ;;
	:) echo "Error: -${OPTARG} requires an argument." >&2
	   usage
	   ;;
	*) usage
	   ;;
    esac
done

if [[ $NUMDEV -ne 1 || -n "$ALL" && !( -n "$ENTID" || -n "$ENTNAME" ) ]]; then
    echo "Error: Need to enter exactly one type option (-d, -D, -e, -E) plus optional '-A' if entity specified" >&2
    usage
fi
shift $((OPTIND-1))
[ -n "$1" ] && usage #No other arguments

#===============================================================================
#### FUNCTIONS
myjoin() {
    local separator="$1"
    shift
    local first="$1"
    shift
    printf "%s" "$first" "${@/#/$separator}"
}

print_exit() {
    echo "*** Don't forget to restart home assistant (ha core start)"
}

#===============================================================================
#### FILE CHECKS
if ! [ -e "$DATABASE" ]; then
    echo "Error: Can't find database '$DATABASE'..." >&2
    exit 2
fi

if ! [ -d "$STORAGE" ]; then
    echo "Error: Can't find storage directory '$STORAGE'..." >&2
    exit 3
fi

#===============================================================================
#### Get device and entity data

ha core status >&/dev/null
if [ $? -ne 1 ]; then #Not stopped so presumably running
    echo "Stopping home assistant..."
    ha core stop >&/dev/null
    if [ $? -ne 0 ]; then
	echo "Error: Couldn't stop home assistant..."
	exit 4
    fi
fi

cd $STORAGE

#Get metadata for state and statistics tables from $DATABASE
declare -A METADATA #Associative array for state metadata id: entity_id -> metadata_id
while IFS='|' read meta entity; do
    METADATA["$entity"]="$meta"
done <<< $(sqlite3 $DATABASE "SELECT metadata_id, entity_id FROM states_meta")
#for id in ${!METADATA[@]}; do echo "$id ---> ${METADATA[$id]}"; done #JJKDEBUG

declare -A METADATA_STAT #Associative array for statistic metadata id: statistic_id -> id
while IFS='|' read id statistic; do
    METADATA_STAT["$statistic"]="$id"
done <<< $(sqlite3 $DATABASE "SELECT id, statistic_id FROM statistics_meta")
#for id in ${!METADATA_STAT[@]}; do echo "$id ---> ${METADATA_STAT[$id]}"; done #JJKDEBUG

## Get Device id & Name
if [ -n "$ALL" ]; then #Find device containing entity if '-e' or '-E' together with '-A' specified
    if [ -n "$ENTID" ]; then
	DEVID=$(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.id == "'$ENTID'") | .device_id' | tr -d \" 2>/dev/null)
    elif [ -n "$ENTNAME" ]; then
	DEVID=$(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.entity_id == "'$ENTNAME'") | .device_id' | tr -d \" 2>/dev/null)
    fi
    if [ -z "$DEVID" ]; then
	echo "Error: Can't find device id correspinding to entity '${ENTID:-$ENTNAME}' in '$DEVREGISTRY'" >&2
	exit 5
    fi    
    unset ENTID ENTNAME
fi
    
if [ -n "$DEVID" ]; then #Look up by DEVID
    { read -r DEVNAME; read -r  DEVUSERNAME; } \
	<<< $(cat $DEVREGISTRY | jq -M '.data.devices.[], .data.deleted_devices.[] | select(.id == "'$DEVID'") | .name, .name_by_user' | tr -d \" 2>/dev/null)

    if [ -z "$DEVNAME" ]; then
	echo "Error: Can't find device id '$DEVID' in '$DEVREGISTRY'" >&2
	exit 6
    fi
elif [ -n "$DEVNAME" ]; then  #Look up by DEVNAME
    { read -r DEVID; read -r  DEVUSERNAME; } \
	<<< $(cat $DEVREGISTRY | jq -M '.data.devices.[], .data.deleted_devices.[] | select(.name == "'$DEVNAME'") | .id, .name_by_user' | tr -d \" 2>/dev/null)

    if [ -z "$DEVID" ]; then
	echo "Error: Can't find device name '$DEVNAME' in '$DEVREGISTRY'" >&2
	exit 7
    fi
fi

#Get Entity id's and associated info
if [ -n "$ENTID" ]; then
    ENT_IDS=($ENTID)
    if [ -z "$(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.id == "'$ENTID'")')" ]; then
	echo "Error: Can't find '$ENTID' in '$ENTREGISTRY'" >&2
	exit 8
    fi
elif [ -n "$ENTNAME" ]; then
    ENT_IDS=($(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.entity_id == "'$ENTNAME'") | .id' | tr -d \" 2>/dev/null))
    if [ ${#ENT_IDS[@]} -eq 0 ]; then
	echo "Error: Can't find any entities corresponding to '$ENTNAME' in '$ENTREGISTRY'" >&2
	exit 9
    fi
else #Device + Entities
    echo -e "Device: Name: $DEVNAME [$DEVID]\tUser name: $DEVUSERNAME"

    #Get Corresponding Entity id's
    ENT_IDS=($(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.device_id == "'$DEVID'") | .id' | tr -d \" 2>/dev/null))
    if [ ${#ENT_IDS[@]} -eq 0 ]; then
	echo -n "Can't find any entities corresponding to '$DEVNAME' ($DEVID) in '$ENTREGISTRY'. Delete device anyway? (y/n) "
	read -e ANSWER
	echo
	if [ "$ANSWER" != "y" ]; then
	    echo "Aborting..."
	    exit 10
	fi
    fi
fi
#for id in ${ENT_IDS[@]}; do echo "|$id|"; done #JJKDEBUG

[ ${#ENT_IDS[@]} -ne 0 ] && printf "\n%-32s %-7s %-7s %-20s %-20s %-20s\n" ID META_ID STAT_ID ENTITY_ID "(UNIQUE_ID)" "[ORIGINAL_NAME]"
for id in ${ENT_IDS[@]}; do
    #Note first need to declare assoc arrays since index is hex and can be interpeted as numbered base for an indexed array
    declare -A ENTIDNAMES[$id] ENTUIDS[$id] ENTORIGNAMES[$id] METAIDS[$id] METASTATIDS[$id]

    { read -r entity_id; read -r  unique_id; read -r  original_name; } \
	<<< $(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.id == "'$id'") | .entity_id, .unique_id, .original_name' | tr -d \")

    ENTIDNAMES[$id]=$entity_id
    ENTUIDS[$id]=$unique_id
    ENTORIGNAMES[$id]=$original_name
    METAIDS[$id]=${METADATA["$entity_id"]}
    [ -z "${METAIDS[$id]}" ] && METAIDS[$id]='-'
    METASTATIDS[$id]=${METADATA_STAT["$entity_id"]}
    [ -z "${METASTATIDS[$id]}" ] && METASTATIDS[$id]='-'

    printf "%s %-7s %-7s %s (%s) [%s]\n" "$id" "${METAIDS[$id]}" "${METASTATIDS[$id]}" "${ENTIDNAMES[$id]}" "${ENTUIDS[$id]}" "${ENTORIGNAMES[$id]}"
    [ "${METAIDS[$id]}" = '-' ] && echo -e "     WARNING: No 'metadata_id' stored in database for $entity_id\n"
done

echo -ne "\nAre you sure you want to COMPLETELY delete the above ${DEVID:+device &} ${#ENT_IDS[@]} entities (including all associated db data)?(y/n) "
read -e ANSWER
echo
if [ "$ANSWER" != "y" ]; then
    echo "Aborting..."
    exit 11
fi

#===============================================================================
#### Delete entries in '.storage' registries

DATE="$(date +%Y%m%d-%H%M%S)"

### Delete entry in $DEVREGISTRY (core.device_registry) if device
if [ -n "$DEVID" ]; then
   cp -a $DEVREGISTRY{,-$DATE}
   echo "...'$DEVREGISTRY' backed up to '${DEVREGISTRY}-$DATE'"
   cat $DEVREGISTRY-$DATE | jq -M '.data |= del(.devices.[], .deleted_devices.[] | select(.id == "'$DEVID'"))' >| $DEVREGISTRY
   if [ $? -ne 0 ]; then #Error - unwind & exit
       echo "Error: Failed to delete *device* entries from '$DEVREGISTRY', unwinding..."
       mv ${DEVREGISTRY}{-$DATE,}       
       exit 12
   fi
   echo "   DELETED *device* entries from '$DEVREGISTRY'"
fi

if [ ${#ENT_IDS[@]} -eq 0 ];then #Device only so exit
    print_exit
    exit 0
fi

### Delete entries in $ENTREGISTRY (core.entity_registry)
cp -a $ENTREGISTRY{,-$DATE}
echo "...'$ENTREGISTRY' backed up to '${ENTREGISTRY}-$DATE'"
ENTJOINS=".entity_id == \"$(myjoin "\" or .entity_id == \"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $ENTREGISTRY-$DATE | jq -M '.data |= del(.entities.[], .deleted_entities.[] | select('"$ENTJOINS"'))' >|  $ENTREGISTRY
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$ENTREGISTRY', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    exit 13
fi
echo "   DELETED *entity* entries from '$ENTREGISTRY'"

### Delete entries in $RESTOREREGISTRY (core.restore_state)
cp -a $RESTOREREGISTRY{,-$DATE}
echo "...'$RESTOREREGISTRY' backed up to '${RESTOREREGISTRY}-$DATE'"
ENTJOINS=".state.entity_id == \"$(myjoin "\" or .state.entity_id == \"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $RESTOREREGISTRY-$DATE | jq -M '.data |= del(.[] | select('"$ENTJOINS"'))' >| $RESTOREREGISTRY
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$RESTOREREGISTRY', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    mv ${RESTOREREGISTRY}{-$DATE,}    
    exit 14
fi
echo "   DELETED *entity* entries from '$RESTOREREGISTRY'"

### Delete entries in $EXPOSEDENTITIES(homeassistant.exposed_entities)
cp -a $EXPOSEDENTITIES{,-$DATE}
echo "...'$EXPOSEDENTITIES' backed up to '${EXPOSEDENTITIES}-$DATE'"
ENTJOINS=".\"$(myjoin "\", .\"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $EXPOSEDENTITIES-$DATE | jq -M '.data |= del(.exposed_entities | select('"$ENTJOINS"'))' >| ${EXPOSEDENTITIES}
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$EXPOSEDENTITIES', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    mv ${RESTOREREGISTRY}{-$DATE,}
    mv ${EXPOSEDENTITIES}{-$DATE,}
    exit 15
fi
echo "   DELETED *entity* entries from '$EXPOSEDENTITIES'"

#===============================================================================
#### Delete entries in 'home-assistant_v2.db'

cp -a $DATABASE{,-$DATE}
echo "...'$DATABASE' backed up to '$DATABASE-$DATE'"

#Needed later (NOTE: 'comm' needs lexicographical sort)
ATTRIBUTES_USED_ORIG="$(sqlite3 $DATABASE "SELECT DISTINCT attributes_id FROM states ORDER BY CAST(attributes_id AS text)")"

if ! [[ $(IFS=; echo "${METAIDS[*]}")  =~ ^-*$ ]]; then ## Consists only of '-'
    METAIDJOINS="metadata_id = '$(myjoin "' OR metadata_id = '" ${METAIDS[@]})'"
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM states_meta WHERE $METAIDJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'states_meta' table"

    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM states WHERE $METAIDJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'states' table"
fi

if ! [[ $(IFS=; echo "${METASTATIDS[*]}")  =~ ^-*$ ]]; then ## Consists only of '-'
    IDSTATJOINS="id = '$(myjoin "' OR id = '" ${METASTATIDS[@]})'"
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM statistics_meta WHERE $IDSTATJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'statistics_meta' table"

    METAIDSTATJOINS="metadata_id = '$(myjoin "' OR metadata_id = '" ${METASTATIDS[@]})'"
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM statistics WHERE $METAIDSTATJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'statistics' table"

    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM statistics_short_term WHERE $METAIDSTATJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'statistics_short_term' table"
fi

#===============================================================================
#### Delete orphaned entries in 'states_attributes' database

#NOTE: 'comm' needs lexicographical sort
ATTRIBUTES_USED="$(sqlite3 $DATABASE "SELECT DISTINCT attributes_id FROM states ORDER BY CAST(attributes_id AS text)")"
ATTRIBUTES_ALL="$(sqlite3 $DATABASE "SELECT attributes_id FROM state_attributes ORDER BY CAST(attributes_id AS text)")"

NEWLY_ORPHANED=($(comm -2 -3 <(echo "$ATTRIBUTES_USED_ORIG") <(echo "$ATTRIBUTES_USED")))
ALL_ORPHANED=($(comm -2 -3 <(echo "$ATTRIBUTES_ALL") <(echo "$ATTRIBUTES_USED")))
MISSING=($(comm -1 -3 <(echo "$ATTRIBUTES_ALL") <(echo "$ATTRIBUTES_USED")))

echo "There are '${#NEWLY_ORPHANED[@]}' NEWLY orphaned attributes and '${#ALL_ORPHANED[@]}' TOTAL orphaned attributes in 'state_attributes' table"
echo "(Also, there are ${#MISSING[@]} attributes that are referenced in 'states' but not found in 'state_attributes table)"

if [ ${#ALL_ORPHANED} -gt 0 ]; then
    echo -ne "\nDelete orphaned attributes from 'state_attributes' table?([N]ewly/[T]otal/[A]bort) "
    read -e ANSWER
    echo
    case $ANSWER in
	N) DELATTRS="$(myjoin ',' "${NEWLY_ORPHANED[@]}")"
	   ;;
	T) DELATTRS="$(myjoin ',' "${ALL_ORPHANED[@]}")"
	   ;;
	*) DELATTRS=""
	   ;;
    esac
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM state_attributes WHERE attributes_id IN ($DELATTRS); SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'state_attributes' table"
fi

#===============================================================================
echo -e "\n***If device/entities are MQTT integrations, don't forget to unpublish the entities!"
print_exit
4 Likes

Looks handy.
Realized I couldn’t jsut run it from within vs code addon without it terminating everything. Tried logging in direct but still no luck.

Also - would be really handy if we could have it show a list of disabled entiteis and choose to delete those, or their entity data.

I ssh into HA to run the scripts.

Your suggestion about showing a list of disabled entities is interesting. I will add to my TODO list if I have time

Note I have also created scripts to (relatively) safely rename entities, renumber states, and renumber statistics

hello, is this script is also available for container’s instance?
I have 7000 devices “Mobile App” (35000 entities) to delete

it should work if you can get a bash shell and can access sqlite3.
Alternatively, you can copy /homeassistant/home_assistant_v2.db and /homeassistant/.storage to another machine and run the scripts there (adjusting the directories as needed)

It is broken.

It errors immediately with: ./clear.sh -d GW2000 jq: error: syntax error, unexpected ‘[’, expecting FORMAT or QQSTRING_START (Unix shell quoting issues?) at , line 1: .data.devices., .data.deleted_devices. | select(.name == “GW2000”) | .id, .name_by_user jq: 1 compile error Error: Can’t find device name ‘GW2000’ in ‘core.device_registry’

The device exists:here is a grep of the core.device_registry
cat .storage/core.device_registry | grep GW2000 {“area_id”:null,“config_entries”:[“6d47fb6543cc898f88523ecf20880035”],“configuration_url”:null,“connections”:,“created_at”:“2024-09-06T08:37:58.273045+00:00”,“disabled_by”:null,“entry_type”:null,“hw_version”:null,“id”:“95808cf5fc5a0914abb0458d4d1e9d77”,“identifiers”:[[“mqtt”,“3A6968E27B39EAE15EAC758A5DD4BFFB”]],“labels”:,“manufacturer”:“Ecowitt”,“model”:“GW2000A”,“model_id”:null,“modified_at”:“2024-09-06T08:37:58.273118+00:00”,“name_by_user”:null,“name”:“GW2000”,“primary_config_entry”:“6d47fb6543cc898f88523ecf20880035”,“serial_number”:null,“sw_version”:“GW2000A_V3.1.1”,“via_device_id”:null}