NEW Script to fully DELETE devices & entities from homeassistant

In looking through the documentation for home assistant, I discovered that there is no simple and clean way to fully delete devices and/or their associated entities from home assistant. Although it is possible to hide them from the UI, the data and metadata still remain in the homeassistant_v2.db database and the various registry and restore files in the .storage directory.

Unfortunately, I found that my homeassistant instance was cluttered with a bunch of “foreign” (i.e. neighbors) MQTT devices & entities as well as legacy devices and entities that are created whenever I change batteries in certain MQTT sensors as well as sensors that I have tested and discarded.

I wanted a clean, safe and easy script that allows one to fully delete all data for individual devices and entities.

So, I created the following bash script that you can run from a ‘ssh’ shell that will do exactly that.

Specifically it cleans up the following:

  • Delete devices ‘.storage’ info from:
    • core.device_registry
  • Delete entity ‘.storage’ info from:
    • core.entity_registry
    • core.restore_state
    • homeassistant.exposed_entities
  • Delete corresponding entity data from the following home-assistant_v2.db tables:
    • states_meta
    • states
    • statistics_meta
    • statistics
    • statistics_short_term
    • state_attributes (delete orphan attributes)

The only thing it doesn’t touch are the event data tables although I may add that later…

I tried to have as much error checking as possible as wells as auto-creation of time-stamped backups before any files are modified.

Enjoy! (feedback is welcome!)

#!/bin/bash
#===============================================================================
# DESCRIPTION:
# Delete devices and/or entities *completely from homeassistant registries & databases
#    - Delete devices '.storage' info from:
#        core.device_registry
#    - Delete entity '.storage' info from:
#        core.entity_registry
#        core.restore_state
#        homeassistant.exposed_entities
#    - Delete corresponding entity data from the following home-assistant_v2.db tables:
#        states_meta
#        states
#        statistics_meta
#        statistics
#        statistics_short_term
#        state_attributes
#    - NOTE: Timestamped backups are generated for every modified database or registry
#    - NOTE: Deleted devices can only be accessed by id (-D), not by name
#    - NOTE: MQTT devices may consistently be republished if topics are retained.
#            So, you need to delete the corresponding topics from the 'homeassistant' tree
#            e.g., use MQTT Explorer
#            Until then, you may want to stop MQTT from re-adding any existing published
#            entities *and* corresponding devices, by adding the following to 'config.yaml'
#                mqtt:n
#                  discovery: false
#
#===============================================================================
# USAGE:
#    Usage: delete_device_entity [-d DEV_NAME | -D DEV_ID | -e ENT_NAME | -E ENT_ID ] [-A]
#        where you need to specify exactly one of: -d, -D, -e, -E
#        If -d or -D specified OR -e or -E plus -A specified, then
#            the corresponding device and ALL its entities are deleted
#        If -e or -E specified with no '-A', then ONLY the specified entity is deleted
#
#===============================================================================
# VERSION: 0.5
#
#===============================================================================
#
# CHANGELOG
#     0.5 (June 2024)
#      - First official release
#===============================================================================
#
# AUTHOR:
#    Jeff Kosowsky
#    Copyright June 2024
#
#===============================================================================
#### VARIABLES

BASE=/homeassistant
DATABASE="home-assistant_v2.db"
DATABASE="$BASE/$DATABASE"
STORAGE="$BASE/.storage"

DEVREGISTRY=core.device_registry
ENTREGISTRY=core.entity_registry
RESTOREREGISTRY=core.restore_state
EXPOSEDENTITIES=homeassistant.exposed_entities

#===============================================================================
#PARSE OPTIONS

PROG=${0##*/}
usage() {
    cat <<EOF >&2
Usage: $PROG [-d DEV_NAME | -D DEV_ID | -e ENT_NAME | -E ENT_ID ] [-A]
    where you need to specify exactly one of: -d, -D, -e, -E
        If -d or -D specified OR -e or -E plus -A specified, then
            the corresponding device and all its entities are deleted
        If -e or -E specified with no '-A', then the specified entity alone is deleted
EOF
    exit 1
    }

NUMDEV=0
while getopts hd:D:e:E:A options; do
    case "${options}" in
	d) DEVNAME=${OPTARG}
	    ((NUMDEV++))
	    ;;
	D) DEVID=${OPTARG}
	   if ! [[ "$DEVID" =~ ^[[:xdigit:]]{32}$ ]]; then
	       echo "Error: DEV_ID must be 32 char hex" >&2
	       usage
	   fi
	   ((NUMDEV++))
	   ;;
	e) ENTNAME=${OPTARG}
	   ((NUMDEV++))
	   ;;
	E) ENTID=${OPTARG}
	   ((NUMDEV++))
	    if ! [[ "$ENTID" =~ ^[[:xdigit:]]{32}$ ]]; then
		echo "Error: ENT_ID must be 32 char hex" >&2
		usage
	    fi	
	    ;;
	A) ALL=1
	   ;;
	:) echo "Error: -${OPTARG} requires an argument." >&2
	   usage
	   ;;
	*) usage
	   ;;
    esac
done

if [[ $NUMDEV -ne 1 || -n "$ALL" && !( -n "$ENTID" || -n "$ENTNAME" ) ]]; then
    echo "Error: Need to enter exactly one type option (-d, -D, -e, -E) plus optional '-A' if entity specified" >&2
    usage
fi
shift $((OPTIND-1))
[ -n "$1" ] && usage #No other arguments

#===============================================================================
#### FUNCTIONS
myjoin() {
    local separator="$1"
    shift
    local first="$1"
    shift
    printf "%s" "$first" "${@/#/$separator}"
}

print_exit() {
    echo "*** Don't forget to restart home assistant (ha core start)"
}

#===============================================================================
#### FILE CHECKS
if ! [ -e "$DATABASE" ]; then
    echo "Error: Can't find database '$DATABASE'..." >&2
    exit 2
fi

if ! [ -d "$STORAGE" ]; then
    echo "Error: Can't find storage directory '$STORAGE'..." >&2
    exit 3
fi

#===============================================================================
#### Get device and entity data

ha core status >&/dev/null
if [ $? -ne 1 ]; then #Not stopped so presumably running
    echo "Stopping home assistant..."
    ha core stop >&/dev/null
    if [ $? -ne 0 ]; then
	echo "Error: Couldn't stop home assistant..."
	exit 4
    fi
fi

cd $STORAGE

#Get metadata for state and statistics tables from $DATABASE
declare -A METADATA #Associative array for state metadata id: entity_id -> metadata_id
while IFS='|' read meta entity; do
    METADATA["$entity"]="$meta"
done <<< $(sqlite3 $DATABASE "SELECT metadata_id, entity_id FROM states_meta")
#for id in ${!METADATA[@]}; do echo "$id ---> ${METADATA[$id]}"; done #JJKDEBUG

declare -A METADATA_STAT #Associative array for statistic metadata id: statistic_id -> id
while IFS='|' read id statistic; do
    METADATA_STAT["$statistic"]="$id"
done <<< $(sqlite3 $DATABASE "SELECT id, statistic_id FROM statistics_meta")
#for id in ${!METADATA_STAT[@]}; do echo "$id ---> ${METADATA_STAT[$id]}"; done #JJKDEBUG

## Get Device id & Name
if [ -n "$ALL" ]; then #Find device containing entity if '-e' or '-E' together with '-A' specified
    if [ -n "$ENTID" ]; then
	DEVID=$(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.id == "'$ENTID'") | .device_id' | tr -d \" 2>/dev/null)
    elif [ -n "$ENTNAME" ]; then
	DEVID=$(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.entity_id == "'$ENTNAME'") | .device_id' | tr -d \" 2>/dev/null)
    fi
    if [ -z "$DEVID" ]; then
	echo "Error: Can't find device id correspinding to entity '${ENTID:-$ENTNAME}' in '$DEVREGISTRY'" >&2
	exit 5
    fi    
    unset ENTID ENTNAME
fi
    
if [ -n "$DEVID" ]; then #Look up by DEVID
    { read -r DEVNAME; read -r  DEVUSERNAME; } \
	<<< $(cat $DEVREGISTRY | jq -M '.data.devices.[], .data.deleted_devices.[] | select(.id == "'$DEVID'") | .name, .name_by_user' | tr -d \" 2>/dev/null)

    if [ -z "$DEVNAME" ]; then
	echo "Error: Can't find device id '$DEVID' in '$DEVREGISTRY'" >&2
	exit 6
    fi
elif [ -n "$DEVNAME" ]; then  #Look up by DEVNAME
    { read -r DEVID; read -r  DEVUSERNAME; } \
	<<< $(cat $DEVREGISTRY | jq -M '.data.devices.[], .data.deleted_devices.[] | select(.name == "'$DEVNAME'") | .id, .name_by_user' | tr -d \" 2>/dev/null)

    if [ -z "$DEVID" ]; then
	echo "Error: Can't find device name '$DEVNAME' in '$DEVREGISTRY'" >&2
	exit 7
    fi
fi

#Get Entity id's and associated info
if [ -n "$ENTID" ]; then
    ENT_IDS=($ENTID)
    if [ -z "$(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.id == "'$ENTID'")')" ]; then
	echo "Error: Can't find '$ENTID' in '$ENTREGISTRY'" >&2
	exit 8
    fi
elif [ -n "$ENTNAME" ]; then
    ENT_IDS=($(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.entity_id == "'$ENTNAME'") | .id' | tr -d \" 2>/dev/null))
    if [ ${#ENT_IDS[@]} -eq 0 ]; then
	echo "Error: Can't find any entities corresponding to '$ENTNAME' in '$ENTREGISTRY'" >&2
	exit 9
    fi
else #Device + Entities
    echo -e "Device: Name: $DEVNAME [$DEVID]\tUser name: $DEVUSERNAME"

    #Get Corresponding Entity id's
    ENT_IDS=($(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.device_id == "'$DEVID'") | .id' | tr -d \" 2>/dev/null))
    if [ ${#ENT_IDS[@]} -eq 0 ]; then
	echo -n "Can't find any entities corresponding to '$DEVNAME' ($DEVID) in '$ENTREGISTRY'. Delete device anyway? (y/n) "
	read -e ANSWER
	echo
	if [ "$ANSWER" != "y" ]; then
	    echo "Aborting..."
	    exit 10
	fi
    fi
fi
#for id in ${ENT_IDS[@]}; do echo "|$id|"; done #JJKDEBUG

[ ${#ENT_IDS[@]} -ne 0 ] && printf "\n%-32s %-7s %-7s %-20s %-20s %-20s\n" ID META_ID STAT_ID ENTITY_ID "(UNIQUE_ID)" "[ORIGINAL_NAME]"
for id in ${ENT_IDS[@]}; do
    #Note first need to declare assoc arrays since index is hex and can be interpeted as numbered base for an indexed array
    declare -A ENTIDNAMES[$id] ENTUIDS[$id] ENTORIGNAMES[$id] METAIDS[$id] METASTATIDS[$id]

    { read -r entity_id; read -r  unique_id; read -r  original_name; } \
	<<< $(cat $ENTREGISTRY | jq -M '.data.entities.[], .data.deleted_entities.[] | select(.id == "'$id'") | .entity_id, .unique_id, .original_name' | tr -d \")

    ENTIDNAMES[$id]=$entity_id
    ENTUIDS[$id]=$unique_id
    ENTORIGNAMES[$id]=$original_name
    METAIDS[$id]=${METADATA["$entity_id"]}
    [ -z "${METAIDS[$id]}" ] && METAIDS[$id]='-'
    METASTATIDS[$id]=${METADATA_STAT["$entity_id"]}
    [ -z "${METASTATIDS[$id]}" ] && METASTATIDS[$id]='-'

    printf "%s %-7s %-7s %s (%s) [%s]\n" "$id" "${METAIDS[$id]}" "${METASTATIDS[$id]}" "${ENTIDNAMES[$id]}" "${ENTUIDS[$id]}" "${ENTORIGNAMES[$id]}"
    [ "${METAIDS[$id]}" = '-' ] && echo -e "     WARNING: No 'metadata_id' stored in database for $entity_id\n"
done

echo -ne "\nAre you sure you want to COMPLETELY delete the above ${DEVID:+device &} ${#ENT_IDS[@]} entities (including all associated db data)?(y/n) "
read -e ANSWER
echo
if [ "$ANSWER" != "y" ]; then
    echo "Aborting..."
    exit 11
fi

#===============================================================================
#### Delete entries in '.storage' registries

DATE="$(date +%Y%m%d-%H%M%S)"

### Delete entry in $DEVREGISTRY (core.device_registry) if device
if [ -n "$DEVID" ]; then
   cp -a $DEVREGISTRY{,-$DATE}
   echo "...'$DEVREGISTRY' backed up to '${DEVREGISTRY}-$DATE'"
   cat $DEVREGISTRY-$DATE | jq -M '.data |= del(.devices.[], .deleted_devices.[] | select(.id == "'$DEVID'"))' >| $DEVREGISTRY
   if [ $? -ne 0 ]; then #Error - unwind & exit
       echo "Error: Failed to delete *device* entries from '$DEVREGISTRY', unwinding..."
       mv ${DEVREGISTRY}{-$DATE,}       
       exit 12
   fi
   echo "   DELETED *device* entries from '$DEVREGISTRY'"
fi

if [ ${#ENT_IDS[@]} -eq 0 ];then #Device only so exit
    print_exit
    exit 0
fi

### Delete entries in $ENTREGISTRY (core.entity_registry)
cp -a $ENTREGISTRY{,-$DATE}
echo "...'$ENTREGISTRY' backed up to '${ENTREGISTRY}-$DATE'"
ENTJOINS=".entity_id == \"$(myjoin "\" or .entity_id == \"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $ENTREGISTRY-$DATE | jq -M '.data |= del(.entities.[], .deleted_entities.[] | select('"$ENTJOINS"'))' >|  $ENTREGISTRY
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$ENTREGISTRY', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    exit 13
fi
echo "   DELETED *entity* entries from '$ENTREGISTRY'"

### Delete entries in $RESTOREREGISTRY (core.restore_state)
cp -a $RESTOREREGISTRY{,-$DATE}
echo "...'$RESTOREREGISTRY' backed up to '${RESTOREREGISTRY}-$DATE'"
ENTJOINS=".state.entity_id == \"$(myjoin "\" or .state.entity_id == \"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $RESTOREREGISTRY-$DATE | jq -M '.data |= del(.[] | select('"$ENTJOINS"'))' >| $RESTOREREGISTRY
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$RESTOREREGISTRY', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    mv ${RESTOREREGISTRY}{-$DATE,}    
    exit 14
fi
echo "   DELETED *entity* entries from '$RESTOREREGISTRY'"

### Delete entries in $EXPOSEDENTITIES(homeassistant.exposed_entities)
cp -a $EXPOSEDENTITIES{,-$DATE}
echo "...'$EXPOSEDENTITIES' backed up to '${EXPOSEDENTITIES}-$DATE'"
ENTJOINS=".\"$(myjoin "\", .\"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $EXPOSEDENTITIES-$DATE | jq -M '.data |= del(.exposed_entities | select('"$ENTJOINS"'))' >| ${EXPOSEDENTITIES}
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$EXPOSEDENTITIES', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    mv ${RESTOREREGISTRY}{-$DATE,}
    mv ${EXPOSEDENTITIES}{-$DATE,}
    exit 15
fi
echo "   DELETED *entity* entries from '$EXPOSEDENTITIES'"

#===============================================================================
#### Delete entries in 'home-assistant_v2.db'

cp -a $DATABASE{,-$DATE}
echo "...'$DATABASE' backed up to '$DATABASE-$DATE'"

#Needed later (NOTE: 'comm' needs lexicographical sort)
ATTRIBUTES_USED_ORIG="$(sqlite3 $DATABASE "SELECT DISTINCT attributes_id FROM states ORDER BY CAST(attributes_id AS text)")"

if ! [[ $(IFS=; echo "${METAIDS[*]}")  =~ ^-*$ ]]; then ## Consists only of '-'
    METAIDJOINS="metadata_id = '$(myjoin "' OR metadata_id = '" ${METAIDS[@]})'"
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM states_meta WHERE $METAIDJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'states_meta' table"

    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM states WHERE $METAIDJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'states' table"
fi

if ! [[ $(IFS=; echo "${METASTATIDS[*]}")  =~ ^-*$ ]]; then ## Consists only of '-'
    IDSTATJOINS="id = '$(myjoin "' OR id = '" ${METASTATIDS[@]})'"
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM statistics_meta WHERE $IDSTATJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'statistics_meta' table"

    METAIDSTATJOINS="metadata_id = '$(myjoin "' OR metadata_id = '" ${METASTATIDS[@]})'"
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM statistics WHERE $METAIDSTATJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'statistics' table"

    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM statistics_short_term WHERE $METAIDSTATJOINS; SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'statistics_short_term' table"
fi

#===============================================================================
#### Delete orphaned entries in 'states_attributes' database

#NOTE: 'comm' needs lexicographical sort
ATTRIBUTES_USED="$(sqlite3 $DATABASE "SELECT DISTINCT attributes_id FROM states ORDER BY CAST(attributes_id AS text)")"
ATTRIBUTES_ALL="$(sqlite3 $DATABASE "SELECT attributes_id FROM state_attributes ORDER BY CAST(attributes_id AS text)")"

NEWLY_ORPHANED=($(comm -2 -3 <(echo "$ATTRIBUTES_USED_ORIG") <(echo "$ATTRIBUTES_USED")))
ALL_ORPHANED=($(comm -2 -3 <(echo "$ATTRIBUTES_ALL") <(echo "$ATTRIBUTES_USED")))
MISSING=($(comm -1 -3 <(echo "$ATTRIBUTES_ALL") <(echo "$ATTRIBUTES_USED")))

echo "There are '${#NEWLY_ORPHANED[@]}' NEWLY orphaned attributes and '${#ALL_ORPHANED[@]}' TOTAL orphaned attributes in 'state_attributes' table"
echo "(Also, there are ${#MISSING[@]} attributes that are referenced in 'states' but not found in 'state_attributes table)"

if [ ${#ALL_ORPHANED} -gt 0 ]; then
    echo -ne "\nDelete orphaned attributes from 'state_attributes' table?([N]ewly/[T]otal/[A]bort) "
    read -e ANSWER
    echo
    case $ANSWER in
	N) DELATTRS="$(myjoin ',' "${NEWLY_ORPHANED[@]}")"
	   ;;
	T) DELATTRS="$(myjoin ',' "${ALL_ORPHANED[@]}")"
	   ;;
	*) DELATTRS=""
	   ;;
    esac
    NUMDELS=$(sqlite3 $DATABASE "DELETE FROM state_attributes WHERE attributes_id IN ($DELATTRS); SELECT changes()")
    echo "   DELETED $NUMDELS entries from 'state_attributes' table"
fi

#===============================================================================
echo -e "\n***If device/entities are MQTT integrations, don't forget to unpublish the entities!"
print_exit
8 Likes

Looks handy.
Realized I couldn’t jsut run it from within vs code addon without it terminating everything. Tried logging in direct but still no luck.

Also - would be really handy if we could have it show a list of disabled entiteis and choose to delete those, or their entity data.

I ssh into HA to run the scripts.

Your suggestion about showing a list of disabled entities is interesting. I will add to my TODO list if I have time

Note I have also created scripts to (relatively) safely rename entities, renumber states, and renumber statistics

hello, is this script is also available for container’s instance?
I have 7000 devices “Mobile App” (35000 entities) to delete

it should work if you can get a bash shell and can access sqlite3.
Alternatively, you can copy /homeassistant/home_assistant_v2.db and /homeassistant/.storage to another machine and run the scripts there (adjusting the directories as needed)

It is broken.

It errors immediately with: ./clear.sh -d GW2000 jq: error: syntax error, unexpected ‘[’, expecting FORMAT or QQSTRING_START (Unix shell quoting issues?) at , line 1: .data.devices., .data.deleted_devices. | select(.name == “GW2000”) | .id, .name_by_user jq: 1 compile error Error: Can’t find device name ‘GW2000’ in ‘core.device_registry’

The device exists:here is a grep of the core.device_registry
cat .storage/core.device_registry | grep GW2000 {“area_id”:null,“config_entries”:[“6d47fb6543cc898f88523ecf20880035”],“configuration_url”:null,“connections”:,“created_at”:“2024-09-06T08:37:58.273045+00:00”,“disabled_by”:null,“entry_type”:null,“hw_version”:null,“id”:“95808cf5fc5a0914abb0458d4d1e9d77”,“identifiers”:[[“mqtt”,“3A6968E27B39EAE15EAC758A5DD4BFFB”]],“labels”:,“manufacturer”:“Ecowitt”,“model”:“GW2000A”,“model_id”:null,“modified_at”:“2024-09-06T08:37:58.273118+00:00”,“name_by_user”:null,“name”:“GW2000”,“primary_config_entry”:“6d47fb6543cc898f88523ecf20880035”,“serial_number”:null,“sw_version”:“GW2000A_V3.1.1”,“via_device_id”:null}

I posted some updated scripts that include multiple enhancements and bug fixes.

See pastebin links at 6 Routines to Delete/Rename/Move Devices & Entities and their corresponding registry entries, data, and metadata - #7 by puterboy

Check if it fixes your issue too.

1 Like

It does work with disabled devices & entities which confusingly are called deleted_devices and deleted_entities in core.device_registry and core.entity_registry (found in /homeassistant/.storage)

You can look in those files to see which devices & entities are considered disabled/deleted and use my routine to completely remove them

1 Like

I’ve touched up the script a bit, to use as a MariaDB database backend.
I use both Home Assistant and MariaDB in a docker container.
@puterboy, you can use it if you want to extend your script.
Thanks for your work.

#!/bin/bash
#===============================================================================
# DESCRIPTION:
# Delete devices and/or entities *completely from homeassistant registries & databases
#    - Delete devices '.storage' info from:
#        core.device_registry
#    - Delete entity '.storage' info from:
#        core.entity_registry
#        core.restore_state
#        homeassistant.exposed_entities
#    - Delete corresponding entity data from the following home-assistant_v2.db tables:
#        states_meta
#        states
#        statistics_meta
#        statistics
#        statistics_short_term
#        state_attributes
#    - NOTE: Timestamped backups are generated for every modified database or registry
#    - NOTE: Deleted devices can only be accessed by id (-D), not by name
#    - NOTE: MQTT devices may consistently be republished if topics are retained.
#            So, you need to delete the corresponding topics from the 'homeassistant' tree
#            e.g., use MQTT Explorer
#            Until then, you may want to stop MQTT from re-adding any existing published
#            entities *and* corresponding devices, by adding the following to 'config.yaml'
#                mqtt:n
#                  discovery: false
#
#    _ NOTE: Need .my.cnf file with client config to connect to database
#
#            [client]
#            user = <<db_user>>
#            password = <<password>>
#            host = <<docker_ip for MariaDB/MySQL>>
#
#
#===============================================================================
# USAGE:
#    Usage: delete_device_entity [-d DEV_NAME | -D DEV_ID | -e ENT_NAME | -E ENT_ID ] [-A]
#        where you need to specify exactly one of: -d, -D, -e, -E
#        If -d or -D specified OR -e or -E plus -A specified, then
#            the corresponding device and ALL its entities are deleted
#        If -e or -E specified with no '-A', then ONLY the specified entity is deleted
#
#===============================================================================
# VERSION: 0.5
#
#===============================================================================
#
# CHANGELOG
#     0.5 (June 2024)
#      - First official release
#===============================================================================
#
# AUTHOR:
#    Jeff Kosowsky
#    Copyright June 2024
#
#===============================================================================
#### VARIABLES

BASE=/docker/homeassistant
DATABASE="hassdb"
STORAGE="$BASE/.storage"

DEVREGISTRY=core.device_registry
ENTREGISTRY=core.entity_registry
RESTOREREGISTRY=core.restore_state
EXPOSEDENTITIES=homeassistant.exposed_entities

#===============================================================================
#PARSE OPTIONS

PROG=${0##*/}
usage() {
    cat <<EOF >&2
Usage: $PROG [-d DEV_NAME | -D DEV_ID | -e ENT_NAME | -E ENT_ID ] [-A]
    where you need to specify exactly one of: -d, -D, -e, -E
        If -d or -D specified OR -e or -E plus -A specified, then
            the corresponding device and all its entities are deleted
        If -e or -E specified with no '-A', then the specified entity alone is deleted
EOF
    exit 1
    }

NUMDEV=0
while getopts hd:D:e:E:A options; do
    case "${options}" in
        d) DEVNAME=${OPTARG}
            ((NUMDEV++))
            ;;
        D) DEVID=${OPTARG}
           if ! [[ "$DEVID" =~ ^[[:xdigit:]]{32}$ ]]; then
               echo "Error: DEV_ID must be 32 char hex" >&2
               usage
           fi
           ((NUMDEV++))
           ;;
        e) ENTNAME=${OPTARG}
           ((NUMDEV++))
           ;;
        E) ENTID=${OPTARG}
           ((NUMDEV++))
            if ! [[ "$ENTID" =~ ^[[:xdigit:]]{32}$ ]]; then
                echo "Error: ENT_ID must be 32 char hex" >&2
                usage
            fi
            ;;
        A) ALL=1
           ;;
        :) echo "Error: -${OPTARG} requires an argument." >&2
           usage
           ;;
        *) usage
           ;;
    esac
done

if [[ $NUMDEV -ne 1 || -n "$ALL" && !( -n "$ENTID" || -n "$ENTNAME" ) ]]; then
    echo "Error: Need to enter exactly one type option (-d, -D, -e, -E) plus optional '-A' if entity specified" >&2
    usage
fi
shift $((OPTIND-1))
[ -n "$1" ] && usage #No other arguments

#===============================================================================
#### FUNCTIONS
myjoin() {
    local separator="$1"
    shift
    local first="$1"
    shift
    printf "%s" "$first" "${@/#/$separator}"
}

print_exit() {
    echo "*** Don't forget to restart home assistant (docker start $id_ha)"
}

#===============================================================================
#### Test mysql connection

mysql -e '\q' $DATABASE 2>/dev/null

if ! [ $? == 0 ]; then
    echo "Error: Can't connect to database '$DATABASE'..." >&2
    exit 2
fi

#### FILE CHECKS

if ! [ -d "$STORAGE" ]; then
    echo "Error: Can't find storage directory '$STORAGE'..." >&2
    exit 3
fi

#===============================================================================

#### Stopping container of home-assistant

id_ha=$(docker ps |grep home-assistant| tr [:space:] '\t'| tr -s '\t'|cut -d $'\t' -f 1)

status=$(docker inspect $id_ha| jq .[].State.Status)
if [ $status != "exited" ]; then
   echo "Home Assistant esta arrancado, ejecute: "
   echo "docker stop $id_ha"
   exit 4
fi

#### Get device and entity data

cd $STORAGE

#Get metadata for state and statistics tables from $DATABASE
declare -A METADATA #Associative array for state metadata id: entity_id -> metadata_id
while IFS='|' read meta entity; do
    METADATA["$entity"]="$meta"
done <<< $(mysql -s -e "SELECT concat_ws('|' , metadata_id, entity_id) FROM states_meta" $DATABASE)

# for id in ${!METADATA[@]}; do echo "$id ---> ${METADATA[$id]}"; done #JJKDEBUG

declare -A METADATA_STAT #Associative array for statistic metadata id: statistic_id -> id
while IFS='|' read id statistic; do
    METADATA_STAT["$statistic"]="$id"
done <<< $(mysql -s -e "SELECT concat_ws('|' , id, statistic_id) FROM statistics_meta" $DATABASE)

# for id in ${!METADATA_STAT[@]}; do echo "$id ---> ${METADATA_STAT[$id]}"; done #JJKDEBUG

## Get Device id & Name
if [ -n "$ALL" ]; then #Find device containing entity if '-e' or '-E' together with '-A' specified
    if [ -n "$ENTID" ]; then
        DEVID=$(cat $ENTREGISTRY | jq -M '.data.entities[], .data.deleted_entities[] | select(.id == "'$ENTID'") | .device_id' | tr -d \" 2>/dev/null)
    elif [ -n "$ENTNAME" ]; then
        DEVID=$(cat $ENTREGISTRY | jq -M '.data.entities[], .data.deleted_entities[] | select(.entity_id == "'$ENTNAME'") | .device_id' | tr -d \" 2>/dev/null)
    fi
    if [ -z "$DEVID" ]; then
        echo "Error: Can't find device id corresponding to entity '${ENTID:-$ENTNAME}' in '$DEVREGISTRY'" >&2
        exit 5
    fi    
    unset ENTID ENTNAME
fi
    
if [ -n "$DEVID" ]; then #Look up by DEVID
    { read -r DEVNAME; read -r  DEVUSERNAME; } \
        <<< $(cat $DEVREGISTRY | jq -M '.data.devices[], .data.deleted_devices[] | select(.id == "'$DEVID'") | .name, .name_by_user' | tr -d \" 2>/dev/null)

    if [ -z "$DEVNAME" ]; then
        echo "Error: Can't find device id '$DEVID' in '$DEVREGISTRY'" >&2
        exit 6
    fi
elif [ -n "$DEVNAME" ]; then  #Look up by DEVNAME
    { read -r DEVID; read -r  DEVUSERNAME; } \
        <<< $(cat $DEVREGISTRY | jq -M '.data.devices[], .data.deleted_devices[] | select(.name == "'$DEVNAME'") | .id, .name_by_user' | tr -d \" 2>/dev/null)

    if [ -z "$DEVID" ]; then
        echo "Error: Can't find device name '$DEVNAME' in '$DEVREGISTRY'" >&2
        exit 7
    fi
fi

#Get Entity id's and associated info
if [ -n "$ENTID" ]; then
    ENT_IDS=($ENTID)
    if [ -z "$(cat $ENTREGISTRY | jq -M '.data.entities[], .data.deleted_entities[] | select(.id == "'$ENTID'")')" ]; then
        echo "Error: Can't find '$ENTID' in '$ENTREGISTRY'" >&2
        exit 8
    fi
elif [ -n "$ENTNAME" ]; then
    ENT_IDS=($(cat $ENTREGISTRY | jq -M '.data.entities[], .data.deleted_entities[] | select(.entity_id == "'$ENTNAME'") | .id' | tr -d \" 2>/dev/null))
    if [ ${#ENT_IDS[@]} -eq 0 ]; then
        echo "Error: Can't find any entities corresponding to '$ENTNAME' in '$ENTREGISTRY'" >&2
        exit 9
    fi
else #Device + Entities
    echo -e "Device: Name: $DEVNAME [$DEVID]\tUser name: $DEVUSERNAME"

    #Get Corresponding Entity id's
    ENT_IDS=($(cat $ENTREGISTRY | jq -M '.data.entities[], .data.deleted_entities[] | select(.device_id == "'$DEVID'") | .id' | tr -d \" 2>/dev/null))
    if [ ${#ENT_IDS[@]} -eq 0 ]; then
        echo -n "Can't find any entities corresponding to '$DEVNAME' ($DEVID) in '$ENTREGISTRY'. Delete device anyway? (y/n) "
        read -e ANSWER
        echo
        if [ "$ANSWER" != "y" ]; then
            echo "Aborting..."
            exit 10
        fi
    fi
fi
#for id in ${ENT_IDS[@]}; do echo "|$id|"; done #JJKDEBUG

[ ${#ENT_IDS[@]} -ne 0 ] && printf "\n%-32s %-7s %-7s %-20s %-20s %-20s\n" ID META_ID STAT_ID ENTITY_ID "(UNIQUE_ID)" "[ORIGINAL_NAME]"
for id in ${ENT_IDS[@]}; do
    #Note first need to declare assoc arrays since index is hex and can be interpeted as numbered base for an indexed array
    declare -A ENTIDNAMES[$id] ENTUIDS[$id] ENTORIGNAMES[$id] METAIDS[$id] METASTATIDS[$id]

    { read -r entity_id; read -r  unique_id; read -r  original_name; } \
        <<< $(cat $ENTREGISTRY | jq -M '.data.entities[], .data.deleted_entities[] | select(.id == "'$id'") | .entity_id, .unique_id, .original_name' | tr -d \")

    ENTIDNAMES[$id]=$entity_id
    ENTUIDS[$id]=$unique_id
    ENTORIGNAMES[$id]=$original_name
    METAIDS[$id]=${METADATA["$entity_id"]}
    [ -z "${METAIDS[$id]}" ] && METAIDS[$id]='-'
    METASTATIDS[$id]=${METADATA_STAT["$entity_id"]}
    [ -z "${METASTATIDS[$id]}" ] && METASTATIDS[$id]='-'

    printf "%s %-7s %-7s %s (%s) [%s]\n" "$id" "${METAIDS[$id]}" "${METASTATIDS[$id]}" "${ENTIDNAMES[$id]}" "${ENTUIDS[$id]}" "${ENTORIGNAMES[$id]}"
    [ "${METAIDS[$id]}" = '-' ] && echo -e "     WARNING: No 'metadata_id' stored in database for $entity_id\n"
done

echo -ne "\nAre you sure you want to COMPLETELY delete the above ${DEVID:+device &} ${#ENT_IDS[@]} entities (including all associated db data)?(y/n) "
read -e ANSWER
## ANSWER=y  ## Force response to use in batch jobs, uncomment this line and comment out the previous line.
echo
if [ "$ANSWER" != "y" ]; then
    echo "Aborting..."
    exit 11
fi


#===============================================================================
#### Delete entries in '.storage' registries

DATE="$(date +%Y%m%d-%H%M%S)"

### Delete entry in $DEVREGISTRY (core.device_registry) if device
if [ -n "$DEVID" ]; then
   cp -a $DEVREGISTRY{,-$DATE}
   echo "...'$DEVREGISTRY' backed up to '${DEVREGISTRY}-$DATE'"
   cat $DEVREGISTRY-$DATE | jq -cr -M '.data |= del(.devices[], .deleted_devices[] | select(.id == "'$DEVID'"))' >| $DEVREGISTRY
   if [ $? -ne 0 ]; then #Error - unwind & exit
       echo "Error: Failed to delete *device* entries from '$DEVREGISTRY', unwinding..."
       mv ${DEVREGISTRY}{-$DATE,}       
       exit 12
   fi
   echo "   DELETED *device* entries from '$DEVREGISTRY'"
fi

if [ ${#ENT_IDS[@]} -eq 0 ];then #Device only so exit
    print_exit
    exit 0
fi

### Delete entries in $ENTREGISTRY (core.entity_registry)
cp -a $ENTREGISTRY{,-$DATE}
echo "...'$ENTREGISTRY' backed up to '${ENTREGISTRY}-$DATE'"
ENTJOINS=".entity_id == \"$(myjoin "\" or .entity_id == \"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $ENTREGISTRY-$DATE | jq -M '.data |= del(.entities[], .deleted_entities[] | select('"$ENTJOINS"'))' >|  $ENTREGISTRY
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$ENTREGISTRY', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    exit 13
fi
echo "   DELETED *entity* entries from '$ENTREGISTRY'"

### Delete entries in $RESTOREREGISTRY (core.restore_state)
cp -a $RESTOREREGISTRY{,-$DATE}
echo "...'$RESTOREREGISTRY' backed up to '${RESTOREREGISTRY}-$DATE'"
ENTJOINS=".state.entity_id == \"$(myjoin "\" or .state.entity_id == \"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $RESTOREREGISTRY-$DATE | jq -cr -M '.data |= del(.[] | select('"$ENTJOINS"'))' >| $RESTOREREGISTRY
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$RESTOREREGISTRY', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    mv ${RESTOREREGISTRY}{-$DATE,}    
    exit 14
fi
echo "   DELETED *entity* entries from '$RESTOREREGISTRY'"

### Delete entries in $EXPOSEDENTITIES(homeassistant.exposed_entities)
cp -a $EXPOSEDENTITIES{,-$DATE}
echo "...'$EXPOSEDENTITIES' backed up to '${EXPOSEDENTITIES}-$DATE'"
ENTJOINS=".\"$(myjoin "\", .\"" ${ENTIDNAMES[@]})\"" #Selection string for all entities
cat $EXPOSEDENTITIES-$DATE | jq -cr -M '.data |= del(.exposed_entities | select('"$ENTJOINS"'))' >| ${EXPOSEDENTITIES}
if [ $? -ne 0 ]; then #Error - unwind & exit
    echo "Error: Failed to delete *entity* entries from '$EXPOSEDENTITIES', unwinding..."
    mv ${DEVREGISTRY}{-$DATE,}
    mv ${ENTREGISTRY}{-$DATE,}
    mv ${RESTOREREGISTRY}{-$DATE,}
    mv ${EXPOSEDENTITIES}{-$DATE,}
    exit 15
fi
echo "   DELETED *entity* entries from '$EXPOSEDENTITIES'"

#===============================================================================
#### Delete entries in 'MySQL/MariaDB $DATABASE'

### cp -a $DATABASE{,-$DATE}
### echo "...'$DATABASE' backed up to '$DATABASE-$DATE'"

#Needed later (NOTE: 'comm' needs lexicographical sort)
ATTRIBUTES_USED_ORIG="$(mysql $DATABASE -s -e "SELECT DISTINCT attributes_id FROM states ORDER BY CAST(attributes_id AS nchar)")"

if ! [[ $(IFS=; echo "${METAIDS[*]}")  =~ ^-*$ ]]; then ## Consists only of '-'
    METAIDJOINS="metadata_id = '$(myjoin "' OR metadata_id = '" ${METAIDS[@]})'"

    NUMDELS=$(mysql $DATABASE -s -e "SET SESSION foreign_key_checks=OFF; DELETE FROM states WHERE $METAIDJOINS; SELECT ROW_COUNT()")
    echo "   DELETED $NUMDELS entries from 'states' table"

    NUMDELS=$(mysql $DATABASE -s -e "DELETE FROM states_meta WHERE $METAIDJOINS; SELECT ROW_COUNT()")
    echo "   DELETED $NUMDELS entries from 'states_meta' table"
fi

if ! [[ $(IFS=; echo "${METASTATIDS[*]}")  =~ ^-*$ ]]; then ## Consists only of '-'
    METAIDSTATJOINS="metadata_id = '$(myjoin "' OR metadata_id = '" ${METASTATIDS[@]})'"
    NUMDELS=$(mysql $DATABASE -s -e "DELETE FROM statistics WHERE $METAIDSTATJOINS; SELECT ROW_COUNT()")
    echo "   DELETED $NUMDELS entries from 'statistics' table"

    NUMDELS=$(mysql $DATABASE -s -e "DELETE FROM statistics_short_term WHERE $METAIDSTATJOINS; SELECT ROW_COUNT()")
    echo "   DELETED $NUMDELS entries from 'statistics_short_term' table"

    IDSTATJOINS="id = '$(myjoin "' OR id = '" ${METASTATIDS[@]})'"
    NUMDELS=$(v"DELETE FROM statistics_meta WHERE $IDSTATJOINS; SELECT ROW_COUNT()")
    echo "   DELETED $NUMDELS entries from 'statistics_meta' table"
fi

#===============================================================================
#### Delete orphaned entries in 'states_attributes' database

#NOTE: 'comm' needs lexicographical sort
ATTRIBUTES_USED="$(mysql $DATABASE -s -e "SELECT DISTINCT attributes_id FROM states ORDER BY CAST(attributes_id AS nchar)")"
ATTRIBUTES_ALL="$(mysql $DATABASE -s -e "SELECT attributes_id FROM state_attributes ORDER BY CAST(attributes_id AS nchar)")"

NEWLY_ORPHANED=($(comm -2 -3 <(echo "$ATTRIBUTES_USED_ORIG") <(echo "$ATTRIBUTES_USED")))
ALL_ORPHANED=($(comm -2 -3 <(echo "$ATTRIBUTES_ALL") <(echo "$ATTRIBUTES_USED")))
MISSING=($(comm -1 -3 <(echo "$ATTRIBUTES_ALL") <(echo "$ATTRIBUTES_USED")))

echo "There are '${#NEWLY_ORPHANED[@]}' NEWLY orphaned attributes and '${#ALL_ORPHANED[@]}' TOTAL orphaned attributes in 'state_attributes' table"
echo "(Also, there are ${#MISSING[@]} attributes that are referenced in 'states' but not found in 'state_attributes table)"

if [ ${#ALL_ORPHANED} -gt 0 ]; then
    echo -ne "\nDelete orphaned attributes from 'state_attributes' table?([N]ewly/[T]otal/[A]bort) "
    read -e ANSWER  
    ## ANSWER=T    ## Force response to use in batch jobs, uncomment this line and comment out the previous line.
    echo
    case $ANSWER in
        N) DELATTRS="$(myjoin ',' "${NEWLY_ORPHANED[@]}")"
           ;;
        T) DELATTRS="$(myjoin ',' "${ALL_ORPHANED[@]}")"
           ;;
        *) DELATTRS=""
           ;;
    esac
    NUMDELS=$(mysql $DATABASE -s -e "DELETE FROM state_attributes WHERE attributes_id IN ($DELATTRS); SELECT ROW_COUNT()")
    echo "   DELETED $NUMDELS entries from 'state_attributes' table"
fi

#===============================================================================
echo -e "\n***If device/entities are MQTT integrations, don't forget to unpublish the entities!"
print_exit

This really should be included as a standard part of Homeassistant. Since HA is a DIY solution, many people do a lot of experimentation before they get the setup they want. Thus there is often a need for cleaning up old clutter. Having to run scripts that directly manipulates to databases and other important stuff, or even worse: Trying to run the required commands without the guidance from a script, is (maybe to big of) a hurdle to many users.

7 Likes

If it is any help for your context, in the ZHA, HA version 2025.7, you can find the option to remove a device in the Reconfigure section.

That may or may not completely remove all traces of the device from the underlying databases (both the sqlite3 home-assistant_v2.db as well as the json registries in .storage)

Remove in HA world can just mean remove from the UI and officially-sanctioned API access methods but the data may still be persist in the underlying databases.

If you use that integration, would be great if you could check how thoroughly the devices along with all its entities was removed