Postgres data is incomplete

I’ve been using HA for a few years now just using the standard SQlite DB. I’m running a second instance in my RV now and decided to use Postgres instead since I’m more familiar with it. I have both HA and Postgres running as Docker containers.

I connected to the Postgres DB tonight to see if I could manually add some missing data. But what I’m seeing doesn’t make sense. When I query the states table, the entity_id field is blank for every record.

homeassistant=# select entity_id, count(*) from states group by entity_id;
 entity_id |  count
-----------+---------
           | 1576471
(1 row)

Similarly, when I describe the table, the entity_id field is a 1 byte character type.

homeassistant=# \d states
                                           Table "public.states"
        Column         |           Type           | Collation | Nullable |             Default
-----------------------+--------------------------+-----------+----------+----------------------------------
 state_id              | integer                  |           | not null | generated by default as identity
 entity_id             | character(1)             |           |          |
 state                 | character varying(255)   |           |          |
 attributes            | character(1)             |           |          |
 event_id              | smallint                 |           |          |
 last_changed          | timestamp with time zone |           |          |
 last_changed_ts       | double precision         |           |          |
 last_updated          | timestamp with time zone |           |          |
 last_updated_ts       | double precision         |           |          |
 old_state_id          | integer                  |           |          |
 attributes_id         | integer                  |           |          |
 context_id            | character(1)             |           |          |
 context_user_id       | character(1)             |           |          |
 context_parent_id     | character(1)             |           |          |
 origin_idx            | smallint                 |           |          |
 context_id_bin        | bytea                    |           |          |
 context_user_id_bin   | bytea                    |           |          |
 context_parent_id_bin | bytea                    |           |          |
 metadata_id           | integer                  |           |          |

I’m not sure why the data is being displayed this way. Home Assistant seems to be acting normal and I don’t see any obvious errors in the HA or Postgres containers. I feel like I’m missing something obvious.

HA also displays all historic data normally so it doesn’t seem like the DB has become corrupted.

recorder conf:

recorder:
  db_url: postgresql://homeassistant:$DB_PASS@localhost:5433/homeassistant

docker compose

version: '3.3'

services:
  homeassistant:
    restart: always
    environment:
      - TZ=America/New_York
    volumes:
      - ./config:/config
    network_mode: host
    image: homeassistant/home-assistant:2023.7

  postgres:
    image: postgres:15.3
    restart: always
    env_file:
      - .env
    environment:
      POSTGRES_DB: homeassistant
      POSTGRES_USER: homeassistant
      POSTGRES_PASSWORD: ${DB_PASS}
    volumes:
      - pgdata:/var/lib/postgresql/data
    ports:
      - 5433:5432

volumes:
  pgdata:

Those fields are deprectaed

SQLite does not allow to remove a column, that’s why they decided to keep them rather than doing table copy’s at upgrade time, I guess.

Ahhh that makes sense! Glad I’m not crazy. Now which fields are valid? Or is this whole table deprecated? Almost all of the fields are empty.

Just look in github for the source file I mentioned if you want to know.
If a table is not empty, considering the tables are cleaned up on a rolling 10 days (by default), then it is not deprecated.

Yeah I looked through the schema file and didn’t really see anything valuable in the table except the context_*_bin fields. What’s weird is that these fields are unused, but methods like from_event() and to_native() still use them. I’ll dig a bit more tonight.

Mmm… That’s just the table containing all the values (state and attributes) used throughout HA.
Not sure what is valuable to you :wink:

Sure. I understand the purpose of the table. What I’m saying is that it isn’t immediately obvious which device a state record belongs to. Example:

state_id              | 2348
entity_id             | <null>
state                 | 243.2
attributes            | <null>
event_id              | <null>
last_changed          | <null>
last_changed_ts       | <null>
last_updated          | <null>
last_updated_ts       | 1690748114.304422
old_state_id          | 2344
attributes_id         | 15
context_id            | <null>
context_user_id       | <null>
context_parent_id     | <null>
origin_idx            | 0
context_id_bin        | \x0189a870d580510c0478834a5cf95e87
context_user_id_bin   | <null>
context_parent_id_bin | <null>
metadata_id           | 14

Looks like metadata_id is a foreign key to state_meta which gives the name of the entity. Kinda weird but I guess it makes sense since the entity_id can change. Less expensive to change one record in one table than every record an entity belongs to.

Still, would be nice if the entity_id field was not null and had an ON UPDATE CASCADE relationship to state_meta so it’s a little easier to follow. But I’m sure the HA folks know a lot better than I do what they’re doing.