Migrate to PostgreSQL

I’ve followed the instructions to migrate to SQLAlchemy. SQLite is fine, but I’m looking to move to PostgreSQL. Is there a mechanism to migrate data between SQLAlchemy backends?

Perhaps this helps: Converting from other Databases to PostgreSQL

There are to tables holding state changes, events and states.
The sensor values are in states.state contains sensor value and events.time_fired is the timestamp but what are actually the other timestamps in states-table?

                                       Table "public.events"
   Column   |           Type           |                         Modifiers
------------+--------------------------+-----------------------------------------------------------
 event_id   | integer                  | not null default nextval('events_event_id_seq'::regclass)
 event_type | character varying(32)    |
 event_data | text                     |
 origin     | character varying(32)    |
 time_fired | timestamp with time zone |
 created    | timestamp with time zone |
Indexes:
    "events_pkey" PRIMARY KEY, btree (event_id)
    "ix_events_event_type" btree (event_type)
Referenced by:
    TABLE "states" CONSTRAINT "states_event_id_fkey" FOREIGN KEY (event_id) REFERENCES events(event_id)

                                        Table "public.states"
    Column    |           Type           |                         Modifiers
--------------+--------------------------+-----------------------------------------------------------
 state_id     | integer                  | not null default nextval('states_state_id_seq'::regclass)
 domain       | character varying(64)    |
 entity_id    | character varying(255)   |
 state        | character varying(255)   |
 attributes   | text                     |
 event_id     | integer                  |
 last_changed | timestamp with time zone |
 last_updated | timestamp with time zone |
 created      | timestamp with time zone |
Indexes:
    "states_pkey" PRIMARY KEY, btree (state_id)
    "states__significant_changes" btree (domain, last_updated, entity_id)
    "states__state_changes" btree (last_changed, last_updated, entity_id)
Foreign-key constraints:
    "states_event_id_fkey" FOREIGN KEY (event_id) REFERENCES events(event_id)

states has timestamp columns last_changed, last_updated, created. I guess created is only technical value?
Created-column also show values that are -3 hours compared to others, so maybe there’s something wrong with it’s time zone code?

In history-component there’s code that uses both last_updated and last_changed.

    """Return states changes during UTC period start_time - end_time."""
    states = recorder.get_model('States')
    query = recorder.query('States').filter(
        (states.last_changed == states.last_updated) &
        (states.last_changed > start_time))

    if end_time is not None:
        query = query.filter(states.last_updated < end_time)

So actually the query is:

            states.last_changed == states.last_updated and
            states.last_changed > start_time and
            states.last_updated < end_time

Because last_changed = last_updated, the query can also written in a more easy way?

            states.last_changed == states.last_updated and
            states.last_changed > start_time and
            states.last_changed < end_time

Now it’s a little more intuitive?

I tried creation of partitions (daily or monthly) in Postgres but there wasn’t any gains, actually the opposite. I tested with 4000 rows/day, data for year = 1,5M rows. If you are purging old records, then it would make more sense to just drop old partitions.

Well, although this is a really, really old thread, maybe the following may be interesting for others with this problem. I just recently also migrated from the default sqlite database to postgresql. My procedure was as follows:

  1. create a postgresql database to be used by home assistant and make sure that the homeassistant user can access it and create tables in it
  2. write a config for pgloader:
load database
  from sqlite:///full/path/to/home-assistant_v2.db
  into postgresql:///home
with data only, drop indexes, reset sequences, truncate;
  1. reconfigure the recorder integration so it would be using postgresql. As I was already into this, I excluded some entities and events I’m not interested in.
  2. stop home assistant
  3. start home assistant, let it run for 20 seconds or so in order to let it create the necessary database tables and stop it again.
  4. run ‘pgloader’ with the configuration mentioned above (write it to a file, e.g. migration.config), then run pgloader migration.config This config deletes all entries HA might have written in the 20 seconds it was running before it copies the contents of the old sqlite database into the new, shiny postgresql database.
  5. restart home assistant and be happy

pgloader is a great tool for such conversions since it is fast and highly configurable. Nonetheless I advise against using it for also creating the database tables since it wouldn’t choose the same column types in postgresql home assistent does: the attributes column in postgresql will be of type text whereas in sqlite it was varchar.

5 Likes

I had to add

set timezone to ‘UTC’

to the pgloader config to get correct times.

Thank you @Jennes ! This was very helpful!

Don’t know is valid topic for question. I’ve already on Postgres. Want to know about use Postgres partitions for history tables. I want store all history (no purges). I disable auto purge in HA. But some SQL requests still start to slow down after a year of history. So question is simple:

  • Is it difficult to migrate to the partition model for history? (I experienced developer)
  • How migrations will behave after home-assistant updates?
  • Whether anyone else had a similar experience, I did not find (fast) any mentions this topic (maybe I badly searched)?
1 Like

I’m not using partitions, so I am no help in this vein. That said, I have been running postgresql as my core HA history and as a analytic db for going on two year now with very good success. I run HA and postgresql in docker containers on a 16 gb intel i7 device, a number of other docker containers running as well, I appear to be hardly taxing the box. DB size is 600 GB. I am running 2021.6.6 HA, so cannot speak to some of latest db changes.

I use simple triggers to make a copy of each new state and event record that HA adds, I let HA prune it’s dbs at 65 days and currently have over two years of data in my analytic dbs. In the post linked below is the setup and stored procedures I use to do this:

https://community.home-assistant.io/t/home-assistant-db-vs-mariadb/224295/4

There a several folks using the timescale layer on top of postgresql to give better time series abilities, I have yet to install this. if you search for either or both of ‘postgresql’ or ‘timescale’ you will get some good info.

Good hunting, I have very pleased with my use of postgresql with Home Assistant. I use a juypter container to do mosts of my analysis on postgresql.

1 Like

Thanks for answer! Really appreciate it. Will continue my research.

How does resetting the sequences work for you? I get heaps of unique constraint issues afterward.
Wouldn’t it make more sense to keep them?