In-memory MariaDB (MySQL) add-on for recorder/history integration

Hello, thanks for the add-on, I’m trying to setup it. I have this line in config:
automation: !include automations.yaml

And I have an error after adding your config:
duplicated mapping key at line 32, column -196: automation:

Also, should I remove default_config: to get your recorder: configuration work? Should I add all of the integrations in Default Config - Home Assistant manually then? Thank you

P.S. Looks like your recorder: config works with default_config:, only question with automation left, I just removed that automation section to test, but want to add it back, but there is an error I show early.

I don’t think default_config causes the error.

If you separate things out from the main configuration.yaml with lines like automation: !include automations.yaml, your automations.yaml should not contain the automation: line, so it should contain only this:

  - alias: Auto purge with repack
    trigger:
      platform: time
      at: "04:12:00"
    action:
      service: recorder.purge
      data:
        keep_days: 7
        repack: true

Similarly, if you have a recorder: !include recorder.yaml line in configuration.yaml, the separated recorder.yaml should contain only something like:

db_url: mysql://homeassistant:PASSWORD@45207088-mariadb/homeassistant?charset=utf8mb4
auto_purge: false
exclude:
  event_types:
    - call_service
include:
  entities:
    - <the entity ids you really need>

I made these copy-paste errors several times. :slight_smile: For more details see: Splitting up the configuration - Home Assistant

1 Like

Hello Laszlo (and other readers),
In-memory MariaDB looks like a great add-on for HA on SD cards!

Overall my question is: as a new homeassistant-addon-mariadb-inmemory user, how can I demonstrate to myself that In-memory MariaDB is in fact working and that the database is using RAM rather than SD card?

More detail about my question:
I’m a HA user since 2017 and run my own HA in a venv but I’m building some HA systems to send across country to my sons to get them up to speed with HA quickly. I’m building them on RPi 4’s with:

  • Home Assistant 2022.11.1
  • Supervisor 2022.10.2
  • (hassio) Operating System 9.3
  • using SD card

and to maximize the time until the SD card fails, I’m trying out In-memory MariaDB. From the description it seems like your add-on is just what I need. Great!

But as a newbie to both MariaDB and In-memory MariaDB, I wonder how to know that it’s doing what I think I want it for: reducing SD card writes to a bare minimum…how do I know that I haven’t mis-configured it so much that it’s doing nothing for me?

Testing I’ve done so far is to:

  • install Hassio on the several RPi4s 1GB
  • install In-memory MariaDB per github docs with password changed but all other config at default
  • verified in the add-on’s log that “s6-rc: info: service mariadb-core-post successfully started” and the like say successfully started
  • ensured in configuration.yaml that recorder is configured to exclude only sensor.date and that purge_keep-days is 1 day
  • made an automation that increments a counter once a second

and I’m going to let it run for several days.

After 8 hours or so I do see that the history of

  • sensor.disk_use_percent_config is staying constant, and
  • sensor.memory_use is slowly increasing.
    [edit] But does that prove that I’ve correctly set In-memory MariaDB to minimize SD writes?

So, to restate the question: in addition to what I’ve just described is there a more straight-forward way to show that In-memory MariaDB is doing what I think I want it for: reducing SD card writes to a bare minimum.

Joe

Hi Joe!

I’ve configured the add-on to have a tmpfs (ie. in-memory) file-system at /tmp, and configured MariaDB to store it’s database at /tmp/databases. So I’m quite sure it uses memory instead of the SD card. :wink:

If you have HA CLI access, use these commands:

docker exec -it addon_45207088_mariadb /bin/bash
df

I’ve used a separate tmpfs for the /tmp folder, to be able to limit the max. memory usable for the database. As far as I was able to understand, HA OS doesn’t use the SD card for virtual memory, but uses a compressed in-memory filesystem also (I don’t remember the details now), and I didn’t want to let the database to eat up all the RAM and CPU from the system.

Things to consider:

  • recorder likes to record a lot of stuff, so I like to use an “opt in” approach instead of the default “opt out”, this can lead to complaints by new users why their graph isn’t shown
  • memory exports to SD
    • by default the add-on exports it’s database from memory to the SD card when it is backed up by HA, or when it is stopped gracefully (add-on restart or update, host reboot), the content is compressed, so in my case 150MB is the max size of the database, database files size 40-60MB (it needs some extra space in /tmp/databases to repack it’s files after old data is deleted from the database), the compressed export is less then 10MB, so 10MB for each backup I think is OK
    • by default it is not turned on to export the memory content to SD card periodically, but I use it, 10MB/day is nothing, HA writes much more other stuff to the SD card daily
    • periodic database export to SD card is not for the latest sensory data to be shown on graphs, they will be lost after a power failure, but for the long term statistics, long term graphs can have much more info then the latest sensory data, I like them
  • statistics data growing continuously, and I had no time to read the docs how to limit it’s timespan (if it is possible), eg. I purge my database daily, but I still have 1 year old statistics in the database, it is on my todo list…
  • the latest 2.5.1.6 versions seems to be quite stable, it survives power outages also (the last exported content will be re-imported), this caused some problem lately, HA doesn’t wait for the full startup of the add-ons, and started to use it before it is imported it’s data causing errors
  • the only problem I see now (beyond the growing statistics data size) is that updating the add-on is better done manually from the CLI, because it is better to stop HA before we rug the database from under it, this is not the best experience for new users, but I don’t see a better approach now

Oh, and I use an eMMC-SD adapter and I’ve put eMMC memory into the SD card slot on my PIs, AFAIK eMMC is much faster and has much better wear leveling than a plain SD card.

Hi Lazlo, thanks for the well thought out, informative reply!

Now that you explain what to look for, it’s very plain that it’s there! Here’s mine as of a few minutes ago showing your /tmp in a tmpfs, and I’m still using your default “size: 200m” so this size shown seems right:

tmpfs 204800 3980 200820 2% /tmp

and in addition to watching sensor.memory_use slowly increasing within HA, I’ll also monitor it from the CLI.

so I’ll change my settings:

enabled: true
periodic: disabled

from disabled to hourly. What I think I’m expecting is that /tmp usage will increase continuously with SD usage staying steady until once an hour the /tmp usage will decrease and SD usage will increase. (The whole purpose of this test setup that I’m writing about here is to demonstrate to myself that I’ve correctly configured In-memory MariaDB to minimize SD writes)

Things to consider about your things to consider: :grinning_face_with_smiling_eyes:

Agreed. The purpose of this exercise is to show me that I’ve got your software working, so now that I’m sure I have basic functionality installed and I have only a few recorder exclusions I will add a lot more sensors and automations for it to record to really demonstrate the growth of /tmp versus SD.

I’ve changed to hourly in this test setup so as to demonstrate database being stored in RAM with periodic export to SD card.
[edit] I changed it back to daily so I can see a larger change inside memory before it’s exported to SD card.

What you’re saying here is that latest sensory in RAM is intended, for example, the temperature over the past 10 minutes, but long term data exported to SD is intended, for example, the temperature over the past 10 months. Right?

I will keep watch for how to do that, although I have “Auto update” toggled ON right now

And as asides, THANK YOU for the method to change to the bash shell and for info about eMMC-SD adapters. I didn’t realize that bash was easily available in HAOS, and didn’t know those adapters existed (I use separate SSD on my own HA instance).

More or less I agree with you, I answer only where I think we misunderstand each other. :slight_smile:

What I think I’m expecting is that /tmp usage will increase continuously with SD usage staying steady until once an hour the /tmp usage will decrease and SD usage will increase.

Nope. The database in the inmemory /tmp location is always there in the memory, in it’s entirety, when it is written to SD (/data folder of the add-on’s docker container), that is an mysqldump export, it is like a backup, everything remains in inmemory /tmp AND you get a “copy” of it in SD /data, a snapshot on the SD card (I use mysqldump, it is like a plain old database backup, but I intentionally do not use this word, because backup is also a function in HA, I call it export).

This /tmp → /data export can happen “on certain occasions” and periodically, the export is the same. In case of HA backup (that is one of the “on certain occasions”), this exported content in /data is part of the HA backup file. On all occasions (periodically included), the exported database content in /data is reimported into /tmp when the add-on is started (because there is nothing in /tmp when it is started, when it is stopped, /tmp is gone).

Yeah, under normal circumstances this in-memory database behaves like a real one, you don’t lose data, only a power outage will cause data loss, because any change in the inmemory /tmp database after the last export to SD, will be lost. But this is a trade-off.

I have only a few recorder exclusions

I prefer only INclusions in case of entity states. My states tables are huge even when everything is opt-in in my case. I use EXclusion only in case of events, they are much smaller, not a big problem, in my case states require 100 times more database space. :confused:

I’ve changed to hourly [periodic export]

Yeah, in case of a small database, it is not an issue, but if the in-memory db add-on backup’s size starts to increase to several MB-s, it is a bigger wear for the SD-card. It is a decision, another trade-off.

What you’re saying here is that latest sensory in RAM is intended, for example, the temperature over the past 10 minutes, but long term data exported to SD is intended, for example, the temperature over the past 10 months. Right?

Not completely right. The “latest” AND “long term” data are in the RAM, there is the real database. It is regularly exported to SD altogether. So, in case of an hourly export, after let say 50 minutes, the “latest” 50 minutes is only in the RAM, and everything as a snapshot at 50 minutes ago is on the SD also. So in case of hourly export, you can lose only 0-60 minutes of data, that is only in RAM, anything else is both in RAM and SD.

“Auto update” toggled ON right now

Spending nearly 4 decades around IT, I still do not have the courage to turn on anything auto on “critical” stuff. :smiley: My heating is critical. :smiley:

bash, eMMC

Yeah, bash, cli, ssh to HA or even ssh to the host OS, is a bit hidden in the docs, they don’t want absolutely inexperienced new users to ruin everything.

The main problem with eMMC (on SD adapter), that they are expensive, my 32GB eMMC even ordered from “far, big, eastern” country :stuck_out_tongue: cost nearly as much as my second-hand rPI 3-s. But if I have to travel 200+km to recover a failed SD, that costs even more for me. This is another trade-off.

Again, thanks so much for your gracious and well-expressed reply.
I’ve been running test RPi systems here for a few days now, and here’s some results.

THE MAIN THING THAT I’VE LEARNED FROM YOU
To quote myself:

And to gather and summarize how I understand your answers:
On a new install’s (i.e. a test install’s) CLI run the commands:

docker exec -it addon_45207088_mariadb /bin/bash
df

and verify that there is no reference to any tmpfs mounted as /tmp

Then install, configure, and start In-memory MariaDB per github docs and run the command
df

and expect to see an item like:

Filesystem           1K-blocks      Used Available Use% Mounted on
...
tmpfs                   204800      3976    200824   2% /tmp

and maybe see the second number, “Used”, increase at first as history is first being stored.

SOME OTHER THINGS THAT I’VE ALSO LEARNED FROM YOU:

  1. Don’t auto update software (how quickly you convinced me!)
  2. The sequence from Recorder->History in RAM then RAM → DiskDrive isn’t what I thought it was (so I’m going to study HA docs about their inter-workings - that’s not a subject for this thread)
  3. Notably, The “latest” AND “long term” data are in the RAM, there is the real database. It is regularly exported to SD altogether
  4. In HAOS it’s possible to use bash and other features that are not widely documented - also not a subject for this thread
  5. There is such a thing as a eMMC (on SD adapter) to replace SDs without requiring a separate SSD
  6. Once my testing is finished and I start building my sons’ actual HAs, I will Include a minimum of History

The two HA instances that I’ll end up sending to my sons will be in two corners of the continental USA while I’m in a third corner, so I especially appreciate your help! It would be phone calls for me rather than 200+km travels, but I still want to minimize service calls!!

Only one point to clarify: docker exec -it addon_45207088_mariadb /bin/bash will show the internal filesystem of the already installed and running inmemory add-on’s docker container, with df you see that /tmp is a separate tmpfs with configurable limited max. capacity, and the HA database is at /tmp/databases/homeassistant. In this add-on, tmpfs is not simply enabled (this is a parameter of the add-on, whether it wants a tmpfs mounted at /tmp), but I also limit the size of it.

And note, when the database from the RAM is exported to SD, this is compressed on the fly to minimize SD wear as much as possible.

Oh, the recorder integration (a component of HA, that stores the states into a DB) has a parameter commit_interval, the only problem with it, that if you increase it to eg. 1 minute, you won’t be able to read the data in the commit “cache”, ie. you wont be able to see the last 1 minute on the charts. This was the reason I’ve started this add-on, maybe it is changed in the past year(s), but I don’t think so.

And some off topic: add ssh access to the system (use the community add-on). And don’t forget to enable port forwarding on the routers. :wink:

PA/FYI: New function

Periodically purge old statistics values. Enabled by default (purge older than 6 months).

Homeassistant never deletes old statistics data, so the database size can overgrow the memory limit.

Optionally it can archive the old statistics values (export it with mysqldump to /share folder before deleting it). Because why not.

So currently it can behave more or less identical with a true database, the only situation you can lose some data, if there is a power outage. In this case the periodic data export function is useful, you will lose only the recent data after this periodic export.

I use daily periodic export, and monthly statistics purge with archive. So I can lose the past day’s sensory data, but I still can draw nice half year long statistics graphs (even after a power outage).

Hi, I’m running a system where MariaDB is already installed and available. I would like to create a DB for HA and integrate it with this extension, to have in-memory functionality and reduce writes to SD card.
Is it possible to do what I mean?

If you want to import a “legacy” MariaDB into this add-on:

  • stop HA core
  • sqldump the old homeassistant database
  • login into this add-on’s command line with docker exec -it addon_45207088_mariadb /bin/bash
  • import the dump
  • modify HA configuration.yaml (see the add-on’s documentation)
  • start HA

If you want to use this add-on next to a HA Container installation (see: Installation - Home Assistant), it is not possible. Even add-ons are docker images, but they require a few stuff around them to run (HA and Supervisor APIs, managed configuration options, backup/restore system, etc.), so add-ons can’t be used if you installed HA Container.

1 Like

Thanks for this add-on!
I am running latest HassOS directly on x86-64 laptop installed on the hard drive.

Add-on version: 2.6.1.2
You are running the latest version of this add-on.
System: Home Assistant OS 11.1 (amd64 / generic-x86-64)
Home Assistant Core: 2023.11.3
Home Assistant Supervisor: 2023.11.3
After 1 night of implementing it, I can hear the hard drive is still spinning and being accessed.

Could I expect the hard drive is spinned down?

I think HA writes the drive quite regularly on it’s own:

  • HA core log file (the home-assistant.log file)
  • each add-on has it’s own log, that goes into the host’s log system, and some of the add-ons are quite chatty, or at least the regular healthcheck causes some logging
  • HA saves it’s state in a lot of file under the .storage folder

So I think it is impossible to have a HA installation that never writes the hard drive or SD or eMMC.

If you configured this in-memory DB add-on to never export it’s in-memory DB to storage (SD card, hard drive), it shouldn’t write the log and the storage at all, so I think it is something else, that writes the drive regularly.

Hello.
Here I tried to do a little comparison using In-memory MariaDB:
https://4pda-to.translate.goog/forum/index.php?showtopic=871505&st=44960&_x_tr_sl=ru&_x_tr_tl=en#entry129852450

Wow! Nice work! As I understand:

  • SQLite: 1GB
  • MariaDB: 420MB
  • In-memory MariaDB: 130MB
    … SD card writes during the 3 day long test periods.

I would like to add:

  • By default In-memory MariaDB do not write anything at 02:00 to the SD card.
  • Though even if it is turned on, with your approx. 100MB database that would be around 30MB (5+10+15MB) during the 3 days, but I don’t see any uptick in the “eMMC written” chart around 02:00.
  • So my asumption is that the default settings means that HA on it’s own has written 130MB to SD card, and MariaDB added 290MB, and SQLite added 870MB wear to the SD card during these 3 days with your setup.

Yeah, HA saves it’s internal state as text files in the /homeassistant/.storage folder as state changes, plus a lot of log is stored by the OS, but I thought it is less than 40MB/day. Let’s say, if we have a 32GB SD card/eMMC, it is written all over approx once a year. Hmmm, it should work, even if I thought it takes more time. Nice new info, thank you!

I did redo the stats with the included commit to archive every day at 2:00 (still the same top link under item 3 two spoilers for comparison)
There are almost no differences at all.

I would also like to say a few words of wishes:

  1. I use the “In-memory MariaDB Memory usage” sensor as a command:
command_line:
  - sensor:
      name: In-memory MariaDB Memory usage
      unique_id: inmemory_mariadb_memory_usage
      command: printf `ssh deoptim@localhost docker exec -i addon_45207088_mariadb /bin/bash -c "df\ -h\ /tmp" | awk 'NR == 2{print $5}' | tr -d '%'` %d
      value_template: "{{ value | round(1) }}"
      scan_interval: 600
      unit_of_measurement: '%'
      device_class: data_size

Since I don’t know if the allocated space of tmpfs (200MB by default) for the database is running out or not. It would be great to display this kind of statistics separately as entities.
Say here:
Untitled

  1. As I understand, if we are going to make a daily commit to the archive, there are limitations in the size of the database (the larger the database, the more we need to increase the period of the commit to the archive - to make it make any sense).
    In this case it would be good to split the commit file into periods (say weeks) - what do you think?

Yeah! :slight_smile: I see the small eMMC writes at 02:00, approx. 1, 5, 10MB, that is the wear of the in-memory add-on on the SD card! :slight_smile: My assumption was correct, all the other 100++ MB is HA alone. :frowning:

  1. tmpfs usage sensor:

    • It was on my wish list also.
    • The current sensors (CPU percent, etc.) are provided by the supervisor and hard-wired, can’t be extended.
    • Hacky solution is to use the REST API of HA to periodically inject sensor states. This is what I implemented, see: Report tmpfs usage diagnostic values as sensors · lmagyar/homeassistant-addon-mariadb-inmemory@37dbe3f · GitHub. I test it for a few days in my beta repo then I will release it. I tried to follow the strange naming of the sensors created by the Supervisor.
    • A better solution would be to create an integration (with devices, entities), giving unique_id for these sensors, making it possible that users can edit their names, icon, etc. But I will not do this. The only argument for this would be, that the REST API doesn’t support unique_id-s, see: HA.core REST API POST unique_id for entities?, but I hope there will be some standard solution in the future.
  2. incremental exports:

    • It is not completely clear, what did you mean by: ‘As I understand, if we are going to make a daily commit to the archive, there are limitations in the size of the database (the larger the database, the more we need to increase the period of the commit to the archive - to make it make any sense). In this case it would be good to split the commit file into periods (say weeks) - what do you think?’
    • The add-on uses sqldump+gzip to save all the databases regularly, and it overwrites the previous dump.
    • So, incremental dumps can decrease the wear on the SD card, but it doesn’t worth the effort. The HA database has 2 main parts:
      • States and Events tables: these are purged regularly, these are usually the bigger part of the DB. But you have only a few days of data in these tables, so the content will be really different when the next dump runs.
      • Statistics tables: HA never deletes any data in them, the add-on deletes (because we have limited storage space), let say after 6 monthes, so we can decrease the SD card wear dumping only - let say - the last day of them, but recovery from 180 dumps in a reliable way would be a nightmare, and we would save let say 1-10MB wear of the SD card per day, meanwhile HA writes 100++MB daily whatever we do. So I won’t implement this.
    • UPDATE: Currently the add-on have to keep 2 consecutive copies of the dump, because it is possible that a dump is not written to the SD card’s memory completely before a power failure, even if the OS reports that it gave the data to the SD card completely, it happened with me due to a faulty capacitor on my eMMC->SD adapter board, so there is an elaborate logic how to handle even these 1+1 sqldump files, I’m sure an incremental version of this would cause more problem than solve.

And the new version (with the sensors) is out. :sunglasses:

This is an awesome addin - I wanted to thank you for creating and maintaining it to start.

I’m assuming this thread is for general inquiry so please direct me where I can understand a couple of general questions:

Is there a set of documentation that discusses the general technical improvements made with this fork? I’m starting with the assumption that this addin makes some general config improvements to store the db on a tmpfs block device…is that fundamentally it or are there features enabled that go beyond tmpfs redirection? I get that this can be found on the repository, but a summary is so much nicer to start :grin:.

How does this work in concept? You start with a db installed into tmpfs…how do you persist between reboots when configuring the periodic update interval? Is the gziped file just a raw data export into a delimited file format or is it a binary representation? What’s the command used to do the export? What about then the import? Does it block HA load during startup or is it parallel? What happens to the data if in parallel if the addon has not started? What about the data load…if I had 32gb in memory does it suck it in through multiple files to create the new db? Does this use internal capabilities of MariaDB to accomplish this task?

I’m digging deeper into this as I am more of your fringe case. I log everything. Always. I am a heavy IoT-centric user with over 500 wifi enabled sensors and switches. I need to fundamentally understand what happens when I run out of space on tmpfs as I will need to balance the substantial performant increases in reads when visualizing sensor data over a long period of time using the stock analytics of HA vs. the cost of additional RAM added to multiple physical hosts that run the HA guest. I virtualize HaOS with libvirt and also use the live migration features to move the guest to other hosts in my home.

I have around 95GB of sensor data over a year in the stock sqllite db. My HA runs as a VM on the amd64 architecture - I care more about blazing performance vs write cycles. Has there been consideration to creating a “Store 8gb in-memory and the rest, 80gb, on disk”?

Thank you again for your time on this project - it’s exciting to see something you can throw some hardware at to make a fun experience in a UI with a crazy amount of data that you get to collect on your property.

The add-on has a documentation tab, everything is there (OK, should be, I’ve updated it now a bit). And the differences compared to the official MariaDB add-on is highlighted at the beginning.

All the other differences compared to the official version is not so important, eg. it uses a different base image, some DB config tuning, that’s all. There were some s6-overlay optimizations, but it got merged in the official add-on already.

To answer your specific questions:

  • There are some tweaks in mariadb/rootfs/etc/my.cnf.d/mariadb-server.cnf, basically to configure the smallest values for anything that can take up space on the file-system, or not really needed when all the data are in the RAM. But due to my very limited MariaDB experience/knowledge, these are not optimal, I’m sure.
  • The gzip-ed file is an sqldump sql file, can be executed anywhere. I’ve added this to the docs. But it is slow to export, slow to import. It has the advantage, that the deleted statistics values can be imported anywhere if somebody needs them.
  • Yes it blocks HA’s DB access during import (not HA completely). Otherwise HA starts to work on statistics calculation immediately, during data import, and that leads to a catastrophic mess. That is the reason that some DB related warnings should be missed during startup (see logger config in the docs).
  • Do not try it with 32GB database, even if you have enough memory. :slight_smile: Importing SQL and dumping it is quite slow, but it’s not a problem on a <100MB DB (approx. 1 minute on a rPI 3), so I’ve never thought about using other technics.
  • Running out of space. Huhhh, I can’t remember, but nothing good happens. I can’t remember that whether InnoDB was able to answer queries when it was out of disk space. As I remember it required a restart to function properly, but an add-on restart is required after changing config values, like the tmpfs size. But better not to try this. :slight_smile:
  • No, it can’t split DB between mem and disk, if you need this level of performance tuning, you need a real MariaDB (or any DB) installation and fine tuning the settings. I have zero expertise to fine tune MariaDB on this high level. :frowning:

The main goal of this add-on is to run on rPI level devices with 1-4GB RAM and 1-200MB DB, to save their SD card from regular damage by write wear. What you need (I think) is a performance tuning for a real DB installation.

1 Like