Home Assistant OS Release 8

zagnuts · August 29, 2022, 7:53am

Lots of good point, thanks. Fully agree it’s not ideal for a o/s, but I chose the Pro Endurance not for speed but for longevity as in theory they are designed to be able to cope with “write intensive” applications and the price differential was not huge. Real world I found the performance to be perfectly fine, but it still got whacked by too much activity. I’d like to stick with hassio but I’m contemplating moving back to either a different o/s, or even a container under docker so the problem will just go away. In the meantime, I’ll just keep feeding it cards…

orange-assistant · August 29, 2022, 12:11pm

But these “write intensive” tasks are “big junks” or sequential writes (like a video or images). This types of writing will probably have a WAF (write amplification factor) close to 1 because they certainly always will write full “pages” (the host writes will very much equal the flash writes). So this is the perfect scenario for a flash storage as there is virtually no write amplification and you will get the maximum lifetime for your flash storage.

If you look at the details for this card: MicroSDXC PRO Endurance Memory Card w Adapter 128GB Memory & Storage - MB-MJ128GA/AM | Samsung US

Continuous recording up to 25x longer than speed-focused cards¹ gives you long-lasting, best-in-class endurance up to 43,800 hours (5 years)². The PRO Endurance secures data with an industry-leading limited warranty up to 5 years³ and captures surveillance footage up to 128GB⁴ of storage space.

You see that the advertisement really only claims to last longer with permanent continuous sequential writes (WAF = 1) aka as no write amplification

But that is quite the opposite of what HA does with your SD card! Like the task expected from HA are very small (random) junks of data that a written every 5 seconds and in most of the cases will never fill a whole page and therefor have very high WAF (much more flash writes than the host/filesystem writes)

As an example one might think of a car that claims 1000km with one full tank/charge. This claim is most of the time theoretically - like tested in a scenario with no wind, completely flat surface, a complete empty car (often even without seats and driver) etc. The real life scenario will most likely be smaller than the seller claims - still you might get 80% of it for tours outside the city when one drives conscious. When it comes to the city with traffic lights and lots of stops maybe the range will already degrade to 50% of what the seller claimed. But this scenario is still not what HaOS does with our sd cards. Think about turning your car on and then move it one meter and turn it off again. What mileage you can expect with a usage like that? Maybe 10% of what the seller claimed - maybe 5%? Maybe less? Most likely your warranty will be void too because you use the car totally out of the specs it was mend for… (sd cards also typically have a “limited” warranty )

And that is somewhat HaOS is treating our flash storage. The mileage will vary but with a better suited commit interval the life expectancy of our sd cards could be greatly improved (with the same usage pattern!)

That’s really sad actually. Not necessary for you as you are even aware of the problems and that they could easily fixed by software but rather all the other users which don’t have a clue there SD cards are actually killed by a software (miss)configuration in HaOS and that there lifetime could be greatly (many years) higher that they are.

Specially in this times we should be conscious about not generating unnecessary amount of waste - even more when one line of code could fix this for (probably) thousands of users…

mobile.andrew.jones · August 29, 2022, 12:32pm

recorder:
  commit_interval: 10

That should make a difference.

orange-assistant · August 29, 2022, 2:35pm

A difference maybe - could even make it worse on the flash layer in the end. In terms of resulting in higher write amplification (more host writes) = earlier dying flash storage.

Your setting is on the application level (indeed influencing the db writes on the fs layer) but the “real” problem here is the filesystem (host) level in conjunction with the flash level.

If HaOS (or any program/add-on/etc.) only wants to write 1 byte in a 5 second interval it will cause a full page write.

If we do the calculation and assume a page is 16kb than the WAF (write amplification factor) is 16384 divided by 1 = 16384. This means the host (filesystem) wanted to write byte but on the flash layer it results in writing bytes written and worn. The “extra” 16383 bytes written actually will cause more write amplification in the future because that intended byte needs to be rewritten again to “free” that page for new data. The problem is that a the minimum amount a flash storage can delete is a block which typically consists of 256 pages.

@mobile.andrew.jones the problem here is really the commit interval of the data/storage filesystem that is “burned in” HaOS and nothing you or me can change (and certainly not something that is available in a yaml to configure!)

agners · August 31, 2022, 1:22pm

Do you have proof or actual measurements for these statements?

ext4 is probably a lot less worse than you think. ext4 introduced the concepts of extents, which essentially causes the file system to allocate space in larger, sequential blocks. That, and other optimizations geared towards SSDs (e.g. relatime) makes ext4 not an unsuitable FS for flash storage. After all, it was the default for many Distros even on SSD drives. Also Android used it for a long time (and still is for some configuration I think? Not sure), and Android essentially exclusively runs on flash storage.

Afaik HaOS uses ext4 with the defaults (flush interval of 5 seconds) which will probably cause heavy write amplification for most of the users.

Note that 5s only applies to data which has reached the (ext4) file system level. A write() first causes a dirty page in the page cache, which gets written out every dirty_expire_centisecs (which is 30s). So in absence of fsync() etc, typically, data is only written after ~30s (see also this post).

Afaict, the problems are on higher layers, where small writes to storage are requested. E.g. a simple INSERT INTO statement sent to a SQLite database (using implicit transaction), will cause an actual write to the flash! SQLite is an ACID compliant database, and the D(urability) in ACID requires it to make sure it hits the disk (think, fsync). That will cause writes on F2FS or ext4, no matter what commit interval.

The standard configuration of the recorder and its SQLite database backend has received quite some love in the last few years. It doesn’t cause flushes to disk as much anymore by pooling writes and using bigger transactions. However, there are countless of other places where things write to disk. Just recently ZHA has been found to cause quite some writes (see ZHA database writes (zigbee.db) · Issue #77049 · home-assistant/core · GitHub). But of course there are a lot more (custom) integrations and add-ons which might batter the storage.

In the end the whole Home Assistant platform pushes the limits of cheap storage: We run 24h and have a mostly random write work-load. From what I can tell, the amount of reported broken SD card went down. But as the platform grows older, and people run their installation for longer, reports of failing flash storage will continue to come in.

I am not against tuning ext4 parameters or switching to F2FS even, however, these things need a bit of thought. We cannot leave existing installation stranded. In general, instead of spreading FUD here, I’d prefer a PR from you so we can discuss the merits of a concrete change

orange-assistant · August 31, 2022, 3:19pm

Measurements (in software) are probably close to impossible with (a unmodified) HaOS on flash storage without SMART (like sd cards). On the other hand measuring the WAF on SSDs (which feature SMART) doesn’t make much sense as they mitigate bad filesystem configurations (like a flush interval of 5 seconds) usually with some small amounts of SLC flash which act like a buffer before a full page write to the valuable TLC/MLC/… flash happens. That way SSDs can even achieve a WAF < 1

But consulting technical sheets of a random flash vendor (for example sand disk/wd) could give some clue:

7.0 SYSTEM OPTIMIZATION TECHNIQUES

7.1 Write Amplification

The higher the write amplification, the faster the device will wear out. Write amplification is directly correlated to the workload. This means pure sequential workload will produce the lowest WA factor and random activity will result in higher WA

HA produces a lot’s of this random/small workload that could be easily mitigated by just using a sane commit interval on the filesystem. It’s the really the only “system wide” and guaranteed working solution as it’s close to impossible to avoid random scripts/programs etc. to write to the filesystems layer (which will then at some point cause a flash layer write) when they want

Some measurements actually found here which compare the default of ext4 with a custom commit interval of 10 minutes (together with deactivating swap and activating zram). The results:

Before: 8 times within 60 seconds a few bytes were written to the card, now it took 12 minutes for 8 write attempts using larger data chunks

Write Amplification significantly decreased.

That’s a very interesting fact I wasn’t aware at all! Still what makes me wonder if this is the default why (in average) with the tested scenario linked a “dirty” page write was performed every 7,5 seconds and not every 30 seconds only?

I think here it is really to distinguish. HaOS certainly does push the limits (or maybe beyond) but HA with another install method and “proper” configured filesystem settings is nothing that needs to kill a (cheap) flash storage fast. My first HA install (for around 2 years) was pip based on a 8GB flash card on armbian installed on a SBC - this system runs till today (~5 years later) with the same sd card and it runs 24-7 (not with ha anymore but some other small stuff).

Wearing out flash fast is nothing that is necessary

That’s very nice to read - the last time (I think it was on github) I remembered you writing that you don’t have any intention and want to stick with the defaults (from the hard disk age).

I would love to be able to deliver that but I know as much that I can edit etc/fstab on a linux install and add commit=600. Beside till now I even expect there would been no chance that a PR like this would be accepted.

agners · August 31, 2022, 4:39pm

I am not questioning the write amplification problem. It is real, and I am aware.

What I am question is if the commit interval is the culprit.

It can be a factor, for sure! But in the end its only one variable in the whole system. A commit interval of 10 minutes does nothing, if a SQL database calls a fsync every 5 seconds.

That is an interesting link indeed, we probably can use the script to monitor writes and see what changes are doing to HAOS.

However, I would be very careful to apply their results to HAOS. There are multiple changes done to the system, also swap has been disabled. HAOS has no swap on the storage. Also, they use an idle system as benchmark. That is vastly different from a system which actually does something

Lastly: 10 minutes commit interval seems really long! I don’t think that is acceptable in a production system.

What database system are you using on that installation? With just Core thre are less opportunities for writes of course.

I’ll prepare a PR, it should be easy enough. I probably push it to 30s or 60s, I’d don’t want people to loose to much data. In any case, I’ll also make some measurements to see how it affects writes. Just turning knobs without verifying if they have the intended effect is wrong.

I expect it will have quite some effect on a new/vanilla/non-active installation. However, I expect that my production system (which uses MariaDB and InfluxDB) will not see that much of a difference as these databases likely sync data to disk quite frequently. We’ll see

orange-assistant · August 31, 2022, 6:22pm

I expect it to one of the most important one as it “catches” all the little “write” attempts on the filesystem layer.

Indeed, if a flush/sync is called that interval is overwritten - but that’s also the idea behind it actually.

A commit interval of 10 minutes also doesn’t say the system will always wait 10 minutes before it writes. It only says it waits up to a maximum of 10 minutes to write a dirty page - if a whole page can be written after 1 or 2 minutes the system should flush/sync it and that should result in a write to the flash storage. (from my limited knowledge)

That would be indeed very nice to visual the harm (hopefully very little) that’s caused by the default ext4 commit interval and even could show how a increased interval does (hopefully again) less harm/damage to the flash storage by minimizing the wear of flash cells.

That’s flash friendly as it can get regarding swap and a flash storage
Any idea what is reported as swap on a HaOS system via the systemmonitor integration?

Well, I have it on all my system that run from cheap flash (mostly SBC’s with sd card or other eMMC/NAND modules that I don’t want to kill) since years and never had any problems with it. The “only” thing which is to expected that if there is a sudden power loss the amount of data can (in theory) reach a maximum of 10 minutes.

Maybe this would be a good setting to actually let the user choose? Because if some runs for example HaOS with a raspberry but with a SSD that person might wanna choose that 5 second but another user that uses a usb stick or sd card maybe rather wants to choose 300, 600 or even 900 seconds and rather trades (a possible) small data loss with a power failure against a greatly (hopefully years) improved life time of his flash storage?

Beside there is almost no cheaper/easier UPS than for 5V SBC’s. By luck even a random power bank in the drawer allows to be charged and discharged at the same time and has the ability to act as a poor man’s UPS That way the possibility of data loss at power failure can be greatly reduced

This particular installation is no more (only the SBC and the same sd card are still in use). But I (since ever) just used the default db (sqlite) which - I expect - most of the users probably do?

Maybe this could be really be a setting? Rather then trying to fit the both ends and choose a value in the middle that doesn’t really suite ether side…

In the on-boarding it could be asked if “cheap” flash storage like a sd card/usb stick/etc. is used or if the system is installed on SSD/HD - based on that ether a high or a low commit interval could be set. Or even have it set “freely” somewhere in the system/hardware settings if the user has the advanced mode enabled.

Do a lot of users use MariaDB? I tried to squeeze it out of the analytics but as this (by the looks) is not listed as integration. I only found SQL on the place 91 (3275 installation or 2.5%) - if that includes MariaDB than it doesn’t look like a frequently used setup (even more if that would be expurgated for all of the install types that only HaOS is left).

So while it might be no big difference for you or other people that use more expensive flash storage (like SSDs which mitigate a low commit interval and also have the abilities like TRIM which sd cards don’t) it would be great if a OS that is (also) available for SBC’s to treat “cheap” flash in a gracefully manner

agners · August 31, 2022, 7:22pm

We use ZRAM (compressed RAM block device) as a swap “device”. Essentially trading CPU cycles for more memory.

The add-on analytics says 17.7%, see Installations | Home Assistant Analytics.

The recorder backend uses SQLAlchemy, and I expect that SQLite and MariaDB cause somewhat similar writes (as we pool writes and transactions the same way). Of course there will be some difference, but it won’t be huge.

I only have this produciton instance. I can try to generate some data on test instances running SQLite, but it won’t be fully representative

santiago990513 · December 26, 2022, 5:02am

Hi, i recently migrated my installation from an sd card to the internal ssd of a generic x64 machine (an old laptop), and now the GRUB will not start the OS automatically, i have to manually hit “enter” to boot it, does anyone knows how to fix this?

checking12 · December 26, 2022, 1:42pm

the boot options editor is mostly done from the OS when booted. I tried to use the grub command line too, but cant seem to persist any edit i do for the boot menu. So you’ll need a ssh connection to the host when booted and then use the edit options from the os . Youll need to locate the grub.cfg file afaik.