15,000+ Entities in Home Assistant? Seeking Advice for Large-Scale Setup

Great suggestions @mirekmal . When reviewing HA docs, I raised an eyebrow at using SQLite for an operation of this size. The initial goals will be heavily skewed toward monitoring and data logging as we determine reliability. We have some batch processes that take close to a year to complete (so seconds/minutes tend to be inconsequential), so it’s critical that we can log and manage that data effectively.

I have some experience with Postgres, and from my understanding, it scales extremely well. I’ll see what kind of process it is to migrate databases, and then I’ll see if I can enhance another tower and get it running as a dedicated machine. I should probably do this for my separate django project as well (it runs on postgres, but moving the db to a separate computer (as its going to be a MASSIVE database) sounds like a good move).

Thanks! Well I have my ~100 zigbee devices already. MQTT2Zigbee is beautiful. I know Ubiquiti has some features on the network to “rescan and reset” the 2.4ghz channels to avoid interference. I’m hoping this helps in the journey. We are spreading most of these sensors across 20,0000 sqft, but we had serious wifi congestion problems when I tried IOT stuff back in 2021 (with the crumby network). I’m hoping to leave that in the past. I’ll do my best to avoid buying more smart stuff and seeing how the results come back.

Good idea with building automatic notifications to determine downtime. I was scratching my head trying to figure out just how to do this.

I assume zigbees becoming “unavailable” would become a big issue if an automation is scheduled that utilizes that zigbee device or if its a time-based event (device is available when timer countdown starts, device is unavailable when countdown ends, and therefore stays on when it should be off).

I was aiming to purchase more sensors and controls than have to build them. I was shocked at the ease of integration of MQTT2Zigbee and the vastness of “ready to use” sensors.

ESPHome looks fantastic, and I was able to get a few devices flashed and playing with it for DS18B20 sensors. I was eyeballing the ethernet ESP devices (to avoid wifi)! I found some Aliexpress listings that look favorable (about $3/ea in bulkish quantities). Maybe, they are PoE :-D, I’ll have to look into this. Either way, a cable is better than RF and a battery.

I appreciate the insight, and I’ll pause my Zigbee purchases while exploring this route.

Come on, mariadb/mysql/postgresql run a lot of the internet!

2 Likes

I’d encourage to use of containers vs physical machines as it’ll make your life easier. You’ll likely want a dev environment, qa/test environment and a prod environment. With containers and docker-compose it’s easy to spin new environments (HASS,Postgres, MQTT, etc,) up and down. It’s easy to back everything up. Easy to upgrade, etc.

1 Like

Redundancy. Make sure two or more sensors agree, else you have a fault.

It is. I started my home automation using MQTT and Tasmota firmware on the devices, but ESPHome is so much easier to set up on a device and integrates very well into Home Assistant. If you are going to have a number of devices identical except for the device name, this line: name_add_mac_suffix: true in your devices’ YAML code adds the MAC address as a suffix to the device name, so you can install identical code to all of the devices.

I bought a wt32-eth01 and flashed ESPHome to it. Easily done, but it’s still in my drawer as I don’t have an application that needs the reliability of Ethernet. In your case it may be worth investigating further.

2 Likes

Sure, but is it running on same (relatively) low performance machine as HA? Or runs on dedicated servers/clusters or even reasonable NAS with 10GB (or faster) connectivity and fast all flash storage? This is what I mean.

To some extent, this is a project to improve my networking and automation skillset. We are planning on moving to farmland soon and I’m trying to figure out how we are automating our greenhouses before we finalize our greenhouse drawings. I’ve found planning ahead is much simpler than retrofitting.

I’d love to setup a cluster of the old computers and run a dedicated server. We have a dedicated server (remote hosted) for our e-commerce site, and I’d love to understand more about how to create clusters, distribute loads, etc.

I’ve been avoiding learning docker. I want to learn it, but it’s been low on the list while I learn the other technologies. It sounds like it’s time to start running through some tutorials.

I have yet to discover a good excuse for Docker, Virtual Machines, Proxmox or other multi-purpose tool.

I suspect that most use these tools because they are too cheap to dedicate one computer to Home Assistant. OK, maybe that’s harsh. They only run a few automations and have a few devices, so they want to use the same PC for other applications.

You are proposing an industrial-strength installation. You won’t be playing Minecraft or browsing PornHub. It is a serious tool. You don’t need clusters since Home Assistant distributes the operating code across the devices. My advice is to install Home Assistant X-86 binary on an Intel NUC. An i5 or i7 should be sufficient. Buy two for a backup. Make daily offline backups. (I recently had a NUC go crazy. Rather than fix it, I simply installed the latest Home Assistant binary on it and performed a system restore from last night’s backup. My total down time was about one hour. The problem NUC, by the way, had a bad RAM. Fixed that and ran stress on it, and it’s ready for the next time I need a replacement).

As far as planning, run a 2-inch PVC electrical conduit between all buildings. If you need Ethernet or Fiber between them for any reason, it’s a simple pull. I would run low-voltage wiring, 14 to 16 gauge, everywhere that you might want sensors or other devices in the future.

1 Like

I think this is a perfect use for proxmox. Clustering is great for high availability.

1 Like

That’s an interesting opinion to say the least.

@stevemann Float switches are super cheap and pretty catostrophic if they fail. We will definetely have multiple of them and setting up the necessary logic.

Thank you for the ESPHome setup details on the YAML part and MAC addresses.

Im putting an order in for 20 of the ethernet devices, pretty cheap at around 3.50 each.

Why run Docker or any other virtualization package just to run Home Assistant when there is a perfectly good X-86 binary?

Because the maintenance is much, much easier. If your machine has enough power, Docker is way better than a bare metal install.

That might not fit all projects available on Docker, but for many it is valid.

2 Likes

Maintenance, availability, ease of use, backup, using your hardware for more than one piece of software, the list goes on. It has nothing to do with being “cheap”, that’s just silly.

2 Likes

My point exactly.

Buy a second PC for all the other software you want to run. I see no advantage of adding another layer of complexity with Docker, etc, that makes maintenance or backup any easier.

I recently had to replace my Home Assistant NUC- it took less than an hour of down-time. Backups couldn’t be easier as I use Samba Backup to make a full backup at 3AM evert day.

It would be a tough sell to convince me that I need Docker or Proxmox.

1 Like

WTF does that have to do with clustering to achieve the level of performance needed for this setup?

1 Like

Agreed- I am going down the wrong rabbit hole and not addressing the OP’s main question.

But, does 15K entities exceed the capabilities of “bare metal” Home Assistant on capable PC hardware? My modest system has over 1500 entities and I am not even using half of my available 8Gb memory. Since I am adding devices and entities almost daily, what indications would I see that my NUC is “hitting the wall”?

1 Like

It’s a bit funny that you list file-level backup, hardware replacement and «less than an hour» as positive points. In a virtualized setup you would have a few seconds downtime, at most.

Not everyone is using NUCs. You have people running it off all kinds of hardware, like proper servers, NAS and so on. To dismiss one of the biggest gains in the industry the last 10-15 years as «being cheap» is plain ignorant.