15,000+ Entities in Home Assistant? Seeking Advice for Large-Scale Setup

Hello everyone,

I’m venturing into a project where I intend to use Home Assistant to automate various aspects of my aquaculture business. The scale I’m looking at seems to be beyond what a typical home setup might involve, and I’d love to hear your experiences or advice on how best to approach this.

Background: Previously, I tried utilizing IoT automation in 2021, but faced challenges mainly due to my vendor choice (required pinging the vendors servers) and lack of technical experience. Since then, I’ve honed my skills in Python, Linux, and PLC/HMIs. I am considering Home Assistant because of its impressive array of off-the-shelf integrations, and it’s been working great for initial tests.

Key Considerations:

  1. Entities: I estimate I’ll have over 15,000 entities. Just to give context, I’m planning on several hundred IP/PoE cameras to monitor cultivation tanks from multiple angles, and each camera contributes to 27 entities. I’m already beyond 2,000 entities, and my setup is running on an “upgraded mini tower pc”. I can provide stats if needed. Any experiences with running HA at this scale?

  2. Networking: Thankfully, I’ve upgraded to enterprise-grade networking hardware, so Wi-Fi congestion is less of a concern. The cameras will be connected via Cat6 cables, but I’d love to know if there are other unforeseen challenges I should watch for.

  3. Zigbee Devices: I’m also considering a few hundred Zigbee devices. I’ve read about potentially needing multiple coordinators(if I reach the coordinator limit) running separate networks. Has anyone faced challenges or can offer insights into managing a large number of Zigbee devices?

  4. Security: My HA network will be isolated using VLANs and blocked from the internet. However, I’m always open to additional security advice when dealing with such a vast number of IoT devices.

  5. Backups: Given the scale and importance of this project, ensuring regular and reliable backups is crucial. My Home Assistant instance resides on an “upgraded old PC”. I’m curious about best practices, strategies, or tools the community recommends for backup. Both for Home Assistant’s configuration and for the entire system? I plan to back up the linux install, and the home assistant installs routinely to a local NAS and an offsite NAS.

  6. Critical Processes: I’m not super trusting of using IOT devices/plugs to turn on/off a pump to transfer 2000 gallons of water from 1 water tank to another. The second tank is full when physical float switches, pressure sensors, ultrasonic flow sensor on the pump(reports in L/min), or ultrasonic height sensors tell it is complete (4 methods to check full “fullness”). I plan on using PLCs for easy integration with the hardware above and comfort. I know there are “input boards” for home assistant, but I worry about potential connection issues or other failures that might occur. Am I being ignorant about the limitations of home assistant here?

  7. PLC/HMI integration through MQTT. This looks like a future project, but our HMI’s can communicate via MQTT (as a client or as a client+broker). Most communication to HA would be primarily for Data logging, but I would like to make a few things that allow HA to run/initialize programs on the HMI. I understand these programs (due to their connection to the physical world) need to be designed from a perspective of “this will fail; how many layers will you implement to catch different edge case scenarios”.

I aim to have a reliable and efficient system, efficiently track the system parameters (photos and cultivation parameters in our cultivation systems), monitor/log power consumption, and eventually integrate with my PLCs/HMIs. Given my context, any suggestions, experiences, or general advice from the community would be greatly appreciated.

Thank you!

2 Likes

So you are putting a serious business on that? Invest in two computers and use proxmox to have redundant setups and really easy backups.

Also HA is not a video system. It has some really good video integrations, but you probably want an nvr.

2 Likes

As nickrout is saying, at this scale and criticality you should be looking at running it on hardware that is designed for high uptime / availability. An old pc has several single points of failure. Some decent server-grade hardware with redundant drives, maybe ECC RAM, IPMI ++ would suit this better. If you run a hypervisor on the hardware you can do snapshots before upgrading, fast backup and recovery etc.

1 Like
  1. Try frigate or any other NVR for these cameras, and integrate that into HA for alerts etc.
  2. VLANS, Routing, The usual :slight_smile:
  3. Not here as a home user
  4. Try wireguard or other VPN servers to connect remotely. Make sure FW is up to date. Perhaps even a light pentest to identify weak spots if this is for a businesses
  5. 3 machines, all running docker swarm or proxmox or other VM tech. Depending on choice, your backup will vary. NOTE: 1 copy is not a backup. you want something off-site so prepare for that.
  6. comes down to network stability and protocols used by the PLC. It might be worth a separate cable depending on the tech.

In short the more 9’s you want in your uptime the more $ you are going to spend, Are your processes as critical as a bank than 99.9999% might be for you, but then I wont host it myself.

Does it need to function just 99% then you might have a fighting chance!
Calculate your uptime here: https://uptime.is/

1 Like

Thank you. Yes, serious business, but most of our automation on home assistant will be turning lights on or off. Power monitoring/recording. Nothing will be high speed.

I have around 5 of these old computers, I’m not too concerned with power consumption.
The computer is a Lenovo Thinkcenter M83

  • Intel Core i5-4570 3.2GHz
  • 32 gb ram
  • 500GB SSD

I’ll investigate proxmox as I have been using KVM.

With regards to Cameras. Yes, I am running the Cameras to PoE Routers → NVRs. I figure I’ll be viewing the feed on occasion, but mostly I want it for time lapses.

I was looking into ECC RAM if I decide to go the DIY route for a NAS. I’ll read more into it! I haven’t heard about IPMI ++ before, thank you, more reading!

I am running KVM, so I’ll be sure to take snapshots. I assume I could do supervisord + crontab (or some other cronjob/scheduler) and to automate it, but looking around, It looks like there are already packaged solutions out there for scheduled backups. I think proxmox as @nickrout mentioned has this feature integrated.

@belastingvormulier

  1. Thanks! I was looking at Frigate for some later integration of OpenCV/TensorFlow! I’m planning on using Reolink 12MP cameras with their NVR (ugh), but we will see how that goes.
    2.Sweet.
  2. Awesome.
  3. We are running on UI networking equipment. I’ve VPNed through wifiman/UI teleport with ease. With regards to the pen test, is this usually coordinated with a white hat security firm?
  4. I like it. I’m going to look more into proxmox.

I’ll look into uptime calculations. We arent doing anything high speed, and I don’t believe we are doing anything “time critical” on home assistant (if the LEDS turn on at 6:01 instead of 6:00, boohoo, even if they turned on at 6:10, I wouldn’t be crying…but I would be scratching my head.)

Our current data time gather intervals are usually on the order of days between measurements. If HA happens to miss a few readings (planning on a reading every 10 minutes), I likely won’t even notice. I’ll get uptime monitoring going.

Thanks for the suggestions!

Another thing I forgot to mention is that you should look into using PoE / LAN variants of the zigbee coordinator(s). That way you are free to move your HA installation around (in VMware it is called vmotion) not locked to the server/pc that has a USB stick installed.

1 Like

@fleskefjes That would be great! I saw another thread of people suggesting CC2652P2 Based Zigbee to PoE Coordinator 2023 – TubesZB but its out of stock. I see they have an ethernet/rj45 + USB offering that is in stock. I’ll do some reading on this and see if there is an alternative device or stock somewhere else. I assume I would need to re-pair all of my zigbee devices when migrating to the new PoE stick?

I’ll see if proxmox has something equivalent to vmotion.

Personally, if I were considering something at this scale, I would consider ethernet connected ESP devices and skip Zigbee all together. Otherwise, you are potentially creating a different “Wi-Fi congestion” situation, just with a different protocol. IP based wired devices can scale essentially infinitely. Since it appears you are installing a robust ethernet network already, why deal with the potential RF issues? The downside is you’d have to build more sensors and controls instead of purchasing off the shelf.

I am certainly far from expert, but for what it’s worth:

  1. Few hundred Zigbee devices is a bit of a risk here. I have 50 running across 4 coordinators (1x Conbee II, 2x Aqara E1s and 1x Aqara G3). I constantly have small drops here and there (connection loss, disconnects, drops in LQI etc.). + Zigbee tendency to jump around for best signal (sometimes jumping into “Unavailable”). + potential congestion between Zigbee and Wi-Fi (even my comparatively tiny setup has this problem).

For my home its’ not as big of a deal, but for a business venture it may prove to be a KILLER “feature”. Appreciate hardwired options are likely to be more expensive, but those are probably more reliable. Alternatively, I’d do a pilot for 2-3 months of ~100 devices first before committing to spending a small fortune. Build notification automations around it to notice when things go offline & monitor LQI.

1 Like

How are you running 4 co-ordinators?

1 Like

3x HomeKit device + 1x ZHA. Though I guess you can count Homekit Devices alltogether as 1x

1 Like

Another consideration is recorder. Not sure how much history and for how many entities you want to keep, but I doubt SQLite or even MariaDB as add-on can handle this - not size wise, but performance wise. Perhaps some separate DB instance on dedicated machine?

1 Like

Great suggestions @mirekmal . When reviewing HA docs, I raised an eyebrow at using SQLite for an operation of this size. The initial goals will be heavily skewed toward monitoring and data logging as we determine reliability. We have some batch processes that take close to a year to complete (so seconds/minutes tend to be inconsequential), so it’s critical that we can log and manage that data effectively.

I have some experience with Postgres, and from my understanding, it scales extremely well. I’ll see what kind of process it is to migrate databases, and then I’ll see if I can enhance another tower and get it running as a dedicated machine. I should probably do this for my separate django project as well (it runs on postgres, but moving the db to a separate computer (as its going to be a MASSIVE database) sounds like a good move).

Thanks! Well I have my ~100 zigbee devices already. MQTT2Zigbee is beautiful. I know Ubiquiti has some features on the network to “rescan and reset” the 2.4ghz channels to avoid interference. I’m hoping this helps in the journey. We are spreading most of these sensors across 20,0000 sqft, but we had serious wifi congestion problems when I tried IOT stuff back in 2021 (with the crumby network). I’m hoping to leave that in the past. I’ll do my best to avoid buying more smart stuff and seeing how the results come back.

Good idea with building automatic notifications to determine downtime. I was scratching my head trying to figure out just how to do this.

I assume zigbees becoming “unavailable” would become a big issue if an automation is scheduled that utilizes that zigbee device or if its a time-based event (device is available when timer countdown starts, device is unavailable when countdown ends, and therefore stays on when it should be off).

I was aiming to purchase more sensors and controls than have to build them. I was shocked at the ease of integration of MQTT2Zigbee and the vastness of “ready to use” sensors.

ESPHome looks fantastic, and I was able to get a few devices flashed and playing with it for DS18B20 sensors. I was eyeballing the ethernet ESP devices (to avoid wifi)! I found some Aliexpress listings that look favorable (about $3/ea in bulkish quantities). Maybe, they are PoE :-D, I’ll have to look into this. Either way, a cable is better than RF and a battery.

I appreciate the insight, and I’ll pause my Zigbee purchases while exploring this route.

Come on, mariadb/mysql/postgresql run a lot of the internet!

2 Likes

I’d encourage to use of containers vs physical machines as it’ll make your life easier. You’ll likely want a dev environment, qa/test environment and a prod environment. With containers and docker-compose it’s easy to spin new environments (HASS,Postgres, MQTT, etc,) up and down. It’s easy to back everything up. Easy to upgrade, etc.

1 Like

Redundancy. Make sure two or more sensors agree, else you have a fault.

It is. I started my home automation using MQTT and Tasmota firmware on the devices, but ESPHome is so much easier to set up on a device and integrates very well into Home Assistant. If you are going to have a number of devices identical except for the device name, this line: name_add_mac_suffix: true in your devices’ YAML code adds the MAC address as a suffix to the device name, so you can install identical code to all of the devices.

I bought a wt32-eth01 and flashed ESPHome to it. Easily done, but it’s still in my drawer as I don’t have an application that needs the reliability of Ethernet. In your case it may be worth investigating further.

2 Likes