Moving Home Assistant to AWS

maxperron · January 17, 2020, 9:12pm

Do you route all your traffic throught the vpn or on only iot traffic?

rafaborges · January 20, 2020, 8:37am

Only IoT traffic. The reality is that data transfer costs can get out of control in a home environment where you have people doing all sorts of things, like streaming, gaming and downloads. Also, I see no technical reason to do it so as you don’t benefit much from it.

If, for some reason, I need privacy, I have a VPN account on a major provider and the pfSense is also configured as a client to this VPN and I can easily switch profiles depende on where I want my endpoint to be.

I also run an openVPN box so I can easily connect to my network from my mobile. This gives me peace of mind when working on sensitive documents over public wifis.

BrunoZumba · March 13, 2020, 10:53pm

Hello, @rafaborges

Did you manage to get everything working?

I was thinking about moving my Home Assistant instance to Amazon EC2. My plan was to install HA on EC2, have a MQTT broker at my home (raspberry pi) and a few MQTT IoT devices at home (lamps, and stuff).

So, the IoT devices comunicate on my LAN with the MQTT on the raspberry pi, and the raspberry pi communicate with HA on EC2.

I know, since I am still using a raspberry pi, why don’t I have HA’s instance on it? Well, I want to use something on the cloud for my particular reasons

I got two questions:

1 - Will I be able to access my HA instance via a web browser on EC2?

2 - You said you were using VPC on a VPN and stuff… is that necessary? Can’t I just run everything using only EC2?

Thanks
And best regards from another HUEBR fella

thisisbenwoo · March 20, 2020, 8:57pm

Why do you say it will not be cost-effective? Surely it doesn’t require much resources. Can you give more info on this?

rafaborges · March 24, 2020, 11:15pm

It all depends on your goals.

On my tests I was using a VPN so I could have an EC2 instanca on my local network so HA could see all devices easily. How expensive is to use the VPN? Well, considering US East (Ohio) and that the connection is active for 30 days, 24 hours a day. Also, let’s say that we transfer about 500 GB out through that connection each month. The site-to-site VPN is charged on an hourly basis, for each hour the connection is active. For this AWS Region, the rate is $0.05 per hour (total of $40.00). Now you add the data transfer out ( $0.09 per GB, first GB is free). This will result in a charge of $44.91. So, only for the VPN, you will pay $80.91 per month. An I’m not including the dedicated firewall you need for this.

For a home user I can’t call $80 cost-effective. And I’m not even talking about the t2.micro ($10 per month) for your HA instance, maybe RDS as a managed service for your database and S3 for backups.

Of course you can bypass the VPN by opening some ports on your network to allow remote connections, but you still need a good hardware (like a router with a built-in firewall) and a static IP address (or DynDNS). But hey, do you really want to open your home network to the world? Are the security compromises worth all of this?

Chances are your home automation system will also include some workloads that are better handled by local processing, like video streaming and local media management. So while the cloud can be interesting in some scenarios, if you have more ambitious goals, you will probably get a better bang for your buck ratio with a local server.

Vasco · March 25, 2020, 12:23pm

Nice terms being used at AWS guess you meant shift

danielperna84 · March 25, 2020, 7:21pm

Do you have any kind of fallback solution in the case internet connectivity goes down? Not sure how well keepalived / vrrp works through a VPN.

rafaborges · April 1, 2020, 5:12pm

I’m embarrassed now. That was a very bad typo…

rafaborges · April 1, 2020, 5:30pm

I don’t know much about VRRP to tell you how it works with VPNs, but when it comes to Home Assistant, as I have already mentioned, I see a VPN to AWS as an overkill.

That said, a high availability solution could be achieved easily if your router supports multiple WANs. On mine WAN1 is an ethernet port that connects directly to my ISP and WAN2 is 5G USB stick with an unlimited data plan (I use TPLink WR902AC to connect the USB dongle to the ethernet port).

Charlie_Hauser · June 8, 2020, 2:11pm

@rafaborges Thanks for the post! You’ve inspired me to use home assistant for my office and thought this would be a good route. I currently have the “supervised home assistant” running on a t2.micro instance. I would like to stay with the free tier if possible. I am have an issue with the status check failing every 24 hours. This is my first endeavor into using AWS, but my suspicion is that the t2.micro does not have enough resources to run home assistant, grafana, influxdb and node red. I noticed you moved your database to RDS in the OP, which I am beginning to explore. Currently everything is running on the EC2 instance. Do you have any other recommendation for improvements?

rafaborges · June 8, 2020, 9:59pm

Hi Charlie, when you say “status check failing every 24 hours”, what do you mean by that? Are you talking about the status check tab on the EC2 console or are you talking about any specific log message on the hassio log? You are right that a t2.micro is not very powerful to run all of these, but if you don’t mind some lag and waiting, it should not be a problem. My first deployment was on a Raspberry PI 3 B+ that has a slightly worse configuration than a t2.micro.

Regarding the free tier, keep in mind that it’s only free for a year. After the trial period is over, you pay the running costs for the resources you are using.

Here some recommendations:

Get away from the Supervised Hassio because it was deprecated a couple of months ago.
Be mindful about security: rest assured that people will try to hack their way into your server. Two factor authentication is a must have and certificate based SSH. Ingress only add-ons.
AWS charges for data getting out of AWS, so make sure you only stream out data that is relevant.
I used the recorder component with RDS for Postgres running on a db.t3.small, but a db.t3.micro will suffice for most use-cases.

Charlie_Hauser · June 9, 2020, 3:09am

The status check tab on the EC2 console fails and restarting is the only way to resolve the problem. Thanks for the recommendations. I will continue to experiment with it to improve stability!

robmarkcole · June 9, 2020, 3:58am

@rafaborges nice to stumble accross this thread, as I have been experimenting with a hybrid rpi/cloud setup. I created integrations for rekognition and S3 which might be of interest. I previously experimented with cloud RDB on google and found that not to be cost effective, but I am interested in finding a low cost cloud solution for storage if you can suggest one? In my approach the RPi is the gateway to the cloud, and cloud is leveraged where there are compute/storage intensive services required. The RPi 4 is quite capable hardware, but things like graphing with grafana become slow when the history gets large, and on the pi you cannot do object detection as fast as rekognition! Cheers

rafaborges · June 9, 2020, 9:49am

Here some questions to help you understand what’s going on:

What metric are you using to monitor the health of your instance?
If the status check alerts you, can you still access your EC2 instance?
Does the time of the alert match the time of an event on hassio?

I don’t believe the size here is the problem. I used to wok this very same configuration on a Raspberry PI 3 B+ with no issues at all, so I guess it’s just a problem related to something else rather than the instance itself.

rafaborges · June 9, 2020, 4:20pm

I don’t have much information about RDB on Google, but I can say that an average hassio database, using Postgres on RDS (a single db.t3.micro instance), will set you back around 15 USD/month. Does this price tag qualifies as low to you? It doesn’t sound much, but over time can be complicated to justify for the average residential user when, after a yer, you will have spent enough money to buy a small computer.

Specifically for me, I ended up buying a powerful NUC to run all those services. I paid around $500 and, converting to cloud native services, I would have a monthly bill of around $40.

In the future, when we make Amazon Timestream publicly available it will be much cheaper to store time-series database. Currently this would be a challenge becase hassio uses SQLAlchemy, so the recorder component does not support time-series databases. But with enough interest of the community, we can think about building Recorder support for time-series databases, like influxdb, Timescale, and even Timestream.

lymkin · June 9, 2020, 4:24pm

I suppose using AWS for HA is ok if your internet is never down, but I moved to HA just to get rid of the dependency of being connected to the internet. I still want all my automatons, etc. to work when Spectrum is down 2-3 times a month.

Cool work none the less!

Charlie_Hauser · June 9, 2020, 5:18pm

I am not able to access the EC2 instance via SSH and the frontend of home assistant is not reachable. I have run home assistant on a pi without issue before as well. After doing some research I have found some similar issues. I had installed docker.io rather than docker ce which I believe is causing issues. I will follow this guide and see if that fixes things.

robmarkcole · June 10, 2020, 7:53am

@rafaborges the only ‘free’ solution I am aware of is google BigQuery, which gives 10GB per month on free tier. I think you are charged for queries however.

RE micro instance, 15 USD/month probably is not justified for a residential user. I also am running a local db on a Synology.

RE timestream, my experience of the HA community is that there are people interesed in almost every niche imaginable, so I am sure there would be some level of interest. Probably this would be most of interest for commercial users, perhaps who want to combine data feeds from multiple instances. Like you say this could be implemented via Recorder, and I imagine it would be very similar to influxdb

Happyman · September 21, 2022, 12:44am

Novice here but had an idea - In order to creat an economic approach could you run HA in a container with OpennVPN and with OpenVPN running on a pi in your local home network. Use the local pi with a reverse VPN (VPN gateway). Therefore you won’t have to open any ports on your home network. And AWS I don’t think will charge you for VPN connection. Most of HA network traffic will travel through AWS, and only local traffic will travel through the VPN gateway.

Thoughts? Clarification? Should be pretty cheap I think

chowhi123 · January 28, 2023, 6:03pm

Who need raise? Obviously this guy.