AI vibecoding for Home Assistant with Cursor, VS Code or other MCP-enabled IDEs

Not so long ago I moved into an old English house. Everything about it was great… except the heating.

A single Honeywell thermostat in one room controlled a gas boiler for the whole house.
Rooms on the sunny side were getting way too hot while the rooms in the shade were still chilly. Finding a “compromise temperature” that worked for everyone was almost impossible. It was a daily source of arguments at home.

The simplest solution would have been to buy a multi-zone Honeywell system with smart TRVs on each radiator.
When a room reaches the target temperature, the TRV closes the radiator so it doesn’t overheat, while colder rooms keep heating. And when all rooms are warm and their TRVs are closed, the boiler turns off.

But I’d been wanting to get into smart home stuff for a while, so I decided to build such a system myself, of course on top of Home Assistant. I really wanted the freedom to create automations my way, not just what a closed vendor solution allows. Plus, I didn’t want to depend on any vendor cloud.

I had a Raspberry Pi 5 lying in a drawer, silently judging me for not doing anything fun with it.
So I flashed Home Assistant onto it and, in a small fit of madness, ordered:

  • TRV radiator valves
  • A relay to switch the boiler
  • A Zigbee USB dongle to control the whole circus.

Surprisingly, it all turned out much cheaper than the ready-made commercial systems.

Two weekends flew by. I went through several “generations” of heating control scripts: from the simplest “turn on the boiler if at least one TRV is open, turn it off if all are closed” to scripts that analyze how fast each room cools down depending on the outdoor temperature, use external temperature sensors, and do predictive on/off switching to smooth out temperature swings.

Now, some background: I used to be a pretty good software engineer before I founded an IoT startup with a few million business users and moved over to “the dark side” of business management.

I still love coding and building things with my hands, but working with Home Assistant at the beginning was… not exactly fun.
Most of the time I was asking ChatGPT how to write automations correctly, then manually copying the code into HA, checking that it didn’t work the way I expected, going back into HA, reading logs, and debugging with ChatGPT again.

After a few iterations of doing this I realised that, in the age of AI and vibe-coding, this workflow just doesn’t make sense.

What I really wanted was to be able to describe what I want in Cursor AI (my IDE), and let it:

  1. Connect to Home Assistant
  2. Analyze my current HA configuration
  3. Generate the right automations and dashboards for this setup
  4. Deploy them directly to my HA instance
  5. Then, if something goes wrong, fetch the logs, understand what’s happening and adjust the code.

I couldn’t find an existing solution for this (beyond “just use SSH” or MCPs using quite limited external API or SSH), so I did the only logical thing and went down the rabbit hole of building it myself.

It wasn’t exactly easy. Home Assistant is a collection of many subsystems with different APIs: dashboards, automations, helpers, variables… Some things use one protocol, others another. In other words, it looks exactly like a mature system with a long history of evolution

But in the end, I managed to build what I wanted:

  • Home Assistant Agent - an add-on installed from the Home Assistant UI. AI IDE like Cursor or VS Code (via the MCP server) connects to this agent and performs actions on your HA instance

  • Home Assistant MCP - an MCP server you add in Cursor or VS Code IDE that gives it the ability to “walk” your Home Assistant instance

After that, creating automations and dashboards became much more enjoyable for me. I just write a natural-language description in Cursor / VS Code, and it talks to HA and does what I ask. Of course, things don’t always work perfectly on the first try - but that’s the same in normal software development. The difference is: with Cursor I can say “this part doesn’t work as expected”, ask it to check the logs, and let it propose a fix.

Honestly, I thought this thing would only be useful for a couple of geeks like me.
But when I wrote about it on my personal FB page, the post got over a million views and a lot of interest.

So I decided to open-source it on GitHub and share it with you here.

I’ll be happy if the result of my hobby turns out to be useful for you and starts living its own life in the HA community.

GitHub link (setup takes under 5 minutes):
https://github.com/Coolver/home-assistant-vibecode-agent

6 Likes

I used this today to fix some of my legacy template notifications. Really powerful! Thank you for building this out!

1 Like

This is great. I used it to figure out how to configure a card I couldn’t figure out. I used Cursor, but would prefer to use opencode that is set up on VSC (opencode extension). How do I set up the MCP server for that?

I got it to work directly with opencode: opencode->HA MCP Helper. I couldn’t get it work with VSC in the middle, but since it is mainly a chat session, I don’t really need to use VSC on the frontend.

I have a question about your workflow. I’m interested to hear what is the recommended setup to sync changes to HAOS from Cursor running remotely.

Do you:

  • Use remote SSH from Cursor
  • Samba share the files
  • Push to git and sync inside HAOS
  • Something else?

It seems to work quite well in VSC for me, but whenever I try to list automations or update an automation, the AI runs out of context, condenses it and completely looses track of what it was doing.
Anyone else having similar issue?
I suspect that the answer from the MCP tool is so big, that it blew the context completely.
I started with 32k context and then tried to upsize it in steps, but even with 256k context it fails the same way :man_shrugging:

YES, and it’s related to the AI client. ChatGPT, MS Copilot and Claude all exhibited that behavior and it was a bit maddening. AI clients have 2 parts

  1. The friendly, enthusiastic overly verbose foreground agent doing your bidding
  2. The AI Ops “man behind the curtain” that decides how resources are being handed out to sessions. Without consulting the foreground agent, different things that limit you are done based on criteria that are hard to see.

My ChatGPT and MS Copilot experiences were so bad over this type of thing that I quit them and only use Claude. You have to tell Claude to watch resources and let you know based on completing bigger packages of work. Also Claude has a Usage dashboard under setttings.

Here are things to watch:

  • If you’re doing a lot of yaml creation, start worrying and saving your work at around 50% daily usage.
  • Ask Claude specifically: “I am about to create a new package (or do a system review), should I start that now or wait. If so, wait until when?”
  • If you get the “compressing your session” message for a second time, figure out how to abandon that session and start a new one. Leave it, don’t delete it until all work is done for that topic. You can ask Claude and I will give you guidance
    *Larger work with tons of files like code reviews or documentation work has to be broken into multiple sessions. Ask Claude to create a tracker and also update the preferences file so I don’t make the same requests to do it a certain way over and over again.

You can ask Claud to create and maintain a document of your preferences. Things like “keep answers short and brief with minimal scrolling”
"We are in discussion mode only, create no code until I agree on requirements "
“Provide brief functional improvement suggestions but do not code them unless discussed and agreed”

TLDR. Claude (well all of them) is its own worse enemy in over providing, over answering simple requests and uses up my resources which limits my time and productivity. It’s annoying and I had to provide Claude rules and also rearrange my day to fit the work into Claude’s AI Ops limitations.

Yes. I have spent a bit of time solving this. I now use MCP for logs, current states of the system, and some small file writing. Local (on my laptop) git repo that stays in sync with HA.

The magic is a couple of configuration.yaml shell commands that allow claude to do “git stuff” server side and also my subagent that runs patch shell commands.

Patch is used instead of MCP for writing large files. Subagent uploads just the patch, then runs the patch shell command, then reloads home assistant (if needed) and resync’s my git repo. I set this up separate from @Coolver 's built in repo, I was having issues with it for some reason.

FYI I developed all this by doing things and then asking claude “what recommendations do you have to reduce context / token usage” and then massaging the responses from there. Turns out the biggest bang was that I was burning through context trying to read and write files. As of this week I tend to run out of usage instead of context.

I also have done a fair bit of “what do you recommend for my claude instructions or history to reduce context”. Combined with above its a really tight system now. Still evolving though as I keep asking it to do more.

I’m still not convinced that running claude code on the server and giving it direct API access, no MCP at all, may be best in the long run.

If you value your HA install in a production environment at all. No. I do in the system I develop in (and enable git for immediate rollback when your agent does a dum… Because it will… A LOT.)

In prod - real - mom’s using this. Nope. No. Not ever. Amen even then current implementation mcp needs to be strictly controlled. One wrong move with the wrong tool and full platter. You are either subject to dead system or pwnd by prompt injection in seconds.

1 Like

i can’t wait for the end of paternalistic “AI Bad” posts.

1 Like

I hope you weren’t interpreting AI bad…

Hell with my day job that’d be Hella hypocritical… Notice I didn’t say AI bad full HAL. I said not uncontrolled… Way different.

I’m all for automated full CI/CD But architected check pointed etc etc. With VMs and containers and oss it’s not an issue to construct the Lego blocks correctly to prevent the Agent from shooting itself in the foot or worse.

Access good.

Uncontrolled access waaaay bad. That’s what I said. Have it build a parallel ha install and deploy to test. Check it and then if works an automated git push deploys… (cause honestly, it’s inevitable on the ai code thing. Im all for it just safely. This stuff amplifies stupidity logarithmically)

1 Like

Sir, we are hobbyists here! Do you do what you suggested for all HA system or core updates? Because THOSE have broken my system more than AI.

And you’re a hobbyist, with a system that thinks like a cicd platform - make it do the work.

Look, HA code is ESOTERIC at BEST and that’s EXACTLY the wrong kind of thing you want in a fast moving platform where a single misplaced double quote can kill a boot. That’s the real problem but guess what’s AMAZING at. Java, Perl, Building a docker container, basic sysadmin of documented tasks - pretty much any tool use code model can smoke those tasks

You can ask IT to spin up your platform once. then get the benefits - my submission is it’s worth the project if you’re going to consider smashing your box with an unbounded llm - *taps head. Use it to protect itself… Dont use it to build the code use it to build its own sandbox first THEN use it to build the code. it’s 1 VM… Ask any of them how to (I just asked two and they both offered me a few options…One I may adopt in my own.)

Have fun!