Hi Cao,
So I was having a simple HA setup with a thermostat, smart lights, door and window sensors and a couple of other stuff. And with HA, everything was locked in the .db file inside the Raspberry Pi. So after 2 months passively collecting the data, I tested it out over the course of a day and wrote some Python script. What I was doing is simulating some unusual behavior, i.e. things that were not observed in the training data, and see if the model would pick up on it, e.g. simple One-class SVM. This included someone opening the door when I am not at home, flickering lights, lights turning on in the middle of the night (as this did not happen in the training data), making the temperature sensor of the thermostat measure higher than usual temperature, basically simulating a fire or something, but also more subtle things, like opening the window when it is cold outside, but having the heatings on at the same time. Obviously, you can hard-code everything, like have a rule that turns off the heating, when it recognizes that the windows are open, etc., but as I explained I wanted to see, if a machine learning model can recognize something like this automatically. So it worked partly, but mostly on things time related, my model was too stupid to recognize more subtle things like the heating scenario. Also FP rate was kind of high, like 2%. I hoped, if I had a larger dataset from diverse users, that the model would be smarter and generalize better. But on the other hand, it was hard to integrate, as your dataset and mine only have a small intersection of the same devices and usage patterns are probably very different as well. Anyway, after those first results, that showed that in some cases it worked quite nicely but overall it is not robust enough and can probably only work very good with lots and lots of data, I was carried away with other things.