HASS-data-detective: Access influx db or use external DBMS?

Hi there,

I am just about to get my feet wet with data science in my home automation. Currently I am asking myself the question which database I like to use to store those big amounts of data. My recorder setup currently only preserves the last day in a sqlite db. I stumbled upon the influxdb addon and integration which seems to be perfect for this very purpose. But checking the source of the HASS-data-detective I can not find a way to access influxdb. The only thing that is constantly mentioned is the hass recorder and its settings.

I am a bit confused. Is it better to use the hass recorder using an external DBMS?

1 Like

I am interested in using influxdb from Data Detective too (I have around a year of data in there). Maybe @robmarkcole knows?

data detective doesn’t support influxdb, but I would like to add that so please share your progress. This should help you out -> https://www.influxdata.com/blog/getting-started-with-influxdb-and-pandas/

1 Like

Hey there,

my current workaround for this is to configure the Jupyter supervisor addon to install influxdb as one if its init_commands so the config looks something like this:

# ... other stuff
init_commands:
  - pip install influxdb

After that you can connect to your influxdb within your notepads very easily using:

from influxdb import InfluxDBClient
db = InfluxDBClient('a0d7b954-influxdb', 8086, <USERNAME>, <PASSWORD>, 'homeassistant')

Follow influxdb-python’s docs to query your data. (Can be found here: https://influxdb-python.readthedocs.io/en/latest/index.html) And process is using pandas or anything other of your choice.

1 Like

Nice one! I’m quite green to this. Do you know by chance how to:

  • Query my influx addon database running on hassio on my Pi from Python (say Pycharm IDE) running on my windows machine (same network) and/or
  • copy my influx database to my Windows machine and then connect/query to it locally? I think I found the db in my backups, but I don’t get how to connect up easily. Dunno if it needs to be on a webserver or something…

I’ll search around for this, but if you know and could point me that would be tops.

My thought is I’d like to do the heavy lifting analytical development on my PC, then move optimised stuff back to Jupyter addon.

1 Like

Hmmm… Can’t retrieve query results via Pycharm IDE, but the URL query it generates seen in the error returns data when I paste it in my browser. Guessing I’m missing something basic. Will keep trying…

from influxdb import DataFrameClient


cli = DataFrameClient('http://192.168.x.xxx', 8086, 'homeassistant', 'MyPassword', 'home_assistant')

rs = cli.query('SELECT * FROM "Mbit/s"')

Query via browser:
http://192.168.1.XXX:8086/query?q=SELECT+*+FROM+"Mbit%2Fs"&db=home_assistant

browser response:

{“results”:[{“statement_id”:0,“series”:[{“name”:“Mbit/s”,“columns”:[“time”,“attribution_str”,“domain”,“entity_id”,“friendly_name_str”,“icon_str”,“value”],“values”:[[“2019-05-20T00:04:29.650247936Z”,null,“sensor”,“speedtest_download”,“Speedtest Download”,“mdi:speedometer”,4.12],[“2019-05-20T02:34:59.870418176Z”,null,“sensor”,“speedtest_download”,“Speedtest Download”,“mdi:speedometer”,4.34],[“2019-05-20T03:18:30.728969984Z”,null,“sensor”,“speedtest_download”,“Speedtest Download”,“mdi:speedometer”,9.51],[“2019-05-20T06:33:41.20925312Z”,null,“sensor”,“speedtest_download”,“Speedtest Download”,“mdi:speedometer”,13.86],["2019-05-

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\mahko\Anaconda3\envs\LightingAnalysis\lib\site-packages\influxdb\_dataframe_client.py", line 194, in query
    results = super(DataFrameClient, self).query(query, **query_args)
  File "C:\Users\mahko\Anaconda3\envs\LightingAnalysis\lib\site-packages\influxdb\client.py", line 512, in query
    response = self.request(
  File "C:\Users\mahko\Anaconda3\envs\LightingAnalysis\lib\site-packages\influxdb\client.py", line 323, in request
    response = self._session.request(
  File "C:\Users\mahko\Anaconda3\envs\LightingAnalysis\lib\site-packages\requests\sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\mahko\Anaconda3\envs\LightingAnalysis\lib\site-packages\requests\sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\mahko\Anaconda3\envs\LightingAnalysis\lib\site-packages\requests\adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='http', port=80): Max retries exceeded with url: //192.168.1.XXX:8086/query?q=SELECT+%2A+FROM+%22Mbit%2Fs%22&db=home_assistant (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0009AB3A0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

Did you eventually find a way to make it work? I’m interested in doing something similar as well and am struggling.

No, I haven’t come back to it yet. Let me know if you crack it.

Make any progess by chance? I’m looking to come back to it.