Reading text and time from a python file and having it as a sensor in HA

bachoo786 · October 11, 2017, 6:10pm

Hi all

I have a python file that contains text and time. What I want to achieve is from the command line sensor be able to extract the text and time and present it on HA as a sensor.

This is what I have so far in my configuration.yaml but my sensor does not read any of the data from the python file:

  - platform: command_line
    name: Football
    command: "python /home/homeassistant/.homeassistant/python_scripts/output.py | grep 'Manutd' | sed 's/^.*: //'"

This is my output.py:

Football Times
Saturday 26 August 2017Manutd12:00 Arsenal15:00Liverpool15:00 Mancity15:00Newcastle17:30 Spurs15:00

I believe my sensor is not reading the values from the python file because of the spacing and selection of “Manutd” in this case. Can someone please help me with this?

I also need the unit_of_measurement to be in time 24 hour format. How can I achieve this?

Many thanks.

bachoo786 · October 12, 2017, 12:48am

@robmarkcole @Tinkerer maybe you guys can help please?

netopiax · October 12, 2017, 1:19am

First off let me say this is a cool idea.

With that said, is your goal just to publish the time and date of the ‘Manutd’ game? What is the python script actually doing?

Why the grep and sed commands? Specifically sed is (if I have this right) removing everything between the beginning of the line and the first colon followed by a space - not sure why you want to do that.

My advice is, make a copy of output.py that simply returns the exact information you are looking for. Ideally it would take a command line parameter (e.g. Manutd) and return the info just for that game. I’ll confess I am not sure what exactly the sensor looks for to interpret as a time and date, but a UNIX timestamp is a good starting guess.

RobDYI · October 12, 2017, 2:04am

I think you should either format the output of output.py to be json and then use a rest sensor to pick the info you need. Another idea is to get your python program to post the info to MQTT which HA can pick up.

I think your grep need to be something like this if you want to use this method.

grep -E -o “Manutd.{0,5}”

Tinkerer · October 12, 2017, 8:24am

I’d start by sorting that output to be something you can parse. I’d suggest that if you want to stick with plain text then you should output a series of lines like:

Day Team HH:MM

Also, that’s not a python script, that’s 2 lines of text, you can’t expect python to know what to do with that.

You’d be better off working with a file sensor, which would require you to output your data in JSON format.

bachoo786 · October 12, 2017, 8:43am

Thank you all for your replies.

I should have mentioned that the output.py file is derived from a python script which obtains the information shown in the output.py file from a website.

I could save it as a text file or a python file but I need to get the data stored in this output file in JSON format which is something I am not familiar with.

To summarize I would need to edit the python script that obtains the data for the output file so that it is in JSON format. Can anyone please guide me on how I can do that?

@RobDYI this grep grep -E -o “Manutd.{0,5}” did not make any difference to the sensor. I still get it blank with white space in it.

Tinkerer · October 12, 2017, 11:43am

Well, you’ve not shared the python script, which makes that impossible.

If you were to share the file (say on hastebin.com) then I’m sure somebody who knows Python will be able to help you.

bachoo786 · October 12, 2017, 12:17pm

Sorry here is the python script.

    # import libraries
    import json
    import urllib2
    from bs4 import BeautifulSoup
    # specify the url
    quote_page = 'http://www.bbc.co.uk/sport/football/premier-league/scores-fixtures/'
    # query the website and return the html to the variable 'page'
    page = urllib2.urlopen(quote_page)
    # parse the html using beautiful soap and store in variable 'soup'
    soup = BeautifulSoup(page, 'html.parser')
    # Take out the <div> of name and get its value
    name_box = soup.find('div', attrs={'id':'ja-user2'})
    name = name_box.text.encode('utf-8').strip() # strip() is used to remove starting and trailing

    with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
        json.dump(name, outfile)

As you must have noticed I have managed to dump the data in json format in a data.txt file. However the data saved on this text file is not specific to ‘Manutd’ in json format.

Here is the data.txt:

"Football Times\nSaturday 14 October 2017Manutd12:30\u00a0Arsenal17:30Liverpool12:30\u00a0Mancity15:00Chelsea15:00\u00a0Spurs15:00"

When I add it as a file sensor, the output is as above from the data.txt file and not just Manutd for example

RobDYI · October 12, 2017, 3:49pm

Have you tried another way to get this info beside scrapping? (HA does have a scrape sensor)

It looks like there are many free api that can provide this info in a more digestible form.

This one might work.

https://market.mashape.com/sportsop/soccer-sports-open-data

I got it from here

https://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/#footballdata

bachoo786 · October 12, 2017, 4:06pm

Well I want to scrap other information from other websites.

I have used HA scrape but the output is the same as I posted above.

The data I am scrapping is from a table hence the difficulty in getting it as a sensor as a text and unit in time.

RobDYI · October 12, 2017, 4:29pm

Ok, I think you need to add some more lines to your python script to help find the exact info you want. I am not sure if you want sensors for each team or just looking for a Manutd sensor.

I would search google for exactly want you want to do and look at the many examples. I would start by first finding the character (index) of the numbers you want, so search for the text ‘Manutd’.

I found this as how to do that

https://www.tutorialspoint.com/python/string_find.htm

I modified their example

Example
#!/usr/bin/python

str1 = name;
str2 = “Manutd”;
startnum = str1.find(str2) +1 ;

we can rewrite all this to

startnum = name.find(‘Manutd’) +1 ;

and now need to find how to get the next 5 characters starting at startnun position. I found it here and this should return the time 12:30.

newname = name [startnum, 5]

https://www.tutorialspoint.com/python/python_strings.htm

You can modify your python script to accept inputs from HA for different teams and it will return the searched text to HA.

bachoo786 · October 12, 2017, 5:11pm

Hmm I am working on your example on my script. Can I ask a noon question as to what is startnum?

RobDYI · October 12, 2017, 5:28pm

startnum is a variable I made up to hold the character position of Munutd. You can actually write all the changes like this with no new variables except the new found text.

So add this line

newname = name [name.find('Manutd') +1, 5]

before this one and change this one too

with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
    json.dump(newname, outfile)

bachoo786 · October 12, 2017, 5:45pm

right so when I run my output script after adding the line you mentioned above I get an error:

Traceback (most recent call last):
  File “football.py", line 15, in <module>
    newname = name [name.find('Manutd') +1, 5]
TypeError: string indices must be integers, not tuple

bachoo786 · October 12, 2017, 6:21pm

Sorry to bother you again but can you tell me please why do you want to find the character (index) of the numbers? How does that help?

RobDYI · October 12, 2017, 6:52pm

What I was trying to do was create a program to output 12:30. To do this, we need to locate text and find where that text begins.

This program will output 12:30, i checked it

str2 = "Manutd"
startnum = int(name.find(str2))
newname = name[startnum + 6 :startnum + 11]

print (newname)

with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
    json.dump(newname, outfile)

bachoo786 · October 12, 2017, 8:14pm

That works very well producing “12:30” as expected for Manutd. Thank you!!

I was wondering why does the data.txt produce “12:30” and not just 12:30 ? Wouldn’t this be an issue in value templating?

In my configuration.yaml. I have added the following:

platform: file
name: Manutd
file_path: /home/homeassistant/.homeassistant/python_scripts/data.txt
unit_of_measurement: ‘hours’

But I am not getting the time i.e. in hours state. What am I doing wrong?

RobDYI · October 12, 2017, 8:40pm

I think you are saving the data.txt as json. Just save normally and it won’t have the ". Also, I would modify the program to create text files for each team you want, and then you can point the HA sensor to the corresponding text.

bachoo786 · October 12, 2017, 9:51pm

Ok that works great.

How about showing from the python file for all the teams? would I have to have separate python scripts for each team and separate text files for each team?

RobDYI · October 13, 2017, 12:14am

I would just output all the teams you want in the same python script. Something like this, rename the text file for each team and point HA sensor to that file. I would rename the variable and file names to correspond with the teams, i didn’t so you can follow how to repeat for more teams. you will also want to put the python script in a shell command and run it went you want the text file to update every month.

str2 = "Manutd"
startnum = int(name.find(str2))
newname = name[startnum + 6 :startnum + 11]
str3 = "nextteam"  #note the # of char in the team name
startnum3 = int(name.find(str3))
newname3 = name[startnum3 + 8 :startnum3 + 11]  #adjusted the 6 to 8 for team char length

print (newname)

with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
    json.dump(newname, outfile)
print (newname3)

with open('/home/homeassistant/.homeassistant/python_scripts/data3.txt', 'w') as outfile:
    json.dump(newname3, outfile)