Reading text and time from a python file and having it as a sensor in HA

Hi all

I have a python file that contains text and time. What I want to achieve is from the command line sensor be able to extract the text and time and present it on HA as a sensor.

This is what I have so far in my configuration.yaml but my sensor does not read any of the data from the python file:

  - platform: command_line
    name: Football
    command: "python /home/homeassistant/.homeassistant/python_scripts/output.py | grep 'Manutd' | sed 's/^.*: //'"

This is my output.py:

Football Times
Saturday 26 August 2017Manutd12:00 Arsenal15:00Liverpool15:00 Mancity15:00Newcastle17:30 Spurs15:00

I believe my sensor is not reading the values from the python file because of the spacing and selection of ā€œManutdā€ in this case. Can someone please help me with this?

I also need the unit_of_measurement to be in time 24 hour format. How can I achieve this?

Many thanks.

@robmarkcole @Tinkerer maybe you guys can help please?

First off let me say this is a cool idea.

With that said, is your goal just to publish the time and date of the ā€˜Manutdā€™ game? What is the python script actually doing?

Why the grep and sed commands? Specifically sed is (if I have this right) removing everything between the beginning of the line and the first colon followed by a space - not sure why you want to do that.

My advice is, make a copy of output.py that simply returns the exact information you are looking for. Ideally it would take a command line parameter (e.g. Manutd) and return the info just for that game. Iā€™ll confess I am not sure what exactly the sensor looks for to interpret as a time and date, but a UNIX timestamp is a good starting guess.

I think you should either format the output of output.py to be json and then use a rest sensor to pick the info you need. Another idea is to get your python program to post the info to MQTT which HA can pick up.

I think your grep need to be something like this if you want to use this method.

grep -E -o ā€œManutd.{0,5}ā€

Iā€™d start by sorting that output to be something you can parse. Iā€™d suggest that if you want to stick with plain text then you should output a series of lines like:

Day Team HH:MM

Also, thatā€™s not a python script, thatā€™s 2 lines of text, you canā€™t expect python to know what to do with that.

Youā€™d be better off working with a file sensor, which would require you to output your data in JSON format.

Thank you all for your replies.

I should have mentioned that the output.py file is derived from a python script which obtains the information shown in the output.py file from a website.

I could save it as a text file or a python file but I need to get the data stored in this output file in JSON format which is something I am not familiar with.

To summarize I would need to edit the python script that obtains the data for the output file so that it is in JSON format. Can anyone please guide me on how I can do that?

@RobDYI this grep grep -E -o ā€œManutd.{0,5}ā€ did not make any difference to the sensor. I still get it blank with white space in it.

Well, youā€™ve not shared the python script, which makes that impossible.

If you were to share the file (say on hastebin.com) then Iā€™m sure somebody who knows Python will be able to help you.

Sorry here is the python script.

    # import libraries
    import json
    import urllib2
    from bs4 import BeautifulSoup
    # specify the url
    quote_page = 'http://www.bbc.co.uk/sport/football/premier-league/scores-fixtures/'
    # query the website and return the html to the variable 'page'
    page = urllib2.urlopen(quote_page)
    # parse the html using beautiful soap and store in variable 'soup'
    soup = BeautifulSoup(page, 'html.parser')
    # Take out the <div> of name and get its value
    name_box = soup.find('div', attrs={'id':'ja-user2'})
    name = name_box.text.encode('utf-8').strip() # strip() is used to remove starting and trailing

    with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
        json.dump(name, outfile)

As you must have noticed I have managed to dump the data in json format in a data.txt file. However the data saved on this text file is not specific to ā€˜Manutdā€™ in json format.

Here is the data.txt:

"Football Times\nSaturday 14 October 2017Manutd12:30\u00a0Arsenal17:30Liverpool12:30\u00a0Mancity15:00Chelsea15:00\u00a0Spurs15:00"

When I add it as a file sensor, the output is as above from the data.txt file and not just Manutd for example

Have you tried another way to get this info beside scrapping? (HA does have a scrape sensor)

It looks like there are many free api that can provide this info in a more digestible form.

This one might work.

https://market.mashape.com/sportsop/soccer-sports-open-data

I got it from here

https://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/#footballdata

Well I want to scrap other information from other websites.

I have used HA scrape but the output is the same as I posted above.

The data I am scrapping is from a table hence the difficulty in getting it as a sensor as a text and unit in time.

Ok, I think you need to add some more lines to your python script to help find the exact info you want. I am not sure if you want sensors for each team or just looking for a Manutd sensor.

I would search google for exactly want you want to do and look at the many examples. I would start by first finding the character (index) of the numbers you want, so search for the text ā€˜Manutdā€™.

I found this as how to do that

https://www.tutorialspoint.com/python/string_find.htm

I modified their example

Example
#!/usr/bin/python

str1 = name;
str2 = ā€œManutdā€;
startnum = str1.find(str2) +1 ;

we can rewrite all this to

startnum = name.find(ā€˜Manutdā€™) +1 ;

and now need to find how to get the next 5 characters starting at startnun position. I found it here and this should return the time 12:30.

newname = name [startnum, 5]

https://www.tutorialspoint.com/python/python_strings.htm

You can modify your python script to accept inputs from HA for different teams and it will return the searched text to HA.

Hmm I am working on your example on my script. Can I ask a noon question as to what is startnum?

startnum is a variable I made up to hold the character position of Munutd. You can actually write all the changes like this with no new variables except the new found text.

So add this line

newname = name [name.find('Manutd') +1, 5]

before this one and change this one too

with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
    json.dump(newname, outfile)

right so when I run my output script after adding the line you mentioned above I get an error:

Traceback (most recent call last):
  File ā€œfootball.py", line 15, in <module>
    newname = name [name.find('Manutd') +1, 5]
TypeError: string indices must be integers, not tuple

Sorry to bother you again but can you tell me please why do you want to find the character (index) of the numbers? How does that help?

What I was trying to do was create a program to output 12:30. To do this, we need to locate text and find where that text begins.

This program will output 12:30, i checked it

str2 = "Manutd"
startnum = int(name.find(str2))
newname = name[startnum + 6 :startnum + 11]

print (newname)

with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
    json.dump(newname, outfile)
1 Like

That works very well producing ā€œ12:30ā€ as expected for Manutd. Thank you!!

I was wondering why does the data.txt produce ā€œ12:30ā€ and not just 12:30 ? Wouldnā€™t this be an issue in value templating?

In my configuration.yaml. I have added the following:

  • platform: file
    name: Manutd
    file_path: /home/homeassistant/.homeassistant/python_scripts/data.txt
    unit_of_measurement: ā€˜hoursā€™

But I am not getting the time i.e. in hours state. What am I doing wrong?

I think you are saving the data.txt as json. Just save normally and it wonā€™t have the ". Also, I would modify the program to create text files for each team you want, and then you can point the HA sensor to the corresponding text.

Ok that works great.

How about showing from the python file for all the teams? would I have to have separate python scripts for each team and separate text files for each team?

I would just output all the teams you want in the same python script. Something like this, rename the text file for each team and point HA sensor to that file. I would rename the variable and file names to correspond with the teams, i didnā€™t so you can follow how to repeat for more teams. you will also want to put the python script in a shell command and run it went you want the text file to update every month.

str2 = "Manutd"
startnum = int(name.find(str2))
newname = name[startnum + 6 :startnum + 11]
str3 = "nextteam"  #note the # of char in the team name
startnum3 = int(name.find(str3))
newname3 = name[startnum3 + 8 :startnum3 + 11]  #adjusted the 6 to 8 for team char length

print (newname)

with open('/home/homeassistant/.homeassistant/python_scripts/data.txt', 'w') as outfile:
    json.dump(newname, outfile)
print (newname3)

with open('/home/homeassistant/.homeassistant/python_scripts/data3.txt', 'w') as outfile:
    json.dump(newname3, outfile)