HA-Multiscrape issues

Hello,

I am struggling to get multiscrape working.

Here is my config:

- resource: https://hollywoodtheatre.org
  scan_interval: 6000
  log_response: true
  sensor:
    - unique_id: hollywood_daily
      name: Hollywood Daily Showings
      select: "body > header > div.primary-header__container.p-2.md\:p-4.lg\:p-8.bg-gradient-to-b.from-black\/90.to-black\/0 > div > div.relative.z-20.w-1\/5.sm\:w-2\/5.grow.flex.gap-4 > button > svg > path:nth-child(5)"
      value_template: "{{ value }}"

I am just trying to get the link to the home page for right now, as I couldn’t get the actual information that I am looking for to work properly, or really at all. I have debug logging on, but the error that I keep getting is:

2025-01-29 06:54:51.700 DEBUG (MainThread) [custom_components.multiscrape.scraper] Scraper_noname_0 # Hollywood Daily Showings # Tag selected: None2025-01-29 06:54:51.700 ERROR (MainThread) [custom_components.multiscrape.sensor] Scraper_noname_0 # Hollywood Daily Showings # Unable to scrape data: Could not find a tag for given selector

I am using Inspect to get the tag. When I used select: "body" it did get the text of the entire website, but I cannot get it to grab just the part that I am looking for. I’ve never done web scraping before, so I’m trying to learn this right now, so please bear with me if I am missing something simple. I do have debug logging turned on and log_response is set to true.

My ultimate goal is to grab the movie times for the day (and hopefully week or month later) and display them, or use them in voice responses.

I am grateful for any help I can get!

Thanks in advance!

No need to use multiscrape with that page. If you look closely on Chrome developer tools under network you can see a request to https://hollywoodtheatre.org/wp-json/gecko-theme/v1/show-list?view=today&_locale=user
that we can call via a rest sensor directly without the need for a scraping service.

This would be your sensors:

sensor:
  - platform: rest
    name: "Hollywood Theatre Show 1"
    resource: "https://hollywoodtheatre.org/wp-json/gecko-theme/v1/show-list?view=today&_locale=user"
    method: GET
    value_template: "{{ value_json.shows[0].display_date }} - {{ value_json.shows[0].title }} - {{ value_json.shows[0].events[0].start_time }}"
    scan_interval: 3600

  - platform: rest
    name: "Hollywood Theatre Show 2"
    resource: "https://hollywoodtheatre.org/wp-json/gecko-theme/v1/show-list?view=today&_locale=user"
    method: GET
    value_template: "{{ value_json.shows[1].display_date }} - {{ value_json.shows[1].title }} - {{ value_json.shows[1].events[0].start_time }}"
    scan_interval: 3600

  - platform: rest
    name: "Hollywood Theatre Show 3"
    resource: "https://hollywoodtheatre.org/wp-json/gecko-theme/v1/show-list?view=today&_locale=user"
    method: GET
    value_template: "{{ value_json.shows[2].display_date }} - {{ value_json.shows[2].title }} - {{ value_json.shows[2].events[0].start_time }}"
    scan_interval: 3600



You can change the scan_interval to whatever you need, it is currently set at 1 hour.

Is that what you wanted? Let me know if you need help getting out some other values.