Issue with special character wrongly scraped/display with scrape sensor in HA - any tips?

Hi,

I use the scrape sensor to get a short weather notice from DWD. HA is running in Docker, I upgraded to the latest version 2022.7.5. The following sensor is configured:

platform: scrape
resource: https://opendata.dwd.de/weather/text_forecasts/html/VHDL50_DWPG_LATEST_html
name: "Wetter heute Berlin kurz"
select: "strong"
scan_interval: 900

The issue I have, is that some times special characters are wrongly encoded - they show like question marks (special characters like “ä”/“ö”/“ü”, potentially also “ß”). Sometimes means, that I dont notice that all the times. Taking a look at the source webpage, the special characters are properly rendered in the text.

HA website with inspector used shows the following, which is actually fully represented by the browser (Chrome Version 103.0.5060.53):

<div class="text-content">
          <!--?lit$450737128$-->
Heute H�hepunkt der Hitzeentwicklung, bis morgen starke W�rmebelastung. Am Donnerstag �rtlich Gewitter.

        </div>

when I take a look at prior states of the sensor, the encoding seems to be done correctly. I took this code from HA, by opening the sensor and checking the historical values:

<div class="chartTooltip  top " style="top:713.7999995529651px;left:640px;">
              <div class="title"><!--?lit$450737128$--><!---->wetter heute berlin kurz<!----></div>
              <!--?lit$450737128$--><div class="beforeBody">
                    <!--?lit$450737128$--><!---->sensor.wetter_heute_berlin_kurz<!---->
                  </div>
              <div>
                <ul>
                  <!--?lit$450737128$--><!----><li>
                      <div class="bullet" style="background-color:#7f80cd;border-color:#7f80cd;"></div>
                      <!--?lit$450737128$-->
Heute starke Wärmbelastung sowie Höhepunkt der Hitzeentwicklung. Am Donnerstag örtlich Gewitter, weiterhin heiß.

20. Juli 2022 um 13:27:54
20. Juli 2022 um 17:30:07
                    </li><!---->
                </ul>
              </div>
              <!--?lit$450737128$-->
            </div>

Does anyone have an idea, how I can debug this further or set it up? It seems, that in some cases the correct encoding is not used/found by the scrape sensor…

Thanks for any hints,

André

Adding screenshot of the dashboard used:

I created a bug for this: https://github.com/home-assistant/core/issues/75610