Haveibeenpwned stopped working: failed fetching data (HTTP Status_code = 403)

tedsluis · January 18, 2019, 5:22pm

Hi, I use haveibeenpwned for quite some time, but since a few weeks it doesn’t work any more. The entity in the Lovelace UI returns unknown and I get this error in the HA log:

failed fetching data (HTTP Status_code = 403)

I did turned on debug logging:

# configuration
default: info
logs:
    homeassistant.components.sensor.haveibeenpwned: debug

And I see this in my HA log:

jan 18 17:31:18 INFO (MainThread) [homeassistant.loader] Loaded sensor.haveibeenpwned from homeassistant.components.sensor.haveibeenpwned
jan 18 17:31:18 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.haveibeenpwned
jan 18 17:31:19 DEBUG (SyncWorker_7) [homeassistant.components.sensor.haveibeenpwned] Checking for breaches for email: [email protected]
jan 18 17:31:19 ERROR (SyncWorker_7) [homeassistant.components.sensor.haveibeenpwned] Failed fetching data for [email protected](HTTP Status_code = 403)
jan 18 17:31:27 DEBUG (SyncWorker_5) [homeassistant.components.sensor.haveibeenpwned] Checking for breaches for email: [email protected]
jan 18 17:31:27 ERROR (SyncWorker_5) [homeassistant.components.sensor.haveibeenpwned] Failed fetching data for [email protected](HTTP Status_code = 403)
jan 18 17:31:33 DEBUG (SyncWorker_7) [homeassistant.components.sensor.haveibeenpwned] Checking for breaches for email: [email protected]

I have checked the haveibeenpwned API documentation and I did found this:

403	Forbidden — no user agent has been specified in the request

And this:

Specifying the user agent
Each request to the API must be accompanied by a user agent request header. 
Typically this should be the name of the app consuming the service, for example "Pwnage-Checker-For-iOS".  
A missing user agent will result in an HTTP 403 response. A valid request would look like:

GET https://haveibeenpwned.com/api/{service}/{parameter}
User-Agent: Pwnage-Checker-For-iOS

The user agent should accurately describe the nature of the API consumer
such that it can be clearly identified in the request. 
Not doing so may result in the request being blocked.

So I checked the HA code https://github.com/home-assistant/home-assistant/blob/master/homeassistant/components/sensor/haveibeenpwned.py to see whether it specifies a User_Agent. And it does specify a User_Agent, so a 403 error shouldn’t be the case…

So now I wonder if there are others that also experience this issue? How can I solve this?

anon43302295 · January 18, 2019, 5:32pm

I had this earlier today, but it seems to be working again now. I’m guessing there was a problem server end.

Silicon_Avatar · January 18, 2019, 5:49pm

Very recently there’s been another massive data breach discovered so I’m sure HaveIBeenPwnd is getting hammered lately with people checking if they’ve been compromised.

Sangeeth_Raja · January 18, 2019, 8:17pm

I also facing the same issue. Getting 403 response code.

lassieee · January 20, 2019, 2:18pm

I was checking a total of 5 e-mailaddresses and aparently that has gotten me blocked

Request Blocked

You have been blocked from accessing this resource on Have I Been Pwned.

This may be due to violating one or more of the acceptable use terms of the API.

It may also be due to your traffic patterns being similar to other users who may have violated the acceptable use terms.

Tips to avoid requests being blocked include:

Stick well within the published rate limit
Don't distribute requests over multiple IP addresses in an attempt to circumvent the rate limit
Only query the email addresses of people who have a reasonable expectation that you should do so
Avoid prolonged querying of the API over an extended period of time

--------

I found out by using curl to contact the api:
curl https://haveibeenpwned.com/api/v2/breachedaccount/[email protected]

Anyone else having this? And most importantly how to get rid of it?
I disabled the haveibeenpwned component hoping that I would be able to query their service again in a couple days, but after 3 days i;m still blocked.

anon43302295 · January 20, 2019, 4:35pm

Interesting.

I get the same, but haven’t had a 403 for ages. Anyone know how we can get unblocked?

raph · January 26, 2019, 2:38pm

Well this sucks, I set up the component to check my wifes and my email adresses and notify me when something happens. Now i happen to check my hassio logs and find all those errors. So I go to check all adresses manually on their site and turns out I’m in this newest breach. Sucks to find out later than everybody else just because i relied on my raspi. I mean this is exactly the reason why I set it up in the first place.
Has anyone found a way to get this component working again? rebooting, disabling the component for a while all did nothing for me.

lassieee · January 27, 2019, 2:43pm

same for me. I have it disabled for over a week now, and still not unblocked.
The error message i get when using curl contains this line:
“If you believe your request meets these requirements and was still blocked, please send this entire response body along with any communication you send regarding the error.”
I haven’t found any contact information to address this issue to though. I read haveibeenpwned used a CloudFlare service to block ipaddresses (part of the error message shows "class=“cferror_details”), so maybe I should contact CloudFlare.
I’ve no idea lol

Megachip · February 24, 2019, 3:18pm

Anything now on that problem? I’m in too (and I’ve just checking one account)

lassieee · February 24, 2019, 4:30pm

After a full month my ip was still blocked so I contacted the creator of haveibeenpwned.com, Troy Hunt. He is really accessible. He responded really quick and unblocked my ip, so I’m back in business. I advise you to do the same as those ip bans aren’t temporary.

tedsluis · February 27, 2019, 6:20am

My IP is unblock too, now. I contacted Troy via:

and explained my case. He responded very quickly. At first he was a bit cautious (and friendly). He wanted to see the Home Assistant HTTP response body while trying to access HaveIBeenPwned. The HA debug logging was not enough for him. Without the HTTP response body he could not see what was going on.

I did not know how to capture it out of HA, so I send him the HTTP response body from a curl on the command line (with the HA user agent in it):

$ curl -A "Home Assistant HaveIBeenPwned Sensor Component" https://haveibeenpwned.com/api/v2/breachedaccount/[email protected]?truncateResponse=true
<!DOCTYPE html>
<head>
<title>Request Blocked</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<h1>You have been blocked from accessing this resource on Have I Been Pwned</h1>

<p>This may be due to violating one or more of <a href="https://haveibeenpwned.com/API/v2#AcceptableUse">the acceptable use terms of the API</a> or for not complying with <a href="https://haveibeenpwned.com/API/v2">the API specifications</a>. It may also be due to your traffic patterns being similar to other users who may have violated the acceptable use terms.</p>

<p>Tips to avoid requests being blocked include:</p>
<ol>
<li>Stick well within the published rate limit</li>
<li>Don't distribute requests over multiple IP addresses in an attempt to circumvent the rate limit</li>
<li>Only query the email addresses of people who have a reasonable expectation that you should do so</li>
<li>Avoid prolonged querying of the API over an extended period of time</li>
<li>Clearly identify your app in the user agent string <a href="https://haveibeenpwned.com/API/v2#UserAgent">per the API docs</a>.</li>
</ol>
<p>If you believe your request meets these requirements and was still blocked, please send this entire response body along with any communication you send regarding the error.</p>
<div class="cf-error-details cf-error-1020">
  <h1>Access denied</h1>
  <p>This website is using a security service to protect itself from online attacks.</p>
  <ul class="cferror_details">
    <li>Ray ID: 4af59d5a2d0fc859</li>
    <li>Timestamp: 2019-02-26 21:48:13 UTC</li>
    <li>Your IP address: MY IP ADDRESS</li>
    <li class="XXX_no_wrap_overflow_hidden">Requested URL: haveibeenpwned.com/api/v2/breachedaccount/[email protected]?truncateResponse=true </li>
    <li>Error reference number: 1020</li>
    <li>Server ID: FL_20F331</li>
    <li>User-Agent: Home Assistant HaveIBeenPwned Sensor Component</li>
  </ul>
</div>

</body>

After this he wrote that it looks like I got caught up in the net of other abusive traffic on the same network and he unblocked my IP address. He apologies for the inconvenience and wrote that it’s increasingly hard to keep the bad stuff out, let the good stuff in. And he is doing this all in his spare time! I think that is great!

baldfox · July 19, 2019, 1:06am

I spoke with Troy, he alerted me to this post:

Looks like we might need to update an api/subscribe.

gurbina93 · August 7, 2019, 12:30pm

Mine stopped working for the 3rd time in a year. I have waited for 3 weeks and I am still blocked.

And anyway it seems it will stop working in August 18, can anyone open the component and confirm the version being used?

" Versions 1 and 2 of the API for searching breaches and pastes by email address will be disabled in 4 weeks from today on August 18."

Example:
GET https://haveibeenpwned.com/api/v2/breachedaccount/[email protected]

Paying 3.5USD to check if an email has been breached seems a bit steep tho, I think this might mark the end of this component unless HA reaches the service and comes up with an agreement and specific service for HA, or keep the component alive and check who’s willing to pay 3.5 USD just for this.

lassieee · August 16, 2019, 6:18pm

Mozilla offers a similar tool: https://monitor.firefox.com/
Perhaps someone with the right skillset can create a new component for this?

Fusion · November 17, 2019, 9:40pm

Actually, Mozilla’s data is provided by haveibeenpwned so this would be bypassing the part where you are financially helping.

lassieee · November 18, 2019, 6:09am

That would only be true if haveibeenpwned would offer their services for free to Mozilla. If they aren’t, we would actually be supporting Troy