IMAP custom event data template - maximum text/content size not working

I am trying to search a string within the body of a recurring email with the IMAP integration.
the email is over 32k, so I have to rely on the proposed workaround:

which sates: “the advantage is that there is no limitation to the size of the email text, as this is available as a variable within the template.”

image

PS: can’t configure max size over 30000 and leaving it blank reverts back to 2048.

But no matter how I try to configure the integration and template, I still hit the event 32k limit as per log details:

2024-02-14 16:07:31.901 WARNING (MainThread) [homeassistant.components.imap.coordinator] Custom imap_content event skipped, size (38315) exceeds the maximal event size (32168), sender: [email protected], subject: ❄️ Avis d’événement de pointe | Jeudi 15 février 2024

If I forward the email to myself, the template works as expected:`

image

Can you replicate or am I configuring this incorrectly ?

Issue opened here: IMAP custom event data template - maximum text/content size not working · Issue #110614 · home-assistant/core · GitHub

Thank you!

Have you tried setting the Max message size back to its default value or at least a significantly smaller value than the absolute maximum? The point of using the custom event template is to be able to parse large message data without forcing it into the event, keeping the event’s size within its limit. If you set the message size as high as possible and your message maxes it out you are very likely going to exceed the event’s max size once the other data like sender, recipient, etc are included into the event.

Yes I did, actually I was hoping that keeping the field empty would mean “ignore max size” but it reverts back to 2048 / same result…

Did you get positive results with default value (2048)? Or deleted previous message because it failed? Will have to do further testing but doesn’t seem to work as expected. Thx

I thought I had, but then I realized that I hadn’t 100% ruled out that my search term wasn’t in the first 2048… that’s why I deleted the previous post.

I am seeing the same thing as you, the custom template does not seem to have access to the complete contents of the message body, just the portion up to the Max message size setting.

Thanks for the feedback, if you have a chance would you mind checking with a message over 32k as here it simply doesn’t process the message/template at all and returns a log error like in my original post. Under 32k it will process it up to the maximum size.

My test message was 40k.

I think you are confusing the 32K event size limit and the 30K message size limit. They are related, but not the same thing.

Can you elaborate and provide examples? Let’s say my message is 40k and the string is located somewhere in the middle of the data (20k) what will happen with my template above + event?

Currently, the use case above doesn’t return anything other than an error message in the log.

I even tried setting up the template to filter the sender instead and the expected result (custom: true) still doesn’t show up in events.

If this is the intended behavior, not sure what this means : “the advantage is that there is no limitation to the size of the email text, as this is available as a variable within the template.”…

I agree that what I am seeing is not what I would expect from the description in the docs or the PR you linked.

If your Max message size is set high enough that your search term is within it’s limit the template will return true.

EDIT: There’s more nuance to this, see below.
TLDR; the template should find the string in the case described.

The error message is due to exceeding the event’s size limit. This causes the event to not be posted to the event bus. Your event-trigger-based content sensor won’t update with the data from that email, because there isn’t an event…

It’s a balancing act of setting the message size limit high enough that it captures your search term, but not so high that the sum of the text and all the other data (sender, server, headers, etc) exceeds the 32K event limit.

there is something else going on here…

the original sender message is 72k and the searched string is at 26k so it should trigger an event but it doesn’t.

however, if I forward myself the message, the message size drops to 64k, still over 32k, but does trigger events.

So there is something going on with the message beyond size, maybe message type or contents?..

original:

Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

image

forwarded:

Content-Type: text/plain; charset=“UTF-8”
Content-Transfer-Encoding: quoted-printable
image

For the original message, the error in the log states that the message size is 38k so I suspect that the message contents like html encoding/header, other than the text body is already over 32k. Once forwarded it gets converted to plain text and returns below size threshold.
Could this be the issue?

Is there a way to attach files other than jpg’s to a PM, I could send you a copy of both messages if you want to have a look.

I agree… there is something weird going on. With message size set to 2048, the template seems to pick up search results up to just over 25K characters into the message body. So it doesn’t work either the way the docs describe or the way I described earlier…

Hopefully the open issue will get some attention so I can get this working. Thx for your feedback.

OK found the issue:

By setting the maximum size of the message to 30k, it is much too close to the event limit of 32k, therefore if the message text is cut at 30k, the remaining attributes (other than text/body) quickly fill the remaining 2k and go over the 32k event limit. consequently, the event is not triggered.

Solution is to keep max message size lower. since you are using a template to do the filtering prior to publishing the event (without text size limitation), you don’t need a complete/long text field in the event data anyways.

case closed.

That’s what I said in post #10:grin:

I’ve run a few more tests since last night, and it seems to mostly work as described out to 40K (that’s as far as I tested) when I use random unique strings for the search terms. However I had a couple instances where the results weren’t as expected. These mostly occurred with search terms that were made up of multiple words and included punctuation.

1 Like

may have misunderstood what you meant originally…
I marked your post with the solution tag :wink:

I always try to avoid punctuation and spaces (multiple words) as they can end up as codes in the raw message.

h &agrave;&nbsp;9  h, en matin&eacute;e.</li>=0A</ul>=0A=0A<p style=3D"ma

it was actually the case here where the original string is “9 h, en matinée” (notice the the unusual space between 9 and “h” which you can easily miss) and I ended up using “h, en matin” to avoid the space issue and punctuation.

Thanks again for assisting on this one,
cheers,