I made some sensors for Paperless-ngx

If you don’t know Paperless-ngx, it is a open-source platform to organize documents, with OCR, tagging, categories ect. I use it, so I made a few sensors using the REST API and the RESTful sensor in Home-Assistant.

monitoring_paperless-ngx_in_home-assistant_dashboard_card_using_mushroom-template-card

I have posted the code ect here: Monitoring Paperless-ngx in Home-Assistant - Flemming's Blog

9 Likes

hi. this is the first time I’ve heard of paperless ngx. Looks interesting. You are running Paperless ngx in same machine as your HA?

I am running Home Assistant Operating System and have Portainer. Not sure how to install it.

I kind of do, but kind of not.

I am running UnRAID as my base OS, then I run Paperless-ngx as a docker on that, and I also run Home Assistant OS as a VM on that.

So paperless-ngx and Home-Assistant is from a network pov on two macines.
Anyway, I have a installation guide for UnRAID, it is mostly “the same” for other docker based setups.

2 Likes

There are also some HA-Addons that can be found on Github (e.g. GitHub - TheBestMoshe/home-assistant-addons: Collection of Home Assistant Addons). However none of them is reliabilly maintained.

I decided that there can not be enough forks…

3 Likes

For everyone also creating REST-sensors for Paperless-ngx:
You should try to minimize traffic, especially if your document database is huge.
To fetch the document’s count you should for example never send the basic Request to

http://<paperless-ngx_ip>:<port>/api/documents/

as this will return all document-objects containing all data including full text.
Instead you should call the API with modifying arguments like

?page=1&page_size=1&truncate_content=true

This will return an object like this:

{
  "count": 3822,
  "next": "http://<paperless-ngx_ip>:<port>/api/documents/?page=1&page_size=1&truncate_content=true",
  "previous": null,
  "all": [ ... ],
  "results": [
    { ... } // Only 1 object returned with truncated text inside
  ]
}

So your REST-sensor should look like this:

sensor:
  - platform: rest
    name: Paperless-ngx - Number of documents in Total
    resource: http://<paperless-ngx_ip>:<port>/api/documents/?page=1&page_size=1&truncate_content=true
    method: GET
    headers:
      Authorization: Token <User-Token>
    value_template: "{{ value_json.count }}"
    scan_interval: 300
5 Likes

Do you know if something has changed in the API? When adding ?page=1&page_size=1&truncate_content=true to /api/documents/ I still get all document objects including all data with full text. Also changing the page number doesn’t give the expected result.

Edit: already found the issue, had to put the URL in the CURL command between quotes and then indeed it shows a minimal object.
So like this

curl -H "Authorization: Token <token>" "http://<url>:<port>/api/documents/?page=1&page_size=1&truncate_content=true" | jq

Hi to all,

I have trouble getting the rest sensors to work. No matter what I do, the sensors do not show up in Home Assistant. I followed the instructions by @flemmingss (creating the user in Paperless, generate the Token, etc.)

To define the sensors in HA, I tried:

  • including them under the sensor section in my configuration.yaml (as described above) … configuration check is successful, no new entities appear
  • as I already use the rest API (for getting informations from my grocy container) in my configuration.yaml (rest: !include rest.yaml), I have a separate rest.yaml included … so I tried to define the sensors there via:
- resource: http://<IP>:<PORT>/api/documents/?page=1&page_size=1&truncate_content=true
  method: GET
  headers:
    Authorization: Token <TOKEN>
  scan_interval: 300
  sensor:
    - name: Paperless_total_documents
      value_template: "{{ value_json.count }}"
      unit_of_measurement: "Documents"

- resource: http://<IP>:<PORT>/api/tags/10/
  method: GET
  headers:
    Authorization: Token <TOKEN>
  scan_interval: 300
  sensor:
    - name: Paperless_inbox_documents
      value_template: "{{ value_json.document_count }}"
      unit_of_measurement: "Documents"

- resource: http://<IP>:<PORT>/api/tags/15/
  method: GET
  headers:
    Authorization: Token <TOKEN>
  scan_interval: 300
  sensor:
    - name: Paperless_todo_documents
      value_template: "{{ value_json.document_count }}"
      unit_of_measurement: "Documents"

again no success … no far I got no error from the code checks of HA, also checking the access via CURL works! So what the heck is going on? Could someone help me here?

EDIT: I found in the protocols that it could not fetch the data:

Error fetching data: http://IP:PORT/api/tags/10/ failed with All connection attempts failed

But Again … CURL works?!?

EDIT2: I found that for some reasons the rest sensor does not grab any data over http … using http:// and reverse proxy to access from “outside” the sensors works nicely. What is the problem here?

Many thanks in advance!

Greetings and happy holidays!
Sven

I’m wondering if NGX offers a total page count of all documents. I can see a total document count, but not total pages. What do you think?

Hi,

just a question- I understand it beeing a nice thing to see Paperless stats in HA.
As for HA I always ask the immediately following question: “For what possible automation”?

I mean, will you trigger when your documents reach 1,000 documents? What acion are you doing then?

Really, I am curious- can you give me an example?

/KNEBB

Not exactly automation. I think most people have some section on their dashboard to show “warnings” - Robot vacuum sensors need cleaning, washing machine is done, some device needs battery replaced, plant needs watering, tea timer is running… So you can just add another one when there are documents in inbox.

2 Likes

The base integration will only give you stats but with the Rest API you can run automations. What I plan to setup is the following:

I add a tag to bills I still need to pay. For some cases I do not pay them right away, so to not forget I want to do is trigger a notification “hey, this doc hat Tag “unpayed” for 14 days”