I am searching for a way to integrate it into home assistant while keeping the “offline” functionality as it is.
My first idea was to buy something like this and connect it to an ESP32 board with ESPhome from where I can emulate the same button press that are done in the keypad.
You misunderstand the merger/splitter plug you have found.
Normally an ethernet cable only use wire 1+2 and 4+5 of the 8 wires available.
The merger/splitter plug will move wire 1+2 to 7+8 and 4+5 to 3+6 for one of the two connectors, thereby utilizing all 8 wires on the cable connected to the side with only 1 connector and later using the same plug to again split them out into two normal ethernet connectors.
In other words, for both connectors on the side with 2 connectors there will be no connection on wire 3,6,7 and 8.
If the RJ45 connector is the only one connected to the button panel, then it is not that simply anymore.
You have 7 buttons and numbers and maybe even lights too.
With only 8 wires available it requires more than just a shorted wire here and there to make it work.
In the case of the Desky project, it uses both uart on one wire (for one way comms from controller to control panel) and shorts other wires (for control from panel to controller)!
This project might give you some structure for how to get started.
Doing a teardown of both controller and handset and posting photos, measuring voltages on wires on the handset side (probably accessible), pulling out the logic analyser, and setting up a ESPHome uart debug to try to read uart messages on wires are all good places to start.