Texecom2mqtt: Texecom alarm panel and MQTT integration with HA support

daern · October 5, 2021, 3:19pm

FWIW, I had no problems getting an installer account on Texecom Cloud to manage firmware on my own alarm. I think I even declared to them that I only intended to use it to support my own, DIY installation. Worth an ask.

iMiMx · October 5, 2021, 3:21pm

I have an installer account on their website, for firmwares and Wintex etc, so presume it wouldn’t be too much trouble for Cloud.

Although the Cloud, App, most things are quite frankly embarrassing that they think they are acceptable.

daern · October 5, 2021, 3:23pm

Yeah, never used any of their cloud services apart from for updating firmware (it’s actually easier than faffing around plugging things in). I have better functionality through this integration and the HA app

harrymrh · October 5, 2021, 8:30pm

I have replaced the host name with the IP address. No improvement I am afraid. The messages are getting through to the Mqtt broker, but the devices show as unavailable.

daern · October 6, 2021, 2:21pm

Have updated SmartCom to v03.01.00 (I see there is a v03.02.xx now available too), the add-on back to the current latest version and will see how this works moving forward.

The bit I’m confused about now is that, despite removing “Texecom Connect” from ARC 1, and doing a “reset digi” from the panel, Texecom Cloud still appears able to talk to the panel. I wonder if something, somewhere is holding the configuration and this extra chatter is destabilising the interface with HA…

iMiMx · October 6, 2021, 2:40pm

Maybe the 3.02.x is hardware revision specific?

I seem to recall they released a new-ish one, newer than mine, perhaps this year, I forget, that had different lights (or something). I did just check again, but only shows me the version I’m running.

My SmartCOM (cabled) has no internet access at all times, it’s just way too chatty for my liking:

14:38:35.849393 IP texecom-smartcom.sec.themunki.net.48820 > one.one.one.one.domain: 32580+ A? broker5.texe1.ltd. (35)
14:38:35.852461 IP texecom-smartcom.sec.themunki.net.41860 > one.one.one.one.domain: 30019+ A? broker5.texe.com. (34)
14:38:40.852442 IP texecom-smartcom.sec.themunki.net.45486 > one.one.one.one.domain: 32580+ A? broker5.texe1.ltd. (35)
14:38:40.857792 IP texecom-smartcom.sec.themunki.net.43564 > one.one.one.one.domain: 30019+ A? broker5.texe.com. (34)
14:38:45.857774 IP texecom-smartcom.sec.themunki.net.48820 > one.one.one.one.domain: 32580+ A? broker5.texe1.ltd. (35)
14:38:45.862523 IP texecom-smartcom.sec.themunki.net.41860 > one.one.one.one.domain: 30019+ A? broker5.texe.com. (34)

Even with all the ARC gunf removed, it still tries to constantly resolve various Texecom hosts… which it shouldn’t be doing, in my view.

daern · October 7, 2021, 8:01am

Yup, I’d agree. Mine is pretty adamant about connecting to the cloud whether or not I tell it to do so, which kinda sucks, although arguably, that is its primary purpose so most users would probably expect it!

Been pondering if this perpetual cloud connection is interfering with the HA interface, especially given how much the Texecom interfaces dislike having more than one app talking to them at the same time. I’ve not yet black-holed mine as I’m trying to change just one thing at a time so I can actually find what the cause of my lockups is, but once things get stable I will probably do the same as you.

MarkB1 · October 7, 2021, 4:10pm

daern:

Hi all - not sure how everyone else is doing, but I’m still stuck on 1.0.36 as if I run anything newer, it will end up freezing until I manually restart the add-on. I’ve been retesting with the current latest version (1.1.6) and the same has happened after 2-3 days of running.

As a note, there’s nothing in the logs (the last entry was a routine PIR status change) and the add-on doesn’t crash (or the watchdog would restart it) - it simply stops generating further log until manually restarted, but as there’s no obvious way to know this without physically verifying that the alarm and HA have fallen out of sync, it’s a bit hard to detect.

I’ve reverted back to 1.0.36 again, but am aware that I can’t do this forever as eventually there will be a breaking change.

Anyone else experiencing similar behaviour?

Same here,

Reverted back to 1.0.36.
Any other version since locks up, does not restart on watchdog just sends panel information like voltage and current

daern · October 8, 2021, 1:07pm

Two days in and mine is still stable. What version firmware are you running on your SmartCom? I’ve upgraded mine from 3.0.x to 3.1.x and, while it’s still early days, things look to have improved.

TomDrewitt · October 9, 2021, 7:17am

I’ve been struggling with the arming side of things and found this thread. i’ve tried adding the mapping to my config but get the below when I try.

May I add, I’m very new to HA and its probably something simple. Any help would be appreciated

.

RogTP · October 9, 2021, 7:31am

Your structure is wrong. Indentation needs to be as below.

texecom:
  host: 192.168.x.xxx
  udl_password: 'xxxxxxx'
mqtt:
  host: mqtt://core-mosquitto
  username: texecom
  password: xxxxxxx
  client_id: texecom2mqtt
  keepalive: 30
homeassistant:
  discovery: true
areas:
  - id: house
    full_arm: armed_away
    part_arm_1: armed_home
zones:
  - id: inner_doors
    device_class: door
  - id: external_bell
    device_class: sound
log: info

daern · October 11, 2021, 2:09pm

Well it’s been 5 days since I updated my SmartCom firmware from 3.00.xx to 3.01.xx and this integration has been 100% stable for the whole time. I’ll keep monitoring, but I’m happy to say that my issues have been fixed and I would strongly recommend that anyone else having issues with lockups also look to update their SmartCom. @MarkB1 you may want to see if this also applies to you - would be interesting to get another confirmation of a fix.

For anyone that has never updated their alarm firmware - you can either do locally using a breakout board connected directly to the main panel or, as I did, through Texecom’s cloud service, which allows you to update it remotely without even removing the panel. In order to do this, you will need access to an installer account, although Texecom were kind enough to let me have one when I asked them so I’d suggest that this is probably the way to go.

iMiMx · October 12, 2021, 1:57pm

If it’s just the SmartCOM you need to check for updates, you can do it via their app (at least the V1 app, not sure about V2):

Enter the Engineers code into the app
Don’t press ‘Login’
Instead, press the ‘i’ at the top of the screen
Will then check for SmartCOM updates

EDIT: Does look like Texecom have removed the V1 Connect app from the Apple App Store, only V2 available now - which needs an installer account

Aturner44 · October 15, 2021, 12:50pm

I’ve seen this error mentioned a few times throughout this thread but haven’t seen an answer to it. I think it’s an intermittent fault that is still lurking around but hasn’t been fully diagnosed yet so I guess there won’t be much I can do, but I’ll post the logs anyway, they may be of some help in diagnosing further.

{"id":"A","name":"HOME","number":1,"status":"disarmed"}
2021-10-14 07:06:06 - PANEL: Open/Close (Away Armed) (Areas: A; Parameter: 1; Group: 5)
2021-10-14 07:06:06 - DEBUG: Publishing to texecom2mqtt/log: {"type":"OpenClose","description":"Open/Close (Away Armed)","timestamp":"2021-10-14T07:06:06Z","areas":["A"],"parameter":1}
/snapshot/app/dist/texecom/texecom.js:98
            throw new Error(`CRC is invalid (${crc}), message: ${hexMessage}`);
            ^
Error: CRC is invalid (112), message: 744d08c601430070
    at Texecom.parseBuffer (/snapshot/app/dist/texecom/texecom.js:98:19)
    at Socket.<anonymous> (/snapshot/app/dist/texecom/texecom.js:32:18)
    at Socket.emit (events.js:315:20)
    at addChunk (_stream_readable.js:295:12)
    at readableAddChunk (_stream_readable.js:271:9)
    at Socket.Readable.push (_stream_readable.js:212:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:186:23)

Cheers

marshn · October 15, 2021, 1:28pm

I looked into this over a year ago, and I’m confident that I got to the bottom of it. I posted my findings on the Texecom Forum, but didn’t get much of a response from Texecom.

Essentially, the CRC errors occur when a missage is being sent by the panel at the same time a message is being sent to the panel. I suspect that there is a bug in the Texecom firmware which is reusing the same buffer for the transmit and receive mechanism, but have no way of proving it.

I don’t use the texecom2mqtt software, I have a compiled C++ daemon which I am developing. The functionality is probably very similar.

The testing was done using a setup which causes the panel to generate frequent zone messages, while I send it status requests. This makes the issue occur every few minutes, which makes it more easier to capture and observe. In ‘real life’ I was only seeing perhaps one CRC failure a week, but the missed messages are really awkward to deal with.

The following is one of the posts that I made on the Texecom Forum last August. If you have the patience to read it all I believe that it demonstrates there is a flaw in the texecom firmware. If you have a Texecom Forum account, the thread there is titled, “Connect protocol, corrupt messages”, and the first post was made 22nd May 2020:

This example is using a test panel, which is an Elite 168 running firmware 5.02. The panel is using the Connect protocol via a USB-COM connected to COM 2 of the panel. Zone event messages have been requested from the panel and zones 57 & 58 are being toggled every seven seconds to create regular unsolicited zone update messages. The panel is also being polled with commands every few seconds (in this case to determine the status of zone 79).
This scope trace shows the TX and RX serial connection between the USB-COM and the panel. The zone status request command is the top (blue) trace, and the two unsolicited zone update messages are the bottom (red) trace. The scope has been configured to display the serial data bytes within the packets, and these byte values correspond the values sent and received by the panel interface software.

The seven byte zone status command packet (top trace) contains:
74 43 07 75 03 4f b4
The first zone update packet (first eight bytes of bottom trace) is corrupt, and contains:
74 4d 08 64 03 4f b4 8c
The packet should have contained:
74 4d 08 64 01 39 41 8c
In this example the 5th to 7th byte of the corrupt packet are overwritten by a sequence of bytes from the zone status request command. It is interesting that the panel has calculated the correct CRC for the corrupt packet, but has sent the wrong data preceding it.
Additionally, whenever this corruption occurs, the panel does not respond to the command which has been sent. Resending the command results in the expected response.

An edited extract from my logfile is pasted to the end of this message.
It appears that there is an issue with the panel which results in disruption to the protocol when a command is sent to the panel while an unsolicited message is being prepared or sent. Both the packet transmitted to the panel (which is ignored) and the packet received from the panel are apparently corrupted. Although in this example the command message is a zone status request I have seen this happen with all of the commands that we use (including ‘get time/date’, ‘get system power’, ‘get log pointer’ etc). I have not see an example of the corruption occur when a command does not immediately precede an unsolicited message.
The scope traces appear to confirm that this is not an issue with third party software, and that the problem is with the panel hardware or firmware.
I have multiple examples of this corruption which have been obtained using two different Elite 168 panels, so I do not believe that this is likely to be caused by faulty panel hardware.
This corruption only happens a few times a day but it means that it is impossible to be sure that all zone state changes have been captured using unsolicited messages. Polling the panel makes the situation worse as the likelihood of a message ‘collision’ becomes more likely. There is no pattern to the corruption that I have identified, such as connection duration, panel run time etc.
As I have noted before, using different panel COM port does not seem to affect the frequency of the corrupt packets. The problem is no more or less frequent when using a COM-IP rather than a local serial connection.

Logfile extract:

get_zone_details_from_panel(): Zone=79
Sending command: (seq=117)  
0000  74 43 07 75 03 4f b4                              tC.u.O.

rx_packet message buf   : 
0000  74 4d 08 64 03 4f b4 8c                           tM.d.O..
CRC Failure - Calculated=61, Received=8C  crc: fail

rx_packet message buf   : 
0000  74 4d 08 65 01 3a 41 a6                           tM.e.:A.
msg_type (4D)=MESSAGE (seq=65) Message seq incorrect - processing message anyway: expected=100, received=101
message_handler() Zones::retrieve(58) 
Msg: Zone event: zone 58 'Cam 18' is 'active+Auto Omit'

rx_packet message buf   : 0000  74 4d 08 66 01 39 40 25                           tM.f.9@%
msg_type (4D)=MESSAGE (seq=66)
message_handler() Zones::retrieve(57)
Msg: Zone event: zone 57 'Cam 17' is 'secure+Auto Omit'

rx_packet message buf   : 0000  74 4d 08 67 01 3a 40 0f                           tM.g.:@.
msg_type (4D)=MESSAGE (seq=67)
message_handler() Zones::retrieve(58)
Msg: Zone event: zone 58 'Cam 18' is 'secure+Auto Omit'

Timeout waiting for response, resending last command
0000  74 43 07 75 03 4f b4                              tC.u.O.

rx_packet message buf   : 
0000  74 52 29 75 03 00 00 01  20 20 20 20 20 20 20 20  tR)u....         
0010  20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                   
0020  20 20 20 20 20 20 20 20  7e                               ~
msg_type (52)=RESPONSE (seq=75)
get_zone_state_from_panel(): Zone=79 Zones::retrieve(79)

Aturner44 · October 15, 2021, 3:50pm

Cheers Neil, you’ve really nailed this it seems.

Have you shared this information with Daniel Chesterton before?

You clearly have a much deeper understanding of this subject, my initial thoughts are that it’s unlikely we will get Texecom to implement any fixes in their firmware, and as you point out, this could be a limitation of the hardware due to using the outdated serial ports for communication, surely it’s time they implemented true IP connection.

It seems as though we cannot stop the unsolicited messages, can you think of a way to mitigate this?

From my logs I find that the system is constantly polling for battery level from the system, so I suppose it could be a game of roulette if I can reduce the amount of information being transferred it reduces the chances of a clash, I don’t particularly need this battery information so I could turn that off for a start. In reality though I suppose it doesn’t make the system anymore reliable, still a fault that could occur at any moment.

I also wonder if the solution is to catch the error, and then poll the system for the state of all of the sensors, this way you could eliminate missing a change of state.

I could be completely wrong, this is a very new platform for me.

marshn · October 16, 2021, 7:34am

I haven’t shared anything with Daniel, but I expect that he is aware of the Texecom forum, and the protocol related threads there.

To be clear, I don’t feel that the issue is caused by a shortcoming in the Texecom hardware, or even the use of a serial port. I strongly suspect that there is a bug in the Texecom firmware which results in a buffer being used by both the Tx and Rx code. Everything works fine until a Tx and Rx happen simultaneously, and then the issue occurs when the messages are corrupted by one another.

Unsolicited messages could be turned off, but in the application I was developing they were essential. They are sent immediately by the panel every time a zone status changes. Continually polling the panel isn’t an option on a system with 168 (or even 640) zones, as a full poll could take over a minute. For home automation this isn’t workable; a door or PIR might only be active for a couple of seconds and would be missed.

The battery level polling will be a mechanism to stay logged into the panel. After a minute or two the panel will timeout and close the port if it doesn’t receive any requests. The integration will then need to log into the panel again.

For us, polling wasn’t an option as we were looking at using panels with many sensors, and polling was too slow. Even if the ‘auto logout’ feature could be disabled a regular exchange with the panel would still be required to ensure that it was still connected, and that would cause CRC errors from time to time.

We couldn’t find a way around it, Texecom either wouldn’t or couldn’t fix the issue, so we dropped our plans for application that was being developed.

Aturner44 · October 19, 2021, 6:33am

This sort of thing is very frustrating, I can understand why Texecom aren’t particularly interested as it probably isn’t a high priority in terms of business needs. Although, if the software you were developing was for use on such scales I’m sure there are almost endless applications.

I understand what you mean about the hardware, but I can’t help but feel if they updated the hardware to communicate over a true IP connection that the system would be open to more possibilities not only that but you’d be able to have simultaneous connections and many other benefits. I can see why they wouldn’t want to do this though, their firmware has been built over years to interface via serial, plus all the different add-ons that the system communicate this way. Serial is probably more robust in these types of applications too.

I wonder, could the bug be on an embedded peice of hardware that controls the transmission of data and has no way of being flashed? Irrelevant at this point but interesting.

I didn’t realise how slow polling would be, but it makes perfect sense, obviously depending on how many sensors you have. I didn’t mean to continually poll, as opposed to listen for messages, just that in this case, instead of having the application crash due to the CRC error. Catch the error, poll the system. But you are correct, you’d still miss a momentary change of state. The only other way I can think of, (which feel free to correct me), and it seems very convulated and slow, would be to keep a log of the system states, and when an error occurs, grab the log from the panel, compare the data for any discrepancies, meaning the application could backtrack for anything that may of been missed. Now I already see 2 problems with this, the panel would still be slow at serving the log (I don’t know if this time can be reduced by just requesting the log from specific time periods, probably not), I’m not entirely sure the log records every single change of state?

Obviously Daniels advice is to leave Watchdog turned on, which obviously restarts the application, then polls the entire system on startup so, essentially already working under this principle. Like you say, not entirely reliable as it could miss a change in state. This is all fine for us home automation enthusiasts, I can see why this would be frustrating for yourself trying to develop a more commercially viable piece of software.

Of course the battery level is just to keep the connection alive, I should’ve realised with just how consistently it does it.

marshn · October 19, 2021, 12:39pm

The Premier panels use a Renesas M16CR5F364AMNFA microcontroller, which has six uarts built in, so the there isn’t any intermediate firmware between the Texecom firmware and the serial hardware.

We rewrote our software to interface directly to the serial port (cutting out the COM/IP) to rule out everything but the panel itself. The scope traces confirm what our logfiles were telling us, and there is no doubt that this is a panel issue.

Unfortunately, because the corruption only happens when there is a simutaneous Tx and Rx, it will result in the loss of a status update when it occurs. It may be tolerable on a very simple system, but as soon as that system becomes a bit more complex with more updates the issue becomes more serious.

The ony way forward that I can see would be to find a way of disabling the panel connection timeout. Panel comms could be monitored by regularly toggling a zone, and checking that those zone updates arrive. This would remove the necessity to send any data to the panel, which should eliminate the CRC fails. Unfortunately, I don’t know of a way of disabling the panel comms timeout.

daern · October 22, 2021, 12:38pm

Sadly, all is still not quite stable and I suspect I’m hitting the issues mentioned above.

I do, however, have a HA weirdness - when the add-on dies (and all of the UI elements go unavailable), it doesn’t restart until I open the “logs” page within the add-on, at which point the log immediately starts and the add-on returns to service. I have the watchdog enabled.

Anyone else seeing this slightly weird behaviour?