Service Node-RED exited with code 256 (by signal 6)

Since latest (2 or 3) updates I get this error when my “Fire” scene launches. It involves a lot of actions (turn on lights, open blinds, etc.) It does not work and crashes node red:

Welcome to Node-RED
===================
30 Jul 15:33:39 - [info] Node-RED version: v3.0.2
30 Jul 15:33:39 - [info] Node.js  version: v18.17.0
30 Jul 15:33:39 - [info] Linux 6.1.34 x64 LE
30 Jul 15:33:39 - [info] Loading palette nodes
30 Jul 15:33:40 - [info] Dashboard version 3.5.0 started at /endpoint/ui
/config/node-red/node_modules/node-red-node-pi-gpio/testgpio: line 8: python: command not found
30 Jul 15:33:40 - [warn] rpi-gpio : Raspberry Pi specific node set inactive
30 Jul 15:33:41 - [warn] ------------------------------------------------------
30 Jul 15:33:41 - [warn] [node-red-node-rbe/rbe] 'rbe' already registered by module node-red
30 Jul 15:33:41 - [warn] ------------------------------------------------------
30 Jul 15:33:41 - [info] Settings file  : /etc/node-red/config.js
30 Jul 15:33:41 - [info] Context store  : 'store' [module=localfilesystem]
30 Jul 15:33:41 - [info] Context store  : 'default' [module=memory]
30 Jul 15:33:41 - [info] User directory : /config/node-red/
30 Jul 15:33:41 - [warn] Projects disabled : editorTheme.projects.enabled=false
30 Jul 15:33:41 - [info] Flows file     : /config/node-red/flows.json
30 Jul 15:33:41 - [info] Server now running at http://127.0.0.1:46836/
30 Jul 15:33:41 - [info] Starting flows
30 Jul 15:33:41 - [info] Started flows
30 Jul 15:33:41 - [error] [api-current-state:Temperatuur > 26] InputError:  sensor.buienradar_temperature
30 Jul 15:33:41 - [info] [mqtt-broker:Hass MQTT] Connected to broker: mqtt://192.168.1.142:1883
30 Jul 15:33:43 - [error] [api-current-state:Op Vakantie?] InputError:  input_boolean.vakantiestand_hoofdschakelaar_helper
30 Jul 15:33:43 - [error] [api-current-state:Op Vakantie?] InputError:  input_boolean.vakantiestand_hoofdschakelaar_helper
30 Jul 15:33:46 - [info] [server:Home Assistant] Connecting to http://supervisor/core
30 Jul 15:33:46 - [info] [server:Home Assistant] Connected to http://supervisor/core
<--- Last few GCs --->
[15234:0x7f5761046330]   419146 ms: Mark-sweep (reduce) 2045.1 (2085.7) -> 2045.0 (2084.2) MB, 687.1 / 0.0 ms  (+ 26.0 ms in 8 steps since start of marking, biggest step 5.1 ms, walltime since start of marking 720 ms) (average mu = 0.272, current mu = 0.2[15234:0x7f5761046330]   420059 ms: Mark-sweep 2046.0 (2084.2) -> 2046.0 (2088.2) MB, 911.9 / 0.0 ms  (average mu = 0.140, current mu = 0.001) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
[15:40:39] INFO: Service Node-RED exited with code 256 (by signal 6)
2023/07/30 15:40:39 [error] 334#334: *332 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.2, server: a0d7b954-nodered, request: "GET /comms HTTP/1.1", upstream: "http://127.0.0.1:46836/comms", host: "hass.nl:8123"
[15:40:40] INFO: Starting Node-RED...
> start
> node $NODE_OPTIONS node_modules/node-red/red.js --settings /etc/node-red/config.js
30 Jul 15:40:40 - [info]

I could find this:

Which tells:

And this “could” be an issue since I run on ESXi as a VM…

I have no issues at al with ZwaveJS or Node-red in “normal” situations and all runs fine… (even a part of a bigger flow with around 40 lights turning off…

Any help appreciated on how to find what causes this behavior.

Simplify your script by cutting it in half.
See if it runs and if not cut it in half again until it runs.
When it runs add some nodes back and test again until it fails.

Thanks for replying. That’s exactly what I did…

this is a piece of the “flow”:

In the red block a device triggers and ALA (All Lamps On) is then triggered. The rest (dotted line) is just some small thing like a message to telegram and setting the fan off.

As you can see in the screenshot triggering the ALA works fine without error. But when the first smoke detector triggers the flow node red crashes with the above error message.

As I am sure this always worked (as I have tested this heavily in the past with my smoke detectors and a spray can of fake smoke…

I do this now and then and I think last time was around 2/3 months ago and all was fine…

Any other tips?

If it is the smoke detector’s trigger that cause the error, then look at the complete output of that node.
And remember it is the small things that make it all go haywire in software coding.

Hi WallyR, did what you suggested…

This is the node output:
{"payload":"on","data":{"entity_id":"binary_sensor.fgsd002_smoke_sensor_smoke_alarm_smoke_detected","old_state":{"entity_id":"binary_sensor.fgsd002_smoke_sensor_smoke_alarm_smoke_detected","state":"off","attributes":{"device_class":"smoke","friendly_name":"Kantoor Rookmelder"},"last_changed":"2023-07-30T13:40:52.317206+00:00","last_updated":"2023-07-30T13:40:52.317206+00:00","context":{"id":"01H6KGFHYXE2YMZ1ME8VY9QVFW","parent_id":null,"user_id":"925e73a7b041443aaa84fcbe893ef6dc"},"original_state":"off"},"new_state":{"entity_id":"binary_sensor.fgsd002_smoke_sensor_smoke_alarm_smoke_detected","state":"on","attributes":{"device_class":"smoke","friendly_name":"Kantoor Rookmelder"},"last_changed":"2023-07-31T15:11:11.196653+00:00","last_updated":"2023-07-31T15:11:11.196653+00:00","context":{"id":"01H6P81MTWCZYP398H9QETC11B","parent_id":null,"user_id":"925e73a7b041443aaa84fcbe893ef6dc"},"original_state":"on","timeSinceChangedMs":34}},"topic":"binary_sensor.fgsd002_smoke_sensor_smoke_alarm_smoke_detected","_msgid":"7764f842cbc7ad2c","_event":"node:cfbb088e.8de148"}

This is that node-red addon output after crash:

31 Jul 17:10:23 - [info] Stopping flows
31 Jul 17:10:24 - [info] [server:Home Assistant] Closing connection to http://supervisor/core
31 Jul 17:10:31 - [info] Stopped flows
31 Jul 17:10:31 - [info] Updated flows
31 Jul 17:10:32 - [info] Starting flows
31 Jul 17:10:32 - [info] Started flows
31 Jul 17:10:32 - [error] [api-current-state:Temperatuur > 26] InputError:  sensor.buienradar_temperature
31 Jul 17:10:32 - [info] [mqtt-broker:Hass MQTT] Connected to broker: mqtt://192.168.1.142:1883
31 Jul 17:10:34 - [error] [api-current-state:Op Vakantie?] InputError:  input_boolean.vakantiestand_hoofdschakelaar_helper
31 Jul 17:10:34 - [error] [api-current-state:Op Vakantie?] InputError:  input_boolean.vakantiestand_hoofdschakelaar_helper
31 Jul 17:10:37 - [info] [server:Home Assistant] Connecting to http://supervisor/core
31 Jul 17:10:37 - [info] [server:Home Assistant] Connected to http://supervisor/core
<--- Last few GCs --->
[305:0x7fdde487e330] 71789740 ms: Scavenge 2018.4 (2063.1) -> 2018.1 (2073.6) MB, 6.0 / 0.0 ms  (average mu = 0.535, current mu = 0.274) allocation failure; 
[305:0x7fdde487e330] 71789756 ms: Scavenge 2025.1 (2073.6) -> 2025.6 (2074.6) MB, 10.5 / 0.0 ms  (average mu = 0.535, current mu = 0.274) allocation failure; 
[305:0x7fdde487e330] 71790503 ms: Scavenge 2025.9 (2074.6) -> 2025.2 (2097.3) MB, 746.9 / 0.0 ms  (average mu = 0.535, current mu = 0.274) allocation failure; 
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
[17:11:22] INFO: Service Node-RED exited with code 256 (by signal 6)
2023/07/31 17:11:23 [error] 332#332: *314 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.2, server: a0d7b954-nodered, request: "GET /comms HTTP/1.1", upstream: "http://127.0.0.1:46836/comms", host: "hass.nl:8123"
[17:11:24] INFO: Starting Node-RED...
> start
> node $NODE_OPTIONS node_modules/node-red/red.js --settings /etc/node-red/config.js
31 Jul 17:11:24 - [info] 
Welcome to Node-RED
===================
31 Jul 17:11:24 - [info] Node-RED version: v3.0.2
31 Jul 17:11:24 - [info] Node.js  version: v18.17.0
31 Jul 17:11:24 - [info] Linux 6.1.34 x64 LE
2023/07/31 17:11:24 [error] 332#332: *314 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.2, server: a0d7b954-nodered, request: "GET /comms HTTP/1.1", upstream: "http://127.0.0.1:46836/comms", host: "hass.nl:8123"
31 Jul 17:11:25 - [info] Loading palette nodes
2023/07/31 17:11:25 [error] 332#332: *314 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.2, server: a0d7b954-nodered, request: "GET /comms HTTP/1.1", upstream: "http://127.0.0.1:46836/comms", host: "hass.nl:8123"
31 Jul 17:11:26 - [info] Dashboard version 3.5.0 started at /endpoint/ui
31 Jul 17:11:26 - [warn] ------------------------------------------------------
31 Jul 17:11:26 - [warn] [node-red-node-rbe/rbe] 'rbe' already registered by module node-red
31 Jul 17:11:26 - [warn] ------------------------------------------------------
31 Jul 17:11:26 - [info] Settings file  : /etc/node-red/config.js
2023/07/31 17:11:26 [error] 332#332: *314 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.2, server: a0d7b954-nodered, request: "GET /comms HTTP/1.1", upstream: "http://127.0.0.1:46836/comms", host: "hass.nl:8123"
31 Jul 17:11:26 - [info] Context store  : 'store' [module=localfilesystem]
31 Jul 17:11:26 - [info] Context store  : 'default' [module=memory]
31 Jul 17:11:26 - [info] User directory : /config/node-red/
31 Jul 17:11:26 - [warn] Projects disabled : editorTheme.projects.enabled=false
31 Jul 17:11:26 - [info] Flows file     : /config/node-red/flows.json
31 Jul 17:11:26 - [info] Server now running at http://127.0.0.1:46836/
31 Jul 17:11:27 - [info] Starting flows
31 Jul 17:11:27 - [info] Started flows
31 Jul 17:11:27 - [error] [api-current-state:Temperatuur > 26] InputError:  sensor.buienradar_temperature
31 Jul 17:11:27 - [info] [mqtt-broker:Hass MQTT] Connected to broker: mqtt://192.168.1.142:1883
31 Jul 17:11:29 - [error] [api-current-state:Op Vakantie?] InputError:  input_boolean.vakantiestand_hoofdschakelaar_helper
31 Jul 17:11:29 - [error] [api-current-state:Op Vakantie?] InputError:  input_boolean.vakantiestand_hoofdschakelaar_helper
31 Jul 17:11:32 - [info] [server:Home Assistant] Connecting to http://supervisor/core
31 Jul 17:11:32 - [info] [server:Home Assistant] Connected to http://supervisor/core

This is what happens in node-red:

Thus if the this triggers the ALA flow with a chage node to ON:
image
Just as if I would manually press the ALA button in Node-Red or enable the flow from within a scene in HA.

The latter 2 (press button in NR of enable scene via HA (and node-red) normally works.

The nodes are “yellow” and the flow stops and NR crashes…

Whats possibly wrong here?

Any help appreciated…

Help :hear_no_evil: please

This is the error that cause the issue.

It is a version of out of memory error and usually comes from a memory leak.
The memory leak might not even be in your current flow, so it is a beast to hunt down and maybe you are not able to do it at all.
Try to restart Node Red and see if that helps.
Then update all palettes that have updates ready.

If this does not help then you might have to search for solutions on Google for that error with Node Red (or Node.js that Node Red is based on).
It is probably some extra custom palette you have installed, since it does not seem to be a common problem for users of Node Red.

Done this already.

It crashes everytime directly after starting this flow.

Pallettes are up to date (I check those regularly). And noting new there…

OMG this is something “just starting” after it worked simce I created it, which is around 2 years ago and ever since.

Googling the error did not realy give me any direction so I created this topic in the hope to get some guidance…

:ok_man::pray:

Some robotic help tells me below.

But I would like some human help on how to best troubleshoot/solve this.

The “FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory” message indicates that Node-RED has exhausted its allocated heap memory and is unable to continue running. This error occurs when the JavaScript code executed by Node-RED requires more memory than is available in the allocated heap.To address this issue, you can try the following steps:

  1. Increase Heap Memory: You can allocate more heap memory to Node-RED by setting the --max-old-space-size flag when starting Node-RED. For example:node --max-old-space-size=1024 node_modules/node-red/red.jsThis command allocates 1024 MB of heap memory to Node-RED. Adjust the value as needed based on the available resources on your system.

  2. Check for Memory-Intensive Nodes: Review your Node-RED flows for any nodes that might be consuming a significant amount of memory. Certain nodes or flows that process large datasets or use memory-intensive operations might contribute to the memory issues. Consider optimizing or refactoring these parts of your flows.

  3. Profile Memory Usage: Node-RED has a built-in memory profiling feature that can help you identify memory usage trends and potential bottlenecks. You can enable this by adding the following line to your Node-RED settings file (settings.js):require(‘node-red’).start({memoryOptimization: true});With memory optimization enabled, Node-RED will log memory usage statistics, which can help you pinpoint memory-consuming parts of your flows.

  4. Reduce Flow Complexity: Simplify your flows where possible. Complex logic, nested loops, and excessive data manipulation can lead to higher memory usage. Streamline your flows to reduce memory overhead.

Is 1 a good practice?
For 2, how do I do that best?
For 3, any experience?
For 4, no clue on where to look or what to do…

It’s not that there are courses on the above topics other than watching video’s and asking for help on fora…

Chiming in again… hoping to ask for help again will encourage some to step in :slight_smile:

Another question.

when I try to change this:

It also puts it in the yaml:

But on save and restart it’s gone… how come?

Since Yesterday i seem to have the same issue. Ditd you solve the issue?

I upgrade from HA 2023.8.3 to 2023.9.2. and after that nodered will not start. Logs give me this: ‘’
16 Sep 10:52:46 - [info] Starting flows
16 Sep 10:52:46 - [info] [join-server:8eb29663.f595e8] Starting server on port 1820…
16 Sep 10:52:46 - [info] [join-server:8eb29663.f595e8] Sending Registration nodered
16 Sep 10:52:46 - [info] [join-server:8eb29663.f595e8] Saved device Name: nodered
16 Sep 10:52:46 - [warn] [presence-faker:Schemerlamp randomly on] new node status: starting up…
<— Last few GCs —>
[559:0x7fc5d70fd330] 287872 ms: Scavenge (reduce) 1888.8 (2038.0) → 1887.9 (2038.0) MB, 12.0 / 0.0 ms (average mu = 0.251, current mu = 0.246) allocation failure;
[559:0x7fc5d70fd330] 292925 ms: Mark-sweep (reduce) 1890.1 (2039.3) → 1886.2 (2039.3) MB, 4991.6 / 0.0 ms (average mu = 0.291, current mu = 0.328) allocation failure; scavenge might not succeed
<— JS stacktrace —>
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
[10:56:37] INFO: Service Node-RED exited with code 256 (by signal 6)
[10:56:39] INFO: Starting Node-RED…’’

No, did not. However it now keeps the 4096 and did not try after.

Same Problem. Just installed the pallet and Node red tried to start and start and start…
is there an alternative?

I was sufficiently interested these postings to go away for a few weeks (months…) and try to find some answers, mainly because I don’t want to get stuck like this myself and because there seems to be very little information to help when NR falls over like this, at least here in the HA forum.

I run a number of Node-RED flows to monitor and control my Solar PV and battery system, and I am keen to prevent NR memory issues. I have had NR shut down and restart in the past, and traced this back to some lazy coding on my part. I had set up a big loop to repeatedly fire commands to my Pylontech battery console, and used split nodes to break array lists of commands into parts. It all worked, but as I did not bother to join the splits back up at the end of each loop, the message got bigger and bigger until NR ran out of memory (about every 30 days or so). Simply deleting the ‘msg.parts’ bit did not fix it, and I had to rewrite my code to join each split back when I had done with it, before going back round the outer loop again.

Starting from the top:
Node-RED program flows run in Node-RED, which itself runs on Node.JS (hence the ‘node’ in Node-RED). Node is a Java Script server, designed to run JavaScript ‘server-side’, and this has been built around the V8 JavaScript engine that was developed for use in Google Chrome.

V8 is embedded into Node, and does all the JS work, including the memory management. V8 like many ‘interpreted’ lanaguages uses a ‘heap’ for dynamic memory allocation and has a garbage collection (GC) system to recover no longer required memory.

In an HA environment with NR as an addon, the NR addon release is based on a version of Node which in turn includes a version of V8. NR is run in a Docker container, using the base Linux operating system.

How does V8 memory and garbage collection work?
This is the long and complicated bit!

For an overview, the following has most of the details. https://v8.dev/blog/trash-talk

I have also found the following informative. https://jayconrod.com/posts/55/a-tour-of-v8-garbage-collection

JavaScript started life as a simple interpreted language (script) to run just on client web-browsers. To this end it uses self-managed memory with a garbage collector. Compiled languages such as C# often require the programmer to allocate and de-allocate (recover) memory. In a compiled language, the compiler can typically see all the required variables, their size and enclosing scope up-front. The compiler at compile time allocates space for variables, and the execution code only has to deal with dynamic length variables such as strings and variable size arrays. These all go on a ‘heap’, entirely managed by the programmer.

JavaScript has moved on considerably over the past two decades, and execution programmes such as V8 now work on a just-in-time compile process, where JS is compiled to tokens prior to execution. However JS still frees the programmer from having to allocated and de-allocated memory. V8 has changed over the years from a basic ‘stop everything GC’, and now runs a sophisticated generation-intermediate garbage collection system, mostly so as to improve real-time performance in high-use web pages.

V8 controlled memory sits within the overall Node RSS (resident set size) and includes the stack, and the heap. RSS is where the code sits (this includes Node code, Node-RED and execution flows). The stack is a last-in, first-out memory space (like a stack of plates) used to hold functions. Every time a JS function is called, V8 pushes the function details onto the (top of the) stack. When the function ends (returns) the function details are pulled off the stack. Details for each function on the stack include all the primitive variables, as well as pointers to non-primitive objects, arrays, and strings (and function control items like closures and call-backs).

Generally the stack looks after itself. Starting with the main program (which is effectively just another function) the stack grows, and shrinks in use. The only time ‘stack overflow’ can be hit is if a program uses a really big number of functions, or more likely when a loop continually calls a function without limit. Typically this happens in non-terminating loops or recursive function calls, so this is very much down to the programmer to fix. The stack is not garbage collected.

Now to the ‘heap’. This is a block of memory used to hold objects (strings, arrays, JS objects and all their associated primitive variables and further pointers). When the JS code requires an ‘object’ variable, the main pointer to the object goes onto the stack (inside the function being called) and the rest sits in space allocated in the heap. This works well, until the variable is no longer required (the function ends, comes off the stack, and function variables are no longer needed). At this point the old space has to be garbage collected and returned to a ‘free list’

V8 at the time of writing uses the ‘Orinoco’ project approach (named after the Wimbledon Common Womble I assume). If anyone is interested this is my understanding of how it all works:-

At start-up Node gets memory from the OS to load and run itself (the RSS).

  • This memory is subdivided to provide code space, the stack, and the heap space.
  • The ‘heap’ space allocated by V8 is divided into two parts - the New Space and the Old Space
  • The New Space (generational) is divided into two sections - the Nursery and the Intermediate
  • The Nursery is divided into two halves - the To space and the From space.

Thus the New Space has three equal sized sections - Nursery To, Nursery From, and Intermediate. These are only ever a few MB big, and do not change size once initialised.

All new memory allocation is done in the Nursery To space. This naturally fills up quite quickly.

On the ‘generation principle’ that the most recently allocated stuff is the stuff most like to become free soonest, the generational approach tries to free up the newest stuff first. In V8 Orinoco, this is done by the scavenger, and this works by regularly going through the Nursery, swapping over the From and To, and moving required memory between the From to the To. Anything in the From (was the written-To) that is still required is moved to the (new) To space and marked as having been saved once.

After a while, the scavenger passes back over Nursery, again swapping the To and From, and anything now still required that was first generation scavenged is moved to the Intermediate space, and marked as saved twice.

Savaging happens all the time, using multiple background threads taking ‘free’ execution time off the main program thread. The process keeps a master list of root pointers from the stack, and follows them into the heap, marking everything touched recursively. Anything not marked is not moved during the scavenge. The ‘From’ space thus become empty of anything wanted, and this space is then effectively clear. The ‘To’ space fills up with newly allocated memory as well as first generation saved memory. Since everything is moved into an empty To space, compaction happens naturally and fragmentation is minimal. Once required memory is moved, the root pointers in the stack are updated, as is the primary root search table, so there is quite a lot going on.

The multiple-thread approach used by the scavenger requires that the root pointer table is divided up and allocated out to different threads. This, and the fact that the main program is still running and potentially allocating new variables on the heap, requires a bit of collision management using ‘write barriers’. Also V8 uses memory in page blocks, each 1MB (each page in Unix is 4k or more depending on processor architecture, so a V8 page is many Unix pages). Thus New Space is physically divided into 1MB chunks, with the scavenger only working across some of the pages at any one time (swapping individual To and From pages) and only freeing up lightly used pages to then be re-used.

Once something has been saved twice, now sitting in the Intermediate space, if it is still required when the scavenger visits for the third time it is moved to the Old Space.

The Old Space is just one large block, and this uses the Intermediate Garbage Collector running a Mark-Sweep(-Compact) process. Unlike the scavenger, this runs (for the key part of the process) in full thread mode, stopping everything else from running. It runs only when required, and is triggered by an heuristic algorithm, so there are no specific circumstances that can been relied upon to trigger the GC. The basic idea behind Orinoco is that the Mark-Sweep GC only runs when it absolutely has to.

The Mark-Sweep GC again starts with the root pointer table, works through the entire table, and covers both the New and the Old spaces. Anything touched is first marked. Then the sweep goes through the heap and picks up anything not marked, adding it to a ‘free memory’ list. Compaction happens only to very fragmented areas, page by page, and therefore the process is often just a Mark-Sweep. Free space in the heap is maintained as a list, sorted by size order. New allocations are thus placed into the spaces ideally just large enough for their requirement. In reality, the Mark stage is multi-thread concurrent and only the Sweep stage takes over the main execution thread, and a great deal of effort has been spent on optimising all these routines and algorithms so as to reduce the (main thread) time-out impact on JS execution so as to give the end web-browser user the best possible experience.

The scavenge cycle is continuous, and has to keep up with the demand for new memory space. The mark-sweep cycle is controlled by a set of heap sizes. The maximum heap size is set at start-up and prevents the heap (and thereby also Node) from going over a given limit. Under this is a physical heap size, which is the amount of memory currently allocated by the operating system. Since Node-RED runs in a Docker Container in HA, Docker passes all memory requests to the underlying operating system (Unix) and will assume (unless set otherwise) that all OS space can be used. At first Unix will provide pages of real memory, then it will start to page out less-used pages and expand into virtual memory space. High demand for memory can introduce paging issues where the OS is trying to push out pages to disc at the same time as getting wanted pages back in, all while programmes are trying to run. Excessive demands will ultimately lead to the OS running out of useable memory completely, whereupon Unix typically starts to kill off processes. A key thing I learnt at college all those years ago was that the paging program must sit in protected (unpaged) real memory - paging out the paging program is akin to turning off the WIFI smart plug powering the WIFI router and trying to switch it back on again.

As well as the physical memory in use, V8 keeps track of the maximum-used heap size and the currently-used heap size. When the ‘currently-used’ size increases close to the ‘max-used’ size, the heap is expanded by taking more available memory in the available-heap or increasing the available-heap up to the maximum-heap size. To make this work, the increase in used-heap size is based on current size multiplied by a factor which is based on the rate of increase. This means that a fast growing heap will increase max-used heap size more quickly than a slow growing heap. Since V8 is designed to run Chrome, it is heavily optimized towards web browser clients. Web pages, when loaded, typically demand a lot of new memory quickly, which rapidly increases the heap-used size then plateaus. Over time memory use may reverse as web pages are closed, but clearly the max-used heap size only goes up. To counter this another algorithm looks at heap memory plateauing behaviour and, where possible, reduces the max heap size. Clearly this reduction in max-used heap will only happen given both time and opportunity for the algorithm to work.

It is worth noting that major mark-sweep garbage collection is triggered by the heap currently-used size approaching the max-used heap limit, which will only happen after memory allocation, and that heap size adjustment is only made after a successful GC.

When the currently-used and max-used size both approach the fixed max-heap-limit, GC has to be run aggressively to try and recover memory before the heap runs out of space.

To make (in reality to encourage) Major Garbage Collection run, memory requests have to consume close to the current heap limit, and then allow the GC time and resource to execute and re-adjust the heap sizes.

This is a simplified interpretation of all the documents I have read, as there are also large object spaces. In reality V8 does not allocate very much memory for the heap, and large objects of say >32k cannot actually fit into the standard generations in the first place so they have their own space that is not GC’d.

There are V8 start-up settings to control the V8 heap, including the heap size, the young space size, and the max-old-space-size. This last setting can be set when starting Node, and in the HA Node-RED addon this is exposed as an advanced start-up setting. This is the only Node-RED configuration that can be used to modify heap and GC ‘behaviour’.

The default max heap size depends on processor (64/32 bit) and Node version. Unofficially it appears that this was 1.4GB up to Node 11, 2.0GB from v13, and 4.0 GB from 14 onwards. These figures are for 64bit processors, halve these for 32 bit - so yes your entire heap could be just 700MB, which when divided between the new and old space, and divided into three for the new spaces, does not give a lot of room to start with.

I have not covered the more tricky subjects of call-backs and asynchronous processing. In JS, call-backs are functions passed with a function call that are separately executed at the end of the main function by way of a return. This clearly gets messy on the stack - where is the call-back in relation to the main function? If the main function goes first and the call-back goes next, then when the main function ends it will clear from the stack but the stack will not unwind as there is another (call-back) function on top of it. This means that variable pointer references still remain. Asynchronous processing with promises becomes even more interesting. If anyone reading this knows how GC works around JS promises, please do explain!

So, what are the problems with just letting Node-RED / Node / V8 just get on with it?

The big challenge is ‘memory leak’. This is where memory is allocated for something, and then either the reference (pointer) is lost or the program or programmer loses track of the memory and does not release it when genuinely no longer required. In C# and Python (which uses pointer counters) it is very possible to leave bits of memory with a reference pointer that neither the GC nor the programmer can get to. JS is more resilient (as it uses ‘reachability’) but still it is possible to create variables (use memory) and not clear these when the program has effectively done with them.

There are several conclusions that I believe can be inferred.
First point is that GC only happens when the heap is getting close to the limit. If the heap grows quickly, the max limit rises quickly and can possibly reach ‘full’ before the GC has had time to be operational and effective. Even if the GC runs, it does not always clear everything, so it could be possible for the GC to run and not find enough free space. If the New Space area has become over-full, then scavage will not work and this area will only be cleaned by the main GC, so if the main GC fails, then scavenge will also fail, and Orinoco has to give up and shut down Node.

Creating a large array in memory will take up a large block in the heap. If this array is of objects, and then the objects are re-written to be bigger objects, the allocation system may move the array to new space leaving holes that are now no longer big enough for re-use. In particular, working on two arrays at the same time using a loop to repeatedly rewrite objects could interleave one array write with the other, particularly as JS execution is becoming more and more asynchronous. Re-writing such arrays with increasingly larger objects could very quickly create a lot of holes in the heap that become unusable without compaction. Since compaction of the Old Space is the very last thing the GC does (if at all) this could fill the heap to failure before the unused space can be recovered.

How to deal with memory
In genuine cases, where there is a requirement for more space, the max-old-space-limit can be increased. This only changes the old space size, and it has the effect of making the Mark-Sweep GC less efficient, so it is not recommended to mess with any of the standard default settings (and allow V8 to work out these for itself). Making the old space very big just delays the inevitable GC and may well mean that, when it happens, there is not enough resource left to allow the GC to work correctly and in-time. GC more regularly on a smaller heap is better than GC as a last dying panic on a larger heap.

There are several things that can be done to mitigate or prevent heap issues.

Starting from the bottom and working up, all memory management stages can generate failure, which could be any one of:

  • hardware (memory) faults
  • Linux paging issues
  • Docker / container allocation limitations
  • V8 (stack or heap) failure
  • Node.JS
  • Node-RED
  • C#, Python code run within Node-RED
  • Node-RED nodes (3rd party)
  • Node-RED JavaScript (programmer written function nodes)
  • Node-RED flow (poor programming)

and much of this is out of our scope, particularly when using third party nodes that may have embedded Python or C#, or asynchronous JS. It is worth noting that any C# contained in Node-RED is memory managed outside of the V8 heap in its own space. The programmer is entirely responsible for allocation and recovery as there is no GC of this space.

Anyway, stuff to avoid doing (complied list reading from several places, and certainly not exhaustive)

Stack

  • Avoid recursive functions without clear termination (also use tail-recursive functions where possible as JS can optimise these by turning them into a simple loop)
  • Keep variables local to the function. Use ‘let’ and not ‘var’ to avoid global variable declaration

Heap

  • Copy data by value and not by reference (ironically this is more economical as it avoids multi-pointer references to the same object, which reduces the efficiency of the scavenger and GC) In Node-RED there is a ‘deep copy’ option in the change node when using ‘SET’.
  • Delete unwanted data ASAP
  • Check loops for possible leakage and ensure loops terminate
  • Check timers and API (http) calls to ensure they are closed / deleted when no longer required
  • Avoid large data growth without allowing time for the GC to work
  • Set heap size to an appropriate value (not as big as possible - should ideally be just a bit more than is required for normal operation)
  • Consider long term leakage tracking with auto restart, and/or a plan to find and fix
  • Consider short term debugging using multiple heap dumps prior to known failure to trace heap behaviour

To elaborate on some of these points:

Undeclared variables in JS are auto-created but as global variables, same as ‘var’. Thus any such variables all end up remaining for the entire duration of the Node-RED program.

Variables can be set to ‘null’ or re-assigned at any point when no longer needed. The JS ‘null’ means no pointer, which means no reference into the heap, which means the GC can recover the old memory.

Timers and call-backs all set up event listeners. These wonderful things sit on the stack until they time out. Setting a load of timeouts for 10 seconds will leave loads of variables stuck in the heap.

I have read that if a caller is cleared before the promise returns this is not an issue for GC, but recursive timers for API and http calls are a problem.

From a Node-RED flow perspective, the following things consume and hold on to memory.

  • Multiple outflow lines from a node
  • API calls (timeouts)
  • Timers and timeouts
  • Nodes that store stuff
  • Processing large files / arrays / objects entirely in memory

The multiple lines out is a problem because Node-RED makes a complete copy of the message for each outflow connection. If the message is large, this multiplies the problem.
API calls and http calls are problematic due to the return which includes res and req. These returns in the message are very large and include recursive loops and should ideally be got rid of ASAP.
Large arrays, particularly of objects, should be managed to reduce overall size, and to delete unwanted data. Large files and large arrays that exceed even half the heap size are going to be a problem, and another approach working on just part of the array / file is required.

Other things to look out for are the use of ‘let’ and ‘const’ which should be tightly scoped. The use of functions and {} code blocks can restrain variable scope even further. The idea being, only have data as long as you absolutely need it and let the GC recover it early during the New Space generational cycle rather than letting it get moved into the Old Space.

Looking at the problematic code posted above, I note that almost everything points to consuming lots of memory and not letting the GC do its job. Many http calls, multiple lines out, lots of timers, many complex nodes working (potentially asynchronously) together. The HA WebSocket nodes variously use API calls and a WebSocket connection with HA. The WebSocket is single, permanent, and efficient. API calls are temporary, use more memory with timeouts and http returns, so these I believe are much more likely to cause issues with heap use and GC failure.

Memory Leak
As well as trying to avoid memory leaks in the first place, another approach is to set up a watchdog to monitor the heap memory. JavaScript inherently exposes absolutely nothing about memory to the coder, however the V8 and internal process information is available and can be accessed via JS functions using an external library.

https://nodejs.org/docs/latest-v18.x/api/v8.html#v8getheapstatistics

The node:v8 module exposes information including several calls that return memory or heap details.
Getting this module loaded needs require(‘node:v8’) which cannot be called from within JavaScript in a NR function node. However, I discovered doing this work that the latest Node-RED function nodes can be set to pull in the V8 module. Once done, the v8.getHeapStatistics() call returns detailed data. There are contrib-nodes for getting heap details or dumping a heap snapshot, however this can be easily done directly in a function node. The hard part seems to be trying to work out what each individual field represents.

This is my project (I’m calling it ‘Project Compost’) to monitor and watch the heap.

Function node to read V8 heap memory figures

[{"id":"1a8578e1722bf90e","type":"function","z":"249a680bf3f915e2","g":"53c4c81afcce308f","name":"Read Memory Use","func":"\nmsg.payload = {\n    \"memory\": process.memoryUsage(),\n    \"heap\": v8.getHeapStatistics(),   \n    \"space\": v8.getHeapSpaceStatistics()}\n\nreturn msg;","outputs":1,"timeout":"","noerr":0,"initialize":"","finalize":"","libs":[{"var":"v8","module":"v8"},{"var":"process","module":"process"}],"x":370,"y":1560,"wires":[["f812f2fa619bc191","3c575eaca7a42b1b","c879c5ac63f83116"]]}]

This just shows how to load in the required modules ‘V8’ and ‘process’ so that they can be called.

There is no point calling this too often, since heap memory sizes will not change except following a GC. As an experiment, I am calling this monitoring flow every 1 minute, and with other flows running every 20 seconds to poll my inverter, I see regular changes. At the moment this is just running and recording one day of data to context in an array, and just the RSS space size, the heap, and the used heap. I had need to restart HA following an update (problem with an integration that stopped working and had to be updated) and the result was very interesting indeed.

On the left - my RSS/heap use after Node-RED has been running for quite some time. On the right, just after a restart. Certainly suggests memory leak over time. After start-up I would expect my memory use (RSS) to be around 200 MiB, but was seeing this climb to almost 400 MiB.

Later results:
Well, I went through all my Node-RED (function) nodes when writing this posting and replaced every var with let wherever possible. I believe that I have noticed a difference in just doing this.

Here is my current RSS/heap use after Node-RED has been running for almost three weeks, and the figure is only slightly higher than just after the restart - around 220 to 230 MiB (average).

I am now just monitoring memory use using a graph card in HA.

In conclusion, I have been quite surprised how much difference using ‘var’ can make to a long term heap memory-leak. It seems that, if I want to run Node-RED on a ‘semi-industrial’ long-term basis, then I will have to put much more thought and effort into writing my code…

Note for the Forum Moderators (assuming that anyone has actually read this far…) - I have written this without any recourse to AI. This is all my own work, having taken almost two months over this, and I accept all responsibility for my human mistakes.

2 Likes