Monitoring health of ESP nodes within rules

Moderators: grovkillen, Stuntteam, TD-er

Post Reply
Message
Author
jgrad
Normal user
Posts: 100
Joined: 29 Aug 2016, 22:03
Location: Slovenia

Monitoring health of ESP nodes within rules

#1 Post by jgrad » 15 Apr 2024, 14:53

Hi,

on main page of every ESPEASY node there is a list of other ESP nodes within same network. For every node there is also information about AGE which somehow indicate that node is alive. Is it possible to reference AGE of particular node within rules - eg to implement rule within one node in network which check health of other nodes within network and send notifications?

If there is another hint how to implement such monitoring please share it.

BR.

TD-er
Core team member
Posts: 8762
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Monitoring health of ESP nodes within rules

#2 Post by TD-er » 15 Apr 2024, 15:06

You could use the "sendto" command to send something to a specific node.
See: https://espeasy.readthedocs.io/en/lates ... nd-publish

For example a command like "event,fromnode=%unit%"

Code: Select all

sendto,N,"event,fromnode=%unit%"
With N being the unit nr of the other node

And then in the rules on that node:

Code: Select all

on fromnode do
  logentry,"Received from node: %eventvalue1%"
endon

jgrad
Normal user
Posts: 100
Joined: 29 Aug 2016, 22:03
Location: Slovenia

Re: Monitoring health of ESP nodes within rules

#3 Post by jgrad » 15 Apr 2024, 15:47

OK, this is one option but I have to modify rules on both sides. Is there a way to use alreasy available info which is distributed with broadcast?

TD-er
Core team member
Posts: 8762
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Monitoring health of ESP nodes within rules

#4 Post by TD-er » 15 Apr 2024, 16:32

Not yet, but I will think about what could be done to make this available.
It feels more like a system variable, so maybe we can do something like %u_age%(N) with N being the unit nr.
A bit like the standard conversions.

jgrad
Normal user
Posts: 100
Joined: 29 Aug 2016, 22:03
Location: Slovenia

Re: Monitoring health of ESP nodes within rules

#5 Post by jgrad » 18 Apr 2024, 20:32

yes, system variables like:
%u_age%(N)
%u_name%(N)
%u_IP%(N)

would be perfect.

Values are available in response to http://localhost/json but I dont know how to get this response and parse it within rules.

User avatar
Ath
Normal user
Posts: 3531
Joined: 10 Jun 2018, 12:06
Location: NL

Re: Monitoring health of ESP nodes within rules

#6 Post by Ath » 18 Apr 2024, 20:47

jgrad wrote: 18 Apr 2024, 20:32 Values are available in response to http://localhost/json but I dont know how to get this response and parse it within rules.
You can not get the JSON output from rules from the same ESP, and you can't process the resulting (large) json-text.

We'll try to add the functions.
/Ton (PayPal.me)

User avatar
Ath
Normal user
Posts: 3531
Joined: 10 Jun 2018, 12:06
Location: NL

Re: Monitoring health of ESP nodes within rules

#7 Post by Ath » 18 Apr 2024, 23:15

jgrad wrote: 18 Apr 2024, 20:32 %u_IP%(N)
That's already available, named %c_u2ip%(N,x) where N = unitnr, and x determines what's returned for an empty IP: 1 = "" (empty string), 2 = 0
/Ton (PayPal.me)

TD-er
Core team member
Posts: 8762
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Monitoring health of ESP nodes within rules

#8 Post by TD-er » 18 Apr 2024, 23:48

Maybe also add %u_build%(N)

User avatar
Ath
Normal user
Posts: 3531
Joined: 10 Jun 2018, 12:06
Location: NL

Re: Monitoring health of ESP nodes within rules

#9 Post by Ath » 21 Apr 2024, 17:32

@jgrad I've created pull request #5039 to add the request conversions. The names are somewhat different then requested, as conversions are required to start with "%c_".

Quick documentation available in the PR. (Documentation is also updated, and will be deployed on ESPEasy RTD when the PR is merged).
Download available in this GH Action Run (You'll need a free github account to be able to download from there)
/Ton (PayPal.me)

jgrad
Normal user
Posts: 100
Joined: 29 Aug 2016, 22:03
Location: Slovenia

Re: Monitoring health of ESP nodes within rules

#10 Post by jgrad » 21 Apr 2024, 22:22

@Ath

thanks for your effort. I uploaded ESP_Easy_mega_20240421_normal_ESP32_4M316k_ETH and sucessfully executed basic test.
I also implemented node healthcheck with email notification within rules - after some days when I will be sure that everything works as expected I will add code in rules which is used for monitoring.

Can you also add possibility to fetch load of selected node?

User avatar
Ath
Normal user
Posts: 3531
Joined: 10 Jun 2018, 12:06
Location: NL

Re: Monitoring health of ESP nodes within rules

#11 Post by Ath » 21 Apr 2024, 23:26

jgrad wrote: 21 Apr 2024, 22:22 Can you also add possibility to fetch load of selected node?
Well, I wasn't sure if that would be useful, but as you're asking for it I've added it. And I also added the Type column value, both numeric and string-converted, that shows the type of chip in the remote unit (not only ESP's... ;))

A new Actions run is working for you
/Ton (PayPal.me)

jgrad
Normal user
Posts: 100
Joined: 29 Aug 2016, 22:03
Location: Slovenia

Re: Monitoring health of ESP nodes within rules

#12 Post by jgrad » 22 Apr 2024, 19:51

Hi @Ath

thanks for adding "load" as possible parameter - it can be usefull for monitoring load performance... (not tested yet)

Below I am sharing simple rule which is checking presence of all ESP nodes with IDs btw 1..100 - if age it above 300seconds =>mail notification. Nodes not present are skiped.

Code: Select all

on System#Boot do
 Let,10,1 //initializing counter for monitoring health of ESP nodes, counter is used also for ID of node which is checked in one cycle 
 loopTimerSet,2,1       // initialize timer2 for checking health of ESP nodes, cycle 1 second, each second 1 node is checked
endon

On Rules#Timer=2 do
  If %c_ubuild%([int#10])>0  //checking if ESPnode is on list of active nodes, rule is checking every ID btw 1..100, ID is stored in int#10
   If %c_uage%([int#10])>300  //if ESPnode is on node list then checking AGE of node since last status report 
   //if age is greater than 300 seconds send email notify
   Notify 1, "ESPEasy Node %c_uname%([int#10]) is not reporting its status for node list for more than %c_uage%([int#10]) seconds! (configured trashold=300 seconds)%CR%%LF%"
  endif
  endif
 Let,10, [int#10]+1 //increment counter
 If [int#10]>100  //restart counter
    Let,10,1
 endif
Endon

TD-er
Core team member
Posts: 8762
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Monitoring health of ESP nodes within rules

#13 Post by TD-er » 22 Apr 2024, 20:06

Maybe you should also set some limit on the number of emails being sent, as the node will be "forgotten" after 10 minutes (or was it 5???) so it might hit the email call several times in a row until it is forgotten.

N.B. you can also nest variables, so you can keep track of sent emails per node.
I suggest to keep your "counter" in a higher variable index like "100" or "1000", so you can do stuff like this:

Code: Select all

let,[int#1000],1 // email sent

logentry,"Email sent state of node [int#1000]: [int#%v1000%]"

jgrad
Normal user
Posts: 100
Joined: 29 Aug 2016, 22:03
Location: Slovenia

Re: Monitoring health of ESP nodes within rules

#14 Post by jgrad » 23 Apr 2024, 09:55

Node disappears from "node list" in 600seconds (10min) of inactivity.

Based on above rules script if node is disconnected email is sent cca 3 times when node age is btw 300 and 600 seconds. After that node disappear from list (%c_ubuild%(N)=-1) and emails are not send anymore.

Post Reply

Who is online

Users browsing this forum: No registered users and 15 guests