Reboot due to Const Interval: TIMER_100MSEC

Moderators: grovkillen, Stuntteam, TD-er

Post Reply
Message
Author
stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Reboot due to Const Interval: TIMER_100MSEC

#1 Post by stefanru » 11 Feb 2021, 10:12

Hi,

my ESP is rebooting roughly once a day with the below error:
Reset Reason: Hardware Watchdog
Last Action before Reboot: Const Interval: TIMER_100MSEC

I write this here in the Software Topic because i have my own Plugin Build in.
Plugin is build with ESPSerial and provides a SoftwareSerial because i read in two smart meters and needed a second serial server.
The Plugin uses the PLUGIN_TEN_PER_SECOND.
I think the error has to do with that?
Is there a better way to do the loop?

My Plugin Code can be found here:
https://github.com/StefanRu1/SoftwareSe ... _SSSRV.ino

P.S.: The ESPSerial is workig fine but i also get messages like "Ser2N: serial buffer full!" in the log. Any Idea what might cause this? Would be perfect if there is an alternative to PLUGIN_TEN_PER_SECOND.

P.P.S: Here my Firmware Information:

Code: Select all

Firmware
Build:⋄	20111 - Mega
System Libraries:⋄	ESP82xx Core 2843a5ac, NONOS SDK 2.2.2-dev(38a443e), LWIP: 2.1.2 PUYA support
Git Build:⋄	
Plugin Count:⋄	47 [Normal]
Build Origin:	Self built
Build Time:⋄	Dec 12 2020 11:48:29
Binary Filename:⋄	ESP_Easy_mega_20201212_normal_ESP8266_4M1M
Build Platform:⋄	Windows-10-10.0.18362-SP0
Git HEAD:⋄	pygit2_not_installed
Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#2 Post by TD-er » 11 Feb 2021, 10:57

Sure there is an alternative to PLUGIN_TEN_PER_SECOND
It is called PLUGIN_FIFTY_PER_SECOND. :)


What baud rate are you using?
What amount of data is received and at what interval?
You're writing to a network connected client what you are receiving.
Maybe you should also set a timeout to this WiFiclient object? The default timeout is too long (longer than the watchdog timer)

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#3 Post by stefanru » 11 Feb 2021, 11:25

Hi,

thanks for your answer.
I will try if PLUGIN_FIFTY_PER_SECOND will help.

The ESP is only getting data from two SmartMeter and i am getting the data with my Smart Home solution via serial.
The Baudrate is 9600.
For the first smart meter i use Communication - Serial Server.
For the second smart meter i use my Plugin Code.

Data send by the meter should not be much. It is in SML Format and a Datatelegram is send every second with the following data:
Baudrate 9600
Max Transfertime 400ms

Telegramm OBIS:
1 81 81 C7 82 03 FF Hersteller X X X X
2 01 00 00 00 09 FF Gerätenummer 0.0.9 X X X X
3 01 00 01 08 00 FF Zählwerk pos. Wirkenergie, tariflos 1.8.0 X X
4 01 00 02 08 00 FF Zählwerk neg. Wirkenergie, tariflos 2.8.0 X X X
8 01 00 0F 07 00 FF Aktuelle pos. Wirkleistung Betrag 15.7.0 X X X X
9 01 00 01 11 00 FF Signierter Zählerstand (nur im EDL40- Modus) 1.17.0 X X

I found that Ser2N: serial buffer full! is not coming from my sotwareseral plugin but from the normal serialserver.

The solution is working fine.
Nevertheless my problem is the reboot with the message "Const Interval: TIMER_100MSEC". Any Ideas why this happens?

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#4 Post by TD-er » 11 Feb 2021, 11:28

Well if it tries to send the data to some connected host, but that connection is lost or something else causes a timeout, then you're waiting for the default timeout.
This default timeout (set in the WiFiclient object) is too long.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#5 Post by stefanru » 11 Feb 2021, 11:38

Hi,

Is this the answer to "Ser2N: serial buffer full!"?
I see that this is happening as long as the normal SerialServer is running.
If i switch it of the error goes away.
But i do not know if the reboots are caused by this.

I do not really understand the timeout.
I changed the only option "TX receive timeout" to 2000ms. => makes no sense because i am only using RX.

But now the ESP is completely gone :-(
And does not even reoccur. Will have to go to the basement and reset the ESP.

I checked a bit more and the reboots are occurring very often. Less than every hour.
The Wifi connection is strong and i do not think it is a connection / Wifi problem.
I have a lot of ESPs working fine. But this with ser2net not.
As said, i assume it has something to do with my plugin.

Best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#6 Post by TD-er » 11 Feb 2021, 12:12

Edit:
Realized later my reply was with another thread in mind, so makes no sense here

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#7 Post by stefanru » 11 Feb 2021, 13:29

Ok, no Problem.

So i changed the SoftwareSerial coding to PLUGIN_FIFTY_PER_SECOND. Running no 60 Minutes without reboot. Let's see.

Strange and what i do not understand is why the build in ser2net is Always logging "Ser2N: serial buffer full!". I do not understand why this happens.

Perhaps someone has an idea?

Best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#8 Post by TD-er » 11 Feb 2021, 13:36

That log entry is not present in your plugin code you linked, but it is present in _P020_Ser2Net.ino

Code: Select all

        if (bytes_read != P020_BUFFER_SIZE)
        {
          if (bytes_read > 0) {
            if (Plugin_020_init && ser2netClient.connected())
            {
              ser2netClient.write((const uint8_t *)serial_buf, bytes_read);
              ser2netClient.flush();
            }
          }
        }
        else // if we have a full buffer, drop the last position to stuff with string end marker
        {
          while (Serial.available()) { // read possible remaining data to avoid sending rubbish...
            Serial.read();
          }
          bytes_read--;

          // and log buffer full situation
          addLog(LOG_LEVEL_ERROR, F("Ser2N: serial buffer full!"));
        }
Your plugin does read from a serial port and apparently Ser2net is also enabled?
It does receive data which is stored in a buffer, but the buffer is not being read.
So I wonder, is this plugin active? (as in assigned to a task and enabled)

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#9 Post by stefanru » 11 Feb 2021, 14:17

Hi,

thanks again for your help. I am really happy that you try to help me.

Yes you are right. The error is only coming from ser2net and has nothing to do with my plugin.
I do not understand why i get this error from ser2net.

I have both enabled ser2net and my plugin. (see screenshot)
Both are working and i get data. ( I have 2 meters and need two serialservers for two sensors).

Everything is fine, apart from the log message and the reboots roughly every hour.
I have the feeling that the problem is soly based on ser2net and not to my plugin, but do not understand why?

Any Idea?

Thanks again,
Stefan
Attachments
Capture.JPG
Capture.JPG (50.22 KiB) Viewed 14515 times

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#10 Post by TD-er » 11 Feb 2021, 14:32

The largest message you posted was 76 bytes long.
Let's round that to 80 bytes. At 9600 bps, receiving this message takes roughly 80 msec. (rule of thumb, 1 bytes takes 1 msec at 9600 bps)
Since you're receiving it via software serial, you are handling quite a lot of interrupts during 80 msec, which may cause issues with the WiFi handling.
For proper stable WiFi handling you should at least every 10 msec handle your WiFi, which is done by calling delay().
N.B. this also resets the Watchdog timer.

What might be happening here is that the WiFi stuff doesn't get enough attention which may cause the network connection to get lost without the code being notified of this.
So when trying to send data, the WiFiClient object may get stuck waiting for a reply => WD reboot.

When using a hardware serial port, you don't have these issues as receiving data doesn't interrupt the running code.
It simply appends new data to an internal buffer which should be read in due time.
Software Serial is way more intrusive in the code execution as it literally interrupts this for every single bit (or byte?) of data received.

Really 'chatty' devices should not be captured via Software Serial, unless they run at a higher bitrate and/or only send short messages.
For example a GPS unit also may not run as stable as one would expect on Software Serial.

Maybe you can add a yield(); in this while loop:
https://github.com/StefanRu1/SoftwareSe ... #L150-L154

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#11 Post by stefanru » 11 Feb 2021, 19:06

Hi,

thanks again for your detailed analysis.
I build in delay(0); in line 153, 156 and 177.
Lets see if this helps to prevent the reboots.
Why did you suppose yield()? Any reason for yield()?

I also checked where the "Ser2N: serial buffer full!" message, but this is only coming from ser2net.
I can also disable my plugin and the error still stays, so it has nothing to do with the softwareserial server at all.

I am now thinking in not using the ser2net but 2 softwareserial servers, but this seems wrong to me. Hardware serial should be better?

Lets see if the delay(0) at least solved the reboots every hour.

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#12 Post by TD-er » 11 Feb 2021, 19:13

yield() is a more lightweight version of delay().
So it will give some attention to the WiFi part.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#13 Post by stefanru » 11 Feb 2021, 20:42

Hi,

ok thanks for the answer.

Regretfully the delay(0) at the various points do not prevent the reboot.
Perhaps i have to check if the plugin is causing the issue or just the Communication - SerialServer on its own?

Nevertheless i will try yield( ) but do not think this will change the behavior.
Any Ideas are welcome. I will try some more stuff...

P.S.: I did try now with a lot of parameters and so on. Nothing helped.
But i am pretty sure the error is coming from ser2net used in the standard Communication - SerialServer and not from my plugin.
I do not understand why the standard Communication - SerialServer has this problem and my software serial server does not?
I read both with a telnet connection on my smart home.

I will now try to run both smart meter with software serial plugin :-(
This seems somehow wrong, but the hardware serial seems to cause my issues.

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#14 Post by TD-er » 11 Feb 2021, 23:07

2 more things for you to try.
I just merged a fix related to WiFi code. See: https://github.com/letscontrolit/ESPEasy/pull/3507
It is now already part of the mega branch.

The other thing you can try is to enable 2 features in the WiFi settings (Tools->Advanced, bottom of the page)
- Periodical send Gratuitous ARP: Checked
- CPU Eco Mode: Unchecked

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#15 Post by stefanru » 11 Feb 2021, 23:15

Hi,

thanks so much.
I will try and report here.

I also found this git Issue from 2017.
I now also try to switch off all log levels. Perhaps this helps.

Best regards and thanks again,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#16 Post by TD-er » 11 Feb 2021, 23:45

Yep, generating logs may just take that last bit of resources you need.
Especially the log to a logserver is notorious for causing issues with flaky network.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#17 Post by stefanru » 12 Feb 2021, 09:50

Hi,

i did a lot of testing now.
And i was wrong. My plugin is the root cause.
As long as my plugin is disabled the normal serial server is working fine.
No errors in the log.

As soon as i switch on my coding the error "Ser2N: serial buffer full!".
Also i see that in my smart home the normal serial server does not send every update anymore.

So somehow my coding is influencing the normal serial server.
I think you are totally right that my coding is taking somehow too much time and the normal serial server is not able to do his processing in the needed time :-(

I no reduced again to PLUGIN_TEN_PER_SECOND, but this seems to not help much.
I play now a bit with buffer settings in my coding. Lets see....

P.S.: is there a way to see why or where it losses time / timing?
Sorry i am not very experienced in ESPeasy Plugin programming.

Thanks again,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#18 Post by TD-er » 12 Feb 2021, 10:18

If you're building using the "normal" or "custom" PlatformIO environment, you will also have a timing stats page (see button on the Tools tab)
N.B. the timing stats will be cleared every time you load that page.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#19 Post by stefanru » 12 Feb 2021, 11:59

Sounds interessting.

I am using Visual Studio Code and Platformio IO.
I build normal_ESP8266_4M1M.

I see a button monitor but nothing with timing stats?
I am using the wrong tools or am i just blind?

P.S.: I now see that the error with "Ser2N: serial buffer full!" is causing CRC errors on the smart home telnet receiver connected to the normal serial server.
So definitely my coding is breaking the data in the normal serial server...

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#20 Post by TD-er » 12 Feb 2021, 12:07

I meant the "tools" on the ESPEasy web interface.
There are several buttons like "Reboot", "Log", .... "Timing stats".

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#21 Post by stefanru » 12 Feb 2021, 13:55

Ah, i am really dumb, sure.

I did more testing and even tried to use two software serial server, but than there is no data received any more.

So for now i found out that my softwareserial causes " "Ser2N: serial buffer full!" on the normal SerialServer and data get corrupted. I have CRC errors on the receiver side on the normal serialserver port.
Software Serial has no CRC errors.

For Software Serial FIFTY PER SECOND seems to provide more Data than TEN PER SECOND, but i use TEN PER SECOND and hope it is not influencing the normal serial server so much.

The timings look like this:
Is there something obviously wrong? Or is something showing where my problems coming from?
For me this looks good. Nearly same timings in SoftwareSerial than in normal serial.


Thanks again for the help!

Best regards,
Stefan
Attachments
Capture.JPG
Capture.JPG (297.02 KiB) Viewed 14413 times

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#22 Post by TD-er » 12 Feb 2021, 14:20

See here for more info on how to interpret the timing stats: https://espeasy.readthedocs.io/en/lates ... ming-stats

As you may have noticed in your screenshot, both the SERIAL_IN and the TEN_PER_SECOND calls are highlighted, meaning one of the times is > 100 msec.
I've chosen this as something to focus on as anything that takes > 100 msec to run will cause (obviously) a delay in the execution of the TEN_PER_SECOND call.
You can also see it is quite affected as the number of calls/sec for the 10/sec call = 7.37 times a second. (N.B. your stats cover roughly 1000 seconds, so the statistics are usable)
The 50/sec call only achieved 32.7 calls/sec.
This is a good indication we do see blocking calls which affect the scheduling of others.

Sadly the timing stats don't show the median and/or the standard deviation, so it is hard to tell how often those "max" values occur.
But since both plugins apparently affect each other, it is likely these max values do occur often. (maybe every received message?)

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#23 Post by stefanru » 12 Feb 2021, 15:55

Hi,

ok yes i understand.

For me a solution is to hook up a second ESP. But i really hoped being able to process it with one.
Since the data mass should not be so much as i showed before.
Normally one data telegram every second.
And it would also be ok if i only get the data once per second or only every two seconds from the ESP.

Would it be possibel in the softwareserial server to get all data in a big buffer and only send it once a second?
I tried this with having the Buffer of ESPSerial enlarged to 4096 but it did not work.
I think the input from the GPIO is the timing related thing? Correct?

Any Ideas how i can get these 2 data streams from the 2 smart meters handled with one ESP?

Again thanks for all your help,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#24 Post by TD-er » 12 Feb 2021, 16:01

The buffering is what is done in the controller, so if you use a controller (or publish the string using MQTT) the sending is done via a buffer in the background.

MQTT does keep its connection open, so you don't have overhead of making the connection.
Ser2net does also keep the connection open, or at least should keep it open.
However, I am not sure how many connections can be kept open, so maybe both try to reconnect all the time?

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#25 Post by stefanru » 12 Feb 2021, 17:04

Hi thanks,

no i think the connections stay open.
I can connect with telnet on both ports. From SerialServer and from my Plugin using a WiFiServer.
Both connections stay open all the time and are not interrupted...

Will have to dig deeper, i only see the buffer in ESPeasySerial which works best with 256. Lower values make the data corrupted..
I will check a bit deeper. But i think i have to use two ESP, regretfully.
I do not see a way to get the coding with less impact so that the two serial servers are not competing with each other about resources.

Thanks again,
Stefan

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#26 Post by stefanru » 12 Feb 2021, 19:14

Hi TD-er,

i had success.
I have read again all your comments and checkt the P020_Ser2Net.
I did 2 small changes in P20 SerialServer which helped. Also i increased the buffer size of ESPeasySerial in my Plugin so that i can also use once per second without loosing data.
No "Ser2N: serial buffer full!" anymore. No CRC errors on the connection to the P20 SerialServer.
I hope also the reboots will not occur anymore. I will update here once it was running long enough.

I even could increase the poll rate of my SoftwareSerial Server again from once a second to ten a second again.
Since my application is not time critical and data is nevertheless only send once a second i will overthink that but i wanted to try.

So here are my 2 changes. I did not test now which one was the essential one.
But if it is important for you i can test it:
Line 254: delay(1) -> delay(0)
Line 14: #define P020_BUFFER_SIZE 128 -> #define P020_BUFFER_SIZE 256

i know increasing the buffer is not healing the root cause. But since i was not able to heal it even with only using once per second in my plugin i did not see another way.
Perhaps this is also caused because all data is coming from the smart meter only once a second?

So can this two changes, or one of the changes if i know which one was essential be done in ESPEasy git repository?

P.S.: I will also overwork my SoftwareSerial Plugin and harmonize the coding with the P20 Serial Server.. So that i would also see Buffer Full Messages.

Thanks again for guiding and helping me! Great support!
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#27 Post by TD-er » 12 Feb 2021, 22:25

Keep in mind that simply increasing that buffer in P020 may cause other issues.
The buffer is allocated "on the stack" and since we only have 4k of stack, you must be really careful and keep a close look on the reported lowest free stack value to make sure you will not run into a stack overflow.
I guess 256 bytes may still work fine, but be aware there is a chance you may run into a stack overflow when increasing it further.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#28 Post by stefanru » 12 Feb 2021, 23:20

Ok thanks.

Free Stack is looking good: 3664 (1568 - sendContentBlocking)

It works fine now but reboots still occur.
I adapted now the SoftwareSerialServer Coding as far as possible to the P020_Ser2Net.
Let's see.

Thanks and best regards,
Stefan

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#29 Post by stefanru » 13 Feb 2021, 01:06

Hi,

so i am quite happy.
I see no CRC errors, no buffer overflows, data is received fast and good.
I have the same error handling now as in P20.

The only problem, i still have the reboots.
But the error changed to:
Reset Reason: Hardware Watchdog
Last Action before Reboot: Const Interval: TIMER_20MSEC

But i have no idea why.

The timing is looking like this now.
Ser2Net is needing double the time in SERIAL_IN than the complete TEN per Second in SoftwareSerial.
What might cause this reboots? Any Idea what i can check or try?
I see that it is still overloaded a bit but it seems to work with the little bit bigger buffers.
Can also this load causes reboots?

One Idea which came to my mind is that the SmartMeter needs to long to send data to SERIAL_IN?
I have read something in the smartmeter documentation of 400 ms max.
Is there a timeout?

So i have 2 identical smartmeters. The data is transferred optical, so i have two Phototransistors, one for each meter.
One is connected to RX and read by SERIAL_IN in Ser2Net.
The other is connected to D6 and read in by ESPSERIAL and send by my SoftwareSerial Plugin.

Thanks,
Stefan
Attachments
Capture.JPG
Capture.JPG (227.33 KiB) Viewed 14286 times

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#30 Post by TD-er » 13 Feb 2021, 11:42

Watchdog timer reboots are -as I told before- caused by some code taking too long to run.
Typical use cases are waiting for a timeout (without calling delay()) or running an infinite loop (waiting for some condition which is not going to happen)

If the last scheduled call was in the 20msec loop, then it was part of the PLUGIN_FIFTY_PER_SECOND call.
In your screenshot there was nothing really sticking out, but of course it had not yet crashed :)

If your plugin is not using the 50/sec call, then it is likely it is from another active plugin.
Could it be that the other plugin does receive bogus data? For example if the used cable is too long, the GPIO pins can switch from high to low or vice verse.
So we have to check the code of that plugin to see if it may run into some limbo state waiting forever in a while loop on data that will not arrive.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#31 Post by stefanru » 13 Feb 2021, 14:04

Hi TDer,

thanks again i got it. I really appreciate your help!

Yes, my plugin is not using fifty_per_second.
Also not the P20 Plugin.
I have only P20 Plugin and my Plugin enabled.

I also think perhaps it has to do with bad data coming to RX.
Because the SERIAL_IN method from P20 Plugin is using so much time.

But the data i receive from the P20 Plugin looks fine and works well, that's strange.

I also changed my coding no in regards of Webserver intialization to the way P20 Plugin does it.
The coding is now nearly identical between P020 and my Plugin.
For sure i do not have SERIAL_IN because i use ESPSERIAL in software mode. So i do all the processing in a TEN_PER_SECOND loop.

As you described something running to long is triggering the watchdog, but i do not really see something from the timings page.
Perhaps it is a strange behavior or wrong error handling when the connection is somehow lost?
But as said i have all this now similar to P20 coding.

I am really happy with the results so far. It is working as supposed. Data is transferred constantly to my SmartHome Server.
I can not see any CRC errors.
Just that the watchdog is triggered in times between 10 minutes and 1h 30 min is anoying.

Here the new Version of my Plugin. Perhaps you can have a look?
https://github.com/StefanRu1/SoftwareSe ... _SSSRV.ino

Thanks again,
Stefan

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#32 Post by stefanru » 13 Feb 2021, 16:56

Hi,

i now tested running without a client connected.
So essentially nothing is happening. No data transferred...

The strange thing is the reboots also occur.
This time i have
Reset Reason: Hardware Watchdog
Last Action before Reboot: Const Interval: TIMER_MQTT

There is no MQTT or something.
Can this be a Problem of the ESP? Or Power? Even though the log says Hardware Watchdog?

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#33 Post by TD-er » 13 Feb 2021, 17:39

Not sure.
I was thinking, it could still be related to the software serial, if it is triggered with false pulses it still generates interrupts and I am not entirely sure if it keeps triggering these interrupts per bit or maybe waits till a full byte is received.
Can you test with the plugins disabled to see if the ESP still reboots?

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#34 Post by stefanru » 13 Feb 2021, 19:39

Sure i will disable the plugins and check if my ESP is stable or not.
I will post an update as soon as i know more.

Thanks,
Stefan

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#35 Post by stefanru » 14 Feb 2021, 11:50

Hi,

i can confirm now that it is working with P20 Ser2Net alone.
With P132 SoftwareSerial alone it is rebooting. :-(

So i have to focus on P132 SoftwareSerial.
I am just adding more error handling for ESPSerial and try to set also a timeout.
What is an appropriate timeout for ESPSerial?

Let's see...

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#36 Post by TD-er » 14 Feb 2021, 13:05

Maybe you can also try to run your plugin only on HW serial, as I do suspect the SW serial to cause some issues with the WiFi.
A timeout in receiving data should be something like the longest expected message times 2.

For sending the data to a controller, you should set the timeout in the controller.
If you maintain your own WiFiClient object to create a network connection, set the timeout to that WiFiClient object right after creating it (before making a connection)

Typical timeout could be 3x - 5x the average connection time (depends on whether it is local or somewhere online)
Just try to keep it < 1000 msec.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#37 Post by stefanru » 14 Feb 2021, 13:11

Hi,

I have a ESP8266 and two sensors. So i think i can not use only HW serial. Am i right?
Is there a way to only have HW serial with 2 inputs? Other Board?
Do i miss something here?

I just started the P132 SoftwareSerial Coding with improved logging and timeouts in ESPSerial and Webserver to 200 ms.
I monitored the run with LogLevel Info.

But no errors nothing and suddenly a reboot after 16 minutes.
Reset Reason: Hardware Watchdog
Last Action before Reboot: Const Interval: TIMER_20MSEC

Do you have an Idea what might cause this? Or what i can try more?
The setup of of P20 and P132 is the same. So same input. Same smartmeter, same diod to read in. P20 alone running fine, P132 alone gets Hardware Watchdog.

My Coding with improved logging and timeouts can be found here:
https://github.com/StefanRu1/SoftwareSe ... _SSSRV.ino

Since i do not see any error before Hardware Watchdog i must miss something.

Thanks and best regards,
Stefan
Top

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#38 Post by TD-er » 14 Feb 2021, 13:13

No I meant for testing purposes, only use your P132 using HW serial, so not using the other one right now.
Since you try to dive into your code, I want to avoid you spending a lot of time on it if the issue is related to SW serial.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#39 Post by stefanru » 14 Feb 2021, 13:15

Ah ok i got it.

So switching my plugin to use the ESPSerial with HW Serial instead of SW Serial and test if it is working fine.
Very good idea.

I configured it now as:
Plugin_132_SS = new ESPeasySerial(ESPEasySerialPort::serial0, 3, 1, false, ExtraTaskSettings.TaskDevicePluginConfigLong[2], false);
Hope this is correct Port 3 and 1 seems to be HW serial?
I can connect to the port but i do not receive any data? Do you have an idea what i am doing wrong?

Ok got it now i still had once per second in the main loop. With ten per second it is running.
Strange thing is i get on console as soon as the plugin is initialized the payload as unknown command:

Code: Select all

65633: Command unknown: EMH5wdrbeYybRi@w$rbeYybRUw8rbeYybRUwLrbeYybRUwrbeYybRUcvbbrcqc57vbbrcvY
65830: Command unknown: EMH5rbeYzbcwvbbrcw
65832: Command unknown: EMH5b
65941: Command unknown: rbeYzww`2EMHw`
66663: Command unknown: EMH5wdrbeYzbRi@w$rbeYzbRUw8rbeYzbRUwLrbeYzbRUwrbeYzbRUc{vbbrcqc1vbbrcvY
66861: Command unknown: EMH5rbeY{bc.vbbrcw
66863: Command unknown: EMH5b
66972: Command unknown: rbeY{ww`2EMHw`
What is going on? I do not get it.

Thanks again,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#40 Post by TD-er » 14 Feb 2021, 14:07

Looks almost like "inverted" data.
I checked but before you also did not have the "inverted" boolean set, so unless you made some error when connecting it, I have no idea what's wrong.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#41 Post by stefanru » 14 Feb 2021, 15:20

Hi,

why do you think inverted data?
"EMH5" is correct, this is the name of the smart meter.
Data seems to be correct, but i do not understand why it is printed as "Command unknown" instead of transferred via Webserver.
I do not get where this messages are coming from? In Software Mode the data is just transferred.

In normal Serial Plugin i have to provide Data bits, parity and stop bit. I do not see this in ESPSerial.
Can it be that it has to do with this?

Nevertheless I am happy that it is working. The reboots are not nice and i would like to find the root cause, but for my usage i can also life with the reboots.
I configured my smart home to reconnect if connection is lost and as long as the ESP is always getting back to live properly i have not a big problem with the reboots,
expect from that it is not nice and i would like to know the root cause.

If you still get an idea what i can test or do i would be happy. But it is not urgent.

I have the latest coding uploaded to git, also with commented line for Hardware Serial with ESPSerial:
https://github.com/StefanRu1/SoftwareSe ... _SSSRV.ino

Thanks for all your help and have a nice weekend,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#42 Post by TD-er » 14 Feb 2021, 23:53

Ah OK, I get it.
You probably have the "Enable serial" checked on the tools -> Advanced page.
Then it will try to parse all incoming data from serial as a command.

Did you also have this checked when using the previous setup?

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#43 Post by stefanru » 15 Feb 2021, 00:01

Hi TD-er,

thanks for your reply.
Yes i have it enabled.
Just read the documentation! Wow.
Sorry that i did not read this earlier...
Strange that it did not have effects on my Setup with P20 on RX TX.
I did not have problems with P20 Plugin or ESPSerial in Software mode having this advanced setting enabled.

I disabled it now.
Do you think this could be my problem with the reboots?
Should i test again with ESPSerial in Hardwaremode?
Or just see how it behaves now?

Thanks again,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#44 Post by TD-er » 15 Feb 2021, 09:32

Short answer: yes :)

I can imagine (have to look at the code) that the P020 may not have been impacted since this plugin uses the PLUGIN_SERIAL_IN call.
If my memory serves me well, incomming data is first tried for PLUGIN_SERIAL_IN and if that fails, it is fed through the command interpreter.

Maybe it could have caused reboots if you did still receive the serial data on Serial0 while testing your plugin on SW serial.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#45 Post by stefanru » 15 Feb 2021, 16:57

Thanks for the explanation.

I switched off the "Enable serial" on the tools -> Advanced page.
And it looked very promising. 2 h without reboot.
But now i have the reboots again.

So when i have a bit time i will try the from you supposed way to change ESPSerial to Hardware and check if the reboots occur also with ESPSerial in Hardware mode.
I will update here when i found time to do this.

Thanks again for all your help!
Stefan

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#46 Post by stefanru » 12 Mar 2021, 17:08

Hi TD-er,

i did not found the time to test the ESPSerial in Hardware mode.
But i have a new finding i do not understand. Perhaps you have an Idea.

Since my ESP was hanging one time i activated RULES and implemented a reboot rule to run once a day at night.
After i have done this i recognized that the ESP is not self rebooting anymore. It is now running for over 5 hours.
That's strange.

I now removed my rule again and have the rules still activated under advanced.
I will monitor if the behavior stay the same.

Is switching on rules under the advanced tab changing anything in regards of watchdog behavior?

Thanks and best regards,
Stefan

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Reboot due to Const Interval: TIMER_100MSEC

#47 Post by TD-er » 12 Mar 2021, 17:35

stefanru wrote: 12 Mar 2021, 17:08 [...]

Is switching on rules under the advanced tab changing anything in regards of watchdog behavior?
[...]
Yep
ESPEasy does generate events on several occasions.
If rules are not active, the events will not be generated, stored in a queue and processed.
So there is for sure a difference here.

What also makes a difference is that rules processing does call delay() every now and then which does prevent the watchdog timer to occur.
And of course it will take some time to check the rules, which may also change behavior.

If a plugin is stuck in processing anything in the PLUGIN_TEN_PER_SECOND call, then it will crash with the reason of the topic title.
But if things get delayed somewhat, then the plugin call could for example handle a few more bytes per call instead of waiting for new characters which may not occur.

stefanru
Normal user
Posts: 41
Joined: 29 Jul 2017, 00:48

Re: Reboot due to Const Interval: TIMER_100MSEC

#48 Post by stefanru » 12 Mar 2021, 18:46

Ok,

interesting.
I will monitor it further and write my findings here.
If i find a bit time i will also test SoftwareSerial with HardwareMode.

Thanks,
Stefan

Post Reply

Who is online

Users browsing this forum: No registered users and 34 guests