Oversampling on analog input hangs nodeMCU

Moderators: grovkillen, Stuntteam, TD-er

Post Reply
Message
Author
mackowiakp
Normal user
Posts: 527
Joined: 07 Jun 2018, 06:47
Location: Gdynia/Poland

Oversampling on analog input hangs nodeMCU

#1 Post by mackowiakp » 15 Aug 2019, 07:35

I do not know if the conclusion contained in the topic title is correct. But this is the result of my almost 2 months of observations and tests.
I use 8 pieces of nodeMCU in a home automation system. In three of them I use the ADC converter input. In all these three devices, I added a resistor so that the voltage at the ADC input - taking into account that the PCB of nodeMCU contains a resistance voltage divider at the input - never exceeds 4.2V. For about 2 months, one of these three devices began to hang. This phenomenon is very rare. Sometimes it occurs twice a day and sometimes the whole week works properly. But keep in mind that I reboot all devices daily from rule. At first I thought it was directly related to WiFi. So I added a rule to reconnect WiFi every 30 minutes just for testing. This, however, did not solve the problem. The connection was not established. What's more, the rules that control the relays and the time-dependent ones don't work either. I am not able to tell if this is due to the device hanging for its own or because "losing" the time previously synchronized with the NTP server. I have been looking for differences in device configuration for a long time (and you know, the easiest thing to find is the hardest). The software versions are the same everywhere. I try to keep them always new.
Well, this hanging device had ADC oversampling enabled. The other two using ADC - no. And these devices do not hang. So I turned off oversampling on this "fatal" device and now it doesn't hang.
In fact, I don't need oversampling for anything. But I think it might be a problem for others. Is it issue or not? If so i move topic to github.

User avatar
ThomasB
Normal user
Posts: 1064
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Oversampling on analog input hangs nodeMCU

#2 Post by ThomasB » 15 Aug 2019, 23:27

So I turned off oversampling on this "fatal" device and now it doesn't hang.
In fact, I don't need oversampling for anything. But I think it might be a problem for others. Is it issue or not? If so i move topic to github.
This sounds like it should be reported as a github issue. Especially if you can duplicate the bug in a basic installation (minimal plugins/controllers).

I was curious about the internal ADC plugin's oversampling function (it's not explained in the wiki or readthedocs). A quick review of P002_ADC code showed that the oversampling mode is basic data averaging. The plug-in's Interval value is used both as the sample period and read interval. The oversampling mode causes faster ADC reads (10Hz) and requires float math.

Although I didn't see anything obvious that explained your hang issue, I did find an unrelated but significant bug. If Oversampling is enabled, and the Interval is greater than 109 minutes, then the ADC value will be corrupted. That is because the 16 bit sampling counter (Plugin_002_OversamplingCount) will wrap around 0 which breaks the data averaging math.

I did a test to confirm this. Setup: Oversampling enabled, Interval set to 6605 (~110 minutes), data value formula not used. The expected ADC read result should have been 40. But 27754 is reported.
Log output shows:

Code: Select all

112674760: ADC  : Analog value: 27754 = 27754.199 (97 samples)
112674764: EVENT: ADCINTERNAL#Analog=27754.20
Screenshot of device page:
adc_bug.jpg
adc_bug.jpg (23.43 KiB) Viewed 10100 times

If you submit your hang problem to ESPEasy github then it would be helpful to mention this bug too.

- Thomas

mackowiakp
Normal user
Posts: 527
Joined: 07 Jun 2018, 06:47
Location: Gdynia/Poland

Re: Oversampling on analog input hangs nodeMCU

#3 Post by mackowiakp » 16 Aug 2019, 05:32

A quick review of P002_ADC code showed that the oversampling mode is basic data averaging
I also looked at the source code and came to the same conclusions.
The error you are showing seems obvious. After all, the ADC converter is 10 bit. So the highest value that can be read from it is 1024. It is therefore not possible for any average value, regardless of how calculated, to take the value of 27754.199. I don't know if the nodeMCU "hanging" and the error You have detected have something common, although it seems to be possible. I will wait for moderators' opinion because I do not want to clutter github unnecessarily.

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#4 Post by TD-er » 17 Aug 2019, 21:41

On ESP32 this bug is even worse, since that one has 12 bit ADC.
I am using the ADC + oversampling in the unit in my car to keep an eye on the car battery voltage.
It's running like this since January.
But I do read the value every minute.
Oh and a crash can occur if it is dividing by zero.
But I have not yet looked at the code, so no idea if that may happen here.

User avatar
ThomasB
Normal user
Posts: 1064
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Oversampling on analog input hangs nodeMCU

#5 Post by ThomasB » 17 Aug 2019, 23:59

On ESP32 this bug is even worse, since that one has 12 bit ADC.
Fortunately the sampled 12bit data is accumulated in a int32, so major disaster is averted.
Oh and a crash can occur if it is dividing by zero. But I have not yet looked at the code, so no idea if that may happen here.
I also saw that a fatal division by zero was possible. But it would only occur if oversampling is enabled and the user has chosen a read interval that is close to 6554 secs (or values that are divisible by that period). I admit it is a fairly unique configuration, but fixing it is preferable to ignoring it.

But at this point it's not known if these observed bugs are the cause of the OP's reported hang that occurs with oversampling enabled. For example, it could be a bug outside the ADC plugin does not play nice with the oversampling function. Some generous developer is needed to volunteer their precious free time to debug this. Wink wink.

- Thomas

mackowiakp
Normal user
Posts: 527
Joined: 07 Jun 2018, 06:47
Location: Gdynia/Poland

Re: Oversampling on analog input hangs nodeMCU

#6 Post by mackowiakp » 18 Aug 2019, 09:03

"Dividing by zero" - sounds pretty nice... :shock:

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#7 Post by TD-er » 18 Aug 2019, 12:25

mackowiakp wrote: 18 Aug 2019, 09:03 "Dividing by zero" - sounds pretty nice... :shock:
Yep, that's a really nice feature for really greedy programmers that don't want to share :)

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#8 Post by TD-er » 18 Aug 2019, 12:56

I just took a quick look.

Code: Select all

float P002_getOutputValue(struct EventStruct *event, int16_t &raw_value) {
  float float_value = 0.0;
  if (PCONFIG(0) && Plugin_002_OversamplingCount > 0) {
    float sum = static_cast<float>(Plugin_002_OversamplingValue);
    float count = static_cast<float>(Plugin_002_OversamplingCount);
    if (Plugin_002_OversamplingCount >= 3) {
      sum -= Plugin_002_OversamplingMaxVal;
      sum -= Plugin_002_OversamplingMinVal;
      count -= 2;
    }
    float_value = sum / count;
    raw_value = static_cast<int16_t>(float_value);
  } else {
    raw_value = P002_performRead(event);
    float_value = static_cast<float>(raw_value);
  }
  return P002_applyCalibration(event, float_value);
}
I don't see yet how it can happen to divide by zero.
Either Plugin_002_OversamplingCount == 1 and then it will just divide by 1, or it is >= 3 and then it will divide by 1 or more.

If no values are read, the min value is set as max possible ADC output value, while the sum is still 0.
In such situations, subtracting max and min value may not be the best option here, since it will set the raw value even lower than 0.
Well, only mathematically speaking...
The sum, count, min and max. value are an unsigned int (signed may be better here????) so I really have to take another coffee here to figure out what will be added or subtracted here :) (can you see this line was edited a few times while writing this reply....)
As far as I can now see, it should still work, since the count is 0.

TL;DR;
I don't see it yet where it could be wrong.

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#9 Post by TD-er » 18 Aug 2019, 13:08

There is one other thing that may be happening here.
If oversampling is enabled, it does perform a sample 10x a second, like ThomasB also observed.
The ADC is also used internally by the WiFi to perform RF calibration.
I'm not sure how often this happens, but I can imagine it will happen when the WiFi does a reconnect.
Maybe that's causing some undesired behavior here?
We could wrap the analog read function in some kind of check to see what WiFi connection state is active and don't perform ADC conversion when there is WiFi activity going on (e.g. during connection phase)

User avatar
ThomasB
Normal user
Posts: 1064
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Oversampling on analog input hangs nodeMCU

#10 Post by ThomasB » 18 Aug 2019, 17:48

I don't see yet how it can happen to divide by zero.
++Plugin_002_OversamplingCount in PLUGIN_TEN_PER_SECOND can wrap around 0xffff and restart at zero before a read occurs. As noted, this breaks the ADC value. An unlikely division-by-zero scenario might occur if the wrapped sample counter is on 0x0002 before statement float_value = sum / count is executed (count will have a 0 divisor). Very unlikely condition and certainly not the cause of the OP's hang.
We could wrap the analog read function in some kind of check to see what WiFi connection state is active and don't perform ADC conversion when there is WiFi activity going on (e.g. during connection phase)
I think that would be good for a temporary test to verify the source of the problem. But as a long term fix, this may cause problems in some installations that are prone to WiFi connectivity issues. That is to say, if the ADC value is something critical, the ADC read could be delayed too long to safely control the locally attached hardware.

- Thomas

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#11 Post by TD-er » 19 Aug 2019, 20:11

ThomasB wrote: 18 Aug 2019, 17:48 [...]
I think that would be good for a temporary test to verify the source of the problem. But as a long term fix, this may cause problems in some installations that are prone to WiFi connectivity issues. That is to say, if the ADC value is something critical, the ADC read could be delayed too long to safely control the locally attached hardware.
True, but if the alternative is a hanging system...

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#12 Post by TD-er » 26 Aug 2019, 12:15


mackowiakp
Normal user
Posts: 527
Joined: 07 Jun 2018, 06:47
Location: Gdynia/Poland

Re: Oversampling on analog input hangs nodeMCU

#13 Post by mackowiakp » 26 Aug 2019, 12:31

THX !

TD-er
Core team member
Posts: 8643
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Oversampling on analog input hangs nodeMCU

#14 Post by TD-er » 26 Aug 2019, 12:52

Just an idea, what if we disable reading the value from the ADC in this plugin only when connecting to WiFi (during RF calibration) and only if running in oversampling mode?
Then it is less likely to be an issue with timing (oversampling is done over a longer time), but more a matter of stable values.
A value read during WiFi connection may also be off from the other values, so it will also improve the stability of the read value.

Since we're using event based WiFi, it is well known when the WiFi reconnect phase is active, so it may only be for 1 - 3 seconds period.

mackowiakp
Normal user
Posts: 527
Joined: 07 Jun 2018, 06:47
Location: Gdynia/Poland

Re: Oversampling on analog input hangs nodeMCU

#15 Post by mackowiakp » 26 Aug 2019, 13:06

In my (private) opinion, the situation is clear. The processor in nodoMCU can do what it can. If one or two ADC measurements are lost during this, nothing will happen. As one of the Polish ministers said "Sorry, that's the climate we have..."

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 38 guests