Cannot update any of my modules past mega-20190215

Moderators: grovkillen, Stuntteam, TD-er

Post Reply
Message
Author
jobst
Normal user
Posts: 22
Joined: 19 Mar 2018, 12:05

Cannot update any of my modules past mega-20190215

#1 Post by jobst » 01 Mar 2020, 02:42

No matter what I do I cannot get past the mega-20190215 on all of my SONOFF (basic) modules (resoldered 4MB).

When I upgrade FOR EXAMPLE TO mega-20190630 the SONOFFs using either OTA or FLASH(usb) I can see the modules starting as I have a rule

Code: Select all

on System#Boot do
 Event lamp_on
 delay 1000ms
 Event lamp_off
 delay 1000ms
 Event lamp_on
 delay 1000ms
 Event lamp_off
endon
so the LED flashes as requested, but then they are totally unresponsive.

Also the DHCP server on my network has the correct entry:

Code: Select all

Mar  1 12:12:14 yorkstreet dhcpd: DHCPDISCOVER from 5c:cf:7f:7c:71:d4 via eth_0
Mar  1 12:12:14 yorkstreet dhcpd: DHCPOFFER on 192.168.200.156 to 5c:cf:7f:7c:71:d4 via eth_0
Mar  1 12:12:14 yorkstreet dhcpd: DHCPREQUEST for 192.168.200.156 (192.168.200.1) from 5c:cf:7f:7c:71:d4 via eth_0
Mar  1 12:12:14 yorkstreet dhcpd: DHCPACK on 192.168.200.156 to 5c:cf:7f:7c:71:d4 via eth_0
Sometimes I get responses like

Code: Select all

64 bytes from 192.168.200.156: icmp_seq=57 ttl=255 time=1216 ms
64 bytes from 192.168.200.156: icmp_seq=66 ttl=255 time=3433 ms
64 bytes from 192.168.200.156: icmp_seq=71 ttl=255 time=194 ms
64 bytes from 192.168.200.156: icmp_seq=78 ttl=255 time=434 ms
64 bytes from 192.168.200.156: icmp_seq=85 ttl=255 time=213 ms
64 bytes from 192.168.200.156: icmp_seq=87 ttl=255 time=7.61 ms
64 bytes from 192.168.200.156: icmp_seq=96 ttl=255 time=79.4 ms
64 bytes from 192.168.200.156: icmp_seq=104 ttl=255 time=1412 ms
64 bytes from 192.168.200.156: icmp_seq=121 ttl=255 time=3.34 ms
Also the following rules are responded to (the LED turnon/off and output as well)

Code: Select all

on black_button#Switch do
 if [black_button#Switch]=0
  gpio,12,0
  gpio,13,1
 else
  gpio,12,1
  gpio,13,0
 endif
endon
Sometimes even this works.
I turn the lamp on using the button then issue this

Code: Select all

wget -q http://192.168.200.156/control?cmd=event,lamp_off
and it actually turns off.

Loading the webpage is pointless.

It really seems there is a problem with the WIFI ALTHOUGH the SONOFF module I am testing is lying on my table and the WIFI router is about 2 meters away. Looking at the logs of the WIFI Router (Nighthawk R7000) there is a constant connection/disconnection going on ... not stable at all.
Longest is about

I am not sure what to do.
There seems to be nothing on the forum matching my problem (well that off course depends whether I am searching correctly).

Any ideas?

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#2 Post by ThomasB » 01 Mar 2020, 04:36

Flash memory may have been corrupted. I suggest starting over using the latest Mega release, as follows:
1. Back up the existing settings and rule files.
2. Using the ESP.Easy.Flasher.exe flash utility, serial flash with blank_4MB.bin.
3. Serial flash with ESP_Easy_mega-20200222_normal_ESP8266_4M1M.bin (this is the latest release on github). Enable the Force-DOUT setting before flashing.
4. Reboot, connect via AP, and reconfigure with your old settings.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#3 Post by TD-er » 01 Mar 2020, 14:03

It does indeed seems like there is something corrupted on the flash.
Maybe your flash modules can only handle a different flash write speed?

Problem is that the OTA may use different settings to write the sketch compared to the normal writes of our sketch.

There is a command with the very confusing name "erase" (will change it later to wifierase or erasewifi) that you could test first.
What this does, is turn on wifi persistent settings, disconnect , disable persistent settings and reconnect.
So it is supposed to clear any stored wifi settings outside our reach.
It is possible the current WiFi calibration data is causing instability with the current core version.

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#4 Post by ThomasB » 02 Mar 2020, 22:31

No matter what I do I cannot get past the mega-20190215 on all of my SONOFF (basic) modules (resoldered 4MB).
@TD-er, I decided to try the OP's configuration with ESP_Easy_mega-20200222_normal_ESP8266_4M1M.bin. And I'm having WiFi problems too.

Background info:
I used a Sonoff Basic (original version with external Flash chip) that was working correctly with the latest ESPEasy release. Then replaced the original 1MB chip with 4MB (Winbond 25Q32BVSIG, Flash Chip Vendor: 0xEF, Device: 0x4016). This chip was from the same lot that has successfully upgraded the memory on a couple Sonoff TH10 modules.

The 4MB chip reliably boots (per serial log), but browser access mostly times-out. Randomly it will show all or part of a web page. After many failed attempts I was able to load my rules and settings files via browser access. Plus confirm NTP and Mail Notifications are working.

It's using static IP. The router is 20 feet away and RSSI is -57dBm. There will be periods of reasonable browser response. But followed by long periods of browser access timeouts.

Here's a serial boot log example (serial log disabled):

Code: Select all

>reboot
97382 : Info  : Command: reboot

 ets Jan  8 2013,rst cause:1, boot mode:(3,7)

load 0x4010f000, len 1392, room 16
tail 0
chksum 0xd0
csum 0xd0
v3d128e5c
~ld
¬U90 : Info  :

INIT : Booting version: mega-20200222 (ESP82xx Core 2_6_3, NONOS SDK 2.2.2-dev(38a443e), LWIP: 2.1.2 PUYA support)
91 : Info  : INIT : Free RAM:30368
92 : Info  : INIT : Warm boot #5 Last Task: Background Task - Restart Reason: Software/System restart
94 : Info  : FS   : Mounting...
119 : Info  : FS   : Mount successful, used 75802 bytes of 957314
576 : Info  : CRC  : program checksum       ...OK
586 : Info  : CRC  : SecuritySettings CRC   ...OK
693 : Info  : INIT : Free RAM:27208
694 : Info  : INIT : I2C
694 : Info  : INIT : SPI not enabled
955 : Info  : INFO : Plugins: 79 [Normal] [Testing] (ESP82xx Core 2_6_3, NONOS SDK 2.2.2-dev(38a443e), LWIP: 2.1.2 PUYA support)
1060 : Info  : WIFI : Set WiFi to STA
1163 : Info  : IP   : Static IP : 192.168.1.254 GW: 192.168.1.1 SN: 255.255.255.0 DNS: 8.8.8.8
1164 : Info  : WIFI : Connecting MY_ROUTER attempt #0
2386 : Info  : WIFI : Connected! AP: MY_ROUTER (C4:E9:84:XX:XX:XX) Ch: 1 Duration: 1034 ms
2388 : Info  : WIFI : Static IP: 192.168.1.254 (ESPEZ-TEST-0) GW: 192.168.1.1 SN: 255.255.255.0   duration: 188 ms
2392 : Info  : Webserver: start
2395 : Info  : firstLoopConnectionsEstablished
2500 : Info  : WD   : Uptime 0 ConnectFailures 0 FreeMem 18528 WiFiStatus 3
32500 : Info  : WD   : Uptime 1 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
58019 : Info  : Webserver: stop
58120 : Info  : WIFI : Disconnected! Reason: '(200) Beacon timeout' Connected for 55 s
58148 : Info  : IP   : Static IP : 192.168.1.254 GW: 192.168.1.1 SN: 255.255.255.0 DNS: 8.8.8.8
58149 : Info  : WIFI : Connecting MY_ROUTER attempt #1
59360 : Info  : WIFI : Connected! AP: MY_ROUTER (C4:E9:84:43:52:19) Ch: 1 Duration: 1033 ms
59362 : Info  : WIFI : Static IP: 192.168.1.254 (ESPEZ-TEST-0) GW: 192.168.1.1 SN: 255.255.255.0   duration: 179 ms
59363 : Info  : Webserver: start
62500 : Info  : WD   : Uptime 1 ConnectFailures 0 FreeMem 16912 WiFiStatus 3
92500 : Info  : WD   : Uptime 2 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
122500 : Info  : WD   : Uptime 2 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
152500 : Info  : WD   : Uptime 3 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
182500 : Info  : WD   : Uptime 3 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
212500 : Info  : WD   : Uptime 4 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
242500 : Info  : WD   : Uptime 4 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
272500 : Info  : WD   : Uptime 5 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
302500 : Info  : WD   : Uptime 5 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
332500 : Info  : WD   : Uptime 6 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
362500 : Info  : WD   : Uptime 6 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
392500 : Info  : WD   : Uptime 7 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
422500 : Info  : WD   : Uptime 7 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
452500 : Info  : WD   : Uptime 8 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
482500 : Info  : WD   : Uptime 8 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
512500 : Info  : WD   : Uptime 9 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
542500 : Info  : WD   : Uptime 9 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
572500 : Info  : WD   : Uptime 10 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
602500 : Info  : WD   : Uptime 10 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
632500 : Info  : WD   : Uptime 11 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
662500 : Info  : WD   : Uptime 11 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
692500 : Info  : WD   : Uptime 12 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
722500 : Info  : WD   : Uptime 12 ConnectFailures 0 FreeMem 16720 WiFiStatus 3
With serial log set to Debug More I noticed MQTT is disconnecting and connecting, along with bcn_timout,ap_probe_send_start. See this:

Code: Select all

426377 : Info  : Subscribed to: ESPEZ_TEST/#
426381 : Info  : EVENT: MQTT#Disconnected
426481 : Info  : EVENT: MQTT#Connected
bcn_timout,ap_probe_send_start
bcn_timout,ap_probe_send_start
446930 : Error : MQTT : Connection lost, state: Disconnected
446984 : Info  : MQTT : Connected to broker with client ID: ESPClient_84:F3:EB:XX:XX:XX
446988 : Info  : Subscribed to: ESPEZ_TEST/#
446994 : Info  : EVENT: MQTT#Disconnected
447094 : Info  : EVENT: MQTT#Connected
450920 : Info  : Dummy: value 1: -1.00
450927 : Info  : EVENT: STATEVAR#rlyval=-1.00
450993 : Info  : EVENT: STATEVAR#bootmode=1.00
451093 : Info  : EVENT: STATEVAR#=0.00
451194 : Info  : EVENT: STATEVAR#=0.00
453936 : Info  : WD   : Uptime 8 ConnectFailures 0 FreeMem 13496 WiFiStatus 3
bcn_timout,ap_probe_send_start
bcn_timout,ap_probe_send_start
467516 : Error : MQTT : Connection lost, state: Disconnected
467567 : Info  : MQTT : Connected to broker with client ID: ESPClient_84:F3:EB:XX:XX:XX
467570 : Info  : Subscribed to: ESPEZ_TEST/#
467577 : Info  : EVENT: MQTT#Disconnected
467678 : Info  : EVENT: MQTT#Connected
468419 : Info  : EVENT: Clock#Time=Mon,16:43
bcn_timout,ap_probe_send_start
I added 100uF across Vcc as a precaution against voltage sags during WiFi startup. I also decided to change the 4MB chip with another new one. The issues remain the same. I also tried an older 2019 mega release and things did not improve.
There is a command with the very confusing name "erase" (will change it later to wifierase or erasewifi) that you could test first.
What this does, is turn on wifi persistent settings, disconnect , disable persistent settings and reconnect.
So it is supposed to clear any stored wifi settings outside our reach.
It is possible the current WiFi calibration data is causing instability with the current core version.
Interesting. Perhaps the poor WiFi performance is a RF Calibration problem. Since I started with a blank memory chip I would expect the Calibration data to be recalculated. But I also tried the "erase" command without any improvement to WiFi performance. Maybe there's a problem with calibration?

Lastly, the ESP8266EX chip is date coded 05-2018. Google didn't find any reported silicon bug Errata for the ESP8266 family.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#5 Post by TD-er » 03 Mar 2020, 08:50

Hmm, the issues you describe are quite similar to what I'm seeing on my test node ESP32.
So that would suggest there is something not right in the way the ESPEasy code does interact with the WiFi.

Could it be that the flash chip does report QIO compatibility, while the ESP is only routed on the board using a slower connection? (for reference to the IO modi: https://github.com/espressif/esptool/wi ... lash-Modes )

So it might be we are now doing so much more in between delay calls that WiFi connectivity is a bit neglected to handle its connection.
If so, then that may be explained by quality differences in the internal clock.
For example, if the clock is spot on, then we have probably enough time to respond to requests, but if the clock is a bit too slow, we may arrive late and thus have our answers (or requests) missed.

You could try to enable NTP once to set the time and disable it then for about an hour.
As soon as you enable it again, it will also report the clock wander in msec/sec.


Another thing I noticed recently is that the initial NTP request may take like forever on some nodes.
Not sure why, but I've seen it happen on ESP32 as well as ESP8266.
On my ESP32 it took about 200 seconds yesterday for the NTP to get an answer and only after it got an answer the node responded to pings also.
There was no WiFI reconnect in between.

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#6 Post by TD-er » 03 Mar 2020, 14:14

On my problematic ESP32 node, I had NTP off for a while to get some idea of the clock drift.
It doesn't look that much, so I guess that's not likely to be the issue here.

Code: Select all

14014926: NTP : NTP replied: delay 31 mSec Accuracy increased by 0.167 seconds
14014931: Time set to 1583240874.167 Time adjusted by 146.76 msec. Wander: 0.04 msec/second
14014935: Local time: 2020-03-03 14:07:54
That's roughly an hour of "stand alone" run time.

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#7 Post by ThomasB » 03 Mar 2020, 17:15

On my problematic ESP32 node, I had NTP off for a while to get some idea of the clock drift.
It doesn't look that much, so I guess that's not likely to be the issue here.
I can try the NTP wander test on my EPS8266 device. But web access is painful (persistent timeouts). Is there a secrete command to enable/disable NTP via the serial port?

Late last night I moved the device so it was just five feet from the WiFi router. This morning the Wifi Status LED was blinking rapidly. Serial port was not connected so I had no way to see the log messages. Hopefully I can catch this situation with the serial port connected.

EDIT:
The blinking Wifi status LED appeared again. The serial log shows repeated attempts to reconnect to WiFi. It reports WiFi status 6 (disconnect code), freeheap=16304, freestack=3376. The "erase" command restored the lost WiFi connection, but browser access remains problematic.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#8 Post by TD-er » 03 Mar 2020, 22:11

I can make you a "DOUT" build to test.
I did test on a few of my nodes, but couldn't find much of a difference, but maybe it does make a difference on your module?

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#9 Post by ThomasB » 03 Mar 2020, 23:01

I can make you a "DOUT" build to test.
That is a good idea. Since it is a custom build, please make it a [test_ESP8266_4M1M] without VCC. Extra foam.

BTW, I flash with Grovkillen's Easy flasher. I've already tried its "Force D-OUT" function; I assumed that changed the bin's flash mode header so DOUT mode was enabled at boot. But no change to the problem.

I discovered an interesting hardware issue with the Sonoff Basic module (original version). It is missing the GPIO2 boot mode pull-up resistor. For reference, my Sonoff TH10's work fine with the upgraded 4MB Flash and they have the resistor. I thought I found the problem and of course I added the missing resistor.

espez_pullup.jpg
espez_pullup.jpg (106.96 KiB) Viewed 16090 times

Now web browser access works and I can visit the various pages. And web log functions too (previously reported resource errors). I haven't seen a repeat of the WiFi disconnect loop either.

But something isn't right. The pages take a couple seconds to load and randomly take much longer. My other devices have nearly instantaneous page loads.

So I think the new resistor is not a final solution. Rather, I suspect a board layout issue that has been worsened by the 4MB flash chip. Perhaps coupled noise on the power supply (or elsewhere) that has been slightly attenuated with the newly installed resistor. Or evil demons.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#10 Post by TD-er » 03 Mar 2020, 23:22

Is it powered via mains power, or 3v3?

Maybe also try "the other" option?
Some Sonoff devices in the past were missing some proper decoupling and thus became unstable when powered via mains.

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#11 Post by ThomasB » 03 Mar 2020, 23:32

Some Sonoff devices in the past were missing some proper decoupling and thus became unstable when powered via mains.
For my safety all the Sonoff testing is with 3.3V power. This allows death-free connection to the serial port. It's the exact same setup I always use with Sonoff devices, which has worked great until this miserable module came along.

I've aready tried other 2A USB type supplies. Haven't tried mains power, so I've put that on the honey-do list.

I should also try your custom bin with DOUT.

- Thomas


User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#13 Post by ThomasB » 04 Mar 2020, 00:34

I found that the Flash Memory chip does not have its own decoupling cap. Its Vcc pin connects to the nearby status LED. To remedy this, I installed a 0.1uF SMD cap across the LED Anode (3.3V) and the ground plane near the ESP8266 (see photo, red arrow). It's hard to say for certain, but it seems to have further the improved web page browsing.

I've also noticed that this Sonoff Basic has less RF range than my NodeMCU boards. All these little quirks suggest that the "4MB upgrade problem" is a Sonoff Basic Module (ESP8266 version) hardware issue. Something related to dirty power, coupled noise, and/or EMI-RFI.

espez_decoupling.jpg
espez_decoupling.jpg (178.72 KiB) Viewed 16072 times

Thanks for the custom build (and the extra foam). I installed it and everything is still working (no change). Maybe the OP can try it out and see if it helps his installation.

- Thomas
Last edited by ThomasB on 04 Mar 2020, 00:48, edited 2 times in total.

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#14 Post by TD-er » 04 Mar 2020, 00:46

Let me know if this build of you is working fine (system variable replacements + web page serving, not corrupting or missing data)

If it is working fine, I will trigger a new "nightly build" tomorrow morning.
So depending on our phase shift in perceived sunrise, it may already be a nightly build for you.

Maybe also look at the load times of the web pages :)

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#15 Post by ThomasB » 04 Mar 2020, 01:16

I don't see anything unusual with the test release. But the installation is simple; Just the switch, dummy, and Sysinfo plugins.

NTP and Rules are working. System var replacements are Ok. OTA flash, Email Notifications and HTTP event commands are Ok.

I noticed that the "erase" command is no longer recognized. So that suggests that you have renamed it.

Here is the timing stats page for the mega-20200222_test_ESP8266_4M1M_VCC version.

espez_timing_old.jpg
espez_timing_old.jpg (302.98 KiB) Viewed 16058 times


Here is the timing stats for your test release.

espez_timing_new.jpg
espez_timing_new.jpg (269.76 KiB) Viewed 16058 times

- Thomas

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#16 Post by ThomasB » 04 Mar 2020, 01:43

Further testing revealed a problem. I enabled the Openhab MQTT controller plugin and the Sonoff device crashed (unresponsive). I've cold booted and still can't connect with the browser.

I need to finish up some other tasks, but will return to debugging later tonight.

-Thomas

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#17 Post by ThomasB » 04 Mar 2020, 05:42

I used ESP Easy Flasher and reloaded ESP_Easy_mega-20200222_normal_ESP8266_4M1M (with Force D-OUT). Web browsing is working, but now it takes several seconds to load some pages. So the progress made today has slipped away. This makes me think that my "hardware problem" comments are a false conclusion.

I used OTA and reflashed the test file provided earlier today. Rules are working (a LED toggles when I press the button) but web browsing is nothing but page timeouts now. Occasionally a page will load, but that's a rare event. Serial log does not offer any explanation (WiFi is connected).

BTW, I found that the "erase" (RF Calibration) command has been renamed to "erasesdkwifi". Tried it, no WiFi improvement.

I OTA flashed some 2019 releases. Here's a summary:

* Fast Page Loading, no browser issues:
ESP_Easy_mega-20190805_normal_ESP8266_4M

* Slow Page Loading and/or page timeouts:
ESP_Easy_mega-20190511_test_core_260_sdk2_alpha_ESP8266_4M
ESP_Easy_mega-20190928_test_core_260_sdk3_alpha_ESP8266_4M1M
ESP_Easy_mega-20191016-11-PR_2667_test_ESP8266_4M_VCC has slow page loads.

I need to sleep on this. Maybe tomorrow some new detail will emerge that solves all this madness.
So it might be we are now doing so much more in between delay calls that WiFi connectivity is a bit neglected to handle its connection.
That seems to be the best explanation so far. Just need to figure out how to make my 4MB upgraded Sonoff Basic Module play nice with ESPEasy.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#18 Post by TD-er » 04 Mar 2020, 09:43

Well if the last build was working fine, then I think it may still be related to the way we use flash.

First of all, there is the obvious one, of using DOUT instead of DIO.
Second one may not be that obvious at first.
The reason the build size is now smaller is because I changed the way how the chunked transfers are being sent.

This is a bit of a C++ issue, as we had this struct with the code defined in the struct define itself.
Just like if you code everything in a .h file.
This means the code for every function call is inline, meaning the compiler generates code for every call made to that function.
One of the functions we used a lot was of the operator+=.
So the code for that was repeated over and over again in the binary and thus it increases the size of the binary a lot.
But also the number of loads from the flash during execution.

This could mean your hardware changes have been helping a bit (pun intended :) ) and also the changes of the webserver code.
The DOUT also gives the flash a bit more time to process.

OpenHAB MQTT (and Domoticz MQTT) do seem to be quite aggressive on the network, so maybe I should make their reconnect attempts somewhat more relaxed.

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#19 Post by ThomasB » 04 Mar 2020, 19:02

Well if the last build was working fine, then I think it may still be related to the way we use flash.
Unfortunately the story has become convoluted and perplexing.

Here's a timeline summary:
1. Sonoff Basic module is upgraded to 4MB flash. ESP_Easy_mega-20200222_normal_ESP8266_4M1M.bin is loaded.
2. Module boots, but web browser mostly times out. Occasionally a page loads.
3. Let it run overnight. Next morning it was stuck in a endless connect/disconnect loop. Endless loop ended after running "erase" command.
4. Reviewed hardware. Found floating GPIO2 boot mode pin. Added 10K pull-up (per typical configuration).
5. Web browsing now working, but pages take a couple seconds to load and randomly take much longer. Improvement, but not fully fixed.
6. A decoupling cap is added to the Flash memory. Page browsing seems a little bit better, but could be a wishful illusion.
7. OTA flashed DOUT test firmware from TD-er ("extra foam" release). Page loading is similar to mega-20200222.
8. Attempted to enable MQTT controller. Browser timed-out during the submit. {Note: MQTT was not enabled, as discovered later.}
9. ALL browser access now times out. No successful page loads after several minutes of trying.
10. Performed cold reset, then serial flashed mega-20200222. Browser now works, but slow loads and some timeouts. Browser performance seems to have gotten worse.
11. ESP_Easy_mega-20190805_normal_ESP8266_4M.bin works fantastic. Very fast browser page loading, no hangs/timeouts.
12. Tried other randomly chosen 2019 releases. Mega-20190805 remains the winner.

New information:
1. OTA re-flashed with the DOUT test firmware ("extra foam" release). Today's results are about the same as mega-20200222; Slow loads, some timeouts.
2. Confirmed that MQTT was never enabled during yesterday's DOUT testing. Enabled it, no change to behavior ( Slow loads, some timeouts). Have disabled it for now.
3. I discovered that OTA flash defaults to DIO. So the test firmware was not using DOUT flash mode.
4. I reloaded the DOUT test software using serial flashing with "force D-OUT" option. DOUT mode is now enabled, per ESPEZ system info. No improvements to browser performance (Slow loads, some timeouts).
5. Pings also experience random timeouts.

It seems that the current theory is that the system does not give enough priority to servicing WiFi. And by slowing down Flash access this may buy some time for WiFi services. But perhaps this isn't the solution since I don't see any performance difference with DOUT enabled. So at this point I would suggest continue using DIO since that offers faster code execution than DOUT.

Not sure where to go from here. If you have any ideas then that would be great.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#20 Post by TD-er » 04 Mar 2020, 20:24

ThomasB wrote: 04 Mar 2020, 19:02 [...]
So at this point I would suggest continue using DIO since that offers faster code execution than DOUT.
Well the speed difference between DOUT and DIO is negligible for the simple reason that it only makes a difference when writing.
"Writing" is also sending the commands to the flash, like "gimme my dataz".
But since we're reading blocks of data, writing is only done for a fraction of the time.
And from that fraction, only the address data can be written via 2 pins in DIO mode.

There are some chips out there that don't support DIO, but do support DOUT, so it may add support for just that extra board. (or maybe you don't want to support that as it has probably had more costs savings than you would like)


As far as I know, the flash mode set in the flasher does only have effect on the flash speed and what speed the bootloader selects at boot.
The run speed of the sketch is set at compile time.


For this 20190805_normal_ESP8266_4M version, can you see what core version was used?


User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#22 Post by ThomasB » 04 Mar 2020, 20:57

Thanks for the DOUT information.
For this 20190805_normal_ESP8266_4M version, can you see what core version was used?
ESP82xx Core 2_5_2, NONOS SDK 2.2.1(cfd48f3), LWIP: 2.1.2 PUYA support

I also tried ESP_Easy_mega-20191016-11-PR_2667_test_ESP8266_4M_VCC. It uses the exact same core version as noted above. But unlike 20190805, this version experiences browser page hangs, slow loads.
Can you test this one: ...
No problem. Shouldn't take long to report back.

- Thomas

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#23 Post by ThomasB » 04 Mar 2020, 21:04

Can you test this one: ...
Respect! That works fantastic. Fast page loads, no hangs. :D

So what is the secrete to this magic?

BTW, I used OTA flash. System info reports DOUT mode. The previously "DOUT" test file reported DIO mode.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#24 Post by TD-er » 04 Mar 2020, 21:18

Well I was a bit afraid it could be a fix here.

Our default SDK build we use is PIO_FRAMEWORK_ARDUINO_ESPRESSIF_SDK22x_190703
Your test build is using PIO_FRAMEWORK_ARDUINO_ESPRESSIF_SDK22x_191122

The problem is that neither one is a "one size fits all" build and it also does seem to be related to a lot of environmental factors like RSSI value, number of other APs in the neighborhood etc.

Of the last builds, you can also try the "stage" builds (called "beta") as those use PIO_FRAMEWORK_ARDUINO_ESPRESSIF_SDK22x_191105
But be aware that in the last nightly build apparently the OTA support may have been broken in the stage builds (see issues on Arduino repo), so don't install it on a board plastered away.

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#25 Post by ThomasB » 04 Mar 2020, 21:48

The problem is that neither one is a "one size fits all" build and it also does seem to be related to a lot of environmental factors like RSSI value, number of other APs in the neighborhood etc.
That's an interesting twist to this debugging saga.

I placed a "working" NodeMCU next to it and they both report similar neighborhood networks. Each scan populates a different list so they are never quite the same, but close enough. The various networks report nearly the same RSSI values too.

So there must be more to this than just environmental conditions. Could it be ESP8266EX chip revision?

I have eight ESPEasy nodes running mega-20200222. It's a mix of Sonoff Basic 1MB, Sonoff TH10 upgraded to 4MB, and NodeMCU. They all work great (fast browser access), except this particular Sonoff Basic with 4MB Flash upgrade.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#26 Post by TD-er » 04 Mar 2020, 22:05

Well I didn't want to make the post too elaborate, but now you make me...

The main differences in those SDK blobs are probably in some tweaked settings like timeout, when to switch to what bandwidth etc.
At least that's what it looks like when investigating the symptoms. I don't have access to their source code, so it is all guessing.

The RSSI value is only an indicator on the power of the received signal. It does not tell you anything about the SNR.
And this SNR is also not a single value that must be 'just right' as it is a measured effect caused by a lot of variables like impedance matching, decoupling capacitors, etc.

So what this SDK blob probably does is trying to set all kind of parameters just right so you have the highest transfer rate, best stability and best power consumption... pick 2.

So even though you may have the same RSSI and maybe even the same SNR values, there is still a possibility that one node may be as unstable as a drunk on roller skates while the other one is beating 10000$ web servers in page serving latency.
Then it comes to what parameters were chosen for when to switch to a lower bandwidth, when to advertise what data rate, etc.

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#27 Post by ThomasB » 04 Mar 2020, 22:34

Espressif's closed source core libraries are the work of the devil. Too bad they demand their secrecy.

I appreciate all the hand holding and helpful advice. Hopefully the OP resolves his problem with what was learned in this debug adventure.

Thanks again. You've earned another good guy club award.

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#28 Post by TD-er » 04 Mar 2020, 22:37

Should I add a build for "bad wifi fallback"?
And how should it be called?

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#29 Post by ThomasB » 04 Mar 2020, 23:02

Should I add a build for "bad wifi fallback"?
It seems like that would salvage some installations. But it's your call.
And how should it be called?
Maybe the file name should include _ALT_WIFI_ or _EXPERIMENTAL_WIFI_.

But the real issue is how to educate the user base. There's a place on the forum for such info, but it has been ignored for a long time:
viewtopic.php?f=6&t=5818

- Thomas

TD-er
Core team member
Posts: 8739
Joined: 01 Sep 2017, 22:13
Location: the Netherlands
Contact:

Re: Cannot update any of my modules past mega-20190215

#30 Post by TD-er » 04 Mar 2020, 23:11

ThomasB wrote: 04 Mar 2020, 23:02 [...]
There's a place for such info, but it has been ignore for a long time:
viewtopic.php?f=6&t=5818
Hmm not sure if I must touch that topic... it is sticky

User avatar
ThomasB
Normal user
Posts: 1065
Joined: 17 Jun 2018, 20:41
Location: USA

Re: Cannot update any of my modules past mega-20190215

#31 Post by ThomasB » 04 Mar 2020, 23:46

Hmm not sure if I must touch that topic... it is sticky
Never touch your face after contact. Immediately wash your hands. See a health care provider if you have a persistent cough or symptoms due to SDK blobs.

- Thomas

Post Reply

Who is online

Users browsing this forum: No registered users and 55 guests