Heltec Wireless Stick ESP32 - LMIC problem - radio.c:1065

BTW, depending on taste, you might want to add a Serial.println() at the end of the clause doing the prints. It will make a more readable log, but will take more vertical space. If you’re capturing output to a file with something like Tera Term, I’d add the newline; if just using a “peephole” into the data, I’d add the newlines afterwards, with an editor.

Here a new error:
sketch\loraWan.cpp.o:(.literal._Z15my_runloop_oncev+0x14): undefined reference to `os_getTimeSecs’

Delightful. That’s a bug in the oslmic.h header file (#685), it declares an API that isn’t implemented. I will edit the post.

Post is edited. Switched to using millis() since I really don’t want to depend on os_getTime(), as I suspect it might be at fault.

After the successful joining, the device sends data once with the following timestamp

01:25:59.414 → GEIGER: Sending to TTN …
01:25:59.414 → 11FC267 op=C txrx=0

and then no more.

  • Is it still transmitting, or does it crash at this point?
  • Did you search your code for all calls to os_runloop_once() and replace with the call to the substitute?

Since the ASSERT() didn’t fire, it seems that updating the check in the core library is a good idea.

This morning (Germany) I started the test again. The device and the logging is working without problems.

BTW.: I am not the developer of the software which is also very complex. It is therefore difficult for me to understand the context and accordingly takes time, especially since I am not very experienced in programming. I hope this is not frustrating for you.

The original code looks like Arduino code, thus is written for a single core platform. Be aware that you ported this to ESP32, which is a dual core platform, thus there are two CPUs running code in parallel. This must not, but may cause race conditions.

They used in the multigeiger project exactly this board. I also opened an issue on the problem in the project github , but the only feedback I received was that this problem has never occurred.

Did you test on different LoRaWAN gateways?

I have two different gateways in use: TTNI and LPS8. The problem occured on both gateways.

In the meantime the device no longer sends any data to TTN. Here the log data with the corresponding change:
GEIGER: Sending to TTN …
35B23E7F op=888 txrx=20
GEIGER: Sending to TTN …
36416BE1 op=888 txrx=20
GEIGER: Sending to TTN …
36D0867D op=C txrx=0
GEIGER: Sending to TTN …
375FA04C op=C txrx=0
GEIGER: Sending to TTN …
37EEBAA5 op=C txrx=0

In a second test, the device stopped data transmission to the TTN without a status change:

6F512B6C op=888 txrx=20
GEIGER: Sending to TTN …
6FE05C54 op=888 txrx=20
GEIGER: Sending to TTN …
706F8CF2 op=888 txrx=20
GEIGER: Sending to TTN …
GEIGER: Sending to TTN …
GEIGER: Sending to TTN …
GEIGER: Sending to TTN …

In the first case, it’s clear that the device has decided to re-join the network (that’s what op=C means). The other failure is not so clear, but that’s OK; we have something to fix and these things are step-by-step.

The rejoin almost certainly means that downlinks aren’t working, which suggests a relatively simple integration problem.

We can also tell that they’re not calling os_runloop_once() nearly often enough – it appears they call once every 2.5 minutes. This just won’t work.

@rodrigop, can you post a few more things?

  1. The link to exact version of the software you’re using as a starting point
  2. A link to, or a zip file, of the sources you’re currently using (since you have local changes)?
  3. A link to the hardware portions of this – the MultiGeiger code looks incorrect to me, but I can’t make changes without testing and without studying the hardware to make sure I’m doing no harm.

It appears the original works only accidentally; and that some minor change you made is causing things to break. Certainly, the downlink will not be reliable with their code. Usually not hard to fix… but needs to be fixed.

@rodrigop sent me enough info that I finally went and grabbed the code.

There’s a basic problem – they left something important out of loop(). @rodrigop, you said you just grabbed a zip of the repo? I’ll fork it and push my changes to my fork; you can then test it. How does that sound?

That sounds very good, I have 2 devices to test, 1 in TTN V2 and the other in TTN V3.

Hi,

If you go to Release Fix LoRaWAN stability versions · terrillmoore/MultiGeiger · GitHub and download https://github.com/terrillmoore/MultiGeiger/archive/V1.14.1-beta.zip, you’ll get code that contains the patch for Heltec Wireless Stick ESP32 - LMIC problem - radio.c:1065 - #7 by tmm, and a patch that causes os_runloop_once() to get called for every pass through loop() (providing LoRaWAN is in use).

Good luck…
–Terry

The device stops sending to TTN after the join and the transmission of 1-2 data packets. I put the wrapper back in. Below is the log:

11:22:25.323 → 35B7ABF op=C txrx=0
11:27:25.925 → GEIGER: Sending to TTN …
11:27:25.960 → 47A276F op=888 txrx=20
11:32:26.077 → GEIGER: Sending to TTN …
11:32:26.112 → 59866A0 op=888 txrx=20
11:37:26.241 → GEIGER: Sending to TTN …
11:42:26.266 → GEIGER: Sending to TTN …
11:47:26.284 → GEIGER: Sending to TTN …
11:52:26.290 → GEIGER: Sending to TTN …

It looks like os_runloop_once () is still only called when sending.

Paul.

I built the poll_lorawan () call into multigeiger’s loop (). The call is now made for each measuring cycle. Here are the log entries:

13:44:23.897 → AD3418 op=0 txrx=0
13:45:23.880 → E66D08 op=0 txrx=0
13:46:23.878 → 11FA5A9 op=0 txrx=0
13:46:23.948 → GEIGER: Sending to TTN …
13:47:24.658 → 1599563 op=900 txrx=20
13:48:24.635 → 192CDE2 op=900 txrx=20
13:49:24.669 → 1CC0738 op=900 txrx=20
13:50:24.663 → 2054013 op=900 txrx=20
13:51:24.660 → 23E7986 op=900 txrx=20
13:51:24.660 → GEIGER: Sending to TTN …
13:52:24.814 → 277D590 op=900 txrx=20
13:53:24.832 → 2B10ECD op=900 txrx=20

Fingers crossed.

Paul.

As it happens, somehow my patch missed adding the call to poll_transmission() in loop() – I did all the work to prep but forgot to add it - too much of a hurry last nigt. Your change (calling poll_lorawan()) works, but will be suboptimal for people who are not using LoRaWAN. No need to change for this experiment. Thanks!

You’re right. I’ve adjusted it accordingly. I noticed that with the V3 device packets get lost: they arrive at the gateway, but not at the application.