-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deinit function issue #269
Comments
Hm, apparently there is a race condition where it deinitializes the buffers before stopping the driver threads... Looks like a bug to me. |
Could you provide a working example to reproduce the issue? |
Thank you for your feedback @vroland . I had the same hunch, using rtos and threads are useful features, but also prone to race conditions. This is the relevant snippet: // setting the board in advance and removing the board argument from epd_init as it would otherwise fail
epd_set_board(&epd_board_v5);
// init epd with 1K buffer. No rendering will be done here, just using the framebuffer to store the image
epd_init(EPD_DISPLAY, EPD_LUT_1K);
hl = epd_hl_init(EPD_BUILTIN_WAVEFORM);
framebuffer = epd_hl_get_framebuffer(&hl);
<download image into framebuffer, confirmed to work>
// de-init to free internal RAM
epd_deinit();
<image download completed, more internal RAM available>
// we can now init epdiy with 64K as we have more internal RAM after the download
epd_init(EPD_DISPLAY, EPD_LUT_64K); ----> Always fails at this line
// main rendering starts here
epd_poweron();
// As we want to render with 16 grayscales, init epdiy again with 64K
epd_hl_update_screen(&hl, MODE_GC16, 25);
// rendering complete, de-init epdiy to free RAM and power off
epd_deinit();
epd_poweroff(); |
Any progress on this so far? If you need any additional information, please let me know 👍 |
@vroland Is it possible to revert this function to a previous state if you don't have any time to look into this bug? |
Not that easily, that would be a version before V7 support. Sorry, I haven't had so much time to put towards this lately, but if you want to investigate yourself: There are rendering threads started in |
Thank you very much for your feedback! May I ask you provide a link to the section creating/exiting these treads? As C++ isn't my strength, finding the section responsible for this is pretty difficult. Thank you very much in advance! |
I'm onto implementing some tests also for this on my vector instructions branch. You're using the V5 board? |
Thank you very much for taking the time to look into this issue! Yes, I'm using the v5 board! |
@aceisace Can you check again with the vector-extension-experimentation branch? Looking at it again it may also be an out of memory issue in your program. |
Hello Ace can you please give feedback about this? |
@vroland Apologies for the late reply. As I have already switched to an older version since several months, I had to do some refactoring before the new code could be tested. I've had time today to test it, but it still doesn't seem to work: --init--
free heap: 159 KB (49.69% free)
I (12823) esp_temperature: Characterized using eFuse Vref
I (12825) epd: Space used for waveform LUT: 64K
assert failed: lq_init line_queue.c:23 (queue.bufs != NULL)
Backtrace: 0x40082469:0x3ffc2770 0x400926ed:0x3ffc2790 0x40099ee5:0x3ffc27b0 0x400f7201:0x3ffc28d0 0x400f65a5:0x3ffc2920 0x400f62e5:0x3ffc2970 0x400e2b19:0x3ffc2990 0x400e481a:0x3ffc2aa0 0x400e6dca:0x3ffc30a0 0x40095e01:0x3ffc30c0
0x40082469: panic_abort at /.../esp-idf/esp-idf/components/esp_system/panic.c:408
0x400926ed: esp_system_abort at /.../esp-idf/esp-idf/components/esp_system/esp_system.c:137
0x40099ee5: __assert_func at /.../esp-idf/esp-idf/components/newlib/assert.c:85
0x400f7201: lq_init at /.../esp-idf/v9-neo/components/epdiy/src/output_common/line_queue.c:23 (discriminator 1)
0x400f65a5: epd_renderer_init at /.../esp-idf/v9-neo/components/epdiy/src/render.c:319
0x400f62e5: epd_init at /.../esp-idf/v9-neo/components/epdiy/src/epdiy.c:491 After this it reboots and gets stuck at the same place before rebooting again |
hey, i was having the same assert error message, it turns out the board was just out of internal sram in a project that used both wifi and the epdiy driver. i also can confirm that i was able to use both wifi and epd_init() at the same without problems in v4-v5 firmware, but when i moved to v7, same project, i could no longer since i faced this issue, it seems v7 uses quite a lot more sram. |
@amadeok Thanks for your input, however, as I have been using epdiy for several years now, I recognised the mentioned issue about epd_init() failing when there is too little RAM and have already implemented thorough debugging at each step, logging stack and heap. On the wrover, which has around 320KB of RAM, attempting to init epdiy with less than 30% free stack results in this error, however, in my case it is well above this threshold, so I can confirm it is not related to this problem in any way. Furthermore, as I am using a v5 based board for this test, stack and heap should not be an issue. The problem is not that I cannot init epdiy, but rather that the deinit function does not undo everything from the init function, leading to the error caused. It boils down to be an issue with the threads not finishing or coming out clean. You can reproduce this yourself by initialising epdiy, then de-initialising and then initialising again. |
why are you deinitializing and initializing multiple times? also a cheap solution could be to just restart the board |
So basically, as my setup requires downloading an image from a server (non-jpeg, lossless), I cannot initialise epdiy at the beginning as the download would fail due to lack of RAM, required for SSL, Wi-Fi, temporary buffers and operation buffers. If I try to initialise epdiy and then download the image, it fails. To tackle this issue, I found a workaround, initialise epdiy with a smaller footprint of internal RAM (mode 1K instead of 64K). That way, 63Kb of RAM of internal RAM are still free, plenty to download and process the image without any memory issues and loading it in the external RAM section used by epdiy (hence the need to initialise). As the download is now complete, I can free up the RAM used by ssl, wi-fi etc., then call deinit to free up internal RAM and allocate the free memory to initialise epdiy again, now with 64K for the highest possible quality. In either case, the issue in question is that if I initialise epdiy and de-initialise, it should revert back to the state it was in before calling init and cleanup buffers, thread and RAM. However, this is clearly not the case so this is a bug in the library, which should be fixed. Consider a use-case where a user needs to initialise and de-init epdiy several times to e.g. make use of the faster mono rendering while also supporting high-quality rendering at the cost of speed and more internal RAM. If I were to restart the board, all the data in the external RAM would be lost since it does not survive reboots. Only RTC or file-based memory is able to retain information after reboots (RTC only for soft reboots, but is very limited in memory). RTC memory does not survive a hard-reboot and file-based memory is way too slow and inefficient. |
it can be reproduced, i went to the demo example and pasted this at the beggining fo app_main()
this is the output:
eventually it crashes |
You are using the latest version or at least a much more recent version of epdiy. Without modifying the epd_init function in the more recent version, you cannot reproduce this error. Just split the setting of the board from the other parts in epd_init and you'll be able to get the same result. For the error mentioned in the first comment, I was using a version back when v6 was the most recent one. |
Hello @vroland assigned this to you but I would also like to research this issue and help discovering what is the problem. Lately I’m using only v7 and I must say at this point all my v5 boards are gone but still would be great to be able to init and deinit epdiy without having this issue described by Ace |
yes the test i did above was with the latest firmware, v4 board, and there's a memory leak or something going there, as you can see the memory drops by about 3kb at each iteration i just did the same test with firmware tag v6 from 3 years ago, board v5, and the memory leak is much more severe:
|
Indeed @amadeok . It's more than just a memory leak and/or heap/stack poisoning. While I only have access to v5 and v7 from @martinberlin , this issue happens on both boards, on both esp32's and even on the more recent as well as the older version of epdiy. @martinberlin In particular, I believe some of the RTOS operations to be linked to this issue used by the init and more importantly, de-init function. But with my limited understanding and expertise with RTOS and cpp, I cannot figure out more by myself |
From the logs you posted it looks a lot like a memory leak. You think it is also something else? I think this is part of why I need to do some refactoring work, I'm not super confident the error and memory management is clean in unexpected paths like continuous initializing / deinitializing. |
@vroland Yes, indeed, I think this is not only related to a memory leak, but also RTOS threads in particular after de-init:
Yes, sure, from time to time, a software may need to be refactored and I'm willing to help with testing too. Is there a development branch that is being developed on for the refactoring? I'm sure that some users can help you with refactoring. I can also help with pretty much anything related to python. |
No branch yet, but I'm trying to get build some test cases and then continuously improve from there. |
Concerning the test cases, apart from testing if epdiy works with a specific release of esp-idf, wouldn't it make things a lot easier for you and all other contributors if there were unittests that run on each commit via GitHub actions? On Inkycal, it's even possible to simulate a display for running the automated tests which are part of the integration test |
That would be the end goal. Unfortunately to do that exhaustively you'd need full emulation in CI or a hardware farm somewhere, which is a lot of complexity. There are esp32 emulators so, it could be worth to look into. |
True enough, a CI will get pretty complex, although it will ensure quality and stability of all core functions throughout all branches. Meanwhile, wouldn't it be easier then to use unittests for all core functions at least which do not require an emulator? Another variant is to have a board just for testing with a test-sketch which does nothing but perform all unittests on-device. This is kind of what kindle did back when it launched. |
Yep, now we just need all the unit tests ;) I intend to add them over time during the course of some refactoring, but it will be a process. Then finding a way to make as much of it work in CI as possible is the next step. |
@aceisace You download an image and then show the heap to be |
@western-hoolock I thought it was the case that some memory was freed while a task on another core could still be running. But here it was just not freed. |
I'm not convinced that (at least in 2.0.0) it's not freed properly. In Line 292 in 1b089e6
My best guess is still heap fragmentation caused by user code that ran before initialization. Which should be easy to check for by calling |
Re-initialising after a de-init causes the following error:
render.c -> assert(render_context.line_queues[i].buf != NULL);
->
assert failed: epd_renderer_init render.c:262 (render_context.line_queues[i].buf != NULL)
My code runs in the following way:
Due to internal RAM constrictions, I cannot simply init epdiy with a 64K buffer and then download the image. As a workaround, I had to init with 1K first prior to the image download, then de-init to free up memory before initialising again with 64K.
As epdiy only allows setting the board once, I removed the board setting function from the epd_init function and am setting the board manually beforehand.
May I ask for a clue on what is causing this error? This error has not appeared before using an epdiy version from around last year, but after the upgrade to the latest main branch, this error causes the esp to reboot each time at this exact place.
The text was updated successfully, but these errors were encountered: