Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32L475 will not send ACK packets after data stage finished #1494

Open
tfx2001 opened this issue Jun 5, 2022 · 13 comments
Open

STM32L475 will not send ACK packets after data stage finished #1494

tfx2001 opened this issue Jun 5, 2022 · 13 comments
Labels

Comments

@tfx2001
Copy link
Contributor

tfx2001 commented Jun 5, 2022

Operating System

Windows 10

Board

STM32L475VET6

Firmware

custom firmware, basically CDC.

What happened ?

During device enumeration, after data stage of SET_LINE_CODING, it won't send ACK packet. After checking the log, I think the XFRC bit should not be cleared when OTEPSPR is set but STPKTRX is cleared.

case GRXSTS_PKTSTS_OUTDONE:
// Occurred on STM32L47 with dwc2 version 3.10a but not found on other version like 2.80a or 3.30a
// May (or not) be 3.10a specific feature/bug or depending on MCU configuration
// XFRC complete is additionally generated when
// - setup packet is received
// - complete the data stage of control write is complete
if ((epnum == 0) && (bcnt == 0) && (dwc2->gsnpsid >= DWC2_CORE_REV_3_00a))
{
uint32_t doepint = epout->doepint;
if (doepint & (DOEPINT_STPKTRX | DOEPINT_OTEPSPR))
{
// skip this "no-data" transfer complete event

By changing line 1038 to if (doepint & DOEPINT_STPKTRX) , it works fine for me. But after I compile it with -O2, it occurs some assert failed. Maybe we can have a better way to handle it?

How to reproduce ?

TinyUSB v0.13.0. Use CDC on STM32L475VET6.

Debug Log as txt file

USBD init
CDC init
dwc2->guid = 2000
dwc2->gsnpsid = 4F54310A
dwc2->ghwcfg1 = 0

...

handle_rxflvl_irq: 970:
  EP 00, Byte Count 8, Setup Data Received
  daint = 00000000, doepint = A010
handle_rxflvl_irq: 970:
  EP 00, Byte Count 0, Out Transfer Complete (ISR)
  daint = 00010000, doepint = A011
  FIX extra transfer complete on setup/data compete
handle_rxflvl_irq: 970:
  EP 00, Byte Count 0, Setup Complete (ISR)
  daint = 00010000, doepint = A018
daint = 00010000,on EP 80 with 0 bytes
  Set Control Line State: DTR = 0, RTS = 0

USBD Setup Received 21 20 00 00 00 00 07 00
  CDC control request
  Set Line Coding
  Queue EP 00 with 7 bytes ...
handle_rxflvl_irq: 970:
  EP 00, Byte Count 7, Out Data Received
  daint = 00000000, doepint = 2010
handle_rxflvl_irq: 970:
  EP 00, Byte Count 0, Out Transfer Complete (ISR)
  daint = 00010000, doepint = 2011
handle_rxflvl_irq: 970:
  EP 00, Byte Count 0, Out Transfer Complete (ISR)
  daint = 00010000, doepint = 2031
  FIX extra transfer complete on setup/data compete        // should not be skipped

Screenshots

stuck on status stage:

image

assert failed after changing:

image

@hathach
Copy link
Owner

hathach commented Jun 7, 2022

this is mostly tested with my L476 discovery board. These line of codes are done when doing generic dwc2 driver #1163 with tons of testing (trial/error). Maybe I didn't fully resolve this issue with L4 series just yet (possibly an race condition that is less likely to happen on my L476 discovery board due to clock/hw set up etc..). I don't have official dwc2 specs and the stm manual are not very helpful on this particular issue.

I kind of forget the detail, #126 may contain some useful info with links to stm32 cube driver. Unfortunately I don't have any L475 to test/fix this, but I will try to pull out L476 to test along your suggested changes later on.

@electretmike
Copy link

I am running into what appears to be this problem. I cam reliably reproduce it on the L476disco board by reducing the system clock to 16MHz (or anything under 48MHz actually).

Applying the suggested change makes the problem go away. Is this then the correct fix?

@hathach
Copy link
Owner

hathach commented Jun 21, 2024

I am running into what appears to be this problem. I cam reliably reproduce it on the L476disco board by reducing the system clock to 16MHz (or anything under 48MHz actually).

Applying the suggested change makes the problem go away. Is this then the correct fix?

I am not sure either, I will try to lower clock and pull out my L476 to test with whenever I got time.

@electretmike
Copy link

Thanks for looking.
In the meantime, we found that the change does not fully fix the issue. Opening and closing the CDC port from windows now fails once every 20 or so tries. So it's better, but not fully fixed.

@HiFiPhile
Copy link
Collaborator

#2576 changed the interrupt flow, although L475 doesn't have DMA but worth a try.

@electretmike
Copy link

Thanks. I will test that change. But can only do that next week. I will post an update.

@alano-ee
Copy link

#2576 changed the interrupt flow, although L475 doesn't have DMA but worth a try.

I have tested this change on an L476, this does not seem to fix the issue described

image
After Queue EP 00 with 7 bytes ..., the flow stops, it waits for the event and after around 30 seconds a new event is received and it continues without having handled those bytes. I have also checked that the 7 requested bytes are indeed being sent with the packet

@HiFiPhile
Copy link
Collaborator

@alano-ee with #2576 I'm not sure if the fix above case GRXSTS_PKTSTS_OUTDONE: is still necessary, it doesn't exist in ST's driver. What happens without it ?

@electretmike
Copy link

@HiFiPhile Without that case statement, I end up at the default, which triggers a breakpoint.

And with that statement, I end up in an infinite loop, because the DOEPINT_STPKTRX flag is not cleared. The comment claims that it is cleared later, but handle_rxflvl_irq is called from a loop until all flags are cleared. But they are never.

(Should this comment move to #2576?)

@HiFiPhile
Copy link
Collaborator

Hum... in ST's code handle_rxflvl_irq is pretty simple : https://github.com/STMicroelectronics/stm32l4xx_hal_driver/blob/42d324ac323088701270f2c19f328bcad0457c52/Src/stm32l4xx_hal_pcd.c#L1109-L1139

DOEPINT_STPKTRX will be cleared in handle_epout_irq, normally it takes care the special case of GenID 3.10A

if ((doepint & DOEPINT_STPKTRX) && (dwc2->gsnpsid >= DWC2_CORE_REV_3_00a)) {
epout->doepint = DOEPINT_STPKTRX;
} else {
.

But I don't have L47x board to see what really happens.

@electretmike
Copy link

I have tried to change the interrupt handling to match what STM32 HAL does: HiFiPhile#2
I could only test with STM32L486, but it seems to work there. But this is all based on looking at the STM32HAL driver, and not based on any documentation.

@HiFiPhile
Copy link
Collaborator

@electretmike could you confirm if issue is resolved with latest #2576 ?

@electretmike
Copy link

I was busy on other projects for the last weeks, and will be away the next few. I hope to be able to test in about 4 weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🆕 New
Development

No branches or pull requests

5 participants