Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix initiation interval of pooling and zeropadding layers on Vitis backend #1141

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

steltze
Copy link

@steltze steltze commented Dec 4, 2024

On the Vitis backend and io_stream, zeropadding and pooling layers don't reach II=1 and are slower than for example the Conv layers

Type of change

  • Bug fix (non-breaking change that fixes an issue)

Tests

Synthesized the zeropadding and pooling models in the pytests. Code achieves II=1 and Latency cycles match the tripcount

Input size = 128x128x3

C-Synthesis results with Vitis HLS 2023.2

Layer Latency (cycles) FFs LUTs
Zeropadding Before 19487 169 596
Zeropadding After 17689 471 1675
Pooling Before 32769 764 1432
Pooling After 16387 795 1392

Tested also on a dummy CNN.

Test Configuration:

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

@JanFSchulte JanFSchulte added the please test Trigger testing by creating local PR branch label Dec 4, 2024
@jmitrevs
Copy link
Contributor

Is it important for the II to be 1? Generally in io_stream conv layers have a larger II. For zero-padding at least the utilization seems to go up.

@steltze
Copy link
Author

steltze commented Jan 15, 2025

@jmitrevs in the model that I am working with, we only use separable convolutions. if the II=1 for the zeropadding and maxpooling, the depthwise and pointwise convolutions have smaller latecy (cycles).

image

Depthwise-pointwise latency ~512*512=262144 (which is the image size) < zero-padding latency 787473

Yes, this change allocates more resources but since we are focusing on latency, padding and pooling seem to be the bottlenecks instead of the convolutions which does not make much sense since they don't perform such heavy computations.

I can take some more measurements to get a grasp on how resource utilization scales

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
please test Trigger testing by creating local PR branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants