Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: graph: add document for sdpa with compressed key and value #2301

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wzt1997
Copy link
Contributor

@wzt1997 wzt1997 commented Dec 20, 2024

General

  1. Add document for SDPA with compressed Key and Value

@wzt1997 wzt1997 added the documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc label Dec 20, 2024
@wzt1997 wzt1997 requested a review from TaoLv December 20, 2024 03:49
@wzt1997 wzt1997 self-assigned this Dec 20, 2024
@wzt1997 wzt1997 requested review from a team as code owners December 20, 2024 03:49
Copy link
Contributor

@mgouicem mgouicem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(open question) Do we want to have separate document for non-quantized and quantized patterns?
My understanding of existing SDPA page was that it would regroup different data-types given that it currently has a subsection for floating-point pattern.

doc/graph/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
@TaoLv
Copy link
Contributor

TaoLv commented Dec 20, 2024

(open question) Do we want to have separate document for non-quantized and quantized patterns?

Thank you @mgouicem ! That's also my question. Initially when i added the floating-point section, i was thinking to include all these fp, int8 quantized, and only kv quantized patterns together in a single page. But with this PR, it seems there will be too much information. Maybe we need @ranukund 's input for which is a better format.

@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from 51d1971 to 9c91ae9 Compare December 24, 2024 01:26
@wzt1997
Copy link
Contributor Author

wzt1997 commented Dec 24, 2024

Do we want to have separate document for non-quantized and quantized patterns?

Thanks for the question! From my perspective, I think it's better for us to put the quantized sdpa patterns in a separate document as it requires much more information compared with pure floating-point patterns, such as fpmath mode setting, group quantization and adding extra dynamic quantization ops.

It's also worth noting that this PR only includes the document for quantization sdpa patterns with compressed KV, but not the pure quantized sdpa. We may need to think about fusing them together in the future.

Copy link
Contributor

@TaoLv TaoLv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document folder has been changed on main branch. Please rebase the PR. Thanks.

doc/graph/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from 9c91ae9 to 2f307af Compare December 24, 2024 05:34
@vpirogov
Copy link
Member

@ranukund, please help with the review.

@TaoLv TaoLv added the component:graph-api Codeowner: @oneapi-src/onednn-graph label Jan 3, 2025
Copy link
Contributor

@ranukund ranukund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments, please incorporate as you see fit.

doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from 2f307af to 05e789e Compare January 6, 2025 01:51
@github-actions github-actions bot removed the component:graph-api Codeowner: @oneapi-src/onednn-graph label Jan 6, 2025
@wzt1997
Copy link
Contributor Author

wzt1997 commented Jan 6, 2025

Thank you @ranukund! The suggestions are incorporated now.

@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from 05e789e to dbb71a0 Compare January 6, 2025 02:05
@mgouicem mgouicem added the component:graph-api Codeowner: @oneapi-src/onednn-graph label Jan 6, 2025
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from dbb71a0 to 3d5e987 Compare January 7, 2025 01:45
@github-actions github-actions bot removed the component:graph-api Codeowner: @oneapi-src/onednn-graph label Jan 7, 2025
Copy link
Contributor

@ranukund ranukund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more minor edits suggested, please incorporate, thanks!

doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
doc/graph/fusion_patterns/sdpa_with_compressed_kv.md Outdated Show resolved Hide resolved
@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from 3d5e987 to 7036823 Compare January 9, 2025 01:17
@github-actions github-actions bot added the component:graph-api Codeowner: @oneapi-src/onednn-graph label Jan 9, 2025
@wzt1997
Copy link
Contributor Author

wzt1997 commented Jan 9, 2025

Few more minor edits suggested, please incorporate, thanks!

Thanks Ranu. I've incorporated the suggestions.

@wzt1997 wzt1997 force-pushed the zhitao/doc-compressed-sdpa branch from 7036823 to bc13d21 Compare January 10, 2025 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:graph-api Codeowner: @oneapi-src/onednn-graph documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants