Skip to content

Commit

Permalink
benchdnn: input: graph: remove case sdpa-compressed-kv-int8-gs128.json
Browse files Browse the repository at this point in the history
rewrite sdpa-compressed-kv-int4-gs32.json for it.
  • Loading branch information
TaoLv committed Jan 6, 2025
1 parent 94389ba commit 539767a
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 550 deletions.
1 change: 0 additions & 1 deletion tests/benchdnn/inputs/graph/complex_fusion/harness_mha_all
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@
--reset --expected-n-partitions=0 --case=complex_fusion/mha/dynamic_quantized_mha-Bert_large-inf-int8-bs1-fake.json
--reset --case=complex_fusion/mha/sdpa-plain-wo-scale-int8-bs1.json
--reset --case=complex_fusion/mha/sdpa-compressed-kv-int4-gs32.json
--reset --case=complex_fusion/mha/sdpa-compressed-kv-int8-gs128.json
--reset --case=complex_fusion/mha/sdpa-compressed-k-int8-gs32.json
--reset --case=complex_fusion/mha/sdpa-compressed-v-int8-gs32.json

Expand Down
3 changes: 2 additions & 1 deletion tests/benchdnn/inputs/graph/complex_fusion/harness_mha_ci
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,6 @@
--reset --expected-n-partitions=0 --case=complex_fusion/mha/MHA-starcoder-inf-int8-bs1.json
--reset --expected-n-partitions=0 --case=complex_fusion/mha/dynamic_quantized_mha-Bert_large-inf-int8-bs1-fake.json
--reset --case=complex_fusion/mha/sdpa-plain-wo-scale-int8-bs1.json
--reset --case=complex_fusion/mha/sdpa-compressed-kv-int8-gs128.json
--reset --case=complex_fusion/mha/sdpa-compressed-v-int8-gs32.json
--reset --case=complex_fusion/mha/sdpa-compressed-kv-int4-gs32.json
--reset --dt=0:s8+2:s8+6:s8+8:s8 --case=complex_fusion/mha/sdpa-compressed-kv-int4-gs32.json
Loading

0 comments on commit 539767a

Please sign in to comment.