cpu: x64: enable matmul-based IP for forward inference #2341

densamoilov · 2025-01-06T10:44:50Z

This PR introduces a matmul-based inner product implementation for inference that will be used instead of the current brgemm-based one. The latter will be disabled for inference.

This PR is targeting oneDNN v3.8 and will be merged after the oneDNN v3.7 code freeze date.

The implementation allows using blocked weights layouts directly or via the special tag any.
The Inner Product weights must meet ONE of the following requirements to enable using the blocked layouts:

Weights don't have spatial.
Weights have unit spatial.
Weights have non-unit spatial but the number of input channels is a multiple of K block.

If none of the above requirements are met then a plain layout will be used.

The new matmul-based implementation is able to leverage blocked weights layouts for all cases used in the performance testing and therefore provides the best possible performance.
Below is a comparison of the matmul-based vs brgemm-based inner product. > 100% means matmul-based is better.

_________________

densamoilov · 2025-01-06T10:49:06Z

make test
enable device_cpu
disable device_gpu

mgouicem · 2025-01-06T13:06:28Z

src/cpu/x64/jit_brgemm_inner_product.hpp

-            VDISPATCH_INNER_PRODUCT(is_fwd(), VERBOSE_BAD_PROPKIND);
+            VDISPATCH_INNER_PRODUCT(
+                    get_prop_kind() == prop_kind::forward_training,
+                    VERBOSE_BAD_PROPKIND);


Could you clarify why jit_brgemm_ip gets restricted to forward training?

IIUC, jit_brgemm_ip is after the new matmul_ip in the dispatch list, so there is a chance that if a forward inference case is not handled by matmul_ip (per the documented restrictions), it will also be skipped by jit_brgemm_ip and will go to a lower performance impl.

so there is a chance that if a forward inference case is not handled by matmul_ip (per the documented restrictions)

If a blocked layout cannot be used for weights then it'll fall back to a plain layout so all cases that can be handled by brgemm-ip can also be handled by matmul-based ip.

The goal is to remove brgemm-ip completely so we can't use it as a complement.

src/cpu/x64/matmul_inner_product.cpp

densamoilov · 2025-01-10T00:06:54Z

make test
enable device_cpu
disable device_gpu

densamoilov requested review from a team as code owners January 6, 2025 10:44

github-actions bot added the platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 label Jan 6, 2025

mgouicem reviewed Jan 6, 2025

View reviewed changes

densamoilov force-pushed the dsamoylo/main/matmul-ip-fwd-i branch from e908f44 to ee4229f Compare January 6, 2025 21:58

dzarukin approved these changes Jan 8, 2025

View reviewed changes

ankalinin approved these changes Jan 9, 2025

View reviewed changes

densamoilov force-pushed the dsamoylo/main/matmul-ip-fwd-i branch from ee4229f to dbc17df Compare January 10, 2025 00:05

densamoilov force-pushed the dsamoylo/main/matmul-ip-fwd-i branch from dbc17df to 84d5a94 Compare January 10, 2025 01:10

densamoilov added 2 commits January 9, 2025 17:17

cpu: x64: enable matmul-based IP for forward inference

f4c5cef

cpu: x64: disable forward brgemm-based IP for inference

fbe9c28

densamoilov force-pushed the dsamoylo/main/matmul-ip-fwd-i branch from 84d5a94 to fbe9c28 Compare January 10, 2025 01:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpu: x64: enable matmul-based IP for forward inference #2341

cpu: x64: enable matmul-based IP for forward inference #2341

densamoilov commented Jan 6, 2025 •

edited

Loading

densamoilov commented Jan 6, 2025

mgouicem Jan 6, 2025

densamoilov Jan 6, 2025

densamoilov commented Jan 10, 2025

cpu: x64: enable matmul-based IP for forward inference #2341

Are you sure you want to change the base?

cpu: x64: enable matmul-based IP for forward inference #2341

Conversation

densamoilov commented Jan 6, 2025 • edited Loading

densamoilov commented Jan 6, 2025

mgouicem Jan 6, 2025

Choose a reason for hiding this comment

densamoilov Jan 6, 2025

Choose a reason for hiding this comment

densamoilov commented Jan 10, 2025

densamoilov commented Jan 6, 2025 •

edited

Loading