Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] [Optimum] [Intel] [OpenVINO] Add OpenVINO backend support through Optimum-Intel #454

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

tjtanaa
Copy link
Contributor

@tjtanaa tjtanaa commented Nov 7, 2024

Description

This is a PR that integrates OpenVINO backend into Infinity's Optimum Embedder class through the use of optimum-intel library.

Related Issue

If applicable, link the issue this PR addresses.

Types of Change

  • New feature
  • Documentation update

Checklist

  • I have read the CONTRIBUTING guidelines.
  • My code follows the code style of this project.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • My changes generate no new warnings.
  • I have updated the documentation accordingly.

Additional Notes

There are multiple inferencing precisions that can be specified through in libs/infinity_emb/infinity_emb/transformer/utils_optimum.py

"ov_config":{
    "INFERENCE_PRECISION_HINT": "bf16" # it supports fp32, fp16 and bf16
}

The Inference precision hint is hardcoded to bf16 because it offers the fastest inference speed.

We have also performed MTEB evaluation test (bankclassification dataset) on the INT4 weight only quantized model with BF16 inference precision, the drop in accuracy is just 0.71%.

Based on speed and accuracy tradeoff as well as the ease-of-use, we think that settling down on a single effective configuration could enhance the user experience of infinity_emb.

License

By submitting this PR, I confirm that my contribution is made under the terms of the MIT license.

@codecov-commenter
Copy link

codecov-commenter commented Nov 12, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 44.15584% with 43 lines in your changes missing coverage. Please review.

Project coverage is 78.20%. Comparing base (c9a8404) to head (295e840).
Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
...nity_emb/infinity_emb/transformer/utils_optimum.py 42.30% 30 Missing ⚠️
...y_emb/infinity_emb/transformer/embedder/optimum.py 43.47% 13 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #454      +/-   ##
==========================================
- Coverage   79.18%   78.20%   -0.99%     
==========================================
  Files          41       41              
  Lines        3248     3308      +60     
==========================================
+ Hits         2572     2587      +15     
- Misses        676      721      +45     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@michaelfeil michaelfeil mentioned this pull request Nov 12, 2024
@michaelfeil michaelfeil marked this pull request as ready for review November 16, 2024 04:03
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR adds OpenVINO backend support through Optimum-Intel integration, enabling optimized inference on Intel hardware with BF16 precision.

  • Added OpenVINO execution provider in primitives.py and integrated model loading/optimization in utils_optimum.py with hardcoded BF16 precision
  • Implemented OpenVINO model file handling and caching in utils_optimum.py with get_openvino_files() function
  • Added CHECK_OPTIMUM_INTEL dependency check in _optional_imports.py for graceful fallback
  • Modified Docker.template.yaml and added Dockerfile.intel_auto to support OpenVINO builds
  • Updated OptimumEmbedder class to handle OpenVINO model loading while maintaining compatibility with existing ONNX runtime

7 file(s) reviewed, 13 comment(s)
Edit PR Review Bot Settings | Greptile

@@ -16,6 +16,7 @@ cpu:
# "RUN poetry install --no-interaction --no-ansi --no-root --extras \"${EXTRAS}\" --without lint,test && poetry cache clear pypi --all"
COPY requirements_install_from_poetry.sh requirements_install_from_poetry.sh
RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider pinning the optimum[openvino] version to ensure reproducible builds

Comment on lines 18 to 19
RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Installing optimum[openvino] after requirements_install_from_poetry.sh may override dependency versions. Consider integrating this into the requirements script

Comment on lines 42 to 43
RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Installing optimum[openvino] after poetry install could override poetry-managed dependencies. Consider adding optimum[openvino] to pyproject.toml instead.

Comment on lines 49 to 50
RUN ./requirements_install_from_poetry.sh --without lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Redundant installation of optimum[openvino] - this package was already installed in the previous stage

Comment on lines 59 to 60
RUN ./requirements_install_from_poetry.sh --with lint,test "https://download.pytorch.org/whl/cpu"
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Third redundant installation of optimum[openvino] - consider consolidating into a single installation in base image

Comment on lines +66 to +67
except Exception as e: # show error then let the optimum intel compress on the fly
print(str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: printing error to stdout could mask critical failures. Consider proper error logging or propagating the exception


if provider == "OpenVINOExecutionProvider":
CHECK_OPTIMUM_INTEL.mark_required()
filename = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: empty filename could cause issues if get_openvino_files fails. Consider setting a default model path or handling this case explicitly

use_auth_token=True,
prefer_quantized="cpu" in provider.lower(),
)
elif provider == "CPUExecutionProvider":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: missing else clause for unsupported providers could lead to undefined model state

Comment on lines +136 to +139
if files_optimized:
file_optimized = files_optimized[-1]
if file_name:
file_optimized = file_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: file_name overrides files_optimized[-1] without checking if file_name exists, could cause errors if file_name is invalid

openvino_files = [p for p in repo_files if p.match(pattern)]

if len(openvino_files) > 1:
logger.info(f"Found {len(openvino_files)} onnx files: {openvino_files}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: log message incorrectly refers to 'onnx files' when listing OpenVINO files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants