neural-compressor

Star

Here are 2 public repositories matching this topic...

intel / auto-round

Star

Advanced Quantization Algorithm for LLMs/VLMs.

rounding quantization awq int4 gptq neural-compressor

Updated Jan 10, 2025
Python

huggingface / optimum-benchmark

Star

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

benchmark pytorch openvino onnxruntime text-generation-inference neural-compressor tensorrt-llm

Updated Dec 17, 2024
Python

Improve this page

Add a description, image, and links to the neural-compressor topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the neural-compressor topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

neural-compressor

Here are 2 public repositories matching this topic...

intel / auto-round

huggingface / optimum-benchmark

Improve this page

Add this topic to your repo