Skip to content

Commit

Permalink
Add OpenVINO weights compression to docs (#435)
Browse files Browse the repository at this point in the history
* Add weights compression to docs

* Update optimization_ov.mdx
  • Loading branch information
helena-intel authored Nov 6, 2023
1 parent 9562235 commit 99a3970
Showing 1 changed file with 22 additions and 1 deletion.
23 changes: 22 additions & 1 deletion docs/source/optimization_ov.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,27 @@ tokenizer.save_pretrained(save_dir)

The `quantize()` method applies post-training static quantization and export the resulting quantized model to the OpenVINO Intermediate Representation (IR). The resulting graph is represented with two files: an XML file describing the network topology and a binary file describing the weights. The resulting model can be run on any target Intel device.

### Weights compression

For large language models (LLMs), it is often beneficial to only quantize weights, and keep activations in floating point precision. This method does not require a calibration dataset. To enable weights compression, set the `weights_only` parameter of `OVQuantizer`:

```python
from optimum.intel.openvino import OVQuantizer, OVModelForCausalLM
from transformers import AutoModelForCausalLM

save_dir = "int8_weights_compressed_model"
model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-3b")
quantizer = OVQuantizer.from_pretrained(model, task="text-generation")
quantizer.quantize(save_directory=save_dir, weights_only=True)
```

To load the optimized model for inference:

```python
optimized_model = OVModelForCausalLM.from_pretrained(save_dir)
```

Weights compression is enabled for PyTorch and OpenVINO models: the starting model can be an `AutoModelForCausalLM` or `OVModelForCausalLM` instance.

## Training-time optimization

Expand Down Expand Up @@ -221,4 +242,4 @@ text = "He's a dreadful magician."
outputs = cls_pipe(text)

[{'label': 'NEGATIVE', 'score': 0.9840195178985596}]
```
```

0 comments on commit 99a3970

Please sign in to comment.