# optimum-intel
**Repository Path**: yangweijie_admin/optimum-intel
## Basic Information
- **Project Name**: optimum-intel
- **Description**: No description available
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-04-20
- **Last Updated**: 2026-04-20
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Optimum Intel
🤗 [Optimum Intel](https://huggingface.co/docs/optimum-intel/en/index) is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by [OpenVINO](https://docs.openvino.ai) to accelerate end-to-end pipelines on Intel architectures.
[OpenVINO](https://docs.openvino.ai) is an open-source toolkit that enables high performance inference capabilities for Intel CPUs, GPUs, and special DL inference accelerators ([see](https://docs.openvino.ai/2024/about-openvino/compatibility-and-support/supported-devices.html) the full list of supported devices). It is supplied with a set of tools to optimize your models with compression techniques such as quantization, pruning and knowledge distillation. Optimum Intel provides a simple interface to optimize your Transformers and Diffusers models, convert them to the OpenVINO Intermediate Representation (IR) format and run inference using OpenVINO Runtime.
## Installation
To install the latest release of 🤗 Optimum Intel with the corresponding required dependencies, you can use `pip` as follows:
```bash
python -m pip install -U "optimum-intel[openvino]"
```
Optimum Intel is a fast-moving project with regular additions of new model support, so you may want to install from source with the following command:
```bash
python -m pip install "optimum-intel"@git+https://github.com/huggingface/optimum-intel.git
```
**Deprecation Notice:** The `extras` for `openvino` (e.g., `pip install optimum-intel[openvino,nncf]`), `nncf`, `neural-compressor`, `ipex` are **deprecated** and will be **removed in a future release**.
## Export:
To export your model to [OpenVINO IR](https://docs.openvino.ai/2025/documentation/openvino-ir-format.html) format, use the optimum-cli tool.
Below is an example of exporting [TinyLlama/TinyLlama_v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) model:
```sh
optimum-cli export openvino --model TinyLlama/TinyLlama_v1.1 ov_TinyLlama_v1_1
```
Additional information on exporting models is available in the [documentation](https://huggingface.co/docs/optimum-intel/en/openvino/export).
## Inference:
To load an exported model and run inference using Optimum Intel, use the corresponding `OVModelForXxx` class instead of `AutoModelForXxx`:
```python
from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer, pipeline
model_id = "ov_TinyLlama_v1_1"
model = OVModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
results = pipe("Hey, how are you doing today?", max_new_tokens=100)
```
For more details on Optimum Intel inference, refer to the [documentation](https://huggingface.co/docs/optimum-intel/en/openvino/inference).
**Note:** Alternatively, an exported model can also be inferred using [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai) framework,
that provides optimized execution methods for highly performant Generative AI.
## Quantization:
Post-training static quantization can also be applied. Here is an example on how to apply static quantization on a Whisper model using the [LibriSpeech](https://huggingface.co/datasets/openslr/librispeech_asr) dataset for the calibration step.
```python
from optimum.intel import OVModelForSpeechSeq2Seq, OVQuantizationConfig
model_id = "openai/whisper-tiny"
q_config = OVQuantizationConfig(dtype="int8", dataset="librispeech", num_samples=50)
q_model = OVModelForSpeechSeq2Seq.from_pretrained(model_id, quantization_config=q_config)
# The directory where the quantized model will be saved
save_dir = "nncf_results"
q_model.save_pretrained(save_dir)
```
You can find more information in the [documentation](https://huggingface.co/docs/optimum-intel/en/openvino/optimization).
## Running the examples
Check out the [`notebooks`](https://github.com/huggingface/optimum-intel/tree/main/notebooks) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.
Do not forget to install requirements for every example:
```sh
cd
pip install -r requirements.txt
```