Skip to content

Commit

Permalink
remove test models and fix order of checks (#1401)
Browse files Browse the repository at this point in the history
      * added removing test models after llm bench tests passing to reduce
disk space
* fixed order of checks in wwb tests (csv file is result of successful
execution of cli command, test trying to open file and only after that
check cli command return code)
* reduces inference counts in llm bench tests (reduces execution time in
2 times, from 72 min to 36 min)
  • Loading branch information
eaidova authored Dec 18, 2024
1 parent b31b6a1 commit 7d2a303
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 18 deletions.
22 changes: 14 additions & 8 deletions .github/workflows/llm_bench-python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ jobs:
SRC_DIR: ${{ github.workspace }}
LLM_BENCH_PYPATH: ${{ github.workspace }}/tools/llm_bench
WWB_PATH: ${{ github.workspace }}/tools/who_what_benchmark
OPENVINO_LOG_LEVEL: 3

steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
Expand Down Expand Up @@ -103,30 +102,34 @@ jobs:
- name: Test native pytorch model on Linux
run: |
git clone --depth 1 https://huggingface.co/katuni4ka/tiny-random-qwen
python ./tools/llm_bench/benchmark.py -m tiny-random-qwen -d cpu -n 1 -f pt
python ./tools/llm_bench/benchmark.py -m tiny-random-qwen -d cpu -n 1 -f pt -ic 20
rm -rf tiny-random-qwen
env:
GIT_LFS_SKIP_SMUDGE: 0
- name: Test tiny-random-baichuan2 on Linux Optimum Intel
run: |
optimum-cli export openvino --model katuni4ka/tiny-random-baichuan2 --trust-remote-code --weight-format fp16 ./ov_models/tiny-random-baichuan2/pytorch/dldt/FP16
python ./tools/llm_bench/benchmark.py -m ./ov_models/tiny-random-baichuan2/pytorch/dldt/FP16/ -d cpu -n 1 --optimum
python ./tools/llm_bench/benchmark.py -m ./ov_models/tiny-random-baichuan2/pytorch/dldt/FP16/ -d cpu -n 1 --optimum -ic 10
rm -rf ./ov_models/tiny-random-baichuan2
- name: Test OpenVINO/LCM_Dreamshaper_v7-int8-ov on Linux Optimum Intel
run: |
huggingface-cli download OpenVINO/LCM_Dreamshaper_v7-int8-ov --local-dir ov_models/lcm_dreamshaper_v7
python ./tools/llm_bench/benchmark.py -m ./ov_models/lcm_dreamshaper_v7/ -pf ./tools/llm_bench/prompts/stable-diffusion.jsonl -d cpu -n 1 --optimum
python ./tools/llm_bench/benchmark.py -m ./ov_models/lcm_dreamshaper_v7/ -pf ./tools/llm_bench/prompts/stable-diffusion.jsonl -d cpu -n 1 --optimum -ic 4
- name: Test OpenVINO/LCM_Dreamshaper_v7-int8-ov on Linux with GenAI
run: |
python ./tools/llm_bench/benchmark.py -m ./ov_models/lcm_dreamshaper_v7/ -pf ./tools/llm_bench/prompts/stable-diffusion.jsonl -d cpu -n 1
python ./tools/llm_bench/benchmark.py -m ./ov_models/lcm_dreamshaper_v7/ -pf ./tools/llm_bench/prompts/stable-diffusion.jsonl -d cpu -n 1 -ic 4
- name: Test OpenVINO/LCM_Dreamshaper_v7-int8-ov on Linux with GenAI and LoRA
run: |
wget -O ./ov_models/soulcard.safetensors https://civitai.com/api/download/models/72591
python ./tools/llm_bench/benchmark.py -m ./ov_models/lcm_dreamshaper_v7/ -pf ./tools/llm_bench/prompts/stable-diffusion.jsonl -d cpu -n 1 --lora ./ov_models/soulcard.safetensors --lora_alphas 0.7
python ./tools/llm_bench/benchmark.py -m ./ov_models/lcm_dreamshaper_v7/ -pf ./tools/llm_bench/prompts/stable-diffusion.jsonl -d cpu -n 1 --lora ./ov_models/soulcard.safetensors --lora_alphas 0.7 -ic 4
rm -rf ./ov_models/lcm_dreamshaper_v7/
- name: Test TinyLlama-1.1B-Chat-v1.0 in Speculative Deconding mode on Linux
run: |
optimum-cli export openvino --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --trust-remote-code --weight-format fp16 ov_models/TinyLlama-1.1B-Chat-v1.0/FP16
optimum-cli export openvino --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --trust-remote-code --weight-format int8 ov_models/TinyLlama-1.1B-Chat-v1.0/INT8
python ./tools/llm_bench/benchmark.py -m ./ov_models/TinyLlama-1.1B-Chat-v1.0/FP16/ --draft_model ./ov_models/TinyLlama-1.1B-Chat-v1.0/INT8/ -p "Why is the Sun yellow?" -d cpu --draft_device cpu -n 1 --assistant_confidence_threshold 0.4
python ./tools/llm_bench/benchmark.py -m ./ov_models/TinyLlama-1.1B-Chat-v1.0/FP16/ --draft_model ./ov_models/TinyLlama-1.1B-Chat-v1.0/INT8/ -p "Why is the Sun yellow?" -d cpu --draft_device cpu -n 1 --num_assistant_tokens 5
python ./tools/llm_bench/benchmark.py -m ./ov_models/TinyLlama-1.1B-Chat-v1.0/FP16/ --draft_model ./ov_models/TinyLlama-1.1B-Chat-v1.0/INT8/ -p "Why is the Sun yellow?" -d cpu --draft_device cpu -n 1 --assistant_confidence_threshold 0.4 -ic 20
python ./tools/llm_bench/benchmark.py -m ./ov_models/TinyLlama-1.1B-Chat-v1.0/FP16/ --draft_model ./ov_models/TinyLlama-1.1B-Chat-v1.0/INT8/ -p "Why is the Sun yellow?" -d cpu --draft_device cpu -n 1 --num_assistant_tokens 5 -ic 20
rm -rf ov_models/TinyLlama-1.1B-Chat-v1.0
- name: Test whisper-tiny on Linux
run: |
GIT_LFS_SKIP_SMUDGE=1 git clone --depth 1 --branch main --single-branch https://huggingface.co/datasets/facebook/multilingual_librispeech
Expand All @@ -138,11 +141,14 @@ jobs:
optimum-cli export openvino --trust-remote-code --model openai/whisper-tiny ./ov_models/whisper-tiny
python ./tools/llm_bench/benchmark.py -m ./ov_models/whisper-tiny --media multilingual_librispeech/data/mls_polish/train/audio/3283_1447_000/3283_1447_000000.flac -d cpu -n 1 --optimum
python ./tools/llm_bench/benchmark.py -m ./ov_models/whisper-tiny --media multilingual_librispeech/data/mls_polish/train/audio/3283_1447_000/3283_1447_000000.flac -d cpu -n 1
rm -rf ./ov_models/whisper-tiny
rm -rf multilingual_librispeech
- name: Text InternVL2-1B on Linux
run: |
optimum-cli export openvino --model OpenGVLab/InternVL2-1B ./ov_models/internvl2-1B --task image-text-to-text --trust-remote-code
python ./tools/llm_bench/benchmark.py -m ./ov_models/internvl2-1B --media https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/d5fbbd1a-d484-415c-88cb-9986625b7b11 --prompt "What is unusual on this image?" -ic 20
python ./tools/llm_bench/benchmark.py -m ./ov_models/internvl2-1B --media https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/d5fbbd1a-d484-415c-88cb-9986625b7b11 --prompt "What is unusual on this image?" -ic 20 --optimum
rm -rf ./ov_models/internvl2-1B
- name: WWB Tests
run: |
pip install git+https://github.com/huggingface/optimum-intel.git
Expand Down
11 changes: 7 additions & 4 deletions tools/llm_bench/task/image_generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,14 @@
stable_diffusion_hook = StableDiffusionHook()


def collects_input_args(image_param, model_type, model_name, callback=None):
def collects_input_args(image_param, model_type, model_name, infer_count=None, callback=None):
input_args = {}
input_args["width"] = image_param.get('width', DEFAULT_IMAGE_WIDTH)
input_args["height"] = image_param.get('height', DEFAULT_IMAGE_HEIGHT)
input_args["num_inference_steps"] = image_param.get('steps', DEFAULT_INFERENCE_STEPS if 'lcm' not in model_name else LCM_DEFAULT_INFERENCE_STEPS)
if infer_count is None:
input_args["num_inference_steps"] = image_param.get('steps', DEFAULT_INFERENCE_STEPS if 'lcm' not in model_name else LCM_DEFAULT_INFERENCE_STEPS)
else:
input_args["num_inference_steps"] = infer_count
guidance_scale = image_param.get('guidance_scale', None)

if guidance_scale is not None:
Expand Down Expand Up @@ -57,7 +60,7 @@ def collects_input_args(image_param, model_type, model_name, callback=None):
def run_image_generation(image_param, num, image_id, pipe, args, iter_data_list, proc_id, mem_consumption, callback=None):
set_seed(args['seed'])
input_text = image_param['prompt']
input_args = collects_input_args(image_param, args['model_type'], args['model_name'])
input_args = collects_input_args(image_param, args['model_type'], args['model_name'], args["infer_count"])
out_str = f"Input params: Batch_size={args['batch_size']}, " \
f"steps={input_args['num_inference_steps']}, width={input_args['width']}, height={input_args['height']}"
if 'guidance_scale' in input_args:
Expand Down Expand Up @@ -120,7 +123,7 @@ def run_image_generation(image_param, num, image_id, pipe, args, iter_data_list,
def run_image_generation_genai(image_param, num, image_id, pipe, args, iter_data_list, proc_id, mem_consumption, callback=None):
set_seed(args['seed'])
input_text = image_param['prompt']
input_args = collects_input_args(image_param, args['model_type'], args['model_name'], callback)
input_args = collects_input_args(image_param, args['model_type'], args['model_name'], args["infer_count"], callback)
out_str = f"Input params: Batch_size={args['batch_size']}, " \
f"steps={input_args['num_inference_steps']}, width={input_args['width']}, height={input_args['height']}"
if 'guidance_scale' in input_args:
Expand Down
9 changes: 3 additions & 6 deletions tools/who_what_benchmark/tests/test_cli_text.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,9 +94,8 @@ def test_text_gt_data():
"CPU",
]
)
assert result.returncode == 0
data = pd.read_csv(temp_file_name)

assert result.returncode == 0
assert len(data["questions"].values) == 2


Expand Down Expand Up @@ -174,9 +173,8 @@ def test_text_language_autodetect():
"CPU",
]
)
assert result.returncode == 0
data = pd.read_csv(temp_file_name)

assert result.returncode == 0
assert "马克" in data["prompts"].values[0]


Expand All @@ -196,9 +194,8 @@ def test_text_hf_model():
"--hf",
]
)
assert result.returncode == 0
data = pd.read_csv(temp_file_name)

assert result.returncode == 0
assert len(data["prompts"].values) == 2


Expand Down

0 comments on commit 7d2a303

Please sign in to comment.