Python通过OpenVINO调用Intel GPU推理YOLOv8模型
随着人工智能的不断发展,使用GPU进行深度学习模型推理已成为加速AI应用的常用手段。Intel的Arc GPU在AI推理方面表现优异,结合OpenVINO工具套件,可以充分利用其硬件加速特性。在这篇文章中,我们将介绍如何在Python环境下,利用OpenVINO调用Intel GPU进行YOLOv8模型的推理。
环境准备
要在Python中通过OpenVINO调用Intel GPU进行推理,我们首先需要进行环境的配置。包括Python、OpenVINO工具套件以及相关的依赖库。
1.1 Python安装
首先,确保系统中安装了Python。推荐使用3.8或更高版本。可以通过以下命令检查Python版本:
python --version
如果未安装Python,可以从Python官网下载并安装。
1.2 安装OpenVINO
OpenVINO是一款功能强大的AI推理工具,它支持多种硬件平台,包括Intel的Arc GPU。按照以下步骤安装OpenVINO:
- 访问OpenVINO官网下载适用于您操作系统的版本。
- 解压并安装OpenVINO工具套件。
- 配置环境变量,确保可以从命令行调用OpenVINO的工具。
可以使用以下命令确认OpenVINO安装是否成功:
source /opt/intel/openvino/setupvars.sh
或者在Windows上:
"C:\Program Files (x86)\Intel\openvino_2022\bin\setupvars.bat"
1.3 安装Python依赖库
在推理过程中,我们需要用到以下Python库:
OpenVINO的Python API (openvino)
图像处理库 (opencv-python)
其他辅助库,例如:numpy
使用pip安装这些依赖:
pip install openvino opencv-python-headless numpy
其中注意,openvino需要安装>2024.0.0的版本,不然不支持iGPU的调用
pip install openvino==2024.0.0
2. YOLOv8模型推理
配置完成后,我们可以使用OpenVINO调用Intel GPU进行YOLOv8模型的推理。下面是一个简单的示例代码:
import cv2
import numpy as np
from openvino.runtime import Core
import time
from tqdm import tqdm
import torch
import os
class SimpleLetterBox:
def __init__(self, new_shape=(640, 640), center=True, resize_only=False, use_gpu=False):
"""Initialize SimpleLetterBox with target shape and padding mode."""
self.new_shape = new_shape
self.center = center
self.resize_only = resize_only
self.use_gpu = use_gpu
def __call__(self, image):
if self.resize_only:
if self.use_gpu:
# 使用 GPU 执行预处理
image_gpu = cv2.UMat(image)
resized_image = cv2.resize(image_gpu, self.new_shape, interpolation=cv2.INTER_LINEAR).get()
else:
resized_image = cv2.resize(image, self.new_shape, interpolation=cv2.INTER_LINEAR)
return resized_image
# Resize image and pad to the target shape
shape = image.shape[:2] # current shape [height, width]
new_shape = self.new_shape
# Scale ratio (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
# Compute padding
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]
if self.center:
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
if self.use_gpu:
image_gpu = cv2.UMat(image)
image = cv2.resize(image_gpu, new_unpad, interpolation=cv2.INTER_LINEAR).get()
else:
image = cv2.resize(image, new_unpad, interpolation=cv2.INTER_LINEAR)
# Add padding
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
padded_image = cv2.copyMakeBorder(
image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114)
)
return padded_image
batch_size = 1
use_gpu = True
input_shape = (2048, 2048)
letterbox = SimpleLetterBox(input_shape, resize_only=False, use_gpu=use_gpu)
imgs_dir = r"C:\Users\leaper\Downloads\dataset2000\images\part"
imgs_paths = os.listdir(imgs_dir)
# Initialize OpenVINO model
core = Core()
model_path = f"C:/Users/leaper/Desktop/ultralytics/yinlie2048_openvino_model_bs{batch_size}/yinlie2048.xml"
model = core.read_model(model_path)
compiled_model = core.compile_model(model, "GPU.0")
total_preprocess_time = 0
total_inference_time = 0
num_batches = 0
# Process images in batches
for i in tqdm(range(0, len(imgs_paths), batch_size)):
imgs_batch = []
# Load and preprocess images
for j in range(batch_size):
if i + j < len(imgs_paths):
img_path = os.path.join(imgs_dir, imgs_paths[i + j])
image = cv2.imread(img_path)
if image is not None:
imgs_batch.append(letterbox(image))
start_preprocess_time = time.time()
im = np.stack(imgs_batch)
im = im[..., ::-1].transpose((0, 3, 1, 2)) # BGR to RGB, BHWC to BCHW, (n, 3, h, w)
im = np.ascontiguousarray(im) # contiguous
im = torch.from_numpy(im)
im = im.half() / 255
preprocess_time = (time.time() - start_preprocess_time) * 1000 # Convert to milliseconds
# Model inference
start_inference_time = time.time()
output = compiled_model([im])
inference_time = (time.time() - start_inference_time) * 1000 # Convert to milliseconds
total_preprocess_time += preprocess_time
total_inference_time += inference_time
num_batches += 1
# print(f"Batch {i // batch_size + 1} - Preprocessing time: {preprocess_time:.2f} ms, Inference time: {inference_time:.2f} ms")
# Output average times
average_preprocess_time = total_preprocess_time / num_batches
average_inference_time = total_inference_time / num_batches
print(f"Average preprocessing time: {average_preprocess_time:.2f} ms")
print(f"Average inference time: {average_inference_time:.2f} ms")
作者:ZhouDevin