代码收藏家技术教程 15天前

Python中使用OpenCV实现手部关键点检测（Hand Landmarks Detection）

文章目录

1、功能描述

2、代码实现

3、效果展示

4、完整代码

5、涉及到的库函数

6、参考

更多有趣的代码示例，可参考【Programming】

1、功能描述

基于 opencv-python 和 mediapipe 实现手部关键点的检测（无法检测出手，不过可以根据关键点的信息外扩出来）

拇指（Thumb）：位于手掌的最外侧，通常是最粗、最短的手指，具有对掌功能，使人类能够抓握和操作物体。

食指（Index Finger）：又称示指，紧挨着拇指，通常用于指示方向或物体，也是打字和书写时常用的手指。

中指（Middle Finger）：位于食指和无名指之间，是手指中最长的一根，有时也被用于表达某些情绪或态度。

无名指（Ring Finger）：又称环指，在传统文化中，人们习惯将戒指戴在这根手指上，象征着婚姻或爱情。

小指（Little Finger）：又称小拇指，是手指中最细、最短的一根，通常用于支撑或稳定物体。

Pinky 是小指的俗称，尤其在非正式场合或口语中常用。

Landmark 0 (Wrist): The base of the palm (root landmark).
Landmark 1 (Thumb base): The base joint of the thumb, near the wrist.
Landmark 2 (Thumb first joint): The first joint of the thumb.
Landmark 3 (Thumb second joint): The second joint of the thumb.
Landmark 4 (Thumb tip): The tip of the thumb.
Landmark 5 (Index base): The base joint of the index finger.
Landmark 6 (Index first joint): The first joint of the index finger.
Landmark 7 (Index second joint): The second joint of the index finger.
Landmark 8 (Index tip): The tip of the index finger.
Landmark 9 (Middle base): The base joint of the middle finger.
Landmark 10 (Middle first joint): The first joint of the middle finger.
Landmark 11 (Middle second joint): The second joint of the middle finger.
Landmark 12 (Middle tip): The tip of the middle finger.
Landmark 13 (Ring base): The base joint of the ring finger.
Landmark 14 (Ring first joint): The first joint of the ring finger.
Landmark 15 (Ring second joint): The second joint of the ring finger.
Landmark 16 (Ring tip): The tip of the ring finger.
Landmark 17 (Pinky base): The base joint of the pinky finger.
Landmark 18 (Pinky first joint): The first joint of the pinky.
Landmark 19 (Pinky second joint): The second joint of the pinky.
Landmark 20 (Pinky tip): The tip of the pinky finger.

2、代码实现

导入必要的库函数

import cv2
import mediapipe as mp
import time

调用 mediapipe 的手部关键点检测和绘制手部关键点的接口

mpHands = mp.solutions.hands
hands = mpHands.Hands(static_image_mode=False,
                      max_num_hands=2,
                      min_detection_confidence=0.5,
                      min_tracking_confidence=0.5)
mpDraw = mp.solutions.drawing_utils

统计时间和帧率，单张图片的话用处不大

pTime = 0
cTime = 0

读取图片，检测手部关键点，结果保存在 results 中，可以通过 help(results) 查看其属性

img = cv2.imread("1.jpg")
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
results = hands.process(imgRGB)

检测结果的三个属性的 help 介绍如下

 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |
 |  multi_hand_landmarks = [landmark {
 |    x: 0.692896
 |    y: 0.489765495
 |    z:...
 |
 |  multi_hand_world_landmarks = [landmark {
 |    x: 0.000626816414
 |    y: 0.08...
 |
 |  multi_handedness = [classification {
 |    index: 1
 |    score: 0.889409304
 |   ...

如果存在检测结果，遍历关键点，可视化出来

if results.multi_hand_landmarks:
    for handLms in results.multi_hand_landmarks:
        for id, lm in enumerate(handLms.landmark):
            #print(id,lm)
            h, w, c = img.shape
            cx, cy = int(lm.x *w), int(lm.y*h)
            #if id ==0:
            cv2.circle(img, (cx,cy), 3, (255,0,255), cv2.FILLED)

        mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)

统计时间，帧率显示在左上角

cTime = time.time()
fps = 1/(cTime-pTime)
pTime = cTime

cv2.putText(img,str(int(fps)), (10,70), cv2.FONT_HERSHEY_PLAIN, 3, (255,0,255), 3)

显示图片，保存结果，退出时候关闭所有窗口

cv2.imshow("Image", img)
cv2.imwrite("result.jpg", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

3、效果展示

输入图片

输出结果

输入图片

输出结果

输入图片

输出图片

输入图片

输出图片

4、完整代码

单张图片

import cv2
import mediapipe as mp
import time

mpHands = mp.solutions.hands
hands = mpHands.Hands(static_image_mode=False,
                      max_num_hands=2,
                      min_detection_confidence=0.5,
                      min_tracking_confidence=0.5)
mpDraw = mp.solutions.drawing_utils

pTime = 0
cTime = 0


img = cv2.imread("1.jpg")
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
results = hands.process(imgRGB)
#print(results.multi_hand_landmarks)

if results.multi_hand_landmarks:
    for handLms in results.multi_hand_landmarks:
        for id, lm in enumerate(handLms.landmark):
            #print(id,lm)
            h, w, c = img.shape
            cx, cy = int(lm.x *w), int(lm.y*h)
            #if id ==0:
            cv2.circle(img, (cx,cy), 3, (255,0,255), cv2.FILLED)

        mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)


cTime = time.time()
fps = 1/(cTime-pTime)
pTime = cTime

cv2.putText(img,str(int(fps)), (10,70), cv2.FONT_HERSHEY_PLAIN, 3, (255,0,255), 3)

cv2.imshow("Image", img)
cv2.imwrite("result.jpg", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

视频

import cv2
import mediapipe as mp
import time

cap = cv2.VideoCapture(0)

mpHands = mp.solutions.hands
hands = mpHands.Hands(static_image_mode=False,
                      max_num_hands=2,
                      min_detection_confidence=0.5,
                      min_tracking_confidence=0.5)
mpDraw = mp.solutions.drawing_utils

pTime = 0
cTime = 0

while True:
    success, img = cap.read()
    img = cv2.flip(img, 1)
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    results = hands.process(imgRGB)
    #print(results.multi_hand_landmarks)

    if results.multi_hand_landmarks:
        for handLms in results.multi_hand_landmarks:
            for id, lm in enumerate(handLms.landmark):
                #print(id,lm)
                h, w, c = img.shape
                cx, cy = int(lm.x *w), int(lm.y*h)
                #if id ==0:
                cv2.circle(img, (cx,cy), 3, (255,0,255), cv2.FILLED)

            mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)


    cTime = time.time()
    fps = 1/(cTime-pTime)
    pTime = cTime

    cv2.putText(img,str(int(fps)), (10,70), cv2.FONT_HERSHEY_PLAIN, 3, (255,0,255), 3)

    cv2.imshow("Image", img)
    cv2.waitKey(1)

5、涉及到的库函数

mediapipe.solutions.hands.Hands 是 MediaPipe 框架中用于手部关键点检测的重要接口。

关键参数

static_image_mode：

默认为 False，表示将输入图像视为视频流。在此模式下，接口会尝试在第一个输入图像中检测手，并在成功检测后进一步定位手的坐标。在随后的图像中，它会跟踪这些坐标，直到失去对任何一只手的跟踪。

设置为 True 时，接口会在每个输入图像上运行手部检测，适用于处理一批静态的、可能不相关的图像。

max_num_hands：

表示最多检测几只手，默认为 2。

min_detection_confidence：

手部检测模型的最小置信值（0-1 之间），超过阈值则检测成功。默认为 0.5。

min_tracking_confidence：

坐标跟踪模型的最小置信值（0-1 之间），用于将手部坐标视为成功跟踪。不成功则在下一个输入图像上自动调用手部检测。将其设置为更高的值可以提高解决方案的稳健性，但代价是更高的延迟。如果 static_image_mode 为 True，则忽略这个参数。默认为 0.5。

返回值

Handednessd：

表示检测到的手是左手还是右手

Landmarks：

手部关键点共有21个，每个关键点由x、y和z坐标组成。x和y坐标通过图像的宽度和高度进行了归一化，范围在[0.0, 1.0]之间。z坐标表示关键点的深度，手腕处的深度被定义为原点。数值越小，表示关键点离摄像机越近。z的大小与x的大小大致相同。

WorldLandmarks：

以世界坐标的形式呈现21个手部关键点。每个关键点由x、y和z组成，表示以米为单位的真实世界三维坐标，原点位于手的几何中心。

HandLandmarkerResult:
  Handedness:
    Categories #0:
      index        : 0
      score        : 0.98396
      categoryName : Left   # Left代表左手
  Landmarks:
    Landmark #0:
      x            : 0.638852
      y            : 0.671197
      z            : -3.41E-7
    Landmark #1:
      x            : 0.634599
      y            : 0.536441
      z            : -0.06984
    ... (21 landmarks for a hand)
  WorldLandmarks:
    Landmark #0:
      x            : 0.067485
      y            : 0.031084
      z            : 0.055223
    Landmark #1:
      x            : 0.063209
      y            : -0.00382
      z            : 0.020920
    ... (21 world landmarks for a hand)