Python中使用OpenCV实现手部关键点检测(Hand Landmarks Detection)

文章目录

  • 1、功能描述
  • 2、代码实现
  • 3、效果展示
  • 4、完整代码
  • 5、涉及到的库函数
  • 6、参考

  • 更多有趣的代码示例,可参考【Programming】


    1、功能描述

    基于 opencv-python 和 mediapipe 实现手部关键点的检测(无法检测出手,不过可以根据关键点的信息外扩出来)

    拇指(Thumb):位于手掌的最外侧,通常是最粗、最短的手指,具有对掌功能,使人类能够抓握和操作物体。

    食指(Index Finger):又称示指,紧挨着拇指,通常用于指示方向或物体,也是打字和书写时常用的手指。

    中指(Middle Finger):位于食指和无名指之间,是手指中最长的一根,有时也被用于表达某些情绪或态度。

    无名指(Ring Finger):又称环指,在传统文化中,人们习惯将戒指戴在这根手指上,象征着婚姻或爱情。

    小指(Little Finger):又称小拇指,是手指中最细、最短的一根,通常用于支撑或稳定物体。

    Pinky 是小指的俗称,尤其在非正式场合或口语中常用。


    Landmark 0 (Wrist): The base of the palm (root landmark).
    Landmark 1 (Thumb base): The base joint of the thumb, near the wrist.
    Landmark 2 (Thumb first joint): The first joint of the thumb.
    Landmark 3 (Thumb second joint): The second joint of the thumb.
    Landmark 4 (Thumb tip): The tip of the thumb.
    Landmark 5 (Index base): The base joint of the index finger.
    Landmark 6 (Index first joint): The first joint of the index finger.
    Landmark 7 (Index second joint): The second joint of the index finger.
    Landmark 8 (Index tip): The tip of the index finger.
    Landmark 9 (Middle base): The base joint of the middle finger.
    Landmark 10 (Middle first joint): The first joint of the middle finger.
    Landmark 11 (Middle second joint): The second joint of the middle finger.
    Landmark 12 (Middle tip): The tip of the middle finger.
    Landmark 13 (Ring base): The base joint of the ring finger.
    Landmark 14 (Ring first joint): The first joint of the ring finger.
    Landmark 15 (Ring second joint): The second joint of the ring finger.
    Landmark 16 (Ring tip): The tip of the ring finger.
    Landmark 17 (Pinky base): The base joint of the pinky finger.
    Landmark 18 (Pinky first joint): The first joint of the pinky.
    Landmark 19 (Pinky second joint): The second joint of the pinky.
    Landmark 20 (Pinky tip): The tip of the pinky finger.

    2、代码实现

    导入必要的库函数

    import cv2
    import mediapipe as mp
    import time
    

    调用 mediapipe 的手部关键点检测和绘制手部关键点的接口

    mpHands = mp.solutions.hands
    hands = mpHands.Hands(static_image_mode=False,
                          max_num_hands=2,
                          min_detection_confidence=0.5,
                          min_tracking_confidence=0.5)
    mpDraw = mp.solutions.drawing_utils
    

    统计时间和帧率,单张图片的话用处不大

    pTime = 0
    cTime = 0
    

    读取图片,检测手部关键点,结果保存在 results 中,可以通过 help(results) 查看其属性

    img = cv2.imread("1.jpg")
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    results = hands.process(imgRGB)
    

    检测结果的三个属性的 help 介绍如下

     |  ----------------------------------------------------------------------
     |  Data and other attributes defined here:
     |
     |  multi_hand_landmarks = [landmark {
     |    x: 0.692896
     |    y: 0.489765495
     |    z:...
     |
     |  multi_hand_world_landmarks = [landmark {
     |    x: 0.000626816414
     |    y: 0.08...
     |
     |  multi_handedness = [classification {
     |    index: 1
     |    score: 0.889409304
     |   ...
    

    如果存在检测结果,遍历关键点,可视化出来

    if results.multi_hand_landmarks:
        for handLms in results.multi_hand_landmarks:
            for id, lm in enumerate(handLms.landmark):
                #print(id,lm)
                h, w, c = img.shape
                cx, cy = int(lm.x *w), int(lm.y*h)
                #if id ==0:
                cv2.circle(img, (cx,cy), 3, (255,0,255), cv2.FILLED)
    
            mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)
    

    统计时间,帧率显示在左上角

    cTime = time.time()
    fps = 1/(cTime-pTime)
    pTime = cTime
    
    cv2.putText(img,str(int(fps)), (10,70), cv2.FONT_HERSHEY_PLAIN, 3, (255,0,255), 3)
    

    显示图片,保存结果,退出时候关闭所有窗口

    cv2.imshow("Image", img)
    cv2.imwrite("result.jpg", img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    3、效果展示

    输入图片

    输出结果

    输入图片

    输出结果


    输入图片

    输出图片

    输入图片

    输出图片

    4、完整代码

    单张图片

    import cv2
    import mediapipe as mp
    import time
    
    mpHands = mp.solutions.hands
    hands = mpHands.Hands(static_image_mode=False,
                          max_num_hands=2,
                          min_detection_confidence=0.5,
                          min_tracking_confidence=0.5)
    mpDraw = mp.solutions.drawing_utils
    
    pTime = 0
    cTime = 0
    
    
    img = cv2.imread("1.jpg")
    imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    results = hands.process(imgRGB)
    #print(results.multi_hand_landmarks)
    
    if results.multi_hand_landmarks:
        for handLms in results.multi_hand_landmarks:
            for id, lm in enumerate(handLms.landmark):
                #print(id,lm)
                h, w, c = img.shape
                cx, cy = int(lm.x *w), int(lm.y*h)
                #if id ==0:
                cv2.circle(img, (cx,cy), 3, (255,0,255), cv2.FILLED)
    
            mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)
    
    
    cTime = time.time()
    fps = 1/(cTime-pTime)
    pTime = cTime
    
    cv2.putText(img,str(int(fps)), (10,70), cv2.FONT_HERSHEY_PLAIN, 3, (255,0,255), 3)
    
    cv2.imshow("Image", img)
    cv2.imwrite("result.jpg", img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    视频

    import cv2
    import mediapipe as mp
    import time
    
    cap = cv2.VideoCapture(0)
    
    mpHands = mp.solutions.hands
    hands = mpHands.Hands(static_image_mode=False,
                          max_num_hands=2,
                          min_detection_confidence=0.5,
                          min_tracking_confidence=0.5)
    mpDraw = mp.solutions.drawing_utils
    
    pTime = 0
    cTime = 0
    
    while True:
        success, img = cap.read()
        img = cv2.flip(img, 1)
        imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        results = hands.process(imgRGB)
        #print(results.multi_hand_landmarks)
    
        if results.multi_hand_landmarks:
            for handLms in results.multi_hand_landmarks:
                for id, lm in enumerate(handLms.landmark):
                    #print(id,lm)
                    h, w, c = img.shape
                    cx, cy = int(lm.x *w), int(lm.y*h)
                    #if id ==0:
                    cv2.circle(img, (cx,cy), 3, (255,0,255), cv2.FILLED)
    
                mpDraw.draw_landmarks(img, handLms, mpHands.HAND_CONNECTIONS)
    
    
        cTime = time.time()
        fps = 1/(cTime-pTime)
        pTime = cTime
    
        cv2.putText(img,str(int(fps)), (10,70), cv2.FONT_HERSHEY_PLAIN, 3, (255,0,255), 3)
    
        cv2.imshow("Image", img)
        cv2.waitKey(1)
    

    5、涉及到的库函数

    mediapipe.solutions.hands.Hands 是 MediaPipe 框架中用于手部关键点检测的重要接口。

    关键参数

    static_image_mode:

  • 默认为 False,表示将输入图像视为视频流。在此模式下,接口会尝试在第一个输入图像中检测手,并在成功检测后进一步定位手的坐标。在随后的图像中,它会跟踪这些坐标,直到失去对任何一只手的跟踪。
  • 设置为 True 时,接口会在每个输入图像上运行手部检测,适用于处理一批静态的、可能不相关的图像。
  • max_num_hands:

  • 表示最多检测几只手,默认为 2。
  • min_detection_confidence:

  • 手部检测模型的最小置信值(0-1 之间),超过阈值则检测成功。默认为 0.5。
  • min_tracking_confidence:

  • 坐标跟踪模型的最小置信值(0-1 之间),用于将手部坐标视为成功跟踪。不成功则在下一个输入图像上自动调用手部检测。将其设置为更高的值可以提高解决方案的稳健性,但代价是更高的延迟。如果 static_image_mode 为 True,则忽略这个参数。默认为 0.5。
  • 返回值

    Handednessd

  • 表示检测到的手是左手还是右手
  • Landmarks

  • 手部关键点共有21个,每个关键点由x、y和z坐标组成。x和y坐标通过图像的宽度和高度进行了归一化,范围在[0.0, 1.0]之间。z坐标表示关键点的深度,手腕处的深度被定义为原点。数值越小,表示关键点离摄像机越近。z的大小与x的大小大致相同。
  • WorldLandmarks

  • 以世界坐标的形式呈现21个手部关键点。每个关键点由x、y和z组成,表示以米为单位的真实世界三维坐标,原点位于手的几何中心。
  • HandLandmarkerResult:
      Handedness:
        Categories #0:
          index        : 0
          score        : 0.98396
          categoryName : Left   # Left代表左手
      Landmarks:
        Landmark #0:
          x            : 0.638852
          y            : 0.671197
          z            : -3.41E-7
        Landmark #1:
          x            : 0.634599
          y            : 0.536441
          z            : -0.06984
        ... (21 landmarks for a hand)
      WorldLandmarks:
        Landmark #0:
          x            : 0.067485
          y            : 0.031084
          z            : 0.055223
        Landmark #1:
          x            : 0.063209
          y            : -0.00382
          z            : 0.020920
        ... (21 world landmarks for a hand)
    
    

    6、参考

  • https://ai.google.dev/edge/mediapipe/solutions/vision/hand_landmarker/python
  • MediaPipe实现手指关键点检测及追踪,人脸识别及追踪
  • https://github.com/webdevpathiraja/Hand-Gesture-Sign-Detection-Project/tree/main
  • Mediapipe框架(一)人手关键点检测

  • 更多有趣的代码示例,可参考【Programming】

    作者:bryant_meng

    物联沃分享整理
    物联沃-IOTWORD物联网 » Python中使用OpenCV实现手部关键点检测(Hand Landmarks Detection)

    发表回复