更穩定的手勢識别方法--基于手部骨架與關鍵點檢測

導讀

本期将介紹并示範基于MediaPipe的手勢骨架與特征點提取步驟以及以此為基礎實作手勢識别的方法。

介紹

關于MediaPipe以前有相關文章介紹，可以參看下面連結：

Google開源手勢識别--基于TF Lite/MediaPipe

它能做些什麼？它支援的語言和平台有哪些？請看下面兩張圖：

我們主要介紹手勢骨架與關鍵點提取，其他内容大家有興趣自行學習了解。github位址：https://github.com/google/mediapipe

效果展示

手勢骨架提取與關鍵點标注：

手勢識别0~6：

實作步驟

具體可參考下面連結：

https://google.github.io/mediapipe/solutions/hands

(1) 安裝mediapipe，執行pip install mediapipe

(2) 下載下傳手勢檢測與骨架提取模型，位址：

https://github.com/google/mediapipe/tree/master/mediapipe/modules/hand_landmark

(3) 代碼測試(攝像頭實時測試)：

import cv2
import mediapipe as mp
from os import listdir
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands




hands = mp_hands.Hands(
    min_detection_confidence=0.5, min_tracking_confidence=0.5)
cap = cv2.VideoCapture(0)
while cap.isOpened():
  success, image = cap.read()
  if not success:
    print("Ignoring empty camera frame.")
    continue


  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
  image.flags.writeable = False
  results = hands.process(image)


  image.flags.writeable = True
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
      mp_drawing.draw_landmarks(
          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imshow('result', image)
  if cv2.waitKey(5) & 0xFF == 27:
    break
cv2.destroyAllWindows()
hands.close()
cap.release()

輸出與結果：

圖檔檢測(可支援多個手掌)：

import cv2
import mediapipe as mp
from os import listdir
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands


# For static images:
hands = mp_hands.Hands(
    static_image_mode=True,
    max_num_hands=5,
    min_detection_confidence=0.2)
img_path = './multi_hands/'
save_path = './'
index = 0
file_list = listdir(img_path) 
for filename in file_list:
  index += 1
  file_path = img_path + filename
  # Read an image, flip it around y-axis for correct handedness output (see
  # above).
  image = cv2.flip(cv2.imread(file_path), 1)
  # Convert the BGR image to RGB before processing.
  results = hands.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))


  # Print handedness and draw hand landmarks on the image.
  print('Handedness:', results.multi_handedness)
  if not results.multi_hand_landmarks:
    continue
  image_hight, image_width, _ = image.shape
  annotated_image = image.copy()
  for hand_landmarks in results.multi_hand_landmarks:
    print('hand_landmarks:', hand_landmarks)
    print(
        f'Index finger tip coordinates: (',
        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * image_width}, '
        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_hight})'
    )
    mp_drawing.draw_landmarks(
        annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imwrite(
      save_path + str(index) + '.png', cv2.flip(annotated_image, 1))
hands.close()


# For webcam input:
hands = mp_hands.Hands(
    min_detection_confidence=0.5, min_tracking_confidence=0.5)
cap = cv2.VideoCapture(0)
while cap.isOpened():
  success, image = cap.read()
  if not success:
    print("Ignoring empty camera frame.")
    # If loading a video, use 'break' instead of 'continue'.
    continue


  # Flip the image horizontally for a later selfie-view display, and convert
  # the BGR image to RGB.
  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
  # To improve performance, optionally mark the image as not writeable to
  # pass by reference.
  image.flags.writeable = False
  results = hands.process(image)


  # Draw the hand annotations on the image.
  image.flags.writeable = True
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
      mp_drawing.draw_landmarks(
          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imshow('result', image)
  if cv2.waitKey(5) & 0xFF == 27:
    break
cv2.destroyAllWindows()
hands.close()
cap.release()

總結後續說明

總結：MediaPipe手勢檢測與骨架提取模型識别相較傳統方法更穩定，而且提供手指關節的3D坐标點，對于手勢識别與進一步手勢動作相關開發有很大幫助。

其他說明：

(1) 手部關節點标号與排序定義如下圖：

(2) 手部關節點坐标(x,y,z)輸出為小于1的小數，需要歸一化後顯示到圖像上，這部分可以檢視上部分源碼後轉到定義檢視，這裡給出demo代碼，另外Z坐标靠近螢幕增大，遠離螢幕減小：

def Normalize_landmarks(image, hand_landmarks):
  new_landmarks = []
  for i in range(0,len(hand_landmarks.landmark)):
    float_x = hand_landmarks.landmark[i].x
    float_y = hand_landmarks.landmark[i].y
    # Z坐标靠近螢幕增大，遠離螢幕減小
    float_z = hand_landmarks.landmark[i].z
    print(float_z)
    width = image.shape[1]
    height = image.shape[0]
 
    pt = mp_drawing._normalized_to_pixel_coordinates(float_x,float_y,width,height)
    new_landmarks.append(pt)
  return new_landmarks

(3) 基于此你可以做個簡單額手勢識别或者手勢靠近遠離螢幕的小程式，當然不僅要考慮關節點的坐标，可能還需要計算角度已經以前的狀态等等，比如下面這樣：

更穩定的手勢識别方法--基于手部骨架與關鍵點檢測

介紹

Google開源手勢識别--基于TF Lite/MediaPipe

效果展示

實作步驟

總結後續說明

繼續閱讀

HDU 4719 Oh My Holy FFF

CSU 1561 (More) Multiplication

CSU 1563 Lexicography

HDU 4721 Food and Productivity

ZOJ 1041 Transmitters

CSU 1562 Fun House

CodeChef PALPROB Palindromeness

UVA 10344- 23 out of 5

ZOJ 1104 Leaps Tall Buildings

HDU 2821 Pusher

UVA 1401 Remember the Word

ZOJ 2748 Free Kick

CSU 1567 Reverse Rot

JAVA 系列——>開發工具IntelliJ IDEA的安裝以及配置、快捷鍵IDEA 簡介

UVA 519 Puzzle (II)

磁盤結構及在Linux中的命名