更稳定的手势识别方法--基于手部骨架与关键点检测

导读

本期将介绍并演示基于MediaPipe的手势骨架与特征点提取步骤以及以此为基础实现手势识别的方法。

介绍

关于MediaPipe以前有相关文章介绍，可以参看下面链接：

Google开源手势识别--基于TF Lite/MediaPipe

它能做些什么？它支持的语言和平台有哪些？请看下面两张图：

我们主要介绍手势骨架与关键点提取，其他内容大家有兴趣自行学习了解。github地址：https://github.com/google/mediapipe

效果展示

手势骨架提取与关键点标注：

手势识别0~6：

实现步骤

具体可参考下面链接：

https://google.github.io/mediapipe/solutions/hands

(1) 安装mediapipe，执行pip install mediapipe

(2) 下载手势检测与骨架提取模型，地址：

https://github.com/google/mediapipe/tree/master/mediapipe/modules/hand_landmark

(3) 代码测试(摄像头实时测试)：

import cv2
import mediapipe as mp
from os import listdir
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands




hands = mp_hands.Hands(
    min_detection_confidence=0.5, min_tracking_confidence=0.5)
cap = cv2.VideoCapture(0)
while cap.isOpened():
  success, image = cap.read()
  if not success:
    print("Ignoring empty camera frame.")
    continue


  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
  image.flags.writeable = False
  results = hands.process(image)


  image.flags.writeable = True
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
      mp_drawing.draw_landmarks(
          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imshow('result', image)
  if cv2.waitKey(5) & 0xFF == 27:
    break
cv2.destroyAllWindows()
hands.close()
cap.release()

输出与结果：

图片检测(可支持多个手掌)：

import cv2
import mediapipe as mp
from os import listdir
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands


# For static images:
hands = mp_hands.Hands(
    static_image_mode=True,
    max_num_hands=5,
    min_detection_confidence=0.2)
img_path = './multi_hands/'
save_path = './'
index = 0
file_list = listdir(img_path) 
for filename in file_list:
  index += 1
  file_path = img_path + filename
  # Read an image, flip it around y-axis for correct handedness output (see
  # above).
  image = cv2.flip(cv2.imread(file_path), 1)
  # Convert the BGR image to RGB before processing.
  results = hands.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))


  # Print handedness and draw hand landmarks on the image.
  print('Handedness:', results.multi_handedness)
  if not results.multi_hand_landmarks:
    continue
  image_hight, image_width, _ = image.shape
  annotated_image = image.copy()
  for hand_landmarks in results.multi_hand_landmarks:
    print('hand_landmarks:', hand_landmarks)
    print(
        f'Index finger tip coordinates: (',
        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * image_width}, '
        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_hight})'
    )
    mp_drawing.draw_landmarks(
        annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imwrite(
      save_path + str(index) + '.png', cv2.flip(annotated_image, 1))
hands.close()


# For webcam input:
hands = mp_hands.Hands(
    min_detection_confidence=0.5, min_tracking_confidence=0.5)
cap = cv2.VideoCapture(0)
while cap.isOpened():
  success, image = cap.read()
  if not success:
    print("Ignoring empty camera frame.")
    # If loading a video, use 'break' instead of 'continue'.
    continue


  # Flip the image horizontally for a later selfie-view display, and convert
  # the BGR image to RGB.
  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
  # To improve performance, optionally mark the image as not writeable to
  # pass by reference.
  image.flags.writeable = False
  results = hands.process(image)


  # Draw the hand annotations on the image.
  image.flags.writeable = True
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
      mp_drawing.draw_landmarks(
          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imshow('result', image)
  if cv2.waitKey(5) & 0xFF == 27:
    break
cv2.destroyAllWindows()
hands.close()
cap.release()

总结后续说明

总结：MediaPipe手势检测与骨架提取模型识别相较传统方法更稳定，而且提供手指关节的3D坐标点，对于手势识别与进一步手势动作相关开发有很大帮助。

其他说明：

(1) 手部关节点标号与排序定义如下图：

(2) 手部关节点坐标(x,y,z)输出为小于1的小数，需要归一化后显示到图像上，这部分可以查看上部分源码后转到定义查看，这里给出demo代码，另外Z坐标靠近屏幕增大，远离屏幕减小：

def Normalize_landmarks(image, hand_landmarks):
  new_landmarks = []
  for i in range(0,len(hand_landmarks.landmark)):
    float_x = hand_landmarks.landmark[i].x
    float_y = hand_landmarks.landmark[i].y
    # Z坐标靠近屏幕增大，远离屏幕减小
    float_z = hand_landmarks.landmark[i].z
    print(float_z)
    width = image.shape[1]
    height = image.shape[0]
 
    pt = mp_drawing._normalized_to_pixel_coordinates(float_x,float_y,width,height)
    new_landmarks.append(pt)
  return new_landmarks

(3) 基于此你可以做个简单额手势识别或者手势靠近远离屏幕的小程序，当然不仅要考虑关节点的坐标，可能还需要计算角度已经以前的状态等等，比如下面这样：

更稳定的手势识别方法--基于手部骨架与关键点检测

介绍

Google开源手势识别--基于TF Lite/MediaPipe

效果展示

实现步骤

总结后续说明

继续阅读

HDU 4719 Oh My Holy FFF

CSU 1561 (More) Multiplication

CSU 1563 Lexicography

HDU 4721 Food and Productivity

ZOJ 1041 Transmitters

CSU 1562 Fun House

CodeChef PALPROB Palindromeness

UVA 10344- 23 out of 5

ZOJ 1104 Leaps Tall Buildings

HDU 2821 Pusher

UVA 1401 Remember the Word

ZOJ 2748 Free Kick

CSU 1567 Reverse Rot

JAVA 系列——>开发工具IntelliJ IDEA的安装以及配置、快捷键IDEA 简介

UVA 519 Puzzle (II)

磁盘结构及在Linux中的命名