實戰 | 用Python和MediaPipe搭建一個嗜睡檢測系統 (詳細步驟 + 源碼)

導讀

本文将使用Python和MediaPipe搭建一個嗜睡檢測系統 (包含詳細步驟 + 源碼)。（公衆号：OpenCV與AI深度學習）

背景介紹

疲勞駕駛的危害不堪設想，據了解，21%的交通事故都是以而生，尤其是高速路上，大多數車輛都是長途駕駛，加之速度快，危害更加嚴重。

實戰 | 用Python和MediaPipe搭建一個嗜睡檢測系統 (詳細步驟 + 源碼)

相關部門一般都會建議司機朋友及時休息調整後再駕駛，避免釀成慘劇。

作為視覺開發人員，我們可否幫助駕駛人員設計一套智能檢測嗜睡的系統，及時提醒駕駛員注意休息？如下圖所示，本文将詳細介紹如何使用Python和MediaPipe來實作一個嗜睡檢測系統。

實作步驟

思路：疲勞駕駛的司機大部分都有打瞌睡的情形，是以我們根據駕駛員眼睛閉合的頻率和時間來判斷駕駛員是否疲勞駕駛(或嗜睡)。

詳細實作步驟

【1】眼部關鍵點檢測。

關于MediaPipe前面已經介紹過，具體可以檢視下面連結的文章：

------MediaPipe介紹與手勢識别------

我們使用Face Mesh來檢測眼部關鍵點，Face Mesh傳回了468個人臉關鍵點：

由于我們專注于駕駛員睡意檢測，在468個點中，我們隻需要屬于眼睛區域的标志點。眼睛區域有 32 個标志點（每個 16 個點）。為了計算 EAR，我們隻需要 12 個點（每隻眼睛 6 個點）。

以上圖為參考，選取的12個地标點如下：

對于左眼： [362, 385, 387, 263, 373, 380]
對于右眼：[33, 160, 158, 133, 153, 144]

選擇的地标點按順序排列：P 1、 P 2、 P 3、 P 4、 P 5、 P 6

import cv2
import numpy as np
import matplotlib.pyplot as plt
import mediapipe as mp


mp_facemesh = mp.solutions.face_mesh
mp_drawing  = mp.solutions.drawing_utils
denormalize_coordinates = mp_drawing._normalized_to_pixel_coordinates


%matplotlib inline

擷取雙眼的地标（索引）點。

# Landmark points corresponding to left eye
all_left_eye_idxs = list(mp_facemesh.FACEMESH_LEFT_EYE)
# flatten and remove duplicates
all_left_eye_idxs = set(np.ravel(all_left_eye_idxs)) 


# Landmark points corresponding to right eye
all_right_eye_idxs = list(mp_facemesh.FACEMESH_RIGHT_EYE)
all_right_eye_idxs = set(np.ravel(all_right_eye_idxs))


# Combined for plotting - Landmark points for both eye
all_idxs = all_left_eye_idxs.union(all_right_eye_idxs)


# The chosen 12 points:   P1,  P2,  P3,  P4,  P5,  P6
chosen_left_eye_idxs  = [362, 385, 387, 263, 373, 380]
chosen_right_eye_idxs = [33,  160, 158, 133, 153, 144]
all_chosen_idxs = chosen_left_eye_idxs + chosen_right_eye_idx

【2】檢測眼睛是否閉合——計算眼睛縱橫比(EAR)。

要檢測眼睛是否閉合，我們可以使用眼睛縱橫比(EAR) 公式：

EAR 公式傳回反映睜眼程度的單個标量：

1. 我們将使用 Mediapipe 的 Face Mesh 解決方案來檢測和檢索眼睛區域中的相關地标（下圖中的點P 1 - P 6）。

2. 檢索相關點後，會在眼睛的高度和寬度之間計算眼睛縱橫比 (EAR)。

當眼睛睜開并接近零時，EAR 幾乎是恒定的，而閉上眼睛是部分人，并且頭部姿勢不敏感。睜眼的縱橫比在個體之間具有很小的差異。它對于圖像的統一縮放和面部的平面内旋轉是完全不變的。由于雙眼同時眨眼，是以雙眼的EAR是平均的。

上圖：檢測到地标P i的睜眼和閉眼。

底部：為視訊序列的幾幀繪制的眼睛縱橫比 EAR。存在一個閃爍。

首先，我們必須計算每隻眼睛的 Eye Aspect Ratio：

|| 表示L2範數，用于計算兩個向量之間的距離。

為了計算最終的 EAR 值，作者建議取兩個 EAR 值的平均值。

一般來說，平均 EAR 值在 [0.0, 0.40] 範圍内。在“閉眼”動作期間 EAR 值迅速下降。

現在我們熟悉了 EAR 公式，讓我們定義三個必需的函數：distance(…)、get_ear(…)和calculate_avg_ear(…)。

def distance(point_1, point_2):
    """Calculate l2-norm between two points"""
    dist = sum([(i - j) ** 2 for i, j in zip(point_1, point_2)]) ** 0.5
    return dist

get_ear (…)函數将.landmark屬性作為參數。在每個索引位置，我們都有一個NormalizedLandmark對象。該對象儲存标準化的x、y和z坐标值。

def get_ear(landmarks, refer_idxs, frame_width, frame_height):
    """
    Calculate Eye Aspect Ratio for one eye.


    Args:
        landmarks: (list) Detected landmarks list
        refer_idxs: (list) Index positions of the chosen landmarks
                            in order P1, P2, P3, P4, P5, P6
        frame_width: (int) Width of captured frame
        frame_height: (int) Height of captured frame


    Returns:
        ear: (float) Eye aspect ratio
    """
    try:
        # Compute the euclidean distance between the horizontal
        coords_points = []
        for i in refer_idxs:
            lm = landmarks[i]
            coord = denormalize_coordinates(lm.x, lm.y, 
                                             frame_width, frame_height)
            coords_points.append(coord)


        # Eye landmark (x, y)-coordinates
        P2_P6 = distance(coords_points[1], coords_points[5])
        P3_P5 = distance(coords_points[2], coords_points[4])
        P1_P4 = distance(coords_points[0], coords_points[3])


        # Compute the eye aspect ratio
        ear = (P2_P6 + P3_P5) / (2.0 * P1_P4)


    except:
        ear = 0.0
        coords_points = None


    return ear, coords_points

最後定義了calculate_avg_ear(…)函數：

def calculate_avg_ear(landmarks, left_eye_idxs, right_eye_idxs, image_w, image_h):
    """Calculate Eye aspect ratio"""


    left_ear, left_lm_coordinates = get_ear(
                                      landmarks, 
                                      left_eye_idxs, 
                                      image_w, 
                                      image_h
                                    )
    right_ear, right_lm_coordinates = get_ear(
                                      landmarks, 
                                      right_eye_idxs, 
                                      image_w, 
                                      image_h
                                    )
    Avg_EAR = (left_ear + right_ear) / 2.0


    return Avg_EAR, (left_lm_coordinates, right_lm_coordinates)

讓我們測試一下 EAR 公式。我們将計算先前使用的圖像和另一張眼睛閉合的圖像的平均 EAR 值。

image_eyes_open  = cv2.imread("test-open-eyes.jpg")[:, :, ::-1]
image_eyes_close = cv2.imread("test-close-eyes.jpg")[:, :, ::-1]


for idx, image in enumerate([image_eyes_open, image_eyes_close]):
   
    image = np.ascontiguousarray(image)
    imgH, imgW, _ = image.shape


    # Creating a copy of the original image for plotting the EAR value
    custom_chosen_lmk_image = image.copy()


    # Running inference using static_image_mode
    with mp_facemesh.FaceMesh(refine_landmarks=True) as face_mesh:
        results = face_mesh.process(image).multi_face_landmarks


        # If detections are available.
        if results:
            for face_id, face_landmarks in enumerate(results):
                landmarks = face_landmarks.landmark
                EAR, _ = calculate_avg_ear(
                          landmarks, 
                          chosen_left_eye_idxs, 
                          chosen_right_eye_idxs, 
                          imgW, 
                          imgH
                      )


                # Print the EAR value on the custom_chosen_lmk_image.
                cv2.putText(custom_chosen_lmk_image, 
                            f"EAR: {round(EAR, 2)}", (1, 24),
                            cv2.FONT_HERSHEY_COMPLEX, 
                            0.9, (255, 255, 255), 2
                )                
             
                plot(img_dt=image.copy(),
                     img_eye_lmks_chosen=custom_chosen_lmk_image,
                     face_landmarks=face_landmarks,
                     ts_thickness=1, 
                     ts_circle_radius=3, 
                     lmk_circle_radius=3
                )

結果：

如您所見，睜眼時的 EAR 值為0.28，閉眼時（接近于零）為 0.08。

【3】設計一個實時檢測系統。

首先，我們聲明兩個門檻值和一個計數器。

EAR_thresh：用于檢查目前EAR值是否在範圍内的門檻值。
D_TIME：一個計數器變量，用于跟蹤目前經過的時間量EAR < EAR_THRESH.
WAIT_TIME：确定經過的時間量是否EAR < EAR_THRESH超過了允許的限制。

當應用程式啟動時，我們将目前時間（以秒為機關）記錄在一個變量中t1并讀取傳入的幀。
接下來，我們預處理并frame通過Mediapipe 的 Face Mesh 解決方案管道。
如果有任何地标檢測可用，我們将檢索相關的 ( Pi )眼睛地标。否則，在此處重置t1 和重置以使算法一緻）。D_TIME (D_TIME
如果檢測可用，則使用檢索到的眼睛标志計算雙眼的平均EAR值。
如果是目前時間，則加上目前時間和to之間的差。然後将下一幀重置為。EAR < EAR_THRESHt2t1D_TIMEt1 t2
如果D_TIME >= WAIT_TIME，我們會發出警報或繼續下一幀。

參考連結：

https://learnopencv.com/driver-drowsiness-detection-using-mediapipe-in-python/

實戰 | 用Python和MediaPipe搭建一個嗜睡檢測系統 (詳細步驟 + 源碼)

背景介紹

疲勞駕駛的危害不堪設想，據了解，21%的交通事故都是以而生，尤其是高速路上，大多數車輛都是長途駕駛，加之速度快，危害更加嚴重。

實作步驟

【2】檢測眼睛是否閉合——計算眼睛縱橫比(EAR)。

要檢測眼睛是否閉合，我們可以使用眼睛縱橫比(EAR) 公式：

—THE END—

繼續閱讀

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

cs231n斯坦福基于卷積神經網絡的CV學習筆記（一）KNN和線性分類器/分類器損失/反向傳播一，KNN圖像分類算法二，線性分類器三，線性分類器損失四，反向傳播五，神經網絡

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

STM32F4内部Flash讀寫

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入

實戰 | 用Python和MediaPipe搭建一個嗜睡檢測系統 (詳細步驟 + 源碼)

背景介紹

疲勞駕駛的危害不堪設想，據了解，21%的交通事故都是以而生，尤其是高速路上，大多數車輛都是長途駕駛，加之速度快，危害更加嚴重。

實作步驟

【2】檢測眼睛是否閉合——計算眼睛縱橫比(EAR)。 要檢測眼睛是否閉合，我們可以使用眼睛縱橫比(EAR) 公式：

—THE END—

繼續閱讀

【2】檢測眼睛是否閉合——計算眼睛縱橫比(EAR)。

要檢測眼睛是否閉合，我們可以使用眼睛縱橫比(EAR) 公式：