Eye - Movement - Recognition開源模型 - 即時精準檢測分類眼眉細微動作

首頁

Eye Movement Recognition

由shayan5422開發

一個先進的即時系統，能夠準確檢測和分類眼睛和眉毛的細微動作，包括'是'、'否'和'正常'三種動作。

人臉相關開源協議:MIT #即時眼眉識別 #CNN-LSTM架構 #非語言交互

下載量 105

發布時間 : 11/8/2024

模型概述

該模型採用CNN-LSTM架構，能夠有效捕捉單幀的空間特徵和幀序列的時間動態，確保在真實場景中的穩健和可靠表現。

模型特點

即時檢測

持續處理即時攝像頭畫面，無顯著延遲地檢測眼眉動作。

GPU加速

通過TensorFlow-Metal在macOS上優化GPU使用，確保高效計算。

可擴展設計

系統設計易於擴展以支持更多面部手勢或動作。

高準確率

在區分支持的動作方面表現出高準確率，是即時面部手勢識別的可靠工具。

模型能力

即時眼眉動作檢測

面部表情分類

非語言交流輔助

使用案例

人機交互

手勢控制界面

通過眼眉動作增強用戶界面交互。

提供更自然的交互方式

輔助技術

非語言交流工具

為言語障礙者提供通過眼眉動作進行交流的能力。

提高溝通效率

行為分析

面部表情監測

用於心理學或市場研究中的面部表情分析。

提供客觀的行為數據

🚀 眼部和眉毛動作識別模型

本模型是一個先進的即時系統，能夠精確檢測和分類眼部及眉毛的細微面部動作。它採用 CNN - LSTM 架構，可有效捕捉單幀的空間特徵和幀序列的時間動態，適用於人機交互、輔助技術等多種領域。

🚀 快速開始

前提條件

硬件：具備 Apple Silicon（M1、M1 Pro、M1 Max、M2 等）的 Mac，以支持 Metal GPU。
操作系統：macOS 12.3（Monterey）或更高版本。
Python：版本 3.9 或更高。

安裝步驟

克隆倉庫

git clone https://huggingface.co/shayan5422/eye-eyebrow-movement-recognition
cd eye-eyebrow-movement-recognition

安裝 Homebrew（若尚未安裝） Homebrew 是 macOS 的包管理器，可簡化軟件安裝過程。

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

安裝 Micromamba Micromamba 是與 Conda 環境兼容的輕量級包管理器。

brew install micromamba

創建並激活虛擬環境 使用 Micromamba 為項目創建一個隔離的環境。

# 創建一個名為 'eye_movement' 的新環境，使用 Python 3.9
micromamba create -n eye_movement python=3.9

# 激活環境
micromamba activate eye_movement

安裝所需庫 安裝支持 Metal 的 TensorFlow（tensorflow-macos 和 tensorflow-metal）以及其他必要的庫。

# 為 macOS 安裝 TensorFlow
pip install tensorflow-macos

# 安裝 TensorFlow Metal 插件以實現 GPU 加速
pip install tensorflow-metal

# 安裝其他依賴項
pip install opencv-python dlib imutils tqdm scikit-learn matplotlib seaborn h5py

⚠️ 重要提示

在 macOS 上安裝 dlib 有時可能會遇到挑戰。如果遇到問題，可考慮通過 Conda 安裝或參考 dlib 的官方安裝說明。

下載 Dlib 的預訓練形狀預測器 此模型對面部特徵點檢測至關重要。

# 導航到項目目錄
cd /path/to/your/project/eye-eyebrow-movement-recognition/

# 下載形狀預測器
curl -LO http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2

# 解壓縮文件
bunzip2 shape_predictor_68_face_landmarks.dat.bz2

確保 shape_predictor_68_face_landmarks.dat 文件與腳本位於同一目錄中。

加載模型

import tensorflow as tf

# 加載訓練好的模型
model = tf.keras.models.load_model('final_model_sequences.keras')

進行預測

import cv2
import numpy as np
import dlib
from imutils import face_utils
from collections import deque
import queue
import threading

# 初始化 dlib 的人臉檢測器和特徵點預測器
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')

# 初始化線程隊列
input_queue = queue.Queue()
output_queue = queue.Queue()

# 定義序列長度
max_seq_length = 30

def prediction_worker(model, input_q, output_q):
    while True:
        sequence = input_q.get()
        if sequence is None:
            break
        # 預處理和預測
        # [添加你實際的預測邏輯]
        # 示例:
        prediction = model.predict(sequence)
        class_idx = np.argmax(prediction)
        confidence = np.max(prediction)
        output_q.put((class_idx, confidence))

# 啟動預測線程
thread = threading.Thread(target=prediction_worker, args=(model, input_queue, output_queue))
thread.start()

# 開始視頻捕獲
cap = cv2.VideoCapture(0)
frame_buffer = deque(maxlen=max_seq_length)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    # 預處理幀
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    rects = detector(gray, 1)
    if len(rects) > 0:
        rect = rects[0]
        shape = predictor(gray, rect)
        shape = face_utils.shape_to_np(shape)
        # 提取感興趣區域並預處理
        # [添加你實際的 ROI 提取和預處理邏輯]
        # 示例:
        preprocessed_frame = preprocess_frame(frame, detector, predictor)
        frame_buffer.append(preprocessed_frame)
    else:
        frame_buffer.append(np.zeros((64, 256, 1), dtype='float32'))

    # 如果緩衝區已滿，發送進行預測
    if len(frame_buffer) == max_seq_length:
        sequence = np.array(frame_buffer)
        input_queue.put(np.expand_dims(sequence, axis=0))
        frame_buffer.clear()

    # 檢查預測結果
    try:
        while True:
            class_idx, confidence = output_queue.get_nowait()
            movement = index_to_text.get(class_idx, "Unknown")
            text = f"{movement} ({confidence*100:.2f}%)"
            cv2.putText(frame, text, (30, 30), cv2.FONT_HERSHEY_SIMPLEX, 
                        0.8, (0, 255, 0), 2, cv2.LINE_AA)
    except queue.Empty:
        pass

    # 顯示幀
    cv2.imshow('Real-time Movement Prediction', frame)

    # 按 'q' 鍵退出
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 清理資源
cap.release()
cv2.destroyAllWindows()
input_queue.put(None)
thread.join()

⚠️ 重要提示

請將佔位符註釋替換為你腳本中實際的預處理和預測邏輯。

✨ 主要特性

即時檢測：持續處理即時網絡攝像頭輸入，無明顯延遲地檢測眼部和眉毛動作。
GPU 加速：通過 macOS 上的 TensorFlow - Metal 進行 GPU 優化，確保高效計算。
可擴展設計：目前支持“是”“否”和“正常”動作，系統設計易於擴展以支持更多面部手勢或動作。
用戶友好界面：將預測結果直接疊加在即時視頻流上，提供直觀的視覺反饋。
高準確率：在區分支持的動作方面表現出高準確率，是即時面部手勢識別的可靠工具。

📦 安裝指南

安裝步驟如上述“快速開始”部分所述，包括克隆倉庫、安裝依賴、下載預訓練模型等。

💻 使用示例

基礎用法

import tensorflow as tf

# 加載訓練好的模型
model = tf.keras.models.load_model('final_model_sequences.keras')

高級用法

import cv2
import numpy as np
import dlib
from imutils import face_utils
from collections import deque
import queue
import threading

# 初始化 dlib 的人臉檢測器和特徵點預測器
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')

# 初始化線程隊列
input_queue = queue.Queue()
output_queue = queue.Queue()

# 定義序列長度
max_seq_length = 30

def prediction_worker(model, input_q, output_q):
    while True:
        sequence = input_q.get()
        if sequence is None:
            break
        # 預處理和預測
        # [添加你實際的預測邏輯]
        # 示例:
        prediction = model.predict(sequence)
        class_idx = np.argmax(prediction)
        confidence = np.max(prediction)
        output_q.put((class_idx, confidence))

# 啟動預測線程
thread = threading.Thread(target=prediction_worker, args=(model, input_queue, output_queue))
thread.start()

# 開始視頻捕獲
cap = cv2.VideoCapture(0)
frame_buffer = deque(maxlen=max_seq_length)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    # 預處理幀
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    rects = detector(gray, 1)
    if len(rects) > 0:
        rect = rects[0]
        shape = predictor(gray, rect)
        shape = face_utils.shape_to_np(shape)
        # 提取感興趣區域並預處理
        # [添加你實際的 ROI 提取和預處理邏輯]
        # 示例:
        preprocessed_frame = preprocess_frame(frame, detector, predictor)
        frame_buffer.append(preprocessed_frame)
    else:
        frame_buffer.append(np.zeros((64, 256, 1), dtype='float32'))

    # 如果緩衝區已滿，發送進行預測
    if len(frame_buffer) == max_seq_length:
        sequence = np.array(frame_buffer)
        input_queue.put(np.expand_dims(sequence, axis=0))
        frame_buffer.clear()

    # 檢查預測結果
    try:
        while True:
            class_idx, confidence = output_queue.get_nowait()
            movement = index_to_text.get(class_idx, "Unknown")
            text = f"{movement} ({confidence*100:.2f}%)"
            cv2.putText(frame, text, (30, 30), cv2.FONT_HERSHEY_SIMPLEX, 
                        0.8, (0, 255, 0), 2, cv2.LINE_AA)
    except queue.Empty:
        pass

    # 顯示幀
    cv2.imshow('Real-time Movement Prediction', frame)

    # 按 'q' 鍵退出
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 清理資源
cap.release()
cv2.destroyAllWindows()
input_queue.put(None)
thread.join()

📚 詳細文檔

預期用途

本模型適用於多種應用場景，包括但不限於：

人機交互（HCI）：通過基於手勢的控制增強用戶界面。
輔助技術：為言語障礙者提供非語言交流工具。
行為分析：監測和分析面部表情，用於心理學或市場研究。
遊戲：通過面部手勢控制創造更沉浸式和響應式的遊戲體驗。

⚠️ 重要提示

該模型僅用於研究和教育目的。在實際應用中，請確保遵守隱私和道德準則。

模型架構

模型採用 CNN - LSTM 架構來捕捉空間和時間特徵：

TimeDistributed CNN 層：
- Conv2D：獨立提取每一幀的空間特徵。
- MaxPooling2D：減少空間維度。
- BatchNormalization：穩定並加速訓練。
Flatten 層：將 CNN 層的輸出展平，為 LSTM 處理做準備。
LSTM 層：捕捉幀序列的時間依賴關係。
全連接層：基於組合的時空特徵進行最終分類。
輸出層：使用 Softmax 激活，提供三個類別（“是”“否”“正常”）的概率分佈。