• Hackers Realm

Realtime Human Pose Estimation using Python | OpenCV - Mediapipe | Deep Learning Tutorial

The Realtime Human Pose Estimation is a process to identify and track the movement of the human body from a recorded video or on a live recording. This is a deep learning project that can be implemented in other projects like character animation, gaming, sports, robotics and security.

In this project tutorial we are going to identify and track all parts and movement of human poses from a recorded video or realtime recording using Mediapipe and OpenCV.



You can watch the step by step explanation video tutorial down below


Project Information

The objective of the project is estimate human body pose in realtime. Mediapipe module is used to detect and track the body movements. It can work for both recorded video and in realtime. It can capture 33 points in the human body and can run smoothly in realtime with the help of tracking.



Install Modules


Let us install the OpenCV module and Mediapipe module.

!pip install opencv-python
!pip install mediapipe

Import Modules

import cv2
import mediapipe as mp


## initialize pose estimator
mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose
pose = mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5)
  • Initialization of the pose estimator

  • min_detection_confidence=0.5 - Min. threshold for the pose drawing

  • min_tracking_confidence=0.5 - Min. tracking detection to minimize overall performance

  • Image reference of points covered on the human body

  • Very helpful baseline for other projects



Pose Estimation for Video



cap = cv2.VideoCapture('test_video.mp4')
while cap.isOpened():
    # read frame
    _, frame = cap.read()
    try:
         # resize the frame for portrait video
         frame = cv2.resize(frame, (350, 600))
         # convert to RGB
         frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
         
         # process the frame for pose detection
         pose_results = pose.process(frame_rgb)
         # print(pose_results.pose_landmarks)
         
         # draw skeleton on the frame
         mp_drawing.draw_landmarks(frame, pose_results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
         # display the frame
         cv2.imshow('Output', frame)
    except:
         break
    
     if cv2.waitKey(1) == ord('q'):
          break
          
cap.release()
cv2.destroyAllWindows()


  • Demonstration of how the Human Pose estimation works

  • frame = cap.read() - Capture each frame for processing

  • frame = cv2.resize(frame, (350, 600)) - Resized to portrait size to cover the whole body in this particular video.

  • frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) - Convert each frame from BGR to RGB

  • pose_results = pose.process(frame_rgb) - Process each frame and points out every pose landmark

  • mp_drawing.draw_landmarks(frame, pose_results.pose_landmarks, mp_pose.POSE_CONNECTIONS) - Draws and connects every point landmark by the pose estimator initialized before

  • cap.release() - Releases the captured object

  • cv2.destroyAllWindows() - Closes all the windows are opened

  • You can clearly see how certain parts are marked very well

  • Video had been converted and resized for better display



# get landmark for a specific point
pose_results.pose_landmarks.landmark[32]

x: 0.35414522886276245 y: 0.8367241024971008 z: 0.16406674683094025 visibility: 0.9696751236915588

  • Landmark of a specific point detected from the video


Realtime Pose Estimation



cap = cv2.VideoCapture(0)
while cap.isOpened():
    # read frame
    _, frame = cap.read()
    try:
         # resize the frame for portrait video
         # frame = cv2.resize(frame, (350, 600))
         # convert to RGB
         frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
         
         # process the frame for pose detection
         pose_results = pose.process(frame_rgb)
         # print(pose_results.pose_landmarks)
         
         # draw skeleton on the frame
         mp_drawing.draw_landmarks(frame, pose_results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
         # display the frame
         cv2.imshow('Output', frame)
    except:
         break
    
     if cv2.waitKey(1) == ord('q'):
          break
          
cap.release()
cv2.destroyAllWindows()


  • In realtime when certain body parts are included in the frame it marks the pose and tracks the movement.

  • cv2.VideoCapture(0) - Zero is the default webcam, if you have multiple webcams change the value in the parameter.



Final Thoughts


  • Very good baseline for other movement detection projects like movement sensors or skeleton movement recording for animation projects.

  • Adjusting the tracking confidence and min. detection confidence in order to tweak the module and obtain better results.


In this project tutorial we have explored the Realtime Human Pose Estimation using Mediapipe. This is a deep learning project reusable for any other movement detection project with recorded videos or live recording.


Get the project notebook from here


Thanks for reading the article!!!


Check out more project videos from the YouTube channel Hackers Realm

94 views