Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am currently working on a "game" that allows it user to shift the position of a circle while watching the video. The video shows two individuals, who each take turn to speak. The user's task is to change the position of the circle appearing to the active speaker. While this is occurring, at some point I plan to change the video without the user noticing while the "game" and circle continues to show.

To achieve this purpose, I wrote the following code. The code takes the input from the user, and sends all the data to a TCP server and prints the information to a logger file. But I run into an issue. Firstly, the audio and video is not synchronised and even using lowest value of waitkey(1) , audio is faster than the video

Any help on how to fix this issue will be highly appreciated. Thanks in advance.

PS- I am using Visual Studio code and my python version is 3.9.6 64-bit.

import cv2 as cv #import the OpenCV library
import numpy as np #Import Numpy Library
import socket  # socket creation for Telnet
from datetime import datetime
from telnetlib import Telnet #telnet client
from ffpyplayer.player import MediaPlayer #ffpyplayer for playing audio
current_date_time = datetime.now().strftime('%Y-%m-%d-%H:%M:%S')
HOST = '127.0.0.1'  # The remote host
PORT = 4212  # The same port as used by the server
TCP_PORT = 9999  # port used to connect with the server file
def send_data(message, s, output):
    s.sendall(message.encode())
    data = s.recv(1024)
    output.write('\n'+ current_date_time+' '+message+ '\n')
    return data
def circle(frame, left):
    if left:
        cv.circle(frame,(450,250),20,(255,255,255),50)
    if not left:
        cv.circle(frame,(1400,250),20,(255,255,255),50)
def video():
    cap1 = cv.VideoCapture('P1.mp4') # the video that we want
    player = MediaPlayer('P1.mp4')
    circle_is_left = True
    if (cap1.isOpened()== False):
        print("Error opening video 1")  
    while (cap1.isOpened()):
        ret,frame = cap1.read() #capture frame-by-frame video
        audio_frame,val=player.get_frame() # capture frame-by-frame audio
        if ret== True:
            key_pressed = cv.waitKey(1)
            if key_pressed == ord(' '): #pressing space bar ends the video
                with open('out.txt', 'a') as output:
                    send_data('video 1 is changed',s,output)
                break
            elif key_pressed == 2: #left key pressed changes circle to lett
                circle_is_left = True
                with open('out.txt', 'a') as output:
                    send_data('left',s,output)
            elif key_pressed == 3: # right key pressed changes circle to right
                circle_is_left = False
                with open('out.txt', 'a') as output:
                    send_data('Right ',s,output)
            circle(frame, circle_is_left) #display the circle at all times
            cv.imshow('cap1',frame) #display resulting frame 
            if val != 'eof' and audio_frame is not None:
                img,t = audio_frame
    cap1.release()
    cv.destroyAllWindows()
    cap2 = cv.VideoCapture('P2.mov') # the video that we want
    player2 = MediaPlayer('P2.mov')
    circle_is_left = True
    if (cap2.isOpened()== False):
        print("Error opening video 2")  
    while (cap2.isOpened()):
        ret,frame = cap2.read() #capture frame-by-frame video
        audio_frame,val=player2.get_frame() # capture frame-by-frame audio
        if ret== True:
            key_pressed = cv.waitKey(1)
            if key_pressed == ord(' '): #pressing space bar ends the video
                with open('out.txt', 'a') as output:
                    send_data('video 1 is changed',s,output)
                break
            elif key_pressed == 2: #left key pressed changes circle to lett
                circle_is_left = True
                with open('out.txt', 'a') as output:
                    send_data('left',s,output)
            elif key_pressed == 3: # right key pressed changes circle to right
                circle_is_left = False
                with open('out.txt', 'a') as output:
                    send_data('Right ',s,output)
            circle(frame, circle_is_left) #display the circle at all times
            cv.imshow('cap2',frame) #display resulting frame 
            if val != 'eof' and audio_frame is not None:
                img,t = audio_frame
    cap2.release()
    cv.destroyAllWindows()
def main():
    print("Game1.py is connected to TCP server")
    video()
if __name__=='__main__':
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, TCP_PORT))
    main()
                it seems you run the same code for both P1 and P2 - so you could create function and run it with different filenames. For P1 you use the same P1.mp4 for cv and mediaplayer - but for P2 you use different names P2.mov and P2.mp4 - so maybe you run file with wrong audio.
– furas
                Jul 13, 2021 at 16:25
                @furas I did that and now video and audio at least break evenly. But I am still struggling to fix synchronisation issue.
– user9228288
                Jul 14, 2021 at 3:57
                synchronisation is big problem and I don't know solution for this. MediaPlayer runs external program which starts in different moment then VideoCapture and you can't control it. I would try to use some modules to access directly audio device - ie. pyaudio, winsound, playsound
– furas
                Jul 14, 2021 at 13:17
                you can't speed up - you can only use waitKey(millisecond) to decide how fast it display.  Teoreticly if you set waitKey(33) then will run frame every 33ms which gives 1000ms/33ms = 30 Frames Per Second - but code runs also other functions which need time so you would have to measure time between frames and use corrected value in waitKey. And it MediaPlayer starts faster then cv then you could start MediaPlayer inside loop when you get first cap1.read() - maybe it will start almost in the same time.
– furas
                Jul 15, 2021 at 5:22
                1000 FPS is only a theory. waitKey(1) waits at least 1ms but it may wait more - besides read() and other functions needs also some time - so finally you get much slower video. waitKey can reduce video speed when it is too fast but in your situations other code can slow you down - maybe you should send_data in separated thread. Or maybe you should open file only once because opening again and again also slow downs code.
– furas
                Jul 15, 2021 at 6:11

You can use cv2.CAP_PROP_POS_MSEC to synchronize video and audio as follows:

import cv2
import time
from ffpyplayer.player import MediaPlayer
def video(file):
    cap = cv2.VideoCapture(file)
    player = MediaPlayer(file)
    start_time = time.time()
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        _, val = player.get_frame(show=False)
        if val == 'eof':
            break
        cv2.imshow(file, frame)
        elapsed = (time.time() - start_time) * 1000  # msec
        play_time = int(cap.get(cv2.CAP_PROP_POS_MSEC))
        sleep = max(1, int(play_time - elapsed))
        if cv2.waitKey(sleep) & 0xFF == ord("q"):
            break
    player.close_player()
    cap.release()
    cv2.destroyAllWindows()

see https://github.com/otamajakusi/opencv_video_with_audio

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.