Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I'm working on web app that turns audio into text using streamlit. I am using the SpeechRecognition library, which has a limit of 3 minutes, but I am working on a fix that splits the video up into 3 minute chunks. I am testing this on a 15-minute audio file, and the first two chunks work perfectly. But when it comes to the chunks after that, I get this error:
FileNotFoundError: [WinError 2] The system cannot find the file specified
Traceback:
File "C:\Users\marcu\AppData\Roaming\Python\Python39\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
exec(code, module.__dict__)
File "C:\Users\marcu\OneDrive\Desktop\Coding\auto notes\test.py", line 51, in <module>
main()
File "C:\Users\marcu\OneDrive\Desktop\Coding\auto notes\test.py", line 26, in main
audio = pydub.AudioSegment.from_file(temp_audio_file.name)
File "C:\Users\marcu\AppData\Roaming\Python\Python39\site-packages\pydub\audio_segment.py", line 728, in from_file
info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
File "C:\Users\marcu\AppData\Roaming\Python\Python39\site-packages\pydub\utils.py", line 274, in mediainfo_json
res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE)
File "c:\program files\python39\lib\subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "c:\program files\python39\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args)
Here is the script:
import streamlit as st
import speech_recognition as sr
import os
import math
def file_selector(folder_path='.'):
filenames = os.listdir(folder_path)
selected_filename = st.selectbox('Select a file', filenames)
return os.path.join(folder_path, selected_filename)
def main():
st.title("Audio to Text Converter")
# Upload the audio file
audio_file = st.file_uploader("Upload an audio file", type=["mp3", "wav", "ogg"])
if audio_file is not None:
# Split the audio file into 5-minute chunks
CHUNK_DURATION = 5 * 60 # 5 minutes
r = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
audio_duration = math.ceil(source.DURATION)
num_chunks = math.ceil(audio_duration / CHUNK_DURATION)
for i in range(num_chunks):
chunk_start = i * CHUNK_DURATION
chunk_end = min((i + 1) * CHUNK_DURATION, audio_duration)
audio_text = r.record(source, offset=chunk_start, duration=chunk_end-chunk_start)
text = r.recognize_google(audio_text)
# Display the text for this chunk
st.header(f"Text from Audio (Chunk {i+1}/{num_chunks})")
st.write(text)
if __name__ == '__main__':
main()
I have asked around on Discord and in other places, but no one seemed to know the fix. I was wondering if this was due to a miss-calculation of how many chunks there should be, but when I print num_chunks
it returns 5, which is correct for a 15 minute audio file. I also tested this with another file, but got the same error after the first 2 chunks. Thanks for the help in advance!
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.