Audio Video 2 Text
This revision is from 2024/07/02 14:32. You can Restore it.
Extract Audio
ffmpeg -i input.mp4 -ar 16000 -ac 1 output.wav
Transcripe to text
pocketsphinx_continuous -infile output.wav >/tmp/out.txt
Do a whole bunch
#!/bin/bashfor file in *.mp4; do
ffmpeg -i "$file" -ar 16000 -ac 1 "${file%.mp4}.wav"
pocketsphinx_continuous -infile "${file%.mp4}.wav" >"${file%.mp4}.txt"
done
LLMDataHub: Awesome Datasets for LLM Training: https://github.com/Zjh-819/LLMDataHub