If you only have a text, how to create a subrip subtitle file (.srt) using python? In this tutorial, we will introduce you how to do.
Subrip Subtitle File
Subrip subtitle file is a type of file with extension (.srt). The content of it may be:
1 00:00:00,000 --> 00:00:01,428 Homework is important 2 00:00:01,428 --> 00:00:04,400 because it develops core skills in young children ......
In order to create a .srt file, we have to answer these questions:
- How to split a text to display in a video?
- How to get the duration of each line?
- How to convert text start time and end time to subrip subtitle file time format?
How to split a text to display in a video?
We should not display a long text in each video frame. To fix this problem,we can split a long text to short manually or use python code.
Here is an example:
How to get the duration of each line?
In order to get the duration of each text line, we have to use tts model. For example, we can use VITS model to convert text to speech and get the duration of each line.
Convert Text to Speech in Python Using VITS Model
After we have got the duration of each text line, we can get the start time and end time of each text line in the whole audio.
How to convert text start time and end time to subrip subtitle file time format
As to subrip subtitle file time format, it is:
00:00:01,428 | –> | 00:00:04,400 |
start time | end time |
As we have got the start time and end time of each text line, we can convert seconds to subrip subtitle file time format. Here is the tutorial:
Python Convert Seconds to Days, Hours, Minutes and Seconds
Then, we can create a .srt file easily. Here is an example code:
subText = [] i = 0 start_time = 0 end_time = 0 line_info = "" for text in data: # Tokenize inputs inputs = tokenizer(text) # Generate speech #print(inputs) outputs = model.run(None, {"text": inputs}) wav = outputs[0] end_time = start_time + wav.shape[0] audio_data.append(wav) # i += 1 info = str(i) start_info = sec2time(start_time) end_info =sec2time(end_time) start_time = end_time time_info = f"{start_info} --> {end_info}" # subrip subtitle file time format text_info = text line_info += info+"\n"+time_info+"\n"+text_info+"\n\n" print(len(audio_data)) audio = np.concatenate(audio_data, axis = 0) # Write to file with open("sub-text.srt","w",encoding="utf-8") as f: f.writelines(line_info) sf.write("sub-audio.wav", audio, 22050)