Title: | Read Subtitle Files as Tabular Data |
---|---|
Description: | Read 'SubRip' <https://sourceforge.net/projects/subrip/> subtitle files as data frames for easy text analysis or manipulation. Easily shift numeric timings and export subtitles back into valid 'SubRip' timestamp format to sync subtitles and audio. |
Authors: | Kiernan Nicholls [aut, cre, cph]
|
Maintainer: | Kiernan Nicholls <[email protected]> |
License: | GPL-3 |
Version: | 1.0.4 |
Built: | 2025-03-07 02:47:56 UTC |
Source: | https://github.com/k5cents/srt |
Convert the SubRip file format to a tabular data frame of times and text.
read_srt(path, collapse = "\n")
read_srt(path, collapse = "\n")
path |
A path or connection to an |
collapse |
The character with which to separate subtitle lines. |
The SubRip format is a newline-separated, non-tabular text file with groups of subtitle text separated by a newline character and preceded by an index and a timestamp string containing the length of the spoken subtitle text. These components (index, time, text) can be parsed individually and combined into a data frame of subtitle groups.
A data frame of subtitles.
# read linear text to tabular data read_srt(srt_example(), collapse = " ")
# read linear text to tabular data read_srt(srt_example(), collapse = " ")
srt comes bundled with a number of sample files in its inst/extdata
directory. This function make them easy to access.
srt_example()
srt_example()
It's a Wonderful Life (1946) entered the public domain in 1974.
The path or name to a example .srt
file.
srt_example()
srt_example()
Parse components of a subtitle file
srt_seconds(x) srt_index(x) srt_text(x, collapse = "\n")
srt_seconds(x) srt_index(x) srt_text(x, collapse = "\n")
x |
A character vector with the lines of an |
collapse |
The character with which to separate subtitle lines. |
The parsed individual components of a subtitle: integer indexes, numeric times, and collapsed string subtitles.
# return individual components of each subtitle x <- readLines(srt_example()) head(srt_seconds(x)[[1]]) head(srt_index(x)) head(srt_text(x))
# return individual components of each subtitle x <- readLines(srt_example()) head(srt_seconds(x)[[1]]) head(srt_index(x)) head(srt_text(x))
Uniformly shift subtitle times
srt_shift(x, seconds)
srt_shift(x, seconds)
x |
A subtitle data frame from |
seconds |
The number of seconds to shift the start and end time. |
Here is a workflow of how a linear srt file is shifted in R.
read_srt(file) %>% srt_shift(2.1) %>% write_srt(file)
The numeric start times uniformly shifted by some amount.
# shift all start and stop by a some time x <- read_srt(srt_example(), collapse = " ") srt_shift(x, 1.234)
# shift all start and stop by a some time x <- read_srt(srt_example(), collapse = " ") srt_shift(x, 1.234)
Write subtitle data frame as SubRip text file
write_srt(x, path = NULL, wrap = TRUE, width = 40)
write_srt(x, path = NULL, wrap = TRUE, width = 40)
x |
A subtitle data frame from |
path |
File or connection to write to. |
wrap |
If |
width |
If |
The SubRip text files format subtitles with four components separated by a blank line:
A numeric counter identifying each sequential subtitle
The time that the subtitle should appear on the screen, followed by -->
and the time it should disappear
Subtitle text itself on one or more lines
A blank line containing no text, indicating the end of this subtitle
The path to the written file, invisibly.
# read and write without line breaks x <- read_srt(srt_example(), collapse = " ") write_srt(x, tempfile(fileext = ".srt"), wrap = FALSE)
# read and write without line breaks x <- read_srt(srt_example(), collapse = " ") write_srt(x, tempfile(fileext = ".srt"), wrap = FALSE)