SRT subtitle splitter,字幕分割程序,in Python
by 半瓶墨水
at 2011-02-20 16:46:48
original http://www.2maomao.com/blog/srt-subtitle-splitter-in-python/
Verycd终于顶不住了
这周末整理了一下从Verycd上面拖下来的一百多G电影,发现教父1/2/3依然没有看过,想要看看才发现没字幕,射手影音也没下载到,只好到射手网上面去找,找到了几个都是一个字幕到底的;于是Google字母分割工具,一边找一边感叹下载站的没落,连国外的下载站也开始搞中国这一套欺骗点击的骗术。
找了半天,没找到一个可用的,有那么几个软件,用起来完全达不到自己的要求;
研究了一下srt的格式,自己写了段python脚本,居然花了一个小时,不过总算搞定了,代码如下:
# -*- coding: utf-8 -*-
#Author: 半瓶墨水 # 2011-02-20, 13:59:14
#Email : realfun AT gmail DOT com
#Usage : split srt files
#Examples:
# 将"教父.srt"分两段,起始时间向前错位29秒
# 第一段长1小时45分48秒,依次为第二段第三段,剩下的为第四段
# srt.py 教父.srt offset=00:00:29 00:42:55 00:46:24 00:42:39
# *NOTICE* Serial number is not handled
#
#写这段代码主要是因为,从Verycd上面下的教父没找到对应的字幕;Verycd is Gone
import sys, os
def srttime2int(st):
""”
>>> srttime2int("-00:01:19")
-79000
>>> srttime2int("00:01:19")
79000
>>> srttime2int("10:21:29")
37289000
>>> srttime2int("00:01:19,601")
79601
>>> srttime2int("10:21:29,601")
37289601
""”
flag = False
if st[0] == '-':
flag = True
st = st[1:]
value = ((int(st[0:2])*60+int(st[3:5]))*60+(int(st[6:8])))*1000+int(st[9:12] or 0)
return flag and -value or value
def int2srttime(i):
""”
>>> int2srttime(-79601)
'-00:01:19,601'
>>> int2srttime(79601)
'00:01:19,601'
>>> int2srttime(37289601)
'10:21:29,601'
""”
flag = False
if i<0:
i = -i
flag = True
srttime = "%02d:%02d:%02d,%03d" % (i/3600000, i%3600000/60000, i%60000/1000, i%1000)
return flag and '-' + srttime or srttime
def srt_split():
if len(sys.argv) < 3:
print "\n\n\tplease read source code for usage\n\n"
sys.exit(0)
filename = sys.argv[1]
splits = sys.argv[2:]
offset = 0
if splits[0].startswith("offset="):
offset = srttime2int(splits[0][len("offset="):])
splits = splits[1:]
splits.append("24:00:00") #assume no more than 24 hours
splits = [srttime2int(s) for s in splits]
lines = open(filename, "rb").read().splitlines()
f = open("001.srt", "wb")
newlines = []
count = 0
limit = splits[0] + offset
for line in lines:
#00:01:16,564 –> 00:01:18,532
if len(line)>=29 and line[13:16] == "–>":
start = srttime2int(line[:12])
end = srttime2int(line[17:])
if start > limit:
f.writelines(newlines)
f.close()
newlines = []
offset += splits[count]
count += 1
limit += splits[count]
f = open("%03d.srt" % (count+1), "wb")
line = int2srttime(start - offset) + " –> " + int2srttime(end - offset)
newlines.append(line + "\r\n")
f.writelines(newlines)
f.close()
if __name__ == "__main__":
import doctest
doctest.testmod()
srt_split()