SRT subtitle splitter,字幕分割程序,in Python

2011-02-21 00:46

SRT subtitle splitter,字幕分割程序,in Python

by 半瓶墨水

at 2011-02-20 16:46:48

original http://www.2maomao.com/blog/srt-subtitle-splitter-in-python/

Verycd终于顶不住了

这周末整理了一下从Verycd上面拖下来的一百多G电影,发现教父1/2/3依然没有看过,想要看看才发现没字幕,射手影音也没下载到,只好到射手网上面去找,找到了几个都是一个字幕到底的;于是Google字母分割工具,一边找一边感叹下载站的没落,连国外的下载站也开始搞中国这一套欺骗点击的骗术。

找了半天,没找到一个可用的,有那么几个软件,用起来完全达不到自己的要求;
研究了一下srt的格式,自己写了段python脚本,居然花了一个小时,不过总算搞定了,代码如下:

Python语言: SRT subtitle splitter,字幕分割程序
#! /usr/bin/env python
# -*- coding: utf-8 -*-

#Author: 半瓶墨水 # 2011-02-20, 13:59:14
#Email : realfun AT gmail DOT com
#Usage : split srt files
#Examples:
#   将"教父.srt"分两段,起始时间向前错位29秒
#    第一段长1小时45分48秒,依次为第二段第三段,剩下的为第四段
#      srt.py 教父.srt offset=00:00:29 00:42:55 00:46:24 00:42:39
# *NOTICE* Serial number is not handled
#
#写这段代码主要是因为,从Verycd上面下的教父没找到对应的字幕;Verycd is Gone :(

import sys, os

def srttime2int(st):
  ""”
  >>> srttime2int("-00:01:19")
  -79000
  >>> srttime2int("00:01:19")
  79000
  >>> srttime2int("10:21:29")
  37289000
  >>> srttime2int("00:01:19,601")
  79601
  >>> srttime2int("10:21:29,601")
  37289601
  ""”
  flag = False
  if st[0] == '-':
    flag = True
    st = st[1:]
  value = ((int(st[0:2])*60+int(st[3:5]))*60+(int(st[6:8])))*1000+int(st[9:12] or 0)
  return flag and -value or value

def int2srttime(i):
  ""”
  >>> int2srttime(-79601)
  '-00:01:19,601'
  >>> int2srttime(79601)
  '00:01:19,601'
  >>> int2srttime(37289601)
  '10:21:29,601'
  ""”
  flag = False
  if i<0:
    i = -i
    flag = True
  srttime = "%02d:%02d:%02d,%03d" % (i/3600000, i%3600000/60000, i%60000/1000, i%1000)
  return flag and '-' + srttime or srttime

def srt_split():
  if len(sys.argv) < 3:
    print "\n\n\tplease read source code for usage\n\n"
    sys.exit(0)

  filename = sys.argv[1]
  splits = sys.argv[2:]
  offset = 0
  if splits[0].startswith("offset="):
    offset = srttime2int(splits[0][len("offset="):])
    splits = splits[1:]
  splits.append("24:00:00") #assume no more than 24 hours :)
  splits = [srttime2int(s) for s in splits]

  lines = open(filename, "rb").read().splitlines()
  f = open("001.srt", "wb")
  newlines = []
  count = 0
  limit = splits[0] + offset
  for line in lines:
    #00:01:16,564 –> 00:01:18,532
    if len(line)>=29 and line[13:16] == "–>":
      start = srttime2int(line[:12])
      end = srttime2int(line[17:])
      if start > limit:
        f.writelines(newlines)
        f.close()
        newlines = []
        offset += splits[count]
        count += 1
        limit += splits[count]
        f = open("%03d.srt" % (count+1), "wb")
      line = int2srttime(start - offset) + " –> " + int2srttime(end - offset)
    newlines.append(line + "\r\n")
  f.writelines(newlines)
  f.close()

if __name__ == "__main__":
    import doctest
    doctest.testmod()
    srt_split()

Share/Bookmark