Sunday, May 18, 2008

Snake in the torrents ...

I decided to branch out last week and use python seriously for the first time.

Turns out it is as good as all the praise it gets. This conversion was round about the same time i switched to Transmission for my torrenting needs as the jre + azuerus don't do my memory usage any real favours. The main thing that i love about azuerus was the fact that i could have it subscribe to a rss feed and download "linux iso's" automatically. Coupled with the fact that everyones favourite release group, EZTV provides there releases in the RSS form, it was perfect. However the plugin stopped working correctly about a month ago and i've been forced to actually operate my torrent client, gasp!

Anyways, while having a shower (Yes, i come up with concepts in the shower), i thought maybe i can replicate that functionality using python! Transmission has a feature that makes it scan a folder for torrents, so in theory i would simply need to do the following ...
  • Read a list of files/shows
  • extract the respective links from a rss/xml feed
  • calculate which ones are the most recent
  • download the torrent files
Below is the entire script. It stores all torrent information in sqlite database, meaning that you can extend anyway you want. The script is released under GPL 2.0. I haven't done any real testing, but it works fine with the following conditions
  • Python 2.5.1
  • OS X 10.5
  • Atom RSS source feed (Mininova)
My plan to schedule it as a cron job and sit back and watch.


#Channel 0.1
# Jonathan Dalrymple
# May 17th, 2008

from xml.etree import ElementTree as ET
import os
import shutil
import sqlite3
import urllib

currentDirectory = os.path.dirname( os.path.abspath( __file__ ))

#Create SQL
dbConn = sqlite3.connect( os.path.join( currentDirectory,"torrentsDB") )

#Create the new table
sql = "DROP TABLE IF EXISTS torrents"

dbConn.execute( sql )

sql = """
title TEXT,
date TEXT,
url TEXT

dbConn.execute( sql )

#Get XML file
response = urllib.urlretrieve( "" )

shutil.copyfile(response[0], os.path.join( currentDirectory,"rssSource.xml") )

xmlFile = os.path.join( currentDirectory, "rssSource.xml" )


tree = ET.parse( xmlFile )

selection = tree.getiterator('item')

i = 0
#For each item tag
for element in selection:
#Get the request elements from the selection

title = element.findtext('title')
date = element.findtext('pubDate')
enclosure = element.find('enclosure').attrib['url']

sql = "INSERT INTO torrents (title, date, url ) VALUES ( '%s','%s','%s')" % (title, date, enclosure)

dbConn.execute( sql )

i += 1

#Commit records

print '%d Torrents have been processed and added to the database' % (i)

except Exception, inst:
print 'Parse Error: %s' % (inst)

#Read the config file for the shows
configFile = file("shows.txt")
shows = configFile.readlines()

#Get the url for the show
print "The following torrents where found ..."

for show in shows:

dataSet = dbConn.cursor()

dataSet.execute( "SELECT title, url FROM torrents WHERE title LIKE '%" + show.rstrip() +"%' ORDER BY id DESC LIMIT 1" )

row = dataSet.fetchone()

#Test to ensure that a record exists
if type(row) == type(tuple()):
print row[0]

#download the torrent file
response = urllib.urlretrieve( row[1] )

newFilename = os.path.join (currentDirectory, show + ".torrent")
#Move from the temp folder
shutil.copyfile( response[0], newFilename )

except Exception, inst:
print 'Download Error: %s' % (inst)

print 'Complete'

Lastly to create the config file, just open your favourite text editor and list your tv shows, delimited using return, mine looks like this

Battlestar Galactica
American Dad

It's not case sensitive, so don't panic.

No comments: