Kodi Community Forum

Full Version: Python Scripts for rTorrent RSS Download and Automatic sorting of TV Shows
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I've made these scripts, and thought I'd share them.
This is the first code I've ever written in Python and it's taken probably 4-5 hour total to make these. I'm a C# coder by trade, so forgive the somewhat C#'ish way of naming things.

They're not pretty, but they work. Someday I might clean them, but until then, maybe they'll be of help to someone.

Also, I didn't bother learning about logging in python, so I just print stuff and >> it to some logfile when I execute the script.

For starters, what does these scripts actually do:
ezrss.py:
Downloads torrents off ezrss.it for shows you've "subscribed" to. Subscription in this case, is defined by the shows you've got a directory for. It can EASILY be modified to use a show-per-line file if you'd rather have that.

adoptFile.py:
Called by rtorrent on hash_done (more on that later), whenever a download is completed.
I'll walk through all your "TV Show Directories", and consider all the directories as "adoption candidates".
It should be safe against i.e. cases like this.

Consider this filename:
Super.Heroes.S01E03.somecrap.avi

Consider these folders:
/TV Shows/Heroes
/TV Shows/Super Heroes

Every word in the directory name must be present in the filename, or it won't be considered a candidate.
However, "Heroes" would be considered a candidate for the file above, but since "Super Heroes" have two matches, that folder will be considered. The winning folder will always be the one with the highest "score".

Now, the code...

First, the script that downloads stuff from EZRSS to rtorrents watch folder:
Code:
import urllib, os, feedparser

def getDownloadHistory(historyFile):
        dh = []
        if not os.access(historyFile, os.R_OK):
                return dh

        f = open(historyFile, 'r')
        for line in f:
                dh.append(line.replace('\n',''))

        f.close()
        return dh

def markAsDownloaded(historyFile, url):
        f = open(historyFile, 'a')
        f.write(url)
        f.write('\n')
        f.close()

def isShowSubscribed(showname, subscribedShows):
        for s in subscribedShows:
                parts = s.split()
                reqScore = 0
                score = 0
                for part in parts:
                        if showname.lower().find(part.lower()) >= 0:
                                score = score + 1

                        reqScore = reqScore + 1

                isSubscribed = reqScore == score
                #print '% scored %i out of %i' % (showname, score, reqScore)
                if isSubscribed:
                        return True
        return False

def downloadFile(localDir, remoteFile):
        filename = remoteFile.split('/')[-1]
        webFile = urllib.urlopen(remoteFile)
        localFile = open(localDir + filename, 'w')
        localFile.write(webFile.read())
        webFile.close()
        localFile.close()

def getShowList(showfolderFile):
        showfolders = []
        f = open(showfolderFile, 'r')
        for line in f:
                sh = line.replace('\n','')
                if len(sh) > 0:
                        showfolders.append(sh)

        shows = []
        for showfolder in showfolders:
                dirs = filter(lambda x : os.path.isdir(os.path.join(showfolder, x)), os.listdir(showfolder))
                for dir in dirs:
                        show = dir
                        xkeys = getFolderKeywords(showfolder + '/' + dir)
                        if(len(xkeys) > 0):
                                show = show + ' ' + xkeys

                        shows.append(show)

        return shows

        f = open(showFile, 'r')
        shows = []
        for show in f:
                shows.append(show.replace('\n',''))

        f.close()
        return shows

def getFolderKeywords(dir):
        xkeys = ''
        xfile = dir + '/keywords.list'

        if not os.access(xfile, os.R_OK):
                return xkeys

        f = open(xfile, 'r')
        for key in f:
                if len(xkeys) > 0:
                        xkeys = xkeys + ' '
                xkeys = xkeys + key.replace('\n', '')

        return xkeys

downloadTo = '/home/jonas/torrents/watch/'
historyFile = '/home/jonas/torrentscripts/dltorrents.list'
showfolderFile = '/home/jonas/torrentscripts/showfolders.list'
hist = getDownloadHistory(historyFile)
shows = getShowList(showfolderFile)

d = feedparser.parse("http://www.ezrss.it/feed/")
s = filter(lambda x: isShowSubscribed(x.title, shows), d.entries)
s = filter(lambda x: hist.count(x.link) == 0, s)

print 'Checking for new shows'

for show in s:
        print '%s is new (%s)' % (show.title, show.link)
        downloadFile(downloadTo, show.link)
        markAsDownloaded(historyFile, show.link)

print 'Done'

About this script:
1) The showfolders.list file contains a list of where my TV Shows are stored. I've got two harddrives where I keep those:
/media/data1/TV Shows
/media/data2/TV Shows

Each of these folders contains folders with this typical pattern:
/ShowName/Season [n]

This script assumes you want to subscribe to anything you've got a show for.
If there is a /ShowName/keywords.list, the script will also include these keywords when searching in the RSS feed. I.e. you'd want to include a 720p line in keywords.list for shows you prefer in HD Quality.

Secondly, the script that sorts anything rtorrent completes:
Code:
import xmlrpclib, os, sys, re, shutil

def adoptionCandidates(basedir, file):
        dirs = filter(lambda x : os.path.isdir(os.path.join(basedir, x)), os.listdir(basedir))
        if os.path.isdir(file):
                print '%s is a directory. Aborting' % file
                return []

        (filepath, filename) = os.path.split(file)

        ignoredPhrases = ['-','_']

        candidates = []
        for dir in dirs:
                dirParts = dir.split()
                score = 0
                requiredScore = 0

                for part in dirParts:
                        if ignoredPhrases.count(part) > 0:
                                continue

                        requiredScore = requiredScore + 1

                        if filename.find(part) >= 0:
                                score = score + 1

                if score == requiredScore:
                        candidates.append( (os.path.join(basedir, dir), score) )

                #print '%s scored %i (req: %i)' % (dir, score, requiredScore)

        #for (dir, score) in candidates:
        #       print '%s with score %i' % (dir, score)

        return candidates

def getSeasonNumber(filename):
        patterns =      [
                                '.*S(\d+)E(\d+).*',
                                '.*(\d+)x(\d+).*'
                        ]
        for pattern in patterns:
                p = re.compile(pattern, re.I)
                g = p.findall(orphanFile)
                if len(g) > 0:
                        season = int(g[0][0])
                        return season

        return None


def getRtorrentId(filename):
        downloads = rtorrent.download_list('started')
        for dl in downloads:
                rfile = rtorrent.d.get_base_filename(dl)
                if rfile == filename:
                        return dl

def rtorrentNotifyMove(id, newBasePath):
        rtorrent.d.set_directory(id, newBasePath)
        rtorrent.d.resume(id)

print '--------------- BEGIN ---------------'

#basic input
orphanFile = sys.argv[1]
showLocations = ['/media/data2/TV Shows', '/media/data1/TV Shows']
allowedSourceLocation = '/home/jonas/torrents/completed'

#rtorrent xmlrpc
rtorrent = xmlrpclib.ServerProxy('http://localhost')
(fpath, fname) = os.path.split(orphanFile)
print 'File path: %s' % fpath
print 'File name: %s' % fname

candidates = []

if not orphanFile.startswith(allowedSourceLocation):
        print 'STOP! This file is not located in %s' % allowedSourceLocation
        exit()

if os.path.isdir(orphanFile):
        print 'STOP! Source is a directory and cannot be automaticly sorted!'
        exit()

print 'Attempting to find a home for file %s' % orphanFile

for location in showLocations:
        candidates.extend(adoptionCandidates(location, orphanFile))

candidates.sort(lambda (da, sa), (db, sb): sb-sa)

if len(candidates) <= 0:
        print 'No one wanted this file :('
        exit()

for (dir, score) in candidates:
       print 'Candidate: %s with score %i' % (dir, score)

print 'Winner is %s with score %i' % candidates[0]

#Determine Season and Episode number
season = getSeasonNumber(fname)
if not season:
        print 'STOP! Season could not be determined.'
        exit()

print 'Season was determined to be %i' % season
finaldir = os.path.join(candidates[0][0], 'Season %s' % season)

print 'Will move it to %s' % finaldir

#Check if season folder is present
if not os.path.isdir(finaldir):
        print 'Season dir doesn\'t exsist. Creating now'
        os.mkdir(finaldir)

rid = getRtorrentId(fname)
print '%s was resolved to rtorrent id: %s' % (fname, rid)

shutil.move(orphanFile, finaldir)
print 'Move successful.  Notifying rtorrent about this'

print 'telling rtorrent to move %s to %s' % (rid, finaldir)
rtorrentNotifyMove(rid, finaldir)
print '----------------- done -----------------'

About this:
This is activated by rtorrent from .rtorrent.rc. These are the lines for it:
Code:
#Sort file
system.method.set_key = event.download.finished,move_complete,"d.set_directory=~/torrents/completed;execute=mv,-u,$d.get_base_path=,~/torrents/completed"
system.method.set_key = event.download.hash_done,sort_finished,"branch=$d.get_complete=,\"execute={~/torrentscripts/postProcessDownload.pl,$d.get_base_path=}\""

(post size limit hit)
(continued from above)
I run this on hash_done because rtorrent obviously isn't happy that I move something it's hashing to another physical harddrive. Sometimes it accepts it, sometimes it crashes. This way rtorrent is ok with it.

This also requires that you've got XMLRPC enabled in rtorrent. I use the XMRPC to set the directory of a download after I've moved it, so it can keep seeding it after it's been moved and scraped by XBMC.

The script it self looks at the showfolders and determined which is the best match. If no decent match turns up, it'll leave the file where it is.

I hope this is helpfull to someone looking to auto sort their downloads.
I realize this "documentation" is terrible, but feel free to ask if you can't make it work. If someone actually shows interest in these scripts I'll probably clean them up and make a more substantial guide for using them.

However with basic knowledge of scripting on linux and rtorrent you can probably make these work as-is.
Wow this is great! I want to try you script but I found out your using postProcessDownload.pl in .rtorrent.rc. Can you please post the code script for this? Thanks in advance.
I have tried your script. It works well but have a problem upper ang lower case. example Lie.to.Me.S02E07.Black.Friday.HDTV.XviD-FQM.\[VTV\].avi
is not equal to Lie To Me Folder. Hope you can make this case insensitive. Thanks in advance.


pvr@pvr-desktop:~$ python ./adoptFile.py /media/data/Downloads/Lie.to.Me.S02E07.Black.Friday.HDTV.XviD-FQM.\[VTV\].avi
--------------- BEGIN ---------------
File path: /media/data/Downloads
File name: Lie.to.Me.S02E07.Black.Friday.HDTV.XviD-FQM.[VTV].avi
Attempting to find a home for file /media/data/Downloads/Lie.to.Me.S02E07.Black.Friday.HDTV.XviD-FQM.[VTV].avi
Greys Anatomy scored 0 (req: 2)
South Park scored 0 (req: 2)
The Vampire Diaries scored 0 (req: 3)
Star Wars The Clone Wars scored 0 (req: 5)
V 2009 scored 1 (req: 2)
Stargate Universe scored 0 (req: 2)
Legend of the Seeker scored 0 (req: 4)
Warehouse 13 scored 0 (req: 2)
Merlin 2008 scored 0 (req: 2)
Heroes scored 0 (req: 1)
Defying Gravity scored 0 (req: 2)
Lie To Me scored 2 (req: 3)
FlashForward scored 0 (req: 1)
Eastwick scored 0 (req: 1)
Smallville scored 0 (req: 1)
Supernatural scored 0 (req: 1)
The Forgotten scored 0 (req: 2)
House scored 0 (req: 1)
Amazing Race scored 0 (req: 2)
Dollhouse scored 0 (req: 1)
Fringe scored 0 (req: 1)
No one wanted this file Sad
Hi there !

Great script, exactly what I am looking for, just one question.

Would it be possible to adopt this to work with folders too?
This is something i've been looking at for a while. I like :0.

Changes I've made:
I've modified the script to ignore upper/lower case in the filename. I've posted the script below for reference, i simply forced the directory names and the file name parts to lower case before running string.find(). You can also tell the regex to ignore case.
The Regex can also be improved as '.*(\d+)x(\d+).*' identifies "top_gear.14x04.720p_hdtv_x264-fov.mkv" as:

# Run findall
>>> regex.findall(string)
[(u'4', u'04')]

Which for some reason is slicing off the 10'th element ('1'). Removing the .* search at the start of the regex: '(\d+)x(\d+).*' you get:

# Run findall
>>> regex.findall(string)
[(u'14', u'04')]

Hopefully thats an improvement that doesnt introduce a different bug.


torrentSorter.py
Code:
#! /usr/bin/python
import xmlrpclib, os, sys, re, shutil

def adoptionCandidates(basedir, file):
        dirs = filter(lambda x : os.path.isdir(os.path.join(basedir, x)), os.listdir(basedir))
        if os.path.isdir(file):
                print '%s is a directory. Aborting' % file
                return []

        (filepath, filename) = os.path.split(file)
        #set filename to lowercase for string comparisons
    filename=filename.lower()

        ignoredPhrases = ['-','_']

        candidates = []
        for dir in dirs:
                dirParts = dir.split()
                score = 0
                requiredScore = 0

                for part in dirParts:
                        if ignoredPhrases.count(part) > 0:
                                continue
                        requiredScore = requiredScore + 1
                        
                        #force lower case for string comparison.
            part=part.lower()
                        if filename.find(part) >= 0:
                                score = score + 1

                if score == requiredScore:
                        candidates.append( (os.path.join(basedir, dir), score) )

                #print '%s scored %i (req: %i)' % (dir, score, requiredScore)

        #for (dir, score) in candidates:
        #       print '%s with score %i' % (dir, score)

        return candidates

def getSeasonNumber(filename):
        patterns =      [
                                '.*S(\d+)E(\d+).*',
                                '(\d+)x(\d+).*'
                               # commented out regex thought below Season was '4' not 14. Def better way of doing that
                               #top_gear.14x04.720p_hdtv_x264-fov.mkv
                                #'.*(\d+)x(\d+).*'
                        ]
        for pattern in patterns:
                p = re.compile(pattern, re.I)
                g = p.findall(orphanFile)
                if len(g) > 0:
                        season = int(g[0][0])
                        return season
        return None


def getRtorrentId(filename):
        downloads = rtorrent.download_list('started')
        for dl in downloads:
                rfile = rtorrent.d.get_base_filename(dl)
                if rfile == filename:
                        return dl

def rtorrentNotifyMove(id, newBasePath):
        rtorrent.d.set_directory(id, newBasePath)
        rtorrent.d.resume(id)

print '--------------- BEGIN ---------------'

#basic input
orphanFile = sys.argv[1]
showLocations = ['/mnt/TV']
allowedSourceLocation = '/Torrents/Complete'

#rtorrent xmlrpc
rtorrent = xmlrpclib.ServerProxy('http://localhost')
(fpath, fname) = os.path.split(orphanFile)
print 'File path: %s' % fpath
print 'File name: %s' % fname

candidates = []

if not orphanFile.startswith(allowedSourceLocation):
        print 'STOP! This file is not located in %s' % allowedSourceLocation
        exit()

if os.path.isdir(orphanFile):
        print 'STOP! Source is a directory and cannot be automaticly sorted!'
        exit()

print 'Attempting to find a home for file %s' % orphanFile

for location in showLocations:
        candidates.extend(adoptionCandidates(location, orphanFile))

candidates.sort(lambda (da, sa), (db, sb): sb-sa)

if len(candidates) <= 0:
        print 'No one wanted this file :('
        exit()

for (dir, score) in candidates:
       print 'Candidate: %s with score %i' % (dir, score)

print 'Winner is %s with score %i' % candidates[0]

#Determine Season and Episode number
season = getSeasonNumber(fname)
if not season:
        print 'STOP! Season could not be determined.'
        exit()

print 'Season was determined to be %i' % season
finaldir = os.path.join(candidates[0][0], 'Season %s' % season)

print 'Will move it to %s' % finaldir

#Check if season folder is present
if not os.path.isdir(finaldir):
        print 'Season dir doesn\'t exsist. Creating now'
        os.mkdir(finaldir)

rid = getRtorrentId(fname)
print '%s was resolved to rtorrent id: %s' % (fname, rid)

shutil.move(orphanFile, finaldir)
print 'Move successful.  Notifying rtorrent about this'

print 'telling rtorrent to move %s to %s' % (rid, finaldir)
rtorrentNotifyMove(rid, finaldir)
print '----------------- done -----------------'

Edit: Found/Fixed. Works fine now.

Useful bits out of .rtorrent.rc
Code:
# This is a resource file for rTorrent.

# Default directory to save the downloaded torrents.
directory = /Torrents/Downloading

# Default session directory. Make sure you don't run multiple instance
# of rtorrent using the same session directory. Perhaps using a
# relative path?
session= /Torrents/Downloading/rtorrent.session

# Watch a directory for new torrents, and stop those that have been
# deleted.
schedule = watch_directory,5,5,load_start=/Torrents/TorrentFiles/Auto/*.torrent
#schedule = untied_directory,5,5,stop_untied=

scgi_port = 127.0.0.1:5000

#Sort file
system.method.set_key = event.download.finished,move_complete,"d.set_directory=/Torrents/Complete/;execute=mv,-u,$d.get_base_path=,/Torrents/Complete/"
system.method.set_key = event.download.hash_done,sort_finished,"branch=$d.get_complete=,\"execute={~/scripts/torrentSorter.py,$d.get_base_path=}\""

I'll keep playing around with it and post the completed script(s) when i've finished
If any of you guys are up for it it sounds like rTorrent would integrate perfectly with my TV searcher/sorter, Sick Beard. I imagine it would save you a fair amount of the trouble of building your own sorting/renaming/etc scripts from scratch as long as you could figure out how to hook it up. I'd be more than willing to help from the Sick Beard side if somebody feels like taking it on :0) If rTorrent can download each episode into its own dir and then call a python script with that dirname as an argument then it would be extremely trivial!
Great script jonasw! I only ended up needing torrentsorter.py as I already am using flexget for grabbing torrents. I am not a python coder either however for my setup I needed a couple modifications. I still need to tweak things and finish testing.

The following snippet of code added to the top of adoptionCandidates gave me more correct finds.

patterns = [
'S(\d+)E(\d+)',
'(\d+)x(\d+)'
]
for pattern in patterns:
p = re.compile(pattern, re.I)
m = p.split(file)
try:
if len(m[1]) > 0:
trimfile=m[0]
except IndexError:
pass
try:
print "testing filename used: %s" %trimfile
except NameError:
print "planned exit no season match found"
nothing=[]
return nothing






Also a few lines down in same function i just modified everything to use upper case as I had a lot of misses and did not want to restructure my whole library.

filename=filename.upper()

part=part.upper()
hi, is it possible to change it so it use the transmission bittorrect client updates? intead of rtorrent
I don't know if anyone else is using this but I found this to be pretty handy. It wasn't exactly perfect for my uses so I made a few minor changes. The big thing is that my version below handles directories. I've also removed the rtorrent control since that wasn't really applicable to my setup but adding it back in would be pretty simple if someone was so inclined.
Code:
import os, sys, re, shutil, time

showLocations = ['/mnt/fs2/TV', '/mnt/fs3/TV', '/mnt/fs3/TV']

videoTypes = ['.avi', '.mkv', '.wmv', '.mp4']

def adoptionCandidates(basedir, file):
    dirs = filter(lambda x : os.path.isdir(os.path.join(basedir, x)), os.listdir(basedir))
    if os.path.isdir(file):
        print '%s is a directory. Aborting' % file
        return []

    (filepath, filename) = os.path.split(file)

    ignoredPhrases = ['-','_']

    candidates = []
    for dir in dirs:
        dirParts = dir.split()
        score = 0
        requiredScore = 0

        for part in dirParts:
            if ignoredPhrases.count(part) > 0:
                continue

            requiredScore = requiredScore + 1

            if filename.find(part) >= 0:
                score = score + 1

        if score == requiredScore:
            candidates.append( (os.path.join(basedir, dir), score) )

        #print '%s scored %i (req: %i)' % (dir, score, requiredScore)

    #for (dir, score) in candidates:
    #       print '%s with score %i' % (dir, score)

    return candidates

def getSeasonNumber(filename):
    patterns =      [
                '.*S(\d+)E(\d+).*',
                '.*(\d+)x(\d+).*'
            ]
    for pattern in patterns:
        p = re.compile(pattern, re.I)
        g = p.findall(filename)
        if len(g) > 0:
            season = int(g[0][0])
            return season

    return None

def handleDirectory(directory):
    for f in os.listdir(directory):
        f = os.path.join(directory, f)
        if os.path.isdir(f):
            handleDirectory(f)
        else:
            (root, ext) = os.path.splitext(f)
            if ext in videoTypes:
                handleFile(f)
            
def handleFile(orphanFile):
    (fpath, fname) = os.path.split(orphanFile)

    candidates = []
    print orphanFile

    if os.path.isdir(orphanFile):
        print 'STOP! Source is a directory and cannot be automaticly sorted!'
        exit()

    print 'Attempting to find a home for file %s' % orphanFile
    
    for location in showLocations:
        candidates.extend(adoptionCandidates(location, orphanFile))
    
    candidates.sort(lambda (da, sa), (db, sb): sb-sa)

    if len(candidates) <= 0:
        print 'No one wanted this file :('
        exit()

    for (dir, score) in candidates:
           print 'Candidate: %s with score %i' % (dir, score)
    
    print 'Winner is %s with score %i' % candidates[0]
    
    #Determine Season and Episode number
    season = getSeasonNumber(fname)
    if not season:
        print 'STOP! Season could not be determined.'
        exit()
    
    print 'Season was determined to be %i' % season
    finaldir = os.path.join(candidates[0][0], 'Season %s' % str(season).rjust(2, '0'))
    
    print 'Will move it to %s' % finaldir
    
    #Check if season folder is present
    if not os.path.isdir(finaldir):
        print 'Season dir doesn\'t exsist. Creating now'
        os.mkdir(finaldir)
    
    if not os.path.isfile(os.path.join(finaldir, fname)):
        shutil.copy(orphanFile, finaldir)
        print 'Copy successful. '

if __name__ == '__main__':
    f = sys.argv[1]
    if os.path.isdir(f):
        handleDirectory(f)
    else:
        handleFile(f)
Found this thread via a Google search. Glad to see others already thought of what I'm trying to do. Smile

I'm using Horn's modified version on my ReadyNAS (installed with rssdler and Transmission).
thanks a lot for this little script.
i modified it to suit my needs and wanted to post this here. maybe it helps someone Wink

https://github.com/xkonni/scripts/blob/m...entSort.py
xkonni Wrote:thanks a lot for this little script.
i modified it to suit my needs and wanted to post this here. maybe it helps someone Wink

https://github.com/xkonni/scripts/blob/m...entSort.py

Since you seem to be active in this I'll throw this question at you in hopes of getting this finally working!

When the script runs I immediately get an error

"* Inactive: Cannot change the directory of an open download after the files have been moved."

and the torrent is broken (will no longer start).

The move is happening just fine, it's just not notifying rtorrent about it (the error above). What version are you using? Is there something in your .rtorrent.rc I am missing? Any help would be HUGELY appreciated!
Oh excuse me for not checking back here for such a long time.
I recently updated rtorrent and then ran into the issue you described.

Updated my script on github to work again.

https://github.com/xkonni/scripts/commit...entSort.py
xkonni,

many thx for mainting your script.
I've updated my server from Debian Squeeze to Wheezy a few days ago, so rtorrent got updated from 0.8.9bpo to 0.9.3.
Pages: 1 2