2008-03-27, 13:09
I think there should be a better stacking algorithm in XBMC when stacking files and looking for sequences of patterns in filenames. I came to this opinion after the current algorithms not being able to stack some of my filenames, and the problems arising from it. I was also inspired and challenged by the current needs/examples given in a thread here. Fortunately for you I have done most of all the work in creating the algorithm, I can share it with you if you want to implement it into xbmc. I would do it myself but it would require extensive amount of time in order for me to become familiar and learn the c++ language and other xbox/xbmc specifics. My algorithm can be used in PCRE or Python regex library, and then simple language constructs and decisions structures.
I encourage you to test it thoroughly and place your comments here in this thread, I have created a demonstration of the algorithim via PHP here:
http://tinyurl.com/ypgmt2
The idea is simple, it is all based on numbers if there exists two strings with the same content in them except for a certain number (can be in any position), then it will consider this a sequence.
movie 1
Movie 2 //no
Movie 1
movie 2 //yes
On strict mode, these examples won't match, however the concept is still the same...
Here's some info I typed up quickly, but you really can see what it can do by testing it yourself.
Right now it only deals with numbers, but I'm thinking about implementing other types like 1a, 1b, or roman numerals..
I encourage you to test it thoroughly and place your comments here in this thread, I have created a demonstration of the algorithim via PHP here:
http://tinyurl.com/ypgmt2
The idea is simple, it is all based on numbers if there exists two strings with the same content in them except for a certain number (can be in any position), then it will consider this a sequence.
movie 1
Movie 2 //no
Movie 1
movie 2 //yes
On strict mode, these examples won't match, however the concept is still the same...
Here's some info I typed up quickly, but you really can see what it can do by testing it yourself.
Quote:Rules:
Not Strict (unchecked):
Will parse all strings for patterns of series of numbers, any sequence is counted it does not matter in what part of the string the sequence comes in.
ie of some example sequences:
ocean12
ocean13
o324234sdf
o234890790834sdf
Strict:
Strict will still look for patterns of sequences in any part of the string, but will be more strict on what it looks for.
All numbers that follow characters a-z or a-z\s* will have to have (part|pt|dvd|cd|title|file|disc) before it.
Unless this number is part of a 1-1 sequence or 1/1 sequence. \s*\d+/\d+ \s*\d+-\d+
Also, this will not count:
sdfdfdsfpart1 or asdfdsfpart 1
but this will:
sdfdf part1
and so will this:
sdfdf_part_1
The main advantage of strict mode is that it will not think that transformers01 or blade 1, blade 2 is a sequence, but still have all the powerful sequence finding capability of unstrict.
In both strict modes and unstrict modes, this tv show episode identifier will never count as a part of a sequence:
\d+\s*x\s*\d+
like:
01 X 08
My stacking algorithim can encompass pretty much all realistic patterns of sequences within strings, giving way more possiblities that what XBMC's current stacking algorithims offer. If you have comments/questions/suggestions post in xbmc thread.
-Enjoy! -plex
Right now it only deals with numbers, but I'm thinking about implementing other types like 1a, 1b, or roman numerals..