Windows -  Scraping - Custom Directory Structure & File Names

  Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Post Reply
Knottyboy Offline
Junior Member
Posts: 17
Joined: Jan 2010
Reputation: 0
Post: #1
Hey All,

So I've left my XBMC set up for awhile. I have a custom regexp:

Season[\._ ][0]*([0-9]+)[\\/][ep[0]*([0-9]+)[^\\/]*

Which looks for:

Show\Season 01\[ep01] - Name Of Episode.avi

My only issue is multipart episodes. These are named in two conventions:

[ep01-02] - Episode 01 Name + Episode 02 Name
or
[ep01a] - Ep Name (p1)
[ep01b] - Ep Name (p2)

I had a look here:
http://wiki.xbmc.org/index.php?title=Add...s/TV_shows

and is says:

Quote:Note that when using a custom RegExp to match a custom file naming convention, add (.*) to the end of the RegExp so that the 3rd RegExp capture is found for the second episode.


So I have added .* to the end of my "Season[\._ ][0]*([0-9]+)[\\/][ep[0]*([0-9]+)[^\\/]*" leaving me with:

Quote: Season[\._ ][0]*([0-9]+)[\\/][ep[0]*([0-9]+)[^\\/]*.*


This still works for the single episode files however nothing changes on the multipart episodes.


I must confess I did bring this up about a year ago (almost to the day =S): http://forum.xbmc.org/showthread.php?tid=107867.
Sorry about that but it's been a year and I still haven't got round to making sense of it.

Perhaps if someone could break down the RegExp as I just made an educated guess at what it should be and it worked (well for the single episode files).

Do I need to change the naming format for my mulitpart episodes, as currently I'm assuming I can get away with leaving them as "[ep01-02] The Begriming + The Second Comming.avi".

Thanks in advance,

Knotty.
find quote
jmarshall Offline
Team-XBMC Developer
Posts: 26,221
Joined: Oct 2003
Reputation: 178
Post: #2
You need a capture around the .* at the end.

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


[Image: badge.gif]
find quote
Knottyboy Offline
Junior Member
Posts: 17
Joined: Jan 2010
Reputation: 0
Post: #3
(2012-08-16 00:46)jmarshall Wrote:  You need a capture around the .* at the end.

Excuse my ignorance but what is a capture?
find quote
olympia Offline
Team-Kodi Member
Posts: 2,503
Joined: May 2008
Reputation: 32
Post: #4
()
find quote
Knottyboy Offline
Junior Member
Posts: 17
Joined: Jan 2010
Reputation: 0
Post: #5
(2012-08-16 10:03)olympia Wrote:  ()

I thought as much but they all seem closed off to me.

I tried running it through http://regexpal.com/

(At work so have not tried it in XBMC yet)

But this seems to select the right bit of text:

Quote:Season[\._ ][0]*([0-9]+)[\\/][ep[0]*([0-9]+)[^\\/]...

From the below:

C:\Show\Season 01\[ep01-02] - Info.avi

the quote string returns:

Quote: Season 01\[ep01-02]

Is this what I need it to return in order to scrape the correct info?

(Will do this by trial and error tonight but thought I'd ask the question to give myself a head start.)

I appreciate that adding the ... only allows for 3 following characters but I think this should be okay as I don't believe I have any two part episodes witch go into 3 figures (ie [ep100-101]).
find quote
olympia Offline
Team-Kodi Member
Posts: 2,503
Joined: May 2008
Reputation: 32
Post: #6
If you follow the advice from jmarshall and the pointer from me, then this should leaves you with:
Code:
Season[\._ ][0]*([0-9]+)[\\/][ep[0]*([0-9]+)[^\\/]*(.*)
find quote
Knottyboy Offline
Junior Member
Posts: 17
Joined: Jan 2010
Reputation: 0
Post: #7
(2012-08-16 13:22)olympia Wrote:  If you follow the advice from jmarshall and the pointer from me, then this should leaves you with:
Code:
Season[\._ ][0]*([0-9]+)[\\/][ep[0]*([0-9]+)[^\\/]*(.*)

Heh will try yours first as you guys actually understand it. Trust it will work but will update the post either way.

Thank you to the both of you.
find quote