docuwiki.com scraper in development
#1
Greetings.

I've started working on a docuwiki.com scraper. Even though its a wiki its got a fairly static structure that can be scraped most of the time. I've got the basics down(extracting the title/year/narrator and the episode titles/plots(if they exists). I'm running into an issue though of being unsure how the documentaries should be organized, since they are somewhere between tvshows and movies.

They are like tvshows in the *some* of them come in multi-part series. They are like movies in that a good number are only single part though. What i'm not sure of is how they need to be organized to be scanned into xbmc with the single parters being recognized as single part documentaries(movies esentially), and the multi-parts being recognized as a show with episodes. Do i need to seperate them into different source directories with different scrapers? a "movie" scraper for the single part documentaries, and a "tvshow" scraper for the multi part? would be essentially the same scraper so seems a bad hack to do it that way.

The next problem relates to advancedsettings.xml. When creating the regexp for <tvshowmatching> it seems it needs to detect the season and the episode from each name. The main problem though is that documentaries dont have a season, for simplicity sake my scraper currently outputs all episodes as part of season 1. Documentaries are usually Name.XXofYY.EpisodeTitle.quality.ripgroup.avi. How can i recognize 2of3 or 5of9 as being Season 1 episode 2, or season 1 episode 5 based off those file names? Its escaping me beacause i need to capture a 1 for the season but there is not reliably a "1" present in the names.

Basically, it seems documentaries dont fit in very well as movies or as tv shows, any pointers to getting this done would be well apreciated. Or should i be submitting a trac ticket for a third type of video file, movies, tvshows, and the new documentaries? I'm not particularly excited to submit a trac ticket for this though because i imagine it could be a few months before anything solid happens(if ever) in respect to a new type of video file.

journey4712
Reply


Messages In This Thread
docuwiki.com scraper in development - by journey4712 - 2008-10-16, 10:00
[No subject] - by journey4712 - 2008-10-16, 10:18
[No subject] - by spiff - 2008-10-27, 18:15
How is this going? - by voiddreamer - 2008-12-05, 00:16
[No subject] - by Jyujinkai - 2010-05-16, 05:34
Logout Mark Read Team Forum Stats Members Help
docuwiki.com scraper in development0