• 1
  • 29
  • 30
  • 31(current)
  • 32
  • 33
  • 37
[WIP] AniDB.net Anime Video Scraper
(2012-08-28, 02:30)akovia Wrote: Hi,
New user here and have been reading all day now but still feel a little lost.

I have an extensive anime collection and have created my own filing/naming scheme that is incompatible with xbmc. I'm searching for the best strategy to attack renaming everything with a high probability of success. I don't mind hand editing some stuff to get it accurate, but I would like to start on the right track, so I'm looking for some success stories or a pointer to an updated guide that walks you through and what to expect. Maybe just a good naming scheme that catches most series, movies, and OVAs.

Is Filebot a good option for renaming and is there a successful script for it?

Lastly I did read that there will be no support for OPs & EDs (specials?) which is fine, but I was curious how you deal with them as far as naming and filing as to not disrupt matching. (sub-folder?)

Any help, strategies, or anecdotes would be much appreciated.

The best strategy for ANIME is to use Anidb 'O Matic (AOM) (http://wiki.anidb.net/w/AniDB_O'Matic) or the anidb applet (http://anidb.net/perl-bin/animedb.pl?show=applet). It can index your entire collection based on CRC32 hash so current filenames do not matter, and then rename them from the AniDB database. Filebot claims it can do that, but it doesn't.

This assumes of course that your files are as you received them and you haven't re-encoded them. If they've been re-encoded and thus don't match any known files by crc32 hash, then you'll have to rename them manually.

You might have to read up on the forums at Anidb to actually figure out how to customize your naming schemes. I currently use AOM.

As for OP/ED/Specials, I keep them in the same folder as the episodes. Both renaming clients handle them fine. As far as the scraper, it appears like it works on some, but even if it doesn't... Due to the hit/miss nature of the scraper, I always use the Video Files option instead of TV Shows, and they always show up there.
Reply
@pigers

Try to delete your "Cache" in your Userdata XBMC Folder.

-----------------------------------------------------
@akovia

If you use the AniDB Applet (option > go advanced), you can use a renaming Script like this, works great with REGEX 2.4.

Code:
Trunc(str, len):=$repl(%str%, ".{" $len($repl(%str%, "(.?){" %len% "}$", "")) "}$", "")
TruncEllipse(str, len):= { $len(%str%) = $len($Trunc(%str%, %len%)) ? %str% : $Trunc(%str%, %len%) "…" }

AT:={ %ATe% ? %ATe% " (" %ATr% ")" : %ATr% }                  #anime title english for romanji
ET:=[%ETe%,%ETr%]                                              #episode title (english or romanji)
ET:=$TruncEllipse(%ET%,"64")                                   #truncate long titles
GT:="[" [%GTs%,%GTl%] "]"                                      #group tag (short or long)

EpNoPad:=$pad(%EpNo%,$max($len(%EpHiNo%),$len(%EpCount%)),"0") #episode number padding
EpNoPad:="ep"%EpNoPad%                                         #add "ep" in front
EpNoPad:=$repl(%EpNoPad%,'ep[sS]',"S00E0")                     #rename specials from "epS" to "S00E"

Src:="["%Source%"]"                                            #set [source] e.g. [Blu-ray]
Ver:={%Ver%="1"?"":"v"%Ver%}                                   #set version if appliccable
Res:="["%FVideoRes%"]"                                         #set resolution e.g. [1280x720]

MEpNo:={%ET%="Complete Movie"?%EpNoPad% %Ver%:%EpNoPad% %Ver%" - "%ET%} #Only show title when not "Complete Movie"

Movie := %AT%" "%MEpNo%" "%Res%%Src%%GT%                       #for movies set to "animetitle ep01 [resolution][source][group]
Other := %AT%" "%EpNoPad% %Ver% " - "%ET%" "%Res%%Src% %GT%    #all else set to "animetitle ep## - episode title [resolution][source][group]

FileName:= {%Type% = "Movie"? %Movie% : %Other%}               #check if the file is a movie or not and rename appropiately

AT:= { %ATe% ? %ATe% "" : %ATr% }
ET:=[%ETe%,%ETr%]
ET:=$TruncEllipse(%ET%,"64")
GT:="[" [%GTs%,%GTl%] "]"
CRC:="["$uc(%FCrc%)"]"

EpNoPad:=$pad(%EpNo%,$max($len(%EpHiNo%),$len(%EpCount%)),"0")
EpNoPad:=$repl(%EpNoPad%,'[sS]',"S00E0")

MEpNo:={%ET%="Complete Movie"?%EpNoPad% :%EpNoPad% " - "%ET%}

Movie := %AT%" - "%MEpNo%" - "%GT%%CRC%
Other := %AT%" - "%EpNoPad%" - "%ET%" - "%GT%%CRC%

FileName:= {%Type% = "Movie"? %Movie% : %Other%}

Movie: Gintama - The Movie - 1 - [STORM][33F80A06].mkv
Episode: Mobile Suit Gundam - Unicorn - 1 - Day of the Unicorn - [THORA][3474BCB6].mkv
Special: Amagami SS Plus - S00E01 - Special 1 - [BtotheO][94089C5E].mkv
Reply
Thank you for the reply aelfwyne.
(2012-08-28, 05:01)aelfwyne Wrote: The best strategy for ANIME is to use Anidb 'O Matic (AOM) (http://wiki.anidb.net/w/AniDB_O'Matic) or the anidb applet (http://anidb.net/perl-bin/animedb.pl?show=applet). It can index your entire collection based on CRC32 hash so current filenames do not matter, and then rename them from the AniDB database. Filebot claims it can do that, but it doesn't.

This assumes of course that your files are as you received them and you haven't re-encoded them. If they've been re-encoded and thus don't match any known files by crc32 hash, then you'll have to rename them manually.
Unfortunately this is the case. As I was using regular media players prior to this, I always renamed my files to keep them orderly (Series - Epno/Movie/OVA [Group][origCRC].ext) This way I was able to make a bash script to embed just the filename (minus bracketed info) into the medtadata "Title". This of course trashed the CRC which is why I kept the original in the filename if I ever needed to reference it.

I really don't mind some manual labor to get it right, just trying to avoid doing things more than once if possible.

As far as AOM is concerned, it appears that their default naming scheme is %a - %ep - %e%G (anime - episode with version# - episode name GROUP) This is where I have been getting pretty lost. After reading though here it seems like it needs to be a format something like anime - SxxExx - epname and that since there are no real seasons per-se that you need a root folder under "TV Shows" for each season. Also it can pick up some specials if you rename them in yet another patterm starting with S00.

I guess what I was hoping for is someone else that has a large diverse collection to reveal their naming scheme that would cover the bulk of everything, and how to manually fix anything that is not caught by the scraper. (NFO files, custom regex, or other strategy)
Quote:As for OP/ED/Specials, I keep them in the same folder as the episodes. Both renaming clients handle them fine. As far as the scraper, it appears like it works on some, but even if it doesn't... Due to the hit/miss nature of the scraper, I always use the Video Files option instead of TV Shows, and they always show up there.
That sounds like a good solution.

I guess I'll just keep reading for now to see if I can get some confidence in how to proceed.

Reply
Above your Message, you have my naming scheme incl. the link to the regex i use. So far 239 Animes (all of witch appear in XBMC). Placed like this: Animes/Animename/Episode.xxx (I dont use "Seasons")

Now if you reEncoded your Animes the AniDB Applet and AOM i think too, will not work, since your CRC Hash is damaged. But as long as they are correctly named incl. the original CRC in the Filename, the AniDB Scrapper will show them on XMBC (ofc the release needs to be in the AniDB). So i guess you will have to rename it manually, as aelfwyne had said, with a renaming program, if the Regex isnt working with your current naming scheme. Or you download them again Smile

There are Animes with numbers in there names, "Gundam 00, NieA Under7, Mobile Suit Gundam - The 8th MS Team, Hunter X Hunter (2011), etc. etc." witch are gonna make problems.

But there are solutions, one is ofc the tvshow.nfo File. Sometimes its already enough to leave out the number ( "Mobile Suit Gundam - The MS Team").

When you scrap an Anime like NieA Under7, you gonna get 13 Episodes of Episode 7 Smile So what i did was naming the Files like this: NieA Under Seven - 01 - Alien & Launching UFO - Bath - [tipota](bc6b35b4).mkv

After that, you manually scrap them in XBMC, export the Video Database and copy & paste the <tvshow> infos from the database xml into a tvshow.nfo and place it into the folder.

I guess when i have time i can write a Step-by-Step Guide, how i do it.

Ps. Sorry for my English.

Reply
Hello Big Grin

I´m relatively new to the scraping of anime, im having trouble scraping the episode info of the most recent episodes,of series like Naruto, One piece, hxh (2011), and other full series, like jigoku shojo, the 3rd season,which it recgonizes, but doesnt show episode info. So after hearing this, what is your opininon, i have already checked thetvdb.com to see if there is actually episode info posted there, and i confirmed it, so thats not the issue. Ayway i hope u guys can help me out.

P.S do i need to post my debug log? its been a long time since i posted 1 , so i forgotRofl sry, can someone remind how to do it?
Reply
Thanks Vaneska,
I decided to do some preliminary testing and copied some anime to a new TV Shows folder. I used filebot (with a few hiccups) to rename them as..
TV Shows/Anime/ Anime - Epno - Episode.ext

I only did Movies and OVAs so far as they seem to give the most trouble from what I've read. (And apparently read correctly) 19 out of 21 have checkmarks by them in the Library after scanning them. 19 have coverart thumbs and series information but no episodes when you click the folder. (Not the same 19 with checkmarks) Only 2 were completed correctly with coverart, series, and episode information. What I don't understand is if it picks up the series correctly to add the coverart and info, where are things going wrong that I can't see my episodes in the folder? I am also curious as to why the thumbnails are stretched beyond recognition in the series information screen.

Is there no way from the interface to point it to the correct anidb anime? It seems like sometime I can manually input a name to scrape, but sometimes it won't let me.

I also tried your script for the anidb applet to see if it would accept my files but the applet hides the start button so I can't try it. It's like the applet is bigger than the window alotted for it. *sigh*

Starting to get frustrated...

Edit: Well it seems I forgot to add the proper regex to the advancedsetting.xml.
Code:
<advancedsettings>
<tvshowmatching action="prepend">
<regexp>(?:[\ _-]{2,3})(\d{1,3})</regexp>
</tvshowmatching>
</advancedsettings>
With this I don't really need to do much renaming from my original scheme. Hopefully just on the few that aren't recognized. Will update when I've gone though a few more.

Quote:But as long as they are correctly named incl. the original CRC in the Filename, the AniDB Scrapper will show them on XMBC (ofc the release needs to be in the AniDB).
Just a FYI. It doesn't seem to be checking the CRC. When I renamed a series back to the way I had it, I tried the update before adding back the CRC string and it worked fine.

ie.. Amagami SS - 01 [UTW].mkv
Reply
(2012-08-28, 23:35)akovia Wrote: Is there no way from the interface to point it to the correct anidb anime? It seems like sometime I can manually input a name to scrape, but sometimes it won't let me.

Not from the interface, but from the filesystem there is.

Create a file named tvshow.nfo in the same folder as the unrecognized Anime. Inside that file, include ONLY the URL of the anime on AniDB, and it will use that URL to scrape the anime. The only problem I have personally with this is that it sometimes still won't scrape the individual episodes, but at least it gets the folder.

Reply
EDIT: Delete this post, nm, on second thought I didn't need to post a second time.... meh wish I could delete my own post.
Reply
(2012-08-29, 03:44)aelfwyne Wrote:
(2012-08-28, 23:35)akovia Wrote: Is there no way from the interface to point it to the correct anidb anime? It seems like sometime I can manually input a name to scrape, but sometimes it won't let me.

Not from the interface, but from the filesystem there is.

Create a file named tvshow.nfo in the same folder as the unrecognized Anime. Inside that file, include ONLY the URL of the anime on AniDB, and it will use that URL to scrape the anime. The only problem I have personally with this is that it sometimes still won't scrape the individual episodes, but at least it gets the folder.

Luckily I haven't had to resort to that yet now that I have figured a few things out, but I have seen a dialog with a keyboard to scrape manually before and haven't found the right conditions or sequence to get back there. I asked in the irc channel but didn't get a response.

I also found out about the check-marks a little. The string I used in FileBot had not put an extension on the files and I didn't catch it since they were auto filed. After fixing that and renamed my files back to how they were, I was able to get files recognized and the check-marks went away. The check-marks are supposed to represent if the series was watched or not so it can bit a bit confusing. I like the option to know if your folder doesn't have any video files in it or not, but it needs to be a different icon. For now I'm waiting till I'm finished with importing my library to set series as watched or not so just to be safe.

If anyone is following this I was able to add series without adding SxxExx. My original filenames work fine. Amagami SS Plus - 01 [UTW][333C09E0].mkv
For OVAs and Movies, I located the correct name of the series with FileBot. (This took a while to figure out) Once you rename the the folder from the result of FileBot, you can pretty much just add S01E01 in the filename to get it recognized. So far I there hasn't been anything it hasn't found that way, but I have a long way to go.

I can't thank you guys enough for all the help. If I can get through my entire library and be satisfied, I might attempt a small guide for idiots like myself. =P
Reply
Quick Update:
I found how to do a manual search from the interface.

Example:
You have a show that's listed wrong and want to change it.
Do either of the following to fix it.

To keep the same folder name for the series/ova/movie:
1. Rename the offending folder to something generic temporarily

2. Clean the Database (System > Settings > Video > Library > Clean Library)

3. Rename the offending folder back to its original name.

4. Navigate back to anime you wanted to fix (Videos > Files > ?? > TV Shows)
(note: the folder will still have the wrong artwork it had before but no info associated with it.)

5. Show "TV Show Information" by hitting "i" or R-Click with mouse. This will bring up a window that shows all anidb matches along with their anidb #. If you don't see what you want from that list, there is a manual button you can click and enter your own search terms. Once you find the one you want, it should update everything including artwork with no problems.


If you don't care what the folder name is:
1. Rename the offending folder to any name not used elsewhere. (ie.. you could just append a - or something)

2. Reload the TV Shows directory by navigating out then back in.

3. Follow step 5 from above on the new folder.

4. (optional) Follow step 2 from above to make the database forget about the old folder and information.

It seems that there could be an option built into the interface to access this functionality directly, but I'm just tickled it can be done at all. I hope this helps someone as I'm finding it very useful with some problem shows.

Cheers

Edit: Of course you could just do this for all new folders that have not been scanned yet and never giving it a chance to select the wrong anime. Always select "TV Show Information" instead of "Scan For New Content" on each new folder.
Reply
(2012-08-28, 16:43)Vaneska Wrote: If you use the AniDB Applet (option > go advanced), you can use a renaming Script like this, works great with REGEX 2.4.

Hey, just noticed.... your code block has a couple of issues.

First, it looks like you accidentally pasted in TWO copies of code. Only the second instance of each item will apply, so it mostly works.

Second, on shows with less than 10 episodes, it is using single digit numbers (without a prepending 0). Looks okay, but causes the scraper to not pick up the episodes. If you simply add a 0 in front of each episode number, the scraper picks them up fine. Working on a fix (but I don't understand regex lol)..
Reply
Okay, I have a much better working renaming script for the AniDB Applet. It solves the problem of scraping episodes when total # of eps < 10 by simply always padding to at least a single leading 0.

This script also includes some added info, but none of it seems to hurt xbmc's odds of scraping.

One big note, is that I have enabled FILE MOVING. The reason for this is because it guarantees that each show is in a properly named folder. You MUST change the path in this code to a valid path on your system, or don't enable file moving.

Code:
Trunc(str, len):=$repl(%str%, ".{" $len($repl(%str%, "(.?){" %len% "}$", "")) "}$", "")
TruncEllipse(str, len):= { $len(%str%) = $len($Trunc(%str%, %len%)) ? %str% : $Trunc(%str%, %len%) "…" }

#AT:= { %ATe% ? %ATe% " (" %ATr% ")" : %ATr% }                  #anime title english for romanji
AT:= { %ATe% ? %ATe% "" : %ATr% }
ET:=[%ETe%,%ETr%]
ET:=$TruncEllipse(%ET%,"64")
GT:="[" [%GTs%,%GTl%] "]"

CRC:="["$uc(%FCrc%)"]"
Src:="["%Source%"]"                                             #set [source] e.g. [Blu-ray]
Ver:={%Ver%="1"?"":"v"%Ver%}                                #set version if appliccable
Res:="["%FVideoRes%"]"                                          #set resolution e.g. [1280x720]
Codec:="["$repl(%FVCodec%,"H264/AVC","h264")"]"                 #set video codec

TmpPad:=$max($len(%EpHiNo%),$len(%EpCount%))                    # set minimum. There HAS to be an easier way but i stupid.
EpNoPad:=$pad(%EpNo%,$max(%TmpPad%,'2'), "0")
EpNoPad:=$repl(%EpNoPad%,'[sS]',"S00E0")                        #rename to "S00E"

MEpNo:={%ET%="Complete Movie"?%EpNoPad% %Ver%:%EpNoPad% %Ver%" - "%ET%}         #Only show title when not "Complete Movie"

Movie := %AT%" - "%MEpNo%" - "%GT%%Src%%Res%%Codec%%CRC%                   #for movies set to
Other := %AT%" - "%EpNoPad%%Ver%" - "%ET%" - "%GT%%Src%%Res%%Codec%%CRC%        #all else set to

FileName:= {%Type% = "Movie"? %Movie% : %Other%}                                #check if the file is a movie or not and rename appropiately
PathName:="J:\TmpAnime\" $repl(%AT%,'[\\":/*|<>?]',"")

Remember, CHANGE THE PATHNAME (last line) to something valid on your system.
Reply
(2012-08-30, 01:50)aelfwyne Wrote:
(2012-08-28, 16:43)Vaneska Wrote: If you use the AniDB Applet (option > go advanced), you can use a renaming Script like this, works great with REGEX 2.4.

Hey, just noticed.... your code block has a couple of issues.

First, it looks like you accidentally pasted in TWO copies of code. Only the second instance of each item will apply, so it mostly works.

Second, on shows with less than 10 episodes, it is using single digit numbers (without a prepending 0). Looks okay, but causes the scraper to not pick up the episodes. If you simply add a 0 in front of each episode number, the scraper picks them up fine. Working on a fix (but I don't understand regex lol)..

Thx, there was a second Trunc string in the code, fixed it. With single digits in a show with less than 10 Eps, i haven't encountered any problems so far.
Reply
(2012-08-28, 23:35)akovia Wrote: I also tried your script for the anidb applet to see if it would accept my files but the applet hides the start button so I can't try it. It's like the applet is bigger than the window alotted for it. *sigh*

Are you on OSX? Smile

AniDB.net > Profile > Custom CSS > div.g_section.applet object { width: 1000px; height: 700px }
Reply
(2012-08-30, 08:53)Vaneska Wrote:
(2012-08-28, 23:35)akovia Wrote: I also tried your script for the anidb applet to see if it would accept my files but the applet hides the start button so I can't try it. It's like the applet is bigger than the window alotted for it. *sigh*

Are you on OSX? Smile

AniDB.net > Profile > Custom CSS > div.g_section.applet object { width: 1000px; height: 700px }

xubuntu actually. I don't think I need to use AOM anymore as my system is working fine now, but that's a great tip.

I sill have a long way to go but I do need to work on some minor issue like sets, multi-episode files, and try to figure out how to rename the titles of some anime so they are grouped together in the interface.

Also, could you elaborate on this?
Quote:After that, you manually scrap them in XBMC, export the Video Database and copy & paste the <tvshow> infos from the database xml into a tvshow.nfo and place it into the folder.
Reply
  • 1
  • 29
  • 30
  • 31(current)
  • 32
  • 33
  • 37

Logout Mark Read Team Forum Stats Members Help
[WIP] AniDB.net Anime Video Scraper3