Anime nerds, assemble!
#16
I've expanded on a bunch of stuff: http://wiki.xbmc.org/index.php?title=Ani...ldid=77354

Hopefully this explains things a bit better.
Reply
#17
I followed the instructions on the wiki about changing the "advancedsettings.xml" in order to have XBMC index my anime files (§1.3) with no luck. I replaced it with the following (naive) regexp and it worked fine:
Code:
[ _]0*(\d+)(?:v\d)?[ _]

I think the problem was that my episode numbers tend to have a leading zero (like "[HorribleSubs] Shigatsu wa Kimi no Uso - 02 [1080p].mkv") and probably XBMC had problems with that. In any case, many thanks for the guide.
Reply
#18
Thanks for the updated regex! I'll update the page with it.
Reply
#19
I've updated my regex filters for anime, cleaned them up, fixed some bugs, and am pretty happy with them. They're not simple, but they handle every naming scheme I've tested, whereas the above naive regex fails on several.

It captures season numbers when it's in a parent directory (including multiple levels up). It does not capture OP/ED entries as long as there's no spacing (EG: OP1, ED1; if entered as "OP 1" instead of "OP1", it will still be found as episode 1), but will capture stuff like SP1 or OVA1 (special episode 1, placed in season 0). It handles all sorts of variations in filename formatting, with or without CRC code anchors. It does not capture data from inside bracketed areas (resolution, 10-bit, etc). It avoids capturing numbers from the show name itself (eg: KoiKoi 7).

It still does not handle multi-episode naming (eg: Dokuro-chan 01-02), as I haven't been able to make any sense of how Kodi processes those.

It's rather complex (though it's now only 3 regexes each for CRC-anchored and non-anchored files), so I don't know if you want to include it in the wiki, but posting here for reference.

Complete version with comments explaining the design, and test file names to check against:

Code:
<advancedsettings>
    <tvshowmatching action="prepend">
        <!-- Regex info: -->
        <!-- First capturing match is assigned to the season.  Second capturing match is assigned to the episode. -->
        <!-- (?i) turns on case-insensitive matching -->
        <!-- (?:stuff) is a non-capturing group for 'stuff', so as not to interfere with season/episode numbers. -->

        <!-- Anime specific matching. -->
        
        <!--
        Building the regex from back to front:
        
        Closing checksum (optionally followed by random text, but we don't have to match that) contained in (), {} or [].
        Preceded by any number of bracketed items of any sort of content, with possible spacing, dashes or underscores in between
        Possibly preceded by unbracketed text. Make sure it doesn't find episode numbers inside brackets.
        Preceded by the episode number (optionally labelled), with possible version number
        Preceded by various combinations of dash, dot, underscore or space, to separate the title from the episode number
        Possibly preceded by a season number, or 'Special' or 'OVA'
        -->
        
        <!-- Regexes listed in order of match preference -->
        <!-- The regexes in the prepend set are anchored to checksums, so should be checked before normal Kodi defaults. -->
        
        <!-- For reference, this is the regex containing everything from the episode number onwards.  It will be the same for all regexes in this section. -->
        
        <!-- <regexp>(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?(?:[[({][\da-f]{8}[])}])</regexp> -->
        

        <!-- Anything with the filename marked as Special/OVA/OAV/etc goes to season 0, regardless of what the directory may say. -->
        
        <!-- EG: [SHiN-gx] Fight Ippatsu! Juuden-chan!! - Special 1 [DVD][720x480 AR h.264 FLAC][v2][FF09021F].mkv -->
        <!-- EG: [gleam] Kurenai OVA - 01 [OAD][0e73f000].mkv -->
        <!-- EG: [Jarzka] Saki Picture Drama 1 [480p 10bit DVD FLAC] [BA3CE364] -->
        <regexp>(?i)(Special|SP|OVA|OAV|Picture Drama)(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?(?:[[({][\da-f]{8}[])}])</regexp>

        <!-- Then check if we have an explicit season directory. -->
        
        <!-- Inside a directory that specifies the season.  May include any number of subdirectories.  Doesn't try to find season markers in the file name. -->
        <!-- EG: Saki/Season 1/Saki [Jarzka]/[Jarzka] Saki 01 - Encounter [480p 10bit DVD FLAC] [9EED32CB] -->
        <!-- EG: Saki/Season 3/[Underwater-FFF] Saki Zenkoku-hen - The Nationals - 01 (720p) [AF65724D] -->

        <regexp>(?i)[\\/](?:S(?:eason)?\s*(?=\d))?(Specials|\d{1,3})[\\/](?:[^\\/]+[\\/])*[^\\/]+(?:\b|_)(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?(?:[[({][\da-f]{8}[])}])</regexp>
        
        <!-- Include season marker in the filename. -->
        <!-- EG: [CoalGuys] K-ON!! S2 - 05 [4B19B10F] -->
        
        <regexp>(?i)[-._ ]+S(?:eason ?)?(\d{1,3})(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?(?:[[({][\da-f]{8}[])}])</regexp>


        <!-- Anything else gets the default blank first capture, which sets the file to season 1. -->
        
        <!-- EG: [avatar-nyanko] Koikoi 7 - 01 (DVD) [5E95FA4A] -->
        <!-- EG: [gg]_Chuunibyou_Demo_Koi_ga_Shitai!_-_01_[5B6EFD1F] -->
        <!-- EG: [Eclipse] Akane-iro ni Somaru Saka - 01 (1024x576 h264) [39920E63].mkv -->
        <!-- EG: [gg]_Bakemonogatari_-_01_[CC0CF5D2].mkv -->
        <!-- EG: [Doki]_Asobi_ni_Iku_yo!_-_03v2_(1280x720_h264_AAC)_[B5B9C6F3].mkv -->
        <!-- EG: [Coalgirls]_Yuru_Yuri_02_(1280x720_Blu-Ray_FLAC)_[43E5A6B4] -->
        <!-- EG: Touch 01(DVD) - (112ceb61) Central Anime  -->
        <!-- EG: Cross Game 02 - Central Anime (1280x720) [BF23052D].mp4 -->
        <!-- EG: [Taka]_Naruto_Shippuuden_135_[480p][9073B8C2] -->

        <regexp>(?i)((?=\b|_))(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?(?:[[({][\da-f]{8}[])}])</regexp>
        
        <!-- Multipart episode handling is still uncertain. (?:-(\d{1,3}))? -->
        <!-- EG: [Triad]_Dokuro-chan_-_01-02 [12345678].mkv -->

    </tvshowmatching>
    
    <tvshowmatching action="append">
        <!-- Alternate version that does not include checksums. Put this after normal XBMC patterns. -->
        <!-- Since it doesn't use the checksum anchor, need to make sure it's not a directory name. -->

        <!-- For reference, this is the regex containing everything from the episode number onwards.  It will be the same for all regexes in this section. -->
        
        <!-- <regexp>(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?[^][)(}{\\/]*$</regexp> -->


        <!-- Anything with the filename marked as Special/OVA/OAV/etc goes to season 0, regardless of what the directory may say. -->

        <regexp>(?i)(Special|SP|OVA|OAV|Picture Drama)(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?[^][)(}{\\/]*$</regexp>

        <!-- Inside a directory that specifies the season. -->
        <!-- EG: Saki/Season 2/[HorribleSubs] Saki Episode of Side A - 14 [720p] -->

        <regexp>(?i)[\\/](?:S(?:eason)?\s*(?=\d))?(Specials|\d{1,3})[\\/](?:[^\\/]+[\\/])*[^\\/]+(?:\b|_)[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?(?:\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?[^][)(}{\\/]*?$</regexp>

        <!-- Include season marker in the filename. -->
        <!-- EG: [DeadFish] Toaru Kagaku no Railgun S - S2 - 01 [720p][AAC].mp4 -->
      
        <regexp>(?i)[-._ ]+S(?:eason ?)?(\d{1,3})(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?[^][)(}{\\/]*$</regexp>

        
        <!-- EG: [a.f.k.] Lucky Star - 01.avi -->
        <!-- EG: Air Master - 04 [HQA&N!].avi -->
        <!-- EG: [ANE] Yosuga no Sora - Ep01v2 [BDRip 1080p x264 FLAC] -->
        <!-- EG: [DeadFish] Jinrui wa Suitai Shimashita - Special 01 [BD][720p][AAC].mp4 -->

        <regexp>(?i)((?=\b|_))(?:[ _.-]*(?:ep?[ .]?)?(\d{1,3})(?:[_ ]?v\d+)?)+(?=\b|_)[^])}]*?(?:[[({][^])}]+[])}][ _.-]*)*?[^][)(}{\\/]*$</regexp>

    </tvshowmatching>
</advancedsettings>

Edit: Fixed some issues in the regexes.
Reply
#20
@Kinematics, you are my hero!
Reply
#21
Note: While writing up some info for an additional post, I found both a way to improve the regexes, and some problematic corner cases. I'm updating the regex in the above post to account for the newest fixes.

Part 2 of my anime naming adventures.


So, after many headaches involving renaming files, manual .nfo creation, and moving stuff all over my hard drive, I finally got all of Bakemonogatari working, and this can probably help for anyone going through similar problems with this or similar shows.

Bakemonogatari is split up into multiple, individually-named seasons: Bakemonogatari (1st season), Nisemonogatari (2nd season), Monogatari Second Season (3rd season), along with specials named Nekomonogtari (Black), Hanamonogatari, Tsukimonogatari, etc.

If you use AniDB [Mod] as the scraper, each season is treated as its own show, so you need to place them in individually named directories. If you use TheTVDB, all the series are considered seasons of the same show, and must either all be in the same directory, or have custom .nfo files written for each video (major hassle).

The actual final fix (tuned for TheTVDB, since AniDB failed at scraping several episodes) involved adjusting the regexes used for name parsing (as posted above) so that they recognized the season number in the directory name. I just hadn't accounted for such things when I'd originally wrote them.


So, what happens is that you set up a directory structure like so:

Code:
Bakemonogatari
|---- Season 1
    |---- Files
|---- Season 2
    |---- Nisemonogatari
        |---- Files
|---- Season 3
    |---- Monogatari Second Season
        |---- Files
Note that the subdirectories (Nisemonogatari, Monogatari Second Season) are optional, and could also be named after the torrent folder (eg: [Coalgirls]_Nisemonogatari_(1280x720_Blu-ray_FLAC)). The point is that whether you have individual files, or an entire folder of files, you don't need to rename anything in order to get the season number identified; just put it in the correct directory.

After that, you need to figure out where to put the specials.

If the files are marked with recognized 'special' values (eg: Special, SP, OVA, OAV, Picture Drama), and their numbering matches that of the scraper site (eg: Saki Picture Drama 1), then the files can be placed anywhere in the directory structure with no further modifications.

Unfortunately, in the case of Bakemonogatari, the specials -will- need to be renamed, because it's the scraper site's 'number' of the special that matters, not the actual number in the file name. For example, Nekomonogatari (Black) Part 1 is the 3rd special, so it needs to have E03 as part of its name.

In this case you have two options. You can either create an additional 'Specials' (or 'Season 0') directory that you put all the specials into (though you will still need to mark the 'episode' number for each), or you can rename the files to contain an explicit season number (EG: S00E03).

For example:
Code:
Bakemonogatari
|---- Specials
    |---- Nekomonogatari (Black) - Part 1 - E03
    |---- Nekomonogatari (Black) - Part 2 - E04
    |---- Nekomonogatari (Black) - Part 3 - E05
    |---- Nekomonogatari (Black) - Part 4 - E06
|---- Season 1
    |---- Files
    |---- Nekomonogatari (Black) - S00E03
|---- Season 2
    |---- Nisemonogatari
        |---- Files
|---- Season 3
    |---- Monogatari Second Season
        |---- Files


Addendum: Rewriting the regex so that you can use a directory name of 'Specials' instead of 'Season 0', as a matter of convenience.


Notes for the regex patterns:

You can name a season via directory name in any of the following ways -

/Season 2/
/Season2/
/S 2/
/S2/
/2/
/Specials/

The episode number will be the last valid number before all the bracketed material at the end of the file name. Valid numbers are formatted like: 3, 03, E03, Ep03, E003 (up to three digits are valid, so up to episode #999).
Reply
#22
Ugh. OK, one more fix. The consolidation I did of the regexes to simplify the code was technically correct on a per-name basis (which is what all my testing checked against), but messed up the assumptions about the order that the files would be analyzed in. In particular, Specials that were stored in a 'Season 1' directory would be marked as part of season 1, instead of season 0. Had to separate the regex again so that 'Special'/'OAV'/etc takes priority over the directory name.

Sorry about so many corrections right after the initial post, but as usual, showing off what you've done immediately brings all the bugs to the surface.
Reply
#23
Hi everyone! First of all thank you for the amazing effort Smile

Do you know if it is possible to easily adapt these regular expressions for movie matching also?
(I am assuming that tvshowmatching works only for tvshows/series matching and couldn't find anything in the wiki for movies)

In my case I actually just need to remove group names (no need to deal with episode numbers or seasons), within brackets before the movie name, files scanned as movies like this are not recognized. Any help?

Thank you!
Reply
#24
I don't think that's possible at the moment for movies. The only option I see is cleanstrings, but they only apply to patterns after the name. I could be wrong, though, since I'm very bad at regex.
Reply
#25
Thank you for the quick reply Ned.

But shouldn't it be possible to modify the default regexes used by xbmc to parse the movies? I can't even find those or a reference on how to change them. Since all is "open" they must be somewhere.
Reply
#26
The cleanstrings option in advancedsettings.xml (wiki) states that it matches and removes everything to the right of the pattern. Any pattern before the name of the movie would then blank out the actual file name itself. At least, as far as I understand.
Reply
#27
Finally got around my problem: remove group names within brackets before movie name

I will leave my solution here:

I used the default movies scraper, with recursive scan enabled. Then to get it to ignore things like [GroupName] in files like [GroupName]MovieName.mkv I added:

Code:
<RegExp input="$$1" output="\1" dest="1">
                        <expression noclean="1">[%20_]*%5b[a-zA-Z0-9]*%5d[%20_]*(.*)</expression>
                </RegExp>

I added this in the default configurations of the scraper that might vary according to xbmc version. Mine was here: /opt/xbmc-bcm/xbmc-bin/share/kodi/addons/metadata.themoviedb.org/tmdb.xml

Add exactly before:

Code:
<CreateSearchUrl dest="3">

The solution above was inspired in link.

I still had another problem. Some files also had information about the movie (e.g. 720p , BluRay, etc) in parethensis right after the movie name, and this was not being cleaned properly. So in the advancedsettings.xml and using Advancedsettings.xml#cleanstrings I cleaned it up by adding:

Code:
<video>
        <cleanstrings action="prepend">
                <regexp>(\(.*\))</regexp>
        </cleanstrings>
</video>

Notice that the
Code:
action="prepend"
is crucial! I did not put this initially and ended up indadvertedly replacing the defaults which then caused problems in recognizing other files.

Hope this helps someone!
Reply
#28
What are people using to scrape their anime movies? Theres no anidb option for movies and if I try setting the content to tv and using anidb it (unsurprisingly) doesnt work
Reply
#29
I just use IMDB for anime movies.
Reply
#30
I use Mediaelch for my Shows and Movies (Anime and Not alike)

I double check TVDB for the season order, and name it like so
Code:
ShowName/Season ##/ShowName - S##E## - EpisodeName - Quality.ext
basically the same as I would for a normal TV Show.

For Anime OVAs I first check if A) It is by itself on TVDB, and B) if it is listed as a part of a longer show.
If either are not listed, I use Mediaelch to create a series for it, while copy/pasting information from AnimeNewsNetwork and Wikipedia for the individual episodes (If the OVA is a 1 episode it is treated as a movie)

For Anime Movies, I first check TheMovieDB if it is located there, if not I check IMDB (odds are it will be listed there)
If the movie is not listed there, I do the same as I do for OVAs

Lastly whether found or not, I scrape the data with Mediaelch to keep so I never have to look it up again

My Movie Naming Scheme is this
Code:
MovieName (Year)/MovieName (Year).Quality.ext
Image
Reply

Logout Mark Read Team Forum Stats Members Help
Anime nerds, assemble!1