Release The Definitive Anime Regex for Absolute Numbering
#1
Exclamation 
After months of looking for a decent regex to deal with anime without having to sort to renaming everything, I finally decided I would have a crack at it myself.

Having zero experience with regex, and with the very poor and scattered information in the wiki it wasn't easy, but I think they turned out really great. I spent about 2 days looking at tutorials trying to figure out what the hell was going on, until inspiration struck.

The whole point of what I did is matching the last valid number in the string. Instead of worrying what comes before the episode, which leads to many pitfalls (most regexes fail at Eyeshield 21 123.avi for example), I'm always certain I have the right episode no matter the format. I also don't try to match the stuff that comes after the episode number. If it's a number inside brackets or inside other text it's not a valid episode number. For example in [Commie] Ace of the Diamond - 58v2 [38BCD21A].mkv 38 or 21 are not valid episode numbers, only 58 is (there's an exception for a number followed by v, to account for version numbers).

I do have to make sure I'm dealing with anime in the first place, since this regex doesn't play nice with regular tv shows with seasons. I have all my anime under an anime folder, so I make sure anime is in the path (this can of course be modified to match your folder structure, but keeping shows with absolute ordering and those with seasons separate is a must for this to work). Even if your setup isn't like that, you only have to move files, not rename them for it to work.

I also check that the file has certain extensions. That's completely optional, I just added it because in some shows I have a lot of garbage files that the default regex picks up, so I end up with a lot of duplicates, which is annoying. I do absolutely no bookkeeping on my files, so if you do even the minimum (like not have an extra folder full of junk for each episode), you may not need that and can safely delete it.

Specials need to be numbered according to the TVDB if you're using that scraper, though you don't need to add S00. That's more of a scraper than a regex issue. I still have no clue how Anidb deals with specials, so I can't help you there. So without further ado here it is:

Code:
<advancedsettings>
    
<tvshowmatching action="prepend">


<!-- Anime specific matching exclusively for Absolute Ordering. Everything will be matched to season 1, except Specials that will be assigned season 0 -->

<!-- Anime must be contained in a folder called anime (case insensitive) at any dept, to avoid messing with regular TV Shows. You can change it to match your folder sructure. If your anime isn't separate from your other shows then simply delete "Anime" from the regex, but bear in mind it won't play nice with shows with more than 1 season -->

<!-- Files must either have a mkv, mp4, avi or ogm extenion to avoid matching spam files that sometimes come with some downloads like .url and the like. Also optional. Remove (?:(?=.*\.mkv$)|(?=.*\.mp4$)|(?=.*\.avi$)|(?=.*\.ogm$)) if you want to match any extensions. -->

<!-- The last number without brackets not immediately followed by letters other than "v" will be matched. That means shows like Macross 7, Eyeshield 21 or Hunter x Hunter (2011) pose no problem to match. -->

<!-- Specials are automatically asigned to season 0, so there's no need to add "S00" to the filename, but that doesn't affect the match. The file must be in a folder called "Specials" (case insensiive) for each show somewhere inside your anime folder. I recommend numbering after TVDB. -->

<regexp defaultseason="0">(?i)Anime.*Specials(?:(?=.*\.mkv$)|(?=.*\.mp4$)|(?=.*\.avi$)|(?=.*\.ogm$)).+(?=\w)(?&lt;![a-df-z0-9])(\d+)(?&lt;=\d)(?!.*[\\\/])(?![a-uw-z0-9\])}px])</regexp>

        <!-- EG: Anime/Kurenai/Specials/[gleam] Kurenai OVA - 01 [OAD][0e73f000].mkv -->
        


<!-- We are looking for a multi episode with the format Anime/Show 01-02.mkv There can be any number of folders in between and numbers either before or after. It will match only the last set, so in the following file it will pick episodes 08 and 09 for example and not 01 and 12. It will fail on single episodes even if the folder has a range of episodes, so no need to rename anything. -->

<!-- smb://n5200xxx/data/anime/started/hack/[kaa]_hack_twilight_01-12.dvd(complete)/hack_twilight_08-09.dvd(aac)[kaa][06da16fc].ogm -->


<regexp>(?i)()Anime(?:(?=.*\.mkv$)|(?=.*\.mp4$)|(?=.*\.avi$)|(?=.*\.ogm$)).+(?=\w)(?&lt;![a-df-z0-9])(\d+)(\-\d+)(?&lt;=\d)(?!.*[\\\/])(?![a-uw-z0-9\])}px])</regexp>


<!-- This one matches single episodes, always matching the last valid number as the episode. It supports a wide array of formats either with CRC, resolution, hash, etc. or without. -->


<regexp>(?i)()Anime(?:(?=.*\.mkv$)|(?=.*\.mp4$)|(?=.*\.avi$)|(?=.*\.ogm$)).+(?=\w)(?&lt;![a-df-z0-9])(\d+)(?&lt;=\d)(?!.*[\\\/])(?![a-uw-z0-9\])}px])</regexp>



</tvshowmatching>
    
  
</advancedsettings>

I'd be happy to answer any questions regarding the regex. If something is not being matched I'd also like to hear about it. By design it doesn't match stuff like MyShow ep01.avi. It can be done, but since no files originally come in that format and there might be a slight risk of false positives I decided not to support that MyShow S01E102.avi works without problem though (as long as you only have 1 season), so if you've already renamed some of your stuff, this will still work.
Reply
#2
When you say "Anime must be contained in a folder called anime", is this the library name or the actual folder path?

For instance, I have a folder on my NAS called .anime (note the period at the beginning of the name), and the entry in Kodi is called anime. So my path is "smb://nas/video/.anime/". Will this work, or do I need to adjust the regex to account for this?
Reply
#3
(2016-05-16, 08:38)wolfgame Wrote: When you say "Anime must be contained in a folder called anime", is this the library name or the actual folder path?

For instance, I have a folder on my NAS called .anime (note the period at the beginning of the name), and the entry in Kodi is called anime. So my path is "smb://nas/video/.anime/". Will this work, or do I need to adjust the regex to account for this?

That would work. Some folder in the whole path needs to contain the word "anime". It doesn't matter if there's period before or any other character before or after.
Reply
#4
This old post might just solve a problem I have with adding a big and long running Anime TV Show.
The context: I would like to add an Anime TV show of which the filenames have several issues:
  • the show has over 100 episodes,
  • the episodes don't match any of the naming conventions.
Running into this thread's quote (referred from Google) made my think: I could indeed try to match a part of the path with my specific TV Show.
(2016-02-11, 07:19)crazygambit Wrote: I realize I'm very late on this and you may not need it anymore, but there's an easy fix to that. If you keep your anime separate from your tv shows, it's trivial to implement. Since the regex looks at the whole path you can just have it match the name of the parent folder your anime is at. I recently posted a set of regexes that do just that for absolute ordering, just look through my post history.
So I started looking through crazygambit's posts and found this one, great!

My question: Looking at your regex something bothered me.
Why do you have "px" at the end of you regex patterns?
Reply
#5
Just noting that I updated my own version of the anime regex patterns, partly improved based on crazygambit's setup. However I was also having failure issues with crazygambit's regex, so had to discard it and rebuild from scratch.

Example failure: Anime/[SHiN-gx] Fight Ippatsu! Juuden-chan!! - Special 1 [DVD][720x480 AR h.264 FLAC][v2][FF09021F].mkv

I made the full post over here.
Reply
#6
Hi Kinematics,
Thanks for the info! I will have an indepth look at your regex patterns later on.
From first sight it looks very promising. Specially the part "...filename marked as Special/OVA/OAV/etc goes to season 0,..."

Note that I do not have problems with crazygambit's regex patterns, I only have a question.
To be honest I only use part of crazygambit's advanced settings configuration.
Reply
#7
(2017-01-27, 16:38)Abberlin3 Wrote: This old post might just solve a problem I have with adding a big and long running Anime TV Show.
The context: I would like to add an Anime TV show of which the filenames have several issues:
  • the show has over 100 episodes,
  • the episodes don't match any of the naming conventions.
Running into this thread's quote (referred from Google) made my think: I could indeed try to match a part of the path with my specific TV Show.
(2016-02-11, 07:19)crazygambit Wrote: I realize I'm very late on this and you may not need it anymore, but there's an easy fix to that. If you keep your anime separate from your tv shows, it's trivial to implement. Since the regex looks at the whole path you can just have it match the name of the parent folder your anime is at. I recently posted a set of regexes that do just that for absolute ordering, just look through my post history.
So I started looking through crazygambit's posts and found this one, great!

My question: Looking at your regex something bothered me.
Why do you have "px" at the end of you regex patterns?

So it doesn't match a number followed by either p or x in the filename, like 1080p or 1280x720

(2017-02-11, 01:22)Kinematics Wrote: Just noting that I updated my own version of the anime regex patterns, partly improved based on crazygambit's setup. However I was also having failure issues with crazygambit's regex, so had to discard it and rebuild from scratch.

Example failure: Anime/[SHiN-gx] Fight Ippatsu! Juuden-chan!! - Special 1 [DVD][720x480 AR h.264 FLAC][v2][FF09021F].mkv

I made the full post over here.

I don't have any filenames with h.264 in the name, only h264, so I missed that. I do have filenames where the additional stuff like codecs and resolutions are not inside brackets, that's why I went with my solution. Unfortunately it's very common for the episode number to come after a period, so I was never gonna catch that. Maybe I will borrow a little from yours and not match anything inside brackets. It's really cool that you were inspired to make your own though. I think this should come standard with Kodi, since the default regex is pretty weak, so it's nice to see more people putting some thought into this.
Reply
#8
This regex works for 99% of my stuff but Im having issues with some episodes and have no idea how to fix it.

example:
Dragon Ball Super - 020 - A Warning from Jaco! Freeza and his 1000 Soldiers Draw Near [OGG] [0DBDB6A5]

Dragon Ball Super - 028 - The God of Destruction from Universe 6 - His Name is Champa [OGG] [EDC33DD3]

Obviously the problem is that it thinks the last numbers that aren't in brackets are the episode numbers. (1000 and 6, instead of 020 and 028)

How might I fix this?
Reply

Logout Mark Read Team Forum Stats Members Help
The Definitive Anime Regex for Absolute Numbering1