Kodi Community Forum
[WIP] AniDB.net Anime Video Scraper - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33)
+--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27)
+---- Forum: Metadata scrapers (https://forum.kodi.tv/forumdisplay.php?fid=147)
+---- Thread: [WIP] AniDB.net Anime Video Scraper (/showthread.php?tid=64587)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37


- bambi73 - 2011-04-01

pathw Wrote:I think a shared copy that's updated would be better. But maybe instead of an xml file, we could make a simple website where you give it the anidbid and it gives you the tvdbid, and we let anyone edit it Smile? Crowdsourcing and all that.
At first glance it's nice idea, but i see there few problems:
1/ That xml is not simple mapping between anidbid and tvdbid, it can contain much more informations, so making some web application to edit/add it will not be easy task
2/ You need manage hosting with DB, manage rights etc etc
3/ You must convince peoples to start cooperate and it will be quite hard, they want to get everything directly to home, packed and prepared. For example till now (~1 year) i got no offer to help with updating of this file, no one ever asked about its function.

Personally I have no time and energy for such project, but if you are willing to work on it i can support in future.

pathw Wrote:I noticed though that the fanart section, that looks up alternate names and prequels, does not look up this file.?
When mapping is found it directly jumps to GetFanartDataAPI, no lookup by name is done.

pathw Wrote:I have added the support. You could just merge it in if you want.
Please send me your version of scraper, i'll take a look. I guess i have idea how you did it, but i don't like naming specials and others like S99E01 instead of C01.

pathw Wrote:hmm but I dont get this. The art on the image is always of different aspect ratios. Why does xbmc scale it so badly?
My guess is that it depends how skins are woking with pictures.


- ZERO <ibis> - 2011-04-03

Imagine if someone convinced anidb to integrate fan art and per episode descriptions right on their site, now that would be nice.

Onto what I think may be an interesting request. Is there any way to have the scrapper automatically download all available fan art and place it in the extrafanart folder?


- bambi73 - 2011-04-04

ZERO <ibis> Wrote:Imagine if someone convinced anidb to integrate fan art and per episode descriptions right on their site, now that would be nice.
It would be nice, but i'm bit sceptical personaly Smile

ZERO <ibis> Wrote:Onto what I think may be an interesting request. Is there any way to have the scrapper automatically download all available fan art and place it in the extrafanart folder?
As I already stated somewhere above, scraper doesn't download any pictures, it only provides information where to download them. These informations are there, but XBMC download only first one in the list and you can select different later. If you want kind of functionality what you described you will need some Python script/addon to do it for you.


- bambi73 - 2011-04-04

Today i posted request for publishing v1.1.0 in official repository, so you can expect update soon (not sure how long it'll take Smile)

Copy&Paste from change log (details later and/or if someone is interested in them):

1.1.0:
Fixed: Workaround for bug #11377 (causes scraper freezing or wrong parses in specific cases)
Changed: Splitting settings into categories
Changed: Slightly improved Google search
Added: Possibility to specify sources (URLs) for anidb.xml and anime-list.xml files
Added: Possibility to select official title (+language) over main title
Added: Possibility to use personal anime mapping file


Btw if you guys have translation for setting to different languages, please send it to me. I'll add it to next version.
And maybe "translation" to english will be nice too Wink


- ZERO &lt;ibis&gt; - 2011-04-04

bambi73 Wrote:As I already stated somewhere above, scraper doesn't download any pictures, it only provides information where to download them. These informations are there, but XBMC download only first one in the list and you can select different later. If you want kind of functionality what you described you will need some Python script/addon to do it for you.

On this note and also noting that others have commented about cool things the scrapper could do but can not due to the limitations of it not being a full python plugin I suggest the following:

How about a feature support plugin. Basically a python plugin that works with this scrapper to offer additional features that would not be possible otherwise. For example:
The ability to preform the above requested automatic downloading of fan art
The ability to auto generate theme.mp3 files by also looking up the series on gendou.com and automatically downloading the full version of the OP1
The ability to have an anime version of TV Show Next Aired that uses data from anidb to show the next episode release date (I have TV Show Next Aired disabled b/c it is borked as far as anime goes thinking it is some crazy crap)
Ect..

I think there are all sorts of features that could be made possible with an expansion plugin that works hand in hand with this scrapper in order to provide us anime lovers the same interface experiences offered to users of American content.


- bambi73 - 2011-04-04

ZERO <ibis> Wrote:On this note and also noting that others have commented about cool things the scrapper could do but can not due to the limitations of it not being a full python plugin I suggest the following:

How about a feature support plugin. Basically a python plugin that works with this scrapper to offer additional features that would not be possible otherwise. For example:
The ability to preform the above requested automatic downloading of fan art
The ability to auto generate theme.mp3 files by also looking up the series on gendou.com and automatically downloading the full version of the OP1
The ability to have an anime version of TV Show Next Aired that uses data from anidb to show the next episode release date (I have TV Show Next Aired disabled b/c it is borked as far as anime goes thinking it is some crazy crap)
Ect..

I think there are all sorts of features that could be made possible with an expansion plugin that works hand in hand with this scrapper in order to provide us anime lovers the same interface experiences offered to users of American content.
It's hard to argue with you because everything what you wrote would be nice to have, but ... you need some support for scraper and python addon cooperation from XBMC side and there was no signs that any regular developer is interested in this topic, which is bad. Of course you can always program it yourself and post patch Smile
Personally i'm satisfied how AniDB scraper works for me now so don't expect any activity from my side. Simply i have no time and energy for such large project, even though it looks like fine challenge Smile


- ZERO &lt;ibis&gt; - 2011-04-06

Oh if I could code xbmc addons in sourcepawn or php I would have released it already. Unfortunately python gives me brain cancer but maybe I will pick it up eventually. Although I wonder if it may be possible to attract a programer with some $$ b/c I would be willing to pay to get support for those features Big Grin


- NoValidTitle - 2011-04-08

Ok, I'm sorry to be a pest but would someone mind giving me a hand? I've spent hours reading and tinkering but I don't fully understand all the scraper stuff apparently. I'm not even a huge anime watcher to be honest, just a few series. I'm running XBMC Live(installed) and I'm trying to use the anidb scraper to get my One Piece episodes into my library. No matter what I do the closest I can seem to get is it scans my folder and adds every episode as a special and not an actual episode. My episodes are named like this: One Piece - 001 - I'm Luffy! The boy who will become the Pirate King! [K-F&AKUPX].avi which I used WebAOM to rename.

Where do I even start at fixing this?


- bambi73 - 2011-04-08

NoValidTitle Wrote:Ok, I'm sorry to be a pest but would someone mind giving me a hand? I've spent hours reading and tinkering but I don't fully understand all the scraper stuff apparently. I'm not even a huge anime watcher to be honest, just a few series. I'm running XBMC Live(installed) and I'm trying to use the anidb scraper to get my One Piece episodes into my library. No matter what I do the closest I can seem to get is it scans my folder and adds every episode as a special and not an actual episode. My episodes are named like this: One Piece - 001 - I'm Luffy! The boy who will become the Pirate King! [K-F&AKUPX].avi which I used WebAOM to rename.

Where do I even start at fixing this?

Your problem lies in parsing of episode and season numbers from file names and it's done by XBMC itself even before scraper is started. What you are looking for is <tvshowmatching> setting in [url="http://wiki.xbmc.org/index.php?title=Advancedsettings.xml]Advancedsettings.xml[/url]. I guess you didn't touched this setting and you have defaults there, so in this case your file name is catched by 4th default regexp

Code:
<regexp>[\._ \-]([0-9]+)([0-9][0-9])([\._ \-][^\\/]*)</regexp>  <!-- foo.103 -->
and result is SeasonNr='0' (first capture group) and EpisodeNr='01' (second capture group) which wrong because SeasonNr=0 means specials. You must add something like
Code:
<tvshowmatching action="prepend">
  <regexp>(?i)[/\\].*?()\s-\s(\d{2,3})([^/\\]*)</regexp>
</tvshowmatching>
It expects " - " before EpisodeNr (two or three digits). Empty first capture group means SeasonNr=1.


- NoValidTitle - 2011-04-08

bambi73 Wrote:Your problem lies in parsing of episode and season numbers from file names and it's done by XBMC itself even before scraper is started. What you are looking for is <tvshowmatching> setting in [url="http://wiki.xbmc.org/index.php?title=Advancedsettings.xml]Advancedsettings.xml[/url]. I guess you didn't touched this setting and you have defaults there, so in this case your file name is catched by 4th default regexp

Code:
<regexp>[\._ \-]([0-9]+)([0-9][0-9])([\._ \-][^\\/]*)</regexp>  <!-- foo.103 -->
and result is SeasonNr='0' (first capture group) and EpisodeNr='01' (second capture group) which wrong because SeasonNr=0 means specials. You must add something like
Code:
<tvshowmatching action="prepend">
  <regexp>(?i)[/\\].*?()\s-\s(\d{2,3})([^/\\]*)</regexp>
</tvshowmatching>
It expects " - " before EpisodeNr (two or three digits). Empty first capture group means SeasonNr=1.

I'll give that a try! I appreciate your time.


- ZERO &lt;ibis&gt; - 2011-04-10

Found what I believe to be a small bug. It appears that studios are not scrapped correctly. I assume that the studio is supposed to come from anidb.net and is listed under "Animation Work". I noticed that not all shows scrap this item correctly instead leaving it blank.

For example Ore no Imouto ga Konna ni Kawaii Wake ga Nai does not scrap the studio even though under the staff section it says: "Animation Work (アニメーション制作Wink AIC Build"

I also noticed that when I ran the scrapper it actually made my genre list more restricted than before. Where it had read Comedy, Seinen it changed to just Seinen when according to anidb it should actually say Comedy, Novel, Seinen.

Also for some shows like 30-sai no Hoken Taiiku the scrapper fails to find the studio or genres even though they are all right there on anidb: http://anidb.net/perl-bin/animedb.pl?show=anime&aid=8106


- bambi73 - 2011-04-10

ZERO <ibis> Wrote:Found what I believe to be a small bug. It appears that studios are not scrapped correctly. I assume that the studio is supposed to come from anidb.net and is listed under "Animation Work". I noticed that not all shows scrap this item correctly instead leaving it blank.

For example Ore no Imouto ga Konna ni Kawaii Wake ga Nai does not scrap the studio even though under the staff section it says: "Animation Work (アニメーション制作Wink AIC Build"
Unfortunatelly it's not problem of scraper but data provided by AniDB. You can check it yourself, there is no "Animation Work" type creator. Maybe they limit creator list to 15 records and "Animation Work" didn't make it under this limit.

ZERO <ibis> Wrote:I also noticed that when I ran the scrapper it actually made my genre list more restricted than before. Where it had read Comedy, Seinen it changed to just Seinen when according to anidb it should actually say Comedy, Novel, Seinen.
Scraper overtakes only genres with weight 500 or 600, which for Ore no Imouto are Seinen, Novel, Earth, Asia, Japan, Present. Additionally it filters out these which doesn't qualify (IMHO Wink) as genres, in this case everything except Seinen. Maybe i can leave Novel too, because when i wrote this part of scraper i linked Novel to Visual Novel (Eroge) in my mind and filtered it out. But in this case it means Light Novel, which is "legal" for me Smile.
About Comedy, it has now weight 400, so it's not in genres list. Reason why it was there in past is that originally Comedy had weight 600 but was changed down to 400 in November. You can check CREQ history for this record on AniDB.

ZERO <ibis> Wrote:Also for some shows like 30-sai no Hoken Taiiku the scrapper fails to find the studio or genres even though they are all right there on anidb: http://anidb.net/perl-bin/animedb.pl?show=anime&aid=8106
Fot this show i see studio Gathering in my DB and there are only two genres with weight only 400 on AniDB (you can check it here - half star = 100 weight points)


- ZERO &lt;ibis&gt; - 2011-04-10

OK I will try to look into why the Studio is not being reported in the api output. However your comments on the genre section has lead me to a request:

Can we have the ability to adjust the weights and or use a fall back option. For example you can sort by weights but if less than X results occur than take the Y highest. Inversely you could have another option to prevent too many results by limiting the output to Z.

My other request involves the filtering. As you stated there are some results that are filtered out in order to provide higher quality output. However some users may want more or less restrictive filters. For example some users may still want to filter out Novel while people like me want to be able to allow Novel and Visual Novel as well. If there was an option that let us edit this list and or add to it that would be awesome! It could be implemented by listing the items separated by , so the program knows how to break them up.

BTW, I have made a thread reporting the issue here: http://anidb.net/perl-bin/animedb.pl?show=cmt&id=36248#c199729

bambi73 Wrote:Fot this show i see studio Gathering in my DB and there are only two genres with weight only 400 on AniDB (you can check it here - half star = 100 weight points)

I have checked this in the API output and see that it is listed however every time I refresh it will not load the studio. I even deleted it from my xbmc database and did it again and it still comes up as "Not available" for studio. Perhaps there is something tricky going on where the scrapper is not writing the studio if it did not find anything to use for genre?


- Armitage - 2011-04-15

Thanks for the update off the scraper! It really works well, good job!


- bambi73 - 2011-04-19

1.2.0:
Changed: Replace "`" with "'" in all significant texts
Changed: Configuration for genres
Added: Loading characters + actors/seiyus

Should be available soon from XBMC repo