2010-02-03, 10:59
I have been working on a scraper for Anime News Network. Initially I was going to use Google for the searching, since ANN already uses it for its search. However, I learned from this bing thread that Google does not allow scraping. So I am using Bing instead, using the AppID from that same thread. I am not too sure what is ANN's policy for scraping, but it seems they don't mind (from this thread 5 years ago).
TV Shows: v0.46 download (xml+jpg)
Movies: v0.12 download (xml+jpg)
Settings for TV Shows scraper:
Enable All Language Casts
Retrieve other language voice actors in addition to Japanese.
Enable Unlisted Specials / 1 Episode OVA Workaround
Allow the same amount of special episodes as normal season 1 episodes. So if the series has 26 normal episodes, then you can also include 26 specials (season 0). These episodes will not have any title and will be named "Special Episode" with "Special" air date. This workaround also allow you to include OVA with a single episode (eg. Hoshi no Koe) which ANN will not include episode listing. Just name it 0x01.
Enable TVDB Fanart
Retrieve fanarts from TVDB using the main title from ANN. Matches with the same premiere date as the one listed on ANN will be preferred.
Include Alternative Titles in Fanart Search
In addition to the main title, also search with all the alternative titles listed on ANN.
Enable TVDB Banner (With ANN Thumbnail Fallback)
Get banners from ANN using the main title.
Enable TVDB Poster (With ANN Thumbnail Fallback)
Or get posters.
Enable TVDB Episode Details (Using Episode Title Matching)
Retrieve episodes overview and other details from TVDB. The matches are done by comparing the episode title (rather than episode number).
Movies scraper has some of the same settings with TMDB, but TMDB's search doesn't function so well, so fanart search will fail more often.
UPDATES:
2010/03/26:
TV: Adapted scraper to ANN's new html.
TV: Fixed a bug where the scraper tries to get fanart no matter the setting.
MOVIE: Fixed a small bug introduced in v0.11.
2010/03/17:
Recovering from the missing information from old forum backup.
-----------
2010/02/08 - 2010/03/16:
Bunch of changes during this period.
-----------
2010/02/08:
Changed method for ANN thumbnail fallback (to fix a possible bug)
Changed fanart code to include fanarts from all alternative names, instead of just the first one with fanarts
Fixed some exception cases for the voice actors scraping
2010/02/04:
Updated scraper to work with ANN's new html for the casts section
TV Shows: v0.46 download (xml+jpg)
Movies: v0.12 download (xml+jpg)
Settings for TV Shows scraper:
Enable All Language Casts
Retrieve other language voice actors in addition to Japanese.
Enable Unlisted Specials / 1 Episode OVA Workaround
Allow the same amount of special episodes as normal season 1 episodes. So if the series has 26 normal episodes, then you can also include 26 specials (season 0). These episodes will not have any title and will be named "Special Episode" with "Special" air date. This workaround also allow you to include OVA with a single episode (eg. Hoshi no Koe) which ANN will not include episode listing. Just name it 0x01.
Enable TVDB Fanart
Retrieve fanarts from TVDB using the main title from ANN. Matches with the same premiere date as the one listed on ANN will be preferred.
Include Alternative Titles in Fanart Search
In addition to the main title, also search with all the alternative titles listed on ANN.
Enable TVDB Banner (With ANN Thumbnail Fallback)
Get banners from ANN using the main title.
Enable TVDB Poster (With ANN Thumbnail Fallback)
Or get posters.
Enable TVDB Episode Details (Using Episode Title Matching)
Retrieve episodes overview and other details from TVDB. The matches are done by comparing the episode title (rather than episode number).
Movies scraper has some of the same settings with TMDB, but TMDB's search doesn't function so well, so fanart search will fail more often.
UPDATES:
2010/03/26:
TV: Adapted scraper to ANN's new html.
TV: Fixed a bug where the scraper tries to get fanart no matter the setting.
MOVIE: Fixed a small bug introduced in v0.11.
2010/03/17:
Recovering from the missing information from old forum backup.
-----------
2010/02/08 - 2010/03/16:
Bunch of changes during this period.
-----------
2010/02/08:
Changed method for ANN thumbnail fallback (to fix a possible bug)
Changed fanart code to include fanarts from all alternative names, instead of just the first one with fanarts
Fixed some exception cases for the voice actors scraping
2010/02/04:
Updated scraper to work with ANN's new html for the casts section