Developing an Amazon Movie Scraper
#31
jelockwood,

I am looking forward to testing your scraper out. I have been following this thread since Oct. I have found many of the DVD's on IMDB don't have coverart, where as Amazon does, as well as missing dvd information. I appreciate all your efforts, and I am looking forward to testing this out.
Advice on Hardware
PC-LUXA2 CPU-AMD Phenom II X6 1100T RAM-12GB Video Card-AMD Radeon HD 6800
OS-Win7 32bit - Kodi - 14.0 Helix SKIN - Aeon MQ 5 Keyboard - DiNovo Mini

-Semper Fi
gyrene2083
Reply
#32
gyrene2083 Wrote:jelockwood,

I am looking forward to testing your scraper out. I have been following this thread since Oct. I have found many of the DVD's on IMDB don't have coverart, where as Amazon does, as well as missing dvd information. I appreciate all your efforts, and I am looking forward to testing this out.

If the only problem is cover art, then you could use the IMDB scraper, and manually select a local picture, or put a picture in the directory with a .tbn file extension. I wrote the scrapers because some titles are not listed at all on IMDB and I still wanted to include them in the XBMC library.

The download link is now live so you can give it a go.
Reply
#33
both are now sitting in svn (r16563). cheers again!
Reply
#34
I just tried using the Amazon scrapers I mostly wrote for the first time for several weeks, and damn they don't work any more for me.

Currently neither is finding any results (so it is not simply an issue of scraping info from a selected result). This was the original problem that I had (constructing a correct query in the scraper, and then getting/showing the list of results). This was originally solved by C-Quel generously providing his original Amazon scraper effort which I then finished off.

Could anyone else confirm whether the Amazon scrapers (either US or UK) are currently still working for them, and if so what DVD title they used successfully.

If on the other hand, other users confirm it is broken, would anyone be able to assist in diagnosing it?

What held me up last time, is that I could not (without a LAN packet sniffer) see what request the scraper sent out, and what result it got back from Amazon and then be able to see how far it got. Once I got past this and moved on to scraping the film info, this could be easily tested by seeing how many fields successfully returned results.
Reply
#35
Try this...

change Get SearchResults from

imageColumn"[^:]*a href="([^"]*)"[^:]*[^>]*alt="([^"]*)"

productTitle"><a href="([^"]*)"> ([^<]*)</a>

or properly formatted

productTitle&quot;&gt;&lt;a href=&quot;([^&quot;]*)&quot;&gt; ([^&lt;]*)&lt;/a&gt;

might not be perfect as i simply glanced at amazon no tools to hand.
Reply
#36
Thumbs Up 
I have already thanked C-Quel (again) via a private message, but this fix does so far look successful. I will do some more testing and then put updated versions on my download page and issue a request for them to be included as updated and fixed versions in XBMC.

Many thanks again to C-Quel and everyone else who has helped out in the past.

C-Quel Wrote:Try this...

change Get SearchResults from

imageColumn"[^:]*a href="([^"]*)"[^:]*[^>]*alt="([^"]*)"

productTitle"><a href="([^"]*)"> ([^<]*)</a>

or properly formatted

productTitle&quot;&gt;&lt;a href=&quot;([^&quot;]*)&quot;&gt; ([^&lt;]*)&lt;/a&gt;

might not be perfect as i simply glanced at amazon no tools to hand.
Reply
#37
jelockwood Wrote:I have already thanked C-Quel (again) via a private message, but this fix does so far look successful. I will do some more testing and then put updated versions on my download page and issue a request for them to be included as updated and fixed versions in XBMC.

Many thanks again to C-Quel and everyone else who has helped out in the past.

Please use our tracker instead and attach a unified diff to the previous scraper.
Always read the online manual (wiki), FAQ (wiki) and search the forum before posting.
Do not PM or e-mail Team-Kodi members directly asking for support. Read/follow the forum rules (wiki).
Please read the pages on troubleshooting (wiki) and bug reporting (wiki) before reporting issues.
Reply
#38
Amazon does not give permission to get info via http. They have a webservice to use which is legal, however you have to delete the info after 3 months... hehe this means that movies should automaticly start to disappear from the library if they were scanned via Amazon webservice scrapper Wink
Reply
#39
both scrapers disabled in svn
Reply
#40
ultrabrutal Wrote:Amazon does not give permission to get info via http. They have a webservice to use which is legal, however you have to delete the info after 3 months... hehe this means that movies should automaticly start to disappear from the library if they were scanned via Amazon webservice scrapper Wink

you gonna ruin everything
Reply
#41
Ok, I did some more testing of this updated version and found a couple more issues.

1. There was a problem with processing the DVD title on some entries on Amazon.co.uk due to the fact some titles are formatted different to others, I believe I have successfully modified the scraper to better cope with this.

2. I took the opportunity to add support for scraping the DVD "Writers" information if available on the Amazon pages. This applies to both the Amazon.com and Amazon.co.uk versions.

I have put the updated versions at this URL for those keen to get it before it appears in the next XBMC release.

http://homepage.mac.com/jelockwood/scrapers.html
Reply
#42
Thanks, however please (always) create a new ticket on trac for each new scraper update:
http://trac.xbmc.org (unified diff if possible, or better yet both diff and the full file).

Thanks again Big Grin
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#43
Gamester17 Wrote:Thanks, however please (always) create a new ticket on trac for each new scraper update:
http://trac.xbmc.org (unified diff if possible, or better yet both diff and the full file).

Thanks again Big Grin

I have reopened and updated the original Trac I used to submit the first version. The purpose of the previous message (from me) was to let those people know what was happening who have been following this thread.
Reply
#44
Please create new trac tickets for updates if and when the old ticket been closed, instead of reopening the old ticket, (only if the old ticket has never closed then it is OK to posts updates to it), this process is to make tracking management easier.

Thanks again! Nod
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
Reply
#45
Neither does iMDB

ultrabrutal Wrote:Amazon does not give permission to get info via http. They have a webservice to use which is legal, however you have to delete the info after 3 months... hehe this means that movies should automaticly start to disappear from the library if they were scanned via Amazon webservice scrapper Wink
Reply

Logout Mark Read Team Forum Stats Members Help
Developing an Amazon Movie Scraper1