I don't think i'm really qualified to defend XBMC scraper format, however every other scraper format i've seen seems to be limited by the need to have programming skills and if not limited by the type of data that can be gathered by it, provisioned for only one type with no want for expansion, and no posibility for drop in usability... everyone's so proprietary, and defensive about their formats, personally, i started making this because i thought it'd be a good way to add a expandable, non-proprietary, info scraper to media manager apps (like the one i'm creating myself) that requires the user only to know regular expressions, which one can learn in the span of an afternoon, and how to put together an xml.
There's nothing out there at the moment that has that kind of flexibility, hell i'm even adding to the already available library of things that this format can scrape, if they don't want flexibility and want to stick to what they know that's fine with me, if no one else cares to use XBMC's scraper format I still will because i see the potential for expansion, i can offer what i have as a standard, but i can't force it on them.
And honestly, if i did want to go to bat for XBMC's scraper format, pulling info from the filesystem would be as easy as writing a scraper to do so if i'm understanding the format properly, In the case of runing thier quieries or whatever it is they wanted to run, at least scraperXML in this case returns XML formatted info which they could run whatever kind of quieries they wanted. They just want to stick with their format, and they are going to be stubborn about it.
Even in the case of MeediOS they just want something to drop in that they don't have to manage, one person said clear as day that they really didn't want to have anything to do with managing the scrapers, so i see a little difficulty in bringing all the Open source HTPC's under the same roof as far as information gathering, but that isn't going to stop my quest to continue to expand on the XBMC's scraper capabilities through ScraperXML, and maintain compatibility with XBMC's current scrapers...
maybe when i'm done with XBMC's scraper compatibility, i'll investigate their format, and then add compatibility with it, maybe the way to bring everyone under the same roof is to support all their proprietary systems in one library, give the user a choice of how to update their information from online sources.
Quote: 1. The method of outputting results with the XBMC scripting engine is fairly cryptic. The script writer has to basically construct an XML document for the output. And what makes this even worse is this construction is embedded in an existing XML document, which means all special characters must be escaped. This dramatically complicates things, reducing the maintainability of existing scripts and making new scripts much more difficult to write.
XBMC's scraper format requires less of a lerning curve than programming. If neccessary i can provide a program that i've written that builds a RegExp Block (with expression, and the option to add all information from that block all you have to do is write your info without worrying about excaping characters and it spits out the RegExp. The options for both the RegExp block And the expression can be set, by clicking their corresponding controls. Its what was origionally going to be the scraper editor... I could probably even add predefined regular expressions, with descriptions to make it easier.
Quote: 2. The output of the search function is only a text based string and a url for the details page for the movie, tv show, etc. The string is not even consistent, sometimes it is title, sometimes it is title and year, sometimes it is official title, english title, then year. It's just not reliable. This would create difficulties with the auto approval system in Moving Pictures.
Not true, the output of the search function is always a url (or am i wrong)... the output of each function is consistent, i think his concern, like mine is what is put into each function, and how to make a descision on which is the best match to the input. I can see how that would be a valid concern. But since the output of each function is consistent
CreateSearchUrl: <url>whatever-page-link</url> (i know some output without the <url> tags but i think this should be standard for this function, and have made it standard with all the new type scrapers i'm working on)
GetSearchResults: <results><entity><title/><url/><id/><what/><ever/><else/><you/><want/><entity>........</results>
GetDetails: <details><all/><the/><details/><you/><could/><possibly/><gather/></details>
I don't know why they consider it would be hard to autoaccept an search result based on its <title> or <id> or <url> or <aired> or whaever else
the information is returned XML, which means its quieriable, even sortable and comparable.
Quote: 3. The way the scraper engine is written, it's just not that extensible. What if someone wanted to add XML parsing tools to be provided to the scraper? What if someone wants to execute XSLT queries on a web page found on the internet? Or what about pulling details from the filesystem rather than a URL? These are not features I particularly care about, but my point is it would not be easy to add these features with the way the XBMC stuff is coded. The core of the scraper engine would have to be modified and this would bring risk to existing functionality, because everything as it is, is lumped together in a big class / file.
Valid arguements i'd say, but then you can't possibly program to allow EVERYTHING to be done, or you run the risk of making it much to complicated for the COMMON user to work with, and lets face it... in the world of HTPC the user is a COMMON user, there are more non-programmers using these apps than there are programmers, and the majority of people from my point of view, want a smaller learning curve for their HTPC so they spend less time getting it to do what they want and more time enjoying the fact that it does what they want.