What's new in Gotham for Scrapers - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Development (https://forum.kodi.tv/forumdisplay.php?fid=32) +--- Forum: Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=60) +--- Thread: What's new in Gotham for Scrapers (/showthread.php?tid=180262) |
What's new in Gotham for Scrapers - Karlson2k - 2013-12-12 Updated 24.03.2014 There are some changes for scrapers processing in upcoming Gotham. I'll summarize changes in this thread.
Code: <?xml version="1.0" encoding="UTF-8"?> 2 means that you don't need any workarounds/hacks to correctly process national non-US-ASCII characters. For 4, new expression attribute "utf8" was introduced, which can be "yes", "no" or "auto" ("auto" is default value) Example of use: Code: <RegExp input="$$2" output="<details>\1</details>" dest="5"> In "auto" mode Regexp pattern is checked for non US-ASCII characters, Unicode Properties or character codes more than 255 (like "\x{2000}) and if any are found, UTF-8 mode is enabled. In not-UTF-8 mode everything is processed as ASCII strings, Unicode Properties are not available. RE: What's new in Gotham for Scrapers - Karlson2k - 2013-12-13 Forgot to say: If you have any questions, feel free to ask them here. RE: What's new in Gotham for Scrapers - hoopsdavis - 2013-12-16 As far as the issue with the music library and scrapers, Currently I have issues with Artist names, wrong names are listed on may Artist. (What I've noticed is, if I have an artist "Duke Ellington" with 5 albums, and the last album listed could be Duke Ellington and Ella Fitzgerald, the next artist listed will be labelled Duke Ellington and Ella Fitzgerald regardless who it is. Is this an issue others are seeing and will be need to be addressed in the scraper tool or within xbmc v13 build?) Looking at the image below, you'll notice the artist "Al Jarreau" his album has track/s with Kathleen Battle. Now the next artist to the right is named Alex Bugnon but he's labelled as "Al Jarreau/Kathleen Battle, I'm seeing this with quite a few artist. RE: What's new in Gotham for Scrapers - Karlson2k - 2013-12-17 (2013-12-16, 23:36)hoopsdavis Wrote: As far as the issue with the music library and scrapers, Currently I have issues with Artist names, wrong names are listed on may Artist.It's not related to scraper processing changes in Gotham. If you have problem with your scraper - open a separate thread in this forum. Or use XBMC General Help and Support. RE: What's new in Gotham for Scrapers - MaDDoGo - 2014-01-11 Hi @Karlson2k I have a problem with filmaffinity scrapper. I think that points 1 and 2 are OK in the scrapper (first line is the same). I think the problem is the third step: in log I see that the we have to search in iso-8859-1 or there are no matches, so I think that the XML that is parsed by XBMC is in iso-8859-1. How we can convert to utf-8? Thanks RE: What's new in Gotham for Scrapers - Karlson2k - 2014-01-11 MaDDoGo, As I mentioned in first post, all XMLs are converted to UTF-8 before parsing. XML can be in any encoding as long as correct encoding was put to XML header. Even if encoding is incorrect or missing, XBMC will try to find suitable encoding, but result is not guaranteed to be correct and costs time/resources. Try with latest nightly, it contains some additional webserver charset detection logic. If it didn't help, check trac for similar problems RE: What's new in Gotham for Scrapers - MaDDoGo - 2014-01-16 Hi, I tried next nightly and all problems are gone. Sorry for the inconvencience but I thought that the problems were from the scrapper. Thank you RE: What's new in Gotham for Scrapers - Karlson2k - 2014-03-24 Updated first post. Changes: all scraper generated XMLs are always processed as UTF-8. RE: What's new in Gotham for Scrapers - scott967 - 2014-03-24 (2014-03-24, 15:30)Karlson2k Wrote: Updated first post. Is there some reason this can't be extended to nfo files? At least UTF-16LE encoding is ignored from my testing (music videos). sacott s. . RE: What's new in Gotham for Scrapers - Karlson2k - 2014-03-25 Starting from Gotham alpha 10, all XMLs (including .nfo) should be automatically converted to UTF-8 before processing. Scraper generated XMLs are forcibly processed as UTF-8. Did you mean that your nfo files didn't converted? If your nfo file is valid XML (with proper charset in declaration) then it must be loaded correctly. Even if your file don't have "charset" in declaration, but starts from "<?xml", XBMC should detect and use UTF-16 encoding. Send me your nfo file, I'll try to reproduce. RE: What's new in Gotham for Scrapers - pancheto - 2014-07-21 (2014-03-24, 15:30)Karlson2k Wrote: Updated first post.unfortunately not. using the FilmAffinity scraper on Windows 8.1 64 XBMC Gotham v13.2beta2 a warning comes out: Code: CScraperUrl::Get: Can't find precise charset for HTML "http://www.filmaffinity.com/es/film109378.html", using "CP1252" as fallback RE: What's new in Gotham for Scrapers - joemcisaac - 2015-04-01 Hi there I'm not sure i have the right part of the forum to be asking this but... I'm good with the scraping part but what i would like to know is when i scrape my 10 hard drives all separated out for different subjects I want to move or tell xbmc or kodi to scrape the covers to an sd card instead of my unit because it takes up alot of space . I know you guys are busy and you have been doing an amazing job but i just thought i would ask on this. RE: What's new in Gotham for Scrapers - ironic_monkey - 2015-04-02 Cant you simply make the thumbnails dir a link? |