TheTVDB scraper not finding TV Shows with hyphens like "Stargate SG-1" and "K-ville"?
#1
Question 
Was there a recent addition to XBMC that filters out hyphens in tv show names? I just updated and with the newest T3CH build (12263 2008-03-23) and I can't find a TV show unless I first strip out the hyphens. But this means I have to mislabel shows like "Stargate SG-1" or "K-ville" to get them to scan even though they used to scan just fine.

I'm just wondering if something is being done with hyphens on purpose (much like with dots) or if it's an error. I checked http://www.thetvdb.com/api/GetSeries.php...me=K-ville just to make sure it wasn't an issue with thetvdb.com and it returns the correct results no problem.
#2
I think I noticed an issue with this myself once. Dashes where becomign spaces.

Can you possibly post a log of a scan for me?
#3
Wink 
So similar to this (only hyphens instead of dots being converted to spaces by XBMC before lookup?)?:
TheTVDB scraper not finding TV-shows like "L.A. Ink" and "The O.C." Huh
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
#4
Well things like dots, colons, ampersands, etc .. I think are taken care of now in the TVDB's new API because things like "Star Trek Voyager" match up fine with "Star Trek: Voyager". I think essentially the GetSeries search ignores those characers.

Ideally the API should just strat ignoring dashes as well, but since Scott recently redid them and dashes don't work, there might have been a reason for it.

I'll TRY to get a moment to dig into the code here and see whats up.
#5
i have a fix sitting in my tree, but i need to investigate some more before i commit to make sure its the correct one.

if you want to have a look agathorn; its in CIMDB::GetURL()

the whole shenanigans with the regexp shouldnt be applied to tvshows. however it would still be nice to keep the year hint somehow
#6
The API on my end should be ignoring all of those characters. Coco discovered that it's not ignoring apostrophes, so I'll be fixing that later. If any other special characters are found that the API isn't handling correctly, please let me know.

The way it works on the API side is this:
- Sphinx indexer looks at the tables/fields I want and stores the fields I want it to store. When it stores them, it can do character/word/phrase translation. It's here that I have it translate those special characters to an empty string.

-When you do a search with the API, I call Sphinx search. It uses the same config, so it'll do the same character/word/phrase translation before searching the index for the series name.

The result is that you should be fine passing special characters along to the API as long as I have them in the Sphinx configuration.
#7
been have problems trying to add the x-files to my tv library
have 2 xboxes, one running t3ch build of 6-16-08 and another running svn of 7-18-05 an neither will add the x-files to the library scraping against thetvdb.com

main folder = The X-Files
season folders = The X-Files S1, The X-Files S2, etc
avi files = The X-Files.s01e01.Pilot.avi, etc...
#8
I've had the same problem. It's to do with the "-". You can get around it. It won't work if you do a full scan; i.e. you need to look up TV Show information for the X-files only. When it returns the results, and the x-files isn't there, click on the 'Manual' button in the bottom left. I'm pretty sure I got mine working by clearing everything, then putting in just 'x-files' or 'x - files'. Note, with the on screen keyboard, the '-' is really thick, it almost looks like a '='.
#9
Use nfo files. There's 2 parts to this problem. First, XBMC removes dashes before sending the search data to the API. Second, our API doesn't handle dashed words that have the dash removed as well as it should. There are lengthy posts on both sites about this. The ideal solution is me tweaking sphinx on my site to handle it properly, but it's extremely complicated so it's not likely to happen anytime soon. I've stated elsewhere that if someone has experience with sphinx, I'd be happy to give them access to ours so they can configure it.
#10
szsori Wrote:There's 2 parts to this problem. First, XBMC removes dashes before sending the search data to the API. Second, our API doesn't handle dashed words that have the dash removed as well as it should.
FYI, there is the same problems with dots ("."), see: http://www.xbmc.org/forum/showthread.php?t=29220

Rolleyes
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
#11
what about url-encoding the search string?
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
#12
The root of the XBMC side of this issue is really that 'scene releases' contain loads of crap dots and hyphens, which is why XBMC removes them in the first place before as many websites that XBMC has lookup scrapers for does not support lookups if the search string has a dot or hyphen in it that is not suppose to be there (thetvdb.com is an example of a such site, just open a webbrowser on your computer and try looking up "The.IT.Crowd" or "Off.Centre" on www.thetvdb.com).

Doctor.Who.S01E01.release-crap-567ty8reuihvn.avi
Knight.Rider-S01E01-release-crap-567ty8reuihvn.avi
Off.Centre.S01E01.TVRip.Xvid-ITMN.avi
The.IT.Crowd.S02E01.WS.PDTV.XviD-ANGELiC.avi
Mythbusters.S06E01.hdtv.xvid-fqm.avi
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
#13
scene naming is irrelevant for tvshows as it gets the show name from the folder, and only the season/episode identifier from the filename. and movies are supposed to be cleaned of known tags to minimize that impact.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
#14
kraqh3d Wrote:scene naming is irrelevant for tvshows as it gets the show name from the folder, and only the season/episode identifier from the filename. and movies are supposed to be cleaned of known tags to minimize that impact.

Is punctuation considered "known tags" as well? I'm hoping we can find an easy solution until I'm able to dig into sphinx a lot more.

Can someone do a test build where it doesn't remove special characters and see how it works for the different scrapers? I know this is something that should be fixed on my end eventually too, but we get a ton of support questions about this only from XBMC users.

Coco said he spoke with spiff about this and he had a good explanation about why the characters were stripped, so perhaps spiff should be consulted before any action is taken. You know, since the rest of us might be missing something only a grumpy dev would notice. Wink
#15
tags are things like "divx", "ac3", "hdtv", "pdtv", "dvdrip", "internal", "proper", "repack", etc. there may even be group monikers in the list. its been a while since i looked. but additionally, all "-" and "." are replaced with space. (gimme 5 minutes and i'll look at the code to confirm.)

** edit **
update... minus and period are replaced with space. and then the string is url-encoded. i could test with how it works where they are left in place.
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.

Logout Mark Read Team Forum Stats Members Help
TheTVDB scraper not finding TV Shows with hyphens like "Stargate SG-1" and "K-ville"?0