data privacy
#1
Hello there,

just a sinmple yet (so i suppose) interesting question, that i couldn' find an answer to.
Does anyone here have an idea if this cute program, along with it's familiar companions, reveals the users ip-adress while retrieving information from imdb/freedb or alike. I mean if i complete my collection of files with aditional data do i reveal the content of my private datacollection, either directly to database hosting company or to the comnpanies distributing the software?
I hope i was able to make myself clear since my english isn't that good. Wink
Reply
#2
Well as far as scrapers go, I'm sure the site you would be scraping from (e.g. imdb, themoviedb.com...) would be able to see your IP address since you are grabbing info from their site. As far as xbmc itself "phoning home" to the developers... I don't think you have to worry about that.
Reply
#3
What XBMC sends is basically the title of the movie, based on a cleaned version of your filename. Obviously the other end will have your IP address details, and (should they want to) could log all the requests you do from that IP address and build some sort of an idea of what movies you have in your collection. In addition to IP address, they'd also see that it was coming from XBMC based on the useragent string. If you use .nfo files with URLs in them, then all that they'll see is that your IP address is fetching the URL using XBMC - i.e. the filename information is not sent at all. You're welcome to use something like wireshark to confirm this for yourself.

No other user information is sent anywhere.

This is identical to you googling for those same movies, or searching for them directly on IMDb or themoviedb.org's websites.

As for what XBMC collects: We don't collect anything whatsoever if we can help it. The only idea thing we get from XBMC users is:
1. how many are hitting the RSS feed on feedburner.
2. basic website usage statistics collected via google analytics.

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#4
jmarshall Wrote:As for what XBMC collects: We don't collect anything whatsoever if we can help it. The only idea thing we get from XBMC users is:
1. how many are hitting the RSS feed on feedburner.
2. basic website usage statistics collected via google analytics.

1. I wonder how many unique users there have been for example the last year (reading the RSS feed). Why don't you make these numbers public. I think they are interesting to know Smile
Reply
#5
Last time I looked it was about 80-90k/day - it's steadily increasing. I guess that gives some idea of the number of users (who have the RSS feed on and haven't changed it) that either load XBMC up or idle on the homepage on a day to day basis.

Cheers,
Jonathan
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.


Image
Reply
#6
@ jmarshall

Thanks for your replay,

if i get this right, then what i'd be lookin for would be a way to use the method with implementing URLs in the nfo file but sending a request not showing that the request comes from XBMC. Is that possible? Maybe by using some other software to scrape the files?
Reply
#7
jan_g Wrote:@ jmarshall

Thanks for your replay,

if i get this right, then what i'd be lookin for would be a way to use the method with implementing URLs in the nfo file but sending a request not showing that the request comes from XBMC. Is that possible? Maybe by using some other software to scrape the files?

Why would you want to do that? No matter what software you use, be it xbmc or an external manager its still going to hit the site its scraping from.

You could alway build xbmc from source and alter the user-agent string, but i really don't see what benefit you would get from it.

What specifically are your concerns and what are you trying to achieve?
Reply
#8
Well,
the thing is, i don't like the idea of companies building databases on all sorts of details concerning peoples private life. Neither contacts on facebook, nor my private moviecollection is to be a calulation figure in any table that i don't know aboutSmile I really think this is a disturbing sideeffect to all the wonderfull benefits of modern digital life.
So what do u say...there is simply no way?
Reply
#9
Right now TheTVDB doesn't compile any sort of stats. It's not something that's out of the question in the future, though, since I think Neilsen-type ratings for digital viewing would be quite interesting. Before we took that step we'd have to discuss the method and implications with all of our API users (XBMC, Media Portal, etc). I've also envisioned some sort of opt-in system for this, but I doubt it'd get used much.

Another option would be an opt-out system that would allow people to opt-out immediately on their first scrape. The different API users would have to get on board with it, though. The benefit is that we'd be able to completely ditch the log files and simply work via anonymized user keys, meaning it'd actually be more private than a us working from our logs.

There are a few things to consider with this:

1. You cannot be sued for possession of copyrighted media, only for distributing it (this is why we see torrent users getting sued but not Usenet users). Not a privacy matter, but related.

2. There's no way for a site like ours to know if you actually have the media or not. For all we know you could be compiling a database of series that interest you, but without actually having the media.

3. There's no way of knowing how someone got the media even if we assume they have it. Did they record it themselves? Purchase from iTunes?

4. I understand privacy concerns, but frankly most sites don't care one bit what individual users are doing. They might want compiled data about all of their users. This isn't a big selling point to privacy advocates, but it's still the truth.

5. Having the networks know which series are popular for digital viewers and the extent of that popularity might allow shows to continue even if they weren't popular in the mainstream. Think of the following: Firefly, Defying Gravity, Dresden Files, etc. Once again not a privacy matter, but it's all loosely connected.

I'll soon be creating a thread on our site asking for feedback on this, since I want to keep the community happy while still researching ways that we can better support our site.

In the meantime, perhaps you should look at an anonymizing proxy if you're that concerned about it. Scrapers are simply website requests, so it wouldn't be very difficult. If you don't want to do that, your best bet is to just create nfo files manually. If your collection isn't huge it probably wouldn't take too much time.
Reply
#10
jan_g Wrote:Well,
the thing is, i don't like the idea of companies building databases on all sorts of details concerning peoples private life. Neither contacts on facebook, nor my private moviecollection is to be a calulation figure in any table that i don't know aboutSmile I really think this is a disturbing sideeffect to all the wonderfull benefits of modern digital life.
So what do u say...there is simply no way?

In short no. If you hit the site, no matter what your user-agent is, no matter if its using a scraper, web browser, api or other, then there is potential they could collect this information.
Reply
#11
Thanks again guys,
if i understand it correctly, a way could be to reroute the requests via a, let's say trusted service/site, e.g. a xbmc server sytem, managing request and sendback, followed by "hardwired" immediate deletion of the requesting user's ip-adress, after all packages have been confirmed as recieved. Theoretically, that should be possible, right? Being aware of facts like maintenance expence and the sort of cause...
And while making users feel better about their private matters it would equally provide a popularity barometer, while, of cause suffering from the same weeknesses, concerning the datas potential to act as referee, concidering the fact that requests can be generatd by maschines...
Reply
#12
Look into an anonymizing proxy like I said. It's going to be easiest for you and won't require a massive amount of hardware and bandwidth to get running. If someone wanted to put something together for all XBMC users' scraper data you'd probably be looking upwards of 15TB of transfer a month, just based on numbers from my site and what I've heard of the movie db.

Here's one to get you started:
http://www.hidemyass.com/vpn/

Note that this software works on the actual computer. If you're running an Xbox and want that traffic anonymized you'd have to route the traffic through your PC, which isn't that difficult (look into ICS).
Reply

Logout Mark Read Team Forum Stats Members Help
data privacy0