Clean scraping API
#60
I've done an initial analysis of the data gathered during GSoC and I just wanted to add some calculated values to the example above.

Code:
P(Episode    | Video) = 0.843797662418
P(Movie      | Video) = 0.154526991252
P(MusicVideo | Video) = 0.00167534633044

These numbers I got simply from looking at the posted length of the data, e.g
Code:
len(Episode)     = 640650
len(Movie)       = 117324
len(MusicVideo)  = 1272
len(Video)       = 233118
len(TotalVideos) = len(Episode) + len(Movie) + len(MusicVideo)

Where len(Video) are videos not in the database (unscraped for a reason or simply missed)
I skipped len(Video) since I want the probability to add up to 1.

Code:
P(Episode|Video) = len(Episode) / len(TotalVideos)

I also did a check at the runtimes of movies (with 10k movies), where I think it could be valid to assume normal distribution with µ=105.624426079 σ=22.9860217337
Image

I have noticed that the movie database seems to contain some tv shows aswell, which might be what is causing it to shift slightly to the left.

I want to add a disclaimer on this since its just a very quick, initial analysis. And it was a long time since I did statistics math Smile
And that randomly selected movies/episodes from tmdb and tvdb might be a better indicator on the runtime distribution.

Cheers,
Tobias
If you have problems please read this before posting

Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.

Image

"Well Im gonna download the code and look at it a bit but I'm certainly not a really good C/C++ programer but I'd help as much as I can, I mostly write in C#."
Reply


Messages In This Thread
Clean scraping API - by topfs2 - 2012-06-14, 19:33
RE: Clean scraping API - by olympia - 2012-06-14, 22:38
RE: Clean scraping API - by DonJ - 2012-06-15, 01:36
RE: Clean scraping API - by da-anda - 2012-06-15, 10:27
RE: Clean scraping API - by topfs2 - 2012-06-16, 11:30
RE: Clean scraping API - by da-anda - 2012-06-18, 22:19
RE: Clean scraping API - by DonJ - 2012-06-27, 12:59
RE: Clean scraping API - by lboregard - 2012-07-01, 04:57
RE: Clean scraping API - by topfs2 - 2012-07-04, 10:34
RE: Clean scraping API - by lboregard - 2012-07-04, 12:09
RE: Clean scraping API - by olympia - 2012-06-16, 12:02
RE: Clean scraping API - by topfs2 - 2012-06-16, 17:05
RE: Clean scraping API - by Maxoo - 2012-06-17, 01:19
RE: Clean scraping API - by RockerC - 2012-06-20, 15:38
RE: Clean scraping API - by NEOhidra - 2012-06-19, 16:25
RE: Clean scraping API - by solidsatras - 2012-06-20, 09:40
RE: Clean scraping API - by Hitcher - 2012-06-20, 10:08
RE: Clean scraping API - by Martijn - 2012-06-20, 10:16
RE: Clean scraping API - by Montellese - 2012-06-20, 10:13
Re: Clean scraping API - by Martijn - 2012-06-20, 16:34
RE: Clean scraping API - by Martijn - 2012-06-20, 21:04
RE: Clean scraping API - by jmarshall - 2012-06-20, 23:46
RE: Clean scraping API - by solidsatras - 2012-06-30, 16:09
RE: Clean scraping API - by Thorbear - 2012-06-30, 13:53
RE: Clean scraping API - by TheAstronaut - 2012-07-02, 16:39
RE: Clean scraping API - by spiff - 2012-07-03, 18:53
RE: Clean scraping API - by TheAstronaut - 2012-07-03, 21:03
RE: Clean scraping API - by Martijn - 2012-07-04, 11:37
RE: Clean scraping API - by topfs2 - 2012-07-07, 12:43
RE: Clean scraping API - by kimp93 - 2012-08-22, 03:28
RE: Clean scraping API - by topfs2 - 2012-08-22, 11:37
RE: Clean scraping API - by aptalca - 2012-07-24, 21:37
RE: Clean scraping API - by kimp93 - 2012-08-23, 05:26
RE: Clean scraping API - by topfs2 - 2012-08-23, 11:53
RE: Clean scraping API - by malte - 2013-03-03, 10:10
RE: Clean scraping API - by topfs2 - 2013-03-06, 09:19
RE: Clean scraping API - by garbear - 2013-03-06, 08:09
RE: Clean scraping API - by garbear - 2013-03-06, 10:11
RE: Clean scraping API - by malte - 2013-03-06, 18:01
RE: Clean scraping API - by topfs2 - 2013-03-11, 15:11
RE: Clean scraping API - by garbear - 2013-03-30, 16:09
RE: Clean scraping API - by topfs2 - 2013-03-31, 20:00
RE: Clean scraping API - by garbear - 2013-04-01, 07:35
RE: Clean scraping API - by malte - 2013-04-02, 14:25
RE: Clean scraping API - by topfs2 - 2013-04-02, 15:03
RE: Clean scraping API - by garbear - 2013-04-02, 16:56
RE: Clean scraping API - by N3MIS15 - 2013-04-03, 07:12
RE: Clean scraping API - by garbear - 2013-04-03, 11:27
RE: Clean scraping API - by topfs2 - 2013-04-04, 08:59
RE: Clean scraping API - by malte - 2013-04-03, 12:56
RE: Clean scraping API - by garbear - 2013-04-04, 08:38
RE: Clean scraping API - by natethomas - 2013-04-04, 10:23
RE: Clean scraping API - by topfs2 - 2013-04-04, 10:56
RE: Clean scraping API - by natethomas - 2013-04-05, 09:58
RE: Clean scraping API - by da-anda - 2013-04-05, 11:25
RE: Clean scraping API - by Bstrdsmkr - 2013-04-05, 16:05
RE: Clean scraping API - by topfs2 - 2013-04-05, 12:27
RE: Clean scraping API - by garbear - 2013-04-05, 16:27
RE: Clean scraping API - by jmarshall - 2013-04-06, 07:36
RE: Clean scraping API - by topfs2 - 2013-04-10, 08:38
RE: Clean scraping API - by natethomas - 2013-04-10, 09:28
RE: Clean scraping API - by garbear - 2013-04-10, 09:42
RE: Clean scraping API - by N3MIS15 - 2013-04-10, 10:40
RE: Clean scraping API - by garbear - 2013-04-10, 09:34
RE: Clean scraping API - by topfs2 - 2013-04-10, 13:29
RE: Clean scraping API - by garbear - 2013-04-10, 13:43
RE: Clean scraping API - by topfs2 - 2013-04-10, 13:58
RE: Clean scraping API - by jmarshall - 2013-04-10, 10:05
RE: Clean scraping API - by garbear - 2013-04-10, 12:08
RE: Clean scraping API - by topfs2 - 2013-04-11, 11:07
RE: Clean scraping API - by N3MIS15 - 2013-04-11, 11:32
RE: Clean scraping API - by topfs2 - 2013-04-11, 11:42
RE: Clean scraping API - by jmarshall - 2013-04-11, 09:00
RE: Clean scraping API - by topfs2 - 2013-04-11, 11:04
RE: Clean scraping API - by garbear - 2013-04-11, 12:05
Re: Clean scraping API - by queeup - 2013-04-11, 16:58
RE: Clean scraping API - by topfs2 - 2013-04-11, 18:04
Re: Clean scraping API - by queeup - 2013-04-11, 19:44
RE: Clean scraping API - by garbear - 2013-04-11, 21:41
Re: Clean scraping API - by queeup - 2013-04-11, 22:05
RE: Clean scraping API - by garbear - 2013-04-11, 22:51
RE: Clean scraping API - by topfs2 - 2013-04-17, 10:50
RE: Clean scraping API - by garbear - 2013-05-09, 23:05
RE: Clean scraping API - by TheMonkeyKing - 2013-10-18, 22:31


Logout Mark Read Team Forum Stats Members Help
Clean scraping API3
This forum uses Lukasz Tkacz MyBB addons.