Interested in GSOC project "Using machine learning to improve suggestions"
#1
Hello all,

I am Vedant Rathore, an Undergraduate student from Indian Institute of Technology, Guwahati, India majoring in Chemical Engineering. I am really interested in working on a recommendation engine for Kodi as it aligns with my skills and interests well.

I am a regular user of Kodi TV and would love to see this project deployed to production. Although after searching the Kodi GitHub organization I was unable to find a good starting point to begin my contributions to this project, Any guidance on how to start would be highly appreciated. 

Regards,
Vedant Rathore
Reply
#2
Well, start with setting up a local development system. Check out the code and follow the guide for your platform. https://github.com/xbmc/xbmc/blob/master.../README.md
Get back when you have it running, don't be afraid to ask if anything goes wrong - that does happen.
Reply
#3
Hey, I was able to build Kodi on my macos device and tinkered around with the code a little bit. There were a few thoughts I had regarding the project:

1) I recommend that we make the recommendation engine separate from the existing codebase and deploy it as another service which communicates with the client using a REST interface. This would allow us to decouple the codebase which will help in testing it, also we can use a different ecosystem such as python which has various useful tools like tensorflow to help build these kind of systems. The model could be deployed using a webserver of choice using frameworks like flask/Django and would communicate with the prod DB to infer suggestions based on the specific user.

2) For training the recommendation engine, we could use the movie-lens dataset (https://grouplens.org/datasets/movielens/) and combine it with user watch history, watch list and other data sources. This would serve as a good dataset to train the model. 

3) The algorithm of the recommendation engine will be the crux of the project, there are basic algorithms like collaborative-filtering and Matrix decomposition for recommendations which could serve as a good starting point, however complex algorithms like using deep learning to show recommendations have shown excellent accuracy for companies like Youtube, Netflix. Examples of such algorithms are a) Deep Neural Networks for YouTube Recommendations (https://static.googleusercontent.com/med.../45530.pdf) and Movie recommendations using LSTMs (https://medium.com/deep-systems/movix-ai...03d6a31607), This is something I have to one consult with the mentor and decide accordingly. 

Any suggestions on how to move forward with the project and how to start contributing would be highly appreciated. 

Regards,
Vedant Rathore
Reply
#4
Some input. I would try to do it without a central server, as it would cause much more work in the aftermath due to somebody having to run the server and I don't really want user data to be send to a server.
So my original thought when creating the project idea was to do everything local, which raises some problems but solves others.

One idea might be to extend the kodi interface to expose "recommendations" and then doing an addon (type) that can set recommendations somehow. Interesting part here would be to find out how to best set such recommendations.

Also slightly related: https://github.com/xbmc/xbmc/pull/13323
Reply
#5
Hey "Razze",

I agree with your point about maintaining an external server and the privacy issues that come with it. I think we could do a python addon thing where it already has the trained model file and it tries to serve recommendations offline. To expose recommendations interface with Kodi I think I have to dive into the code Kodi modules and the UI code. 

Also, one question I had in mind. Is there any bug fixing or code contributions I should be doing for this project or should I start working on the proposal and the think about the process which we are going to follow to tackle the issues? Also, I checked out the PR and understood some parts of the VideoDatabase code, if there is anything else you'd like me to do, please do tell me. 

Any help or guidance would be highly appreciated,

Regards,
Vedant Rathore
Reply
#6
For having having models we could distribute better trained models through our add-on repository quite easily. This just requires hooking into the repo and pushing updates when needed which can be used locally.

You don't have to start with bugfixing our code or do contributions. All you have to do is get Kodi building locally,get a rough Idea how to implement and do your proposal.
Of course we always welcome contributions whether or not you will do our GSoC. If in mean time you find and fix an issue we welcome a pull request on github. However for now focus on getting Kodi building and running and your proposal.
Read/follow the forum rules.
For troubleshooting and bug reporting, read this first
Interested in seeing some YouTube videos about Kodi? Go here and subscribe
Reply
#7
It would also make sense to check out a kodi database file, so that you know what kind of data you have available to make recommendations based upon.
Reply
#8
Hey @Razze and @Martijn 
 
Thanks for your response, I agree about the addon thing so we can serve trained models. I've already built up the Kodi code on my local machine, I will now focus more on the implementation details and the Kodi modules I will work with during the summer. I will also start making a basic draft proposal on which we can iterate later on. I have my midsemester exams this week so I'll be a little inactive but will try to work on this as much as I can. Thanks!

Regards,
Vedant Rathore
Reply
#9
Hey,

I am Santosh Rajan, undergrad computer student at National Institute of Technology, Pondicherry, India.
I am 20 years old, been coding for about 3 years.

I am mainly work on machine learning projects most of my time.
Familiar with :
  • Python
  • C++
  • Tensorflow
  • Keras
  • MySQL
My rig : Ryzen 1600, GTX 1080

I am not a Kodi user, at least not yet!
I do not have any other job/project during the summer.

@Razze I would like to know more about the database files and it's structure.
Reply
#10
(2018-03-01, 04:10)nottherealsanta Wrote: Hey,

I am Santosh Rajan, undergrad computer student at National Institute of Technology, Pondicherry, India.
I am 20 years old, been coding for about 3 years.

I am mainly work on machine learning projects most of my time.
Familiar with :
  • Python
  • C++
  • Tensorflow
  • Keras
  • MySQL
My rig : Ryzen 1600, GTX 1080

I am not a Kodi user, at least not yet!
I do not have any other job/project during the summer.

@Razze I would like to know more about the database files and it's structure. 
 Hey there Smile

Please read this https://github.com/xbmc/xbmc/blob/master...tabase.cpp or better install kodi and access your database via a sqlite viewer.
See https://kodi.wiki/view/Userdata for your respective system, to find the path to those files.
Reply
#11
Hi all!
 
Firstly let me introduce myself. My name is Marko Prcać, I am currently finishing mathematics master at University of Primorska (located in Koper, Slovenia) on the Faculty of Mathematics, Natural sciences and Information Technologies. I have spent my last year as an full time intern in company Dinit d.o.o. developing various solutions for credit card industry, mostly in C#, Python and all standard front end technologies with substantial use of tensorflow, tesseract and openCV.

My idea for recommender system comes from lecture by prof. Jure Leskovec (Slovenian Professor at Stanford University and chief scientist at Pinterest) which I attended couple of months ago. At lecture prof. Leskovec explained to us how we can leverage simple graph theory concepts to serve people interesting pins in near real time (much faster than pinterest was able to do it by leveraging traditional ML algorithms).
 
I believe that we can abstract same approach to recommending "what to watch" and be a little bit more interesting then youtube suggestions are.
 

Basic Idea:

    - Pitnterest constructs bi partite graphs where nodes on one side of graphs are only boards, and other side nodes are only pins. Then it uses a simple algorithm to perform a random walk from a starting pin, and returns set of pins that are potentially interesting to the user.
 
    - In our case starting pin would be the currently viewed (or last viewed movie) and we would perform a similar random walk down the movies and movie boards graph.


What are movie boards?

    - Imagine this as just a set of movies that have something in common ... same author, genre, actors etc. being some of the naturally occurring relationships, but boards can be formed by prominent community members and by users themselves. Maybe even Pinterest decides to support open source community by sharing some of their users favourite choices.


What is the graph?

    - Simple relational database with foreign keys form movies to the board (with reversible queries)


Where is the Machine Learning part?

    - Advanced version of the algorithm assigns weights to graph connections so that more relevant items are selected more times. Weight assigning function should be derived from available user data.
        - Idea: Kind of running average, that moves through users timeline observing n movies to see which parameters impacted selection of the n+1 movie.   
        - Other uses of ML: Pick up patterns in order to bias algorthm towards selection of certain kind of movies.
    - Construct interesting boards from online data.
 

Advantages:

    - System does not need a central server
    - Can be agnostic of users private data
    - Can return results in batches
    - Has natural stages to it:
        - Construct very simple movie graph based on available movie rankings
        - Construct basic random walk algorithm
        - Increase graph to maximum viable size for an addon (optional: call community for board suggestions)
        - (at this stage reccommender should function)
        - (Parallel task up to now) Collect as much of data as possible (this should not interfere with programming because it is more of an administrative task)
        - Observe relations, visualize connections
        - Start algorithm training
        - Iterate and improve on base results as much as time allows

 
My computer:

I am currently in a market for a new computer. Computing power should not be an issue however since I know how to leverage AWS, Azure ML and Paperspace. In case I get chosen for the project I believe that I will be granted access to universities cluster
 

Other things about me:

    - Waterpolo player: I played professionally and semi-pro during college. I Am dedicated team player with good understanding of interpersonal relations.
    - Github: most of my latest work belongs to my employer, I will get one of my older repositories up to my current level and share it ASAP.
    - LinkedIN: https://www.linkedin.com/in/marko-prca%C4%87-83472568/
 

Why I dislike youtube, google, quora and similar recommenders: They box us in based on streaks of our attention and focus on a single topic … humans like to explore different things! Kodi recommender should enable people to have fun and widen their horizons at same time, not finish up watching similar movies over and over again!

 

Looking forward to your reply! BR Marko!
Reply
#12
Hey and welcome Smile

That does indeed sound very good and seems like a perfect match.
Only thing I can think of is that I want that, but we should generalize it in some way. So that you can focus on movies for now, but we can easily add shows, music or even pictures later.
Reply
#13
(2018-03-07, 10:14)Razze Wrote: That does indeed sound very good and seems like a perfect match.
Only thing I can think of is that I want that, but we should generalize it in some way. So that you can focus on movies for now, but we can easily add shows, music or even pictures later.

Thank you for your opinion Razze!

I believe that board and pin concept can be easily extended to shows, music and pictures. Pinterest does if for pictures, and show are basically very similar to movies, but you have to put "next episode" at the top of the list.

Let me know what would be desired next steps from your side when you have time for it. I will be back on forum after my lectures.

BR! Marko!
Reply
#14
Well first thing would be to setup a local enviroment to debug/compile kodi. You can find infos about that in our WIKI
Reply
#15
Hey @Razze 

I apologize for the long break, I just finished midterm examinations yesterday. It seems a lot has happened in this thread in the meantime.

However as said earlier, I am currently working on understanding the Kodi database engine to find out what kind of data do we have available for the project. I am also working on a draft proposal which should be completed by tomorrow. To reinforce my proposal, I've also started working on a deep learning based demo recommendation engine which utilizes the movie-lens dataset. 

Please do tell me if you need me to work on something different than this.

Regards,
Vedant Rathore
Reply

Logout Mark Read Team Forum Stats Members Help
Interested in GSOC project "Using machine learning to improve suggestions"0
This forum uses Lukasz Tkacz MyBB addons.