Hi all!
Firstly let me introduce myself. My name is Marko Prcać, I am currently finishing mathematics master at University of Primorska (located in Koper, Slovenia) on the Faculty of Mathematics, Natural sciences and Information Technologies. I have spent my last year as an full time intern in company Dinit d.o.o. developing various solutions for credit card industry, mostly in C#, Python and all standard front end technologies with substantial use of tensorflow, tesseract and openCV.
My idea for recommender system comes from lecture by prof. Jure Leskovec (Slovenian Professor at Stanford University and chief scientist at Pinterest) which I attended couple of months ago. At lecture prof. Leskovec explained to us how we can leverage simple graph theory concepts to serve people interesting pins in near real time (much faster than pinterest was able to do it by leveraging traditional ML algorithms).
I believe that we can abstract same approach to recommending "what to watch" and be a little bit more interesting then youtube suggestions are.
Basic Idea:
- Pitnterest constructs bi partite graphs where nodes on one side of graphs are only boards, and other side nodes are only pins. Then it uses a simple algorithm to perform a random walk from a starting pin, and returns set of pins that are potentially interesting to the user.
- In our case starting pin would be the currently viewed (or last viewed movie) and we would perform a similar random walk down the movies and movie boards graph.
What are movie boards?
- Imagine this as just a set of movies that have something in common ... same author, genre, actors etc. being some of the naturally occurring relationships, but boards can be formed by prominent community members and by users themselves. Maybe even Pinterest decides to support open source community by sharing some of their users favourite choices.
What is the graph?
- Simple relational database with foreign keys form movies to the board (with reversible queries)
Where is the Machine Learning part?
- Advanced version of the algorithm assigns weights to graph connections so that more relevant items are selected more times. Weight assigning function should be derived from available user data.
- Idea: Kind of running average, that moves through users timeline observing n movies to see which parameters impacted selection of the n+1 movie.
- Other uses of ML: Pick up patterns in order to bias algorthm towards selection of certain kind of movies.
- Construct interesting boards from online data.
Advantages:
- System does not need a central server
- Can be agnostic of users private data
- Can return results in batches
- Has natural stages to it:
- Construct very simple movie graph based on available movie rankings
- Construct basic random walk algorithm
- Increase graph to maximum viable size for an addon (optional: call community for board suggestions)
- (at this stage reccommender should function)
- (Parallel task up to now) Collect as much of data as possible (this should not interfere with programming because it is more of an administrative task)
- Observe relations, visualize connections
- Start algorithm training
- Iterate and improve on base results as much as time allows
My computer:
I am currently in a market for a new computer. Computing power should not be an issue however since I know how to leverage AWS, Azure ML and Paperspace. In case I get chosen for the project I believe that I will be granted access to universities cluster
Other things about me:
- Waterpolo player: I played professionally and semi-pro during college. I Am dedicated team player with good understanding of interpersonal relations.
- Github: most of my latest work belongs to my employer, I will get one of my older repositories up to my current level and share it ASAP.
- LinkedIN:
https://www.linkedin.com/in/marko-prca%C4%87-83472568/
Why I dislike youtube, google, quora and similar recommenders: They box us in based on streaks of our attention and focus on a single topic … humans like to explore different things! Kodi recommender should enable people to have fun and widen their horizons at same time, not finish up watching similar movies over and over again!
Looking forward to your reply! BR Marko!