Content-based portion of a potential movie recommendation system based on scraped IMDB reviews. This was a university group project. My personal contributions came in the form of a web scraper that collects various text data from IMDB movie descriptions and reviews spanning thousands of web pages, which worked flawlessly, and getting a cosine similarity matrix to make content-based recommendations to work. The recommendation part technically works as intended, but our experimental design was questionable and we didn't prepare a method for determining if a recommendation is good, or not.
I may revisit this later, but I suspect I would need to obtain data on whether users like certain groupings of movies to tell whether the recommendations are helpful.
But still, hey, that web scraper's a beaut, right?