This movie recommender use User-based Collaborative Filtering algorithm to find similar films. Implement a recommender algorithm -- Item Collaborative Filtering ( Item CF ), with Mapper-reducer
- Build rating matrix and co-occurence matrix, multiplied these two matrix computation to generate recommender list
- Implemented 4 Map-Reduce jobs to finish the algorithm Item Collaborative Filter(Item CF)
This program is build with maven, and could run it on Hadoop. Use Docker would simplify the environment setting
- Setup Docker
- Setup Hadoop on a Docker image
- Build the Mapper-Reducer jobs with IDE like Intellij
- Run jobs on Hadoop
- cd RecommenderSystem
- hdfs dfs -mkdir /input
- hdfs dfs -put input/* /input
- hdfs dfs -rm -r /dataDividedByUser
- hdfs dfs -rm -r /coOccurrenceMatrix
- hdfs dfs -rm -r /Normalize
- hdfs dfs -rm -r /Multiplication
- hdfs dfs -rm -r /Sum
- cd src/main/java/
- hadoop com.sun.tools.javac.Main *.java
- jar cf recommender.jar *.class
- hadoop jar recommender.jar Driver /input /dataDividedByUser /coOccurrenceMatrix /Normalize /Multiplication /Sum
- hdfs dfs -cat /Sum/*
- #args0: original dataset
- #args1: output directory for DividerByUser job
- #args2: output directory for coOccurrenceMatrixBuilder job
- #args3: output directory for Normalize job
- #args4: output directory for Multiplication job
- #args5: output directory for Sum job