TY - GEN
T1 - Told you i didn't like it
T2 - 32nd IEEE International Conference on Data Engineering, ICDE 2016
AU - Hwang, Won Seok
AU - Parc, Juan
AU - Kim, Sang Wook
AU - Lee, Jongwuk
AU - Lee, Dongwon
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/6/22
Y1 - 2016/6/22
N2 - We study how to improve the accuracy and running time of top-N recommendation with collaborative filtering (CF). Unlike existing works that use mostly rated items (which is only a small fraction in a rating matrix), we propose the notion of pre-use preferences of users toward a vast amount of unrated items. Using this novel notion, we effectively identify uninteresting items that were not rated yet but are likely to receive very low ratings from users, and impute them as zero. This simple-yet-novel zero-injection method applied to a set of carefully-chosen uninteresting items not only addresses the sparsity problem by enriching a rating matrix but also completely prevents uninteresting items from being recommended as top-N items, thereby improving accuracy greatly. As our proposed idea is method-agnostic, it can be easily applied to a wide variety of popular CF methods. Through comprehensive experiments using the Movielens dataset and MyMediaLite implementation, we successfully demonstrate that our solution consistently and universally improves the accuracies of popular CF methods (e.g., item-based CF, SVD-based CF, and SVD++) by two to five orders of magnitude on average. Furthermore, our approach reduces the running time of those CF methods by 1.2 to 2.3 times when its setting produces the best accuracy. The datasets and codes that we used in experiments are available at: https://goo.gl/KUrmip.
AB - We study how to improve the accuracy and running time of top-N recommendation with collaborative filtering (CF). Unlike existing works that use mostly rated items (which is only a small fraction in a rating matrix), we propose the notion of pre-use preferences of users toward a vast amount of unrated items. Using this novel notion, we effectively identify uninteresting items that were not rated yet but are likely to receive very low ratings from users, and impute them as zero. This simple-yet-novel zero-injection method applied to a set of carefully-chosen uninteresting items not only addresses the sparsity problem by enriching a rating matrix but also completely prevents uninteresting items from being recommended as top-N items, thereby improving accuracy greatly. As our proposed idea is method-agnostic, it can be easily applied to a wide variety of popular CF methods. Through comprehensive experiments using the Movielens dataset and MyMediaLite implementation, we successfully demonstrate that our solution consistently and universally improves the accuracies of popular CF methods (e.g., item-based CF, SVD-based CF, and SVD++) by two to five orders of magnitude on average. Furthermore, our approach reduces the running time of those CF methods by 1.2 to 2.3 times when its setting produces the best accuracy. The datasets and codes that we used in experiments are available at: https://goo.gl/KUrmip.
UR - http://www.scopus.com/inward/record.url?scp=84980361816&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84980361816&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2016.7498253
DO - 10.1109/ICDE.2016.7498253
M3 - Conference contribution
AN - SCOPUS:84980361816
T3 - 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016
SP - 349
EP - 360
BT - 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 May 2016 through 20 May 2016
ER -