Douban Dataset


This is the anonymized Douban dataset contains 129,490 unique users and 58,541 unique movie items. The total number of movie ratings is 16,830,839. For the social friend network, there are a total of 1,692,952 claimed social relationships.

Two files are included in this Douban dataset, the user-item rating file "uir.index" and the user social friend network file "social.index".

Format of "uir.index" file:
UserId ItemId Rating

Format of "social.index" file:
UserId1 UserId2

Note: when we were crawling the Douban datasets, Douban only allowed the Facebook-like friendship building mechanism (Now Douban also supports the Twitter-like following mechanism). This indicates the social relationships in the file "social.index" should be mutual relationships. It means, for example, if "126 257" exists in the "social.index" file, then "257 126" should also appear somewhere in the file. However, due to some crawling issues (time out, network issue, temporarily blocking, etc.), some relationships may be missed from the dataset. So if you see "126 257" in the file but cannot find "257 126" in the file, you can always add "257 126" into the dataset!

Please don't forget to cite our WSDM '11 paper in your research paper :)

The BibTex of our WSDM '11 paper:

 author = {Ma, Hao and Zhou, Dengyong and Liu, Chao and Lyu, Michael R. and King, Irwin},
 title = {Recommender systems with social regularization},
 booktitle = {Proceedings of the fourth ACM international conference on Web search and data mining},
 series = {WSDM '11},
 year = {2011},
 address = {Hong Kong, China},
 pages = {287--296},
 numpages = {10},

The plain text of our WSDM '11 paper:

H. Ma, D. Zhou, C. Liu, M. R. Lyu, and I. King. Recommender systems with social regularization. In Proceedings of the fourth ACM international conference on Web search and data mining, WSDM '11, pages 287-296, Hong Kong, China, 2011.
pub/data/douban.txt · Last modified: 2012/02/01 04:35 by admin     Back to top