1、论文阅读报告撰写人: 张胜 时间:10 月 9 号一、 标题Parallel Spectral Clustering in Distributed Systems二、 出处IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE三、 摘要Spectral clustering algorithms have been shown to be more effective in finding clusters than some traditional algorithms, such as k-means. However,
2、 spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform clustering on large data sets, we investigate two representative ways of approximating the dense similarity matrix. We compare one approach by sparsifyin
3、g the matrix with another by the Nystrom method. We then pick the strategy of sparsifying the matrix via retaining nearest neighbors and investigate its parallelization. We parallelize both memory use and computation on distributed computers. Through an empirical study on a document data set of 193,844 instances and a photo data set。四、 研究的问题五、 研究的目标六、 研究的方法七、 文章结论八、 备注注:这部分主要是你对这篇文章的看法,哪些方面是值得借鉴的?哪些地方可能值得继续深入下去(改进)?