收藏 分享(赏)

南京大学数据挖掘chapter 8 cluster analysis.pdf

上传人:weiwoduzun 文档编号:1758230 上传时间:2018-08-22 格式:PDF 页数:26 大小:197.43KB
下载 相关 举报
南京大学数据挖掘chapter 8 cluster analysis.pdf_第1页
第1页 / 共26页
南京大学数据挖掘chapter 8 cluster analysis.pdf_第2页
第2页 / 共26页
南京大学数据挖掘chapter 8 cluster analysis.pdf_第3页
第3页 / 共26页
南京大学数据挖掘chapter 8 cluster analysis.pdf_第4页
第4页 / 共26页
南京大学数据挖掘chapter 8 cluster analysis.pdf_第5页
第5页 / 共26页
点击查看更多>>
资源描述

1、Data MiningFall 2006Chapter 8 Cluster AnalysisZhi-Hua ZhouDepartment of Computer Science otherwise exitHierarchical methodsa hierarchical method creates a hierarchical decomposition of the given set of data objectstwo schemes: agglomerative: bottom-up, starts with each object forming a separate grou

2、p divisive: top-down, starts with all the objects in the same clusterrepresentatives: AGNES (AGgglomerative NESting) DIANA (DIvisive ANAlysis) BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) CURE (Clustering Using REpresentatives) ROCK (RObust Clustering using linKs) CHAMELEONAG

3、NESAGgglomerative NESting an agglomerative methodStep1: every object is placed into a cluster of its ownStep2: merge the clusters according to the minimum Euclidean distance between the nearest objects in the clustersStep3: if arriving a “whole” cluster, exit; otherwise go to Step 2DIANADIvisive ANA

4、lysis a divisive methodStep1: all the objects are placed in one clusterStep2: split the clusters according to the maximum Euclidean distance between the nearest objects in the clustersStep3: if each cluster contains only one object, exit; otherwise go to Step 2Density-based methodsa density-based me

5、thod creates clusters by continuing growing a cluster so long as the density of the data objects in the neighborhood exceeds some thresholdrepresentatives: DBSCAN (Density-Based Spatial Clustering of Applications with Noise) OPTICS (Ordering Points To Identify the Clustering Structure) DENCLUE (DENs

6、ity-based CLUstEring) CLIQUE (CLustering In QUEst)basic idea: for each object of a cluster, the neighborhood of a given radius (called -neighborhood) has to contain at least a minimum number of objects (MinPts)key concepts: an object P whose -neighborhood containing no less than MinPts number of obj

7、ects is a core object with respect to and MinPts an object M is directly density-reachable from object P with respect to and MinPts if M is within the -neighborhood of P which contains at least a minimum number of points, MinPts an object Q is density-reachable from object P with respect to and MinP

8、tsif there is a chain of objects p1, , pn, p1= P and pn= Q, pi+1is directly density-reachable from pi with respect to and MinPts an object S is density-connected to object R with respect to and MinPts if there is an object O such that both S and R are density-reachable from O with respect to and Min

9、PtsDBSCANGrid-based methoda grid-based method quantizes the object space into a finite number of cells which form a grid structure, and then performs clustering operations on the grid structurerepresentatives: STING (STatistical INformation Grid) WaveCluster CLIQUE (CLustering In QUEst)STING the spa

10、tial area is divided into rectangular cells there are usually several levels of cells corresponding to different levels of resolution a cell at a high level is partitioned to form a number of cells at the next lower levelModel-based methoda model-based method hypothesizes a model for each of the clu

11、sters, and finds the best fit of the data to that modeltwo schemes: statistical method: uses probability measures neural network method: use competitive excitative and inhibitive mechanismsrepresentatives: COBWEB CLASSIT AutoClass Competitive learning SOM (Self-Organizing Maps)To master data mining,you should read moreand practice moreMore Game OverHope you enjoy the course

展开阅读全文
相关资源
猜你喜欢
相关搜索
资源标签

当前位置:首页 > 高等教育 > 大学课件

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报