收藏 分享(赏)

(3.5.1)--Bigdataanalysis3-5DataReduction.pdf

上传人:职教中国 文档编号:13831183 上传时间:2022-10-28 格式:PDF 页数:15 大小:625.77KB
下载 相关 举报
(3.5.1)--Bigdataanalysis3-5DataReduction.pdf_第1页
第1页 / 共15页
(3.5.1)--Bigdataanalysis3-5DataReduction.pdf_第2页
第2页 / 共15页
(3.5.1)--Bigdataanalysis3-5DataReduction.pdf_第3页
第3页 / 共15页
亲,该文档总共15页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述

1、2What is Data reduction ?Data reduction (subtraction) technology is used to help obtain a condensed data setfrom the original huge data set, and make this condensed data set maintain the integrity of the original data set, so that data analysis on the condensed data set is obviously efficient higher

2、, and the results of analysis are basically the same as those obtained by using the original data set.3Data reduction standard The time spent on data reduction should not exceed or offset the time saved by analysis on the reduced data The data obtained by the reduction is much smaller than the origi

3、nal data, but can produce the same or almost the same analysis results4Data reduction technologyData reduction- Dimension reductionDimension reduction- Attributes subset selection7Attributes subset selection8Attributes subset selection9Attributes subset selectionDecision tree (decision tree) inducti

4、onUse the decision tree induction method to classify and induct the initial data to obtain an initial decision tree. All attributes that do not appear on the decision tree are considered irrelevant attributes. Therefore, delete these attributes from the initial attribute set to obtain an initial dec

5、ision tree. A better subset of attributes.Reduction based on statistical analysisData reduction-Data compression11Data reduction- data compressionLossless compression:Compressed data can be restored without losing any information.For example: string compression have a broad theoretical foundation an

6、d sophisticated algorithmsLossy compression: Only an approximate representation of the original data can be reconstructed.For example: audio/video compressionSometimes it is possible to reconstruct a fragment without decompressing the overall data12Data reduction-data compressionPrincipal component

7、analysis (PCA) assumes that the data to be compressed consists of N tuples or data vectors taken from k dimensions. Principal component analysis and search to obtain c-dimensional orthogonal vectors that best represent the data” , where ck. In this way, the original data can be projected into a smal

8、ler space to achieve data compression.13Data reduction technology14Data reduction- Data Cube Aggregation15Data reduction - DiscretizationThree types of attribute values: Name type-e.g. value in an unordered set Ordinal-e.g. value in an ordered set Continuous value-e.g. real numberDiscretization technologyReduce the number of values of a continuous (value) attribute by dividing the range of the attribute (continuous value) domain value into several intervals.16Data reduction- concept hierarchical generationYouth Middle aged Prime of life

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 中等教育 > 教学文档

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报