引用本文
  • 李春贵,王萌,原庆能.基于启发式信息熵的粗集数值属性离散化算法[J].广西科学院学报,2007,23(4):235-237.    [点击复制]
  • LI Chun-gui,WANG Meng,YUAN Qing-neng.Discretization of Numerical Attributes in Rough Set Theory Based on Information Entropy with Heuristics Information[J].Journal of Guangxi Academy of Sciences,2007,23(4):235-237.   [点击复制]
【打印本页】 【在线阅读全文】【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 432次   下载 429 本文二维码信息
码上扫一扫!
基于启发式信息熵的粗集数值属性离散化算法
李春贵, 王萌, 原庆能
0
(广西工学院计算机系, 广西柳州 545006)
摘要:
在一致性假设前提下,以数据集的统计性质作为启发式知识,从候选离散点集中选择离散点,根据数据集的期望值和方差来确定搜索最优离散点的区域,提出一种新的基于信息熵粗集数值属性离散化算法,并采用UCI国际标准数据集来验证新算法。新算法与已报道的算法所得到的离散断点集完全一致,决策表的离散化结果也相同,但时间代价不同,新算法比其计算效率提高40%~50%。
关键词:  信息熵  粗糙集  数值属性  离散化  统计性质
DOI:
投稿时间:2007-07-15
基金项目:广西自然科学基金项目(桂科自0481016);广西教育厅2006年科研基金资助项目(149)资助
Discretization of Numerical Attributes in Rough Set Theory Based on Information Entropy with Heuristics Information
LI Chun-gui, WANG Meng, YUAN Qing-neng
(Department of Computer, Guangxi University of Technology, Liuzhou, Guangxi, 545006, China)
Abstract:
According to the consistency assumption in machine learning,the heuristics information of the data set statistic properties is used to select the discretization points from the candidate point set,in more detail,the mean and variance of data set are used to ascertain the region for searching optimal discretization points.A novel algorithm of numerical attributes discretization based on information entropy is proposed.The testing experiment with the UCI data sets has been performed.The results of experiment show that the discretization point set selected by using the new algorithm is the same as those by using the existing algorithm,and so does the results of decision tables discretization,but the time cost is different,the computing time of the new algorithm has saved about 40%~50% compared to the existing algorithm.
Key words:  information entropy  rough set  numerical attribute  discretization  statistic property

用微信扫一扫

用微信扫一扫