引用本文: |
-
唐懿芳,钟达夫,张师超.数据清洗前的预处理方法[J].广西科学,2005,12(2):118-122. [点击复制]
- Tang Yifang,Zhong Dafu,Zhang Shichao.Pre-Processing for Data Cleansing[J].Guangxi Sciences,2005,12(2):118-122. [点击复制]
|
|
摘要: |
为提高数据清洗的质量,提出消除脏数据域、使用统一的缩写、数据的转换等预处理方法,基于这3种方法和链表存储复制记录算法,设计一个数据清洗的系统,与其他方法的效率与准确程度比较可知,该系统的数据准确程度要高于现有的数据清洗系统. |
关键词: 数据清洗 脏数据 预处理 外部源文件 |
DOI: |
投稿时间:2005-01-06修订日期:2005-03-07 |
基金项目:澳大利亚国家大型项目(ARC:DP0343109)资助。 |
|
Pre-Processing for Data Cleansing |
Tang Yifang1, Zhong Dafu1, Zhang Shichao1,2
|
(1.Coll. of Math. & Comp. Sci., Guangxi Normal Univ., Guilin, Guangxi, 541004, China;2.Faculty of Info. Tech., Sydney Tech. Univ., Sydney, Australia) |
Abstract: |
For improving the quality of data cleaning,it provides three pre-process methods,such as eliminating dirty data,using unified abbreviation,data converstion.Based on these methods,using link table to store replicate recorders algorithm,implementing a data cleansing system.This cleaning system has a higher veracity than the existing one. |
Key words: data cleansing dirty data pre-processing external source file |