针对函数依赖一致性数据生成问题,采用有向无环图作为函数依赖集合的描述模型,提出一种单函数依赖一致性数据生成算法 (TGSFD);并通过属性排序解决多函数依赖一致性数据生成问题;为了利用流水线技术提高数据生成效率,提出最小独立属性子集概念,并给出了属性集划分算法. 实验表明本文提出的TGSFD和属性排序算法能够保证生成的数据满足函数依赖一致性,属性集划分和流水线技术可以有效提高数据生成效率.
For data generation problems with functional dependency (FD) consistency, directed acyclic graph (DAG) was used to model FDs set, an algorithm of tuple generation with single FD (TGSFD) was proposed to generate data consistent with single FD, an attributes sorting algorithm was proposed to solve the data generation problems with multi FDs. In order to utilize pipelining technique to improve the efficiency of data generation, a concept of minimal independent attributes subset (MIAS) was proposed and the attributes set partitioning algorithm was given. Experiments results show that TGSFD and attributes sorting algorithm can guarantee the FD consistency of generated data, while MIAS and pipeline technique can improve the efficiency of data generation.