Development of a scalable real-world data generator for benchmarking data matching applications.

Apeh, E. T., 2008. Development of a scalable real-world data generator for benchmarking data matching applications. Masters Thesis (Masters). Bournemouth University.

Full text not available from this repository.

Abstract

Personal data has now gained prominence as the life blood of societal existence. Both government and business organisations now collect and store vast amounts of personal data obtained from various sources. The existence of databases holding personal details and the proliferation of personal data sources has however brought about an increase in the existence of poor quality personal data. Techniques for effectively handling the now prevalent data quality problem are in huge demand. In many cases, obtaining and using the appropriate technique that will most adequately address a particular poor quality data occurrence is more of an art than science, as no one technique can adequately address the common multiplicity, inconsistence, redundancy and inaccuracy that occur in real world data. This thesis gives a detailed examination of the problem of data quality in databases and presents data matching as a potential solution. It presents an in-depth analysis of data matching algorithms together with a novel scalable system for simulating real world data for the purpose of benchmarking data matching algorithms. The thesis concludes by presenting the performance of Paribus a commercially available data matching system on simulated real world personal data and directions for future work

Item Type:Thesis (Masters)
Subjects:Generalities > Computer Science and Informatics
Group:School of Design, Engineering & Computing
ID Code:15911
Deposited By:Mrs Jill Burns
Deposited On:10 Aug 2010 13:50
Last Modified:16 Oct 2012 10:49
Repository Staff Only -
BU Staff Only -
Help Guide - Editing Your Items in BURO