Density Preserving Sampling (DPS) for error estimation and model selection.

Tools

Budka, M. and Gabrys, B., 2013. Density Preserving Sampling (DPS) for error estimation and model selection. IEEE Transactions on Neural Networks and Learning Systems, 24 (1), 22-34.

Full text available as:

Preview

PDF
IEEE_TPAMI_DPS.pdf - Accepted Version
442kB

Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk.

Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material.

DOI: 10.1109/TNNLS.2012.2222925

Abstract

Estimation of the generalization ability of a classification or regression model is an important issue, as it indicates expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures like cross–validation (CV) or bootstrap are stochastic and thus require multiple repetitions in order to produce reliable results, which can be computationally expensive if not prohibitive. The correntropy–based Density Preserving Sampling procedure (DPS) proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets, which are guaranteed to be representative of the input dataset. This allows to produce low variance error estimates with accuracy comparable to 10 times repeated cross–validation at a fraction of computations required by CV. The method can also be successfully used for model ranking and selection. This paper derives the Density Preserving Sampling procedure and investigates its usability and performance using a set of publicly available benchmark datasets and standard classifiers.

Item Type:	Article
ISSN:	2162-237X
Uncontrolled Keywords:	error estimation, model selection, sampling, cross–validation, bootstrap, correntropy
Group:	Faculty of Science & Technology
ID Code:	13829
Deposited By:	Professor Bogdan Gabrys LEFT
Deposited On:	20 Apr 2010 13:00
Last Modified:	14 Mar 2022 13:30

Downloads

Downloads per month over past year

More statistics for this item...

Repository Staff Only -