Budka, M. and Gabrys, B., 2013. Density Preserving Sampling: Robust and Efficient Alternative to Cross-validation for Error Estimation. IEEE Transactions on Neural Networks and Learning Systems, 24 (1), 22 - 34 .
Full text available as:
|
PDF
TNNLS-2012-P-0123.pdf - Accepted Version 540kB | |
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
DOI: 10.1109/TNNLS.2012.2222925
Abstract
Estimation of the generalization ability of a classi- fication or regression model is an important issue, as it indicates the expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures, such as cross-validation (CV) or bootstrap, are stochastic and, thus, require multiple repetitions in order to produce reliable results, which can be computationally expensive, if not prohibitive. The correntropy-inspired density- preserving sampling (DPS) procedure proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets that are guaranteed to be representative of the input dataset. This allows the production of low-variance error estimates with an accuracy comparable to 10 times repeated CV at a fraction of the computations required by CV. This method can also be used for model ranking and selection. This paper derives the DPS procedure and investigates its usability and performance using a set of public benchmark datasets and standard classifiers
Item Type: | Article |
---|---|
ISSN: | 1045-9227 |
Uncontrolled Keywords: | Bootstrap, correntropy, cross-validation, error estimation, model selection, sampling |
Group: | Faculty of Science & Technology |
ID Code: | 20876 |
Deposited By: | Symplectic RT2 |
Deposited On: | 17 Jun 2013 09:08 |
Last Modified: | 14 Mar 2022 13:47 |
Downloads
Downloads per month over past year
Repository Staff Only - |