A Comparison of Re-sampling Techniques for Pattern Classification in Imbalanced Data-Sets.

Saul, M.A. and Rostami, S., 2018. A Comparison of Re-sampling Techniques for Pattern Classification in Imbalanced Data-Sets. In: UKCI 2018 : 18TH ANNUAL UK WORKSHOP ON COMPUTATIONAL INTELLIGENCE, 5-7 September 2018, Nottingham Trent University, Nottingham, United Kingdom. (In Press)

Full text available as:

[img] PDF
UKCI2018_Sampling.pdf - Accepted Version
Restricted to Repository staff only until 8 September 2018.
Available under License Creative Commons Attribution Non-commercial No Derivatives.

2MB

Official URL: http://ukci2018.uk/

Abstract

Class imbalance is a common challenge when dealing with pattern classification of real-world medical data-sets. An effective countermeasure typically used is a method known as re-sampling. In this paper we implement an ANN with different re-sampling techniques to subsequently compare and evaluate the performances. Re-sampling strategies included a control, under-sampling, over-sampling, and a combination of the two. We found that over-sampling and the combination of under- and over-sampling both led to a significantly superior classifier performance compared to under-sampling only in correctly predicting labelled classes.

Item Type:Conference or Workshop Item (Paper)
Additional Information:This paper is embargoed until after it has been presented at the conference.
Uncontrolled Keywords:machine learning; imbalanced data; over-sampling; undersampling;
Group:Faculty of Science & Technology
ID Code:31059
Deposited By: Unnamed user with email symplectic@symplectic
Deposited On:26 Jul 2018 13:34
Last Modified:26 Jul 2018 14:01

Downloads

Downloads per month over past year

More statistics for this item...
Repository Staff Only -