Saul, M. and Rostami, S., 2022. Assessing performance of artificial neural networks and re-sampling techniques for healthcare datasets. Health Informatics Journal, 28 (1). (In Press)
Full text available as:
|
PDF (OPEN ACCESS ARTICLE)
14604582221087109.pdf - Published Version Available under License Creative Commons Attribution. 1MB | |
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
DOI: 10.1177/14604582221087109
Abstract
Re-sampling methods to solve class imbalance problems have shown to improve classification accuracy by mitigating the bias introduced by differences in class size. However, it is possible that a model which uses a specific re-sampling technique prior to Artificial neural networks (ANN) training may not be suitable for aid in classifying varied datasets from the healthcare industry. Five healthcare-related datasets were used across three re-sampling conditions: under-sampling, over-sampling and combi-sampling. Within each condition, different algorithmic approaches were applied to the dataset and the results were statistically analysed for a significant difference in ANN performance. The combi-sampling condition showed that four out of the five datasets did not show significant consistency for the optimal re-sampling technique between the f1-score and Area Under the Receiver Operating Characteristic Curve performance evaluation methods. Contrarily, the over-sampling and under-sampling condition showed all five datasets put forward the same optimal algorithmic approach across performance evaluation methods. Furthermore, the optimal combi-sampling technique (under-, over-sampling and convergence point), were found to be consistent across evaluation measures in only two of the five datasets. This study exemplifies how discrete ANN performances on datasets from the same industry can occur in two ways: how the same re-sampling technique can generate varying ANN performance on different datasets, and how different re-sampling techniques can generate varying ANN performance on the same dataset.
Item Type: | Article |
---|---|
ISSN: | 1741-2811 |
Uncontrolled Keywords: | artificial intelligence; artificial neural networks; healthcare; no-free-lunch; re-sampling |
Group: | Faculty of Media & Communication |
ID Code: | 36820 |
Deposited By: | Symplectic RT2 |
Deposited On: | 04 Apr 2022 15:04 |
Last Modified: | 04 Apr 2022 15:04 |
Downloads
Downloads per month over past year
Repository Staff Only - |