AdSMOTE: A Technique for “High Proportion” Audio Augmentation

Jayathunge, K.; Yang, Xiaosong; Southern, Richard

AdSMOTE: A Technique for “High Proportion” Audio Augmentation.

Tools

Jayathunge, K., Yang, X. and Southern, R., 2023. AdSMOTE: A Technique for “High Proportion” Audio Augmentation. In: Soh, H., Geib, C. and Petrick, R., eds. Proceedings of the Inaugural 2023 Summer Symposium Series. Washington, DC: AAAI Publications, 14-18.

Full text available as:

[thumbnail of 27468-Article Text-31519-1-2-20231003.pdf]

Preview

PDF
27468-Article Text-31519-1-2-20231003.pdf
1MB

Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk.

Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material.

Official URL: https://ojs.aaai.org/index.php/AAAI-SS/article/vie...

Abstract

Data augmentation is a practice that is widely used in the fields of machine and deep learning. It is used primarily for its effectiveness in reducing the generalisation gap between training and validation, as well as to artificially increase in available training data points. This is particularly relevant to audio datasets, which are usually smaller and suffer from imbalanced classes in some applications. This work presents adSMOTE (audio SMOTE), a novel sampling and augmentation strategy and also compares it to Specaugment, one of the most effective augmentation strategies for audio data. We show that our method outperforms the latter by a considerable margin when the proportion of synthetic training samples is high. We also provide source code for the complete algorithm, which can easily be integrated into an existing model, enabling the rapid development of augmentation frameworks.

Item Type:	Book Section
Volume:	1
Issue:	1
Uncontrolled Keywords:	Audio Augmentation; SMOTE; Specaugment; Text-to-speech
Group:	Faculty of Media & Communication (Until 31/07/2025)
ID Code:	39202
Deposited By:	Symplectic RT2
Deposited On:	28 Nov 2023 10:47
Last Modified:	28 Nov 2023 10:47

Downloads

Downloads per month over past year

More statistics for this item...

Repository Staff Only -