Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M., 2018. A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams. IEEE Transactions on Neural Networks and Learning Systems, 29 (1), 74-86.
Full text available as:
|
PDF (©2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users.)
07605500.pdf - Accepted Version Available under License Creative Commons Attribution Non-commercial No Derivatives. 2MB | |
PDF (©2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users.)
ITNNLS_Sept16.pdf - Accepted Version Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial No Derivatives. 2MB | ||
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
DOI: 10.1109/TNNLS.2016.2614393
Abstract
Active learning (AL) is a promising way to efficiently building up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier’s model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL approach (BAL) that relies on two selection criteria, namely label uncertainty criterion and density-based cri- terion . While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared to the state-of-the-art AL methods
Item Type: | Article |
---|---|
ISSN: | 2162-2388 |
Additional Information: | Horizon 2020 Grant 687691 related to the Project: PROTEUS: Scalable Online Machine Learning for Predictive Analytics and Real-Time Interactive Visualization. |
Uncontrolled Keywords: | Active Learning; Data Streams; Bayesian Online Learning; Concept Drift |
Group: | Faculty of Science & Technology |
ID Code: | 24782 |
Deposited By: | Symplectic RT2 |
Deposited On: | 28 Sep 2016 09:39 |
Last Modified: | 14 Mar 2022 13:59 |
Downloads
Downloads per month over past year
Repository Staff Only - |