Adaptive Preprocessing for Streaming Data.

Zliobaite, I. and Gabrys, B., 2014. Adaptive Preprocessing for Streaming Data. IEEE Transactions on Knowledge and Data Engineering, 26 (2), 309 - 321 .

Full text available as:

[img]
Preview
PDF
Zliobaite_Gabrys_Adaptive_Preprocessing_IEEE_TKDE_2014_post_print.pdf - Accepted Version

826kB

DOI: 10.1109/TKDE.2012.147

Abstract

Many supervised learning approaches that adapt to changes in data distribution over time (e.g., concept drift) have been developed. The majority of them assume that the data comes already preprocessed or that preprocessing is an integral part of a learning algorithm. In real-application tasks, data that comes from, e.g., sensor readings, is typically noisy, contain missing values, redundant features, and a very large part of model development efforts is devoted to data preprocessing. As data is evolving over time, learning models need to be able to adapt to changes automatically. From a practical perspective, automating a predictor makes little sense if preprocessing requires manual adjustment over time. Nevertheless, adaptation of preprocessing has been largely overlooked in research. In this paper, we introduce and address the problem of adaptive preprocessing. We analyze when and under what circumstances it is beneficial to handle adaptivity of preprocessing and adaptivity of the learning model separately. We present three scenarios where handling adaptive preprocessing separately benefits the final prediction accuracy and illustrate them using computational examples. As a result of our analysis, we construct a prototype approach for combining adaptive preprocessing with adaptive predictor online. Our case study with real sensory data from a production process demonstrates that decoupling the adaptivity of preprocessing and the predictor contributes to improving the prediction accuracy. The developed reference framework and our experimental findings are intended to serve as a starting point in systematic research of adaptive preprocessing mechanisms for adaptive learning with evolving data.

Item Type:Article
ISSN:1041-4347
Uncontrolled Keywords: Concept drift; adaptive preprocessing; streaming data
Subjects:UNSPECIFIED
Group:Faculty of Science and Technology
ID Code:22865
Deposited By: Unnamed user with email symplectic@symplectic
Deposited On:09 Nov 2015 12:02
Last Modified:09 Nov 2015 12:02

Downloads

Downloads per month over past year

More statistics for this item...
Repository Staff Only -