Salvador, M. M., Budka, M. and Gabrys, B., 2016. Towards automatic composition of multicomponent predictive systems. In: 11th International Conference on Hybrid Artificial Intelligence Systems, 18 - 20 April 2016, Seville, Spain.
Full text available as:
|
PDF
HAIS18.pdf - Accepted Version 713kB | |
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
Official URL: http://hais2016.upo.es/
Abstract
Automatic composition and parametrisation of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps is a challenging task. In this paper we propose and describe an extension to the Auto-WEKA software which now allows to compose and optimise such flexible MCPSs by using a sequence of WEKA methods. In the experimental analysis we focus on examining the impact of significantly extending the search space by incorporating additional hyperparameters of the models, on the quality of the found solutions. In a range of extensive experiments three different optimisation strategies are used to automatically compose MCPSs on 21 publicly available datasets. A comparison with previous work indicates that extending the search space improves the classification accuracy in the majority of the cases. The diversity of the found MCPSs are also an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models. This can have a big impact on high quality predictive models development, maintenance and scalability aspects needed in modern application and deployment scenarios.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | KDD process; CASH problem; Bayesian optimisation; Data mining and decision support systems; Data preprocessing |
Group: | Faculty of Science & Technology |
ID Code: | 23388 |
Deposited By: | Symplectic RT2 |
Deposited On: | 12 Apr 2016 13:59 |
Last Modified: | 14 Mar 2022 13:55 |
Downloads
Downloads per month over past year
Repository Staff Only - |