Reich, T., Budka, M and Hulbert, D., 2021. Impact of Data Quality and Target Representation on Predictions for Urban Bus Networks. In: IEEE Symposium Series on Computational Intelligence, 1-4 December, 2020, Canberra, ACT, Australia, 2843 -2852.
Full text available as:
|
PDF
_Thilo__IEEE_SSCI_2020.pdf - Accepted Version Available under License Creative Commons Attribution Non-commercial. 3MB | |
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
Official URL: https://doi.org/10.1109/SSCI47803.2020
Abstract
Passengers of urban bus networks often rely on forecasts of Estimated Times of Arrival (ETA) and live-vehicle movements to plan their journeys. ETA predictions are unreliable due to the lack of good quality historical data, while ‘live’ positions in mobile apps suffer from delays in data transmission. This study uses deep neural networks to predict the next position of a bus under various vehicle-location data-quality regimes. Additionally, we assess the effect of the target representation in the prediction problem by encoding it either as unconstrained geographical coordinates, progress along known trajectory or ETA at the next two stops. We demonstrate that without data cleaning, model predictions give false confidence if mean errors are used, highlighting the importance of a holistic assessment of the results. We show that target representation affects the prediction accuracy, by constraining the prediction space. The literature is vague about quality issues in public transport data. Here we show that noisy data is a problem and discuss simple but effective approaches to address these issues. Research generally only focuses on a single method of target representation. Therefore, comparing several methods is a useful addition to the literature. This gives insight into the value of addressing data quality issues in urban transport data to enable better predictions and improve the passenger experience. We show that ‘rephrasing’ the prediction problem by changing the target representation can yield massively improved predictions. Our findings enable researchers using deep learning approaches in public transport to make more informed decisions about essential data cleaning steps and problem representation for improved results.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | Public transport; ETA prediction; Traffic analysis; Modeling and prediction; Machine learning; Deep learning |
Group: | Faculty of Science & Technology |
ID Code: | 36200 |
Deposited By: | Symplectic RT2 |
Deposited On: | 06 Nov 2021 12:37 |
Last Modified: | 14 Mar 2022 14:30 |
Downloads
Downloads per month over past year
Repository Staff Only - |