GAMAFlow: Estimating 3D Scene Flow via Grouped Attention and Global Motion Aggregation.

Tools

Li, Z., Yang, X. and Zhang, J., 2024. GAMAFlow: Estimating 3D Scene Flow via Grouped Attention and Global Motion Aggregation. In: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New York: IEEE, 3955-3959.

Full text available as:

Preview

PDF
Li.pdf
1MB

Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk.

Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material.

DOI: 10.1109/ICASSP48485.2024.10447849

Abstract

The estimation of 3D motion fields, known as scene flow estimation, is an essential task in autonomous driving and robotic navigation. Existing learning-based methods either predict scene flow through flow-embedding layers or rely on local search methods to establish soft correspondences. However, these methods often neglect distant points which, in fact, represent the true matching elements. To address this challenge, we introduce GAMAFlow, a point-voxel architecture that models local motion and global motion to predict scene flow iteratively. In particular, GAMAFlow integrates the advantages of (i) the point Transformer with Grouped Attention and (ii) global Motion Aggregation to boost the efficacy of point-voxel correlation. Such an approach facilitates learning long-distance dependencies between current frame and next frame. Experiments illustrate the performance gains achieved by GAMAFlow compared to existing works on both FlyingThings3D and KITTI benchmarks.

Item Type:	Book Section
ISBN:	9798350344851
ISSN:	1520-6149
Additional Information:	14-19 April 2024, COEX, Seoul, Korea
Uncontrolled Keywords:	Solid modeling; Three-dimensional displays; Search methods; Estimation; Signal processing; Predictive models; Transformers; Scene Flow Estimation; Attention Model; Point-Voxel Correlation; 3D Perception
Group:	Faculty of Media & Communication
ID Code:	40135
Deposited By:	Symplectic RT2
Deposited On:	09 Jul 2024 10:24
Last Modified:	09 Jul 2024 10:24

Downloads

Downloads per month over past year

More statistics for this item...

Repository Staff Only -