Luo, Y., Yuan, C., Gao, L., Xu, W., Yang, X. and Wang, P., 2025. FaTNET: Feature-alignment transformer network for human pose transfer. Pattern Recognition, 165, 111626. (In Press)
Full text available as:
![]() |
PDF
PR-D-23-02324_R2_reduced size.pdf - Accepted Version Restricted to Repository staff only until 4 April 2027. Available under License Creative Commons Attribution Non-commercial No Derivatives. 777kB |
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
DOI: 10.1016/j.patcog.2025.111626
Abstract
Pose-guided person image generation involves converting an image of a person from a source pose to a target pose. This task presents significant challenges due to the extensive variability and occlusion. Existing methods heavily rely on CNN-based architectures, which are constrained by their local receptive fields and often struggle to preserve the details of style and shape. To address this problem, we propose a novel framework for human pose transfer with transformers, which can employ global dependencies and keep local features as well. The proposed framework consists of transformer encoder, feature alignment network and transformer synthetic network, enabling the generation of realistic person images with desired poses. The core idea of our framework is to obtain a novel prior image aligned with the target image through the feature alignment network in the embedded and disentangled feature space, and then synthesize the final fine image through the transformer synthetic network by recurrently warping the result of previous stage with the correlation matrix between aligned features and source images. In contrast to previous convolution and non-local methods, ours can employ the global receptive field and preserve detail features as well. The results of qualitative and quantitative experiments demonstrate the superiority of our model in human pose transfer.
Item Type: | Article |
---|---|
ISSN: | 0031-3203 |
Uncontrolled Keywords: | People image generation; Human pose transfer; Generative adversarial network; Transformers |
Group: | Faculty of Media & Communication |
ID Code: | 40994 |
Deposited By: | Symplectic RT2 |
Deposited On: | 02 May 2025 12:29 |
Last Modified: | 02 May 2025 12:29 |
Downloads
Downloads per month over past year
Repository Staff Only - |