Skip to main content

3D facial performance capture from monocular RGB video.

Liu, S., 2018. 3D facial performance capture from monocular RGB video. Doctoral Thesis (Doctoral). Bournemouth University.

Full text available as:

LIU, Shuang_Ph.D._2017.pdf



3D facial performance capture is an essential technique for animation production in featured films, video gaming, human computer interaction, VR/AR asset creation and digital heritage, which all have huge impact on our daily life. Traditionally, dedicated hardware such as depth sensors, laser scanners and camera arrays have been developed to acquire depth information for such purpose. However, such sophisticated instruments can only be operated by trained professionals. In recent years, the wide spread availability of mobile devices, and the increased interest of casual untrained users in applications such as image, video editing, virtual and facial model creation, have sparked interest in 3D facial reconstruction from 2D RGB input. Due to the depth ambiguity and facial appearance variation, 3D facial performance capture and modelling from 2D images are inherently ill-posed problems. However, with strong prior knowledge of the human face, it is possible to accurately infer the true 3D facial shape and performance from multiple observations captured with different viewing angles. Various 3D from 2D methods have been proposed and proven to work well in controlled environments. Nevertheless there are still many unexplored issues in uncontrolled in-the-wild environments. In order to achieve the same level of performance in controlled environments, interfering factors in uncontrolled environments such as varying illumination, partial occlusion and facial variation not captured by prior knowledge would require the development of new techniques. This thesis addresses existing challenges and proposes novel methods involving 2D landmark detection, 3D facial reconstruction and 3D performance tracking, which are validated through theoretical research and experimental studies. 3D facial performance tracking is a multidisciplinary problem involving many areas such as computer vision, computer graphics and machine learning. To deal with the large variations within a single image, we present new machine learning techniques for facial landmark detection based on our observation of the facial features in challenging scenarios to increase the robustness. To take advantage of the evidence aggregated from multiple observations, we present new robust and efficient optimisation techniques that impose consistency constrains that help filter out outliers. To exploit the person-specific model generation, temporal and spatial coherence in continuous video input, we present new methods to improve the performance via optimisation. In order to track the 3D facial performance, the fundamental prerequisite for good results is the accurate underlying 3D model of the actor. In this thesis, we present new methods that are targeted at 3D facial geometry reconstruction, which are more efficient than existing generic 3D geometry reconstruction methods. Evaluation and validation were obtained and analysed from substantial experiment, which shows the proposed methods in this thesis outperform the state-of-the-art methods and enable us to generate high quality results with less constraints.

Item Type:Thesis (Doctoral)
Additional Information:If you feel that this work infringes your copyright please contact the BURO Manager.
Uncontrolled Keywords:facial capture; facial animation
Group:Faculty of Media & Communication
ID Code:30227
Deposited By: Symplectic RT2
Deposited On:17 Jan 2018 14:34
Last Modified:09 Aug 2022 16:04


Downloads per month over past year

More statistics for this item...
Repository Staff Only -