4DSloMo: 4D Reconstruction for High Speed Scene
with Asynchronous Capture

1Shanghai AI Laboratory 2The Chinese University of Hong Kong 3The University of Hong Kong 4NVIDIA

Our method can reconstruct high speed and complex motion with high quality.

Abstract

Reconstructing fast-dynamic scenes from multi-view videos is crucial for high-speed motion analysis and realistic 4D reconstruction. However, the majority of 4D capture systems are limited to frame rates below 30 FPS (frames per second), and a direct 4D reconstruction of high-speed motion from low FPS input may lead to undesirable results. In this work, we propose a high-speed 4D capturing system only using low FPS cameras, through novel capturing and processing modules. On the capturing side, we propose an asynchronous capture scheme that increases the effective frame rate by staggering the start times of cameras. By grouping cameras and leveraging a base frame rate of 25 FPS, our method achieves an equivalent frame rate of 100–200 FPS without requiring specialized high-speed cameras. On processing side, we also propose a novel generative model to fix artifacts caused by 4D sparse-view reconstruction, as asynchrony reduces the number of viewpoints at each timestamp. Specifically, we propose to train a video-diffusion-based artifact-fix model for sparse 4D reconstruction, which refines missing details, maintains temporal consistency, and improves overall reconstruction quality. Experimental results demonstrate that our method significantly enhances high-speed 4D reconstruction compared to synchronized capture.

Overview of 4DSloMo

Given several asynchronous multi-view videos, we first initialize a 4D Gaussian model for a specific iteration. We then employ an artifact-fix video diffusion model to refine the input videos. The refined videos are subsequently used to update the 4D Gaussian model.

Results

Video

Citation

If you find this work helpful, please consider citing:
@article{
}