UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Zixuan Chen1,3*, Yujin Wang1*, Xin Cai2, Zhiyuan You2, Zheming Lu3, Fan Zhang1, Shi Guo1, Tianfan Xue2,1
1Shanghai AI Laboratory, 2The Chinese University of Hong Kong, 3Zhejiang University
Teaser Image

Our UltraFusion generates visually appealing results without ghosting artifacts in challenging scenes.

Abstract

Capturing high dynamic range (HDR) scenes is one of the most important issues in camera design. Majority of cameras use exposure fusion technique, which fuses images captured by different exposure levels, to increase dynamic range. However, this approach can only handle images with limited exposure difference, normally 3-4 stops. When applying to very high dynamic scenes where a large exposure difference is required, this approach often fails due to incorrect alignment or inconsistent lighting between inputs, or tone mapping artifacts. In this work, we propose UltraFusion, the first exposure fusion technique that can merge input with 9 stops differences. The key idea is that we model the exposure fusion as a guided inpainting problem, where the under-exposed image is used as a guidance to fill the missing information of over-exposed highlight in the over-exposed region. Using under-exposed image as a soft guidance, instead of a hard constrain, our model is robust to potential alignment issue or lighting variations. Moreover, utilizing the image prior of the generative model, our model also generates natural tone mapping, even for very high-dynamic range scene. Our approach outperforms HDR-Transformer on latest HDR benchmarks. Moreover, to test its performance in ultra high dynamic range scene, we capture a new real-world exposure fusion benchmark, UltraFusion Dataset, with exposure difference up to 9 stops, and experiments show that UltraFusion can generate beautiful and high-quality fusion results under various scenarios.

Overview of UltraFusion

We model the HDR imaging as a guided inpainting problem. We use the over exposed image as a reference, and inpainting the missing information in the highlight. Unlike the traditional inpainting, we use the information from under-exposed frame as soft guidance, so inpainted highlight is not completely generated, but stays consistent with under-exposed frame.



Teaser Image
Overview of our UltraFusion framework.

Visual Results

Ablation Image
Visual results on static scenes.
Ablation Image
Visual results on a dynamic scene.
Ablation Image
Visual results on a real-captured scene.
Ablation Image
Visual results on another real-captured scene.
Ablation Image
Our UltraFusion can be extended to general image fusion.

BibTeX

@article{chen2025ultrafusion,
        title={UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion},
        author={Chen, Zixuan and Wang, Yujin and Cai, Xin and You, Zhiyuan and Lu, Zheming and Zhang, Fan and Guo, Shi and Xue, Tianfan},
        journal={arXiv preprint arXiv:2501.11515},
        year={2025}
      }