UltraFusion

Abstract

Capturing high dynamic range (HDR) scenes is one of the most important issues in camera design. Majority of cameras use exposure fusion technique, which fuses images captured by different exposure levels, to increase dynamic range. However, this approach can only handle images with limited exposure difference, normally 3-4 stops. When applying to very high dynamic scenes where a large exposure difference is required, this approach often fails due to incorrect alignment or inconsistent lighting between inputs, or tone mapping artifacts. In this work, we propose UltraFusion, the first exposure fusion technique that can merge input with 9 stops differences. The key idea is that we model the exposure fusion as a guided inpainting problem, where the under-exposed image is used as a guidance to fill the missing information of over-exposed highlight in the over-exposed region. Using under-exposed image as a soft guidance, instead of a hard constrain, our model is robust to potential alignment issue or lighting variations. Moreover, utilizing the image prior of the generative model, our model also generates natural tone mapping, even for very high-dynamic range scene. Our approach outperforms HDR-Transformer on latest HDR benchmarks. Moreover, to test its performance in ultra high dynamic range scene, we capture a new real-world exposure fusion benchmark, UltraFusion Dataset, with exposure difference up to 9 stops, and experiments show that UltraFusion can generate beautiful and high-quality fusion results under various scenarios.

Overview of UltraFusion

We model the HDR imaging as a guided inpainting problem. We use the over exposed image as a reference, and inpainting the missing information in the highlight. Unlike the traditional inpainting, we use the information from under-exposed frame as soft guidance, so inpainted highlight is not completely generated, but stays consistent with under-exposed frame.

Overview of our UltraFusion framework.

UltraFusion Benchmark (UltraFusion100)

We capture 100 challenging real-world HDR scenes for performance evaluation, including diverse motion patterns, scene types and capture devices. Our benchmark and corresponding results are available here.

Samples of UltraFusion Benchmark (UltraFusion100).

Visual Comparison

Under Exposure

Over Exposure

Defusion

SCTNet

HDR-Transformer

HSDS-MEF

UltraFusion (Ours)

You can press 1, 2, 3, 4, 5, 6, 7 on keyboard to switch different results. Results are compressed for quick display, please refer to Google Drive for original results.

BibTeX

@InProceedings{Chen_2025_CVPR, author = {Chen, Zixuan and Wang, Yujin and Cai, Xin and You, Zhiyuan and Lu, Zheming and Zhang, Fan and Guo, Shi and Xue, Tianfan}, title = {UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {16111-16121} }

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Our UltraFusion generates visually appealing results without ghosting artifacts in challenging scenes.

Abstract

Overview of UltraFusion

UltraFusion Benchmark (UltraFusion100)

Additional Applications

Visual Comparison

BibTeX