Dark Channel-Assisted Depth-from-Defocus from a Single Image

Abstract

We estimate scene depth from a single defocus blurred image using the dark channel as a complementary cue, leveraging its ability to capture local statistics and scene structure. Traditional depth-from-defocus (DFD) methods use multiple images with varying apertures or focus. Single-image DFD is underexplored due to its inherent challenges. Few attempts have focused on DFD from a single defocused image because the problem is underconstrained. Our method uses the relationship between local defocus blur and contrast variations as depth cues to improve scene structure estimation. The pipeline is trained end-to-end with adversarial learning. Experiments on real data demonstrate that incorporating the dark channel prior into single-image DFD provides meaningful depth estimation, validating our approach.

Motivation

(a) This work is primarily motivated by the need for a reliable, high-performing depth estimation method from a single defocused image for robotics and computer vision applications. A single defocused image, captured instantly by a system or robot with a fixed aperture setting without relying on autofocus, can provide fast depth cues. Blur is not merely noise but a signal.

(b) Existing depth-from-defocus (DFD) methods typically use video sequence or multiple images captured with varying apertures or focus. These methods exploit the defocus relationship observed among the images with differing focal settings. While multi-image DFD techniques often outperform single-image approaches, single-image DFD remains a significantly more constrained, ill-posed, and challenging task.

(c) Dark Channel Prior (DCP ) is commonly used to estimate depth from hazy, foggy, or underwater images where DCP is used to compute the scene transmission map, which is a function of depth. DCP has also been adapted for space-variant blur analysis for deblurring based on dark channel sparsity in deblurred images. Although defocus blur degradation results from the camera's optics, similar to optical scattering in hazy or foggy conditions, the dark channel plays an analogous role in both types of degraded images. In defocused blurred images, regions near the focal plane exhibit less blur. The dark channel highlights these regions because of their greater intensity variability. Conversely, the dark channel has reduced intensity variance in significantly blurred areas far from the focal plane and lacks sharp details because of the smoothing effect of blur. This hints at the presence and extent of defocus blur, thus providing cues for depth estimation. We leverage the combined local intensity deviation of the defocused image and its dark channel, namely, the Local Defocus and Dark Channel Variation (LDDCV) map, to improve DFD performance.

(d) Our single-image DFD approach offers a promising alternative to multiimage or hardware-intensive methods, enabling rapid depth inference from limited data and improving system efficiency. A system could use a fixed-focus, wide-aperture camera (which induces defocus blur) to passively infer depth from a single shot. This approach reduces system complexity and cost compared to the active depth sensing technique, making it a practical and scalable solution for real-world automation applications.

Kernel Density Estimate Plot (KDE) of Depth Values in the NYU-Depth V2 Dataset. The KDE plot shows how the dark channel intensity discrepancy and the LDDCV map difference change with normalized spatially varying blur level, which is a function of scene depth.

Architecture Overview

Dark Channel and LDDCV Map as Complementary Cues

The Dark Channel emphasizes shadows, edges, and darker structural elements in a scene. While it discards fine textures and details, it preserves the major structure and depth transitions, reducing noise and highlighting larger spatial elements. This provides valuable context about the 3D layout and relative distances between objects.

The LDDCV Map is a dual-channel intensity variation map obtained by concatenating the Local Defocus Variation (LDV) and the Local Dark Channel Variation (LDCV). Together, these cues reveal the presence and extent of defocus blur, offering stronger insights into depth from a single out-of-focus image.

Datasets

NYU-Depth V2 EBD Dataset

To generate optically realistic depth-dependent defocus effects in the all-in-focus (AIF) NYUv2 RGB image, we select parameters corresponding to a synthetic camera with: focal length f = 9 mm, in-focus plane D_fp = 0.7 m, F-number F_n = 2 (shallow DoF), sensor size p_x = 7.5 µm, and aperture A = f / F_n.

The defocus-blurred image I_df is generated by convolving the all-in-focus (AIF) image I with a point spread function (PSF) G(x, y, r):

I_df(x, y) = G(x, y) * I(x, y), G(x, y) = (1 / 2πr²) · exp( -½ · (x² + y²) / r² )

Following the thin-lens model, the blur radius r is calculated as a function of scene distance d_gt:

r(d_gt) = (1 / √2 · p_x) · (Af / (D_fp - f)) · │d_gt - D_fp│ / d_gt

Visual Comparison

Select an RGB image

Rotate and zoom to view the 3D structure

Real Defocus Input

Flip on Hover 🔄

D3-Net

Flip on Hover 🔄

Real Defocus Input

Flip on Hover 🔄

Ours

Flip on Hover 🔄

Real Defocus Input

Flip on Hover 🔄

D3-Net

Flip on Hover 🔄

Real Defocus Input

Flip on Hover 🔄

Ours

Flip on Hover 🔄

Citation

@misc{medhi2025darkchannelassistedDFD,
  title={Dark Channel-Assisted Depth-from-Defocus from a Single Image},
  author={Moushumi Medhi and Rajiv Ranjan Sahay},
  year={2025},
  eprint={2506.06643},
  archivePrefix={arXiv},
  url={https://arxiv.org/abs/2506.06643}, 
}