Skip to content

Camera

The end2end_imaging.camera module provides the end-to-end camera model that couples an optical lens with an image sensor for differentiable image simulation.

Camera ties a lens and a sensor into a single differentiable capture model — it renders a scene through the optics and sensor to produce a simulated raw/RGB image, and is the main entry point for image simulation and end-to-end co-design.

end2end_imaging.camera.Camera

Camera(lens_file, sensor_file, lens_type='geolens', sensor_type='rgb', device=None)

Bases: Renderer

End-to-end camera model coupling an optical lens with an image sensor.

Simulates the full image-capture pipeline used in computational imaging research:

  1. Unprocess: convert input sRGB to linear RGB via the invertible ISP.
  2. Lens simulation: convolve with the lens PSF (or ray-trace) to produce a degraded linear-RGB image at the sensor plane.
  3. Sensor simulation: add shot + read noise, apply the ISP forward pipeline (Bayer mosaic, demosaicing, gamma) to yield an sRGB output.

Attributes:

Name Type Description
lens

Optical lens object (GeoLens, HybridLens, etc.).

sensor

Image sensor object (RGBSensor, MonoSensor, etc.).

Initialize a camera from lens and sensor configuration files.

Parameters:

Name Type Description Default
lens_file str

Path to the lens configuration file. The accepted format depends on lens_type: JSON for all lens types; additionally .zmx / .seq for "geolens".

required
sensor_file str

Path to a JSON sensor configuration file.

required
lens_type str

Lens model. One of "geolens" (default), "hybridlens", "defocuslens", "diffraclens".

'geolens'
sensor_type str

Sensor model. One of "simple", "rgb" (default), "mono", "event".

'rgb'
device str or None

Compute device. Defaults to None (auto-selects GPU if available).

None

Raises:

Type Description
NotImplementedError

If lens_type or sensor_type is not recognised.

Example

cam = Camera( ... lens_file="datasets/lenses/camera/ef50mm.json", ... sensor_file="test.json", ... lens_type="geolens", ... sensor_type="rgb", ... )

Source code in end2endimaging-src/end2end_imaging/camera.py
def __init__(
    self,
    lens_file,
    sensor_file,
    lens_type="geolens",
    sensor_type="rgb",
    device=None,
):
    """Initialize a camera from lens and sensor configuration files.

    Args:
        lens_file (str): Path to the lens configuration file.  The
            accepted format depends on *lens_type*: JSON for all lens
            types; additionally ``.zmx`` / ``.seq`` for ``"geolens"``.
        sensor_file (str): Path to a JSON sensor configuration file.
        lens_type (str, optional): Lens model.  One of ``"geolens"``
            (default), ``"hybridlens"``, ``"defocuslens"``,
            ``"diffraclens"``.
        sensor_type (str, optional): Sensor model.  One of ``"simple"``,
            ``"rgb"`` (default), ``"mono"``, ``"event"``.
        device (str or None, optional): Compute device.  Defaults to
            ``None`` (auto-selects GPU if available).

    Raises:
        NotImplementedError: If *lens_type* or *sensor_type* is not
            recognised.

    Example:
        >>> cam = Camera(
        ...     lens_file="datasets/lenses/camera/ef50mm.json",
        ...     sensor_file="test.json",
        ...     lens_type="geolens",
        ...     sensor_type="rgb",
        ... )
    """
    super().__init__(device=device)

    # Sensor
    if sensor_type == "simple":
        from end2end_imaging.sensor import Sensor

        self.sensor = Sensor.from_config(sensor_file)
    elif sensor_type == "rgb":
        from end2end_imaging.sensor import RGBSensor

        self.sensor = RGBSensor.from_config(sensor_file)
    elif sensor_type == "mono":
        from end2end_imaging.sensor import MonoSensor

        self.sensor = MonoSensor.from_config(sensor_file)
    else:
        raise NotImplementedError(f"Unsupported sensor type: {sensor_type}")
    self.sensor.to(device)
    sensor_res = self.sensor.res
    sensor_size = self.sensor.size

    # Lens
    if lens_type == "geolens":
        from end2end_imaging.deeplens import GeoLens

        self.lens = GeoLens(lens_file, device=device)
    elif lens_type == "hybridlens":
        from end2end_imaging.deeplens import HybridLens

        self.lens = HybridLens(lens_file, device=device)
    elif lens_type == "defocuslens":
        from end2end_imaging.deeplens import DefocusLens

        self.lens = DefocusLens(lens_file, device=device)
    elif lens_type == "diffraclens":
        from end2end_imaging.deeplens import DiffractiveLens

        self.lens = DiffractiveLens(lens_file, device=device)
    else:
        raise NotImplementedError(f"Unsupported lens type: {lens_type}")
    self.lens.set_sensor(sensor_res=sensor_res, sensor_size=sensor_size)

__call__

__call__(data_dict)

Alias for render().

Source code in end2endimaging-src/end2end_imaging/camera.py
def __call__(self, data_dict):
    """Alias for ``render()``."""
    return self.render(data_dict)

render

render(data_dict, render_mode='psf_patch', output_type='rggbif')

Simulate camera-captured images with lens aberrations and sensor noise.

This method performs the complete imaging pipeline: converts input to linear RGB, applies lens aberrations, converts to Bayer format, adds sensor noise, and prepares output for network training or testing.

Parameters:

Name Type Description Default
data_dict dict

Dictionary containing essential imaging parameters: - "img": sRGB image (torch.Tensor), shape (B, 3, H, W), range [0, 1] - "iso": ISO value (int), shape (B,) - "field_center": Field center coordinates (torch.Tensor), shape (B, 2), range [-1, 1] - "depth": Depth map (torch.Tensor), required for "psf_pixel" and "psf_patch_depth_interp"

required
render_mode str

Rendering method for lens aberration simulation. Options: - "psf_patch": Per-patch PSF convolution (default) - "psf_map": Spatially-varying PSF map convolution - "psf_pixel": Pixel-wise PSF rendering - "ray_tracing": Full ray tracing simulation - "psf_patch_depth_interp": PSF patch with depth interpolation

'psf_patch'
output_type str

Output format type. Defaults to "rggbif".

'rggbif'

Returns:

Name Type Description
tuple

(data_lq, data_gt) - data_lq: Low-quality network input with degradations - data_gt: Ground-truth data for training

References

[1] "Unprocessing Images for Learned Raw Denoising", CVPR 2018. [2] "Optical Aberration Correction in Postprocessing using Imaging Simulation", SIGGRAPH 2021. [3] "Efficient Depth- and Spatially-Varying Image Simulation for Defocus Deblur", ICCV Workshop 2025.

Source code in end2endimaging-src/end2end_imaging/camera.py
def render(self, data_dict, render_mode="psf_patch", output_type="rggbif"):
    """Simulate camera-captured images with lens aberrations and sensor noise.

    This method performs the complete imaging pipeline: converts input to linear RGB,
    applies lens aberrations, converts to Bayer format, adds sensor noise, and prepares
    output for network training or testing.

    Args:
        data_dict (dict): Dictionary containing essential imaging parameters:
            - "img": sRGB image (torch.Tensor), shape (B, 3, H, W), range [0, 1]
            - "iso": ISO value (int), shape (B,)
            - "field_center": Field center coordinates (torch.Tensor), shape (B, 2), range [-1, 1]
            - "depth": Depth map (torch.Tensor), required for "psf_pixel" and "psf_patch_depth_interp"
        render_mode (str): Rendering method for lens aberration simulation. Options:
            - "psf_patch": Per-patch PSF convolution (default)
            - "psf_map": Spatially-varying PSF map convolution
            - "psf_pixel": Pixel-wise PSF rendering
            - "ray_tracing": Full ray tracing simulation
            - "psf_patch_depth_interp": PSF patch with depth interpolation
        output_type (str): Output format type. Defaults to "rggbif".

    Returns:
        tuple: (data_lq, data_gt)
            - data_lq: Low-quality network input with degradations
            - data_gt: Ground-truth data for training

    References:
        [1] "Unprocessing Images for Learned Raw Denoising", CVPR 2018.
        [2] "Optical Aberration Correction in Postprocessing using Imaging Simulation", SIGGRAPH 2021.
        [3] "Efficient Depth- and Spatially-Varying Image Simulation for Defocus Deblur", ICCV Workshop 2025.
    """
    data_dict = self.move_to_device(data_dict)
    img = data_dict["img"]
    iso = data_dict["iso"]

    # -----------------------------------------------
    # Step 1: Unprocess from sRGB to linear RGB space
    # -----------------------------------------------
    sensor = self.sensor
    img_linrgb = sensor.unprocess(img)  # (B, 3, H, W), [0, 1]

    # -----------------------------------------------
    # Step 2: Lens aberration simulation in linear RGB space
    # -----------------------------------------------
    if render_mode == "psf_patch":
        # Each image in the batch can have a different PSF
        img_lq_ls = []
        for b in range(img_linrgb.shape[0]):
            img_b = img_linrgb[b, ...].unsqueeze(0)
            patch_center = data_dict["field_center"][b, ...]
            img_lq_b = self.lens.render(
                img_b, method="psf_patch", patch_center=patch_center
            )
            img_lq_ls.append(img_lq_b)
        img_lq = torch.cat(img_lq_ls, dim=0)

    elif render_mode == "psf_map":
        img_lq = self.lens.render(img_linrgb, method="psf_map")

    elif render_mode == "psf_pixel":
        img_lq = self.lens.render(
            img_linrgb, method="psf_pixel", **data_dict
        )

    elif render_mode == "ray_tracing":
        img_lq = self.lens.render(
            img_linrgb, method="ray_tracing", **data_dict
        )

    elif render_mode == "psf_patch_depth_interp":
        img_lq_ls = []
        for b in range(img_linrgb.shape[0]):
            img_b = img_linrgb[b, ...].unsqueeze(0)
            patch_center = data_dict["field_center"][b, ...]  # shape (2,), as render_rgbd expects
            depth = data_dict["depth"][b, ...].unsqueeze(0)
            img_lq_b = self.lens.render_rgbd(
                img_b, depth, method="psf_patch", patch_center=patch_center
            )
            img_lq_ls.append(img_lq_b)
        img_lq = torch.cat(img_lq_ls, dim=0)

    else:
        raise NotImplementedError(f"Invalid render mode: {render_mode}")

    # -----------------------------------------------
    # Step 3: Convert linear RGB to Bayer space
    # -----------------------------------------------
    bayer_gt = sensor.linrgb2raw(img_linrgb)  # (B, 1, H, W), [0, 2**bit - 1]
    bayer_lq = sensor.linrgb2raw(img_lq)  # (B, 1, H, W), [0, 2**bit - 1]

    # -----------------------------------------------
    # Step 4: Simulate sensor noise
    # -----------------------------------------------
    bayer_lq = sensor.simu_noise(
        bayer_lq, iso
    )  # (B, 1, H, W), [black_level, 2**bit - 1]

    # -----------------------------------------------
    # Step 5: Pack output for network training
    # -----------------------------------------------
    data_lq, data_gt = self.pack_output(
        bayer_lq=bayer_lq,
        bayer_gt=bayer_gt,
        data_dict=data_dict,
        output_type=output_type,
    )
    return data_lq, data_gt

output_channels staticmethod

output_channels(output_type)

Return (in_channels, target_channels) required for an output_type.

Useful to validate network.in_chan / out_chan against the packing layout produced by :meth:pack_output before model construction.

Parameters:

Name Type Description Default
output_type str

One of "rgb", "rggbi", "rggbif".

required

Returns:

Type Description

tuple[int, int]: (input_channels, target_channels).

Raises:

Type Description
ValueError

If output_type is unknown.

Source code in end2endimaging-src/end2end_imaging/camera.py
@staticmethod
def output_channels(output_type):
    """Return ``(in_channels, target_channels)`` required for an ``output_type``.

    Useful to validate ``network.in_chan`` / ``out_chan`` against the
    packing layout produced by :meth:`pack_output` before model construction.

    Args:
        output_type (str): One of ``"rgb"``, ``"rggbi"``, ``"rggbif"``.

    Returns:
        tuple[int, int]: ``(input_channels, target_channels)``.

    Raises:
        ValueError: If ``output_type`` is unknown.
    """
    mapping = {
        "rgb": (3, 3),
        "rggbi": (5, 4),
        "rggbif": (6, 4),
    }
    if output_type not in mapping:
        raise ValueError(f"Unknown output_type: {output_type}")
    return mapping[output_type]

pack_output

pack_output(bayer_lq, bayer_gt, data_dict, output_type='rggbi')

Pack Bayer data into network-ready inputs and targets.

Parameters:

Name Type Description Default
bayer_lq Tensor

Noisy Bayer image, shape (B, 1, H, W), range [~black_level, 2**bit - 1].

required
bayer_gt Tensor

Clean Bayer image, shape (B, 1, H, W), range [~black_level, 2**bit - 1].

required
data_dict dict

Per-sample metadata. Required keys by output_type: - "rgb": none. - "rggbi": "iso" (B,); optional "iso_scale" (default 1000). - "rggbif": "iso" (B,), "field_center" (B, 2); optional "iso_scale".

required
output_type str

One of "rgb", "rggbi", "rggbif".

'rggbi'

Returns:

Name Type Description
tuple

(data_lq, data_gt). - For "rgb": shape (B, 3, H, W). - For "rggbi" / "rggbif": shape (B, C, H/2, W/2) where C = 5 or 6 for data_lq and C = 4 for data_gt.

Source code in end2endimaging-src/end2end_imaging/camera.py
def pack_output(self, bayer_lq, bayer_gt, data_dict, output_type="rggbi"):
    """Pack Bayer data into network-ready inputs and targets.

    Args:
        bayer_lq (torch.Tensor): Noisy Bayer image, shape ``(B, 1, H, W)``,
            range ``[~black_level, 2**bit - 1]``.
        bayer_gt (torch.Tensor): Clean Bayer image, shape ``(B, 1, H, W)``,
            range ``[~black_level, 2**bit - 1]``.
        data_dict (dict): Per-sample metadata. Required keys by ``output_type``:
            - ``"rgb"``: none.
            - ``"rggbi"``: ``"iso"`` (B,); optional ``"iso_scale"`` (default 1000).
            - ``"rggbif"``: ``"iso"`` (B,), ``"field_center"`` (B, 2); optional ``"iso_scale"``.
        output_type (str): One of ``"rgb"``, ``"rggbi"``, ``"rggbif"``.

    Returns:
        tuple: ``(data_lq, data_gt)``.
            - For ``"rgb"``: shape ``(B, 3, H, W)``.
            - For ``"rggbi"`` / ``"rggbif"``: shape ``(B, C, H/2, W/2)`` where
              ``C = 5`` or ``6`` for ``data_lq`` and ``C = 4`` for ``data_gt``.
    """
    sensor = self.sensor
    device = bayer_lq.device

    if output_type == "rgb":
        rgb_gt = sensor.isp(bayer_gt)
        rgb_lq = sensor.isp(bayer_lq)
        return rgb_lq, rgb_gt

    iso = data_dict["iso"]
    iso_scale = data_dict.get("iso_scale", 1000)

    rggb_gt = sensor.bayer2rggb(bayer_gt)  # (B, 4, H/2, W/2), [0, 1]
    rggb_lq = sensor.bayer2rggb(bayer_lq)  # (B, 4, H/2, W/2), [0, 1]
    B, _, H, W = rggb_lq.shape
    iso_channel = (iso / iso_scale).view(-1, 1, 1, 1).repeat(1, 1, H, W)

    if output_type == "rggbi":
        return torch.cat([rggb_lq, iso_channel], dim=1), rggb_gt

    if output_type == "rggbif":
        pixel_size = sensor.pixel_size
        field_channels = []
        for b in range(B):
            field_center = data_dict["field_center"][b, ...]
            # After shuffling to rggb, the pixel stride is 2 * pixel_size.
            grid_x, grid_y = torch.meshgrid(
                torch.linspace(
                    field_center[0] - W * pixel_size,
                    field_center[0] + W * pixel_size,
                    W,
                    device=device,
                ),
                torch.linspace(
                    field_center[1] + H * pixel_size,
                    field_center[1] - H * pixel_size,
                    H,
                    device=device,
                ),
                indexing="xy",
            )
            field_channels.append(torch.sqrt(grid_x**2 + grid_y**2).unsqueeze(0))
        field_channel = torch.cat(field_channels, dim=0).unsqueeze(1)
        return torch.cat([rggb_lq, iso_channel, field_channel], dim=1), rggb_gt

    raise NotImplementedError(f"Invalid output type: {output_type}")

Renderer

Low-level rendering engine that applies the lens PSF to form the sensor image. Used internally by Camera; documented here for reference.

end2end_imaging.camera.Renderer

Renderer(device=None)

Abstract base class for image simulation renderers.

Defines the render(data_dict) interface shared by all concrete renderers (e.g. :class:Camera).

Attributes:

Name Type Description
device str

Compute device used for rendering.

Initialize the renderer.

Parameters:

Name Type Description Default
device str or None

Compute device. If None, auto-selects CUDA when available. Defaults to None.

None
Source code in end2endimaging-src/end2end_imaging/camera.py
def __init__(self, device=None):
    """Initialize the renderer.

    Args:
        device (str or None, optional): Compute device.  If ``None``,
            auto-selects CUDA when available. Defaults to ``None``.
    """
    if device is None:
        device = "cuda" if torch.cuda.is_available() else "cpu"
    self.device = device

__call__

__call__(*args, **kwargs)

Alias for render().

Source code in end2endimaging-src/end2end_imaging/camera.py
def __call__(self, *args, **kwargs):
    """Alias for ``render()``."""
    return self.render(*args, **kwargs)

set_device

set_device(device)

Set the compute device.

Parameters:

Name Type Description Default
device str

Target device (e.g. 'cpu', 'cuda', 'cuda:0').

required
Source code in end2endimaging-src/end2end_imaging/camera.py
def set_device(self, device):
    """Set the compute device.

    Args:
        device (str): Target device (e.g. ``'cpu'``, ``'cuda'``, ``'cuda:0'``).
    """
    self.device = device

move_to_device

move_to_device(data_dict)

Move all tensor values in a dictionary to the configured device.

Parameters:

Name Type Description Default
data_dict dict

Dictionary whose tensor values will be transferred.

required

Returns:

Name Type Description
dict

The same dictionary with tensors moved in-place.

Source code in end2endimaging-src/end2end_imaging/camera.py
def move_to_device(self, data_dict):
    """Move all tensor values in a dictionary to the configured device.

    Args:
        data_dict (dict): Dictionary whose tensor values will be transferred.

    Returns:
        dict: The same dictionary with tensors moved in-place.
    """
    for key in data_dict:
        if isinstance(data_dict[key], torch.Tensor):
            data_dict[key] = data_dict[key].to(self.device)
    return data_dict

render

render(data_dict)

Render an image from the input data dictionary.

Subclasses must override this method with their specific rendering pipeline.

Parameters:

Name Type Description Default
data_dict dict

Input data for rendering.

required

Raises:

Type Description
NotImplementedError

Always, unless overridden by a subclass.

Source code in end2endimaging-src/end2end_imaging/camera.py
def render(self, data_dict):
    """Render an image from the input data dictionary.

    Subclasses must override this method with their specific rendering
    pipeline.

    Args:
        data_dict (dict): Input data for rendering.

    Raises:
        NotImplementedError: Always, unless overridden by a subclass.
    """
    raise NotImplementedError