Skip to content

Architecture

End2endImaging models the full camera pipeline — differentiable optics, sensor simulation, and neural image processing — as a single computation graph in PyTorch. Every stage is differentiable, so gradients flow from a downstream task loss (reconstruction, classification, detection) back through the network, the ISP, the sensor noise model, and into the optical design parameters. That is what makes optics–algorithm co-design possible.

The pipeline

The Camera class is the heart of the framework: it composes a lens model and a sensor into one differentiable capture, which then pairs with a reconstruction network.

Scene Image → [ DeepLens ] → Spectral Image → [ Sensor ] → Raw Image → [ Network ] → Output Image
                │                               │                        │
                GeoLens                         RGBSensor                UNet
                HybridLens                      MonoSensor               Restormer
                DiffractiveLens                                          NAFNet
                ParaxialLens
                PSFNetLens
  • Optics (DeepLens) turns a scene into a spectrally/aberration-aware image by applying the lens PSF (ray tracing or PSF-map convolution).
  • Sensor converts that optical image into a raw capture — Bayer CFA, read/shot noise, and a composable ISP pipeline.
  • Network reconstructs a clean output image from the degraded raw capture.

Code structure

end2end_imaging/
├── camera.py                    # Camera = Lens + Sensor differentiable pipeline (+ Renderer)
├── utils.py                     # Image I/O, metrics, device selection, logging
├── deeplens/                    # Differentiable optics — the DeepLens engine
│   ├── lens.py                  #   Lens — shared interface for all lens models
│   ├── geolens.py               #   GeoLens — refractive ray tracing
│   ├── hybridlens.py            #   HybridLens — refractive + DOE (ray–wave)
│   ├── diffraclens.py           #   DiffractiveLens — pure wave optics
│   ├── defocuslens.py           #   DefocusLens — defocus / circle-of-confusion
│   ├── psfnetlens.py            #   PSFNetLens — neural PSF surrogate
│   ├── geolens_pkg/             #   GeoLens mixins (PSF, eval, optim, I/O, vis)
│   ├── geometric_surface/       #   Refractive surfaces (spheric, aspheric, ...)
│   ├── phase_surface/           #   Phase surfaces (ray optics)
│   ├── diffractive_surface/     #   Diffractive elements (wave optics)
│   ├── light/                   #   Ray and ComplexWave representations
│   ├── material/                #   Glass & plastic catalogs (Sellmeier, AGF)
│   ├── imgsim/                  #   PSF convolution & Monte Carlo rendering
│   └── surrogate/               #   PSF surrogate networks (MLP, SIREN)
├── sensor/                      # Sensor simulation
│   ├── sensor.py                #   Sensor — shared base class
│   ├── rgb_sensor.py            #   RGBSensor (Bayer CFA + noise + ISP)
│   ├── mono_sensor.py           #   MonoSensor
│   └── isp_modules/             #   ISP pipeline (demosaic, white balance, gamma, ...)
└── network/                     # Neural networks
    ├── reconstruction/          #   Image reconstruction (NAFNet, Restormer, UNet)
    ├── loss/                    #   Training losses (perceptual, PSNR, SSIM)
    ├── dataset.py               #   ImageDataset, PhotographicDataset
    └── depth_estimator.py       #   DepthAnythingV2Estimator (for depth-aware sim)

Optics — deeplens/

The optics stage is the DeepLens engine, vendored inside End2endImaging as end2end_imaging.deeplens. All lens models extend a common Lens interface and compute PSFs and rendered images in a fully differentiable way:

Lens model Optical method
GeoLens Multi-element refractive ray tracing (Zemax / Code V / JSON I/O)
HybridLens Refractive lens + DOE — coherent ray trace then ASM wave propagation
DiffractiveLens Pure scalar wave optics through diffractive surfaces
PSFNetLens Neural surrogate (MLP) wrapping a GeoLens for fast PSF prediction
DefocusLens Analytic circle-of-confusion / defocus model

The supporting subpackages — geometric_surface, phase_surface, diffractive_surface, light, material, imgsim, and surrogate — mirror the standalone DeepLens library. For the full optics reference, see the DeepLens documentation; the Optics (DeepLens) API page documents the lens models as exposed through End2endImaging.

Sensor — sensor/

The sensor stage turns the optical image into a physically plausible raw capture. RGBSensor applies a Bayer color-filter array, a read- + shot-noise model, and a full ISP pipeline (black-level compensation, white balance, demosaicing, color correction, gamma, tone mapping, and more). MonoSensor is the monochrome variant. Every ISP stage in isp_modules/ is an individual, differentiable torch.nn.Module, and the pipeline is invertible (raw ↔ sRGB).

Network — network/

The network stage reconstructs a clean image from the degraded sensor capture. reconstruction/ provides NAFNet, Restormer, and UNet; loss/ provides differentiable image-quality objectives (PerceptualLoss, PSNRLoss, SSIMLoss). dataset.py supplies dataset wrappers (ImageDataset, PhotographicDataset) and depth_estimator.py wraps Depth Anything V2 for depth-aware simulation (used by the defocus deblur example).

PSF surrogate networks live in DeepLens

The PSF surrogate models (MLP, SIREN) are part of the optics engine under deeplens/surrogate/, not network/network/ holds the image reconstruction models. See the DeepLens Surrogate Networks reference.

Putting it together

Camera composes a lens and a sensor; Camera.render(data_dict, render_mode=...) produces a degraded capture and its ground truth (data_lq, data_gt). A reconstruction network then maps data_lq → data_gt, and because every stage is differentiable, a single loss optimizes the network and the optics jointly. See the Quickstart for the minimal loop and the Examples for complete training scripts.

Next steps

  • API Reference — camera, sensor, and network class documentation
  • Examples — image simulation, defocus deblur, and end-to-end design
  • Setup — install End2endImaging and run your first simulation