zea.models.diffusion¶

Diffusion models for ultrasound image generation and posterior sampling.

To try this model, simply load one of the available presets:

>>> from zea.models.diffusion import DiffusionModel

>>> model = DiffusionModel.from_preset("diffusion-echonet-dynamic")

See also

A tutorial notebook where this model is used: Diffusion models for ultrasound image generation.

Classes

`DDS`(diffusion_model, operator[, disable_jit])	Decomposed Diffusion Sampling guidance.
`DPS`(diffusion_model, operator[, disable_jit])	Diffusion Posterior Sampling guidance.
`DiffusionGuidance`(diffusion_model, operator)	Base class for diffusion guidance methods.
`DiffusionModel`(args, *kwargs)	Implementation of a diffusion generative model.
`NuclearDiffusion`(diffusion_model, operator)	Nuclear Diffusion posterior sampling guidance.

class zea.models.diffusion.DDS(diffusion_model, operator, disable_jit=False)[source]¶

Bases: DiffusionGuidance

Decomposed Diffusion Sampling guidance.

Reference paper: https://arxiv.org/pdf/2303.05754

Initialize the diffusion guidance.

Parameters:

diffusion_model (DiffusionModel) – The diffusion model to use for guidance.
operator (Operator) – The forward operator \(A\) that maps clean images to the measurement space.
disable_jit (bool) – Whether to disable JIT compilation of the guidance function.

Acg(x, **op_kwargs)[source]¶

__call__(noisy_images, measurements, noise_rates, signal_rates, n_inner=5, eps=1e-05, verbose=False, **op_kwargs)[source]¶

Run one DDS guidance step (public entry point).

Delegates to call(), which may be JIT-compiled depending on disable_jit.

Parameters:

noisy_images – Noisy images x_t of shape (n_images, *input_shape).
measurements – Target observations y.
noise_rates – Noise rates at the current diffusion time.
signal_rates – Signal rates at the current diffusion time.
n_inner (int) – Number of conjugate gradient iterations. Default: 5.
eps (float) – Convergence tolerance for the CG solver. Default: 1e-5.
verbose (bool) – When True, compute and return the measurement error for logging. Default: False.
**op_kwargs – Additional keyword arguments forwarded to the operator (e.g. mask).

Returns:

A (gradients, (measurement_error, (pred_noises, pred_images))) tuple (see call()).

call(noisy_images, measurements, noise_rates, signal_rates, n_inner, eps, verbose, **op_kwargs)[source]¶

Run one DDS guidance step via conjugate gradient.

Denoises x_t to obtain an initial x_0 estimate, then refines it by solving the normal equations \(A^\top A\, x = A^\top y\) with n_inner conjugate gradient iterations.

Parameters:

noisy_images – Noisy images x_t of shape (n_images, *input_shape).
measurements – Target observations y.
noise_rates – Noise rates at the current diffusion time.
signal_rates – Signal rates at the current diffusion time.
n_inner (int) – Number of conjugate gradient iterations.
eps (float) – Convergence tolerance; CG stops early when the residual norm falls below this threshold.
verbose (bool) – When True, compute and return the measurement error ‖y - A(x̂_0)‖. When False, the error is returned as 0.0 to avoid extra computation.
**op_kwargs – Additional keyword arguments forwarded to the operator (e.g. mask).

Returns:

A (gradients, (measurement_error, (pred_noises, pred_images))) tuple. gradients is a zero tensor because DDS performs guidance entirely inside the CG loop; the caller subtracts it as a no-op.

conjugate_gradient_inner_loop(i, loop_state, eps=1e-05)[source]¶

A single iteration of the conjugate gradient method. This involves minimizing the error of x along the current search vector p, and then choosing the next search vector.

Reference code from: https://github.com/svi-diffusion/

setup()[source]¶: Setup DDS guidance function.

class zea.models.diffusion.DPS(diffusion_model, operator, disable_jit=False)[source]¶

Bases: DiffusionGuidance

Diffusion Posterior Sampling guidance.

Initialize the diffusion guidance.

Parameters:

diffusion_model (DiffusionModel) – The diffusion model to use for guidance.
operator (Operator) – The forward operator \(A\) that maps clean images to the measurement space.
disable_jit (bool) – Whether to disable JIT compilation of the guidance function.

__call__(noisy_images, **kwargs)[source]¶

Compute DPS gradients and denoiser outputs.

Calls the JIT-compiled gradient function obtained from setup().

Parameters:

noisy_images – Noisy images x_t of shape (n_images, *input_shape).
**kwargs – Keyword arguments forwarded to compute_error() (measurements, noise_rates, signal_rates, omega, and any operator kwargs such as mask).

Returns:

A (gradients, (measurement_error, (pred_noises, pred_images))) tuple. gradients is the gradient of the measurement error w.r.t. noisy_images and can be subtracted directly from the reverse-diffusion update.

compute_error(noisy_images, measurements, noise_rates, signal_rates, omega, **kwargs)[source]¶

Compute the DPS measurement error for gradient computation.

Following the DPS implementation, the loss is a standard L2 norm.

Parameters:

noisy_images – Noisy images x_t of shape (n_images, *input_shape).
measurements – Target observations y.
noise_rates – Noise rates at the current diffusion time.
signal_rates – Signal rates at the current diffusion time.
omega (float) – Scalar step-size weight for the measurement gradient.
**kwargs – Additional keyword arguments forwarded to the operator’s forward method (e.g. mask).

Returns:

A (measurement_error, (pred_noises, pred_images)) tuple where measurement_error is the scalar loss and pred_noises / pred_images are the denoiser outputs.

setup()[source]¶: Setup the autograd function for DPS.

class zea.models.diffusion.DiffusionGuidance(diffusion_model, operator, disable_jit=False)[source]¶

Bases: ABC, Object

Base class for diffusion guidance methods.

Initialize the diffusion guidance.

Parameters:

diffusion_model (DiffusionModel) – The diffusion model to use for guidance.
operator (Operator) – The forward operator \(A\) that maps clean images to the measurement space.
disable_jit (bool) – Whether to disable JIT compilation of the guidance function.

abstractmethod __call__(*args, **kwargs)[source]¶: Call the guidance function.

abstractmethod setup()[source]¶: Setup the guidance function. Should be implemented by subclasses.

class zea.models.diffusion.DiffusionModel(*args, **kwargs)[source]¶

Bases: DeepGenerativeModel

Implementation of a diffusion generative model. Heavily inspired from https://keras.io/examples/generative/ddim/

Initialize a diffusion model.

Parameters:

input_shape – Shape of the input data. Typically of the form (height, width, channels) for images.
input_range – Range of the input data.
min_signal_rate – Minimum signal rate for the diffusion schedule.
max_signal_rate – Maximum signal rate for the diffusion schedule.
network_name – Name of the network architecture to use. Options are “unet_time_conditional” or “dense_time_conditional”.
network_kwargs – Additional keyword arguments for the network.
name – Name of the model.
guidance – Guidance method to use. Can be a string, or dict with “name” and “params” keys. Additionally, can be a DiffusionGuidance object.
operator – Operator to use. Can be a string, or dict with “name” and “params” keys. Additionally, can be a Operator object.
ema_val – Exponential moving average value for the network weights.
min_t – Minimum diffusion time for sampling during training.
max_t – Maximum diffusion time for sampling during training.
**kwargs – Additional arguments.

call(inputs, training=False, network=None, **kwargs)[source]¶

Call the score network.

Parameters:

inputs – A list [noisy_images, noise_rates_squared] as expected by the underlying time-conditional network.
training (bool) – Whether to run in training mode. When False and network is None, the EMA network is used.
network – Explicit network to call. If None, the EMA network is used during inference and the online network during training.
**kwargs – Extra keyword arguments forwarded to the network.

Returns:

Predicted noise tensor of the same shape as the input images.

denoise(noisy_images, noise_rates, signal_rates, training, network=None)[source]¶

Predict the noise component and derive the clean-image estimate.

Uses the score network to predict the noise ε in x_t, then computes the Tweedie estimate of x_0.

Parameters:

noisy_images – Noisy images x_t of shape (n_images, *input_shape).
noise_rates – Noise rates at the current diffusion time, broadcastable to noisy_images.
signal_rates – Signal rates at the current diffusion time, broadcastable to noisy_images.
training (bool) – Whether to call the network in training mode.
network – Explicit network to use. If None, chosen based on training (see call()).

Returns:

A (pred_noises, pred_images) tuple where pred_noises is the predicted noise ε and pred_images is the Tweedie estimate of x_0.

diffusion_schedule(diffusion_times)[source]¶

Cosine diffusion schedule.

Implements the cosine schedule from Nichol & Dhariwal (2021).

The noisy image at time t is defined as:

Parameters:: diffusion_times – Tensor of diffusion times in [min_t, max_t].
Returns:: A (noise_rates, signal_rates) tuple of tensors with the same shape as diffusion_times.

get_config()[source]¶

Returns the config of the object.

An object config is a Python dictionary (serializable) containing the information needed to re-instantiate it.

linear_diffusion_schedule(diffusion_times)[source]¶: Create a linear diffusion schedule

log_likelihood(data, **kwargs)[source]¶

Approximate log-likelihood of the data under the model.

Parameters:

data – Data to compute log-likelihood for.
**kwargs – Additional arguments.

Returns:

Approximate log-likelihood.

property metrics¶: Metrics for training.

posterior_sample(measurements, n_samples=1, n_steps=20, initial_step=0, initial_samples=None, seed=None, **kwargs)[source]¶

Sample from the posterior distribution given measurements.

Parameters:

measurements – Input measurements. Typically of shape (batch_size, *input_shape).
n_samples (int) – Number of posterior samples to generate. Will generate n_samples samples for each measurement in the measurements batch.
n_steps (int) – Number of diffusion steps.
initial_step (int) – Step at which to begin the reverse diffusion loop. 0 runs all diffusion_steps steps from maximum noise. Higher values skip early (high-noise) steps and require initial_samples to be provided. Number of effective steps will be diffusion_steps - initial_step.
initial_samples – Optional initial samples to warm-start the diffusion process. The diffusion process now starts from a noised version of these samples. This can be used to speed up the diffusion process. When initial_step == 0, samples are noised at the maximum noise level (max_t). When initial_step > 0, samples are noised at the noise level corresponding to initial_step. These initial_samples can be initial guesses such as solutions of previous frames (for sequences), see for instance SeqDiff. Must be of shape (batch_size, n_samples, *input_shape).
seed – Random seed generator.
**kwargs – Additional arguments passed to reverse_conditional_diffusion().

Returns:

Posterior samples p(x|y) of shape (batch_size, n_samples, *input_shape).

prepare_diffusion(diffusion_steps, initial_step, verbose, disable_jit=False)[source]¶

Prepare the diffusion process.

Validates initial_step, computes the step size, and optionally creates a Keras progress bar.

Parameters:

diffusion_steps (int) – Total number of diffusion steps.
initial_step (int) – Step index at which reverse diffusion begins. Must satisfy 0 <= initial_step < diffusion_steps.
verbose (bool) – Whether to create a Keras Progbar.
disable_jit (bool) – When True, skip the initial_step range assertions (required when values are runtime tensors).

Returns:

A (step_size, progbar) tuple where step_size is the uniform time increment per step and progbar is a Progbar instance or None.

prepare_schedule(base_diffusion_times, initial_noise, initial_samples, initial_step, step_size)[source]¶

Prepare the starting noisy images for the reverse diffusion loop.

Constructs the initial x_t tensor that is fed into the first diffusion step. Three cases are handled:

initial_samples provided and initial_step > 0: samples are mixed with noise at the noise level that corresponds to initial_step, skipping the highest-noise diffusion steps.
initial_samples provided and initial_step == 0: samples are mixed with noise at the maximum noise level (max_t), running the full diffusion process from a noised version of the samples.x
initial_samples is None and initial_step == 0: the starting point is pure noise (initial_noise).

Parameters:

base_diffusion_times – Tensor of shape (n_images, *[1]*n_dims) filled with max_t.
initial_noise – Pure noise tensor of shape (n_images, *input_shape).
initial_samples – Optional samples of shape (n_images, *input_shape). The diffusion process always starts from a noised version of these samples.
initial_step (int) – Step index at which reverse diffusion begins.
step_size (float) – Uniform time increment per diffusion step.

Returns:

Noisy images tensor of shape (n_images, *input_shape) to use as the starting point x_t for the diffusion loop.

reverse_conditional_diffusion(measurements, initial_noise, diffusion_steps, initial_samples=None, initial_step=0, stochastic_sampling=False, seed=None, verbose=False, track_progress_type='x_0', disable_jit=False, **kwargs)[source]¶

Reverse diffusion process conditioned on some measurement.

Effectively performs diffusion posterior sampling p(x_0 | y) by interleaving reverse diffusion steps with gradient-based guidance (e.g. DPS or DDS).

Parameters:

measurements – Conditioning observations of shape (n_images, *measurement_shape).
initial_noise – Initial noise tensor of shape (n_images, *input_shape).
diffusion_steps (int) – Total number of diffusion steps.
initial_samples – Optional initial samples to warm-start the diffusion process. The diffusion process now starts from a noised version of these samples. When initial_step == 0, samples are noised at the maximum noise level (max_t). When initial_step > 0, samples are noised at the noise level corresponding to initial_step.
initial_step (int) – Step at which to begin the reverse diffusion loop. 0 runs all diffusion_steps steps from maximum noise. Higher values skip early (high-noise) steps and require initial_samples to be provided. Number of effective steps will be diffusion_steps - initial_step.
stochastic_sampling (bool) – Whether to use stochastic DDPM sampling instead of deterministic DDIM sampling.
seed – Random seed generator.
verbose (bool) – Whether to show a Keras progress bar with the guidance error at each step.
track_progress_type (Literal[None, 'x_0', 'x_t']) – Intermediate output to store at each step. "x_0" stores the Tweedie-denoised estimate; "x_t" stores the noisy intermediate image; None disables tracking.
disable_jit – Whether to disable JIT compilation.
**kwargs – Additional keyword arguments forwarded to the guidance function and operator (e.g. omega, mask).

Returns:

Generated images of shape (n_images, *input_shape).

reverse_diffusion(initial_noise, diffusion_steps, initial_samples=None, initial_step=0, stochastic_sampling=False, seed=None, verbose=True, track_progress_type='x_0', disable_jit=False, training=False, network_type=None)[source]¶

Reverse diffusion process to generate images from noise.

Parameters:

initial_noise – Initial noise tensor of shape (n_images, *input_shape).
diffusion_steps (int) – Total number of diffusion steps.
initial_samples – Optional initial samples to warm-start the diffusion process. The diffusion process now starts from a noised version of these samples. When initial_step == 0, samples are noised at the maximum noise level (max_t). When initial_step > 0, samples are noised at the noise level corresponding to initial_step.
initial_step (int) – Step at which to begin the reverse diffusion loop. 0 runs all diffusion_steps steps from maximum noise. Higher values skip early (high-noise) steps and require initial_samples to be provided. Number of effective steps will be diffusion_steps - initial_step.
stochastic_sampling (bool) – Whether to use stochastic DDPM sampling instead of deterministic DDIM sampling.
seed (SeedGenerator | None) – Random seed generator.
verbose (bool) – Whether to show a Keras progress bar.
track_progress_type (Literal[None, 'x_0', 'x_t']) – Intermediate output to store at each step. "x_0" stores the Tweedie-denoised estimate; "x_t" stores the noisy intermediate image; None disables tracking.
disable_jit (bool) – Whether to disable JIT compilation.
training (bool) – Whether to call the network in training mode.
network_type (Literal[None, 'main', 'ema']) – Which network weights to use. "main" uses the online network, "ema" uses the exponential-moving-average network. If None, the choice is determined by the training argument.

Returns:

Generated images of shape (n_images, *input_shape).

reverse_diffusion_step(shape, pred_images, pred_noises, signal_rates, next_signal_rates, next_noise_rates, seed=None, stochastic_sampling=False)[source]¶

A single reverse diffusion step.

Parameters:

shape – Shape of the input tensor.
pred_images – Predicted images.
pred_noises – Predicted noises.
signal_rates – Current signal rates.
next_signal_rates – Next signal rates.
next_noise_rates – Next noise rates.
seed – Random seed generator.
stochastic_sampling – Whether to use stochastic sampling (DDPM).

Returns:

Noisy images after the reverse diffusion step.

Return type:

next_noisy_images

sample(n_samples=1, n_steps=20, seed=None, **kwargs)[source]¶

Sample from the model.

Parameters:

n_samples – Number of samples to generate.
n_steps – Number of diffusion steps.
seed – Random seed generator.
**kwargs – Additional arguments.

Returns:

Generated samples of shape (n_samples, *input_shape).

start_track_progress(diffusion_steps, initial_step=0)[source]¶

Initialize progress tracking for the diffusion process.

Resets track_progress and sets track_progress_interval so that at most 50 frames are stored during the diffusion trajectory (to keep memory usage bounded for large step counts).

Parameters:

diffusion_steps (int) – Total number of diffusion steps.
initial_step (int) – Step index at which reverse diffusion begins.

store_progress(step, track_progress_type, next_noisy_images, pred_images)[source]¶

Store an intermediate diffusion frame in track_progress.

Frames are stored every track_progress_interval steps. Does nothing when track_progress_type is None.

Parameters:

step (int) – Current diffusion step index.
track_progress_type (Literal[None, 'x_0', 'x_t']) – Which tensor to store. "x_0" stores the Tweedie-denoised estimate (predicted clean image); "x_t" stores the noisy intermediate image.
next_noisy_images – Noisy images x_t after the current step.
pred_images – Predicted clean images x_0 at the current step.

test_step(data)[source]¶: Custom test step so we can call model.fit() on the diffusion model.

train_step(data)[source]¶: Custom train step so we can call model.fit() on the diffusion model. .. note:: - Only implemented for the TensorFlow backend.

class zea.models.diffusion.NuclearDiffusion(diffusion_model, operator, disable_jit=False)[source]¶

Bases: DPS

Nuclear Diffusion posterior sampling guidance.

A hybrid framework that combines diffusion posterior sampling (DPS) with low-rank temporal modeling for video restoration. This method replaces the sparsity assumption in Robust Principal Component Analysis (RPCA) with a learned diffusion prior while maintaining a nuclear norm penalty on the background component to encourage low-rank temporal structure.