zea.models.flow_matching¶

Flow matching generative model for ultrasound image generation and posterior sampling.

Replaces the cosine diffusion schedule and noise-prediction objective of DiffusionModel with a linear flow-matching schedule and a velocity-field prediction objective.

See also

DiffusionModel: DDIM-based counterpart.
Liu et al., Flow Straight and Fast, 2022. https://arxiv.org/abs/2209.03003
Lipman et al., Flow Matching for Generative Modeling, 2022. https://arxiv.org/abs/2210.02747
Esser et al., Scaling Rectified Flow Transformers for High-Resolution Image Synthesis, 2024. https://arxiv.org/abs/2403.03206

Classes

FlowMatchingModel(*args, **kwargs)

Flow matching generative model.

class zea.models.flow_matching.FlowMatchingModel(*args, **kwargs)[source]¶

Bases: DiffusionModel

Flow matching generative model.

Implements conditional flow matching (CFM) with straight-line (linear) interpolation paths between data and noise. The forward process is:

\[x_t = (1 - t)\, x_0 + t\, \varepsilon, \qquad \varepsilon \sim \mathcal{N}(0, I)\]

The network is trained to predict the velocity field

\[v_\theta(x_t, t) \approx v = \varepsilon - x_0\]

from which the clean-image estimate follows as

\[\hat{x}_0 = x_t - t\, v_\theta(x_t, t)\]

At inference, images are generated by integrating the probability flow ODE

\[\frac{dx}{dt} = v_\theta(x_t, t)\]

backwards from \(t = 1\) (pure noise) to \(t = 0\) (clean data) using a simple Euler discretisation (identical to the DDIM update rule under this linear schedule).

Noise samples are drawn independently from \(\mathcal{N}(0, I)\) and paired with data samples via independent (random) coupling, i.e. vanilla CFM / Rectified Flow (Liu et al. 2022). Minibatch Optimal Transport coupling (OT-CFM, Tong et al. 2023) is not currently implemented.

All sampling, guidance (DPS/DDS), and posterior-sampling machinery from DiffusionModel is inherited unchanged.

Initialize a flow matching model.

Parameters:

input_shape – Shape of the input data, typically (height, width, channels) for images.
input_range – Range of the input data. Default (0, 1).
network_name – Network architecture. One of "unet_time_conditional", "dense_time_conditional", or "dit_time_conditional" (Diffusion Transformer).
network_kwargs – Extra keyword arguments forwarded to the network constructor.
name – Model name. Default "flow_matching_model".
guidance – Guidance method. Can be a string (e.g. "dps"), a dict with "name" and optional "params" keys, or a DiffusionGuidance instance.
operator – Forward operator. Same format as guidance.
ema_val – Exponential moving average coefficient for the inference network weights. Default 0.999.
min_t – Lower bound of the flow time interval. Default 0.0.
max_t – Upper bound of the flow time interval. Default 1.0.
solver – ODE solver used for (unconditional) sampling. One of "heun" (second-order Euler–Heun, the default) or "euler" (first-order). Heun evaluates the velocity field twice per step for higher accuracy and is purely an inference-time choice (no retraining needed). See solver_step().
**kwargs – Additional arguments forwarded to DiffusionModel.

denoise(noisy_images, noise_rates, signal_rates, training, network=None)[source]¶

Predict the velocity field and derive the clean-image estimate.

The network predicts the velocity \(v_\theta(x_t, t)\). The clean-image estimate follows as

\[\hat{x}_0 = x_t - t\, v_\theta(x_t, t)\]

To keep full compatibility with the parent’s sampling and guidance machinery (which expects a (pred_noises, pred_images) return value), the method also returns the corresponding noise estimate

\[\hat{\varepsilon} = \hat{x}_0 + v_\theta = x_t + (1 - t)\, v_\theta\]

under the name pred_noises. The parent’s reverse_diffusion_step() formula

\[x_{t - \Delta t} = \alpha_{t-\Delta t}\,\hat{x}_0 + \sigma_{t-\Delta t}\,\hat{\varepsilon}\]

is algebraically equivalent to the Euler step \(x_{t-\Delta t} = x_t - \Delta t\, v_\theta\) under the linear schedule, so no changes to the sampling loop are needed.

Parameters:

noisy_images – Noisy images x_t of shape (n_images, *input_shape).
noise_rates – Flow times t, broadcastable to noisy_images.
signal_rates – 1 - t, broadcastable to noisy_images.
training (bool) – Whether to call the network in training mode.
network – Explicit network to use. If None, chosen based on training (see call()).

Returns:

A (pred_noises_est, pred_images) tuple where pred_noises_est is \(\hat{\varepsilon}\) and pred_images is \(\hat{x}_0\).

diffusion_schedule(diffusion_times)[source]¶

Linear flow-matching schedule.

\[\text{noise\_rates} = t, \qquad \text{signal\_rates} = 1 - t\]

Parameters:: diffusion_times – Tensor of flow times in [min_t, max_t].
Returns:: A (noise_rates, signal_rates) tuple with the same shape as diffusion_times.

get_config()[source]¶

Returns the config of the object.

An object config is a Python dictionary (serializable) containing the information needed to re-instantiate it.

property metrics¶: Metrics for training.

reverse_diffusion_step(shape, pred_images, pred_noises, signal_rates, next_signal_rates, next_noise_rates, seed=None, stochastic_sampling=False)[source]¶

A single reverse flow-matching step.

The deterministic (ODE) step is inherited unchanged from the parent. The stochastic step adds isotropic Langevin noise on top of the Euler update, turning the probability-flow ODE into a Langevin SDE:

\[x_{t - \Delta t} = x_t - \Delta t\, v_\theta(x_t, t) + \sqrt{2\,\Delta t}\; \mathbf{z}, \qquad \mathbf{z} \sim \mathcal{N}(0, I)\]

Under the linear schedule \(\alpha_t = 1 - t\), the time step is recovered as \(\Delta t = \alpha_{t-\Delta t} - \alpha_t\) (i.e. next_signal_rates - signal_rates), which is always positive during reverse sampling.

Parameters:

shape – Shape of the output tensor.
pred_images – Clean-image estimate \(\hat{x}_0\).
pred_noises – Noise estimate \(\hat{\varepsilon}\) (equal to \(\hat{x}_0 + v_\theta\)).
signal_rates – Current signal rates \(\alpha_t = 1 - t\).
next_signal_rates – Next signal rates \(\alpha_{t - \Delta t} = 1 - (t - \Delta t)\).
next_noise_rates – Next noise rates \(t - \Delta t\).
seed – Random seed generator.
stochastic_sampling – Whether to add Langevin noise. Default False (deterministic Euler step).

Returns:

Updated noisy images \(x_{t - \Delta t}\).

solver_step(noisy_images, noise_rates, signal_rates, next_noise_rates, next_signal_rates, shape, network=None, training=False, seed=None, stochastic_sampling=False)[source]¶

Single ODE solver step for the probability-flow ODE.

Integrates \(\frac{dx}{dt} = v_\theta(x_t, t)\) backwards by one step using either a first-order Euler update (solver="euler") or a second-order Euler–Heun (improved-Euler) update (solver="heun", the default).

The Heun update evaluates the velocity field twice per step:

\[\begin{split}\tilde{x}_{t-\Delta t} &= x_t - \Delta t\, v_\theta(x_t, t) \qquad\text{(Euler predictor)} \\ x_{t-\Delta t} &= x_t - \tfrac{\Delta t}{2}\big( v_\theta(x_t, t) + v_\theta(\tilde{x}_{t-\Delta t},\, t-\Delta t) \big) \qquad\text{(trapezoidal corrector)}\end{split}\]

where \(\Delta t = t - (t-\Delta t)\) equals noise_rates - next_noise_rates (positive during reverse sampling). Heun’s method only changes inference; it reuses the same trained velocity network and requires no retraining.

Stochastic sampling falls back to the first-order Euler–Maruyama update inherited from DiffusionModel, since the deterministic Heun corrector does not apply to the Langevin SDE.

Parameters:

noisy_images – Current noisy images x_t.
noise_rates – Flow times t at the current step.
signal_rates – 1 - t at the current step.
next_noise_rates – Flow times t - Δt at the next step.
next_signal_rates – 1 - (t - Δt) at the next step.
shape – Shape of the image tensor.
network – Explicit network to use (None selects based on training).
training (bool) – Whether to call the network in training mode.
seed – Random seed generator (for stochastic sampling).
stochastic_sampling (bool) – Whether to use stochastic (Langevin) sampling.

Returns:

A (next_noisy_images, pred_images) tuple where next_noisy_images is x_{t-Δt} and pred_images is the clean-image estimate x̂₀ at the current step.

test_step(data)[source]¶: Custom test step for Rectified Flow (independent coupling).

train_step(data)[source]¶

Custom train step for Rectified Flow (independent coupling).

Trains the network to predict the velocity field \(v = \varepsilon - x_0\) from noisy observations \(x_t = (1 - t)\,x_0 + t\,\varepsilon\), where \(\varepsilon \sim \mathcal{N}(0, I)\) is sampled independently of \(x_0\).

Note

Only implemented for the TensorFlow backend.