zea.models.flow_matching¶
Flow matching generative model for ultrasound image generation and posterior sampling.
Replaces the cosine diffusion schedule and noise-prediction objective of
DiffusionModel with a linear flow-matching
schedule and a velocity-field prediction objective.
See also
DiffusionModel: DDIM-based counterpart.Liu et al., Flow Straight and Fast, 2022. https://arxiv.org/abs/2209.03003
Lipman et al., Flow Matching for Generative Modeling, 2022. https://arxiv.org/abs/2210.02747
Esser et al., Scaling Rectified Flow Transformers for High-Resolution Image Synthesis, 2024. https://arxiv.org/abs/2403.03206
Classes
|
Flow matching generative model. |
- class zea.models.flow_matching.FlowMatchingModel(*args, **kwargs)[source]¶
Bases:
DiffusionModelFlow matching generative model.
Implements conditional flow matching (CFM) with straight-line (linear) interpolation paths between data and noise. The forward process is:
\[x_t = (1 - t)\, x_0 + t\, \varepsilon, \qquad \varepsilon \sim \mathcal{N}(0, I)\]The network is trained to predict the velocity field
\[v_\theta(x_t, t) \approx v = \varepsilon - x_0\]from which the clean-image estimate follows as
\[\hat{x}_0 = x_t - t\, v_\theta(x_t, t)\]At inference, images are generated by integrating the probability flow ODE
\[\frac{dx}{dt} = v_\theta(x_t, t)\]backwards from \(t = 1\) (pure noise) to \(t = 0\) (clean data) using a simple Euler discretisation (identical to the DDIM update rule under this linear schedule).
Noise samples are drawn independently from \(\mathcal{N}(0, I)\) and paired with data samples via independent (random) coupling, i.e. vanilla CFM / Rectified Flow (Liu et al. 2022). Minibatch Optimal Transport coupling (OT-CFM, Tong et al. 2023) is not currently implemented.
All sampling, guidance (DPS/DDS), and posterior-sampling machinery from
DiffusionModelis inherited unchanged.Initialize a flow matching model.
- Parameters:
input_shape – Shape of the input data, typically
(height, width, channels)for images.input_range – Range of the input data. Default
(0, 1).network_name – Network architecture. One of
"unet_time_conditional"or"dense_time_conditional".network_kwargs – Extra keyword arguments forwarded to the network constructor.
name – Model name. Default
"flow_matching_model".guidance – Guidance method. Can be a string (e.g.
"dps"), a dict with"name"and optional"params"keys, or aDiffusionGuidanceinstance.operator – Forward operator. Same format as
guidance.ema_val – Exponential moving average coefficient for the inference network weights. Default
0.999.min_t – Lower bound of the flow time interval. Default
0.0.max_t – Upper bound of the flow time interval. Default
1.0.**kwargs – Additional arguments forwarded to
DiffusionModel.
- denoise(noisy_images, noise_rates, signal_rates, training, network=None)[source]¶
Predict the velocity field and derive the clean-image estimate.
The network predicts the velocity \(v_\theta(x_t, t)\). The clean-image estimate follows as
\[\hat{x}_0 = x_t - t\, v_\theta(x_t, t)\]To keep full compatibility with the parent’s sampling and guidance machinery (which expects a
(pred_noises, pred_images)return value), the method also returns the corresponding noise estimate\[\hat{\varepsilon} = \hat{x}_0 + v_\theta = x_t + (1 - t)\, v_\theta\]under the name
pred_noises. The parent’sreverse_diffusion_step()formula\[x_{t - \Delta t} = \alpha_{t-\Delta t}\,\hat{x}_0 + \sigma_{t-\Delta t}\,\hat{\varepsilon}\]is algebraically equivalent to the Euler step \(x_{t-\Delta t} = x_t - \Delta t\, v_\theta\) under the linear schedule, so no changes to the sampling loop are needed.
- Parameters:
noisy_images – Noisy images
x_tof shape(n_images, *input_shape).noise_rates – Flow times
t, broadcastable tonoisy_images.signal_rates –
1 - t, broadcastable tonoisy_images.training (
bool) – Whether to call the network in training mode.network – Explicit network to use. If
None, chosen based ontraining(seecall()).
- Returns:
A
(pred_noises_est, pred_images)tuple wherepred_noises_estis \(\hat{\varepsilon}\) andpred_imagesis \(\hat{x}_0\).
- diffusion_schedule(diffusion_times)[source]¶
Linear flow-matching schedule.
\[\text{noise\_rates} = t, \qquad \text{signal\_rates} = 1 - t\]- Parameters:
diffusion_times – Tensor of flow times in
[min_t, max_t].- Returns:
A
(noise_rates, signal_rates)tuple with the same shape asdiffusion_times.
- get_config()[source]¶
Returns the config of the object.
An object config is a Python dictionary (serializable) containing the information needed to re-instantiate it.
- property metrics¶
Metrics for training.
- reverse_diffusion_step(shape, pred_images, pred_noises, signal_rates, next_signal_rates, next_noise_rates, seed=None, stochastic_sampling=False)[source]¶
A single reverse flow-matching step.
The deterministic (ODE) step is inherited unchanged from the parent. The stochastic step adds isotropic Langevin noise on top of the Euler update, turning the probability-flow ODE into a Langevin SDE:
\[x_{t - \Delta t} = x_t - \Delta t\, v_\theta(x_t, t) + \sqrt{2\,\Delta t}\; \mathbf{z}, \qquad \mathbf{z} \sim \mathcal{N}(0, I)\]Under the linear schedule \(\alpha_t = 1 - t\), the time step is recovered as \(\Delta t = \alpha_{t-\Delta t} - \alpha_t\) (i.e.
next_signal_rates - signal_rates), which is always positive during reverse sampling.- Parameters:
shape – Shape of the output tensor.
pred_images – Clean-image estimate \(\hat{x}_0\).
pred_noises – Noise estimate \(\hat{\varepsilon}\) (equal to \(\hat{x}_0 + v_\theta\)).
signal_rates – Current signal rates \(\alpha_t = 1 - t\).
next_signal_rates – Next signal rates \(\alpha_{t - \Delta t} = 1 - (t - \Delta t)\).
next_noise_rates – Next noise rates \(t - \Delta t\).
seed – Random seed generator.
stochastic_sampling – Whether to add Langevin noise. Default
False(deterministic Euler step).
- Returns:
Updated noisy images \(x_{t - \Delta t}\).
- train_step(data)[source]¶
Custom train step for Rectified Flow (independent coupling).
Trains the network to predict the velocity field \(v = \varepsilon - x_0\) from noisy observations \(x_t = (1 - t)\,x_0 + t\,\varepsilon\), where \(\varepsilon \sim \mathcal{N}(0, I)\) is sampled independently of \(x_0\).
Note
Only implemented for the TensorFlow backend.