DataΒΆ

This page covers the zea data format, how files are structured, how to create and read files, and where to get existing datasets. More detail data handling classes can be found in zea.data module documentation.

Note

For the configuration system (model, pipeline, and scan parameters in YAML), see Config. Example notebooks on data handling live in Examples.

The philosophy behind the zea data format is to store data alongside all necessary parameters to process it (e.g. Parameters), and additional metadata (e.g. acquisition conditions, patient info, etc.) in a single file. This makes it easy to manage and share data, and ensures that all necessary information is always available when loading a file.

Additionally, to support the cognitive ultrasound framework, the zea data format is designed to allow for flexible and efficient access to a part of the data (e.g. a single frame or transmit) without the need to load the entire file into memory.

Working with zea data filesΒΆ

zea stores each acquisition as a single HDF5 file following the schema. The primary API is zea.File. It operates similarly to h5py.File, but with an additional interface of parsing parameters into a Parameters object (the merged probe + scan parameters, via load_parameters()), and validating the file against the zea data spec.

Open and read an existing file

from zea import File

with File("my_acquisition.hdf5") as f:
    raw   = f.data.raw_data[:]        # all frames
    raw0  = f.data.raw_data[0]        # first frame only
    parameters = f.load_parameters()  # returns zea.Parameters (merged probe + scan)
    scan  = f.scan                    # returns zea.data.spec.ScanSpec (bare scan group)
    probe = f.probe                   # returns zea.Probe

# For remote files (Hugging Face Hub):
with File("hf://zeahub/picmus/.../contrast_speckle.hdf5") as f:
    raw0 = f.data.raw_data[0]         # first frame

See zea.File for the full API reference.

Create a new file

Use zea.File.create() to build a validated file from NumPy arrays. All inputs are checked against the full schema before anything is written to disk.

>>> import numpy as np
>>> from zea import File

>>> n_frames, n_tx, n_el, n_ax = 2, 32, 128, 512
>>> raw_data = np.zeros((n_frames, n_tx, n_ax, n_el, 1), dtype=np.float32)
>>> probe_geometry = np.zeros((n_el, 3), dtype=np.float32)

>>> scan = {
...    "sampling_frequency": np.float32(40e6),
...    "center_frequency":   np.float32(7e6),
...    "demodulation_frequency": np.float32(7e6),
...    "initial_times":      np.zeros(n_tx, dtype=np.float32),
...    "t0_delays":          np.zeros((n_tx, n_el), dtype=np.float32),
...    "tx_apodizations":    np.ones((n_tx, n_el),  dtype=np.float32),
...    "focus_distances":    np.full(n_tx, np.inf,  dtype=np.float32),
...    "transmit_origins":   np.zeros((n_tx, 3),    dtype=np.float32),
...    "polar_angles":       np.zeros(n_tx, dtype=np.float32),
...    "time_to_next_transmit": np.ones((n_frames, n_tx), dtype=np.float32) * 1e-4,
... }

>>> probe = {
...    "name": "verasonics_l11_4v",
...    "probe_geometry": probe_geometry,
... }

>>> f = File.create(
...    "my_acquisition.hdf5",
...    data={"raw_data": raw_data},
...    scan=scan,
...    probe=probe,
... )
>>> f.close()

Save from a Parameters object

When you already hold a Parameters object β€” e.g. loaded from an existing file β€” you can round-trip it back to a new file using to_scan_dict() and to_probe_dict() to reconstruct the dicts that create() expects. No manual field-by-field reconstruction is needed:

>>> # load parameters from any file
>>> with File("source.hdf5") as f:
...     parameters = f.load_parameters() # returns a `zea.Parameters` object
...     raw_data = f.data.raw_data[:]

>>> # save those parameters to a new file, without manually reconstructing the scan and probe dicts
>>> f2 = File.create(
...     "output.hdf5",
...     data={"raw_data": raw_data},
...     scan=parameters.to_scan_dict(),
...     probe=parameters.to_probe_dict() or None,
...     overwrite=True,
... )
>>> f2.close()

Multi-track filesΒΆ

Some acquisitions interleave multiple transmit sequences in a single recording. Sometimes these sequences contain parameters that may not be expressed by a single Scan, or are intended to be processed with different Pipelines β€” for example, swapping between focused B-mode and plane-wave Doppler pulses. Rather than splitting these into separate files, zea can store them as Tracks: self-contained bundles of raw data and scan parameters in a single HDF5 file, with a shared Probe and metadata. Each track exposes its own Parameters object (via track.load_parameters()), containing the parameters necessary to beamform the raw data in that track. This allows us to specify a Pipeline per-track, which can be applied independently to each frame in that track. Global timing information can be stored in the optional track_schedule parameter, which indicates which track each transmit event belongs to. Provided the time_to_next_transmit for each transmit event, this allows us to reconstruct the full timing of the acquisition.

Illustrative example of a zea file with two tracks.

HDF5 layout

acquisition.hdf5
β”œβ”€β”€ attrs:  us_machine, description, zea_version
β”œβ”€β”€ probe/                  # probe_geometry, probe_center_frequency, …
β”œβ”€β”€ metadata/               # credit, annotations, subject, …
β”œβ”€β”€ metrics/                # optional evaluation metrics
β”œβ”€β”€ track_schedule          # optional int32[n_total_tx]
└── tracks/
    β”œβ”€β”€ track_0/
    β”‚   β”œβ”€β”€ attrs:  label="focused_bmode"
    β”‚   β”œβ”€β”€ data/           # raw_data, image, …
    β”‚   └── scan/           # focus_distances, t0_delays, …
    └── track_1/
        β”œβ”€β”€ attrs:  label="planewave_doppler"
        β”œβ”€β”€ data/
        └── scan/

Write β€” create a file with multiple tracks

>>> import numpy as np
>>> from zea import File
>>> from zea.probes import create_probe_geometry

>>> n_frames, n_ax, n_el = 2, 512, 128
>>> n_tx_focused, n_tx_pw = 3, 2
>>> pitch = 0.0003

>>> probe_geometry = create_probe_geometry(n_el, pitch)

>>> # One track index per global transmit event across all frames
>>> track_schedule = np.tile(
...     [0] * n_tx_focused + [1] * n_tx_pw, n_frames
... ).astype(np.int32)

>>> f = File.create(
...     "acquisition.hdf5",
...     tracks=[
...         # Track 0: focused B-mode
...         {
...             "label": "focused_bmode",
...             "data": {"raw_data": np.zeros((n_frames, n_tx_focused, n_ax, n_el, 1))},
...             "scan": {
...                 "sampling_frequency":     40e6,
...                 "center_frequency":       7e6,
...                 "demodulation_frequency": 7e6,
...                 "initial_times":          np.zeros(n_tx_focused),
...                 "t0_delays":              np.zeros((n_tx_focused, n_el)),
...                 "tx_apodizations":        np.ones((n_tx_focused, n_el)),
...                 "focus_distances":        np.full(n_tx_focused, np.inf),
...                 "transmit_origins":       np.zeros((n_tx_focused, 3)),
...                 "polar_angles":           np.zeros(n_tx_focused),
...                 "time_to_next_transmit": np.ones((n_frames, n_tx_focused)) * 1e-4,
...             },
...         },
...         # Track 1: plane-wave Doppler
...         {
...             "label": "planewave_doppler",
...             "data": {"raw_data": np.zeros((n_frames, n_tx_pw, n_ax, n_el, 1))},
...             "scan": {
...                 "sampling_frequency":     40e6,
...                 "center_frequency":       7e6,
...                 "demodulation_frequency": 7e6,
...                 "initial_times":          np.zeros(n_tx_pw),
...                 "t0_delays":              np.zeros((n_tx_pw, n_el)),
...                 "tx_apodizations":        np.ones((n_tx_pw, n_el)),
...                 "focus_distances":        np.full(n_tx_pw, np.inf),
...                 "transmit_origins":       np.zeros((n_tx_pw, 3)),
...                 "polar_angles":           np.zeros(n_tx_pw),
...                 "time_to_next_transmit": np.ones((n_frames, n_tx_pw)) * 2e-4,
...             },
...         },
...     ],
...     probe={"name": "L11-4v", "probe_geometry": probe_geometry},
...     track_schedule=track_schedule,
...     overwrite=True,
... )
>>> f.close()

Read β€” unpack multiple tracks from a file

>>> import zea

>>> with zea.File("acquisition.hdf5") as f:
...     probe = f.probe             # probe is shared across all tracks
...     # See track labels:
...     print(f.track_labels)          # ['focused_bmode', 'planewave_doppler']
...     # Unpack in the same order as track_labels β€” always safe:
...     focused_track, planewave_track = f.tracks
...     # Or fetch a specific track by name:
...     focused_track = f.get_track("focused_bmode")
...     focused_parameters = focused_track.load_parameters()
...     focused_raw  = focused_track.data.raw_data[:]
...     # access the global timing information for the focused track:
...     focused_track.timestamps
...     # ... process with e.g. a focused B-mode pipeline
...     planewave_parameters = planewave_track.load_parameters()
...     planewave_raw  = planewave_track.data.raw_data[:]
...     # access the global timing information for the planewave track:
...     planewave_track.timestamps
...     # ... process with e.g. a plane-wave Doppler pipeline
['focused_bmode', 'planewave_doppler']
array([[0.    , 0.0001, 0.0002],
       [0.0007, 0.0008, 0.0009]], dtype=float32)
array([[0.0003, 0.0005],
       [0.001 , 0.0012]], dtype=float32)

zea data format referenceΒΆ

Files created with zea 0.1.0 and later are fully described by the FileSpec class.

Note

The spec is the single source of truth. The documentation below is automatically generated from zea.data.spec. Run python docs/source/spec_doc.py to refresh it after spec changes.

File hierarchyΒΆ

Every zea HDF5 file follows the layout shown below.

data_file.hdf5         (attrs: us_machine, description, zea_version)
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw_data                  float32 | int16  (n_frames, n_tx, n_ax, n_el, n_ch)
β”‚   β”œβ”€β”€ aligned_data/             group (AlignedData)
β”‚   β”œβ”€β”€ beamformed_data/          group (BeamformedData)
β”‚   β”œβ”€β”€ envelope_data/            group (EnvelopeData)
β”‚   β”œβ”€β”€ image/                    group (Image)
β”‚   β”œβ”€β”€ segmentation/             group (Segmentation)
β”‚   β”œβ”€β”€ sos_map/                  group (SosMap)
β”‚   β”œβ”€β”€ strain_percentage_map/    group (StrainPercentageMap)
β”‚   β”œβ”€β”€ tissue_doppler/           group (TissueDopplerMap)
β”‚   β”œβ”€β”€ color_doppler/            group (ColorDopplerMap)
β”‚   └── <custom>/                 group (any spatial map)
β”œβ”€β”€ scan/
β”‚   β”œβ”€β”€ sampling_frequency        float32  scalar
β”‚   β”œβ”€β”€ center_frequency          float32  scalar | (n_tx,)
β”‚   β”œβ”€β”€ t0_delays                 float32  (n_tx, n_el)
β”‚   └── …
β”œβ”€β”€ metadata/
β”‚   β”œβ”€β”€ subject/                  group (Subject)
β”‚   β”œβ”€β”€ annotations/              group (Annotations)
β”‚   β”œβ”€β”€ ecg/                      group (Signal1D)
β”‚   └── …
└── metrics/
    └── …

Root attributesΒΆ

Stored as HDF5 root-level attributes (not groups).

Attribute

Type

Description

Values

us_machine

str

Name of the ultrasound system.

e.g. "Verasonics Vantage 256"

optional

description

str

Free-text description of the acquisition.

optional

zea_version

str

Version of zea that wrote this file (set automatically).

optional

Group referenceΒΆ

Click a group tab to explore its fields. Fields marked optional may be absent; all others are required.

Data group containing raw channel data, derived pipeline products, and optional grouped data products.

Data fields

Field

Type

Shape

Unit

Description

raw_data

float32 | int16

(n_frames, n_tx, n_ax, n_el, n_ch)

Raw channel data.

optional

aligned_data

AlignedData

group

–

optional

beamformed_data

BeamformedData

group

–

optional

envelope_data

EnvelopeData

group

–

optional

image

Image

group

–

optional

segmentation

Segmentation

group

–

optional

sos_map

SosMap

group

–

optional

strain_percentage_map

StrainPercentageMap

group

–

optional

shear_wave_elastography_map

ShearWaveElastographyMap

group

–

optional

tissue_doppler

TissueDopplerMap

group

–

optional

color_doppler

ColorDopplerMap

group

–

optional

Grouped data products

Each grouped data product is an HDF5 sub-group. Spatial map groups (beamformed_data, envelope_data, image, segmentation, and custom maps) also include a coordinates field (per-pixel Cartesian positions in metres, shape (*spatial_dims, 3) where spatial_dims matches the spatial (non-channel) dimensions of values). Custom spatial maps are also accepted β€” any extra key passed to DataSpec is validated as a generic Map sub-group.

aligned_data

Time-of-flight corrected data. Values are float32 or int16 in (n_frames, n_tx, n_ax, n_el, n_ch); labels names each channel (RF or I/Q).

Field

Type

Shape

Unit

values

float32 | int16

(n_frames, n_tx, n_ax, n_el, n_ch)

–

required

labels

str

(n_ch)

–

optional

beamformed_data

Beamformed (beamsummed) data. Values are float32 in (n_frames, z, x, n_ch) or (n_frames, z, x, y, n_ch); labels names each channel (RF or I/Q).

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y, n_ch) or (n_frames, z, x, n_ch)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

envelope_data

Envelope-detected data. Values are float32 in (n_frames, z, x) or (n_frames, z, x, y).

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y) or (n_frames, z, x)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

image

Reconstructed (log-compressed) image. Values are uint8 in (n_frames, z, x) or (n_frames, z, x, y).

Field

Type

Shape

Unit

values

float32 | uint8

(n_frames, x, z, y) or (n_frames, x, z)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

segmentation

Semantic segmentation mask. Values are bool in (n_frames, z, x, y, n_labels); labels names each channel.

Field

Type

Shape

Unit

values

bool

(n_frames, z, x, y, n_spatial_ch) or (n_frames, z, x, n_spatial_ch)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

sos_map

Speed-of-sound map in m/s. Values are float32.

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y, n_spatial_ch) or (n_frames, z, x, y) or (n_frames, z, x, n_spatial_ch) or (n_frames, z, x)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

strain_percentage_map

Strain map in %. Values are float32.

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y, n_spatial_ch) or (n_frames, z, x, y) or (n_frames, z, x, n_spatial_ch) or (n_frames, z, x)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

shear_wave_elastography_map

Shear-wave elastography map in m/s. Values are float32.

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y, n_spatial_ch) or (n_frames, z, x, y) or (n_frames, z, x, n_spatial_ch) or (n_frames, z, x)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

tissue_doppler

Tissue Doppler velocity map in m/s. Values are float32.

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y, n_spatial_ch) or (n_frames, z, x, y) or (n_frames, z, x, n_spatial_ch) or (n_frames, z, x)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

color_doppler

Color Doppler velocity map in m/s. Positive = towards probe. Values are float32.

Field

Type

Shape

Unit

values

float32

(n_frames, z, x, y, n_spatial_ch) or (n_frames, z, x, y) or (n_frames, z, x, n_spatial_ch) or (n_frames, z, x)

–

required

coordinates

float32

(…, 3)

–

optional

labels

str

(n_spatial_ch)

–

optional

description

str

scalar

–

optional

unit

str

scalar

–

optional

min

float32

scalar

–

optional

max

float32

scalar

–

optional

Scan group with acquisition and transmit sequence parameters.

Field

Type

Shape

Unit

Description

sampling_frequency

float32

scalar

Hz

Sampling frequency.

required

center_frequency

float32

scalar or (n_tx)

Hz

Center frequency of the transmit pulse.

required

demodulation_frequency

float32

scalar or (n_tx)

Hz

Demodulation frequency.

required

initial_times

float32

(n_tx)

s

A/D converter start times per transmit.

required

t0_delays

float32

(n_tx, n_el)

s

Transmit delays per element.

required

tx_apodizations

float32

(n_tx, n_el)

Transmit apodization per element.

required

focus_distances

float32

(n_tx)

m

Transmit focus distances.

required

transmit_origins

float32

(n_tx, 3)

m

Transmit beam origins (x, y, z).

required

polar_angles

float32

(n_tx)

rad

Polar angles of transmit beams.

required

time_to_next_transmit

float32

(n_frames, n_tx)

s

Time between transmit events.

optional

azimuth_angles

float32

(n_tx)

rad

Azimuthal angles of transmit beams.

optional

sound_speed

float32

scalar

m/s

Speed of sound.

optional

tgc_gain_curve

float32

(n_ax)

Time-gain-compensation curve.

optional

waveforms_one_way

float32

(n_tx, n_samples_one_way)

V

One-way transmit waveforms.

optional

waveforms_two_way

float32

(n_tx, n_samples_two_way)

V

Two-way transmit waveforms.

optional

Probe group with probe geometry and frequency parameters.

Field

Type

Shape

Unit

Description

name

str

scalar

–

Probe model name/identifier.

optional

type

str

scalar

–

Probe geometry type (linear, phased, curved, …).

optional

probe_center_frequency

float32

scalar

Hz

Probe nominal centre frequency.

optional

probe_bandwidth_percent

float32

scalar

%

Fractional bandwidth as a percentage.

optional

probe_geometry

float32

(n_el, 3)

m

Element positions (x, y, z) per element, shape (n_el, 3).

optional

element_width

float32

scalar

m

Width of a single transducer element.

optional

element_height

float32

scalar

m

Height (elevation aperture) of a single transducer element.

optional

lens_sound_speed

float32

scalar

m/s

Speed of sound in the acoustic lens.

optional

lens_thickness

float32

scalar

m

Thickness of the acoustic lens.

optional

Optional metadata group for subject, acquisition context, annotations, and extra time-series signals (ECG, voice narration, probe orientation). Extra signal keys are accepted and validated as SignalND sub-groups.

Field

Type

Shape

Unit

Description

subject

Subject

group

–

optional

credit

str

scalar

Credit or attribution for the dataset.

optional

probe_pose

ProbePose

group

Sampled probe pose at the transducer tip.

optional

voice_narration

Signal1D

group

Voice narration signal.

optional

ecg

Signal1D

group

Electrocardiogram signal.

optional

text_report

str

scalar

Free-text report associated with the study.

optional

annotations

Annotations

group

Frame-level annotations.

optional

Sub-groups

subject β€” Subject

Subject metadata associated with the study.

Field

Type

Shape

Unit

id

str

scalar

–

optional

type

str

scalar

–

optional

age

uint8

scalar

–

optional

sex

str

scalar

–

optional

fat_percentage

float32

scalar

–

optional

annotations β€” Annotations

Frame-level annotations, either per frame or broadcast labels.

Field

Type

Shape

Unit

anatomy

str

(n_frames) or scalar

–

optional

view

str

(n_frames)

–

optional

label

str

(n_frames)

–

optional

image_quality

str

(n_frames) or scalar

–

optional

probe_pose β€” ProbePose

Sampled probe pose metadata at the tip of the transducer.

Field

Type

Shape

Unit

Description

translation

float32

(T, 3)

m

Position of the transducer tip, ordered as (x, y, z), where x is lateral along the transducer, y is elevation (out of plane), and z is axial (depth).

required

rotation

float32

(T, 3) or (T, 4)

Orientation associated with the transducer-tip pose in the x-lateral, y-elevation, z-axial coordinate convention, interpreted according to rotation_representation.

required

rotation_representation

str

scalar

Rotation parameterization: one of euler_xyz, quaternion_wxyz, or quaternion_xyzw.

required

start_time_offset

float32

scalar

s

Time offset between the first transmit event of the ultrasound acquisition and sample 0 of this data. Negative means this data starts before the first transmit event; positive means it starts after.

required

sampling_frequency

float32

scalar

Hz

Sampling frequency.

required

ecg / voice_narration β€” Signal1D

One-dimensional sampled signal with timing metadata.

Field

Type

Shape

Unit

Description

samples

uint8 | float32 | int16 | complex64

Signal samples.

required

start_time_offset

float32

scalar

s

Time offset between the first transmit event of the ultrasound acquisition and sample 0 of this data. Negative means this data starts before the first transmit event; positive means it starts after.

required

sampling_frequency

float32

scalar

Hz

Sampling frequency.

required

Optional metrics group for acquisition-level quality and performance metrics.

Field

Type

Shape

Unit

Description

common_midpoint_phase_error

float32

(n_frames)

–

optional

coherence_factor

float32

(n_frames)

–

optional

Custom fieldsΒΆ

Beyond the standard data types (raw_data, beamformed_data, …), you can attach arbitrary custom spatial maps and custom metadata to any zea file.

Custom spatial maps (data group)

A custom map is a named entry in the data group that associates a pixel array with a per-pixel Cartesian coordinate grid. Each map is then a function from Cartesian space to some real values. Pass it as a sub-dict under the key you want:

import numpy as np
from zea import File
from zea.beamform.pixelgrid import cartesian_pixel_grid

n_frames = 2
values = np.zeros((n_frames, 64, 64, 1), dtype=np.uint8)   # (frames, z, x[, channels])

# Build a coordinate grid matching the values spatial shape.
# cartesian_pixel_grid returns shape (nz, nx, 3); broadcast to add the frame dimension.
coords_2d = cartesian_pixel_grid(
    xlims=(-0.02, 0.02), zlims=(-0.03, 0.0), grid_size_x=64, grid_size_z=64
)  # shape (64, 64, 3), last axis = [x, y, z] in metres
coordinates = np.broadcast_to(coords_2d, (n_frames, 64, 64, 3)).copy()
# For a simple placeholder without a real grid:
# coordinates = np.zeros((n_frames, 64, 64, 3), dtype=np.float32)

f = File.create(
    "my_acquisition.hdf5",
    data={
        "raw_data": raw,
        "my_overlay": {          # <-- Example of a custom field not in the zea spec
            "values":      values,
            "coordinates": coordinates,  # shape (*spatial_dims, 3)
            # optional: "labels", "description", "unit"
        },
    },
    scan=scan,
)
f.close()

# Reading back
with File("my_acquisition.hdf5") as f:
    overlay_values      = f.data.my_overlay.values[:]
    overlay_coordinates = f.data.my_overlay.coordinates[:]

Note

cartesian_pixel_grid() and polar_pixel_grid() are convenient helpers for constructing coordinate grids that match typical beamformed images. See their docstrings for full details.

Custom metadata (metadata group)

Standard metadata fields (credit, annotations, text_report, subject, ecg, …) are validated by MetadataSpec. Pass a plain dict to File.create metadata argument.

f = File.create(
    "my_acquisition.hdf5",
    data={"raw_data": raw},
    scan=scan,
    metadata={
        "credit": "My Lab, 2024",
        "text_report": "Normal acquisition, no pathology.",
        "annotations": {
            "label": np.array(["healthy", "healthy"]),
        },
    },
)

Custom signal keys (anything beyond the standard names) are accepted and stored as SignalND entries: a dict with samples, start_time_offset, and sampling_frequency:

import numpy as np
from zea import File

n_samples = 500
respiratory_signal = {
    "samples":            np.sin(np.linspace(0, 2 * np.pi, n_samples)).astype(np.float32),
    "start_time_offset":  np.float32(-0.5),   # seconds before first transmit
    "sampling_frequency": np.float32(10.0),   # Hz
}

f = File.create(
    "my_acquisition.hdf5",
    data={"raw_data": raw},
    scan=scan,
    metadata={
        "credit": "My Lab, 2024",
        "respiratory_signal": respiratory_signal,   # <-- custom SignalND field
    },
)
f.close()

# Reading back
with File("my_acquisition.hdf5") as f:
    meta = f.metadata()
    samples = meta.respiratory_signal.samples        # numpy array
    fs      = meta.respiratory_signal.sampling_frequency

See MetadataSpec for the full list of supported standard fields.

Supported datasets & conversionΒΆ

The zea toolbox supports several public and research ultrasound datasets. Conversion scripts live in zea/data/convert/ and can be invoked as:

python -m zea.data.convert --dataset "echonet"  --src <src> --dst <dst>
python -m zea.data.convert --dataset "camus"    --src <src> --dst <dst>
python -m zea.data.convert --dataset "picmus"   --src <src> --dst <dst>

Supported datasets:

  • EchoNet-Dynamic β€” large-scale cardiac ultrasound.

  • EchoNet-LVH β€” cardiac dataset for left ventricular hypertrophy.

  • CAMUS β€” Cardiac Acquisitions for Multi-structure Ultrasound Segmentation.

  • PICMUS β€” Plane-wave Imaging Challenge in Medical Ultrasound.

  • Custom β€” any dataset can be converted by following the layout described above.

Data acquisition platformsΒΆ

Verasonics

Record data with your Verasonics script, save the workspace to .mat, then convert:

python -m zea.data.convert --dataset "verasonics" --src <src> --dst <dst>

See zea.data.convert.verasonics for details.

us4us β€” to be added in a future release.