zea.File¶
- class zea.File(name, mode='r', *args, **kwargs)[source]¶
Bases:
Fileh5py.File in zea format.
Initialize the file.
- Parameters:
name (str, Path, HFPath) – The path to the file. Can be a string or a Path object. Additionally can be a string with the prefix ‘hf://’, in which case it will be resolved to a huggingface path.
mode (str, optional) – The mode to open the file in. Defaults to “r”.
revision (str, optional) – HuggingFace revision (branch, tag, or commit hash) to download from. Only used when
namestarts withhf://. Defaults to"main". Example:revision="v0.1.0".repo_type (str, optional) – HuggingFace repository type. Only used when
namestarts withhf://. Defaults to"dataset".cache_dir (str or Path, optional) – Local cache directory for downloaded HuggingFace files. Only used when
namestarts withhf://.*args – Additional arguments to pass to h5py.File.
**kwargs – Additional keyword arguments to pass to h5py.File.
- property acquisition_time: datetime | None¶
Return the acquisition timestamp as a timezone-aware UTC
datetime.Returns None when no timestamp was stored (e.g. human-subject files saved without an explicit
acquisition_time).
- copy_key(key, dst)[source]¶
Copy a specific key to another file.
Will always copy the attributes and the scan data if it exists. Will warn if the key is not in this file or if the key already exists in the destination file.
- Parameters:
key (
str) – The key to copy.dst (
File) – The destination file to copy the key to.
- classmethod create(path, data=None, scan=None, tracks=None, track_schedule=None, metadata=None, metrics=None, probe_name=None, probe=None, us_machine=None, description=None, acquisition_time=None, compression='lzf', chunk_frames=False, overwrite=False)[source]¶
Create a new zea HDF5 file from data, scan, and optional metadata.
All inputs are validated against the
FileSpecschema (dtypes, shapes, dimension consistency) before anything is written to disk.For single-track files, supply
dataandscan. For multi-track files, supplytracks(a list of dicts with"data"and"scan"keys, orTrackSpecobjects) and optionallytrack_schedule.- Parameters:
path – Destination file path.
data (
dict|None) – Data dict accepted byDataSpec. Mutually exclusive withtracks.scan (
dict|None) – Scan-parameter dict accepted byScanSpec. Mutually exclusive withtracks.tracks (
list|None) – List of track dicts (each with"data"and"scan"keys) accepted byTrackSpecobjects. Mutually exclusive withdata/scan.track_schedule (
ndarray|None) – Optional int32 array of lengthn_total_txindicating which track each global transmit belongs to. Only used withtracks.metadata (
dict|None) – Optional metadata dict accepted byMetadataSpec.metrics (
dict|None) – Optional metrics dict accepted byMetricsSpec.probe_name (
str|None) – Removed — useprobe={'name': ...}instead.probe (
ProbeSpec|dict|None) – Probe specification as aProbeobject or a plain dict accepted byProbeSpec.us_machine (
str|None) – Name of the ultrasound machine.description (
str|None) – Free-text description of the acquisition.acquisition_time (
str|None) – UTC acquisition timestamp as an ISO 8601 string (e.g."2026-06-12T14:30:00+00:00"). When None (default) the current UTC time is recorded automatically, unless the subject type is"human"— in that case no timestamp is saved by default (recording timestamps for human subjects may constitute Protected Health Information / PHI). To capture the current moment explicitly, passdatetime.now(timezone.utc).isoformat()(requiresfrom datetime import datetime, timezone).compression (
str) – HDF5 compression filter (default"lzf").chunk_frames (
bool) – If True, use frame-wise chunking for all datasets containing a “frames” dimension. Dataset will be stored with HDF5 chunking enabled, using a single frame (a single slice along the first dimension) per chunk.overwrite (
bool) – If False (default), raise if the file exists.
- Returns:
An open read-only
Filehandle.- Return type:
Single-track example:
>>> from datetime import datetime, timezone >>> import numpy as np >>> from zea import File >>> n_frames, n_tx, n_ax, n_el = 2, 4, 64, 8 >>> raw = np.zeros((n_frames, n_tx, n_ax, n_el, 1), dtype=np.float32) >>> probe_geometry = np.zeros((n_el, 3), dtype=np.float32) >>> scan = { ... "sampling_frequency": np.float32(40e6), ... "center_frequency": np.float32(5e6), ... "demodulation_frequency": np.float32(5e6), ... "initial_times": np.zeros(n_tx, dtype=np.float32), ... "t0_delays": np.zeros((n_tx, n_el), dtype=np.float32), ... "tx_apodizations": np.ones((n_tx, n_el), dtype=np.float32), ... "focus_distances": np.full(n_tx, np.inf, dtype=np.float32), ... "transmit_origins": np.zeros((n_tx, 3), dtype=np.float32), ... "polar_angles": np.zeros(n_tx, dtype=np.float32), ... "time_to_next_transmit": np.ones((n_frames, n_tx), dtype=np.float32) * 1e-4, ... } >>> File.create( ... "example.hdf5", ... data={"raw_data": raw}, ... scan=scan, ... probe={"name": "verasonics_l11_4v", "probe_geometry": probe_geometry}, ... acquisition_time=datetime.now(timezone.utc).isoformat(), ... overwrite=True, ... )
- property data: _GroupProxy¶
Lazy proxy for the
datagroup of a single-track file.Supports both the new
tracks/track_0/data/layout and the flatdata/layout (files without a tracks group).Returns a
GroupProxyso individual datasets can be accessed as attributes without loading everything into RAM:with File(path) as f: f.data.raw_data[:, :n_tx] # read a slice f.data.image.values[0] # nested group access
- Raises:
AttributeError – When the file contains more than one track. Use
tracksto iterate over individual tracks.
- property description¶
Reads the description from the data file and returns it.
- get_scan_parameters()[source]¶
Returns a dictionary of parameters to initialize a scan object that comes with the file (stored inside datafile).
If there are no scan parameters in the hdf5 file, returns an empty dictionary.
- Returns:
The scan parameters.
- Return type:
dict
- classmethod get_shape(path, key)[source]¶
Get the shape of a key in a file.
- Parameters:
path (
str) – The path to the file.key (
str) – The key to get the shape of.
- Returns:
The shape of the key.
- Return type:
tuple
- get_track(label)[source]¶
Return the track with the given label.
- Parameters:
label (
str) – The exact label string assigned to the desired track.- Returns:
The matching
Trackobject.- Return type:
- Raises:
KeyError – If no track with that label exists, with a message listing the available labels so the error is self-diagnosing.
Example:
with File("acquisition.hdf5") as f: focused = f.get_track("focused") raw = focused.data.raw_data[:]
- has_key(key)[source]¶
Check if the file has a specific key.
- Parameters:
key (
str) – The key to check.- Returns:
True if the key exists, False otherwise.
- Return type:
bool
- load_data(data_type, indices=None)[source]¶
Load data from the file.
Deprecated since version Use:
file.data.<key>with standard h5py slice indexing instead::- with File(path) as f:
raw = f.data.raw_data[:] # all frames raw = f.data.raw_data[0] # first frame raw = f.data.raw_data[0, [0, 2]] # frame 0, transmits 0 and 2
The indices parameter can be used to load a subset of the data. This can be
'all'orNoneto load all dataan
intto load a single framea
List[int]to load specific frames- a
Tuple[Union[list, slice, int], ...]to index multiple axes (i.e. frames and transmits). Note that indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.
- a
For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.
- Parameters:
data_type (str) – The type of data to load. Options are ‘raw_data’, ‘aligned_data’, ‘beamformed_data’, ‘envelope_data’, ‘image’ and ‘image_sc’.
indices (
Union[Tuple[Union[list,slice,int],...],List[int],int,None]) – The indices to load. Defaults toNonein which case all data is loaded.
- Return type:
ndarray
- load_parameters(**overrides)[source]¶
Load the acquisition parameters (merged probe + scan) from the file.
Reads both the
scanandprobegroups and merges them into a singleParametersobject that owns derivation, caching, and lazy loading of derived quantities. The probe and scan groups live at the same level and have non-overlapping field names, so merging is a plain dict union.- Parameters:
**overrides – Override any parameter from the file. Custom (non-spec) keys are stored as passthrough parameters.
- Returns:
The merged, derivable parameters object.
- Return type:
- Raises:
AttributeError – When the file contains more than one track. Use
tracksand call.load_parameters()on each track.
>>> from zea import File >>> path = ( ... "hf://zeahub/picmus/database/experiments/contrast_speckle/" ... "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5" ... ) >>> with File(path) as f: ... parameters = f.load_parameters() >>> type(parameters).__name__ 'Parameters'
- load_transmits(key, selected_transmits)[source]¶
Load raw_data or aligned_data for a given list of transmits. :type key: str :param key: The type of data to load. Options are ‘raw_data’ and ‘aligned_data’. :type key: str :type selected_transmits: list, np.ndarray :param selected_transmits: The transmits to load. :type selected_transmits: list, np.ndarray
- property metadata: MetadataSpec¶
Return a validated
MetadataSpecobject from the file.- Returns:
The validated metadata spec.
- Return type:
- Raises:
KeyError – If the file has no
metadatagroup.
Example
>>> from zea import File >>> path = ( ... "hf://zeahub/picmus/database/experiments/contrast_speckle/" ... "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5" ... ) >>> with File(path, revision="v0.1.0") as f: ... meta = f.metadata ... print(meta.subject.type) phantom
- property metrics: MetricsSpec¶
Return a validated
MetricsSpecobject from the file.- Returns:
The validated metrics spec.
- Return type:
- Raises:
KeyError – If the file has no
metricsgroup.
Example:
>>> with File("my_file.hdf5") as f: ... met = f.metrics ... print(met.coherence_factor.shape)
- property n_ax: int¶
Number of axial samples.
- property n_el: int¶
Number of elements.
- property n_frames: int¶
Number of frames.
- property n_tx: int¶
Number of transmit events.
- property name¶
Return the name of the file.
- property path¶
Return the path of the file.
- property probe: Probe¶
Returns a Probe object initialized with the parameters from the file.
- Returns:
The probe object.
- Return type:
Example
>>> from zea import File >>> path = ( ... "hf://zeahub/picmus/database/experiments/contrast_speckle/" ... "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5" ... ) >>> with File(path) as f: ... probe = f.probe >>> probe.name 'verasonics_l11_4v'
- property probe_name¶
Reads the probe name from the data file and returns it.
- recursively_load_dict_contents_from_group(path)[source]¶
Load dict from contents of group.
Deprecated since version Use: the module-level
load_dict_from_hdf5_group()function instead, passing anh5py.Groupdirectly.- Parameters:
path (
str) – path to group- Returns:
dictionary with contents of group
- Return type:
dict
- property scan: ScanSpec | None¶
Return the validated
ScanSpecfor this file.This is the bare scan group as a spec object. For a full, derivable parameter object (merged probe + scan, with caching and derived properties) use
load_parameters().- Raises:
AttributeError – When the file contains more than one track. Use
tracksand access.scanon each track instead.
>>> from zea import File >>> path = ( ... "hf://zeahub/picmus/database/experiments/contrast_speckle/" ... "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5" ... ) >>> with File(path, revision="v0.1.0", mode="r") as f: ... scan = f.scan >>> type(scan).__name__ 'ScanSpec'
- property stem¶
Return the stem of the file.
- property track_labels: list[str | None]¶
Labels of all tracks in acquisition order.
Returns a list with one entry per track. Each entry is the label string stored on that track, or
Nonefor unlabelled tracks (e.g. single-track or legacy files). The list order matchestracks, so unpackingf.tracksin the same order asf.track_labelsis always safe.Example:
with File("acquisition.hdf5") as f: print(f.track_labels) # ['focused', 'planewave'] focused, planewave = f.tracks # safe — same order
- property track_schedule: ndarray | None¶
Track index for each global transmit event, shape
(n_total_tx,).Returns an
int32array that maps every transmit event (in acquisition order) to the track it belongs to, orNoneif notrack_scheduledataset was stored in this file.Example:
with File("multi_track.hdf5") as f: sched = f.track_schedule # e.g. array([0, 1, 0, 1, ...])
- property tracks: list[Track]¶
Return a list of
Trackobjects, one per track.Each track exposes
.data(aGroupProxy),.scan(aScanSpec) and.load_parameters()(aParametersfactory method) for that specific track.- Raises:
AttributeError – For flat-layout files that have no
tracks/group — usedataandscan()directly for those.
Example:
with File("multi_track.hdf5") as f: for track in f.tracks: raw = track.data.raw_data[:] parameters = track.load_parameters()
- property us_machine¶
Reads the ultrasound machine name from the data file and returns it.
- validate()[source]¶
Lightweight structural validation — no array data is loaded into RAM.
Checks that the file has a
datagroup and that all keys within it are recognised zea data types. For legacy files (before zea v0.1.0) a minimal key-name check is performed. For files created with zea v0.1.0 and later (viaFile.create()) the keys are checked against theDataSpecschema.Use
validate_spec()for a full validation that loads all data and checks dtypes, shapes, and cross-field dimension consistency.- Returns:
{"status": "success"}on success.- Return type:
dict
- Raises:
AssertionError – If the file is missing required groups or contains unrecognised data keys.
- validate_spec()[source]¶
Full schema validation — loads all data into RAM.
Reads every dataset in the file and runs dtype, shape, and cross-dimension consistency checks as defined by
FileSpec. Use this to confirm a file is fully spec-compliant before sharing or processing it.For a fast, zero-IO structural check use
validate()instead.Note
This method only works on files created with zea v0.1.0 and later. Files written before zea v0.1.0 should be re-saved through
File.create().- Returns:
The fully validated spec object, with all data accessible as typed attributes (e.g.
spec.data.raw_data,spec.scan.n_tx).- Return type:
- Raises:
TypeError, ValueError – If the file does not conform to the spec.
>>> with File("my_file.hdf5") as f: ... spec = f.validate_spec() ... print(spec.scan.n_tx)
- property zea_version: str | None¶
Return the zea version that wrote this file, or
Nonefor legacy files.Files created with zea v0.1.0 and later store a
zea_versionroot attribute. Files written before zea v0.1.0 returnNone.