zea.data.convert.echonetlvh

Script to convert the EchoNet-LVH database to zea format.

Each video is cropped so that the scan cone is centered without padding, such that it can be converted to polar domain.

For more information about the dataset, resort to the following links:

  • The original dataset can be found at this link.

Functions

convert_echonetlvh(src, dst, no_rejection, ...)

Conversion script for the EchoNet-LVH dataset.

find_avi_file(source_dir, hashed_filename)

Find AVI file in the source EchoNet-LVH dataset.

load_cone_parameters(csv_path)

Load cone parameters from CSV file into a dictionary.

load_shapes(csv_path)

Load shapes from MeasurementsList.csv and return avi filenames

load_splits(csv_path)

Load splits from MeasurementsList.csv and return avi filenames

overwrite_splits(csv_path[, rejection_path])

Overwrite splits in a MeasurementsList.csv based on manual_rejections.txt or another txt file specifying which hashes to reject.

precompute_cone_parameters(source_path, ...)

Precompute and save cone parameters for all AVI files.

transform_measurement_coordinates_with_cone_params(...)

Transform measurement coordinates using cone parameters from fit_scan_cone.

transform_measurements_csv(csv_path[, ...])

Update a measurements CSV file in place with coordinates transformed using cone parameters.

Classes

LVHProcessor(path_out_h5, splits, cone_params)

Processor for EchoNet-LVH dataset.

class zea.data.convert.echonetlvh.LVHProcessor(path_out_h5, splits, cone_params, polar_shape=(600, 600), frame_bucket=128)[source]

Bases: object

Processor for EchoNet-LVH dataset.

__call__(avi_file)[source]

Takes a single avi_file and generates a zea dataset.

Sequential convenience wrapper around load(), compute() and save(); the parallel pipeline drives those stages directly.

Parameters:

avi_file (Path) – Path to avi_file to be processed

compute(payload)[source]

Stage 2 (GPU, main thread only): polar conversion.

Takes a payload from load() and returns the host-side arrays and metadata for save(). Keep this on the main thread: there is a single device and concurrent tracing is not safe.

get_split(avi_file)[source]

Get the split (train/val/test) for a given AVI file.

load(avi_file)[source]

Stage 1 (I/O + host preprocessing, thread-safe): read+decode the AVI, fetch cone params, frame-pad, and build the cropped cartesian view (image_sc).

Runs no GPU/JAX work, so it is safe to call from worker threads. Returns a payload dict consumed by compute(), or raises on missing params or an all-zero cropped sequence.

run(files, load_workers=8, save_workers=2, max_pending_saves=None)[source]

Run over files as an overlapped load -> compute -> save pipeline so the GPU is not stalled on disk I/O.

Loads (decode) run on a thread pool, GPU compute stays on the main thread, and writes go to a small saver pool. A single bad file is logged and skipped rather than aborting the whole (multi-hour) run.

The in-flight save queue is bounded (max_pending_saves): each pending save pins a decoded volume in memory, so without a cap a transient write slowdown would let memory grow until the process is OOM-killed. h5py serialises its writes through a global lock, so a couple of save workers is enough to overlap the write with GPU compute; more just adds memory pressure.

static save(out_h5, image_sc_np, polar, metadata)[source]

Stage 3 (I/O, thread-safe): write the zea HDF5 file.

polar may still be an unmaterialised device array from compute(); the blocking device->host copy happens here, off the main thread, so it overlaps with the next file’s GPU compute. No tracing is involved, so materialising it from a worker thread is safe.

static scan_convert(image_polar, cone_params, cartesian_shape, order=1)[source]

Scan convert the ‘image_polar’ to cartesian coordinates to exactly match the cropped original (i.e. the ‘image’ in the file.), using the cone parameters.

zea.data.convert.echonetlvh.convert_echonetlvh(src, dst, no_rejection, rejection_path, convert_measurements, convert_images, max_files, force, max_workers=8)[source]

Conversion script for the EchoNet-LVH dataset. Unzips, overwrites splits if needed, precomputes cone parameters, and converts images and/or measurements to zea format and saves dataset. Is called with argparse arguments through zea/zea/data/convert/__main__.py

zea.data.convert.echonetlvh.find_avi_file(source_dir, hashed_filename)[source]

Find AVI file in the source EchoNet-LVH dataset.

Parameters:
  • source_dir (Path) – Source directory containing BatchX subdirectories

  • hashed_filename (str) – Hashed filename (with or without .avi extension)

Returns:

Path to the AVI file if found, else None

zea.data.convert.echonetlvh.load_cone_parameters(csv_path)[source]

Load cone parameters from CSV file into a dictionary.

Only loads the rows with status “success”.

Parameters:

csv_path – Path to the CSV file containing cone parameters

Returns:

Dictionary mapping avi_filename to cone parameters

zea.data.convert.echonetlvh.load_shapes(csv_path)[source]

Load shapes from MeasurementsList.csv and return avi filenames

Parameters:

csv_path (str | Path) – Path to the MeasurementsList.csv file

Returns: dictionary with the filename as key and the shape as value

zea.data.convert.echonetlvh.load_splits(csv_path)[source]

Load splits from MeasurementsList.csv and return avi filenames

Parameters:

csv_path (str | Path) – Path to the MeasurementsList.csv file

Returns:

Dictionary with keys ‘train’, ‘val’, ‘test’, ‘rejected’ and values as lists of avi filenames

zea.data.convert.echonetlvh.overwrite_splits(csv_path, rejection_path=None)[source]

Overwrite splits in a MeasurementsList.csv based on manual_rejections.txt or another txt file specifying which hashes to reject.

Parameters:
  • csv_path (Path) – Path to the MeasurementsList.csv to update in place

  • rejection_path – Path to the rejection txt file. If None, defaults to ./manual_rejections.txt

Returns:

None

zea.data.convert.echonetlvh.precompute_cone_parameters(source_path, measurements_csv, cone_params_csv, max_files, max_workers=8)[source]

Precompute and save cone parameters for all AVI files.

This function loads the first frame from each AVI file, applies fit_scan_cone to determine cropping parameters, and saves these parameters to a CSV file for later use during the actual data conversion.

Parameters:
  • source_path (Path) – Source directory containing EchoNet-LVH data

  • measurements_csv (str | Path) – Path to the MeasurementsList.csv file

  • cone_params_csv (Path) – Path to the output CSV file

  • max_files – Maximum number of files to process (or None for all)

  • max_workers (int) – Number of worker threads used to process files in parallel

Returns:

Path to the CSV file containing cone parameters

zea.data.convert.echonetlvh.transform_measurement_coordinates_with_cone_params(row, cone_params)[source]

Transform measurement coordinates using cone parameters from fit_scan_cone.

Parameters:
  • row – A dict containing measurement data with X1,X2,Y1,Y2 coordinates

  • cone_params – Dictionary containing cone parameters from fit_scan_cone

Returns:

A new row with transformed coordinates, or None if cone_params is None

zea.data.convert.echonetlvh.transform_measurements_csv(csv_path, cone_params_csv=None)[source]

Update a measurements CSV file in place with coordinates transformed using cone parameters.

Parameters:
  • csv_path – Path to the CSV file to transform in place

  • cone_params_csv – Path to CSV file with cone parameters