zea.data.convert.echonetlvh¶
Script to convert the EchoNet-LVH database to zea format.
Each video is cropped so that the scan cone is centered without padding, such that it can be converted to polar domain.
For more information about the dataset, resort to the following links:
The original dataset can be found at this link.
Functions
|
Conversion script for the EchoNet-LVH dataset. |
|
Find AVI file in the source EchoNet-LVH dataset. |
|
Load cone parameters from CSV file into a dictionary. |
|
Load shapes from MeasurementsList.csv and return avi filenames |
|
Load splits from MeasurementsList.csv and return avi filenames |
|
Overwrite splits in a MeasurementsList.csv based on manual_rejections.txt or another txt file specifying which hashes to reject. |
|
Precompute and save cone parameters for all AVI files. |
Transform measurement coordinates using cone parameters from fit_scan_cone. |
|
|
Update a measurements CSV file in place with coordinates transformed using cone parameters. |
Classes
|
Processor for EchoNet-LVH dataset. |
- class zea.data.convert.echonetlvh.LVHProcessor(path_out_h5, splits, cone_params, polar_shape=(600, 600), frame_bucket=128)[source]¶
Bases:
objectProcessor for EchoNet-LVH dataset.
- __call__(avi_file)[source]¶
Takes a single avi_file and generates a zea dataset.
Sequential convenience wrapper around
load(),compute()andsave(); the parallel pipeline drives those stages directly.- Parameters:
avi_file (
Path) – Path to avi_file to be processed
- compute(payload)[source]¶
Stage 2 (GPU, main thread only): polar conversion.
Takes a payload from
load()and returns the host-side arrays and metadata forsave(). Keep this on the main thread: there is a single device and concurrent tracing is not safe.
- load(avi_file)[source]¶
Stage 1 (I/O + host preprocessing, thread-safe): read+decode the AVI, fetch cone params, frame-pad, and build the cropped cartesian view (image_sc).
Runs no GPU/JAX work, so it is safe to call from worker threads. Returns a payload dict consumed by
compute(), or raises on missing params or an all-zero cropped sequence.
- run(files, load_workers=8, save_workers=2, max_pending_saves=None)[source]¶
Run over
filesas an overlapped load -> compute -> save pipeline so the GPU is not stalled on disk I/O.Loads (decode) run on a thread pool, GPU compute stays on the main thread, and writes go to a small saver pool. A single bad file is logged and skipped rather than aborting the whole (multi-hour) run.
The in-flight save queue is bounded (
max_pending_saves): each pending save pins a decoded volume in memory, so without a cap a transient write slowdown would let memory grow until the process is OOM-killed. h5py serialises its writes through a global lock, so a couple of save workers is enough to overlap the write with GPU compute; more just adds memory pressure.
- static save(out_h5, image_sc_np, polar, metadata)[source]¶
Stage 3 (I/O, thread-safe): write the zea HDF5 file.
polarmay still be an unmaterialised device array fromcompute(); the blocking device->host copy happens here, off the main thread, so it overlaps with the next file’s GPU compute. No tracing is involved, so materialising it from a worker thread is safe.
- zea.data.convert.echonetlvh.convert_echonetlvh(src, dst, no_rejection, rejection_path, convert_measurements, convert_images, max_files, force, max_workers=8)[source]¶
Conversion script for the EchoNet-LVH dataset. Unzips, overwrites splits if needed, precomputes cone parameters, and converts images and/or measurements to zea format and saves dataset. Is called with argparse arguments through zea/zea/data/convert/__main__.py
- zea.data.convert.echonetlvh.find_avi_file(source_dir, hashed_filename)[source]¶
Find AVI file in the source EchoNet-LVH dataset.
- Parameters:
source_dir (
Path) – Source directory containing BatchX subdirectorieshashed_filename (
str) – Hashed filename (with or without .avi extension)
- Returns:
Path to the AVI file if found, else None
- zea.data.convert.echonetlvh.load_cone_parameters(csv_path)[source]¶
Load cone parameters from CSV file into a dictionary.
Only loads the rows with status “success”.
- Parameters:
csv_path – Path to the CSV file containing cone parameters
- Returns:
Dictionary mapping avi_filename to cone parameters
- zea.data.convert.echonetlvh.load_shapes(csv_path)[source]¶
Load shapes from MeasurementsList.csv and return avi filenames
- Parameters:
csv_path (
str|Path) – Path to the MeasurementsList.csv file
Returns: dictionary with the filename as key and the shape as value
- zea.data.convert.echonetlvh.load_splits(csv_path)[source]¶
Load splits from MeasurementsList.csv and return avi filenames
- Parameters:
csv_path (
str|Path) – Path to the MeasurementsList.csv file- Returns:
Dictionary with keys ‘train’, ‘val’, ‘test’, ‘rejected’ and values as lists of avi filenames
- zea.data.convert.echonetlvh.overwrite_splits(csv_path, rejection_path=None)[source]¶
Overwrite splits in a MeasurementsList.csv based on manual_rejections.txt or another txt file specifying which hashes to reject.
- Parameters:
csv_path (
Path) – Path to the MeasurementsList.csv to update in placerejection_path – Path to the rejection txt file. If None, defaults to ./manual_rejections.txt
- Returns:
None
- zea.data.convert.echonetlvh.precompute_cone_parameters(source_path, measurements_csv, cone_params_csv, max_files, max_workers=8)[source]¶
Precompute and save cone parameters for all AVI files.
This function loads the first frame from each AVI file, applies fit_scan_cone to determine cropping parameters, and saves these parameters to a CSV file for later use during the actual data conversion.
- Parameters:
source_path (
Path) – Source directory containing EchoNet-LVH datameasurements_csv (
str|Path) – Path to the MeasurementsList.csv filecone_params_csv (
Path) – Path to the output CSV filemax_files – Maximum number of files to process (or None for all)
max_workers (
int) – Number of worker threads used to process files in parallel
- Returns:
Path to the CSV file containing cone parameters
- zea.data.convert.echonetlvh.transform_measurement_coordinates_with_cone_params(row, cone_params)[source]¶
Transform measurement coordinates using cone parameters from fit_scan_cone.
- Parameters:
row – A dict containing measurement data with X1,X2,Y1,Y2 coordinates
cone_params – Dictionary containing cone parameters from fit_scan_cone
- Returns:
A new row with transformed coordinates, or None if cone_params is None
- zea.data.convert.echonetlvh.transform_measurements_csv(csv_path, cone_params_csv=None)[source]¶
Update a measurements CSV file in place with coordinates transformed using cone parameters.
- Parameters:
csv_path – Path to the CSV file to transform in place
cone_params_csv – Path to CSV file with cone parameters