zea.data.convert.cetus¶

Convert the CETUS dataset to the zea format.

Note

Requires SimpleITK: pip install SimpleITK.

CETUS (Challenge on Endocardial Three-dimensional Ultrasound Segmentation) is a public MICCAI 2014 challenge dataset. It contains 3-D echocardiographic volumes from 45 patients with ground-truth left-ventricle segmentation masks at end-diastole (ED) and end-systole (ES). Volumes are stored as NIfTI (.nii.gz) files with isotropic voxel spacing.

Dataset splits:

  • Train - patients 1-30

  • Validation - patients 31-38

  • Test - patients 39-45

License

CC BY-NC-SA 4.0 - https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode

The CETUS dataset is available free of charge strictly for non-commercial scientific research purposes only.

Reference

O. Bernard, et al. Standardized Evaluation System for Left Ventricular Segmentation Algorithms in 3D Echocardiography. IEEE Transactions on Medical Imaging, vol. 35, no. 4, pp. 967-977, April 2016. DOI: 10.1109/tmi.2015.2503890

Links

Usage

python -m zea.data.convert cetus ./raw ./output --download

Functions

convert_cetus(args)

Convert the CETUS dataset into zea HDF5 files across dataset splits.

download_cetus(destination[, patients])

Download the CETUS dataset from the Girder server.

get_split(patient_id)

Determine which dataset split a patient ID belongs to.

process_cetus(source_path, output_path[, ...])

Convert a single CETUS patient time-point to a zea HDF5 file.

upload_cetus(output_folder, revision)

Upload the converted CETUS dataset to a HuggingFace Hub revision branch.

zea.data.convert.cetus.convert_cetus(args)[source]¶

Convert the CETUS dataset into zea HDF5 files across dataset splits.

Processes all NIfTI B-mode volumes found under the source folder, assigns each patient to a train/val/test split, and executes per-file conversion tasks either serially or in parallel.

Usage:

python -m zea.data.convert cetus <source_folder> <destination_folder> --download
Parameters:

args (argparse.Namespace) –

An object with attributes:

  • src (str | Path): Path to the folder containing CETUS patient subfolders, or a directory to download into when --download is set.

  • dst (str | Path): Root destination folder for zea HDF5 outputs; split subfolders (train/val/test) will be created.

  • download (bool, optional): If True, download the dataset first from the Girder server.

  • no_hyperthreading (bool, optional): If True, run tasks serially instead of using a process pool.

  • upload (bool, optional): If True, upload the converted dataset to HuggingFace Hub after conversion. Only for zea maintainers with push access to the repository.

zea.data.convert.cetus.download_cetus(destination, patients=None)[source]¶

Download the CETUS dataset from the Girder server.

Downloads NIfTI files for each patient (B-mode volumes and ground truth segmentations for ED and ES time points).

Parameters:
  • destination (str | Path) – Directory where the dataset will be downloaded.

  • patients (list[int] | None) – List of patient IDs to download (1-45). If None, all 45 patients are downloaded.

Return type:

Path

Returns:

Path to the downloaded dataset directory.

zea.data.convert.cetus.get_split(patient_id)[source]¶

Determine which dataset split a patient ID belongs to.

Parameters:

patient_id (int) – Integer ID of the patient (1-45).

Returns:

"train", "val", or "test".

Return type:

str

Raises:

ValueError – If the patient_id does not fall into any defined split range.

zea.data.convert.cetus.process_cetus(source_path, output_path, overwrite=False)[source]¶

Convert a single CETUS patient time-point to a zea HDF5 file.

Each file stores the 3D B-mode volume as image (scan-converted image). If a corresponding ground truth segmentation file exists, it is stored as a Segmentation map under data/segmentation. Both maps share a per-voxel coordinate grid (shape (D, H, W, 3)) derived from the NIfTI voxel spacing.

Patient ID and citation are stored in the metadata group. License information is embedded in the file description.

Parameters:
  • source_path (str or Path) – Path to the source .nii.gz B-mode file.

  • output_path (str or Path) – Path to the output .hdf5 file.

  • overwrite (bool, optional) – Whether to overwrite an existing output file. Defaults to False.

zea.data.convert.cetus.upload_cetus(output_folder, revision)[source]¶

Upload the converted CETUS dataset to a HuggingFace Hub revision branch.

Only for zea maintainers with push access to the repository. Upload to main is blocked; merge the revision branch into main manually after verifying the upload.

Parameters:
  • output_folder (str | Path) – Root folder containing the train/val/test splits.

  • revision (str) – Target branch name on the Hub (must not be "main").

Return type:

None