zea.data.convert.camus¶

Convert the CAMUS dataset to the zea format.

Note

Requires SimpleITK: pip install SimpleITK.

CAMUS (Cardiac Acquisitions for Multi-structure Ultrasound Segmentation) is a public dataset containing 2-D echocardiographic sequences from 500 patients. Sequences are stored in NIfTI (.nii.gz) format and include both 2-chamber (2CH) and 4-chamber (4CH) apical views.

Dataset splits:

  • Train - patients 1-400

  • Validation - patients 401-450

  • Test - patients 451-500

License

CC BY-NC-SA 4.0 - https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode

The CAMUS dataset is available free of charge strictly for non-commercial scientific research purposes only.

Reference

S. Leclerc, E. Smistad, J. Pedrosa, A. Ostvik, F. Cervenansky, F. Espinosa, T. Espeland, E. A. R. Berg, P.-M. Jodoin, T. Grenier, C. Lartizien, J. D’hooge, L. Lovstakken and O. Bernard. Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography. IEEE Transactions on Medical Imaging, vol. 38, no. 9, pp. 2198-2210, 2019. DOI: 10.1109/TMI.2019.2900516

Links

Usage

python -m zea.data.convert camus ./raw ./output --download

For testing purposes, you can also convert a reduced dataset containing only 6 half-sequence files:

python -m zea.data.convert camus ./raw ./output --download --reduced-dataset

Functions

convert_camus(args)

Convert the CAMUS dataset into zea HDF5 files across dataset splits.

download_camus(destination[, patients])

Download the CAMUS dataset from the Girder server.

get_split(patient_id)

Determine which dataset split a patient ID belongs to.

process_camus(source_path, output_path[, ...])

Convert one CAMUS NIfTI half-sequence into the zea HDF5 format.

upload_camus(output_folder, revision[, repo_id])

Upload the converted CAMUS dataset to a HuggingFace Hub revision branch.

zea.data.convert.camus.convert_camus(args)[source]¶

Convert the CAMUS dataset into zea HDF5 files across dataset splits.

Processes files found under the CAMUS source folder (after unzipping or downloading if needed), assigns each patient to a train/val/test split, creates matching output paths, and executes per-file conversion tasks either serially or in parallel.

Usage:

python -m zea.data.convert camus <source_folder> <destination_folder>
python -m zea.data.convert camus <source_folder> <destination_folder> --download
Parameters:

args (argparse.Namespace) –

An object with attributes:

  • src (str | Path): Path to the CAMUS archive or extracted folder, or a directory to download into when --download is set.

  • dst (str | Path): Root destination folder for ZEA HDF5 outputs; split subfolders will be created.

  • download (bool, optional): If True, download the dataset first from the Girder server.

  • no_hyperthreading (bool, optional): If True, run tasks serially instead of using a process pool.

zea.data.convert.camus.download_camus(destination, patients=None)[source]¶

Download the CAMUS dataset from the Girder server.

Downloads NIfTI files for each patient.

Parameters:
  • destination (str | Path) – Directory where the dataset will be downloaded.

  • patients (list[int] | None) – List of patient IDs to download (1-500). If None, all patients are downloaded.

Return type:

Path

Returns:

Path to the downloaded dataset directory.

zea.data.convert.camus.get_split(patient_id)[source]¶

Determine which dataset split a patient ID belongs to.

Parameters:

patient_id (int) – Integer ID of the patient.

Returns:

“train”, “val”, or “test”.

Return type:

str

Raises:

ValueError – If the patient_id does not fall into any defined split range.

zea.data.convert.camus.process_camus(source_path, output_path, overwrite=False)[source]¶

Convert one CAMUS NIfTI half-sequence into the zea HDF5 format.

Stores the scan-converted B-mode sequence (data/image), per-pixel Cartesian coordinates derived from the NIfTI voxel spacing, the full segmentation sequence (data/segmentation) with an explicit "unannotated" label channel for frames that lack manual annotations, and rich clinical metadata parsed from the accompanying Info_*.cfg file.

Parameters:
  • source_path (str, pathlike) – Path to a *_half_sequence.nii.gz file.

  • output_path (str, pathlike) – Destination HDF5 file path.

  • overwrite (bool, optional) – Overwrite existing output file. Defaults to False.

zea.data.convert.camus.upload_camus(output_folder, revision, repo_id='zeahub/camus')[source]¶

Upload the converted CAMUS dataset to a HuggingFace Hub revision branch.

Only for zea maintainers with push access to the repository. Upload to main is blocked; merge the revision branch into main manually after verifying the upload.

Parameters:
  • output_folder (str | Path) – Root folder containing the train/val/test splits.

  • revision (str) – Target branch name on the Hub (must not be "main").

Return type:

None