zea.data.convert.camus¶

Convert the CAMUS dataset to the zea format.

Note

Requires SimpleITK: pip install SimpleITK.

CAMUS (Cardiac Acquisitions for Multi-structure Ultrasound Segmentation) is a public dataset containing 2-D echocardiographic sequences from 500 patients. Sequences are stored in NIfTI (.nii.gz) format and include both 2-chamber (2CH) and 4-chamber (4CH) apical views.

Dataset splits:

Train - patients 1-400
Validation - patients 401-450
Test - patients 451-500

License

CC BY-NC-SA 4.0 - https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode

The CAMUS dataset is available free of charge strictly for non-commercial scientific research purposes only.

Reference

S. Leclerc, E. Smistad, J. Pedrosa, A. Ostvik, F. Cervenansky, F. Espinosa, T. Espeland, E. A. R. Berg, P.-M. Jodoin, T. Grenier, C. Lartizien, J. D’hooge, L. Lovstakken and O. Bernard. Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography. IEEE Transactions on Medical Imaging, vol. 38, no. 9, pp. 2198-2210, 2019. DOI: 10.1109/TMI.2019.2900516

Links

Usage

python -m zea.data.convert camus ./raw ./output --download

For testing purposes, you can also convert a reduced dataset containing only 6 half-sequence files:

python -m zea.data.convert camus ./raw ./output --download --reduced-dataset

Functions

`convert_camus`(args)	Convert the CAMUS dataset into zea HDF5 files across dataset splits.
`download_camus`(destination[, patients])	Download the CAMUS dataset from the Girder server.
`get_split`(patient_id)	Determine which dataset split a patient ID belongs to.
`process_camus`(source_path, output_path[, ...])	Convert one CAMUS NIfTI half-sequence into the zea HDF5 format.
`upload_camus`(output_folder, revision[, repo_id])	Upload the converted CAMUS dataset to a HuggingFace Hub revision branch.

zea.data.convert.camus.convert_camus(args)[source]¶

Convert the CAMUS dataset into zea HDF5 files across dataset splits.

Processes files found under the CAMUS source folder (after unzipping or downloading if needed), assigns each patient to a train/val/test split, creates matching output paths, and executes per-file conversion tasks either serially or in parallel.

Usage:

python -m zea.data.convert camus <source_folder> <destination_folder>
python -m zea.data.convert camus <source_folder> <destination_folder> --download

Parameters:

args (argparse.Namespace) –

An object with attributes:

src (str | Path): Path to the CAMUS archive or extracted folder, or a directory to download into when --download is set.
dst (str | Path): Root destination folder for ZEA HDF5 outputs; split subfolders will be created.
download (bool, optional): If True, download the dataset first from the Girder server.
no_hyperthreading (bool, optional): If True, run tasks serially instead of using a process pool.

zea.data.convert.camus.download_camus(destination, patients=None)[source]¶

Download the CAMUS dataset from the Girder server.

Downloads NIfTI files for each patient.

Parameters:

destination (str | Path) – Directory where the dataset will be downloaded.
patients (list[int] | None) – List of patient IDs to download (1-500). If None, all patients are downloaded.

Return type:

Path

Returns:

Path to the downloaded dataset directory.

zea.data.convert.camus.get_split(patient_id)[source]¶

Determine which dataset split a patient ID belongs to.

Parameters:: patient_id (int) – Integer ID of the patient.
Returns:: “train”, “val”, or “test”.
Return type:: str
Raises:: ValueError – If the patient_id does not fall into any defined split range.

zea.data.convert.camus.process_camus(source_path, output_path, overwrite=False)[source]¶

Convert one CAMUS NIfTI half-sequence into the zea HDF5 format.

Stores the scan-converted B-mode sequence (data/image), per-pixel Cartesian coordinates derived from the NIfTI voxel spacing, the full segmentation sequence (data/segmentation) with an explicit "unannotated" label channel for frames that lack manual annotations, and rich clinical metadata parsed from the accompanying Info_*.cfg file.

Parameters:

source_path (str, pathlike) – Path to a *_half_sequence.nii.gz file.
output_path (str, pathlike) – Destination HDF5 file path.
overwrite (bool, optional) – Overwrite existing output file. Defaults to False.

zea.data.convert.camus.upload_camus(output_folder, revision, repo_id='zeahub/camus')[source]¶

Upload the converted CAMUS dataset to a HuggingFace Hub revision branch.

Only for zea maintainers with push access to the repository. Upload to main is blocked; merge the revision branch into main manually after verifying the upload.

Parameters:

output_folder (str | Path) – Root folder containing the train/val/test splits.
revision (str) – Target branch name on the Hub (must not be "main").

Return type:

None