The dataset 🗃️¶
Overall, the TrackRAD2025 challenge provides over 2.8 million unlabeled sagittal cine-MRI frames from 477 individual patients, and over 10,000 labeled sagittal cine-MRI frames (+8000 from frames with multiple observers) from 108 individual patients. Precisely, a cohort of 477 unlabeled and 108 manually labeled patients has been prepared for the participants. For each patient, 2D sagittal cine MRI data (time-resolved sequence of 2D images) has been acquired during the course of radiotherapy treatments at 0.35 T (ViewRay MRIdian) or 1.5 T (Elekta Unity) MRI-linacs from six international centers. Tracking targets (typically tumors) in the thorax, abdomen and pelvis were included as these can be affected by motion and reflect the most often treated anatomies on MRI-linacs. The training set, which comprises the 477 unlabeled cases plus 50 labeled cases was publicly released. Participants can further subdivide this dataset locally into training and validation. The remaining 58 labeled cases building the preliminary and final testing set are only accessible for evaluation via submission to the challenge. A couple of years after the challenge is closed, the testing set data is also going to be uploaded to the same location as the training set.
Detailed information about the dataset are provided in the following paper "link data paper (expected early 2025)".
Data location¶
The training (and validation) dataset can be downloaded from HuggingFace starting from March 15th, 2025.
The preliminary testing and final testing dataset are not provided to participants, but are accessible by uploading algorithms for evaluation.
Data structure¶
Each patient's data, both for the training and for the testing dataset, are organized using the following folder structure:
dataset/ |-- <patient>/ | |-- field-strength.json | |-- frame-rate.json | |-- frame-rate<scan>.json -> additional scans (for some of the unlabeled patients) | |-- scanned-region.json | |-- scanned-region<scan>.json -> additional scans (for some of the unlabeled patients) | |-- images/ | | |-- <patient_id>_frames.mha | | `-- <patient_id>_frames<scan>.mha -> additional scans (for some of the unlabeled patients) | `-- targets/ | |-- <patient_id>_first_label.mha | |-- <patient_id>_labels.mha | `-- <patient_id>_labels<observer>.mha -> additional observers (for some of the labeled patients) |-- D_001/ -> this is a labeled patient | |-- field-strength.json -> 0.35 | |-- frame-rate.json -> 4.0 | |-- scanned-region.json -> "thorax" | |-- images/ | | `-- D_001_frames.mha | `-- targets/ | |-- D_001_first_label.mha | `-- D_001_labels.mha |-- F_002/ -> this is an unlabeled patient | |-- field-strength.json -> 1.5 --> same for all scans of one patient | |-- frame-rate.json -> 1.65 -> frame rate of first scan | |-- frame-rate2.json -> 1.3 -> frame rate of second scan | |-- frame-rate3.json -> 1.65 -> frame rate of third scan | |-- scanned-region.json -> "abdomen" -> anatomic region of first scan | |-- scanned-region2.json -> "pelvis" -> anatomic region of second scan | |-- scanned-region3.json -> "abdomen" -> anatomic region of third scan | `-- images/ | |-- F_002_frames.mha -> first scan | |-- F_002_frames2.mha -> second scan | `-- F_002_frames3.mha -> third scan from the same patient |-- F_003/ `-- ...
Please note that the dataset folder structure does not match the interfaces that submissions need to implement one-to-one. For details regarding submission requirements, please read the corresponding page.
Data license¶
Data is released under CC-BY-NC (Attribution-NonCommercial).
Data description¶
This challenge provides 2D+t sagittal cine MRI data collected at six international centers:
- Amsterdam University Medical Center, Amsterdam
- Catharina Hospital, Eindhoven
- GenesisCare, Sydney
- LMU University Hospital, Munich
- Sichuan Cancer Center, Chengdu
- University Medical Center Utrecht, Utrecht
For anonymization purposes, the provenance of the data is not provided, and each center is indicated with letters from A to F. One of the centers also provided cine data with an updated MRI sequence for gating at a 1.5 T MRI-linac, this specific data is indicated with the letter X.
Training set¶
0.35 T MRI-linac | A | E | Total |
---|---|---|---|
Unlabeled | 219 | 34 | 253 |
Labeled | 25 | - | 25 |
1.5 T MRI-linac | B | C | F | Total |
---|---|---|---|---|
Unlabeled | 63 | 60 | 101 | 224 |
Labeled | 15 | 10 | - | 25 |
For training, centers A, B and C provided both unlabeled and manually labeled data while centers E and F provided solely unlabeled data.
Preliminary testing set¶
0.35 T MRI-linac | A | Total |
---|---|---|
Labeled | 2 | 2 |
1.5 T MRI-linac | B | C | Total |
---|---|---|---|
Labeled | 3 | 3 | 6 |
Final testing set¶
0.35 T MRI-linac | A | D | Total |
---|---|---|---|
Labeled | 5 | 20 | 25 |
1.5 T MRI-linac | B | C | X | Total |
---|---|---|---|---|
Labeled | 8 | 11 | 6 | 25 |
For preliminary testing and final testing centers A, B, C, D and X provided manually labeled data.
Data protocols¶
Unlabelled data protocol¶
For the unlabeled training set data, sagittal cine MRI from one or multiple radiotherapy MRI-linac treatment fractions and from MRI-linac initial simulation imaging sessions were included.
Due to machine design, when the gantry of the 0.35 T moves, the image quality is degraded. The 0.35 T unlabeled data therefore includes frames with degraded image quality due to gantry rotations, which participants are free to exclude using a method of their choice. All frames from the 1.5 T MRI-linac do not present degradation due to gantry rotations by design. The 1.5 T unlabeled data however can present temporal jumps and large changes in contrast within one cine MRI due to treatment interruptions, which were combined in a single cine MRI sequence during export.
Labelled data protocol¶
To avoid degraded images during evaluation, labeled frames for the 0.35 T MRI-linac were either chosen from simulation cine MRIs prior to treatment start or, when taken from treatments, a manual selection was performed to avoid periods of gantry rotation.
Labeled frames from the 1.5 T MRI-linac were visually inspected and selected to avoid temporal jumps due to treatment interruptions.
Human observers have generated the reference labels both for the training and testing sets. For dataset A, two observers (a medical student and a dentistry student) labeled the cine MRI frames using a labeling tool developed specifically for the challenge. For dataset B, a medical physics researcher (assistant professor) with more than 10 years experience in radiotherapy used the same in-house labeling tool to delineate the frames. For dataset C, two radiation oncologists independently labeled the cine MRI frames using itk-snap. For dataset D, 4 radiation oncologists and one medical physicist have independently labeled the cine MRI frames using software provided by the 0.35 T MRI-linac vendor.
For all labeled data, a medical physics doctoral student with 4 years experience in tumor tracking then reviewed and, if necessary, corrected all labels used in this challenge using the in-house tool.
Data acquisition and pre-processing¶
All images were acquired using the clinically adopted imaging protocols of the respective centers for each anatomical site and reflect typical images found in daily clinical routine. The cine-MRI sequences used at the 0.35 T and 1.5 T MRI-linacs are standardized, which ensures uniformity of the data for a given field strength.
At the centers using the 0.35 T MRI-linac, the 2D cine-MRIs were acquired in sagittal orientation with the patient in treatment position in the MRI-linac bore. During simulation or delivery, the patients performed breath holds to increase the duty cycle of the gated radiation delivery. The breath-holds are followed by periods of regular breathing. The sequence was a 2D balanced steady-state free precession (bSSFP) at 4 Hz or 8 Hz with a slice thickness of 5, 7 or 10 mm and pixel spacing of 2.4x2.4 or 3.5x3.5 mm2.
At the centers using the 1.5 T MRI-linac, the 2D cine-MRIs were acquired in either interleaved sagittal and coronal or interleaved sagittal, coronal and axial orientations with the patient in treatment position in the MRI-linac. For the challenge, only the sagittal plane has been considered. During simulation or delivery, some patients performed breath holds to increase the duty cycle of the gated radiation delivery, while others breathed freely. The breath-holds are followed by periods of regular breathing. The sequence was a balanced fast field echo (bFFE) sequence at 1.3 Hz to 3.5 Hz in the sagittal orientation with a slice thickness of 5, 7 or 8 mm and pixel spacing of 1.0x1.0 to 1.7x1.7 mm2.
The following pre-processing steps were performed on the data:
- Conversion from proprietary formats to .mha
- Anonymization
- Reshaping and orientation correction
- Resampling to 1x1 mm2 in-plane pixel spacing
- Conversion to 16-bit unsigned integer