The ATTRACT trial consist of a retrospective and a prospective part. In the retrospective
part of the trial, radiology, pathology, and genomics data of 300 patients with stage II-III
colon cancer will be used to identify the genetic and radiomics features of colon tumors and
the clinical endpoints as the outcomes of the predictive model.
Tumors will be manually segmented on CT images and used for the AI (artificial
intelligence)-model generation. Pathological annotations will be associated to the
corresponding anonymised profiles. Immunohistochemistry will be used to classify the samples
in the 4 molecular subtypes according.RNA-seq profiles will be also generated from tissue
samples through targeted transcriptomics using custom NGS (next-generation sequencing) panels
specifically designed to evaluate gene expression and assess Tumor Mutational Burden (TMB).
Raw data will be processed and modelled using Topological Pathway Analysis to stratify
patients according to the relevant molecular features and define molecular annotations that
will be used to train the model for the identification of specific clinically relevant
groups. Raw data together with radiological data will be used to generate and train the
AI-models for the automated segmentation and the extraction of the radiogenomics signature.
Radiomics features will be extracted from manually segmented tumors. Standard PyRadiomics
tools as well custom-made tools will be used. Feature robustness will be guaranteed by
selecting only those with high inter-observer statistical correlation. Two families of AI
models will be generated, one family dedicated to segmentation, and the other dedicated to
radiogenomics-based phenotyping according to the clinical, molecular biology and pathological
data available. The two families will be fused for the creation of the ATTRACT AI-model. For
the generation of these models, specific convolutional neural network (CNN) architectures
based on deep learning (DL) and Artificial Intelligence like UNet and MaskNet will be
applied. The training will be performed using the manual ROI (region of interest)
segmentations as ground truth. For the generation of radiogenomics analysis models, radiomics
and genomic features will be combined using different multivariate algorithms. The
classificatory will be trained to recognise the cancer subtypes and clinical endpoints.
In the prospective part of the trial, patients with stage II-III colon cancer will be
recruited and will undergo a preoperative contrast-enhanced CT examination. The recruitment
rate will be 70 patients per year, for a total of 210 patients. After pre-operative CT,
surgery will be performed according to international standard protocols. Eventually adjuvant
therapy will be considered following current guidelines. Pathological sample of the
prospective enrollment will be analyzed. First, with RNA-seq data, TMB (of coding genes) and
clinical data, patients will be clustered by making use of two different techniques Markov
Cluster algorithm (MCL) and t-SNE (t-distributed stochastic neighbor embedding), Multi-Layer
Network clustering. Patients will be represented as node of a network, edges between nodes
will be weighted and thresholded according to the Jaccard Similarity. The similarity will be
computed on top of Gene Expression, TMB, and perturbation information coming from Topological
Pathway Analysis. Results of clustering will be matched with those coming from
Immunohistochemistry. Clinical follow-up data (i.e. outcome of the therapy etc...) will be,
once available, also plugged into the workflow to enforce the learning. Extracted knowledge
will be used to annotate the dataset used to train and validate the radiomics classification.
Gross specimen will be analyzed in order to extract different transcriptomics molecular
subtypes (CMS1, CMS2, CMS3, CMS4) in accordance to the Colorectal Cancer (CRC) Subtyping
Consortium (CRCSC) assessing the presence or absence of core subtype patterns among existing
gene expression-based CRC subtyping algorithms. The accordance between pathological molecular
profile and ctDNA analysis during protocol will be related to radiomics classification in
order to provide a new whole-diagnostic model of approach in CRC treatment and surveillance.
Prospective data will be used to validate the AI models. For the segmentation models, the
Dice Coefficient will be used as an indicator to measure the degree of overlap between the
automated and the expert segmentation. For the radiogenomics model, performances will be
evaluated using accuracy, integral under the receiver operator curve (ROC-AUC) and clinical
decision curve. The investigators will also take into consideration, in order to select the
best AI models, the response to the variation of the input characteristics and will produce
saliency maps where the features of the input image that mostly contributed to the
classification are highlighted.
Clinical evaluation with will be performed every 6 months for 2 years, including regular
serum CEA (carcinoembryonic antigen) tests and Whole-Body CT every 6-12 months in patients
who are at higher risk of recurrence in the first 3 years following ESMO (European Society
for Medical Oncology) guidelines .Disease-free survival (DFS) and relapse-free survival (RFS)
will be calculated.