Content deleted Content added
→External links: {{Recategorize|date=April 2011}} |
m Fixing improperly nested section headings (task 5) |
||
Line 33:
The data processing procedures are based on reference methods from the literature or on innovative internal developments. Implemented with the open source statistical software R, they follow a set of specifications which facilitate collaborative work and tracking.
The data sent by the hybridization platforms are pre-processed according to a normalization and quality control stage adapted to each technology: background correction, quality control, filtering, aggregation and normalization. For genomic data (CGH, SNPs), an essential segmentation step is added to identify the altered regions along the genome.
The data analysis unfolds into three main stages:
* Class discovery, using unsupervised clustering, enables the identification of the underlying molecular groups. The quality and variety of the supplied annotations are crucial to interpret the resulting classification.
Line 42:
* Class prediction, using classification approaches, establishes the smallest combinations of molecular markers to characterize tumor groups and to guide decisions about medical treatments.
Results are interpreted through additional bioinformatics analysis (pathway analysis, combined genome and transcriptome study), and then validated against independent datasets from the literature or from the CIT program. Finally, a validation of the results is carried out with RT-PCR on a microfluidic platform.
|