We propose a combination of machine learning techniques to integrate predictive profiling from gene expression with clinical and epidemiological data. Starting from BioDCV, a complete software setup for predictive classification and feature ranking without selection bias, we apply semisupervised profiling for detecting outliers and deriving informative subtypes of patients. During the profiling process, sampletracking curves are extracted, and then clustered according to a distance derived from dynamic time warping. Sampletracking allows also the identification of outlier cases, whose removal is shown to improve predictive accuracy and stability of derived gene profiles. Here we propose to employ clinical features to validate the semisupervising procedure. The procedure is demonstrated in the analysis of a liver cancer dataset of 213 samples described by 1993 genes and by pathological features.

Integrating gene expression profiling and clinical data

Jurman, Giuseppe;Albanese, Davide;Merler, Stefano;Furlanello, Cesare
2008-01-01

Abstract

We propose a combination of machine learning techniques to integrate predictive profiling from gene expression with clinical and epidemiological data. Starting from BioDCV, a complete software setup for predictive classification and feature ranking without selection bias, we apply semisupervised profiling for detecting outliers and deriving informative subtypes of patients. During the profiling process, sampletracking curves are extracted, and then clustered according to a distance derived from dynamic time warping. Sampletracking allows also the identification of outlier cases, whose removal is shown to improve predictive accuracy and stability of derived gene profiles. Here we propose to employ clinical features to validate the semisupervising procedure. The procedure is demonstrated in the analysis of a liver cancer dataset of 213 samples described by 1993 genes and by pathological features.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/8644
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact