Classification-based approaches for data analysis are provoking wide interest and increasing adoption within the neuroscience community. Topics like brain decoding, multivoxel pattern analysis and brain-computer interface are prominent examples of this trend. The core problem of these investigations is hypothesis testing, i.e., finding evidence of some effect produced by the stimulation protocol within neural correlates. A classification algorithm is trained on the recorded data to learn how to discriminate between different stimuli. Then the misclassification rate of the predictions is estimated to answer the statistical test. This generic classification problem can be implemented in several ways depending on the exact neuroscientific question under investigation. However some implementations produce biased estimates due to circular analysis issues that could invalidate the conclusion of the scientific study. Therefore the most suited implementation of the classification problem must be used in order to avoid biases, to detect weak stimulus-related information within noise and to give the proper answer to the neuroscientific question at hand. In this work we propose different implementations of the classification-based approach in the case it comprises a variable selection step together with a classification step. For each different implementation we investigate the associated bias. Analyses are conducted on synthetic data and MEG data from a covert spatial attention task. The effects of different implementations of the classification algorithm are quantified by means of expected misclassification rate. Results prove the importance of adopting a proper error rate estimation process.

Brain Decoding: Biases in Error Estimation

Olivetti, Emanuele;Mognon, Andrea;Greiner, Susanne;Avesani, Paolo
2010

Abstract

Classification-based approaches for data analysis are provoking wide interest and increasing adoption within the neuroscience community. Topics like brain decoding, multivoxel pattern analysis and brain-computer interface are prominent examples of this trend. The core problem of these investigations is hypothesis testing, i.e., finding evidence of some effect produced by the stimulation protocol within neural correlates. A classification algorithm is trained on the recorded data to learn how to discriminate between different stimuli. Then the misclassification rate of the predictions is estimated to answer the statistical test. This generic classification problem can be implemented in several ways depending on the exact neuroscientific question under investigation. However some implementations produce biased estimates due to circular analysis issues that could invalidate the conclusion of the scientific study. Therefore the most suited implementation of the classification problem must be used in order to avoid biases, to detect weak stimulus-related information within noise and to give the proper answer to the neuroscientific question at hand. In this work we propose different implementations of the classification-based approach in the case it comprises a variable selection step together with a classification step. For each different implementation we investigate the associated bias. Analyses are conducted on synthetic data and MEG data from a covert spatial attention task. The effects of different implementations of the classification algorithm are quantified by means of expected misclassification rate. Results prove the importance of adopting a proper error rate estimation process.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/16889
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact