We apply a semi-supervised technique called Supervised Principal Component (SPC) to explore the relationship between the composition of a thin film combinatorial library and the peaks of Time-Of-Flight Secondary Ion Mass Spectrometry(ToF-SIMS) spectra acquired from the library. SPC is first used to select a subset of the available multivariate features (the peak intensities of the ToF-SIMS spectra) based on their association with the outcome variable (the elemental concentration of the thin film samples). Next, using only the selected features, SPC creates optimal linear models which map the ToF-SIMS data onto the composition data. The models for the first two of the considered elemental concentrations use only eight of the 55 available ToF-SIMS peaks, making interpretation of the model much simpler than for a model which uses all 55 available peaks. The percentage of explained variance (R2) in concentration data is in both cases about 0.80. These results are obtained during the model validation phase, performed on test data, which are exclusively used for this purpose. The model for the third considered element did not produce significant results due to the poor variability of the dataset. This work illustrates for the first time that using a multivariate analysis technique, one can establish the relationship between ToF-SIMS measurements and stoichiometric data in a combinatorial experiment. More generally, the described feature selection approach provides an example of how combinatorial experiments can be useful for accelerating the understanding of the chemical – physical behaviors under investigation.
Data analysis in combinatorial experiments: applying supervised principal component technique to investigate the relationship between ToF-SIMS spectra and the composition distribution of ternary metallic alloy thin films
Dell'Anna, Rossana;Canteri, Roberto;
2008-01-01
Abstract
We apply a semi-supervised technique called Supervised Principal Component (SPC) to explore the relationship between the composition of a thin film combinatorial library and the peaks of Time-Of-Flight Secondary Ion Mass Spectrometry(ToF-SIMS) spectra acquired from the library. SPC is first used to select a subset of the available multivariate features (the peak intensities of the ToF-SIMS spectra) based on their association with the outcome variable (the elemental concentration of the thin film samples). Next, using only the selected features, SPC creates optimal linear models which map the ToF-SIMS data onto the composition data. The models for the first two of the considered elemental concentrations use only eight of the 55 available ToF-SIMS peaks, making interpretation of the model much simpler than for a model which uses all 55 available peaks. The percentage of explained variance (R2) in concentration data is in both cases about 0.80. These results are obtained during the model validation phase, performed on test data, which are exclusively used for this purpose. The model for the third considered element did not produce significant results due to the poor variability of the dataset. This work illustrates for the first time that using a multivariate analysis technique, one can establish the relationship between ToF-SIMS measurements and stoichiometric data in a combinatorial experiment. More generally, the described feature selection approach provides an example of how combinatorial experiments can be useful for accelerating the understanding of the chemical – physical behaviors under investigation.File | Dimensione | Formato | |
---|---|---|---|
QSAR_27_2_08_P171_178.pdf
non disponibili
Tipologia:
Documento in Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
448.88 kB
Formato
Adobe PDF
|
448.88 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.