Describing the conformational landscape of small organic molecules requires to estimate the probability density of their conformational states along the relevant degrees of freedom. Traditional methods relying on histograms of molecular dynamics (MD) or Monte-Carlo (MC) samples may become very time-consuming when increasing the dimension of the system, requiring huge datasets for statistical accuracy. Furthermore, these methods do not provide a direct estimation of the free- energy basins where the conformations are most stable. In this paper, we propose a novel clustering method providing a parametric representation of the conformational space that scales very efficiently with the dimension, while allowing to detect the free-energy basins in a completely unsupervised way. Our algorithm is an online version of Bregman clustering for exponential family mixtures. Results on a dataset of MD samples of alanine dipeptide agree with previous work and confirm the feasability of our approach.
Fitting and simplification of mixtures for clustering conformational populations of small organic molecules
Sona, Diego;
2012-01-01
Abstract
Describing the conformational landscape of small organic molecules requires to estimate the probability density of their conformational states along the relevant degrees of freedom. Traditional methods relying on histograms of molecular dynamics (MD) or Monte-Carlo (MC) samples may become very time-consuming when increasing the dimension of the system, requiring huge datasets for statistical accuracy. Furthermore, these methods do not provide a direct estimation of the free- energy basins where the conformations are most stable. In this paper, we propose a novel clustering method providing a parametric representation of the conformational space that scales very efficiently with the dimension, while allowing to detect the free-energy basins in a completely unsupervised way. Our algorithm is an online version of Bregman clustering for exponential family mixtures. Results on a dataset of MD samples of alanine dipeptide agree with previous work and confirm the feasability of our approach.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.