In recent years, there has been a growing interest in using generative models for de novo drug design. State-of-the-Art methods typically focus on either 2D structures or 3D structures, also known as conformers. Designing 3D structures is more challenging because it involves predicting spatial coordinates, necessitating the use of SE(3) equivariant architectures to ensure consistency under coordinate transformations like rotations and translations. This study presents D4, a novel Distance and Discrete Denoising Diffusion model that utilizes the distance matrix of molecular atoms to predict a molecule’s 3D coordinates, which are naturally unaffected by such transformations. This method effectively sidesteps the difficulties encountered with traditional coordinate-based training done by State-of-the-Art methods and allows explicit conditioning of bond types on distances. The experiments performed on three well-established datasets — QM9, GDB13, and ZINC250K — of varying challenges show that this approach significantly surpasses the performance of MiDi, a State-of-the-Art approach for generating 3D molecular structures. Additionally, an ablation study confirms the significance of adopting a novel regularization loss, which addresses errors in distance predictions and bounds the triangle inequality, validating the use of distance matrices in molecular generative models.

D4: Distance diffusion for a truly equivariant molecular design

Cognolato, Samuel;Rigoni, Davide
;
Serafini, Luciano;Sperduti, Alessandro
2026-01-01

Abstract

In recent years, there has been a growing interest in using generative models for de novo drug design. State-of-the-Art methods typically focus on either 2D structures or 3D structures, also known as conformers. Designing 3D structures is more challenging because it involves predicting spatial coordinates, necessitating the use of SE(3) equivariant architectures to ensure consistency under coordinate transformations like rotations and translations. This study presents D4, a novel Distance and Discrete Denoising Diffusion model that utilizes the distance matrix of molecular atoms to predict a molecule’s 3D coordinates, which are naturally unaffected by such transformations. This method effectively sidesteps the difficulties encountered with traditional coordinate-based training done by State-of-the-Art methods and allows explicit conditioning of bond types on distances. The experiments performed on three well-established datasets — QM9, GDB13, and ZINC250K — of varying challenges show that this approach significantly surpasses the performance of MiDi, a State-of-the-Art approach for generating 3D molecular structures. Additionally, an ablation study confirms the significance of adopting a novel regularization loss, which addresses errors in distance predictions and bounds the triangle inequality, validating the use of distance matrices in molecular generative models.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/369088
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact