As an indicator of the stability of spectral clustering of an undirected weighted graph into k clusters, the kth spectral gap of the graph Laplacian is often considered. The kth spectral gap is characterized in this paper as an unstructured distance to ambiguity, namely as the minimal distance of the Laplacian to arbitrary symmetric matrices with vanishing kth spectral gap. As a conceptually more appropriate measure of stability, the structured distance to ambiguity of the k-clustering is introduced as the minimal distance of the Laplacian to Laplacians of graphs with the same vertices and edges but with weights that are perturbed such that the kth spectral gap vanishes. To compute a solution to this matrix nearness problem, a two-level iterative algorithm is proposed that uses a constrained gradient system of matrix differential equations in the inner iteration and a one-dimensional optimization of the perturbation size in the outer iteration. The structured and unstructured distances to ambiguity are compared on some example graphs. The numerical experiments show, in particular, that selecting the number k of clusters according to the criterion of maximal stability can lead to different results for the structured and unstructured stability indicators.

Measuring the stability of spectral clustering

Andreotti, E.;
2021-01-01

Abstract

As an indicator of the stability of spectral clustering of an undirected weighted graph into k clusters, the kth spectral gap of the graph Laplacian is often considered. The kth spectral gap is characterized in this paper as an unstructured distance to ambiguity, namely as the minimal distance of the Laplacian to arbitrary symmetric matrices with vanishing kth spectral gap. As a conceptually more appropriate measure of stability, the structured distance to ambiguity of the k-clustering is introduced as the minimal distance of the Laplacian to Laplacians of graphs with the same vertices and edges but with weights that are perturbed such that the kth spectral gap vanishes. To compute a solution to this matrix nearness problem, a two-level iterative algorithm is proposed that uses a constrained gradient system of matrix differential equations in the inner iteration and a one-dimensional optimization of the perturbation size in the outer iteration. The structured and unstructured distances to ambiguity are compared on some example graphs. The numerical experiments show, in particular, that selecting the number k of clusters according to the criterion of maximal stability can lead to different results for the structured and unstructured stability indicators.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/362073
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact