Learning and clustering graphs from high dimensional data

Pasadakis, Dimosthenis

Back

Doctoral thesis

Learning and clustering graphs from high dimensional data

Università della Svizzera italiana

Pasadakis, Dimosthenis ORCID
Schenk, Olaf (Degree supervisor) ORCID

2023

PhD: Università della Svizzera italiana

English Estimating the graphical structures of high dimensional data and identifying the presence of clusters in them are ubiquitous tasks in every scientific domain that deals with interacting or interconnected variables. We participate in the advance of these research fields with efficient and accurate algorithms that learn and cluster graphs. Initially, we contribute in the development of a performant precision matrix estimation routine based on the sparse quadratic approximation of the l1 regularized Gaussian maximum likelihood method. The proposed method exploits the presence of block structure in the underlying computations, and is suitable for datasets characterized by reduced sparsity. Motivated by its effectiveness in high dimensional problems, we extend the capabilities of this method to the retrieval of graphs of only non-negatively correlated variables, and introduce two algorithms for sparse M-matrix estimation. The first one is based on consecutive precision matrix estimations, while the second one performs constrained optimization for the retrieval of the final graphical structure. Finally, we present a nonlinear reformulation of direct multiway spectral clustering that is formulated as an unconstrained minimization problem. Our method promotes sharp indicator vectors that correspond to optimal graph cuts and improved clustering assignments. The advantages of all introduced algorithms are showcased in a series of comparative tests with the current state-of-the-art on artificial datasets, and their real-world applicability is demonstrated with numerical experiments on biological, medical, and image data.

Collections

USI Faculty of Informatics

Language

English

Classification

Computer science and technology

License

License undefined

Open access status

green

Identifiers

NDP-USI 2023INF001
URN urn:nbn:ch:rero-006-121365
ARK ark:/12658/srd1324640

Persistent URL

https://n2t.net/ark:/12658/srd1324640

Statistics

Document views: 324 File downloads:

2023INF001: 289

Doctoral thesis

Learning and clustering graphs from high dimensional data

Università della Svizzera italiana

l1-regularization

Gaussian Markov random fields

M-matrices

Precision matrix estimation

Graph p-Laplacian

Spectral graph clustering

Statistics