A number of artificial intelligence and machine learning problems need to be formulated within a directional space, where classical Euclidean geometry does not apply or needs to be readjusted into the circle. This is typical, for example, in computational linguistics and natural language processing, where language models based on Bag-of-Words, Vector Space, or Word Embedding, are largely used for tasks like document classification, information retrieval and recommendation systems, among others. In these contexts, for assessing document clustering and outliers detection applications, it is often necessary to generate data with directional properties and units that follow some model assumptions and possibly form close groups. In the following we propose a Reduced Variable Neighbourhood Search heuristic which is used to generate high-dimensional data controlled by the desired properties aimed at representing several real-world contexts. The whole problem is formulated as a non-linear continuous optimization problem, and it is shown that the proposed Reduced Variable Neighbourhood Search is able to generate high-dimensional solutions to the problem in short computational time. A comparison with the state-of-the-art local search routine used to address this problem shows the greater efficiency of the approach presented here.

Reduced Variable Neighbourhood Search for the Generation of Controlled Circular Data

Turchi, Marco
2021-01-01

Abstract

A number of artificial intelligence and machine learning problems need to be formulated within a directional space, where classical Euclidean geometry does not apply or needs to be readjusted into the circle. This is typical, for example, in computational linguistics and natural language processing, where language models based on Bag-of-Words, Vector Space, or Word Embedding, are largely used for tasks like document classification, information retrieval and recommendation systems, among others. In these contexts, for assessing document clustering and outliers detection applications, it is often necessary to generate data with directional properties and units that follow some model assumptions and possibly form close groups. In the following we propose a Reduced Variable Neighbourhood Search heuristic which is used to generate high-dimensional data controlled by the desired properties aimed at representing several real-world contexts. The whole problem is formulated as a non-linear continuous optimization problem, and it is shown that the proposed Reduced Variable Neighbourhood Search is able to generate high-dimensional solutions to the problem in short computational time. A comparison with the state-of-the-art local search routine used to address this problem shows the greater efficiency of the approach presented here.
2021
978-3-030-69624-5
978-3-030-69625-2
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/325846
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact