Hierarchical supervised classifiers are highly demanding in terms of labelled examples, because the number of categories are proportional to the size of a given taxonomy. In this case the bootstrapping process plays a key role because a small amount of labelled examples could prevent a successful exploitation of the learning techniques. This paper proposes a method to make a first hypothesis of categorization for a set of unlabelled documents with respect to a given empty hierarchy of concepts. The goal is to support a semi-automated management of the bootstrapping. The proposed solution is based on a revised model of self organizing maps in such a way that the unsupervised learning is biased by a taxonomy given as input to the model
Self Organization of Documents in a Given Taxonomy
Adami, Giordano;Avesani, Paolo;Sona, Diego
2003-01-01
Abstract
Hierarchical supervised classifiers are highly demanding in terms of labelled examples, because the number of categories are proportional to the size of a given taxonomy. In this case the bootstrapping process plays a key role because a small amount of labelled examples could prevent a successful exploitation of the learning techniques. This paper proposes a method to make a first hypothesis of categorization for a set of unlabelled documents with respect to a given empty hierarchy of concepts. The goal is to support a semi-automated management of the bootstrapping. The proposed solution is based on a revised model of self organizing maps in such a way that the unsupervised learning is biased by a taxonomy given as input to the modelI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.