This paper proposes an approach to full parsing approximation suitable for Information Extraction from texts. Sequences of cascades of finite-state rules deterministically analyze the text, building unambiguous structures. Initially basic chunks are analyzed; then clauses are recognized and nested; finally modifier attachment is performed and the global parse tree is built. The approach has been extensively proven to work mainly for Italian, but it was also tested for English and Russian. A parser based on such approach has been implemented as part of Pinocchio, an environment for developing and running IE applications
Full Parsing Approximation, Finite-State Cascades and Grammar Organization for Information Extraction
Lavelli, Alberto;
1999-01-01
Abstract
This paper proposes an approach to full parsing approximation suitable for Information Extraction from texts. Sequences of cascades of finite-state rules deterministically analyze the text, building unambiguous structures. Initially basic chunks are analyzed; then clauses are recognized and nested; finally modifier attachment is performed and the global parse tree is built. The approach has been extensively proven to work mainly for Italian, but it was also tested for English and Russian. A parser based on such approach has been implemented as part of Pinocchio, an environment for developing and running IE applicationsFile in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.