We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utilizes syntactic fragments of arbitrary size from a treebank to analyze new sentences, but, crucially, it uses only those which are encountered at least twice. This criterion al- lows us to work with a relatively small but representative set of fragments, which can be employed as the symbolic backbone of sev- eral probabilistic generative models. For pars- ing we define a transform-backtransform ap- proach that allows us to use standard PCFG technology, making our results easily replica- ble. According to standard Parseval metrics, our best model is on par with many state-of- the-art parsers, while offering some comple- mentary benefits: a simple generative proba- bility model, and an explicit representation of the larger units of grammar.

Accurate Parsing with Compact Tree-Substitution Grammars: Double-DOP

Sangati, Federico;
2011-01-01

Abstract

We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utilizes syntactic fragments of arbitrary size from a treebank to analyze new sentences, but, crucially, it uses only those which are encountered at least twice. This criterion al- lows us to work with a relatively small but representative set of fragments, which can be employed as the symbolic backbone of sev- eral probabilistic generative models. For pars- ing we define a transform-backtransform ap- proach that allows us to use standard PCFG technology, making our results easily replica- ble. According to standard Parseval metrics, our best model is on par with many state-of- the-art parsers, while offering some comple- mentary benefits: a simple generative proba- bility model, and an explicit representation of the larger units of grammar.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/250654
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact