Many experiments have shown that traditional approaches to both Natural Language Processing (NLP) and Information Retrieval (IR) are not effective enough to extract information from text: as a matter of fact shallow techniques (such as statistics, keyword analysis, etc.) tend to be imprecise, although efficient and transportable, whereas linguistic approaches tend to be very precise but not robust and efficient. Integrating NLP and IR is the challenge for the evolution of text processing systems for the next few years. In this paper an architecture that integrates shallow and linguistic processing is presented. Shallow techniques are used to limit the linguistic analysis to the interesting sections, and to help the parser reduce the overhead. The linguistic analyzer carefully extracts the information, controlling the combinatorics of parsing and any misdirected parsing efforts. Some preliminary results show that the architecture has considerable advantages with respect to traditional approaches to information extraction from text

Integrating Shallow and Linguistic Techniques for Information Extraction from Text

1995-01-01

Abstract

Many experiments have shown that traditional approaches to both Natural Language Processing (NLP) and Information Retrieval (IR) are not effective enough to extract information from text: as a matter of fact shallow techniques (such as statistics, keyword analysis, etc.) tend to be imprecise, although efficient and transportable, whereas linguistic approaches tend to be very precise but not robust and efficient. Integrating NLP and IR is the challenge for the evolution of text processing systems for the next few years. In this paper an architecture that integrates shallow and linguistic processing is presented. Shallow techniques are used to limit the linguistic analysis to the interesting sections, and to help the parser reduce the overhead. The linguistic analyzer carefully extracts the information, controlling the combinatorics of parsing and any misdirected parsing efforts. Some preliminary results show that the architecture has considerable advantages with respect to traditional approaches to information extraction from text
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1106
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact