This paper investigates the potentialities of a lightweight approach to the Expected Answer Type (EAT) recognition task in a specific restricted-domain Question Answering scenario. In such scenario, the input is represented by automatically transcribed spoken requests, possibly affected by transcription errors. Our objective is to demonstrate that, when dealing with sub-optimal (i.e. noisy) inputs, good performance can be easily achieved with a Machine Learning approach based on simple features extracted from unprocessed questions. In contrast to traditional approaches dealing with questions pre-processed at different levels (including lemmatization, part of speech (POS) tagging, and multiword recognition), the advantage of our lightweight approach is that extra errors often derived from processing noisy data are avoided.
Expected Answer Type Identification from Unprocessed Noisy Questions
Chowdhury, Faisal Mahbub;Negri, Matteo
2009-01-01
Abstract
This paper investigates the potentialities of a lightweight approach to the Expected Answer Type (EAT) recognition task in a specific restricted-domain Question Answering scenario. In such scenario, the input is represented by automatically transcribed spoken requests, possibly affected by transcription errors. Our objective is to demonstrate that, when dealing with sub-optimal (i.e. noisy) inputs, good performance can be easily achieved with a Machine Learning approach based on simple features extracted from unprocessed questions. In contrast to traditional approaches dealing with questions pre-processed at different levels (including lemmatization, part of speech (POS) tagging, and multiword recognition), the advantage of our lightweight approach is that extra errors often derived from processing noisy data are avoided.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.