TIES is a trainable information extraction system developed in an object-oriented fashion with Java. The application package supplies a set of interfaces and classes for training, testing and running an extraction task both in traditional (natural text) and wrapper (machine-generated or rigidly-structured text) domains. TIES is based on a reimplementation of the Boosted Wrapper Induction (BWI) algorithm devised by Dayne Freitag and Nicholas Kushmerick [1]. The system architecture is strongly based on boosting and wrapper induction techniques, but it has a high degree of flexibility allowing programmers, if necessary, to develop their own weak learner implementation to bootstrap, as well as to add new validation strategies. The default implementation exploits only simple features, which map an individual token to an arbitrary set of wildcards (e.g. capitalized, lower-case, punctuation), but more complex features (e.g., morpho-syntactic ones), if available, could be provided to the algorithm. In this case a different feature extraction method must be supplied. The system comes with default implementation of all the interfaces defined, therefore the application can also be used without programming experience. In the remaining sections of the tutorial, you are provided with step-by-step instructions for installing, configuring and performing common tasks using TIES software. You will benefit most from this tutorial when you complete these sections in order. You will be performing these tasks in the actual TIES environment - not a simulation

TIES 1.2 User Manual

2003-01-01

Abstract

TIES is a trainable information extraction system developed in an object-oriented fashion with Java. The application package supplies a set of interfaces and classes for training, testing and running an extraction task both in traditional (natural text) and wrapper (machine-generated or rigidly-structured text) domains. TIES is based on a reimplementation of the Boosted Wrapper Induction (BWI) algorithm devised by Dayne Freitag and Nicholas Kushmerick [1]. The system architecture is strongly based on boosting and wrapper induction techniques, but it has a high degree of flexibility allowing programmers, if necessary, to develop their own weak learner implementation to bootstrap, as well as to add new validation strategies. The default implementation exploits only simple features, which map an individual token to an arbitrary set of wildcards (e.g. capitalized, lower-case, punctuation), but more complex features (e.g., morpho-syntactic ones), if available, could be provided to the algorithm. In this case a different feature extraction method must be supplied. The system comes with default implementation of all the interfaces defined, therefore the application can also be used without programming experience. In the remaining sections of the tutorial, you are provided with step-by-step instructions for installing, configuring and performing common tasks using TIES software. You will benefit most from this tutorial when you complete these sections in order. You will be performing these tasks in the actual TIES environment - not a simulation
2003
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/2459
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact