Corpus-Based Terminology

Gamper, J.; Stock, Oliviero

The manual acquisition of terminological data from domain-specific text material is a very time-consuming task. Recent advances in text-processing research provide a basis for automating this task. Computer-assisted term acquisition improves both the quantity and the quality of terminological work. This paper gives a brief overview of this new approach in terminology acquisition. Three subtasks are distinguished: compilation of an electronic text corpus, extraction of terminological data, and management of terminological data. Each of these subtasks will be discussed in some detail by identifying the core problems as well as proposed solutions. As a concrete initiative in this emerging field, we present an ongoing research project at the European Academy Bolzano, which illustrates the importance of computer-assisted terminology acquisition and of the resulting steps that have been taken in recent times. The paper concludes with a summary of five selected papers which have been presented at a workshop on corpus-based terminology in Bolzano.