The number of patent applications is enormous, and patent documents are long and complex. Methods for automatically obtaining the most salient information in a short text would thus be useful for patent professionals and other practitioners. However, patent summarization is currently under-researched; moreover, the proposed methods are difficult to compare directly as they are generally tested on different datasets. In this paper, we benchmark several extractive, abstractive, and hybrid summarization methods on the BigPatent dataset, compare automatic metrics and show qualitative insights.

Benchmarking Natural Language Processing Algorithms for Patent Summarization

Casola, silvia;Lavelli Alberto
2023-01-01

Abstract

The number of patent applications is enormous, and patent documents are long and complex. Methods for automatically obtaining the most salient information in a short text would thus be useful for patent professionals and other practitioners. However, patent summarization is currently under-researched; moreover, the proposed methods are difficult to compare directly as they are generally tested on different datasets. In this paper, we benchmark several extractive, abstractive, and hybrid summarization methods on the BigPatent dataset, compare automatic metrics and show qualitative insights.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/346667
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact