KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Minghan Li; Éric Gaussier

doi:10.1145/3404835.3463083

Communication Dans Un Congrès Année : 2021

KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

(1) , (1)

Minghan Li

Fonction : Auteur

Laboratoire d'Informatique de Grenoble

Éric Gaussier

Fonction : Auteur
PersonId : 182833
IdHAL : eric-gaussier
ORCID : 0000-0002-8858-3233
IdRef : 074308297

Laboratoire d'Informatique de Grenoble

Résumé

Transformer-based models, and especially pre-trained language models like BERT, have shown great success on a variety of Natural Language Processing and Information Retrieval tasks. However, such models have difficulties to process long documents due to the quadratic complexity of the self-attention mechanism. Recent works either truncate long documents or segment them into passages that can be treated by a standard BERT model. A hierarchical architecture, such as a transformer, can be further adopted to build a document-level representation on top of the representations of each passage. However, these approaches either lose information or have high computational complexity (and are both time and energy consuming in this latter case). We follow here a slightly different approach in which one first selects key blocks of a long document by local query-block pre-ranking, and then aggregates few blocks to form a short document that can be processed by a model such as BERT. Experiments conducted on standard Information Retrieval datasets demonstrate the effectiveness of the proposed approach.

Domaines

Intelligence artificielle [cs.AI]

Anne-Christine Jacob : Connectez-vous pour contacter le contributeur

https://hal.univ-grenoble-alpes.fr/hal-03369577

Soumis le : jeudi 7 octobre 2021-14:09:00

Dernière modification le : mercredi 18 décembre 2024-09:24:56

Dates et versions

hal-03369577 , version 1 (07-10-2021)

Identifiants

HAL Id : hal-03369577 , version 1
DOI : 10.1145/3404835.3463083

Citer

Minghan Li, Éric Gaussier. KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval. SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 2021, Virtual Event Canada, Canada. pp.2207-2211, ⟨10.1145/3404835.3463083⟩. ⟨hal-03369577⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG MIAI ANR LIG_SIDCH LIG_SIDCH_APTIKAL

136 Consultations

0 Téléchargements

KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager