KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval - Université Grenoble Alpes
Communication Dans Un Congrès Année : 2021

KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Résumé

Transformer-based models, and especially pre-trained language models like BERT, have shown great success on a variety of Natural Language Processing and Information Retrieval tasks. However, such models have difficulties to process long documents due to the quadratic complexity of the self-attention mechanism. Recent works either truncate long documents or segment them into passages that can be treated by a standard BERT model. A hierarchical architecture, such as a transformer, can be further adopted to build a document-level representation on top of the representations of each passage. However, these approaches either lose information or have high computational complexity (and are both time and energy consuming in this latter case). We follow here a slightly different approach in which one first selects key blocks of a long document by local query-block pre-ranking, and then aggregates few blocks to form a short document that can be processed by a model such as BERT. Experiments conducted on standard Information Retrieval datasets demonstrate the effectiveness of the proposed approach.
Fichier non déposé

Dates et versions

hal-03369577 , version 1 (07-10-2021)

Identifiants

Citer

Minghan Li, Éric Gaussier. KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval. SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 2021, Virtual Event Canada, Canada. pp.2207-2211, ⟨10.1145/3404835.3463083⟩. ⟨hal-03369577⟩
136 Consultations
0 Téléchargements

Altmetric

Partager

More