On Modelling Corpus Citations in Computational Lexical Resources

In this article we look at how two different standards for lexical resources, TEI and OntoLex, deal with corpus citations in lexicons. We will focus on how corpus citations in retrodigitised dictionaries can be modelled using each of the two standards since this provides us with a suitably challenging use case. After looking at the structure of an example entry from a legacy dictionary, we examine the two approaches offered by the two different standards by outlining an encoding for the example entry using both of them (note that this article features the first extended discussion of how the Frequency Attestation and Corpus (FrAC) module of OntoLex deals with citations). After comparing the two approaches and looking at the advantages and disadvantages of both, we argue for a combination of both. In the last part of the article we discuss different ways of doing this, giving our preference for a strategy which makes use of RDFa.

Mots clés

Corpus Citations Computational Lexical Resources Electronic Lexicographic Resources Linked Data Text Encoding Initiative (TEI) RDFa OntoLex OntoLex-FrAC

Domaines

Informatique et langage [cs.CL]

Fichier principal

2189_Paper.pdf (548.3 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Gilles Sérasset : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04535091

Soumis le : vendredi 5 avril 2024-19:53:11

Dernière modification le : samedi 14 décembre 2024-03:36:17

Archivage à long terme le : samedi 6 juillet 2024-20:03:01

Dates et versions

hal-04535091 , version 1 (05-04-2024)

Identifiants

HAL Id : hal-04535091 , version 1

Citer

Anas Fahad Khan, Maxim Ionov, Christian Chiarcos, Laurent Romary, Gilles Serasset, et al.. On Modelling Corpus Citations in Computational Lexical Resources. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), ELDA; ICCL, May 2024, Turin, Italy. pp.12385--12394. ⟨hal-04535091⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_TDCGE_GETALP INRIA2 LIG_SIDCH

67 Consultations

84 Téléchargements