State of the art of augmenting metadata techniques and technology
Résumé
We have identified main issues and challenges on augmenting metadata techniques and technologies appropriate for using on a corpora of mathematical scientific documents. For most partial tasks tools were identified that are able to cover basic functionalities that are expected to be needed by a digital library of EuDML type, as in other projects like PubMed Central or Portico. Generic standard techniques for metadata enhancement and normalization are applicable there. Deliverable also reviews and identifies expertize and tools from some project partners (MU, CMD, ICM, FIZ, IU, and IMI-BAS). Main (unresolved) challenges posed are OCR of mathematics and reliable and robust converting between different math formats (TEX and MathML) to normalize in one primary metadata format (NLM Archiving DTD Suite) to allow services like math indexing and search . In a follow up deliverable D7.2 [58], tools and techniques will be chosen for usage in the EuDML core engine (combining YADDA and REPOX), or as a (loosely coupled) set of enhancement tools in a linked data fashion.
Origine | Fichiers produits par l'(les) auteur(s) |
---|