A semantic approach to analyze scientific paper abstracts
Résumé
Each domain and its underlying communities evolve in time and each period is centered on specific topics that emerge from textual sources that characterize the domain. Our analysis represents an extension of other researches performed on the same corpora that were focusing more on evaluating co-citations between the articles in order to compute their importance score (Grauwin and Jensen [1]). Our approach presents a general perspective of the domain by performing semantic comparisons between article abstracts using natural language processing techniques such as Latent Semantic Analysis, Latent Dirichlet Allocation or semantic distances in lexicalized ontologies, i.e. WordNet. Moreover, graph visual representations are generated using Gephi in order to highlight the keywords of each paper and of the domain, the document similarity view or the table of keyword-abstract overlap score. The purpose of the views is to minimize the learning curve of the domain and to facilitate the research process for someone interested in a particular subject. Also, in order to further argue the benefits of our approach, some potential refinements of the methods for classification that can be performed as future improvements are presented.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...