Analysis of the Complementarity of Latent and Concept Spaces for Cross-Modal Video Search
Résumé
This paper focuses on studying the complementarity between the spaces from hybrid crossmodal state-of-the-art systems for video retrieval like [5]. We aim at investigating if these spaces really convey different features, or if they are representing the same things. We use PCA (Principal Component Analysis) to study the optimal dimensions, CCA (Canonical Correlation Analysis) to assess the similarity of the spaces, and check if such approach is in fact similar to ensemble learning. We achieve experiments on the MST-VTT corpus, and show that in fact these two spaces are indeed very similar, paving the way for new models that could enforce more dissimilar spaces.
Origine | Fichiers produits par l'(les) auteur(s) |
---|