Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

Automatic Categorization of Social Sensor Data

Abstract : Today, there is a huge impact on generation of data in everyday life due to micro blogging sites like Twitter, Facebook, and other social networking web sites. The valuable data that is broadcast through micro blogging can provide useful information to different situations if captured and analyzed properly in timely manner. When it comes to Smart City, automatically identifying messages communicated via Twitter can contribute to situation awareness about the city, and it also brings out a lot of beneficial information for people who seek information about the city. This paper addresses processing and automatic categorization of micro blogging data; in particular Twitter data, using Natural Language Processing (NLP) techniques together with Random Forest classifier. As processing of twitter messages is a challenging task, we propose an algorithm to automatically preprocess the twitter messages. For this, we collected Twitter messages for sixteen different categories from one geo-location. We used proposed algorithm to prepro-cess the twitter messages and using Random Forest classifier these tweets are automatically categorized into predefined categories. It is shown that Random Forest classifier outperformed Support Vector Machines (SVM) and Naive Bayes classifiers.
Liste complète des métadonnées

Littérature citée [17 références]  Voir  Masquer  Télécharger
Contributeur : Olivera Kotevska <>
Soumis le : jeudi 6 juillet 2017 - 23:02:26
Dernière modification le : mercredi 7 octobre 2020 - 03:02:43
Archivage à long terme le : : mercredi 10 janvier 2018 - 15:22:59


Fichiers éditeurs autorisés sur une archive ouverte




Olivera Kotevska, Sarala Padi, Ahmed Lbath. Automatic Categorization of Social Sensor Data. EUSPN/ICTH, Sep 2016, London, United Kingdom. pp.596 - 603, ⟨10.1016/j.procs.2016.09.093⟩. ⟨hal-01542606⟩



Consultations de la notice


Téléchargements de fichiers