Automatic Categorization of Social Sensor Data - Université Grenoble Alpes Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Automatic Categorization of Social Sensor Data

Résumé

Today, there is a huge impact on generation of data in everyday life due to micro blogging sites like Twitter, Facebook, and other social networking web sites. The valuable data that is broadcast through micro blogging can provide useful information to different situations if captured and analyzed properly in timely manner. When it comes to Smart City, automatically identifying messages communicated via Twitter can contribute to situation awareness about the city, and it also brings out a lot of beneficial information for people who seek information about the city. This paper addresses processing and automatic categorization of micro blogging data; in particular Twitter data, using Natural Language Processing (NLP) techniques together with Random Forest classifier. As processing of twitter messages is a challenging task, we propose an algorithm to automatically preprocess the twitter messages. For this, we collected Twitter messages for sixteen different categories from one geo-location. We used proposed algorithm to prepro-cess the twitter messages and using Random Forest classifier these tweets are automatically categorized into predefined categories. It is shown that Random Forest classifier outperformed Support Vector Machines (SVM) and Naive Bayes classifiers.
Fichier principal
Vignette du fichier
1-s2.0-S1877050916322384-main.pdf (511.69 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01542606 , version 1 (06-07-2017)

Identifiants

Citer

Olivera Kotevska, Sarala Padi, Ahmed Lbath. Automatic Categorization of Social Sensor Data. EUSPN/ICTH, Sep 2016, London, United Kingdom. pp.596 - 603, ⟨10.1016/j.procs.2016.09.093⟩. ⟨hal-01542606⟩
101 Consultations
140 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More