Active Learning from Unreliable Data

Abstract : Classification algorithms have been widely adopted in big recommendation systems, e.g., products, images and advertisements, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the field can be unreliable due to careless annotations or malicious data transformation. In our previous work, we proposed a two-layer learning framework for continuous learning in the presence of unreliable anomaly labels, it worked perfectly for two use cases, (i) detecting 10 classes of IoT attacks and (ii) predicting 4 classes of task failures of big data jobs. To continue this study, now we will challenge our framework with image dataset. The first layer of quality model filters the suspicious data, where the second layer of classification model predicts data instance's class. As we focus on the case of images, we will use widely studied datasets: MNIST, Cifar10, Cifar100 and Ima-geNet. Deep Neural Network (DNN) has demonstrated excellent performances in solving images classification problems, we will show that two collaborating DNN could construct a more robust and high accuracy model.
Complete list of metadatas
Contributor : Zilong Zhao <>
Submitted on : Friday, February 22, 2019 - 10:13:27 AM
Last modification on : Thursday, May 9, 2019 - 10:19:35 AM


Files produced by the author(s)


  • HAL Id : hal-02045455, version 1


Zilong Zhao, Sophie Cerf, Robert Birke, Bogdan Robu, Sara Bouchenak, et al.. Active Learning from Unreliable Data. 13th EuroSys Doctoral Workshop (EuroDW 2019), Mar 2019, Dresde, Germany. ⟨hal-02045455⟩



Record views


Files downloads