Spam Filtering Using Regularized Neural Networks with Rectified Linear Units

Barushka, Aliaksandr; Hájek, Petr

Digitální knihovna UPCE
→
Univerzita Pardubice
→
Publikační činnost akademických pracovníků UPCE / UPCE Research Outputs
→
Zobrazit záznam

Spam Filtering Using Regularized Neural Networks with Rectified Linear Units

Barushka, Aliaksandr; Hájek, Petr

Soubory tohoto záznamu

URI: http://hdl.handle.net/10195/67259

Datum publikování: 2016

Typ dokumentu: ConferenceObject

ISSN: 0302-9743

Zdrojový dokument: AIIA 2016 Advances in Artificial Intelligence

Vydavatelská verze: http://link.springer.com/chapter/10.1007/978-3-319-49130-1_6

Název akce 15th International Conference of the Italian Association for Artificial Intelligence (28.11.2016 - 01.12.2016)

Abstrakt:

The rapid growth of unsolicited and unwanted messages has inspired the development of many anti-spam methods. Machine-learning methods such as Naïve Bayes (NB), support vector machines (SVMs) or neural networks (NNs) have been particularly effective in categorizing spam /non-spam messages. They automatically construct word lists and their weights usually in a bag-of-words fashion. However, traditional multilayer perceptron (MLP) NNs usually suffer from slow optimization convergence to a poor local minimum and overfitting issues. To overcome this problem, we use a regularized NN with rectified linear units (RANN-ReL) for spam filtering. We compare its performance on three benchmark spam datasets (Enron, SpamAssassin, and SMS spam collection) with four machine algorithms commonly used in text classification, namely NB, SVM, MLP, and k-NN. We show that the RANN-ReL outperforms other methods in terms of classification accuracy, false negative and false positive rates. Notably, it classifies well both major (legitimate) and minor (spam) classes.

Zobrazit celý záznam