KLASIFIKASI KOMENTAR SPAM PADA YOUTUBE MENGGUNAKAN METODE NAÏVE BAYES, SUPPORT VECTOR MACHINE, DAN K-NEAREST NEIGHBORS

. Burhanudin, Yunarti Musa'adah, Yaya Wihardi

Abstract


Social mediabecome popular in this day. Sharing the daily moments in social media has become a daily routine. People can also discuss about the post in the existing comment field. For example a comment on Youtube video. But the popularity of social media bring some problems with attracting users who spread spam content on comments. In this research, will be discussed about the classification of spam comments on Youtube with several methods tested. The dataset contains 1956 data,that used to train data. The result of model evaluation using cross validation resulted Support Vector Machine method with Linear approach has highest accuracy equal to 91,92%. Expectedby this research can provide solutions as an effort to prevent spam content in social media comment field..

Keywords


knn; naive bayes; svm

Full Text:

PDF

References


B. Yu and Z.-b. Xu, "A comparative study for content-based dynamic spam classification using four machine learning algorithms," Knowledge-Based Systems, vol. 21, no. 4, pp. 355-362, 2008.

M. McCord and M. Chuah, "Spam Detection on Twitter Using Traditional Classifiers," in international conference on Autonomic and trusted computing, Berlin, 2011.

A. McCallum and K. Nigam, "A Comparison of Event Models for Naive Bayes Text Classification," in AAAI-98 workshop on learning for text categorization, 1998.

V. Metsis, I. Androutsopoulos and G. Paliouras, "Spam Filtering with Naive Bayes – Which Naive Bayes?," in 3rd Conf. on Email and Anti-Spam (CEAS), 2006.

C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.

D. Coomans and D. L. Massart, "Alternative k-nearest neighbour rules in supervised pattern recognition : Part 1. k-Nearest neighbour classification by using alternative voting rules," Analytica Chimica Acta, vol. 136, pp. 15-27, 1982.

T. C. Alberto, J. V. Lochter and T. A. Almeida, "TubeSpam: Comment Spam Filtering on YouTube," in International Conference on Machine Learning and Applications, Miami, 2015.

R. C. Balabantaray, C. Sarma and M. Jha, "Document Clustering using K-Means and K-Medoids," International Journal of Knowledge Based Computer System, vol. 1, no. 1, pp. 7-13, 2013.

E. Rasywir and A. Purwarianti, "Eksperimen pada Sistem Klasifikasi Berita Hoax Berbahasa Indonesia Berbasis Pembelajaran Mesin," Jurnal Cybermatika, vol. 3, no. 2, pp. 1-8, 2015.

M. A. Fauzi, A. Z. Arifin, S. C. Gosaria and I. S. Prabowo, "Indonesian News Classification Using Naïve Bayes and Two-Phase Feature Selection Model," Indonesian Journal of Electrical Engineering and Computer Science, vol. 8, no. 3, pp. 610-615, 2017.

J. Ramos, "Using TF-IDF to Determine Word Relevance in Document Queries," in Instructional Conference On Machine Learning, 2003.




DOI: http://dx.doi.org/10.26798/jiko.v3i2.139

Article Metrics

Abstract view : 2901 times
PDF - 1647 times

Refbacks

  • There are currently no refbacks.




Copyright (c) 2018 . Burhanudin, Yunarti Musa'adah, Yaya Wihardi


JIKO (Jurnal Informatika dan Komputer)

Published by
Lembaga Penelitian dan Pengabdian Masyarakat
Universitas Teknologi Digital Indonesia (d.h STMIK AKAKOM)

Jl. Raya Janti (Majapahit) No. 143 Yogyakarta, 55198
Telp. (0274)486664

Website : https://www.utdi.ac.id/

e-ISSN : 2477-3964 
p-ISSN : 2477-4413