Faiz Jauhari Makarim Riza, - and Rangga Gelar Guntara, - and Mubammad Rizki Nugraha, - (2025) IMPLEMENTASI ALGORITMA NAIVE BAYES UNTUK FILTRASI SPAM KOMENTAR PADA YOUTUBE. S1 thesis, Universitas Pendidikan Indonesia.
Abstract
Perkembangan interaksi pengguna di platform YouTube turut memunculkan permasalahan baru, salah satunya adalah maraknya komentar spam yang mengandung promosi perjudian online. Komentar semacam berdampak negatif terhadap komunitas yang terdapat di kanal. Penelitian ini bertujuan untuk membangun sistem klasifikasi komentar spam menggunakan algoritma Naive Bayes. Proses pengembangan model mengikuti tahapan CRISP-DM, dimulai dari pengumpulan data komentar menggunakan YouTube API, yang kemudian dilanjutkan dengan preprocessing teks melalui tahapan Unicode normalization, case folding, tokenizing, stopword removal, filtering, dan labelling. Pembobotan kata dilakukan dengan metode TF-IDF untuk menghasilkan representasi fitur yang optimal. Evaluasi model dilakukan dengan teknik K-Fold Cross Validation dan analisis confusion matrix. Hasil evaluasi menunjukkan bahwa model mencapai performa yang tinggi, dengan akurasi sebesar 97,1%, precision 96,4%, recall 95,6%, dan f1-score 96%. Model yang dibangun kemudian diterapkan dalam bentuk aplikasi berbasis Command Line Interface (CLI) yang dapat digunakan oleh pemilik kanal untuk mendeteksi dan menghapus komentar spam secara langsung. Berdasarkan hasil pengujian, sistem menunjukkan efektivitas yang tinggi. Penelitian ini menunjukkan bahwa kombinasi preprocessing yang tepat dan pemilihan algoritma yang sesuai dapat menghasilkan sistem deteksi spam yang akurat. The growing interaction among users on the YouTube platform has led to new challenges, one of which is the increasing prevalence of spam comments promoting online gambling. These types of comments negatively impact the sense of community within YouTube channels. This study aims to develop a spam comment classification system using the Naive Bayes algorithm. The model development process follows the CRISP-DM framework, starting with data collection through the YouTube API, followed by a series of text preprocessing steps including Unicode normalization, case folding, tokenizing, stopword removal, filtering, and manual labelling. Word weighting is performed using the TF-IDF method to produce optimal feature representations. Model evaluation was carried out using the K-Fold Cross Validation technique and confusion matrix analysis. The results show that the model achieved strong performance, with an accuracy of 97.1%, precision of 96.4%, recall of 95.6%, and an F1-score of 96%. The trained model was then deployed as a Command Line Interface (CLI) application, allowing channel owners to directly detect and remove spam comments. Based on testing outcomes, the system demonstrated high effectiveness. This study highlights that combining proper preprocessing techniques with the appropriate classification algorithm can result in an accurate and practical spam detection system.
![]() |
Text
S_BIDI_2107090_Title.pdf Download (2MB) |
![]() |
Text
S_BIDI_2107090_Chapter1.pdf Download (354kB) |
![]() |
Text
S_BIDI_2107090_Chapter2.pdf Restricted to Staf Perpustakaan Download (955kB) |
![]() |
Text
S_BIDI_2107090_Chapter3.pdf Download (430kB) |
![]() |
Text
S_BIDI_2107090_Chapter4.pdf Restricted to Staf Perpustakaan Download (1MB) |
![]() |
Text
S_BIDI_2107090_Chapter5.pdf Download (216kB) |
![]() |
Text
S_BIDI_2107090_Appendix.pdf Restricted to Staf Perpustakaan Download (535kB) |
Item Type: | Thesis (S1) |
---|---|
Additional Information: | https://scholar.google.com/citations?view_op=new_profile&hl=id ID SINTA Dosen Pembimbing: Rangga Gelar Guntara: 6738149 Muhammad Rizki Nugraha: 6770726 |
Uncontrolled Keywords: | YouTube, Komentar Spam, Naive Bayes, TF-IDF, Klasifikasi, CLI. YouTube, Spam Comments, Naive Bayes, TF-IDF, Classification, CLI. |
Subjects: | Q Science > QA Mathematics > QA76 Computer software |
Divisions: | UPI Kampus Tasikmalaya > S1 Bisnis Digital |
Depositing User: | Faiz Jauhari Makarim Riza |
Date Deposited: | 08 Sep 2025 04:10 |
Last Modified: | 08 Sep 2025 04:10 |
URI: | http://repository.upi.edu/id/eprint/135898 |
Actions (login required)
![]() |
View Item |