eprintid: 138264 rev_number: 22 eprint_status: archive userid: 218077 dir: disk0/00/13/82/64 datestamp: 2025-09-09 07:53:10 lastmod: 2025-09-09 07:53:10 status_changed: 2025-09-09 07:53:10 type: thesis metadata_visibility: show creators_name: Rafharum Fatimah, - creators_name: Mahmudah Salwa Gianti, - creators_name: Muhammad Rizalul Wahid, - creators_nim: NIM2104428 creators_nim: NIDN0008049601 creators_nim: NIDN0001049402 creators_id: rafharumf@upi.edu creators_id: msg.salwa@upi.edu creators_id: rizalulwahid@upi.edu contributors_type: http://www.loc.gov/loc.terms/relators/THS contributors_type: http://www.loc.gov/loc.terms/relators/THS contributors_name: Mahmudah Salwa Gianti, - contributors_name: Muhammad Rizalul Wahid, - contributors_nidn: NIDN0008049601 contributors_nidn: NIDN0001049402 contributors_id: msg.salwa@upi.edu contributors_id: rizalulwahid@upi.edu title: ANALISIS KOMPARATIF KINERJA MODEL NEURAL NETWORK RINGAN (CNN, LSTM, TRANSFORMER) UNTUK DETEKSI DEEPFAKE AUDIO PADA DATASET WAVEFAKE ispublished: pub subjects: T1 divisions: MKB_S1_PWT full_text_status: restricted keywords: CNN, Deepfake, LSTM, SVM, Transformer CNN, Deepfake, LSTM, SVM, Transformer note: https://scholar.google.com/citations?hl=id&user=6O8dAqkAAAAJ ID SINTA PEMBIMBING Mahmudah Salwa Gianti: 6779018 Muhammad Rizalul Wahid: 6780434 abstract: Kemunculan deepfake audio sebagai hasil manipulasi suara berbasis text-to-speech menimbulkan ancaman serius terhadap keamanan digital. Untuk itu, penelitian ini bertujuan membandingkan kinerja tiga model deep learning ringan, yaitu Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), dan Transformer, dalam mendeteksi deepfake audio menggunakan dataset WaveFake. Dataset diproses melalui tahap preprocessing dan diekstraksi menjadi tiga fitur utama yaitu MFCC, mel-spectrogram dan spectrogram. Setelah itu, digunakan sebagai masukan bagi setiap model dengan parameter pelatihan yang seragam. Hasil evaluasi menunjukkan CNN mencapai akurasi tertinggi sebesar 92,8%, diikuti LSTM dengan 87,8%, sementara Transformer memperoleh 84,9%. CNN unggul karena kemampuannya mengekstraksi pola lokal pada data audio, sedangkan Transformer masih membutuhkan optimasi lebih lanjut. LSTM relatif cukup optimal dalam menangani dimensi spektral yang kompleks namun tidak sebagus CNN. Penelitian ini menyimpulkan bahwa CNN merupakan arsitektur paling efektif untuk deteksi deepfake audio pada dataset WaveFake, serta berpotensi diterapkan pada sistem keamanan digital yang efisien. ----- The emergence of deepfake audio as a result of text-to-speech-based voice manipulation poses a serious threat to digital security. Therefore, this study aims to compare the performance of three lightweight deep learning models, namely Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Transformer, in detecting deepfake audio using the WaveFake dataset. The dataset was processed through a preprocessing stage and extracted into three main features: MFCC, Mel-spectrogram, and spectrogram. These were then used as input for each model with uniform training parameters. Evaluation results showed that CNN achieved the highest accuracy at 92.8%, followed by LSTM at 87.8%, while Transformer obtained 84.9%. CNN excels due to its ability to extract local patterns in audio data, while Transformer still require further optimization. LSTM is relatively optimal in handling complex spectral dimensions but not as effective as CNN. This study concludes that CNN is the most effective architecture for detecting deepfake audio on the WaveFake dataset and has the potential to be applied in efficient digital security systems. date: 2025-08-28 date_type: published institution: Universitas Pendidikan Indonesia department: KODEPRODI21204#Mekatronika dan Kecerdasan Buatan Kampus Purwakarta_S1 thesis_type: other thesis_name: other official_url: https://repository.upi.edu/ related_url_url: https://perpustakaan.upi.edu/ related_url_type: org citation: Rafharum Fatimah, - and Mahmudah Salwa Gianti, - and Muhammad Rizalul Wahid, - (2025) ANALISIS KOMPARATIF KINERJA MODEL NEURAL NETWORK RINGAN (CNN, LSTM, TRANSFORMER) UNTUK DETEKSI DEEPFAKE AUDIO PADA DATASET WAVEFAKE. S1 thesis, Universitas Pendidikan Indonesia. document_url: http://repository.upi.edu/138264/1/S_MKB_2104428_Title.pdf document_url: http://repository.upi.edu/138264/2/S_MKB_2104428_Chapter1.pdf document_url: http://repository.upi.edu/138264/3/S_MKB_2104428_Chapter2.pdf document_url: http://repository.upi.edu/138264/4/S_MKB_2104428_Chapter3.pdf document_url: http://repository.upi.edu/138264/5/S_MKB_2104428_Chapter4.pdf document_url: http://repository.upi.edu/138264/6/S_MKB_2104428_Chapter5.pdf document_url: http://repository.upi.edu/138264/7/S_MKB_2104428_Appendix.pdf