PERBANDINGAN REPRESENTASI FITUR PADA KATEGORISASI DAN PREDIKSI PRIORITAS LAPORAN DALAM BUG TRACKING SYSTEM

Daffa Almer Fauzan, - (2023) PERBANDINGAN REPRESENTASI FITUR PADA KATEGORISASI DAN PREDIKSI PRIORITAS LAPORAN DALAM BUG TRACKING SYSTEM. S1 thesis, Universitas Pendidikan Indonesia.

[img] Text
S_RPL_1904207_Title.pdf

Download (730kB)
[img] Text
S_RPL_1904207_Chapter1.pdf

Download (80kB)
[img] Text
S_RPL_1904207_Chapter2.pdf
Restricted to Staf Perpustakaan

Download (1MB)
[img] Text
S_RPL_1904207_Chapter3.pdf

Download (589kB)
[img] Text
S_RPL_1904207_Chapter4.pdf
Restricted to Staf Perpustakaan

Download (943kB)
[img] Text
S_RPL_1904207_Chapter5.pdf

Download (69kB)
[img] Text
S_RPL_1904207_Appendix.pdf
Restricted to Staf Perpustakaan

Download (8MB)
Official URL: http://repository.upi.edu/

Abstract

Bug Tracking System (BTS) merupakan suatu perangkat lunak yang digunakan dalam tahap pemeliharaan perangkat lunak dan berperan untuk menyimpan riwayat dan melacak laporan terkait permintaan terhadap perubahan, perbaikan kecacatan dan kegagalan, dan dukungan teknis dalam siklus hidup pengembangan perangkat lunak. Kategori dan prioritas suatu laporan dalam BTS dapat ditetapkan secara otomatis menggunakan model pembelajaran mesin. Algoritma yang digunakan dalam penelitian ini adalah Logistic Regression. Tujuan dari penelitian adalah mengidentifikasi representasi fitur yang tepat dengan memperhatikan fitur teks secara kontekstual dan karakteristik sumber himpunan data dalam menghadapi permasalahan ketidakseimbangan kelas. Permasalahan ketidakseimbangan kelas dihadapi ketika data pada label kelas, baik berdasarkan kategori maupun prioritas, memiliki jumlah yang tidak seimbang sehingga berdampak terhadap kemampuan model dalam memprediksi label kelas dengan jumlah data yang relatif sedikit. Representasi fitur yang dibandingkan mencakup TF-IDF, TF-IDF dengan SMOTE dan variasinya (ADASYN dan BorderlineSMOTE), TF-IDF dengan Word2Vec (CBOW dan skip-gram), dan TF-CHI dengan Word2Vec (CBOW dan skip-gram). Hasil menunjukkan bahwa model direpresentasikan dengan TF-CHI dengan Word2Vec (CBOW) dapat meningkatkan nilai precision paling tinggi sebesar 51%, recall paling tinggi sebesar 29%, F-score paling tinggi sebesar 35%, dan accuracy paling tinggi sebesar 21%. Namun, TF-IDF dengan SMOTE dan variasinya dapat menjadi alternatif solusi ketika anomali terjadi pada TF-CHI, yakni terbentuknya suatu kluster yang terdiri atas sebagian besar atau seluruh label kelas. -------- Bug Tracking System (BTS) is a software that is used in the stage of software maintenance and plays a role in keeping history and tracking reports regarding modification requests, defect fixes, and technical support in the software development life cycle. The category and priority of a report can be set automatically using a machine learning model. The algorithm that is used in this research is Logistic Regression. The objective of this research is to identify the appropriate feature representation by considering the text features contextually and the characteristic of the dataset in dealing with class imbalance problem. The class imbalance problem is faced when the data on their class label, either based on their category or priority, has an imbalance number in terms of amount which affects the capability of the model in predicting the class label with lower amount of data. The feature representation that are being compared includes TF-IDF, TF-IDF with SMOTE and its variations (ADASYN and BorderlineSMOTE), TF-IDF with Word2Vec (CBOW and skip-gram), and TF-CHI with Word2Vec (CBOW and skip-gram). The results show that the model represented by TF-CHI with Word2Vec (CBOW) can increase its precision maximum by 51%, recall maximum by 29%, F-score maximum by 35%, and accuracy maximum by 21%. However, TF-IDF with SMOTE and its variation can be an alternative solution when an anomaly occurs in TF-CHI, that is the formation of a cluster which consists of most or all class labels.

Item Type: Thesis (S1)
Additional Information: https://scholar.google.com/citations?user=JNODF3cAAAAJ&hl=id SINTA ID: 6681751 SINTA ID: 6682222
Uncontrolled Keywords: Bug Tracking System; Class Imbalance; TF-IDF; TF-CHI; Word2Vec; Logistic Regression
Subjects: L Education > L Education (General)
Q Science > QA Mathematics > QA76 Computer software
T Technology > T Technology (General)
Divisions: UPI Kampus cibiru > S1 Rekayasa Perangkaat Lunak
Depositing User: Daffa Almer Fauzan
Date Deposited: 10 May 2023 07:19
Last Modified: 10 May 2023 07:41
URI: http://repository.upi.edu/id/eprint/87469

Actions (login required)

View Item View Item