OPTIMASI MODEL DISTILBERT BERBASIS TREE-STRUCTURED PARZEN ESTIMATOR PADA KLASIFIKASI PERSYARATAN PERANGKAT LUNAK

    Afwa Afini, - and Raditya Muhammad, - and Indira Syawanodya, - (2025) OPTIMASI MODEL DISTILBERT BERBASIS TREE-STRUCTURED PARZEN ESTIMATOR PADA KLASIFIKASI PERSYARATAN PERANGKAT LUNAK. S1 thesis, Universitas Pendidikan Indonesia.

    Abstract

    Klasifikasi manual persyaratan perangkat lunak pada proyek besar kerap menyita waktu dan rawan kesalahan, sehingga berpotensi menurunkan kualitas sistem. Model transformer seperti DistilBERT menawarkan solusinya, namun performanya sangat bergantung pada konfigurasi hyperparameter. Pengaturan default yang bersifat umum tidak selalu sesuai untuk data domain spesifik seperti persyaratan perangkat lunak yang kompleks dan tidak seimbang. Oleh karena itu, diperlukan strategi optimasi hyperparameter, yang dalam penelitian ini dipahami sebagai upaya sistematis untuk menemukan konfigurasi yang lebih sesuai daripada baseline default. Penelitian ini bertujuan mengevaluasi pengaruh optimasi hyperparameter berbasis Tree-structured Parzen Estimator (TPE) terhadap DistilBERT pada klasifikasi biner (fungsional vs. non-fungsional) dan multikelas (11 sub non-fungsional) menggunakan dataset PROMISE_exp. Optimasi difokuskan pada eksplorasi kombinasi terbaik learning rate, batch size, epoch, dan weight decay. Hasil menunjukkan bahwa pada klasifikasi biner, akurasi meningkat dari 0,82 menjadi 0,88 dan macro F1-score dari 0,81 menjadi 0,88. Pada klasifikasi multikelas yang bersifat imbalanced, akurasi meningkat dari 0,65 menjadi 0,72 dan macro F1-score dari 0,37 menjadi 0,55. Analisis per kelas juga mengungkap bahwa TPE meningkatkan deteksi kelas minoritas tanpa menurunkan performa kelas mayoritas. Dengan demikian, penelitian ini berkontribusi pada klasifikasi persyaratan perangkat lunak yang lebih konsisten, adil, dan andal. -----------Manual classification of software requirements in large-scale projects often consumes time and is prone to errors, potentially reducing system quality. Transformer-based models such as DistilBERT offer a solution, yet their performance heavily depends on hyperparameter configuration. Default settings, which are general in nature, are not always suitable for domain-specific data such as complex and imbalanced software requirements. Therefore, a hyperparameter optimization strategy is required, which in this study is understood as a systematic effort to find configurations more appropriate than the baseline default. This research aims to evaluate the impact of Tree-structured Parzen Estimator (TPE)-based hyperparameter optimization on DistilBERT for binary classification (functional vs. non-functional) and multiclass classification (11 non-functional subcategories) using the PROMISE_exp dataset. The optimization focuses on exploring the best combination of learning rate, batch size, epoch, and weight decay. Results show that in binary classification, accuracy increased from 0.82 to 0.88 and macro F1-score from 0.81 to 0.88. In imbalanced multiclass classification, accuracy improved from 0.65 to 0.72 and macro F1-score from 0.37 to 0.55. Per-class analysis also revealed that TPE enhanced the detection of minority classes without reducing the performance of majority classes. Thus, this study contributes to more consistent, fair, and reliable classification of software requirements.

    [thumbnail of S_RPL_2101968_Title.pdf] Text
    S_RPL_2101968_Title.pdf

    Download (454kB)
    [thumbnail of S_RPL_2101968_Chapter1.pdf] Text
    S_RPL_2101968_Chapter1.pdf

    Download (51kB)
    [thumbnail of S_RPL_2101968_Chapter2.pdf] Text
    S_RPL_2101968_Chapter2.pdf
    Restricted to Staf Perpustakaan

    Download (411kB)
    [thumbnail of S_RPL_2101968_Chapter3.pdf] Text
    S_RPL_2101968_Chapter3.pdf

    Download (225kB)
    [thumbnail of S_RPL_2101968_Chapter4.pdf] Text
    S_RPL_2101968_Chapter4.pdf
    Restricted to Staf Perpustakaan

    Download (545kB)
    [thumbnail of S_RPL_2101968_Chapter5.pdf] Text
    S_RPL_2101968_Chapter5.pdf

    Download (108kB)
    [thumbnail of S_RPL_2101968_Appendix.pdf] Text
    S_RPL_2101968_Appendix.pdf
    Restricted to Staf Perpustakaan

    Download (131kB)
    Official URL: https://repository.upi.edu/
    Item Type: Thesis (S1)
    Additional Information: https://scholar.google.com/citations?hl=en&user=iwQNgUIAAAAJ ID SINTA Dosen Pembimbing: Raditya Muhammad: 6682222 Indira Syawanodya: 6681751
    Uncontrolled Keywords: DistilBERT, Optimasi Hyperparameter, Persyaratan Perangkat Lunak, Tree-structured Parzen Estimator, DistilBERT, Hyperparameter Optimization, Software Requirements, Tree-structured Parzen Estimator
    Subjects: L Education > L Education (General)
    Q Science > QA Mathematics > QA75 Electronic computers. Computer science
    Q Science > QA Mathematics > QA76 Computer software
    T Technology > T Technology (General)
    Divisions: UPI Kampus cibiru > S1 Rekayasa Perangkaat Lunak
    Depositing User: Afwa Afini
    Date Deposited: 06 Oct 2025 09:25
    Last Modified: 06 Oct 2025 09:25
    URI: http://repository.upi.edu/id/eprint/136877

    Actions (login required)

    View Item View Item