TY - THES KW - Automatic Short Answer Scoring KW - Cross-Prompt KW - Direct Scoring KW - Outlier KW - Similarity-Based Scoring KW - Specific-Prompt Automatic Short Answer Scoring; Cross-Prompt; Direct Scoring; Outlier; Similarity-Based Scoring; Specific-Prompt Y1 - 2025/07/23/ N2 - Dalam era pendidikan digital, kebutuhan akan sistem penilaian otomatis untuk jawaban teks pendek semakin meningkat. Automatic Short Answer Scoring (ASAS) bertujuan untuk mengotomasi proses penilaian ini dengan pendekatan yang efisien dan konsisten. Dua pendekatan yang umum digunakan dalam ASAS adalah direct scoring dan similarity-based scoring. Meskipun kedua pendekatan ini sudah banyak digunakan, penelitian sebelumnya cenderung fokus terhadap metrik seperti RMSE dan Pearson Correlation dalam menilai performa model. Penelitian ini bertujuan untuk melakukan analisis yang lebih mendalam dengan membandingkan kedua pendekatan tersebut pada dua skenario evaluasi, yaitu specific-prompt dan cross-prompt, dengan menilai akurasi dan stabilitas model. Dataset yang digunakan adalah dataset Rahutomo. Hasil analisis menunjukkan bahwa direct scoring lebih unggul dibandingkan similarity-based scoring. Pada skenario specific-prompt, diperoleh RMSE sebesar 0.0817 dan korelasi Pearson 0.9504, sedangkan pada cross-prompt, diperoleh RMSE sebesar 0.0917 dan korelasi Pearson 0.9286. Penelitian ini memberikan wawasan yang lebih komprehensif tentang performa model dengan tidak hanya mengandalkan metrik evaluasi, tetapi juga dengan melihat distribusi residual dan outlier, yang memberikan gambaran lebih lengkap mengenai stabilitas model. In the era of digital education, the need for automated scoring systems for short text answers has been steadily increasing. Automatic Short Answer Scoring (ASAS) aims to automate this assessment process with efficient and consistent approaches. Two commonly used approaches in ASAS are direct scoring and similarity-based scoring. Although these two approaches have been widely used, previous research has mostly focused on metrics like RMSE and Pearson Correlation to assess model performance. This study aims to provide a more in depth analysis by comparing both approaches in two evaluation scenarios, specific prompt and cross-prompt, by evaluating the accuracy and stability of the models. The dataset used in this study is the Rahutomo dataset. The results of the analysis show that direct scoring outperforms similarity-based scoring. In the specific prompt scenario, an RMSE of 0.0817 and a Pearson Correlation of 0.9504 were obtained, while in the cross-prompt scenario, the RMSE was 0.0917 and the Pearson Correlation was 0.9286. This study provides a more comprehensive insight into model performance by not only relying on evaluation metrics but also examining the distribution of residuals and outliers, which offers a more complete picture of model stability. UR - https://repository.upi.edu/ AV - restricted ID - repoupi135104 A1 - Bayu Wicaksono, - A1 - Rasim, - A1 - Yaya Wihardi, - PB - Universitas Pendidikan Indonesia N1 - https://scholar.google.com/citations?user=T3C8yrsAAAAJ&hl=en ID SINTA Dosen Pembimbing: Rasim: 5990962 Yaya Wihardi: 5994413 TI - PERBANDINGAN PENDEKATAN DIRECT SCORING DAN SIMILARITY-BASED SCORING DALAM SISTEM PENILAIAN JAWABAN SINGKAT OTOMATIS M1 - other ER -