PERHITUNGAN TEXT SIMILARITY BERBASIS WORD EMBEDDING DENGAN WORD MOVER DISTANCE UNTUK PENILAIAN KOMENTAR OTOMATIS DALAM SISTEM ONLINE JUDGE

Muhammad Nabillah Fihira Rischa, - (2021) PERHITUNGAN TEXT SIMILARITY BERBASIS WORD EMBEDDING DENGAN WORD MOVER DISTANCE UNTUK PENILAIAN KOMENTAR OTOMATIS DALAM SISTEM ONLINE JUDGE. S1 thesis, Universitas Pendidikan Indonesia.

This is the latest version of this item.

[img] Text
S_KOM_1603631_Title.pdf

Download (536kB)
[img] Text
S_KOM_1603631_Chapter1.pdf

Download (204kB)
[img] Text
S_KOM_1603631_Chapter2.pdf
Restricted to Staf Perpustakaan

Download (1MB)
[img] Text
S_KOM_1603631_Chapter3.pdf

Download (449kB)
[img] Text
S_KOM_1603631_Chapter4.pdf
Restricted to Staf Perpustakaan

Download (1MB)
[img] Text
S_KOM_1603631_Chapter5.pdf

Download (120kB)
Official URL: http://repository.upi.edu

Abstract

Komentar dalam source code adalah sebuah bentuk dokumentasi inline yang dibuat programmer untuk membantu orang lain memahami fungsi programnya. Departemen Pendidikan Ilmu Komputer Universitas Pendidikan Indonesia (UPI) memiliki sistem online judge bernama Computer Science Programming Contest (CSPC). Sistem ini digunakan dosen pengampu mata kuliah Algoritma dan Pemrograman untuk menilai kebenaran output program yang dibuat para mahasiswa yang mengontrak. Namun, penilaian komentar source code masih dilakukan secara manual oleh dosen pengampu dan tim asisten dosen. Hal ini tidak efisien serta rawan bias dan ketidakkonsistenan dari masing-masing penilai. Maka dari itu, penulis mengusulkan sebuah metode penilaian komentar source code otomatis untuk sistem online judge dengan pendekatan corpus-based text similarity. Model Word2vec, GloVe, dan fastText akan digunakan untuk melakukan training vektor-vektor kata dengan Wikipedia Dump Bahasa Indonesia. Kemiripan akan diukur menggunakan Word Mover’s Distance (WMD). Eksperimen dilakukan dengan menggunakan variasi-variasi epoch selama proses training. Koefisien korelasi Spearman’s rho, mean average error (MAE), serta pengukuran performa dari masing-masing model akan dibandingkan. Source code comments are a form of inline documentation made by programmers to help other people understand how the program functions. The Department of Computer Science Education in Universitas Pendidikan Indonesia (UPI) has an online judge system called the Computer Science Programming Contest (CSPC). The lecturer use this online judge system to evaluate the outputs of the programs created by the students. However, evaluating the comments of the source codes is still done manually by the lecturer and lecturer's assistant. This is inefficient and also prone to bias and inconsistency of each evaluators. Therefore, we propose a method to automatically assess source code comments in an online judge system using corpus-based text similarity. Word2vec, GloVe, and fastText models are used to train the word vectors using the Indonesian Wikipedia Dump. The similarity will be calculated using Word Mover's Distance (WMD). Experiments are done by using epoch variations during training. Spearman's rho correlation coefficient, mean average error (MAE), and performance metrics from each model are compared.

Item Type: Thesis (S1)
Uncontrolled Keywords: text similarity, word embedding, word mover’s distance, automatic scoring
Subjects: L Education > L Education (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Fakultas Pendidikan Matematika dan Ilmu Pengetahuan Alam > Program Studi Ilmu Komputer
Depositing User: Muhammad Nabillah Fihira Rischa
Date Deposited: 18 Feb 2021 01:55
Last Modified: 18 Feb 2021 01:55
URI: http://repository.upi.edu/id/eprint/59325

Available Versions of this Item

  • PERHITUNGAN TEXT SIMILARITY BERBASIS WORD EMBEDDING DENGAN WORD MOVER DISTANCE UNTUK PENILAIAN KOMENTAR OTOMATIS DALAM SISTEM ONLINE JUDGE. (deposited 18 Feb 2021 01:55) [Currently Displayed]

Actions (login required)

View Item View Item