AUTOMATIC QUESTION GENERATION UNTUK SOAL VOCABULARY PADA READING COMPREHENSION TOEFL MENGGUNAKAN ALGORITMA LEARNING VECTOR QUANTIZATION

Riviawati Putri Giovani, - (2021) AUTOMATIC QUESTION GENERATION UNTUK SOAL VOCABULARY PADA READING COMPREHENSION TOEFL MENGGUNAKAN ALGORITMA LEARNING VECTOR QUANTIZATION. S1 thesis, Universitas Pendidikan Indonesia.

[img] Text
S_KOM_1700424_Title.pdf

Download (297kB)
[img] Text
S_KOM_1700424_Chapter1.pdf

Download (143kB)
[img] Text
S_KOM_1700424_Chapter2.pdf
Restricted to Staf Perpustakaan

Download (613kB)
[img] Text
S_KOM_1700424_Chapter3.pdf

Download (106kB)
[img] Text
S_KOM_1700424_Chapter4.pdf
Restricted to Staf Perpustakaan

Download (2MB)
[img] Text
S_KOM_1700424_Chapter5.pdf

Download (58kB)
[img] Text
S_KOM_1700424_Appendix.pdf
Restricted to Staf Perpustakaan

Download (414kB)
Official URL: http://repository.upi.edu

Abstract

Salah satu bentuk evaluasi kemampuan berbahasa Inggris adalah dengan mengikuti TOEFL atau Test of English as a Foreign Language. Mengutip dari halaman web Educational Testing Service, nilai tes TOEFL digunakan sebagai syarat administrasi penerimaan oleh lebih dari 11.000 universitas dan institusi akademik di 150 negara. Masalah muncul ketika pertanyaan yang diajukan pada TOEFL tidak dibagikan bebas kepada peserta tes, sehingga untuk mempersiapkan ujian peserta berlatih dengan mengandalkan pertanyaan yang ada pada buku persiapan TOEFL yang jumlahnya terbatas. Atas dasar tersebut, penelitian ini bertujuan untuk memberikan ketersediaan sumber belajar berupa kumpulan soal. Penelitian ini difokuskan untuk menghasilkan tipe soal vocabulary pada bagian reading comprehension TOEFL, tipe soal vocabulary secara luas diakui sebagai salah satu komponen kunci yang diperlukan untuk kemahiran bahasa kedua. Soal yang dihasilkan dibangkitkan dari teks artikel berita media berbahasa Inggris seperti, The Guardian, New York Times, dan BBC. Sedangkan untuk data pelatihan digunakan data historical soal vocabulary TOEFL. Secara garis besar tahapan penelitian ini dimulai dari, data preprocessing, feature extraction, menentukan kata target dengan algoritma Learning Vector Quantization dan menentukan heuristic untuk kandidat jawaban dan pengecoh, lalu yang terakhir adalah post processing. Penelitian ini berhasil menghasilkan soal-soal yang dinilai menurut expert memiliki nilai answer existence 89%, difficulty index 55%, dan distractor quality sebesar 61%. Berdasarkan evaluasi pilihan jawaban, 70,5% pilihan jawaban memiliki hubungan sinonim dengan kata target. One form of evaluating English language skills is to take the TOEFL or Test of English as a Foreign Language. According to the Educational Testing Service web page, TOEFL test scores are used as an admission administrative requirement by more than 11,000 universities and academic institutions in 150 countries. Problems arise when the questions posed on the TOEFL are not distributed freely to test takers, so to prepare for the exam participants practice by relying on the questions in the limited number of TOEFL preparation books. On this basis, this study aims to provide the availability of learning resources in the form of a collection of questions. This research is focused on producing vocabulary question types on the reading comprehension section of the TOEFL. Vocabulary type questions are widely recognized as one of the key components required for second language proficiency. Questions are generated from the texts of news articles in English-language media such as The Guardian, New York Times, and the BBC. For the training data, historical data on TOEFL vocabulary questions are used. The stages of this research start from data preprocessing, feature extraction, determine target words with the Learning Vector Quantization algorithm and determine heuristics for answer candidates and distractors, then the last is post processing. This study succeeded in producing vocabulary questions and based on expert assessment the value for the answer existence parameter was 89%, difficulty index was 55%, and distractor quality was 61%. Based on the evaluation of the answer choices, 70.5% of the answer choices have a synonymous relationship with the target word.

Item Type: Thesis (S1)
Uncontrolled Keywords: Automatic Question Generation, Natural Language Processing, Machine Learning, Learning Vector Quantization.
Subjects: L Education > L Education (General)
L Education > LC Special aspects of education
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Fakultas Pendidikan Matematika dan Ilmu Pengetahuan Alam > Program Studi Ilmu Komputer
Depositing User: Riviawati Putri Giovani
Date Deposited: 01 Sep 2021 03:38
Last Modified: 01 Sep 2021 03:38
URI: http://repository.upi.edu/id/eprint/64915

Actions (login required)

View Item View Item