IMPLEMENTASI ANALISIS MORFOLOGI DALAM MENANGANI OUT-OF-VOCABULARY WORDS PADA PART-OF-SPEECH TAGGER BAHASA INDONESIA MENGGUNAKAN HIDDEN MARKOV MODEL

Febriyana Ramdhanti, - (2019) IMPLEMENTASI ANALISIS MORFOLOGI DALAM MENANGANI OUT-OF-VOCABULARY WORDS PADA PART-OF-SPEECH TAGGER BAHASA INDONESIA MENGGUNAKAN HIDDEN MARKOV MODEL. S1 thesis, Universitas Pendidikan Indonesia.

[img] Text
S_KOM_1404095_Titke.pdf

Download (199kB)
[img] Text
S_KOM_1404095_Abstract.pdf

Download (272kB)
[img] Text
S_KOM_1404095_Table_of_content.pdf

Download (306kB)
[img] Text
S_KOM_1404095_Chapter1.pdf

Download (309kB)
[img] Text
S_KOM_1404095_Chapter2.pdf
Restricted to Staf Perpustakaan

Download (751kB)
[img] Text
S_KOM_1404095_Chapter3.pdf

Download (365kB)
[img] Text
S_KOM_1404095_Chapter4.pdf
Restricted to Staf Perpustakaan

Download (1MB)
[img] Text
S_KOM_1404095_Chapter5.pdf

Download (185kB)
[img] Text
S_KOM_1404095_Bibliography.pdf

Download (398kB)
[img] Text
S_KOM_1404095_Appendix.pdf
Restricted to Staf Perpustakaan

Download (730kB)
Official URL: http://repository.upi.edu

Abstract

Part-of-speech (PoS) tagger merupakan salah satu task dalam bidang natural language processing (NLP) sebagai proses penandaan kategori kata (part-of-speech) untuk setiap kata pada teks kalimat masukan. Hidden markov model (HMM) merupakan algoritma PoS tagger berbasis probabilistik, sehingga sangat tergantung pada train corpus. Terbatasnya komponen dalam train corpus dan luasnya kata dalam bahasa Indonesia menimbulkan masalah yang disebut out-of-vocabulary (OOV) words. Untuk mengatasi permasalahan tersebut dibutuhkan sebuah metode yaitu Analisis Morofologi. Penelitian ini membuat dua sistem yaitu PoS tagger HMM menggunakan metode Analsis Morfologi (AM) dan PoS tagger HMM tanpa AM, dengan menggunakan train corpus dan testing corpus yang sama. Testing corpus mengandung 30% tingkat OOV dari 6.676 token atau 740 kalimat masukan. Hasil yang diperoleh dari sistem HMM saja memiliki akurasi 97.54%, sedangkan sistem HMM dengan metode analisis morfologi memiliki akurasi tertinggi 99.14%.; Part-of-speech (PoS) tagger is one of tasks in the field of natural language processing (NLP) as the process of part-of-speech tagging for each word in the inputed sentence. Hidden markov model (HMM) is a probabilistic based PoS tagger algorithm, so it really depends on the train corpus. The limited components in the train corpus and the breadth of words in the Indonesian language pose a problem called out-of-vocabulary (OOV) words. To overcome this problem, a method is needed, namely Morophological Analysis. This research includes developing two systems, those are PoS tagger HMM using Morphological Analysis (AM) method and HMM PoS tagger without AM, using the same train and testing corpus. Testing corpus contains 30% OOV level out of 6,676 tokens or 740 sentences. The result obtained from the HMM system has 97.54% of accuracy, while the HMM system with morphological analysis method has 99.14% as it’s highest accuracy.

Item Type: Thesis (S1)
Additional Information: No Panggil : S KOM FEB i-2019; Pembimbing : I. Yudi Wibisono, II. Rosa Ariani Sukanto; NIM : 1404095
Uncontrolled Keywords: bahasa Indonesia, natural language processing, part-of-speech tagging, hidden markov model, morphological analysis, out-of-vocabulary.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: Fakultas Pendidikan Matematika dan Ilmu Pengetahuan Alam > Program Studi Ilmu Komputer
Depositing User: Yayu Wulandari
Date Deposited: 15 May 2020 02:47
Last Modified: 15 May 2020 02:47
URI: http://repository.upi.edu/id/eprint/48708

Actions (login required)

View Item View Item