CONVOLUSIONAL RECURRENT NEURAL NETWORK DAN CONNECTIONIST TEMPORAL CLASSIFICATION DALAM PENGENALAN AKSARA SUNDA KUNO

    Rizky Sanjaya Tandia, - (2025) CONVOLUSIONAL RECURRENT NEURAL NETWORK DAN CONNECTIONIST TEMPORAL CLASSIFICATION DALAM PENGENALAN AKSARA SUNDA KUNO. S1 thesis, Universitas Pendidikan Indonesia.

    Abstract

    Aksara Sunda merupakan warisan budaya masyarakat Sunda sekaligus identitas yang perlu dilestarikan keberadaanya. Kondisi dokumen sejarah aksara Sunda yang semakin terdegradasi membuat urgensi digitalisasi dokumen semakin diperlukan. Penelitian seputar aksara Sunda maupun aksara non-latin lainnya sudah banyak dilakukan dengan model gabungan Convolusional Neural Network (CNN) dan Recurrent Neural Network (RNN) dengan hasil akurasi yang baik. Namun banyak penelitian berfokus pada aksara modern dan transliterasi level per karakter dimana pada karakteristik aksara Sunda kuno menghadapi tantangan lebih kompleks. Connectionist Temporal Classification (CTC) yang dipadukan dengan arsitektur CRNN juga terbukti meningkatkan akurasi pengenalan kata dibanding CRNN biasa pada kasus studi pengenalan huruf non-latin. Berdasarkan kekurangan yang ada, penelitian ini mengajukan model gabungan CRNN-CTC untuk pengenalan kata aksara Sunda kuno untuk dievaluasi efektivitasnya. Evaluasi dilakukan dengan metrik Character Error Rate (CER) dan Word Error Rate (WER) serta Overall Accuracy dengan perolehan terbaik pada model yang dimodifikasi didapat CER sebesar 22.87%, WER sebesar 64,49% dan akurasi sebesar 71,03%. Setelah penerapan hyperparameter tuning dengan label smoothing angka metrik berhasil diturunkan dengan CER sebesar 18.32%, WER sebesar 49,31% dan accuracy sebesar 77,14%. Pengujian pada set testing mendapat accuracy sebesar 75,1% dan sampel gambar di kertas HVS mendapat hasil accuracy sebesar 81%.
    -----------
    The Sundanese script is a cultural heritage of the Sundanese people and an identity that must be preserved. The increasing degradation of historical Sundanese script documents highlights the urgency of document digitization. Research on the Sundanese script and other non-Latin scripts has been extensively conducted using a combination of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), yielding good accuracy results. However, most studies focus on modern scripts and character-level transliteration, while ancient Sundanese script presents more complex challenges. Connectionist Temporal Classification (CTC), when combined with the CRNN architecture, has been proven to improve word recognition accuracy compared to standard CRNN in non-Latin script recognition studies. Based on these limitations, this study proposes a CRNN-CTC model for word-level recognition of ancient Sundanese script to evaluate its effectiveness. The evaluation was conducted using Character Error Rate (CER), Word Error Rate (WER), and Overall Accuracy metrics, with the best results obtained from a modified model achieving CER of 22.87%, WER of 64.49%, and accuracy of 71.03%. After applying hyperparameter tuning with label smoothing, the error rates were further reduced, resulting in a CER of 18.32%, WER of 49.31%, and accuracy of 77.14%. Testing on the test set achieved an accuracy of 75.1%, while samples on HVS paper achieved an accuracy of 81%.

    [thumbnail of S_RPL_2004324_Title.pdf] Text
    S_RPL_2004324_Title.pdf

    Download (455kB)
    [thumbnail of S_RPL_2004324_Chapter1.pdf] Text
    S_RPL_2004324_Chapter1.pdf

    Download (116kB)
    [thumbnail of S_RPL_2004324_Chapter2.pdf] Text
    S_RPL_2004324_Chapter2.pdf
    Restricted to Staf Perpustakaan

    Download (310kB)
    [thumbnail of S_RPL_2004324_Chapter3.pdf] Text
    S_RPL_2004324_Chapter3.pdf

    Download (205kB)
    [thumbnail of S_RPL_2004324_Chapter4.pdf] Text
    S_RPL_2004324_Chapter4.pdf
    Restricted to Staf Perpustakaan

    Download (455kB)
    [thumbnail of S_RPL_2004324_Chapter5.pdf] Text
    S_RPL_2004324_Chapter5.pdf

    Download (34kB)
    [thumbnail of S_RPL_2004324_Appendix.pdf] Text
    S_RPL_2004324_Appendix.pdf
    Restricted to Staf Perpustakaan

    Download (190kB)
    Official URL: https://repository.upi.edu/
    Item Type: Thesis (S1)
    Additional Information: ID SINTA Dosen Pembimbing: Indira Syawanodya: 6681751 Raditya Muhammad: 6682222
    Uncontrolled Keywords: Aksara Sunda, ectionist, Sundanese Manuscript, Convolusional Recurrent Neural Network (CNN), Recurrent Neural Network (RNN), Connectionist Temporal Classification (CTC), CER, WER.
    Subjects: L Education > L Education (General)
    Q Science > QA Mathematics > QA76 Computer software
    Divisions: UPI Kampus cibiru > S1 Rekayasa Perangkaat Lunak
    Depositing User: Rizky Sanjaya Tandia
    Date Deposited: 05 Mar 2025 04:23
    Last Modified: 05 Mar 2025 04:23
    URI: http://repository.upi.edu/id/eprint/130171

    Actions (login required)

    View Item View Item