IMPLEMENTASI VIDEO CAPTIONING MENGGUNAKAN OBJECT RELATIONAL GRAPH DENGAN PENDEKATAN NON-AUTOREGRESSIVE

Muhammad Ilham Malik, - (2023) IMPLEMENTASI VIDEO CAPTIONING MENGGUNAKAN OBJECT RELATIONAL GRAPH DENGAN PENDEKATAN NON-AUTOREGRESSIVE. S1 thesis, Universitas Pendidikan Indonesia.

This is the latest version of this item.

[img] Text
S_KOM_1902563_Title.pdf

Download (456kB)
[img] Text
S_KOM_1902563_Chapter1.pdf

Download (80kB)
[img] Text
S_KOM_1902563_Chapter2.pdf
Restricted to Staf Perpustakaan

Download (2MB)
[img] Text
S_KOM_1902563_Chapter3.pdf

Download (247kB)
[img] Text
S_KOM_1902563_Chapter4.pdf
Restricted to Staf Perpustakaan

Download (4MB)
[img] Text
S_KOM_1902563_Chapter5.pdf

Download (46kB)
Official URL: http://repository.upi.edu

Abstract

The ability of video captioning to generate a detailed caption that explains the content of the video with low inference is important. However, existing methods have limitations in both aspects. In this paper, we propose a video captioning model Object Relational Graph with Non-autoregressive Coarse to Fine (ORG-NACF) approach to tackle the video captioning problem in both aspects. The ORG module is used to obtain detailed object information and learn the relationship between the objects. The NACF module along with sequential cross attention is used to solve the problem of high inference time and maintain caption quality during caption generation. Experimental evaluation on benchmark MSR-VTT dataset shows that the performance of the ORG-NACF model is competitive and even exceeds the state-of-the-art model on several metrics and has the advantage of faster inference time. This model achieved 7 times more faster inference time than the baseline model. These results show that the ORG-NACF Model is able to generate descriptive and detailed captions with lower inference time compared to existing methods.

Item Type: Thesis (S1)
Additional Information: Link Google Scholar: https://scholar.google.co.id/citations?user=R20SJKYAAAAJ&hl=en ID SINTA Dosen Pembimbing: Yaya Wihardi : 5994413 Yudi Wibisono : 260167
Uncontrolled Keywords: Video Captioning, Transformer, Object Detection, Graph Convolutional Network, Convolutional Neural Network
Subjects: L Education > L Education (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Fakultas Pendidikan Matematika dan Ilmu Pengetahuan Alam > Program Studi Ilmu Komputer
Depositing User: Muhammad Ilham Malik
Date Deposited: 05 Sep 2023 18:42
Last Modified: 05 Sep 2023 18:42
URI: http://repository.upi.edu/id/eprint/102713

Available Versions of this Item

  • IMPLEMENTASI VIDEO CAPTIONING MENGGUNAKAN OBJECT RELATIONAL GRAPH DENGAN PENDEKATAN NON-AUTOREGRESSIVE. (deposited 05 Sep 2023 18:42) [Currently Displayed]

Actions (login required)

View Item View Item