Hasil Pencarian

Ditemukan 2 dokumen yang sesuai dengan query

Ikhlas Purwanto

Studi pengukuran kemiripan rantai DNA virus H5N1 berbasis himpunan fuzzy

"Tugas akhir ini berupaya menganalisis metode pengukuran kemiripan dan perbedaan rantai DNA dengan menggunakan dasar fuzzy genom dan ruang fuzzy polinukleotida. Analisis dilakukan dengan cara menerapkan metode yang dianalisis dalam sebuah aplikasi sederhana dalam bahasa pemograman JAVA. Aplikasi yang dibuat bertujuan dapat memberikan nilai kemiripan dan perbedaan dua rantai DNA. Sampel data yang diambil adalah rantai DNA virus influenza yang telah dipetakan genomnya serta telah diketahui subtipenya. Virus yang menjadi pembanding utama yaitu virus influenza dengan subtipe H5N1. Selain itu, data yang diambil yaitu virus-virus lain yang tersedia juga di dari NCBI (National Center for Biotechnology Information). Hasil menunjukkan bahwa virus dengan tipe yang sama yaitu virus influenza memiliki nilai kemiripan yang lebih besar dan jarak yang kecil. Pengukuran yang dilakukan tidak mampu membedakan subtipe sesama virus influenza. Akan tetapi, pengukuran virus influenza dengan jenis virus lainnya memiliki nilai kemiripan yang relatif lebih rendah dan jarak yang jauh. Hasil tersebut memungkinkan untuk membedakan virus influenza dengan jenis virus lain. Metode yang digunakan dapat digunakan sebagai salah satu ciri untuk mengklasifikasi jenis virus tertentu.

This thesis attempts to analyze the methods to measure similarity and distance between DNA sequences based on the theory of fuzzy genomes and fuzzy polynucleotide spaces. The analyzing is done using an application coded by JAVA language. The application implements the methods to measure similarity and distance between two DNA sequences. Influenza virus which its genomes has already been mapped and subtype known is used as sample data. The main DNA sequence comparator is the DNA of influenza virus with subtype H5N1. Besides that, other virus data is taken from same source (NCBI - National Center for Biotechnology Information) as samples. The results show that viruses with same type have high similarity value and low distance value. The measure cannot classify subtype in influenza virus. However, the measurements of influenza virus with other kind of virus have relatively low similarity value and high distance value. This result creates a possibility to differ virus influenza and other virus. The methods can be used as a feature for virus classification."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2009

S-Pdf

UI - Skripsi Open Universitas Indonesia Library

Gibran Brahmanta Patriajati

Analisis Performa Pendekatan Topic Modeling dan Similarity Measure untuk Text Summarization secara Ekstraktif pada Teks Berbahasa Indonesia = Performance Analysis of Topic Modeling and Similarity Measure Approach for Extractive Text Summarization in Indonesian Text

"Text Summarization secara ekstraktif merupakan suatu isu yang dapat meningkatkan kualitas pengalaman pengguna ketika menggunakan suatu sistem perolehan informasi. Pada bahasa Inggris, terdapat beberapa penelitian terkait Text Summarization secara ekstraktif salah satunya adalah penelitian Belwal et al. (2021) yang memperkenalkan suatu metode Text Summarization secara ekstraktif yang berbasiskan proses Topic Modeling serta Semantic Measure menggunakan WordNet. Sementara pada bahasa Indonesia, juga terdapat beberapa penelitian terkait Text Summarization secara ekstraktif tetapi belum ada yang menggunakan metode yang sama seperti yang diperkenalkan oleh Belwal et al. (2021). Agar metode yang diperkenalkan Belwal et al. (2021) dapat digunakan pada bahasa Indonesia, proses Semantic Measure menggunakan WordNet harus diganti dengan Similarity Measure menggunakan Vector Space Model karena tidak adanya model WordNet bahasa Indonesia yang dapat digunakan oleh umum. Dalam menggunakan metode yang diperkenalkan oleh Belwal et al. (2021) pada bahasa Indonesia, terdapat beberapa metode yang dapat digunakan untuk melakukan Topic Modeling, Vector Space Model, serta Similarity Measure yang terdapat di dalamnya. Penelitian ini berfokus untuk mencari kombinasi metode ketiga hal yang telah disebutkan sebelumnya yang dapat memaksimalkan performa metode Text Summarization yang diperkenalkan oleh Belwal et al. (2021) pada bahasa Indonesia dengan menggunakan pendekatan hill-climbing. Proses evaluasi dilakukan dengan menggunakan metrik ROUGE-N dalam bentuk F-1 Score pada dua buah dataset yaitu Liputan6 serta IndoSUM. Hasil penelitian menemukan bahwa kombinasi metode yang dapat memaksimalkan performa metode Text Summarization secara ekstraktif yang diperkenalkan oleh Belwal et al. (2021) adalah Non-Negative Matrix Factorization untuk Topic Modeling, Word2Vec untuk Vector Space Model, serta Euclidean Distance untuk Similarity Measure. Kombinasi metode tersebut memiliki nilai ROUGE-1 sebesar 0.291, ROUGE-2 sebesar 0.140, dan ROUGE-3 sebesar 0.079 pada dataset Liputan6. Sementara pada dataset IndoSUM, kombinasi metode tersebut memiliki nilai ROUGE-1 sebesar 0.455, ROUGE-2 sebesar 0.337, dan ROUGE-3 sebesar 0.300. Performa yang dihasilkan oleh kombinasi metode tersebut bersifat cukup kompetitif dengan performa metode lainnya seperti TextRank serta metode berbasiskan model Deep Learning BERT apabila dokumen masukannya bersifat koheren.

Extractive text summarization is an issue that can improve the quality of user experience when using an information retrieval system. Research related to extractive text summarization is a language-specific research. In English, there are several studies related to extractive text summarization, one of them is the research of Belwal et al. (2021) They introduced an extractive Text Summarization method based on the Topic Modeling process and Semantic Measure using WordNet. While in Indonesian, there are also several studies related to extractive text summarization, but none have used the same method as introduced by Belwal et al. (2021). In order to use the method introduced by Belwal et al. (2021) in Indonesian, the Semantic Measure process using WordNet must be replaced with Similarity Measure using the Vector Space Model because there is no Indonesian WordNet model that can be used by the public. When using the method introduced by Belwal et al. (2021) in Indonesian, there are several methods that can be used to perform Topic Modeling, Vector Space Model, and Similarity Measure that contained in there. This study focuses on finding a combination of the three methods previously mentioned that can maximize the performance of the Text Summarization method introduced by Belwal et al. (2021) in Indonesian using hill-climbing approach. The evaluation process is carried out using the ROUGE-N metric in the form of F-1 Score on two datasets, namely Liputan6 and IndoSUM. The results of the study found that the combination of methods that can maximize the performance of the extractive text summarization method introduced by Belwal et al. (2021) are Non-Negative Matrix Factorization for Topic Modeling, Word2Vec for Vector Space Model, and Euclidean Distance for Similarity Measure. The combination of those methods has a ROUGE-1 value of 0.291, ROUGE-2 value of 0.140, and ROUGE-3 value of 0.079 in the Liputan6 dataset. Meanwhile, in the IndoSUM dataset, the combination of those methods has a ROUGE-1 value of 0.455, ROUGE-2 value of 0.337, and ROUGE-3 value of 0.300. The performance generated by the combination of those methods is quite competitive with the performance of other methods such as TextRank and Deep Learning BERT model based method if the input document is coherent."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian