Hasil Pencarian  ::  Simpan CSV :: Kembali

Hasil Pencarian

Ditemukan 4 dokumen yang sesuai dengan query
cover
Kenneth Jonathan
"Terdapat beberapa masalah yang muncul seiring dengan bertambahnya peraturan. Hal tersebut menyebabkan proses pengumpulan dan evaluasi peraturan memakan waktu yang relatif lebih lama. Oleh karena itu, diperlukan suatu sistem yang dapat mengotomatiskan kebutuhan tersebut, salah satunya adalah Information Retrieval. Penelitian ini bertujuan untuk meningkatkan efektivitas sistem Information Retrieval melalui pendekatan re-ranker berbasis fitur dengan memanfaatkan beberapa jenis fitur, seperti atribut kuantitatif sederhana, skor text matching, dan document embeddings. Ditemukan bahwa skor kesamaan Jaccard, nilai relevansi BM25 dan nilai relevansi LemurTF_IDF merupakan karakteristik yang dapat membantu peningkatan efektivitas re-ranking secara konsisten dalam domain legal. Sementara itu, fitur yang memanfaatkan embeddings dari BERT maupun T5 didapatkan bermanfaat, namun memiliki kontribusi yang lebih kecil dari fitur perhitungan sederhana seperti kesamaan Jaccard. Selain itu, didapatkan bahwa pemanfaatan seluruh fitur sebagai masukan dari re-ranker LambdaMART dapat meningkatkan seluruh metrik sistem sekitar 4,17% secara signifikan dengan nilai metrik utama, recall@3, tertinggi diperoleh DLH13 (Reranker) dengan nilai 0,6632 dan peningkatan sebesar 5,64%. Namun, saat dilakukan percobaan menggunakan hanya ketiga fitur tersebut, didapatkan peningkatan sebesar 3, 739% yang tidak signifikan.

There are several issues that arise with the increasing number of regulations. This causes the process of collecting and evaluating regulations to take relatively longer. Therefore, a system is needed to automate these needs, one of which is Information Retrieval. This research aims to improve the effectiveness of the Information Retrieval system through a feature-based re-ranker approach by utilizing several types of features, such as simple quantitative attributes, text matching scores, and document embeddings. It was found that Jaccard similarity scores, BM25 relevance values, and LemurTF_IDF relevance values are characteristics that can consistently help improve re-ranking effectiveness in the legal domain. Meanwhile, features that utilize BERT and T5 embeddings were found to be beneficial but contributed less than simple calculation features like Jaccard similarity. Additionally, it was found that using all the features as input for the LambdaMART re-ranker can significantly improve all system metrics by about 4,17%, with the highest main metric value, recall@3, achieved by DLH13 (Reranker) with a value of 0, 6632 and an increase of 5,64%. However, when experiments were conducted using only the three features mentioned, an insignificant increase of 3, 739% was obtained."
Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2024
S-pdf
UI - Skripsi Membership  Universitas Indonesia Library
cover
Kenneth Jonathan
"Terdapat beberapa masalah yang muncul seiring dengan bertambahnya peraturan. Hal tersebut menyebabkan proses pengumpulan dan evaluasi peraturan memakan waktu yang relatif lebih lama. Oleh karena itu, diperlukan suatu sistem yang dapat mengotomatiskan kebutuhan tersebut, salah satunya adalah Information Retrieval. Penelitian ini bertujuan untuk meningkatkan efektivitas sistem Information Retrieval melalui pendekatan re-ranker berbasis fitur dengan memanfaatkan beberapa jenis fitur, seperti atribut kuantitatif sederhana, skor text matching, dan document embeddings. Ditemukan bahwa skor kesamaan Jaccard, nilai relevansi BM25 dan nilai relevansi LemurTF_IDF merupakan karakteristik yang dapat membantu peningkatan efektivitas re-ranking secara konsisten dalam domain legal. Sementara itu, fitur yang memanfaatkan embeddings dari BERT maupun T5 didapatkan bermanfaat, namun memiliki kontribusi yang lebih kecil dari fitur perhitungan sederhana seperti kesamaan Jaccard. Selain itu, didapatkan bahwa pemanfaatan seluruh fitur sebagai masukan dari re-ranker LambdaMART dapat meningkatkan seluruh metrik sistem sekitar 4,17% secara signifikan dengan nilai metrik utama, recall@3, tertinggi diperoleh DLH13 (Reranker) dengan nilai 0,6632 dan peningkatan sebesar 5,64%. Namun, saat dilakukan percobaan menggunakan hanya ketiga fitur tersebut, didapatkan peningkatan sebesar 3, 739% yang tidak signifikan.

There are several issues that arise with the increasing number of regulations. This causes the process of collecting and evaluating regulations to take relatively longer. Therefore, a system is needed to automate these needs, one of which is Information Retrieval. This research aims to improve the effectiveness of the Information Retrieval system through a feature-based re-ranker approach by utilizing several types of features, such as simple quantitative attributes, text matching scores, and document embeddings. It was found that Jaccard similarity scores, BM25 relevance values, and LemurTF_IDF relevance values are characteristics that can consistently help improve re-ranking effectiveness in the legal domain. Meanwhile, features that utilize BERT and T5 embeddings were found to be beneficial but contributed less than simple calculation features like Jaccard similarity. Additionally, it was found that using all the features as input for the LambdaMART re-ranker can significantly improve all system metrics by about 4,17%, with the highest main metric value, recall@3, achieved by DLH13 (Reranker) with a value of 0, 6632 and an increase of 5,64%. However, when experiments were conducted using only the three features mentioned, an insignificant increase of 3, 739% was obtained."
Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2024
S-pdf
UI - Skripsi Membership  Universitas Indonesia Library
cover
Febi Imanuela
"Perkembangan teknologi pada bidang kesehatan di Indonesia telah menghadirkan layanan konsultasi dengan dokter melalui forum tanya jawab kesehatan. Seiring dengan berjalannya waktu, muncul permasalahan pertanyaan duplikat pada forum. Permasalahan ini perlu ditangani agar dapat mempercepat proses pengembalian jawaban untuk keluhan yang serupa dan menjaga jumlah pertanyaan agar tetap scalable dengan kapasitas dokter penjawab. Namun, pertanyaan duplikat merupakan suatu tantangan tersendiri karena kompleksitas bahasa natural. Penelitian ini memanfaatkan pendekatan Information Retrieval untuk mengidentifikasi pasangan pertanyaan duplikat pada domain ini sebagai suatu pasangan query dan dokumen yang relevan. Setelah melakukan ranking awal menggunakan BM25 sebagai model baseline, performa hasil ranking ditingkatkan melalui proses re-ranking menggunakan model learning-to-rank LambdaMART yang berbasis fitur. Penelitian ini memanfaatkan fitur perhitungan jarak dan similaritas antara pasangan vektor representasi query dan dokumen, yang diperoleh dari model word embeddings dan transformer. Selain itu, diusulkan fitur scoring yang diperoleh dari model Cross Encoder, serta model BM25 yang menjadi model baseline. Penelitian ini juga mengusulkan fitur-fitur yang mempertimbangkan jumlah keywords gagasan utama query yang dikandung dokumen. Evaluasi eksperimen dilakukan menggunakan cross validation dan error analysis, dengan MRR sebagai metrik utama. Performa tertinggi yang dicapai eksperimen adalah MRR senilai 0,951 dengan p value senilai 0,016 yang signifikan terhadap baseline. Dengan demikian, penelitian ini menunjukkan dukungan empiris terhadap peningkatan efektivitas model re-ranking yang diusulkan untuk melakukan identifikasi otomatis terhadap karakteristik query dan dokumen yang relevan, yakni pasangan pertanyaan duplikat dalam konteks ini.

The development of technology in the healthcare sector in Indonesia has introduced consultation services with doctors through consumer health forums. Over time, the issue of duplicate questions on these forums emerged. This problem needs to be addressed to accelerate the response process for similar questions and to keep the number of questions scalable with the capacity of the responding doctors. However, duplicate questions present their own challenge due to the complexity of natural language. This study utilizes Information Retrieval approach to identify pairs of duplicate questions in this domain as query and relevant document pairs. After initial ranking using BM25 as the baseline model, the ranking performance is improved through a re-ranking process using the feature-based LambdaMART model. This study leverages features that calculate the distance and similarity between vector representations of the query and document, obtained from word embedding and transformer models. Additionally, scoring features derived from the Cross Encoder model and the BM25 baseline model are proposed. The study also suggests features that consider the number of main idea keywords from the query that is also contained within the document. Experiment evaluation is conducted using cross validation and error analysis, with Mean Reciprocal Rank (MRR) as the primary metric. The highest performance achieved in the experiments is an MRR of 0.951 with a p-value of 0.016, which is significant to the baseline. Thus, this study provides empirical support for the effectiveness of the proposed re-ranking model for automatic identification of the query and relevant document, specifically duplicate question pairs in this context."
Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2024
S-pdf
UI - Skripsi Membership  Universitas Indonesia Library
cover
Stojadinovi, Slavenko M.
"This book introduces a new generation of metrological systems and their application in a digital quality concept. It discusses the development of an optimal collision-free measuring path based on CAD geometry and tolerances defined in knowledge base and AI techniques such as engineering ontology, ACO and GA. This new approach, combining both geometric and metrological features, allows the following benefits: reduction of a preparation time based on the automatic generation of a measuring protocol; developed mathematical model for the distribution of measuring points and collision avoidance; the optimization of a measuring probe path; the analysis of a part placement based on the accessibility analysis and automatic configuration of measuring probes. The application of this new system is particularly useful in the inspection of complex prismatic parts with a large number of tolerances, in all of type production. The implementation is demonstrated using several case studies relating to high-tech industries and advanced, non-conventional processes."
Switzerland: Springer Nature, 2019
e20506256
eBooks  Universitas Indonesia Library