Hasil Pencarian  ::  Simpan CSV :: Kembali

Hasil Pencarian

Ditemukan 2 dokumen yang sesuai dengan query
cover
Hapnes Toba
Abstrak :
[Sebuah sistem tanya jawab (STJ) adalah sebuah sistem komputer yang dirancang untuk mencari jawaban yang paling tepat terhadap sebuah pertanyaan yang diajukan dalam sebuah bahasa alami. Penelitian terkait STJ telah dilakukan sejak awal tahun 60-an, dan mengalami perkembangan yang pesat sejak diadakannya forum-forum evaluasi STJ sejak tahun 90-an sampai saat ini. Bidang-bidang penelitian dalam ilmu komputer yang memberikan kontribusi besar dalam perkembangan STJ meliputi antara lain: temu balik informasi, pemrosesan bahasa alami, dan kecerdasan buatan. Secara khusus dalam riset doktoral ini dilakukan eksplorasi terhadap komponen validasi jawaban. Riset bertujuan untuk menghasilkan metode baru yang dapat meningkatkan relevansi cuplikan teks dan mencari strategi untuk melakukan ekstraksi jawaban dengan mengkombinasikan pendekatan statist ik dan simbolik. Terdapat dua usulan yang diberikan guna mencapai tujuan riset. Usul yang pertama adalah penggunaan model kualitas jawaban yang dikembangkan dari STJ berbasis komunitas sebagai alat untuk melakukan pengurutan ulang cuplikan teks. Usul yang kedua adalah pembentukan model jawaban melalui pembelajaran frasa pengandung jawaban terkecil dan terlengkap (least generalized answer bearing phrase/ABP-LG) sebagai sarana untuk memprediksi bagian kalimat yang paling memungkinkan mengandung jawaban. Model ABPLG memanfaatkan informasi struktur kalimat pada pertanyaan dan cuplikan teks sebagai indikator yang menentukan peluang kandungan jawaban dalam sebuah bagian kalimat. Hasil eksperimen dengan berbagai koleksi data memperlihatkan bahwa kombinasi model ABP-LG dengan sistem berbasis pola mampu memberikan kontribusi untuk perbaikan hasil ekstraksi jawaban secara signifikan untuk tipe pertanyaan faktoid maupun kompleks (tipe lain-lain). Keunggulan model ABP-LG jika dibandingkan dengan STJ berbasis entitas bernama ataupun kamus adalah kemampuannya untuk mempelajari indikasi 'cara menjawab' dan portabilitasnya untuk diterapkan dalam domain pertanyaan yang berbeda-beda, khususnya untuk tipe-tipe pertanyaan yang dapat mencakup konteks apapun, seperti dalam tipe 'other' (lain-lain). Kelemahan model ABP-LG yang teramati selama eksperimen adalah ketergantungannya pada kualitas teks. Problem terakhir ini secara parsial berhasil ditangani oleh model pengurutan ulang cuplikan teks sebagai penyaring kandidat-kandidat kalimat yang dianggap mengandung jawaban dari hasil temu balik informasi.;The task of a question answering system (QAS) is to find a final answer given a natural language question. Since it was introduced in the 1960s, the task of QAS has always been at the forefront of technology advances. Along with the advances in the fields of information retrieval, computational linguistics, and artificial intelligence, research on QAS are broadened into unstructured textual documents in open domains. Evaluation forums for QAS have steered the development of QAS into an established and large-scale research methodologies and evaluations. This doctoral research investigates various techniques in the answer validation component. The main objective of the research is to develop new methods in snippet reranking and answer extraction process by combining the statistical and the symbolic (semantics) approaches. Two novel techniques are proposed as the results of this doctoral research. The first one is the snippets' reranking model which is developed by using the question-answer pairs' characteristics in a community-based QAS. This answer quality model forms the basic ingredient for the snippet reranking process. The second proposal is the least generalized answer bearing phrase model (ABP-LG) to predict the final answer location of a given question which is extracted from a number of good quality snippets, after a reranking process. The ABP-LG model employs syntactic tree information of question-answer (snippet) pairs as indicators to predict the answer bearing possibility in each part of a snippet. The experiment results show that the ABP-LG model combines with the pattern-based approach contributes considerably in the answer extraction process for factoid- and complex (other)-typed questions. The main advantage of the ABPLG model beyond the common approaches, which are based on named-entity recognizers or dictionaries, is its ability to predict the 'way-of-answering', either in factoid or complex question types. Based on the analysis of the experiment results, the main weaknesses of the ABP-LG model is its high dependency on good quality snippets which partially has been tackled by employing the snippets' reranking model., The task of a question answering system (QAS) is to find a final answer given a natural language question. Since it was introduced in the 1960s, the task of QAS has always been at the forefront of technology advances. Along with the advances in the fields of information retrieval, computational linguistics, and artificial intelligence, research on QAS are broadened into unstructured textual documents in open domains. Evaluation forums for QAS have steered the development of QAS into an established and large-scale research methodologies and evaluations. This doctoral research investigates various techniques in the answer validation component. The main objective of the research is to develop new methods in snippet reranking and answer extraction process by combining the statistical and the symbolic (semantics) approaches. Two novel techniques are proposed as the results of this doctoral research. The first one is the snippets' reranking model which is developed by using the question-answer pairs' characteristics in a community-based QAS. This answer quality model forms the basic ingredient for the snippet reranking process. The second proposal is the least generalized answer bearing phrase model (ABP-LG) to predict the final answer location of a given question which is extracted from a number of good quality snippets, after a reranking process. The ABP-LG model employs syntactic tree information of question-answer (snippet) pairs as indicators to predict the answer bearing possibility in each part of a snippet. The experiment results show that the ABP-LG model combines with the pattern-based approach contributes considerably in the answer extraction process for factoid- and complex (other)-typed questions. The main advantage of the ABPLG model beyond the common approaches, which are based on named-entity recognizers or dictionaries, is its ability to predict the 'way-of-answering', either in factoid or complex question types. Based on the analysis of the experiment results, the main weaknesses of the ABP-LG model is its high dependency on good quality snippets which partially has been tackled by employing the snippets' reranking model.]
2014
D1990
UI - Disertasi Membership  Universitas Indonesia Library
cover
Agus Widodo
Abstrak :
Saat ini penentuan area riset masih banyak bergantung kepada pendapat para ahli. Meskipun ahli tersebut memiliki pengetahuan yang mendalam di bidangnya, akan tetapi tidak semua area riset yang emerging dapat diketahui oleh ahli tersebut mengingat cepatnya perkembangan sumber-sumber informasi tentang ilmu pengetahuan dan teknologi. Namun demikian, analisis data yang berjumlah besar memerlukan waktu yang lama dan bisa jadi subyektif jika menggunakan cara manual. Beberapa penelitian sebelumnya telah menggunakan teknik kuantitatif dengan menghitung trend berdasarkan jumlah kata kunci dari suatu topik riset dan memprediksi trend tersebut untuk masa yang akan datang. Untuk prediksi trend dari data time series, saat ini pendekatan machine learning mulai banyak dikaji disamping pendekatan statistik yang sebelumnya lazim digunakan. Sementara itu, pendekatan ensemble yang menggabungkan hasil prediksi, teknik prediksi atau representasi data diyakini dapat meningkatkan akurasi prediksi. Multiple Kernel Learning (MKL) merupakan suatu teknik ensemble melalui penggabungan kernel yang menggunakan teknik machine learning, yakni Support Vector Machine (SVM), sebagai classifier atau prediktor. Dalam penelitian sebelumnya, MKL telah dimanfaatkan untuk menggabungkan fitur, yang biasa disebut sebagai data integration, dalam bidang image processing tetapi masih menggunakan single kernel. Dalam penelitian ini, MKL dimanfaatkan untuk menggabungkan fitur data time series yang berupa sliding windows dan diterapkan pada multiple kernel. Disamping itu, penelitian ini juga mengajukan penggunaan data historis sebagai pengganti training dataset untuk memilih model prediksi yang sesuai dengan karakteristik time series karena setiap model prediksi memiliki kelebihan dan keterbatasan dalam memprediksi data time series yang jenisnya cukup beragam.
Currently, the determination of the research area is still largely dependent on the opinion of experts. Although experts have in-depth knowledge in the field, but not all areas of emerging research can be known by the experts given the rapid development of sources of information regarding science and technology. However, the analysis of large amounts of data would take a quite long time and the result could be subjective if a manual method is employed. Several previous studies have used quantitative techniques to calculate trends based on the number of keywords on research topics and forecast their future trends. For the trend forecasting of time series data, currently, machine learning approaches have been extensively studied in addition to the previous statistical approaches which are commonly used. Meanwhile, an ensemble approach that may combine the prediction results, prediction techniques or data representations has the capability to increase the prediction accuracy. Multiple Kernel Learning (MKL) is one of such ensemble methods that optimizes the combination of kernels through the use of machine learning technique, such as Support Vector Machine (SVM), as a classifier or predictor. In previous studies, MKL has been used to combine features, which is commonly referred to as the data integration approach, in the field of image processing but is still implemented on a single kernel. In this study, MKL is used to combine the features of time series data in the form of sliding windows and tested on multiple kernels. In addition, this study also proposes the use of historical data as a substitute for the training dataset to select the prediction technique based on the characteristics of time series considering the diverse kind of time series data such that no single prediction technique can be used for all types of data.
Depok: Universitas Indonesia, 2014
D1972
UI - Disertasi Membership  Universitas Indonesia Library