Hasil Pencarian

Ditemukan 176397 dokumen yang sesuai dengan query

Nina Sevani

Penggunaan Bobot Dan Jarak Dalam Feature-Transfer Learning Untuk Klasifikasi Gambar = The Utilisation Of Weights And Distance In Feature-Transfer Learning For Image Classification

"Transfer learning merupakan pengembangan dari pembelajaran mesin biasa (tradisional) yang dapat diterapkan pada cross-domain. Cross-domain adalah domain yang memiliki perbedaan pada feature space atau pada marginal dan conditional distribution, sehingga sulit ditangani dengan metode pembelajaran mesin biasa. Perbedaan ini banyak terjadi pada kasus computer vision atau pattern recognition seperti untuk mengenali korban bencana alam melalui foto yang diambil dari atas menggunakan drone atau helikopter. Terjadinya perbedaan feature space dan distribusi data ini karena adanya perbedaan sudut, cahaya, dan alat yang berbeda. Kondisi seperti ini semakin menyulitkan untuk dilakukannya klasifikasi gambar terlebih pada domain dengan keterbatasan label. Implementasi transfer learning terbukti dapat memberikan performance yang baik pada banyak kasus, termasuk kasus yang menggunakan dataset gambar.

Dalam transfer learning penting untuk menghindari terjadinya negative transfer learning, sehingga perlu dilakukan pengukuran kesamaan (similarity) antar domain. Penelitian ini menerapkan feature-representation-transfer dan menggunakan Maximum Mean Discrepancy (MMD) untuk mengukur jarak antar feature pada domain yang terlibat di transfer learning. Setelah mengukur kesamaan antar domain, maka akan dilakukan pemilihan feature berdasarkan jarak antar feature. Feature terpilih adalah feature yang mempunyai jarak kurang dari threshold yang ditentukan. Bobot akan diberikan kepada feature terpilih. Selain melakukan pemilihan feature berdasarkan kesamaan domain, metode ini juga melakukan pemilihan feature yang signifikan antar class label dan dalam class label dengan menggunakan ANOVA (Analysis of Variance). Hanya feature yang signifikan yang akan digunakan untuk proses prediksi.

Metode yang diusulkan juga menerapkan inter-cluster class label untuk memperkecil perbedaan conditional distribution. Prinsip kerja inter-cluster class label ini adalah menghitung jarak minimal dari instance pada domain target ke setiap center of cluster class label. Rumus jarak yang digunakan adalah Euclidean Distance. Properti statistik seperti rata-rata dan varians akan digunakan pada metode ini, untuk menggambarkan struktur data lokal dalam setiap domain. Penggunaan rata-rata digunakan untuk menentukan threshold dan pusat cluster class label, sedangkan varians digunakan untuk pemilihan feature yang signifikan. Proses prediksi label dilakukan berdasarkan feature terpilih yang telah diberi bobot dan jarak terpendek setiap instance ke salah satu class label.

Tidak terdapat parameter tambahan dalam fungsi pembelajaran yang diusulkan. Selain itu, proses penentuan label juga dilakukan tanpa iterasi, sehingga memungkinkan metode ini dapat dijalankan dengan keterbatasan resource. Hasil eksperimen menunjukkan bahwa metode yang diusulkan dapat memberikan performance sebesar 46,6%, pada saat menggunakan SVM sebagai classifier dan 51.7% pada saat menggunakan logistic regression. Akurasi yang didapat dengan SVM ini mengimbangi metode feature-representation-transfer sebelumnya. Namun akurasi dari logistic regression sudah dapat mengungguli metode sebelumnya. Hasil ini menunjukkan bahwa penggunaan metode feature selection menggunakan properti statistik yang dikombinasikan dengan pemberian bobot pada feature terpilih dan jarak minimal dapat memberikan hasil akurasi yang baik tanpa memerlukan resource yang besar.

Transfer learning is the extension of traditional machine learning in a cross-domain environment. Cross-domains are domains with different feature spaces or different marginal and conditional distributions. Many real-world cases of computer vision and pattern recognition, such as the surveillance of some victims of natural disasters from above using a drone or helicopter, have these differences. These conditons are difficult to handle with traditional machine learning methods. The differences in feature space or data distribution caused by the existence of different angles, different light, and different tools. All of these situation add difficulty to the classification process, especially in domains with limited labels. The implementation of transfer learning is proven to provide good performance in many cases of cross-domain learning, including cases that use image datasets.
In transfer learning, it is important to measure the similarity between domains to avoid negative transfer learning. This study applies feature-representation-transfer and uses Maximum Mean Discrepancy (MMD) to measure the distance between features in the cross-domains and reduce the domain discrepancy. After measuring the similarity between domains, a feature selection will be made based on the distance between the features. Selected features are features that have a distance less than the specified threshold. Weight will be given to the selected features. In addition to selecting features based on domain similarity, this method also selects significant features between class labels and within class labels using ANOVA (Analysis of Variance). Only significant features will be used for the prediction process.
The proposed method also applies an inter-cluster class label to minimize the difference in conditional distribution. The inter-cluster class label works by calculating the minimum distance from the instance in the target domain to each center of the cluster class label. The distance formula used is Euclidean distance. Statistical properties such as mean and variance will be used in this method to describe the local data structure in each domain. The average is used to determine the threshold and center of the cluster class label, while the variance is used to select significant features. The label prediction process is carried out based on the selected features that have been weighted and the shortest distance for each instance to one of the label classes.
There are no additional parameters in the proposed learning function. In addition, the process of determining the label is also carried out without iteration, thus allowing this method to be run with limited resources. The experimental results show that the proposed method can provide a performance of 46.6% when using SVM as a classifier and 51.7% when using logistic regression. The accuracy obtained from SVM offsets the previous feature-representation transfer learning. However, the accuracy of logistic regression has been able to outperform the previous method. These results indicate that the use of the feature selection method using statistical properties combined with assigning weights to selected features and a minimum distance can provide good accuracy without requiring large resources."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

D-pdf

UI - Disertasi Membership Universitas Indonesia Library

Muhammad Faisal Adi Soesatyo

Eksplorasi Keefektifan Cross-Lingual Transfer Learning untuk Constituency Parsing Bahasa Indonesia = Exploring the Efficacy of Cross-Lingual Transfer Learning for Indonesian Constituency Parsing

"Pendekatan transfer learning telah digunakan di beragam permasalahan, khususnya low-resource language untuk meningkatkan performa model di masing-masing permasalahan tersebut. Fokus pada penelitian ini ingin menyelidiki apakah pendekatan cross-lingual transfer learning mampu meningkatkan performa pada model constituency parsing bahasa Indonesia. Constituency parsing adalah proses penguraian kalimat berdasarkan konstituen penyusunnya. Terdapat dua jenis label yang disematkan pada konstituen penyusun tersebut, yakni POS tag dan syntactic tag. Parser model yang digunakan di penelitian ini berbasis encoder-decoder bernama Berkeley Neural Parser. Terdapat sebelas macam bahasa yang digunakan sebagai source language pada penelitian ini, di antaranya bahasa Inggris, Jerman, Prancis, Arab, Ibrani, Polandia, Swedia, Basque, Mandarin, Korea, dan Hungaria. Terdapat dua macam dataset bahasa Indonesia berformat Penn Treebank yang digunakan, yakni Kethu dan ICON. Penelitian ini merancang tiga jenis skenario uji coba, di antaranya learning from scratch (LS), zero-shot transfer learning (ZS), dan transfer learning dengan fine-tune (FT). Pada dataset Kethu terdapat peningkatan F1 score dari 82.75 (LS) menjadi 84.53 (FT) atau sebesar 2.15%. Sementara itu, pada dataset ICON terjadi penurunan F1 score dari 88.57 (LS) menjadi 84.93 (FT) atau sebesar 4.11%. Terdapat kesamaan hasil akhir di antara kedua dataset tersebut, di mana masing-masing dataset menyajikan bahwa bahasa dari famili Semitic memiliki skor yang lebih tinggi dari famili bahasa lainnya.

The transfer learning approach has been used in various problems, especially the low-resource languages, to improve the model performance in each of these problems. This research investigates whether the cross-lingual transfer learning approach manages to enhance the performance of the Indonesian constituency parsing model. Constituency parsing analyzes a sentence by breaking it down by its constituents. Two labels are attached to these constituents: POS tags and syntactic tags. The parser model used in this study is based on the encoder-decoder named the Berkeley Neural Parser. Eleven languages are used as the source languages in this research, including English, German, French, Arabic, Hebrew, Polish, Swedish, Basque, Chinese, Korean, and Hungarian. Two Indonesian PTB treebank datasets are used, i.e., the Kethu and the ICON. This study designed three types of experiment scenarios, including learning from scratch (LS), zero-shot transfer learning (ZS), and transfer learning with fine-tune (FT). There is an increase in the F1 score on the Kethu from 82.75 (LS) to 84.53 (FT) or 2.15%. Meanwhile, the ICON suffers a decrease in F1 score from 88.57 (LS) to 84.93 (FT) or 4.11%. There are similarities in the final results between the two datasets, where each dataset presents that the languages from the Semitic family have a higher score than the other language families."

Depok;;: Fakultas Ilmu Komputer Universitas Indonesia;;, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Mehler, Alexander, editor

Modeling, learning, and processing of text-technological data structures

"The book focuses on theoretical foundations of representing natural language texts as well as on concrete operations of automatic text processing. The present volume includes contributions to a wide range of topics in the context of processing of textual data. This relates to the learning of ontologies from natural language texts, the annotation and automatic parsing of texts as well as the detection and tracking of topics in texts and hypertexts. "

Berlin: Springer, 2011

e20418145

eBooks Universitas Indonesia Library

Modell, Martin E.

Data analysis, data modeling, and classification

New York: McGraw-Hill , 1992

005.73 MOD d

Buku Teks SO Universitas Indonesia Library

Khalid Muhammad

Machine learning dengan data yang terenkripsi secara homomorfis = Machine learning with homomorphically encrypted data

"ABSTRAK

Machine learning dapat digunakan untuk menganalisis berbagai macam jenis data, termasuk data yang umumnya bersifat rahasia. Sebuah model machine learning yang sudah dilatih dapat dibungkus dalam sebuah aplikasi web sehingga model tersebut dapat diakses dengan mudah via internet. Namun, jika data yang ingin dianalisis bersifat pribadi atau rahasia seperti data medis atau keuangan maka hal ini menjadi masalah, pengelola aplikasi itu dapat saja membaca data rahasia yang di-input. Skema enkripsi homomorfis dapat digunakan untuk menghadapi masalah ini. Salah satu skema enkripsi yang memiliki sifat homomorfis ialah skema enkripsi Paillier. Pada peneltitian ini ditunjukkan bahwa suatu jenis model machine learning tertentu dapat menerima input data yang terenkripsi dengan skema enkripsi Paillier dan menghasilkan output yang terenkripsi dengan kunci yang sama. Konsep ini didemonstrasikan dengan melatih sebuah model machine learning dengan database MNIST. Kemudian, model ini diuji dengan data test yang terenkripsi dengan skema enkripsi Paillier. Hasil percobaan menunjukkan akurasi model mencapai 92,92.

ABSTRACT

Machine learning can be used to analyze various kinds of data, including confidential data such us medical or financial data. A trained machine learning model can be wrapped in a web application so that people can access it easily via internet. But if the data to be analyzed is private or confidential, this will cause a problem, the application administrator may read our input. Homomorphic encryption scheme can be used to overcome this kind of problem. Paillier encryption scheme is one kind of encryption scheme that has homomorphic property. In this research, it will be shown that one type of machine learning model can take an input encrypted by Paillier encryption scheme and produce an output encrypted with the same key. This concept is demonstrated by training a machine learning model with the MNIST database of hand written digits. This model will be tested with the test data encrypted with Paillier encryption scheme. The experiment shows that the model achieved 92.92 accuracy."

2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Strategi penyusunan kamus referensi dan analisis kinerja metode pelacakan kata pada pemeriksa ejaan bahasa indonesia

Fakultas Ilmu Komputer Universitas Indonesia, 1995

S26902

UI - Skripsi Membership Universitas Indonesia Library

Dwi Putri Ningsih

Peran Gaya Belajar dan Persepsi Pemelajar Bahasa Jerman Terhadap Strategi Metakognitif Dalam Konteks Pembelajaran Kemahiran Menyimak = The Role of Learning Styles and German Language Learners Perception Towards Metacognitive Strategies in The Context of Learning The Listening Skill

"Tesis ini membahas peran gaya belajar dan persepsi pemelajar mengenai penggunaan strategi metakognitif dalam pembelajaran kemahiran menyimak bahasa Jerman. Pendekatan penelitian adalah penelitian kombinasi kuantitatif dan kualitatif dengan desain penelitian pra-eksperimen. Penelitian ini melibatkan pemelajar bahasa Jerman dari dua kelas yang berbeda, yang kemudian dikelompokkan menjadi kelas eksperimen dan kelas kontrol. Kelas kontrol mendapatkan pengajaran kemahiran menyimak secara konvensional, sedangkan kelas eksperimen mendapatkan pengajaran kemahiran menyimak dengan menggunakan siklus pedagogis metakognitif.

Hasil penelitian menujukkan bahwa peran gaya belajar memiliki hubungan yang signifikan dengan persepsi pemelajar terhadap penggunaan strategi metakognitif dalam pembelajaran kemahiran bahasa Jerman. Pemberian perlakuan pengajaran strategi metakognitif dengan siklus pedagogis pada kelas eksperimen ini memberikan dampak pada peningkatan pemerolehan hasil pembelajaran kemahiran menyimak pemelajar. Walau demikian, hasil uji-t sampel kecil ini menunjukkan tidak adanya perbedaan yang signifikan atas pemerolehan hasil pembelajaran kemahiran menyimak dari kelas eksperimen dan kelas kontrol.

This paper examines the role of learning styles and learners’ perspective towards metacognitive strategies instructions to improve German language learners’ listening skill. The research method used was mixed method with pre-experimental design. There are experiment class and control class. Control class was having conventional instructions meanwhile experiment class was having classroom instructions with metacognitive pedagogical cycle.
The results showed that learners learning style and their perspective towards metacognitive strategies are having a significant correlation. The results of post-test score of experiment class were increasing from the pre-test result, meanwhile control class post-test scores’ were not increasing. These results showed that the metacognitive strategies instruction can help learners to increase their scores in listening. Whilst the result of t-test small sample showed that there is no significant difference between post-test results from experiment and control classroom."

Depok: Fakultas Ilmu Pengetahuan Budaya Universitas Indonesia, 2019

T52961

UI - Tesis Membership Universitas Indonesia Library

Ary Pramudito

Femap sebagai alternatif pemodelan dan komputasi metode elemen hingga dengan UI-FEAP

"ABSTRAK

UI-FEAP adalah program metode elemen hingga yang digunakan sebagai program komputasi numerik terhadap berbagai persoalan analisa struktur. Karena UI-FEAP tidak memiliki fasilitas visualisasi yang baik, maka dikembangkanlah suatu jembatan dengan program visualisasi komersial yang telah ada yaitu SDRC FEMAP. Jembatan ini berupa suatu program transformasi yang mudah untuk dikembangkan dan disesuaikan kembali terhadap setiap perkembangan kemampuan pada UI-FEAP. Kemudahan ini ditunjang dengan adanya format file neutral yang ada pada FEMAP. Permasalahan yang ada ialah bagaimana membuat program transformasi yag kompatibel dan bisa menyesuaikan diri terhadap kemampuan yang ada pada kedua program tersebut.

2001

S34804

UI - Skripsi Membership Universitas Indonesia Library

Arif Rahman Hakim

Pengembangan Kerangka Kerja Forensik Digital dan Metode Baru berbasis Machine Learning untuk Investigasi Insiden Kebocoran Data = Development of a Digital Forensics Framework and New Machine Learning-based Methods for Investigating Data Leak Incidents

"Salah satu tantangan utama investigasi insiden kebocoran data adalah tidak tersedianya kerangka kerja spesifik yang sesuai dengan karakteristik insiden kebocoran, disertai langkah-langkah yang jelas dan memberikan hasil investigasi yang komprehensif. Tantangan lain berupa proses analisis terhadap logs berjumlah besar akan menghabiskan waktu dan berpotensi terjadi human-error bila dilakukan secara manual. Pendekatan machine learning (ML) dapat dijadikan solusi, namun kinerja ML seringkali tidak optimal dikarenakan kondisi ketidakseimbangan dataset. Untuk itu, pada penelitian ini dikembangkan kerangka kerja forensik digital baru yang bernama KARAFFE (Kalamullah Ramli–Arif Rahman Hakim–Forensic Framework for Exfiltration), yang bersifat spesifik sesuai dengan karakteristik kebocoran data. Tahapan dan komponen pada KARAFFE mampu menghasilkan jawaban atas pertanyaan investigatif berupa What, When, Who, Where, Why dan How (5WH) dari insiden yang diinvestigasi. Berdasarkan karakteristik pembanding yang ditetapkan, KARAFFE memenuhi enam indikator karakteristik mengungguli kerangka kerja existing lainnya. Lebih lanjut, analisis studi kasus menunjukkan bahwa KARAFFE mampu menginvestigasi insiden secara utuh disertai jawaban 5WH yang lengkap atas insiden yang diuji. Metode lain yang dikembangkan adalah ARKAIV (Arif Rahman Hakim-Kalamullah Ramli-Advanced Investigation). Metode ARKAIV berbasis ML mampu memprediksi terjadinya exfilration berdasarkan event logs yang dipetakan ke adversarial tactics. Untuk prediksi tersebut dilakukan modifikasi dataset berupa rangkain tactics dengan exfiltration sebagai target dan didesain skema resampling untuk mengatasi kondisi ketidakseimbangan dataset. SMOTEENN menghasilkan kinerja terbaik mengungguli empat teknik resampling lainnya, dengan meningkatkan nilai geometric-mean 0 pada initial dataset menjadi 0.99 pada resampled dataset. Selain itu, model ML pada metode ARKAIV dipilih dengan kinerja paling optimal berdasarkan lima teknik feature selection, menerapkan lima classifiers ML, dan dua teknik validasi model. Hasil ML-ARKAIV menunjukkan bahwa Random Forest melampaui kinerja empat classifiers lainnya (XGBoost, Logistic Regression, Naive Bayes, dan Support Vector Machine), dengan mean accuracy sebesar 99.1% (5-folds), 99.8% (10-folds), 99.7% (5-folds 5-repetitions), dan 99.74% (10-folds 10-repetitions). Selain itu, analisis studi kasus menunjukkan bahwa ARKAIV mampu memprediksi secara akurat dua insiden exfiltration dan satu insiden non-exfiltration. Dengan demikian, ARKAIV menunjukkan konsistensi kinerja dan efektifitasnya dalam memprediksi terjadinya exfiltration dalam berbagai skenario.

One of the primary challenges in investigating data breach incidents is the lack of a specific framework tailored to the characteristics of such incidents, accompanied by clear steps to ensure comprehensive investigative results. Another challenge lies in the analysis of large volumes of logs, which is time-consuming and prone to human error when performed manually. Machine learning (ML) approaches offer a potential solution; however, their performance is often suboptimal due to the imbalance in datasets. This study proposes a novel digital forensic framework named KARAFFE, designed specifically to address the unique characteristics of data breach incidents. The stages and components of KARAFFE are structured to answer investigative questions encompassing What, When, Who, Where, Why, and How (5WH) of the incidents under investigation. Case study analysis demonstrates that KARAFFE provides a complete investigation of incidents, delivering comprehensive 5WH responses for the examined cases. Based on the established comparative characteristics, KARAFFE meets six key indicators, outperforming other existing frameworks. Furthermore, the case study analysis demonstrates that KARAFFE enables comprehensive incident investigation, providing complete 5WH answers for the tested incidents. Additionally, this study introduces the ARKAIV method. ARKAIV is an ML-based approach capable of predicting exfiltration attacks based on event logs mapped to adversarial tactics. To facilitate these predictions, the dataset was modified to include a sequence of tactics with exfiltration as the target, and a resampling scheme was designed to address dataset imbalance. SMOTEENN achieved the best performance, surpassing four other resampling techniques by improving the geometric mean value from 0 on the initial dataset to 0.99 on the resampled dataset. Furthermore, the ML models in ARKAIV were selected for optimal performance through the application of five feature selection techniques, five ML classifiers, and two model validation methods. The results of ML-ARKAIV indicate that Random Forest outperformed four other classifiers (XGBoost, Logistic Regression, Naive Bayes, and Support Vector Machine), with mean accuracy rates of 99.1% (5-folds), 99.8% (10-folds), 99.7% (5-folds with 5 repetitions), and 99.74% (10-folds with 10 repetitions). Additionally, the case study analysis demonstrated that ARKAIV accurately predicted two exfiltration incidents and one non-exfiltration incident. These findings underscore ARKAIV's consistent performance and effectiveness in predicting exfiltration across various scenarios."

Depok: Fakultas Teknik Universitas Indonesia, 2025

D-pdf

UI - Disertasi Membership Universitas Indonesia Library

Faul, A.C.

A concise introduction to machine learning

"The emphasis of the book is on the question of Why – only if why an algorithm is successful is understood, can it be properly applied, and the results trusted. Algorithms are often taught side by side without showing the similarities and differences between them. This book addresses the commonalities, and aims to give a thorough and in-depth treatment and develop intuition, while remaining concise."

London: CRC press, 2020

e20528988

eBooks Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian