Hasil Pencarian

Ditemukan 152984 dokumen yang sesuai dengan query

Junanto Prihantoro

Analisis Performa Teknik Klasifikasi Data NILM Menggunakan Metode Naive Bayes dan K-NN = Performance Analisys Of Classification Technique For NILM Data Using Naive Bayes And K-NN Method

Konsumsi energi nasional secara signifikan dikontribusikan oleh tenaga listrik rumah tangga. Untuk mengetahui penggunaan energi listrik di setiap peralatan listrik rumah tangga, teknik yang disebut Non-Intrusive Load Monitoring (NILM) digunakan. NILM adalah alat untuk memantau dan mengidentifikasi kekuatan setiap peralatan listrik. Baru-baru ini beberapa metode klasifikasi data seperti jaringan saraf, pembelajaran mendalam telah diterapkan untuk mengembangkan NILM. Dalam tulisan ini, metode naive bayes digunakan untuk NILM. Metode ini untuk mengklasifikasikan kondisi on-off peralatan listrik. Untuk meningkatkan akurasi, metode preprocessing data yang normalisasi dan diskritisasi digunakan. Perbandingan kinerja dievaluasi untuk setiap metode. Dalam tulisan ini, dataset REDD digunakan. Metode Supervised learning yang digunakan adalah Naive Bayes dan K Nearest Neighbour. Hasil simulasi menunjukkan bahwa dua metode ini dapat mengenali data NILM dengan akurasi yang tinggi. Metode naive bayes dengan diskritisasi memperoleh akurasi tertinggi dengan nilai 96.64% diikuti oleh KNN dengan k =5 dengan nilai 96.1287%.

National energy consumption is significantly contributed by household electricity. To find out the use of electrical energy in every household electrical equipment, a technique called Non-Intrusive Load Monitoring (NILM) used. NILM is a tool to monitor and identify the strength of each electrical equipment. Recently several methods of data classification such as neural networks, deep learning have been applied to develop NILM. In this paper, the naive Bayes method used for NILM. This method is to classify the conditions of on-off electrical equipment. Accuracy to improve, data preprocessing techniques that are normalised and discretised used. Performance comparisons are evaluated for each method. In this paper, the REDD dataset used. The Supervised learning method used is Naive Bayes and K Nearest Neighbor. The simulation results of the two classification methods can recognise NILM data with high accuracy, the naive Bayes method with discretisation obtained the highest accuracy with an amount of 96.64% followed by KNN with 5 with a value of 96.1287%.

2019

T53159

UI - Tesis Membership Universitas Indonesia Library

Annisa Kamalia

Klasifikasi data talasemia menggunakan K-nearest neighbor dan naive bayes = Classification of data thalassemia using K-nearest neighbor and naive bayes

"ABSTRACT

Talasemia adalah penyakit yang disebabkan oleh adanya kelainan dalam hemoglobin. Penyakit talasemia merupakan penyakit herediter atau penyakit keturunan dimana pembawa gen talasemia adalah orang tua dari penderita. Di Indonesia, pada tahun 2015 diketahui jumlah kasus talasemia mencapai 7.029 kasus. Sampai saat ini talasemia belum dapat disembuhkan namun dapat dikenali sifat pembawanya dengan skrining. Dalam tugas akhir ini, akan dibandingkan performa dari dua metode yang digunakan untuk mengklasifikasikan data talasemia, yaitu K-Nearest Neighbor dan Naive Bayes. Data yang digunakan adalah 82 data pasien talasemia dan 68 data pasien non-talasemia dari Rumah Sakit Anak dan Bunda Harapan Kita, Jakarta Barat. Hasil akhir menunjukkan bahwa metode Naive Bayes memberikan nilai akurasi yang lebih besar dari K-Nearest Neighbor dalam mengklasifikasikan talasemia. Rata-rata akurasi Naive Bayes sebesar 99.775% dengan rata-rata waktu running 0.0554 detik dan rata-rata akurasi K-Nearest Neighbor adalah 97.142% dengan rata-rata waktu running 0.081 detik. Untuk nilai spesifikasi, keduanya memberikan performa yang sama, yaitu dari K-Nearest Neighbor diperoleh ketika K=3 yaitu sebesar 100% dan dari Naive Bayes sebesar 100%. Hasil rata-rata sensitivitas tertingi diberikan oleh Naive Bayes yaitu sebesar 99.59%, sedangkan K-Nearest Neighbor sebesar 96.25% untuk K=1.

ABSTRACT

Thalassemia is a disease caused by abnormalities in the hemoglobin. Thalassemia is a hereditary disease which the thalassemia gene carriers are parents of sufferers. In Indonesia, in 2015 it was found that the number of thalassemia cases reached 7,029 cases. Until now thalassemia has not been cured, but it can be recognized the nature of its carrier by screening. In this final project, the performance of the two methods will be compared to classify thalassemia data, namely K-Nearest Neighbor and Naive Bayes. The data used were 82 data on thalassemia patients and 68 data on non-thalassemia patients from Harapan Kita Children and Womans Hospital, West Jakarta. The final results show that the Naive Bayes method provides greater accuracy value than K-Nearest Neighbor in classifying thalassemia. The average accuracy of Naive Bayes is 99.775% with an average running time of 0.0554 seconds and the average accuracy of K-Nearest Neighbor is 97.142% with an average running time of 0.081 seconds. For specification values, both give the same performance. The result of specification values using K-Nearest Neighbor yield when K = 3 that is 100% and from Naive Bayes that is 100%. The highest average sensitivity results are given by Naive Bayes is 99.59%, while K-Nearest Neighbor is 96.25% for K = 1."

2019

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Fitra Hidiyanto

Evaluasi Kinerja Metode K-NN dengan Beragam Metode Kalkulasi Jarak dalam Teknik Non-Intrusive Load Monitoring (NILM) = Performance Evaluation of K-NN Methode with Various Distance Calculation Methods in Non-Intrusive Load Monitoring (NILM) Technique

"Non-Intrusive Load Monitoring (NILM) memungkinkan pendeteksian peralatan yang aktif atau tidak aktif bahkan karakteristik untuk setiap peralatan yang dipasang di rumah, industri, laboratorium, dll, dengan mendisagregasi total konsumsi listrik yang diukur di panel daya pusat. Penerapan NILM untuk energi efisiensi, manajemen energi, dan diagnosa peralatan di rumah tangga, industri atau penyedia energi telah menunjukkan peningkatan yang menjanjikan. Metode K-NN adalah salah satu metode machine learning yang paling sederhana dan umum digunakan untuk klasifikasi dengan kinerja yang baik dan bersaing dengan metode yang bahkan lebih kompleks. KNN memiliki 3 karakteristik yang dapat diubah dan dioptimalkan untuk memberikan hasil akurasi yang lebih baik, yaitu dari sisi data, algoritma jarak, dan nilai k. Dalam makalah ini metode K nearest neighbor (KNN) dilakukan pada data NILM AMPds2 yang memiliki load karakteristik yang mirip antar peralatan yang berbeda, dengan 9 algoritma jarak yang berbeda, 7 jumlah data training (10% -70%) dan dilakukan untuk variasi k ( 1-25) pada input daya Aktif serta input daya Aktif dan Reaktif untuk didapatkan hasil terbaik, Selain itu dilakukan juga metode Backpropagation Neural Network (BPNN) dengan variasi data training sebesar 25%, 50%, 75% dan 100%, jumlah hidden 10, 20 dan 30, dan jumlah iterasi 50000 dan 150000 dengan input daya aktif dan reaktif data dan 2 metode input yaitu input statis dan dinamis, dan pada akhirnya perbandingan kinerja antara metode KNN dan backpropagation untuk memisahkan data NILM AMPds2 telah dilakukan. Dari hasil pengujian dan penelitian didapatkan bahwa dengan menambahkan data daya reaktif sebagai input, hasil disagregasi pada data NILM yang mempunyai load karakteristik yang sama antara peralatan yang berbeda dengan metode KNN diperoleh akurasi lebih dari 20% lebih akurat sampai dengan 95% akurasi, dan memiliki nilai precision dan recall mencapai 0.9565, dan perbandingan performansi antara metode KNN input daya aktif dan reaktif dan metode backpropagation input daya aktif dan reaktif untuk memisahkan data NILM AMPds2 ke dalam kluster didapatkan hasil bahwa metode KNN input daya aktif dan reaktif memiliki akurasi yang bersaing dengan akurasi 95% sedangkan akurasi hasil backpropagation input dinamis 99.85%.

Non-Intrusive Load Monitoring (NILM) enables detection of appliances that are active or non-active even characteristics for each equipment installed in homes, industries, laboratories, etc by disaggregating total electrical consumption at the Central Power panel. The application of NILM for energy efficiency, energy management, and diagnostic equipment in households, industries or energy providers has shown promising improvement. The K-NN method is one of the most simple and commonly used machine learning methods for classifying with good performance and competing with even complex methods. K-NN has 3 characteristics that can be changed and optimized to provide better accuracy results, namely in terms of data, distance algorithm, and k value. In this paper the K nearest neighbor (KNN) method is performed on NILM AMPds2 data which having distinctive similar load characteristic between different appliances, with 9 different distances, 7 types of total training data (10% -70%) and performed for k (1-25) in single input (Active Power) and double input (Active and Reactive Power) for best result, In addition, the Backpropagation Neural Network (BPNN) methode was also carried out with variations in the training data amount of 25%, 50%, 75% and 100%, hidden number of 10, 20 and 30, and iterations number of 50000 and 150000 in double input data and 2 input methodes which are static input and dynamic input, and in the end performance comparison between KNN and backpropagation methods to disaggregate NILM AMPds2 data had done. From test and research results it was found that by adding reactive power data as input, the disaggregation results on NILM data which having distinctive similar load characteristic between different appliances with KNN methode were more than 20% accurate up to 95% accuracy and had higher precision and recall value also up to 0.9565, and also performance comparison between KNN double input and backpropagation double input methode to disaggregate NILM AMPds2 data into clusters result was found that KNN methode has shown good competitive result up to 95% accuracy while backpropagation with dynamic input accuracy result is 99.85 %."

Depok: Fakultas Teknik Universitas Indonesia, 2021

T-Pdf

UI - Tesis Membership Universitas Indonesia Library

Rifqi Wazirsyah

Klasifikasi Performa Mahasiswa Berdasarkan Data Teks Forum Diskusi Online Menggunakan Multinomial Naive Bayes dengan Vektorisasi Pembobotan TF-IDF = Classification of Student Performance based on Text Data of Online Discussion Forums using Multinomial Naive Bayes with TF-IDF Weighting Vectorization

"E-Learning Management System (EMAS) merupakan aplikasi yang dibuat oleh Universitas Indonesia dengan berbagai fitur salah satunya forum diskusi online. Dalam forum diskusi online, mahasiswa dapat membuat postingan-postingan dalam bentuk teks untuk bisa berdiskusi. Postingan-postingan dalam bentuk teks memiliki peran penting dalam meningkatkan performa mahasiswa yang terkhusus pada kelulusannya. Pada tugas akhir ini, Multinomial Naïve Bayes (MNB) digunakan untuk mengklasifikasi performa mahasiswa berdasarkan postingan-postingan dalam bentuk teks pada forum diskusi online. Sebelum dilakukan tahapan klasifikasi, postingan-postingan tersebut dilakukan preprocessing dan pemberian bobot kata pada teks menggunakan TF-IDF. Hasil TF-IDF dinyatakan dalam bentuk vektor-vektor, proses ini disebeut dengan proses vektorisasi. Banyaknya dokumen dari data hasil vektorisasi TF-IDF yang digunakan yaitu sebanyak 228, dengan proporsi mahasiswa lulus dan tidak lulus secara berturut-turut, yaitu sebesar 219 dan 9. Pada data tersebut didominasi oleh mahasiswa lulus, artinya data tersebut tidak seimbang, sehingga diperlukan proses SMOTE untuk menyeimbangkan data. Kemudian, dilakukan implementasi model MNB pada 3 kasus pembagian data training dan data testing, yaitu 70%;30%, 80%:20% dan 90%:10%, dengan cara melatih model pada data training dan menguji model pada data testing untuk memperoleh klasifikasi performanya. Implementasi dilakukan sebanyak lima kali percobaan, sehingga didapatkan model MNB dapat mengklasifikasi performa mahasiswa dengan baik dan hasil kinerja model terbaik pada data testing 30% yaitu rata-rata akurasi sebesar 0,956, rata-rata recall sebesar 0,979, dan rata-rata f1-score sebesar 0,977. Namun rata-rata presisi terbaik didapatkan pada data testing 20%, yaitu sebesar 0,977.

E-Learning Management System (EMAS) is an application created by the University of Indonesia with various features, one of which is an online discussion forum. In online discussion forums, students can make posts in the form of text to be able to discuss. Posts in the form of text have an important role in improving student performance, especially at graduation. In this final project, Multinomial Naive Bayes (MNB) is used to classify student performance based on posts in text form on online discussion forums. Prior to the classification stage, the posts were preprocessed and assigned word weights to the text using TF-IDF. The results of TF-IDF are expressed in the form of vectors, this process is called the vectorization process. The number of documents from the TF-IDF vectorized data used is 228, with the proportion of students graduating and not graduating respectively, which is 219 and 9. SMOTE to balance data. Then, the implementation of the MNB model was carried out in 3 cases of distribution of training data and testing data, namely 70%; 30%, 80%:20% and 90%:10%, by training the model on the training data and testing the model on the testing data to obtain performance classification. The implementation was carried out five times, so that the MNB model was able to classify student performance well and the best model performance results were on 30% testing data, namely an average accuracy of 0.956, an average recall of 0.979, and an average f1-score of 0.956. 0.977. However, the best average precision was obtained at 20% testing data, which was 0.977."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Ghani Deori

Klasifikasi sekuens protein coronavirus penyebab COVID-19 menggunakan metode Naive Bayes dengan seleksi fitur Lasso = Classification of coronavirus protein sequences cause COVID-19 using Naive Bayes method with LASSO feature selection

"SARS-COV-2 merupakan jenis virus yang menyebabkan pandemi COVID-19. Pandemi COVID-19 pertama kali terdeteksi di Wuhan, Cina. Berdasarkan data World Health Organization (WHO), jumlah orang yang telah terpapar COVID-19 adalah 123.216.178 orang dan 2.714.517 orang meninggal akibat COVID-19 berdasarkan data www.who.int pada tanggal 23 Maret 2021. Pada skripsi ini, dilakukan klasifikasi untuk SARS-COV-2 dengan menggunakan sekuens protein dari SARS-COV-2. Sekuens protein SARS-COV- 2 di ekstraksi fitur dengan menggunakan package discere dari Python. Package discere akan menghasilkan 27 fitur, dimana fitur-fitur diseleksi dengan menggunakan metode LASSO (Least Absolute Shrinkage and Selection Operator). Setelah dilakukan seleksi fitur, dilakukan klasifikasi dengan menggunakan dua metode, yaitu metode Absolute Correlation Weighted Naïve Bayes dan metode Naïve Bayes. Rata-rata akurasi, sensitifitas, dan spesifisitas tertinggi untuk metode Absolute Correlation Weighted Naïve Bayes berturut-turut adalah 81,85%, 74,81%, dan 89,19%, sedangkan rata-rata akurasi, sensitifitas, dan spesifisitas tertinggi untuk metode Naïve Bayes berturut-turut adalah 81,44%, 74,58%, dan 88,24%. Terlihat bahwa metode Absolute Correlation Weighted Naïve Bayes mempunyai rata-rata akurasi, sensitifitas, dan spesifisitas yang lebih tinggi dibandingkan dengan metode Naïve Bayes.

SARS-COV-2 is the type of virus that causes the COVID-19 pandemic. The COVID-19 pandemic was first detected in Wuhan, China. Based on data from the World Health Organization (WHO), the number of people who have been exposed to COVID-19 is 123,216,178 people and 2,714,517 people died from COVID-19 based on data from www.who.int on March 23, 2021. In this paper, the SARS-COV-2 classification is done by using the protein sequence of SARS-COV-2. The SARS-COV-2 protein sequence will be feature extraction using the discere package from Python. The discere package will produce 27 features, where the features are selected using the LASSO (Least Absolute Shrinkage and Selection Operator) method. After feature selection, classification is carried out using two methods, namely the Absolute Correlation Weighted Naïve Bayes method and the Naïve Bayes method. The highest average accuracy, sensitivity, and specificity for the Absolute Correlation Weighted Naïve Bayes method are 81.85%, 74.81%, and 89.19%, respectively, whereas the highest average accuracy, sensitivity, and specificity for the Naïve Bayes method are 81.44%, 74.58%, and 88.24%, respectively. It can be seen that the Absolute Correlation Weighted Naïve Bayes method has a higher average accuracy, sensitivity, and specificity than the Naïve Bayes method."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Muhammad Rias Agnini Majdi

Klasifikasi Alat Musik Berdasarkan Suara yang Dihasilkannya dengan Ekstraksi Fitur MFCC dan Metode Klasifikasi k-NN = Musical Instrument Classification with MFCC Feature Extraction and k‑NN Classification Method

"Jenis-jenis alat musik yang digunakan dalam suatu musik adalah salah satu cara menjelaskan musik tersebut. Skripsi ini membahas penggunaan ekstraksi fitur MFCC dan metode klasifikasi k-NN untuk mengklasifikasi alat musik berdasarkan suara yang dihasilkannya. MFCC merupakan sebuah metode yang mampu mengolah sebuah data suara sehingga menghasilkan beberapa fitur yang bersifat numerik. k-NN merupakan sebuah metode klasifikasi yang menggunakan jarak dari fitur tiap-tiap observasi. Pengerjaan skripsi dilakukan dengan mengekstraksi fitur dari data-data suara yang tersedia dengan MFCC lalu menggunakan fitur-fitur yang diekstraksi tersebut untuk metode klasifikasi k-NN. Data yang digunakan adalah data suara alat musik yang tersedia pada dataset Philharmonia Orchestra Sound Samples. Hasil dari penerapan metode klasifikasi k-NN pada skripsi ini menunjukkan bahwa model k-NN mampu meraih nilai akurasi hingga 94,84%.

Instrumentation is one way to describe a music. This study discusses the use of MFCC feature extraction and k-NN classification method to classify instruments by the sound they produce. MFCC is a method capable of processing a sound data into a set of numeric features. k-NN is a classification method that uses the distance of the features of each observations. The process of this study uses MFCC to extract the features of available sound data and use these extracted features to fit a k-NN model. The data used in this study are the sound data available in the Philharmonia Orchestra Sound Samples dataset. The result of k-NN model fitting in this study shows that the model is capable of reaching an accuracy of 94.84%.

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Natalia Aji Yuwanti

Analisis kinerja one dimensional naive bayes sebagai metode imputasi data masalah asuransi = Performance analysis of one dimensional naive bayes as a data imputation method for insurance problems

"Metode machine learning sangat banyak digunakan dalam membantu pekerjaan manusia. Tidak semua data seperti yang diharapkan. Kebanyakan data memiliki missing value. Data yang memiliki missing value harus ditangani dulu pada tahap pra pengolahan, salah satunya adalah dengan cara imputasi missing value. Pada penelitian ini, dilakukan analisis kinerja One-Dimensional NaÃ¯ve Bayes sebagai metode imputasi data masalah asuransi mobil dan keselamatan berkendara. Berdasarkan hasil simulasi menggunakan SVM didapatkan hasil yang sama untuk imputasi menggunakan modus dan One-Dimensional NaÃ¯ve Bayes pada data Car Insurance yaitu 1,00. Setelah itu dilakukan telaah lebih lanjut ternyata imputasi setiap missing value dengan modus dan prediksi imputasi dengan One-Dimensional NaÃ¯ve Bayes persis sama. Pada data Safe Driver, imputasi dengan modus menghasilkan akurasi 0,86 sedangkan imputasi dengan One-Dimensional NaÃ¯ve Bayes menghasilkan akurasi 0,85. Hasil ini menunjukkan bahwa metode imputasi missing value dengan modus masih sangat direkomendasikan untuk tahap pra pengolahan data pada machine learning.

Machine learning methods are very widely used in helping human work. Not all data is as expected. Most data have missing values. Data which has a missing value must be handled first at the pre-processing stage, one of which is by imputation of the missing value. In this study, a One-Dimensional NaÃ¯ve Bayes performance analysis was performed as a data imputation method for car insurance and safe driver problems. Based on simulation results by using SVM obtained the same results for imputation using mode and One-Dimensional NaA ve Bayes on Car Insurance data that is 1,00. After that, a further study is carried out, apparently the imputation of each missing value by mode and the prediction of imputation with One-Dimensional NaAve Bayes are the same. In Safe Driver data, imputation with mode produces 0.86 accuracy while imputation with One-Dimensional NaAve Bayes produces accuracy of 0.85. These results indicate that the method of missing value imputation with mode is still highly recommended for the pre-processing data stage in machine learning."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Ricco Yhandy Fernando

Implementasi Sistem Klasifikasi Penyakit Paru-Paru Dari Data Screening Menggunakan Metode Support Vector Machine Dan Ensemble Bagging Gaussian Naive Bayes = Implementation Of A Lung Disease Classification System Using The Support Vector Machine And Ensemble Bagging Gaussian NaÃ¯ve Bayes Methods.

"Penyakit pada paru-paru merupakan gangguan yang cukup serius dimana dapat menyerang sistem pernapasan manusia dan bisa berakibat fatal jika tidak ditangani dengan serius. Pada saat ini deteksi penyakit pada paru-paru masih dilakukan secara manual oleh para dokter ahli, namun proses secara manual memakan waktu lama. Oleh karena itu, dalam penelitian ini dibuat sistem yang dapat mendeteksi dan mengklasifikasi penyakit paru-paru dengan otomatis. Dalam penelitian ini akan digunakan dua metode yaitu Support Vector Machine dan Ensemble Bagging Gaussian Naïve Bayes . Data yang digunakan dalam penelitian ini adalah data screening yang berjumlah seratus data pasien, data di dapatkan dari salah satu sumber yang memiliki data primer yaitu salah satu rumah sakit di Yogyakarta. Penelitian ini menggunakan dua belas gejala paru-paru dan diklasifikasikan kedalam lima kelas penyakit paru-paru yaitu tuberkulosis, penyakit paru obstruktif kronis, pneumonia, asma bronkial, kanker paru. Sistem klasifikasi akan di implementasikan menggunakan bahasa pemrograman PHP. Pengujian kinerja klasifikasi menggunakan Confusion Matrix dan aplikasi diuji dengan menggunakan System Usability Scale. Penelitian ini menghasilkan sistem klasifikasi penyakit paru-paru dengan menggunakan metode Support Vector Machine dan Ensemble Bagging Gaussian Naïve Bayes, dari hasil pengujian akurasi Confusion Matrix pada algoritma Support Vector Machine mendapatkan hasil akurasi 93,9% , recall 92%, precison 79%, dan f1 score 54%, sedangkan pada Ensemble Bagging Gausian Naïve Bayes mendapatkan hasil akurasi 88,9 % recall 92%, precision 79%, f1 score 54%, serta pengujian sistem menggunakan metode System Usability Scale nilai yang diperolah sebesar 73 atau mendapatkan grade B.

Lung disease is a serious disorder that can attack the human respiratory system and can be fatal if not treated seriously. Currently, lung disease detection is still done manually by expert doctors, but the manual process takes a long time. Therefore, in this research a system was created that can detect and classify lung diseases automatically. In this research, two methods will be used, namely Support Vector Machine and Ensemble Bagging Gaussian NaÃ¯ve Bayes. The data used in this research is screening data consisting of one hundred patient data, the data was obtained from one source that has primary data, namely one of the hospitals in Yogyakarta. This study used twelve lung symptoms and classified them into five classes of lung disease, namely tuberculosis, chronic obstructive pulmonary disease, pneumonia, bronchial asthma, lung cancer. The classification system will be implemented using the PHP programming language. Classification performance testing uses the Confusion Matrix and the application is tested using the System Usability Scale. This research produces a lung disease classification system using the Support Vector Machine method and Ensemble Bagging Gaussian NaÃ¯ve Bayes, from the results of Confusion Matrix accuracy testing on the Support Vector Machine algorithm, the results are 93.9% accuracy, 92% recall, 79% precision, and f1 score was 54%, while Ensemble Bagging Gausian NaÃ¯ve Bayes obtained accuracy results of 88.9%, recall 92%, precision 79%, f1 score 54%, and system testing using the System Usability Scale method obtained a score of 73 or got grade B. "

Depok: Fakultas Teknik Universitas Indonesia, 2024

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Nabilla Ayu Fauziyyah

Klasifikasi subtipe molekular kanker payudara menggunakan naive bayes classifier = Breast cancer molecular subtype classification using naive bayes classifier

"ABSTRACT

Dewasa ini, sudah banyak rumah sakit modern yang dilengkapi dengan peralatan monitoring yang lengkap, yang menyebabkan makin banyaknya data medis yang tersimpan. Data medis ini memiliki karakteristik khusus, dan biasanya metode statistika biasa tidak dapat diterapkan begitu saja. Dari sinilah kemudian muncul gagasan mengenai Medical Data Mining (MDM) yang sudah terbukti cocok untuk diterapkan dalam analisis data medis. Naive Bayes Classifier (NBC) merupakan salah satu implementasi dari MDM. Kendati terbukti memiliki hasil yang akurat dan memuaskan dalam proses diagnosis medis, metode-metode dalam MDM belum sepenuhnya diterima dalam praktek medis untuk diterapkan. Alasan utama mengapa metode ini belum dapat diterima adalah karena terdapatnya resistansi dari tenaga medis terhadap metode diagnosis yang baru. Tujuan dari penelitian ini adalah untuk menerapkan dan mengevaluasi performa NBC pada data rekam medis pasien kanker payudara di salah satu rumah sakit di Jakarta dalam masalah klasifikasi subtipe molekular kanker payudara, serta membandingkan hasil klasifikasi NBC dengan metode MDM lain, yaitu Decision Tree (DT). Hasil analisis menunjukkan bahwa NBC mengungguli DT dengan tingkat akurasi sebesar 92,8%. Selain itu, dapat juga ditunjukkan secara empiris bahwa NBC mampu menangani missing value dengan cukup baik dan tidak membutuhkan data dalam jumlah banyak untuk tetap dapat mengklasifikasikan sebagian besar pasien dengan benar.

ABSTRACT

Nowadays, modern hospitals are well equipped with data monitoring devices, which resulted in an abundant amount of medical data. These medical data possess specific characteristics and usually, statistical methods could not be applied directly. This is what started the notion of Medical Data Mining (MDM), which has proven to be effective in analysing medical data. Naive Bayes Classifier (NBC) is an implementation of MDM. Even though MDM methods produce a sufficiently accurate and satisfying results in diagnosis problems, these methods are still not well accepted in the medical practice. One of the main reasons is because there is a resistance of physicians to a new diagnosis method. The main goal of this study is to apply and evaluate the performance of NBC in classifying breast cancer patients in a private hospital in Indonesia into five classes of molecular subtypes and compare its performance with another popular MDM method, Decision Tree (DT). Results showed that NBC outperformed DT by reaching an accuracy rate of 92.8%. This study could also show empirically that NBC does not need a big dataset to be able to achieve a high accuracy rate and that NBC could handle the problem of missing values just fine."

2019

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Kresna Bima Sudirgo

Klasifikasi tipe batuan menggunakan kombinasi metode rock fabric number lucia-differential effective medium dan naive bayes untuk analisis potensi produksi pada sumur karbonat cekungan Jawa Timur Utara = Rock type classification using combinated rock fabric number lucia differential effective medium with naive bayes classification method for well production potential analysis in North East Java basin

ABSTRAK

Kemampuan produksi sebuah sumur dapat dinilai dari parameter porositas dan permeabilitas dari reservoirnya . Untuk memperoleh nilai porositas dan permeabilitas pada sebuah sumur umumnya akan dilakukan logging dan coring dengan biaya yang tinggi. Pada penelitian ini penulis akan mengatasi masalah tersebut dengan mencoba untuk memperoleh data porositas dan permeabilitas pada sumur target menggunakan rock typing metode Lucia yang akan diintegrasikan dengan salah satu metode geostatistik yaitu Naive Bayes classifier. Metode rock typing Lucia akan membagi batuan reservoir pada sumur acuan menjadi beberapa kelas berdasar korelasi porositas dan permeabilitasnya, kemudian metode kalsifikasi Naive Bayes akan mengkorelasikan kelas tersebut dengan nilai Vp dan Vs beserta turunannya. Dengan pengintegrasian ini diperoleh trend yang menunjukan korelasi yang baik antara nilai Vp dan Vs dengan kelas batuan yang mewakili nilai porositas dan permeabilitas. Oleh karena itu nilai porositas dan permeabilitas dapat diperoleh dengan mengklasifikasikan nilai Vp dan Vs maupun turunanya pada sumur target menggunakan klasifikasi Naive Bayes. Setelah memperoleh nilai porositas dan permeabilitas pada sumur target, maka analisis produksi dapat dilakukan dengan melihat tipe batuan zona prospek sumur acuan. Selain memperoleh analisa potensi produksi sumur target, penulis juga dapat melihat kerusakan dan merekomendasikan penyemenan dengan menganalisa porositas dan permeabilitas.

ABSTRAK

The production capability of a well can be evaluated by measuring the porosity and permeability parameters of the reservoir. To obtain the porosity and permeability values in a well usually in industrial is using coring method. Both method are requiring high cost instrumentation. In this study the authors will solve that problem by trying to obtain the porosity and permeability data on the target well without coring using the Lucia method rock typing which will be integrated with one of the geostatistics method that is Naive Bayes classifier. Lucia 39 s rock typing method will divide the reservoir rocks from reference well into several classes based on the porosity and permeability value by the particle size of the rock, then the Naive Bayes classification method will correlate the classes with Vp and Vs and also with their derivatives . With this integration we get a trend that shows a good correlation between Vp and Vs with rock classes representing porosity and permeability values of the reservoir rock in target well. Therefore, porosity and permeability values can be obtained with Vp and Vs and also with their derivative of target wells with the Naive Bayes classification method. After obtaining the porosity and permeability values in the target well, the production analysis can be done by looking at the rock type of the prospect zone of the reference well. In addition to obtaining an analysis of the production potential of the target well, the authors can also look at the damage and recommend cementing by analyzing porosity and permeability. "

2017

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian