Hasil Pencarian

Ditemukan 130624 dokumen yang sesuai dengan query

Wahyu Nuryaningrum

Perbandingan Prediksi Tren Harga Saham Menggunakan Random Forest, Support Vector Regression, dan K-Nearest Neighbor = Prediction Comparison of Stock Market Trend using Random Forest, Support Vector Regression, and K-Nearest Neighbor

"Pesatnya perkembangan ekonomi menyebabkan kebutuhan manusia menjadi tidak terbatas. Usaha yang dapat dilakukan untuk pemenuhan kebutuhan hidup di masa yang akan datang adalah dengan melakukan investasi. Saham merupakan salah satu instrumen investasi dengan tingkat keuntungan yang menarik, namun memiliki risiko kerugian yang tinggi. Hal ini disebabkan oleh adanya pergerakan harga saham yang cenderung tak menentu selama periode waktu tertentu. Untuk meminimalkan risiko kerugian, perlu dilakukan prediksi pergerakan harga saham. Prediksi yang akurat akan membantu para investor dalam menentukan nilai saham di masa yang akan datang. Pada penelitian ini, dilakukan perbandingan untuk memprediksi pergerakan harga saham menggunakan tiga algoritma supervised machine learning yaitu Random Forest, Support Vector Regression (SVR) dan K- Nearest Neighbor (KNN) berdasarkan tingkat akurasinya. Sutau model dikatakan akurat jika memiliki nilai Root Mean Square Error (RMSE) dan Mean Absolute Error (MAE) yang lebih rendah. Pada penelitian ini, diperoleh hasil prediksi harga penutupan saham terbaik menggunakan metode Support Vector Regression dengan melihat rendahnya nilai RMSE dan MAE yang dihasilkan dibandingkan dengan dua metode lain. Dalam perhitungannya, penelitian ini menggunakan histori data harian dari website investing.com. periode Maret 2017 hingga Februari 2020 dari tiga perusahaan di Indonesia yang terdaftar dalam IDX30.

The fast growth of economic development causes human needs to be immeasurable. One of the efforts that could be done to fulfill life needs in the future was Investation. Stock is one of the Investation instruments with interesting benefits but has high- risk loss caused by the unstable stock market trend between some period. For minimalizing the risky loss, the literati need to predicting the stock rate trend. The accurate prediction will help the investor in choosing a stock value in the future. In this study, the literati make a comparison to predict stock market trend with three kinds of algorithms supervised machine learning that are Randon Forest, Support Vector Regression (SVR), and K-Nearest Neighbor (KNN) based on their accurate level. A model could be said accurate just if they have a lower value of Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The best Stock Closing Price prediction will be obtained by the Support Vector Regression method and see how low the result of RMSE and MAE value is compared with another method. To calculate, the study uses a daily data history from investing.com website between March 2017 to February 2020 period. The object data is a three big company in Indonesia which listed in IDX30."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Diva Arum Puspitasari

Analisis dan prediksi harga saham berbasis support vector regression (SVR) dan support vector machines - K nearest neighbor (SVM-KNN) = Analysis and prediction of stock price using support vector regression (SVR) and support vector machines - K nearest neighbor (SVM-KNN)

"Prediksi trend harga saham dapat berguna bagi trader untuk menentukan nilai saham dimasa yang akan datang. Untuk memprediksi trend dengan analisis teknikal adalah melakukan prediksi harga penutupan saham. Seiring dengan waktu, meningkatnya harga saham setara dengan diperolehnya return saham yang profit. Pada skripsi ini, dilakukan analisis dan prediksi harga penutupan saham selama sebulan menggunakan metode Support Vector Machines ndash; K Nearest Neighbor SVM-KNN . Pertama, terlebih dahulu dilakukan pemilihan indikator teknikal yang berpengaruh terhadap saham perusahaan yang dianalisis menggunakan Support Vector Regression SVR . Kedua, klasifikasi return saham yang terdiri dari profit dan loss dengan SVM. Hasil prediksi label kelas dapat membantu mencari tetangga terdekat dalam memprediksi harga penutupan saham dengan KNN. Percobaan dilakukan menggunakan 3, 4, dan 5 indikator teknikal yang terpilih dan tanpa pemilihan fitur dengan 13 indikator teknikal.

Stock price trend prediction is important for trader to determine whether the stock price is rising up or not. To predict the trend using technical stock analysis is by predicting the close prices. Along the time, when the price is rising up then it can indicate profit return. This undergraduate thesis will study how to analysis and prediction of stock closing prices one month ahead with Support Vector Machines ndash K Nearest Neighbor SVM KNN method. First, feature selection method is applied to select the important technical indicators using Support Vector Regression SVR . Second, classify the stock rsquo s return which consist of profit and loss using SVM. The output of class label is used to help find the nearest neighbor. Next, stock prices are forecasted using KNN. This study will be experimented with 3, 4, and 5 selected indicators and compared with 13 technical indicators."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2017

S69143

UI - Skripsi Membership Universitas Indonesia Library

Restu Eka Firdaus

Perbandingan akurasi antara metode support vector machines SVM dan K-nearest neighbor knn untuk pengenalan wajah = Comparison of accuracy between support vector machine SVM and K nearest neighbor knn for face recognition

"Sistem pengenalan wajah telah banyak diaplikasikan dengan menggunakan berbagai metode, diantaranya: metode PCA, metode ICA, metode LDA, metode EP, metode EBGM, metode Kernel, metode 3-D Morphable, metode 3-D Face Recognition, metode Bayesian Framework, metode HMM, metode SVM, dan sebagainya.

Pada penelitian ini digunakan metode Local Binary Pattern LBP untuk melakukan ekstraksi fitur citra wajah, serta metode SVM dan KNN untuk mengukur tingkat akurasi sistem pengenalan wajah. Data yang digunakan pada penelitian ini yaitu citra wajah 25 mahasiswa Matematika Universitas Indonesia, masing-masing individu diambil 10 citra wajah yang berbeda terdiri dari 5 citra wajah menggunakan kacamata dan 5 citra lainnya tidak menggunakan kacamata, serta diambil dari sudut yang berlainan.

Berdasarkan pengujian yang telah dilakukan, metode KNN dengan memperoleh tingkat akurasi terbaik yaitu sebesar 96.20 pada iterasi 100 dan 90 data training. Hal ini menunjukkan metode KNN lebih baik dibandingkan dengan metode SVM yang hanya memperoleh tingkat akurasi sebesar 94.80 pada iterasi 100 dan 90 data training.

Face recognition has been widely applied using various methods, that is PCA, ICA, LDA, EP, EBGM, Kernel, 3 D Morphable, 3 D Face Recognition, Bayesian Framework, HMM, SVM, etc.
In this research, the Local Binary Pattern LBP method is used to perform feature extraction of a facial image, and to measure the accuracy level of face recognition used SVM and knn method. The data used in this research are face images of 25 mathematics students of University of Indonesia, each individual took 10 different facial images consisting of 5 face images are using glasses with 5 different angles and 5 other images aren 39 t using glasses that also taken from the same 5 different angles.
Based on the tests, KNN method with K 1 obtained the best accuracy of 96.20 at 100 iterations and 90 training data. This result shows the KNN method is better than the SVM method which only obtained 94.80 at 100 iterations and 90 of training data."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Revan Dzaky Fahrezi

Analisis Sentimen dan Text Clustering dari Hasil Review Maskapai Penerbangan Menggunakan Metode Naive Bayes, Support Vector Machine, dan K-Nearest Neighbor = Sentiment Analysis and Text Clustering From Airline Review Results Using Naive Bayes, Support Vector Machine, and K-Nearest Neighbor

"Penelitian ini bertujuan untuk mengintegrasikan analisis sentimen dan teknik pengelompokan teks (text clustering) dalam mengevaluasi kualitas layanan berdasarkan model SERVQUAL, yang mencakup lima dimensi utama: Tangibility, Responsiveness, Reliability, Assurance, dan Empathy. Metode yang digunakan meliputi Naïve Bayes, Support Vector Machine, dan K-Nearest Neighbor untuk melakukan klasterisasi sentimen yang bervariasi di setiap dimensi SERVQUAL. Hasil analisis menunjukkan bahwa sentimen pelanggan berbeda di setiap dimensi, dengan beberapa area menonjol dalam sentimen negatif atau positif. Teknik clustering teks membantu mengidentifikasi tema-tema umum dan masalah yang sering dihadapi pelanggan. Kesimpulan dari penelitian ini adalah pendekatan analisis sentimen dan text clustering memberikan wawasan yang lebih detail dan mendalam mengenai kualitas layanan, yang memungkinkan perusahaan untuk mengambil tindakan yang lebih tepat dalam meningkatkan setiap dimensi SERVQUAL untuk meningkatkan kepuasan dan loyalitas pelanggan secara keseluruhan

This study aims to integrate sentimen analysis and text clustering techniques to evaluate service quality based on the SERVQUAL model, which includes five main dimensions: Tangibility, Responsiveness, Reliability, Assurance, and Empathy. The methods used include Naïve Bayes, Support Vector Machine, and K-Nearest Neighbor to perform sentimen clustering that varies across each SERVQUAL dimension. The analysis results show that customer sentimens differ across each dimension, with certain areas standing out in either negatif or positive sentimens. Text clustering techniques help identify common themes and issues frequently faced by customers. The conclusion of this study is that the sentimen analysis and text clustering approach provides more detailed and in-depth insights into service quality, enabling companies to take more precise actions in enhancing each SERVQUAL dimension to increase overall customer satisfaction and loyalty."

Depok: Fakultas Teknik Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Velery Virgina Putri Wibowo

Klasifikasi tumor otak menggunakan Mmetode K-Nearest Neighbor-Genetic Algorithm dan Support Vector Machine-Genetic Algorithm = Classification of brain tumor using K-Nearest Neighbor-Genetic Algorithm and Support Vector Machine-Genetic Algorithm methods

Kemunculan suatu penyakit merupakan masalah yang tak terhindarkan di seluruh dunia, termasuk di Indonesia. Tumor otak merupakan salah satu penyakit berbahaya yang dapat menyebabkan kematian. Salah satu jenis penyakit tumor otak yang paling umum dan mematikan adalah glioblastoma. Penderita glioblastoma memiliki tingkat kelangsungan hidup yang cukup rendah dan umumnya didiagnosis pada saat tumor sudah berkembang lebih jauh. Oleh karena itu, sangat penting dilakukan diagnosis secara dini dengan hasil yang akurat untuk menentukan apakah seseorang menderita glioblastoma atau tidak. Pada penelitian ini, metode machine learning, yaitu K-Nearest Neighbor dan Support Vector Machine dengan seleksi fitur Genetic Algorithm (KNN-GA dan SVM-GA) diterapkan dan dibandingkan untuk mengklasifikasi glioblastoma. Genetic Algorithm (GA) diimplementasikan sebagai seleksi fitur untuk menentukan fitur-fitur relevan yang terpilih dan kemudian diklasifikasi dengan metode KNN dan SVM. Data yang digunakan adalah data numerik hasil Magnetic Resonance Imaging (MRI) yang didapat dari RSUPN Dr. Cipto Mangunkusumo. Berdasarkan percobaan yang dilakukan, metode SVM-GA menggunakan kernel Radial Basis Function dan 5 fitur dengan 90% data training adalah metode terbaik untuk mengklasifikasi data glioblastoma. Hasil yang didapat untuk nilai akurasi, recall, presisi, dan f1-score secara berturut-turut adalah 92.35%, 93.19%, 92.62%, dan 92.83%.

The emergence of a disease is an inevitable problem throughout the world, including in Indonesia. Brain tumor is one of the dangerous diseases that can cause death. One of the most common and deadly types of brain tumor is glioblastoma. Patients with glioblastoma have a fairly low survival rate and are generally diagnosed when the tumor has developed further. Therefore, it is very important to make an early diagnosis with accurate result to determine whether a person has glioblastoma or not. In this study, machine learning methods, namely K-Nearest Neighbor and Support Vector Machine with feature selection Genetic Algorithm (KNN-GA and SVM-GA) were applied and compared to classify glioblastoma. Genetic Algorithm (GA) was implemented as a feature selection to determine the selected relevant features and then classified by KNN and SVM methods. The data used are numerical data obtained from Magnetic Resonance Imaging (MRI) results from Dr. Cipto Mangunkusumo Hospital. Based on the experiments conducted, the SVM-GA method using a Radial Basis Function kernel and 5 features with 90% training data is the best method for classifying glioblastoma. The results obtained for the values of accuracy, recall, precision, and f1-score were 92.35%, 93.19%, 92.62%, and 92.83%, respectively."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Anggoro Gagah Nugroho

Perbandingan pengenalan otomatis plat nomor Indonesia menggunakan metode K-nearest neighbor dan neural network = Comparison of automatic Indonesian plate number recognition using K-nearest neighbour and neural network methods

"Plat nomor merupakan suatu jenis identifikasi kendaraan bermotor. Setiap kendaraan bermotor yang beroperasi dijalanan diwajibkan untuk melengkapi kendaraannya dengan plat nomor atau Tanda Nomor Kendaraan Bermotor (TNKB) yang sesuai dengan kode wilayah, nomor registrasi dan masa berlaku. Plat nomor di Indonesia terdapat 3 warna yang dipakai yaitu hitam, merah dan kuning dengan masing masing warna untuk fungsi yang berbeda. Dengan jumlah kendaraan di Indonesia, sistem pengenalan plat nomor dibuat secara otomatis bisa di implementasikan untuk memudahkan berbagai hal dalam pendataan plat nomor diantaranya pengecekan plat nomor ketika di area parkir, menemukan kendaraan yang dicuri ataupun mobil yang melanggar lampu merah. Pada penelitian ini terdapat 2 metode yang sering digunakan untuk pengenalan plat nomor otomatis yaitu KNN (K-Nearest Neighbour) dan NN (Neural Network). Setelah dilakukan pengujian menggunakan 3 analisis uji yang sudah dilakukan oleh penulis, akurasi metode neural network berhasil mencapai 88,8% sedangkan pada K-Nearest Neighbor akurasinya mencapai 72,2%. Metode NN lebih baik daripada KNN pada pengujian kali ini disebabkan adanya modifikasi pada variable yang dapat membuat akurasi NN lebih baik daripada KNN. Sedangkan pada metode KNN tidak dapat merubah akurasi yang telah didapatkan.

Number plate is a type of motor vehicle identification. Every motorized vehicle operating on the road is required to complete the vehicle with a license plate or Motor Vehicle Number (TNKB) that matches the area code, registration number and validity period. Number plates in Indonesia there are 3 colors used, namely black, red and yellow with each color for different functions. With the number of vehicles in Indonesia, the number plate recognition system is made automatically can be implemented to facilitate various things in number plate registration including checking license plates when in the parking area, finding stolen vehicles or cars that violate red lights. In this study there are 2 methods that are often used for automatic number plate recognition, namely K-Nearest Neighbor and NN (Neural Network). After testing using 3 test analyzes carried out by the author, the accuracy of the neural network method reached 88.8% while the K-Nearest Neighbor accuracy was 72.2%. The NN method is better than KNN in this test due to a modification in the variable that can make the accuracy of NN better than KNN. While the KNN method cannot change the accuracy that has been obtained."

Depok: Fakultas Teknik Universitas Indonesia, 2019

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Fahri Alamsyah

Perancangan sistem pengenalan wajah menggunakan Support Vector Machine (SVM), Logistic Regression, Multi-Layer Perceptron (MLP), Gaussian Naive Bayes, K-Nearest Neighbour (KNN), Decision Tree (DT), dan Convolutional Neural Network (CNN) = Development of face recognition system using Support Vector Machine (SVM), Logistic Regression, Multi-Layer Perceptron (MLP), Gaussian Naive Bayes, K-Nearest Neighbour (KNN), Decision Tree (DT), and Convolutional Neural Network (CNN)

"Dunia digital khususnya image processing berkembang seiring waktu berjalan dikarenakan kebutuhan masyarakat dan pentingnya keamanan sistem berbasis digital. Salah satu teknologi yang sangat mengalami kemajuan pesat adalah pengenalan wajah (face recognition) menggunakan artificial intelligence. Wajah seseorang yang sudah terdaftar di dalam database akan dikenali oleh sistem untuk keperluan validasi atau verifikasi. Di dalam penelitian ini dirancang sistem pengenalan wajah (face recognition) menggunakan algoritma machine learning dan Principal Component Analysis (PCA) sebagai pereduksi dimensi. Pengujian dilakukan dengan menggunakan beberapa metode, yakni: Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbour (K- NN), Logistic Regression (LR), Multi-Layer Perceptron (MLP) dan Convolutional Neural network (CNN). CNN berfokus pada layer dan tidak memerlukan reduksi dimensi, sehingga hasilnya lebih akurat. Model machine learning yang digunakan untuk classifier selain CNN adalah standar/default, sedangkan CNN menggunakan arsitektur LeNet-5, dengan dropout rate sebesar 0.25. Training dilakukan selama 60 epoch dengan loss function crosscategorical entropy, optimizer Adam, dan batch size sebesar 20. Data masukan adalah citra wajah berukuran 64 × 64 × 1 yang diperoleh dari dataset olivetti faces. Akurasi tertinggi metode PCA, SVM, maupun LR sebesar 91.25%, sementara akurasi terbaik CNN mencapai 98.75%. Selain akurasi, pemakaian confusion matrix dan classification report digunakan untuk menguji performa metode yang ada melalui evaluasi model klasifikasi.

The digital world, especially image processing, is evolving due to the needs of society and the importance of digital-based system security. One of the technologies that are rapidly progressing is face recognition using artificial intelligence. The system will recognize a person's face already registered in the database for validation or verification purposes. A face recognition system was designed using machine learning algorithms and Principal Component Analysis (PCA) as dimension reduction in this study. Testing is conducted using several methods: Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbour (K-NN), Logistic Regression (LR), Multi-Layer Perceptron (MLP) and Convolutional Neural network (CNN). CNN focuses on layers and does not require dimensional reduction to increase the accuracy of the result. The machine learning model used for classifiers other than CNN is standard/default settings, while CNN uses the LeNet-5 architecture, with a dropout rate of 0.25. The training was conducted for 60 epochs with loss function cross-categorical entropy, optimizer Adam, and batch size of 20. Input data is a 64 × 64 × 1 facial image obtained from the Olivetti faces database. The highest accuracy of PCA, SVM and LR methods was 91.25%, while CNN's best accuracy reached 98.75%. In addition to accuracy, the use of confusion matrix and classification report is used to test the performance of existing methods through the evaluation of classification models."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Rany Dwi Cahyaningtyas

Prediksi Churn Pelanggan Berdasarkan Segmen Produk Susu Bubuk Balita menggunakan Model Customer Lifetime Value (CLV) dan Metode Klasifikasi K-Nearest Neighbor = Customer Churn Prediction based on Infant Powdered Milk Product Segment using Customer Lifetime Value (CLV) Model and K-Nearest Neighbor Classifier

"Produk susu bubuk balita yang beragam membuat konsumen memiliki banyak pilihan sehingga penting bagi produsen menjaga loyalitas pelanggan yang telah ada dengan memahami perilaku churn pelanggan. Churn pelanggan didefinisikan sebagai kecenderungan pelanggan untuk berhenti melakukan bisnis dengan sebuah perusahaan. Penelitian ini berfokus memprediksi pola churn pelanggan sehingga perusahaan dapat menentukan strategi untuk mengurangi churn. Penelitian ini membahas mengenai prediksi churn pelanggan berdasarkan segmen produk susu bubuk balita menggunakan model Length, Recency, Frequency, Monetary (LRFM). Responden penelitian ini adalah pelanggan PT. XYZ yang pernah bertransaksi untuk produk susu bubuk balita kelas premium (susu A) dan segmen biasa (susu B) selama periode tahun 2021. Variabel pada penelitian ini meliputi variabel LRFM dan CLV yang dibentuk dengan pembobotan variabel LRFM. Pertama metode Fuzzy C-Means Clustering digunakan untuk melakukan pelabelan target pelanggan selanjutnya metode klasifikasi K-Nearest Neighbor (KNN) digunakan untuk memprediksi churn. Hasilnya terdapat tiga kelompok pelanggan untuk masing-masing susu A dan susu B. Pelabelan yang dihasilkan yaitu pelanggan churn dengan nilai CLV rendah, potential to churn dengan nilai CLV menengah, dan loyal dengan nilai CLV tinggi. Susu B menunjukkan jumlah pelanggan churn sebesar 43,4% lebih banyak dibandingkan susu A sebanyak 34%. Tahapan akhir penelitian ini adalah menganalisis kinerja metode KNN berdasarkan nilai akurasi, recall, dan f1-score terhadap kedua susu A dan susu B. Hasil dari tugas akhir ini menunjukkan bahwa kinerja metode KNN bergantung pada pemilihan jumlah tetangga terdekat dan proporsi pemisahan data.

The variety of powdered toddler milk products gives consumers many choices, so producers need to maintain the loyalty of existing customers by understanding customer churn behaviour. Customer churn is defined as the tendency of a customer to stop doing business with a company. This study focuses on predicting customer churn patterns so companies can determine strategies to reduce churn. This study discusses the prediction of customer churn based on the segment of toddler powdered milk products using the Length, Recency, Frequency, Monetary (LRFM) model. The respondent of this research are the customers of PT. XYZ who have transacted for premium segment powdered milk products for toddlers (milk A) and ordinary segment (milk B) during 2021. Variables in the data include LRFM and CLV variables which are formed by weighting the LRFM variable. At first, Fuzzy C-Means Clustering algorithm was applied for labelling target customer and then, K-Nearest Neighbor (KNN) Classifier as churn prediction was used. As a result, there are three groups of customers for each milk A and milk B. The resulting labels are the churn customer group with low CLV value, potential to churn group with medium CLV, and loyal customer group with high CLV value. Milk B shows the number of customers churn by 43,4% more than milk A as much as 34%. In the final stage of this research, the author analyze the performance of the KNN method based on the value of accuracy, recall, and f1-score for both milk A and milk B. The results of this final project show that the performance of the KNN method depends on the selection of the number of nearest neighbors and the proportion of data splitting used."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Annisa Kamalia

Klasifikasi data talasemia menggunakan K-nearest neighbor dan naive bayes = Classification of data thalassemia using K-nearest neighbor and naive bayes

"ABSTRACT

Talasemia adalah penyakit yang disebabkan oleh adanya kelainan dalam hemoglobin. Penyakit talasemia merupakan penyakit herediter atau penyakit keturunan dimana pembawa gen talasemia adalah orang tua dari penderita. Di Indonesia, pada tahun 2015 diketahui jumlah kasus talasemia mencapai 7.029 kasus. Sampai saat ini talasemia belum dapat disembuhkan namun dapat dikenali sifat pembawanya dengan skrining. Dalam tugas akhir ini, akan dibandingkan performa dari dua metode yang digunakan untuk mengklasifikasikan data talasemia, yaitu K-Nearest Neighbor dan Naive Bayes. Data yang digunakan adalah 82 data pasien talasemia dan 68 data pasien non-talasemia dari Rumah Sakit Anak dan Bunda Harapan Kita, Jakarta Barat. Hasil akhir menunjukkan bahwa metode Naive Bayes memberikan nilai akurasi yang lebih besar dari K-Nearest Neighbor dalam mengklasifikasikan talasemia. Rata-rata akurasi Naive Bayes sebesar 99.775% dengan rata-rata waktu running 0.0554 detik dan rata-rata akurasi K-Nearest Neighbor adalah 97.142% dengan rata-rata waktu running 0.081 detik. Untuk nilai spesifikasi, keduanya memberikan performa yang sama, yaitu dari K-Nearest Neighbor diperoleh ketika K=3 yaitu sebesar 100% dan dari Naive Bayes sebesar 100%. Hasil rata-rata sensitivitas tertingi diberikan oleh Naive Bayes yaitu sebesar 99.59%, sedangkan K-Nearest Neighbor sebesar 96.25% untuk K=1.

ABSTRACT

Thalassemia is a disease caused by abnormalities in the hemoglobin. Thalassemia is a hereditary disease which the thalassemia gene carriers are parents of sufferers. In Indonesia, in 2015 it was found that the number of thalassemia cases reached 7,029 cases. Until now thalassemia has not been cured, but it can be recognized the nature of its carrier by screening. In this final project, the performance of the two methods will be compared to classify thalassemia data, namely K-Nearest Neighbor and Naive Bayes. The data used were 82 data on thalassemia patients and 68 data on non-thalassemia patients from Harapan Kita Children and Womans Hospital, West Jakarta. The final results show that the Naive Bayes method provides greater accuracy value than K-Nearest Neighbor in classifying thalassemia. The average accuracy of Naive Bayes is 99.775% with an average running time of 0.0554 seconds and the average accuracy of K-Nearest Neighbor is 97.142% with an average running time of 0.081 seconds. For specification values, both give the same performance. The result of specification values using K-Nearest Neighbor yield when K = 3 that is 100% and from Naive Bayes that is 100%. The highest average sensitivity results are given by Naive Bayes is 99.59%, while K-Nearest Neighbor is 96.25% for K = 1."

2019

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Nadia Hartini Kusumawijaya

Komparasi Kinerja Metode Random Forest Regression dengan Metode Support Vector Regression untuk Memprediksi Usia Biologis pada Data Pemeriksaan Medis = Comparison of the Performance of the Random Forest Regression Method with the Support Vector Regression Method for Predicting Biological Age on Medical Examination Data

"Penuaan adalah salah satu faktor utama resiko terjadinya penyakit dan kematian. Laju

penuaan individu dengan usia kronologis yang sama terbukti bervariasi. Maka dari

itu, muncul kebutuhan untuk alat pengukuran penuaan yang lebih akurat, robust, dan

dapat diandalkan dibandingkan usia kronologis, yakni usia biologis. Pada penelitian

ini, penulis membangun model menggunakan Metode Random Forest Regression (RF)

dan Metode Support Vector Regression (SVR) untuk memprediksi umur biologis pada

data pemeriksaan medis, menilai dan mengevaluasi hasil kinerjanya, serta melakukan

komparasi kinerja kedua metode. Terkait metode yang digunakan, Metode RF adalah

metode yang mengaplikasikan Teknik Ensemble Learning dengan cara menggabungkan

beberapa decision tree untuk menghasilkan prediksi. Sedangkan, Metode SVR adalah

metode yang berkerja dengan cara membangun hyperplane atau kumpulan hyperplane

dalam ruang berdimensi tinggi yang dapat digunakan untuk regresi linier atau nonlinier.

Dataset yang digunakan adalah data medis yang berasal dari Kementrian Kesehatan

Republik Indonesia. Pada dataset dilakukan data preprocessing, yakni data diproses pada

aspek missing values handling, encoding, dan outliers detection and outliers handling.

Kemudian, dilakukan feature selection menggunakan Spearman’s Rank Correlation

Coefficient. Setelah itu, dilakukan pembangunan model dengan Metode RF dan model

dengan Metode SVR secara terpisah untuk masing - masing jenis kelamin. Terakhir,

performa model dievaluasi dan dibandingkan kinerjanya menggunakan metrik evaluasi

Root Mean Square Error (RMSE), Coefficient of Determination (R2), Adjusted R2, dan

running time. Metode RF menggunakan hyperparameter terbaik {’max depth’: 15,

’n estimators’: 1150} untuk dataset pria, dan {’max depth’: 15, ’n estimators’: 1250}

untuk dataset wanita. Sedangkan, Metode SVR menggunakan hyperparameter terbaik

{’C’: 2,’epsilon’: 0,2, ’gamma’: ’scale’, ’kernel’: ’rbf’, ’tol’: 0,005} untuk dataset pria,

dan {’C’: 3, ’epsilon’: 0,2, ’gamma’: ’scale’, ’kernel’: ’rbf’, ’tol’: 0,005} untuk dataset

wanita. Metode RF memiliki kinerja yang cukup baik, dengan nilai RMSE = 7,532; R2

= 0,403; Adjusted R2 = 0,351; running time = 0,154 untuk pria dan RMSE = 6,889;

R2 = 0,340; Adjusted R2 = 0,264; running time = 0,179 untuk wanita. Selain itu, SVR

juga memiliki performa yang cenderung sama namun sedikit lebih buruk, dengan nilai

RMSE = 7,692; R2 = 0,376; Adjusted R2 = 0,321; running time = 0,035 untuk pria dan

RMSE = 6,905; R2 = 0,337; Adjusted R2 = 0,306; running time = 0,080 untuk wanita.

Berdasarkan analisis kinerja model yang dilakukan pada penelitian ini model yang

dibangun dengan Metode Random Forest Regression lebih unggul dalam memprediksi

usia biologis dibandingkan dengan Metode Support Vector Regression.

Aging is one of the main risk factors for disease and death. The aging rate of individ- uals of the same chronological age has been shown to vary. So therefore, a need arises for a more accurate, robust, and reliable aging measurement tool than chronological age, namely biological age. In this research, the author build a model using the Random For- est Regression (RF) Method and the Support Vector Regression (SVR) Method to predict biological age from patient clinical data, assess and evaluate the performance results, and compare the performance of the two models. Regarding the method used, the Random Forest Regression Method is a method that applies the Ensemble Learning Technique by combining several decision trees to produce predictions. Meanwhile, the Support Vector Regression Method is a method that works by building a hyperplane or collection of hy- perplane in high-dimensional space which can be used for linear or nonlinear regression. The dataset used is medical data originating from the Ministry of Health of the Republic of Indonesia. On the dataset, data preprocessing is carried out, namely the data is processed in the aspects of missing values handling, encoding, and outliers detection and outliers handling. Then, feature selection is carried out using Spearman’s Rank Correlation Co- efficient. After that, machine learning model using RF Method and machine learning model using SVR Method were created separately for each gender. Finally, the model performance is evaluated and its performance compared using evaluation metrics, namely Root Mean Square Error (RMSE), Coefficient of Determination (R2), and Adjusted R2, as well as running time. The RF Method used best hyperparameters {’max depth’: 15, ’n estimators’: 1150} for the male dataset, and {’max depth’: 15, ’n estimators’: 1250 } for the female dataset. Meanwhile, the SVR Method used best hyperparameters {’C’: 2, ’epsilon’: 0.2, ’gamma’: ’scale’, ’kernel’: ’rbf’, ’toll’: 0.005} for the male dataset, and {’C’: 3, ’epsilon’: 0, 2, ’gamma’: ’scale’, ’kernel’: ’rbf’, ’toll’: 0.005} for female dataset. The result is that the model built using the RF Method has quite good performance, with an RMSE value of = 7.532; R2 = 0.403; Adjusted R2 = 0.351; running time = 0.154 for men and RMSE = 6.889; R2 = 0.340; Adjusted R2 = 0.264; running time = 0.179 for women. Apart from that, SVR also has performance that tends to be the same but slightly worse, with an RMSE value of = 7,692; R2 = 0.376; Adjusted R2 = 0.321; running time = 0.035 for men and RMSE = 6.905; R2 = 0.337; Adjusted R2 = 0.306; running time = 0.080 for women. Based on the model performance analysis carried out in this research, the model built using the Random Forest Regression Method is superior in predicting biological age compared to the Support Vector Regression Method."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian