Hasil Pencarian

Ditemukan 8 dokumen yang sesuai dengan query

Dasarathy, Belur V.

Nearest neighbor (NN) norms : nn pattern classification techniques

Washington: IEEE Computer Society Press, 1991

R 519 DAS n

Buku Referensi Universitas Indonesia Library

Rany Dwi Cahyaningtyas

Prediksi Churn Pelanggan Berdasarkan Segmen Produk Susu Bubuk Balita menggunakan Model Customer Lifetime Value (CLV) dan Metode Klasifikasi K-Nearest Neighbor = Customer Churn Prediction based on Infant Powdered Milk Product Segment using Customer Lifetime Value (CLV) Model and K-Nearest Neighbor Classifier

"Produk susu bubuk balita yang beragam membuat konsumen memiliki banyak pilihan sehingga penting bagi produsen menjaga loyalitas pelanggan yang telah ada dengan memahami perilaku churn pelanggan. Churn pelanggan didefinisikan sebagai kecenderungan pelanggan untuk berhenti melakukan bisnis dengan sebuah perusahaan. Penelitian ini berfokus memprediksi pola churn pelanggan sehingga perusahaan dapat menentukan strategi untuk mengurangi churn. Penelitian ini membahas mengenai prediksi churn pelanggan berdasarkan segmen produk susu bubuk balita menggunakan model Length, Recency, Frequency, Monetary (LRFM). Responden penelitian ini adalah pelanggan PT. XYZ yang pernah bertransaksi untuk produk susu bubuk balita kelas premium (susu A) dan segmen biasa (susu B) selama periode tahun 2021. Variabel pada penelitian ini meliputi variabel LRFM dan CLV yang dibentuk dengan pembobotan variabel LRFM. Pertama metode Fuzzy C-Means Clustering digunakan untuk melakukan pelabelan target pelanggan selanjutnya metode klasifikasi K-Nearest Neighbor (KNN) digunakan untuk memprediksi churn. Hasilnya terdapat tiga kelompok pelanggan untuk masing-masing susu A dan susu B. Pelabelan yang dihasilkan yaitu pelanggan churn dengan nilai CLV rendah, potential to churn dengan nilai CLV menengah, dan loyal dengan nilai CLV tinggi. Susu B menunjukkan jumlah pelanggan churn sebesar 43,4% lebih banyak dibandingkan susu A sebanyak 34%. Tahapan akhir penelitian ini adalah menganalisis kinerja metode KNN berdasarkan nilai akurasi, recall, dan f1-score terhadap kedua susu A dan susu B. Hasil dari tugas akhir ini menunjukkan bahwa kinerja metode KNN bergantung pada pemilihan jumlah tetangga terdekat dan proporsi pemisahan data.

The variety of powdered toddler milk products gives consumers many choices, so producers need to maintain the loyalty of existing customers by understanding customer churn behaviour. Customer churn is defined as the tendency of a customer to stop doing business with a company. This study focuses on predicting customer churn patterns so companies can determine strategies to reduce churn. This study discusses the prediction of customer churn based on the segment of toddler powdered milk products using the Length, Recency, Frequency, Monetary (LRFM) model. The respondent of this research are the customers of PT. XYZ who have transacted for premium segment powdered milk products for toddlers (milk A) and ordinary segment (milk B) during 2021. Variables in the data include LRFM and CLV variables which are formed by weighting the LRFM variable. At first, Fuzzy C-Means Clustering algorithm was applied for labelling target customer and then, K-Nearest Neighbor (KNN) Classifier as churn prediction was used. As a result, there are three groups of customers for each milk A and milk B. The resulting labels are the churn customer group with low CLV value, potential to churn group with medium CLV, and loyal customer group with high CLV value. Milk B shows the number of customers churn by 43,4% more than milk A as much as 34%. In the final stage of this research, the author analyze the performance of the KNN method based on the value of accuracy, recall, and f1-score for both milk A and milk B. The results of this final project show that the performance of the KNN method depends on the selection of the number of nearest neighbors and the proportion of data splitting used."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Dinda Lusiafitri

Optimasi Rute Distribusi Pengantaran Uang Tunai (Cash-in-Transit) yang Mempertimbangkan Risiko Perjalanan Menggunakan Metode Ant Colony Optimization = Optimization of Cash Delivery Distribution Routes (Cash-in-Transit) by Considering Travel Risks Using Ant Colony Optimization Method

"Uang tunai yang digunakan sebagai alat tukar utama dalam transaksi ekonomi, dicetak dan diedarkan oleh Bank Sentral suatu Negara. Pendistribusian uang tunai dari Bank Sentral ke bank-bank umum, agar dapat digunakan oleh masyarakat, dilakukan oleh Perusahaan Jasa Cash-in-Transit (CIT). CIT merupakan proses pengiriman dan pengangkutan uang tunai dari satu tempat ke tempat lainnya. Salah satu risiko perjalanan yang mungkin terjadi dalam proses ini yaitu hilangnya uang akibat perampokan. Oleh karena itu, pada skripsi ini dilakukan optimasi distribusi pengantaran uang tunai yang bertujuan meminimumkan biaya perjalanan sekaligus meminimumkan risiko perjalanan akibat perampokan. Masalah optimasi tersebut dimodelkan dalam bentuk Vehicle Routing Problem with Time Windows (VRPTW) yang merupakan sebuah modifikasi dengan fungsi tujuan yang menggabungkan dua aspek, yaitu biaya perjalanan dan risiko, menjadi suatu nilai yang tidak memiliki satuan. Untuk menyelesaikan permasalahan tersebut digunakan metode Ant Colony Optimization (ACO), yaitu metode heuristik berdasarkan perilaku semut dalam mencari jejak perjalanan, dengan pembentukan solusi awal menggunakan metode Nearest Neighbor. Eksperimen diimplementasikan pada contoh kasus terdapat 1 depot dan 15 kantor bank umum. Hasil eksperimen menunjukkan bahwa metode Ant Colony Optimization (ACO) dapat mengoptimalkan solusi yang sebelumnya dihasilkan oleh metode Nearest Neighbor dengan mengalami penurunan sebesar 4,87%, hasil tersebut meliputi penurunan nilai risiko sebesar 3,67% dan total biaya perjalanan yang meningkat sebesar 0,02%.

Cash is used as the primary medium of exchange in economic transactions, printed and circulated by the Central Bank of a country. The distribution of cash from the Central Bank to commercial banks, for use by the public, is carried out by Cash-in-Transit (CIT) companies. CIT involves the transportation and delivery of cash from one location to another. One of the risks in this process is the potential loss of money due to robbery. Therefore, this thesis aims to optimize the distribution of cash delivery with the dual objectives of minimizing travel costs and reducing the risk of robbery during transportation. The optimization problem is modeled as a Vehicle Routing Problem with Time Windows (VRPTW), which is a modification with an objective function combining two aspects: travel costs and risk, resulting in a unitless value. To address this problem, the Ant Colony Optimization (ACO) method is employed, a heuristic approach based on the behavior of ants in finding travel paths, with the initial solution formed using the Nearest Neighbor method. Experiments were implemented on a sample case with data from 1 depot and 15 commercial bank offices. The experimental results show that the Ant Colony Optimization (ACO) method can optimize solutions compared to the Nearest Neighbor method, resulting in a decrease of 4.87%. This improvement includes a 3.67% reduction in risk and a marginal increase of 0.02% in total travel costs."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Annisa Kamalia

Klasifikasi data talasemia menggunakan K-nearest neighbor dan naive bayes = Classification of data thalassemia using K-nearest neighbor and naive bayes

"ABSTRACT

Talasemia adalah penyakit yang disebabkan oleh adanya kelainan dalam hemoglobin. Penyakit talasemia merupakan penyakit herediter atau penyakit keturunan dimana pembawa gen talasemia adalah orang tua dari penderita. Di Indonesia, pada tahun 2015 diketahui jumlah kasus talasemia mencapai 7.029 kasus. Sampai saat ini talasemia belum dapat disembuhkan namun dapat dikenali sifat pembawanya dengan skrining. Dalam tugas akhir ini, akan dibandingkan performa dari dua metode yang digunakan untuk mengklasifikasikan data talasemia, yaitu K-Nearest Neighbor dan Naive Bayes. Data yang digunakan adalah 82 data pasien talasemia dan 68 data pasien non-talasemia dari Rumah Sakit Anak dan Bunda Harapan Kita, Jakarta Barat. Hasil akhir menunjukkan bahwa metode Naive Bayes memberikan nilai akurasi yang lebih besar dari K-Nearest Neighbor dalam mengklasifikasikan talasemia. Rata-rata akurasi Naive Bayes sebesar 99.775% dengan rata-rata waktu running 0.0554 detik dan rata-rata akurasi K-Nearest Neighbor adalah 97.142% dengan rata-rata waktu running 0.081 detik. Untuk nilai spesifikasi, keduanya memberikan performa yang sama, yaitu dari K-Nearest Neighbor diperoleh ketika K=3 yaitu sebesar 100% dan dari Naive Bayes sebesar 100%. Hasil rata-rata sensitivitas tertingi diberikan oleh Naive Bayes yaitu sebesar 99.59%, sedangkan K-Nearest Neighbor sebesar 96.25% untuk K=1.

ABSTRACT

Thalassemia is a disease caused by abnormalities in the hemoglobin. Thalassemia is a hereditary disease which the thalassemia gene carriers are parents of sufferers. In Indonesia, in 2015 it was found that the number of thalassemia cases reached 7,029 cases. Until now thalassemia has not been cured, but it can be recognized the nature of its carrier by screening. In this final project, the performance of the two methods will be compared to classify thalassemia data, namely K-Nearest Neighbor and Naive Bayes. The data used were 82 data on thalassemia patients and 68 data on non-thalassemia patients from Harapan Kita Children and Womans Hospital, West Jakarta. The final results show that the Naive Bayes method provides greater accuracy value than K-Nearest Neighbor in classifying thalassemia. The average accuracy of Naive Bayes is 99.775% with an average running time of 0.0554 seconds and the average accuracy of K-Nearest Neighbor is 97.142% with an average running time of 0.081 seconds. For specification values, both give the same performance. The result of specification values using K-Nearest Neighbor yield when K = 3 that is 100% and from Naive Bayes that is 100%. The highest average sensitivity results are given by Naive Bayes is 99.59%, while K-Nearest Neighbor is 96.25% for K = 1."

2019

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Uji kinerja dan analisis k-support vector nearest neighbor terhadap decision tree dan naive bayes

" Algoritma K-Support Vector Nearest Neighbor (K-SVNN) menjadi salah satu alternatif

metode hasil evolusi K-Nearest Neighbor (K-NN) yang bertujuan untuk mengurangi saat prediksi

tetapi tetap mempertahankan akurasi prediksi. Metode ini masih relatif muda sehingga baru

dibandingkan hanya dengan metode-metode berbasis K-NN lainnya. Dalam penelitian ini

dilakukan analisis perbandingan kesamaan, perbedaan, dan kinerja terhadap metode Decision Tree

(DT) dan Naïve Bayes (NB). Pengujian dengan perbandingan ini penting untuk mengetahui

keunggulan dan kelemahan relatif yang dimiliki oleh K-SVNN. Dengan mengetahui keunggulan

dan kelemahan maka metode tersebut dapat dibuktikan kehandalannya ketika diimplementasikan.

Pengujian dilakukan baik pada saat pelatihan maupun prediksi. Kinerja pelatihan diukur dalam hal

waktu yang digunakan untuk pelatihan, kinerja prediksi diukur dalam hal waktu yang digunakan

untuk prediksi dan akurasi prediksi yang didapat. Hasil pengujian menunjukkan bahwa K-SVNN

mempunyai akurasi yang lebih baik daripada DT dan NB. Sedangkan waktu yang digunakan untuk

pelatihan dan prediksi K-SVNN lebih lama disbanding DT dan NB. "

005 JEI 3:1 (2013)

Artikel Jurnal Universitas Indonesia Library

Wahyu Nuryaningrum

Perbandingan Prediksi Tren Harga Saham Menggunakan Random Forest, Support Vector Regression, dan K-Nearest Neighbor = Prediction Comparison of Stock Market Trend using Random Forest, Support Vector Regression, and K-Nearest Neighbor

"Pesatnya perkembangan ekonomi menyebabkan kebutuhan manusia menjadi tidak terbatas. Usaha yang dapat dilakukan untuk pemenuhan kebutuhan hidup di masa yang akan datang adalah dengan melakukan investasi. Saham merupakan salah satu instrumen investasi dengan tingkat keuntungan yang menarik, namun memiliki risiko kerugian yang tinggi. Hal ini disebabkan oleh adanya pergerakan harga saham yang cenderung tak menentu selama periode waktu tertentu. Untuk meminimalkan risiko kerugian, perlu dilakukan prediksi pergerakan harga saham. Prediksi yang akurat akan membantu para investor dalam menentukan nilai saham di masa yang akan datang. Pada penelitian ini, dilakukan perbandingan untuk memprediksi pergerakan harga saham menggunakan tiga algoritma supervised machine learning yaitu Random Forest, Support Vector Regression (SVR) dan K- Nearest Neighbor (KNN) berdasarkan tingkat akurasinya. Sutau model dikatakan akurat jika memiliki nilai Root Mean Square Error (RMSE) dan Mean Absolute Error (MAE) yang lebih rendah. Pada penelitian ini, diperoleh hasil prediksi harga penutupan saham terbaik menggunakan metode Support Vector Regression dengan melihat rendahnya nilai RMSE dan MAE yang dihasilkan dibandingkan dengan dua metode lain. Dalam perhitungannya, penelitian ini menggunakan histori data harian dari website investing.com. periode Maret 2017 hingga Februari 2020 dari tiga perusahaan di Indonesia yang terdaftar dalam IDX30.

The fast growth of economic development causes human needs to be immeasurable. One of the efforts that could be done to fulfill life needs in the future was Investation. Stock is one of the Investation instruments with interesting benefits but has high- risk loss caused by the unstable stock market trend between some period. For minimalizing the risky loss, the literati need to predicting the stock rate trend. The accurate prediction will help the investor in choosing a stock value in the future. In this study, the literati make a comparison to predict stock market trend with three kinds of algorithms supervised machine learning that are Randon Forest, Support Vector Regression (SVR), and K-Nearest Neighbor (KNN) based on their accurate level. A model could be said accurate just if they have a lower value of Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The best Stock Closing Price prediction will be obtained by the Support Vector Regression method and see how low the result of RMSE and MAE value is compared with another method. To calculate, the study uses a daily data history from investing.com website between March 2017 to February 2020 period. The object data is a three big company in Indonesia which listed in IDX30."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Nurul Shabrina

Metode Bicluster Berbasis k-Nearest Neighbors dan Robust Least Squares Estimation menggunakan Principal Components (bi-KNNRLSP) untuk imputasi Missing values pada Data Ekspresi Gen = Missing values Imputation for Microarray Data Using Bicluster-Based k-Nearest Neighbors and Robust Least Squares Estimation with Principal Components (bi-KNN-RLSP)

"Microarray merupakan salah satu teknologi pada bidang biologi yang memberikan

informasi tentang ekspresi gen. Data microarray mentah berupa gambar, yang harus

diubah menjadi matriks ekspresi gen dimana baris menunjukkan gen, kolom

menunjukkan kondisi eksperimental. Namun, pada praktiknya data microarray banyak

ditemukan missing values yang tentunya akan menghambat proses dari analisis datanya.

Imputasi merupakan salah satu solusi yang dapat mengatasi adanya missing values pada

data microarray. Dengan menggunakan imputasi, nilai missing values yang terdapat pada

matriks data diprediksi atau diestimasi sehingga diperoleh matriks data yang lengkap.

Metode imputasi yang digunakan pada penelitian ini bernama bi-KNN-RLSP, yang

menggunakan konsep biclustering, principal component analysis, dan regresi kuantil.

Dalam proses pembentukan biclustering, dibutuhkan matriks lengkap sementara yang

diperoleh melalui proses praimputasi dengan KNNimpute. Percobaan bi-KNN-RLSP

dilakukan pada data ekspresi gen garis sel kanker serviks dengan menerapkan missing

rate yang berbeda, yaitu 1%, 5%, 10%, 15%, 20%, 25%, dan 30% dengan menggunakan

parameter k=10 pada proses praimputasi KNNimpute. Hasil percobaan tersebut dievaluasi

performanya menggunakan normalized root mean squared error. Nilai rata-rata NRMSE

pada percobaan yang dilakukan sebanyak lima kali memiliki nilai yang lebih rendah

dibandingkan dengan metode bi-RLSP dan row average. Waktu komputasi untuk metode bi-KNN-RLSP dan bi-RLSP tidak jauh berbeda, sehingga dengan waktu yang tidak

signifikan berbeda, metode bi-KNN-RLSP dapat menghasilkan nilai NRMSE yang lebih kecil dibandingkan dengan bi-RLSP. Oleh karena itu, dapat dikatakan bahwa modifikasi praimputasi row average pada metode bi-RLSP menjadi KNNimpute dapat menghasilkan performa imputasi yang lebih bagus. Selain itu, diperoleh hasil bahwa nilai NMRSE untuk metode bi-KNN-RLSP meningkat seiring dengan meningkatnya missing rate.

Microarray is a technology in biology that provides information about gene expression. The raw microarray data is in the form of images, which must be converted into a gene expression matrix where rows indicate genes, columns indicate experimental conditions. However, in practice, many missing values are found in microarray data, which of course
will hinder the process of data analysis. Imputation is one solution that can overcome the missing values in microarray data. By using imputation, the missing values contained in the data matrix are predicted or estimated so that a complete data matrix is obtained. The imputation method used in this study is called bi-KNN-RLSP, which uses the concept of
biclustering, principal component analysis, and quantile regression. In the process of forming biclustering, a temporary complete matrix is needed which is obtained through the pre-imputation process with KNNimpute. The bi-KNN-RLSP experiment was carried out on cervical cancer cell line gene expression data by applying different missing rates,
namely 1%, 5%, 10%, 15%, 20%, 25%, and 30% using the parameter k=10. in the KNNimpute pre-imputation process. The results of these experiments were evaluated for their performance using the normalized root mean squared error. The average value of NRMSE in the five times experiment has a lower value than the bi-RLSP and row average methods. The computation time for the bi-KNN-RLSP and bi-RLSP methods is not much different, so with the time that is not significantly different, the bi-KNN-RLSP method can produce a smaller NRMSE value compared to bi-RLSP. Therefore, it can be said that the modification of the row average preimputation in the bi-RLSP method to KNNimpute can produce better imputation performance. In addition, it was found that the NMRSE value for the bi-KNN-RLSP method increased along with the increase in the missing rate."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2022

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Aulya Khatulistivani

Pengembangan dan analisis sistem parkir otomatis berbasis pengenalan plat nomor otomatis menggunakan metode K-nearest neighbor = Development and analysis of automatic parking system based on automatic license plate recognition using K-nearest neighbor method

"Pada sistem parkir yang ada saat ini, proses pengecekan plat nomor atau Tanda Nomor Kendaraan Bermotor TNKB dilakukan dengan mencocokkan plat nomor melalui foto yang diambil saat pengguna masuk ke area parkir. Hasil pengenalan plat nomor kemudian diinput ke komputer dengan cara diketik. Proses yang dilakukan secara manual oleh operator ini memakan waktu yang relatif lama. Tugas akhir ini mengembangkan pengenalan plat nomor otomatis untuk mengatasi masalah tersebut. Pengenalan plat nomor otomatis merupakan teknologi yang memudahkan ekstraksi karakter-karakter pada plat nomor. Pengembangan sistem parkir ini menggunakan OpenCV sebagai pustaka pengolah citra, algoritma KNN K-Nearest Neighbour untuk Optical Character Recognition, dan sistem basis data untuk sistem parkir.

Berdasarkan hasil pengujian, kombinasi nilai variabel block size dan weight terbaik untuk proses thresholding dalam pengenalan plat nomor adalah b=71 dan w=20 dengan hasil akurasi segmentasi karakter sebesar 89, akurasi rekognisi sebesar 82, dan jumlah rekognisi yang tepat 100 sebesar 26. Sistem dapat membaca plat nomor dengan baik pada jarak optimal 60 cm dengan akurasi segmentasi karakter sebesar 89, akurasi rekognisi sebesar 79, dan jumlah rekognisi yang tepat 100 sebesar 26. Resolusi input gambar juga memengaruhi proses pengenalan plat nomor.

Resolusi yang optimal untuk rekognisi adalah 1024 x 768 dengan hasil akurasi segmentasi karakter sebesar 89, akurasi rekognisi keseluruhan sebesar 81, jumlah rekognisi yang tepat 100 sebesar 26, dan dengan rata-rata waktu pemrosesan selama 0,174 detik. Akurasi rekognisi plat nomor juga diperngaruhi oleh faktor lain seperti pencahayaan dan kondisi plat nomor apakah rusak atau tidak, tertutup sesuatu atau tidak . Selain itu, kondisi plat nomor yang ideal diperlukan untuk pengenalan plat nomor otomatis secara optimal. Secara keseluruhan, sistem parkir otomatis memiliki akurasi rekognisi yang baik.

In current parking system, number plate checking is done by matching it through the photo taken when user enters the parking area. The operator then types the recognised number plate into computer. The process, which is done manually by operator, takes a relatively long time. This thesis develops an automatic license plate recognition to overcome the problem. Automatic license plate recognition is a technology which makes computer able to recognize characters in a license plate. The development of the system uses OpenCV as image processing library, KNN algorithm for Optical Character Recognition OCR, and database system for parking data.
Based on the test result, the combination of the best block size and weight value for the thresholding process in the recognition of the number plate is b 71 and w 20 with character segmentation accuracy of 89, recognition accuracy of 82, and the number of fully recognized number plate of 1. The system can read the number plate well at an optimal distance of 60 cm with character segmentation accuracy 89, recognition accuracy of 79, and fully recognized number plate 26. Image input resolution also affects the number plate recognition process.
The optimal resolution for recognition is 1024 x 768 with character segmentation accuracy of 89, overall segmentation accuracy of 81, the number of fully recognized number plate of 26, and with average processing time of 0.174 seconds. The accuracy of plate number recognition is also affected by other factors such as lighting and the condition of the number plate whether it is damaged or not, obstructed by something or not. In addition, the ideal number plate conditions are required for optimal number plate recognition. Overall, the automated parking system has a good recognition accuracy."

Depok: Fakultas Teknik Universitas Indonesia, 2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian