Hasil Pencarian

Ditemukan 3 dokumen yang sesuai dengan query

Fatma Irmadani

Klasifikasi Multikelas Credit Scoring pada Pinjaman Peer-to-Peer Menggunakan Metode Multinomial Logistic Regression = Multiclass Classification of Credit Scoring on Peer-to-Peer Loans Using the Multinomial Logistic Regression

Credit Scoring adalah metode yang digunakan untuk memprediksi kemungkinan adanya risiko calon peminjam akan gagal bayar atau menunggak. Credit scoring digunakan oleh penyedia jasa pinjaman ketika calon peminjam dana mengajukan pinjaman. Salah satu perusahaan yang menggunakan credit scoring terhadap peminjamnya adalah Lending Club. Lending Club adalah salah satu penyedia jasa pinjam meminjam online Peer-to-Peer (P2P) di Amerika Serikat. Pada penelitian ini, dilakukan klasifikasi multikelas credit scoring berdasarkan status pinjaman (Loan Status) dari dataset Lending Club. Status pinjaman memiliki 3 kelas, yaitu default, fully paid, dan late. Dengan menggunakan pendekatan machine learning, yaitu supervised learning, klasifikasi multikelas credit scoring dapat dilakukan dengan menggunakan Multinomial Logistic Regression (MLR). MLR merupakan pengembangan dari Logistic Regression yang mampu menangani klasifikasi multikelas. Pada implementasi model MLR, digunakan 3 skenario sampling strategy pada SMOTE yang berbeda dalam mengklasifikasikan multikelas. Hasil klasifikasi multikelas dievaluasi dengan menggunakan metrik accuracy, precision, recall, F1-Score dan AUC (Area Under the Curve) One versus All. Hasil implementasi dengan evaluasi terbaik adalah model MLR dengan nilai accuracy sebesar 0,67 dan nilai rata-rata AUC One versus All sebesar 0,724932. Sedangkan evaluasi pada setiap kelas, kelas default memiliki nilai precision sebesar 0,47,recall sebesar 0,02 dan F1-Score sebesar 0,04; kelas fully paid memiliki nilai precision sebesar 0,85, recall sebesar 0,83 dan F1-Score sebesar 0,84; dan kelas late memiliki nilai precision sebesar 0,02, recall sebesar 0,84 dan F1-Score sebesar 0,04. Hasil tersebut menunjukkan bahwa kelas default memiliki hasil evaluasi yang kurang baik untuk setiap metrik evaluasi, kelas fully paid memiliki hasil evaluasi yang baik untuk setiap metrik evaluasi, sedangkan kelas late memiliki nilai yang cukup baik hanya pada nilai recall (0,84). Hasil yang kurang baik diduga dipengaruhi oleh adanya data yang tidak seimbang dan kelas yang saling tumpang tindih.

Credit Scoring is a method used to predict the possible risk that a prospective borrower will default or delinquency. Credit scoring is used by loan service providers when prospective borrowers apply for loans. One company that uses credit scoring for its borrowers is the Lending Club. Lending Club is a Peer-to-Peer (P2P) online lending and borrowing service provider in the United States. In this study, a multiclass credit scoring classification was carried out based on loan status from the Lending Club dataset. Loan status has 3 classes, namely default, fully paid, and late. By using a machine learning approach, namely supervised learning, multiclass classification of credit scoring can be done using Multinomial Logistic Regression (MLR). MLR is a development of Logistic Regression which is able to handle multiclass classification. In the implementation of the MLR model, 3 different sampling strategy scenarios are used in SMOTE in classifying multiclasses. The multiclass classification results are evaluated using accuracy, precision, recall, F1-Score and AUC (Area Under the Curve) One versus All metrics. The result of the implementation with the best evaluation is the MLR model with an accuracy value of 0.67 and an average value of AUC One versus All of 0.724932. While the evaluation for each class, the default class has a precision value of 0.47, a recall of 0.02 and an F1-Score of 0.04; the fully paid class has a precision value of 0.85, a recall of 0.83 and an F1-Score of 0.84; and the late class has a precision value of 0.02, a recall of 0.84 and an F1-Score of 0.04. These results show that the default class has poor evaluation results for each evaluation metric, the fully paid class has good evaluation results for each evaluation metric, while the late class has a fairly good value only on the recall value (0.84). Unfavorable results are thought to be influenced by the presence of unbalanced data and overlapping classes.
"

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Frischi Dwi Nabilah

Implementasi Metode CatBoost untuk Klasifikasi Multikelas Credit Scoring pada Pinjaman Peer-to-peer = Implementation of the CatBoost Method for Multi-class Classification of Credit Scoring on Peer-to-peer Lending

"Credit scoring merupakan bentuk penilaian untuk menentukan kelayakan peminjam. Tidak ada kesepakatan kapan metode ini mulai berkembang. Namun, kesubjektivitasan dan ketidakmampuan manusia untuk memproses permohonan pinjaman dalam jumlah besar setiap harinya adalah alasan penggunaan credit scoring dengan machine learning menjadi sangat dibutuhkan. Untuk mendeteksi dini potensi peminjam yang bermasalah, credit scoring pada tugas akhir ini diprediksi status pinjaman menjadi tiga kelas: default, fully paid, dan late. Berdasarkan permasalahan tersebut, pada tugas akhir ini digunakan model untuk memprediksi status pinjaman pada kasus klasifikasi multikelas credit scoring dengan machine learning menggunakan metode CatBoost. Penggunaan metode CatBoost dimaksudkan untuk mengatasi kasus klasifikasi multikelas pada data yang heterogen dan tidak seimbang (imbalanced data). Data yang digunakan adalah data pinjaman online peer-to-peer (P2P) LendingClub yang memuat tiga jenis informasi yaitu informasi pinjaman, informasi peminjam, dan informasi riwayat pinjaman peminjam. Data pinjaman P2P LendingClub memiliki imbalanced data dan overlapping class. Terdapat tiga skenario sampling strategy SMOTE-NC dilakukan untuk melihat efek imbalanced data dan overlapping class pada permasalahan klasifikasi multikelas tersebut sehingga didapatkan tiga model. Kinerja model CatBoost dievaluasi berdasarkan precision, recall, f1-score serta accuracy dan AUC one-vs-all. Hasil implementasi CatBoost sudah baik pada kelas 1 (fully paid) dikarenakan f1-score ketiga skenario lebih dari 0,75. Namun, pada kelas 0 (default) dan kelas 2 (late) hasil implementasinya masih tidak baik mengingat f1-score pada kelas 0 (default) tertinggi hanyalah 0,15 sementara f1-score kelas 2 (late) bernilai sama yaitu 0,04 pada ketiga skenario model yang dibuat. Efek dari imbalanced data dan overlapping class pada metrik evaluasi model precision, recall, f1-score serta accuracy dan AUC one-vs-all beragam bergantung dengan kelasnya.

Credit scoring is a form of assessment used to determine the creditworthiness of borrowers. There is no agreement on when this method started to develop. However, subjectivity and the inability of humans to process large volumes of loan applications every day are the reasons why credit scoring with machine learning is highly needed. In order to detect potential problem borrowers early on, this final project predicts the loan status into three classes: default, fully paid, and late. Based on this problem, a model is employed in this final project to predict the loan status in a multi-class classification of credit scoring by using machine learning, specifically using the CatBoost method. The use of CatBoost is intended to address multi-class classification cases with heterogeneous and imbalanced data. The data used in this research is online peer-to-peer (P2P) lending data from LendingClub, which includes three types of information: loan information, borrower information, and borrower's loan history information. The P2P LendingClub loan data has imbalanced data and overlapping classes. Three sampling strategy scenarios of SMOTE-NC are performed to observe the effects of imbalanced data and overlapping classes on this multi-class classification problem, resulting in having three models. The performance of the CatBoost model is evaluated based on precision, recall, f1-score, as well as accuracy and AUC one-vs-all. The implementation of CatBoost yields good results for class 1 (fully paid) as the f1-scores in all three scenarios are above 0.75. However, the implementation results for class 0 (default) and class 2 (late) are still unsatisfactory, considering that the highest f1-score for class 0 (default) is only 0.15, while the f1-score for class 2 (late) has the same value, i.e., 0.04, in all three model scenarios. The effects of imbalanced data and overlapping classes on the evaluation metrics of precision, recall, f1-score, as well as accuracy and AUC one-vs-all vary depending on the class."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Yoel Zabarro

Analisis Kinerja Metode Random Forest untuk Klasifikasi Multikelas Credit Scoring = Performance Analysis of the Random Forest Method for Credit Scoring Multiclass Classification

"Credit scoring adalah suatu proses dalam mengevaluasi kelayakan kredit dari suatu individu. Credit Scoring perlu dilakukan perusahaan keuangan untuk meminimalisir risiko kredit, karena credit scoring dapat menentukan kelayakan debitur. Salah satu perusahaan keuangan yang menyediakan jasa pinjaman berbasis P2P (Peer-to-Peer) yang menerapkan credit scoring dalam evaluasi debitur adalah LendingClub. Pada skripsi ini dilakukan klasifikasi multikelas credit scoring berdasarkan status pinjaman (loan status) yang terdiri dari 3 kelas, yaitu default, fully paid, dan late. Klasifikasi multikelas credit scoring dapat dilakukan dengan salah satu pendekatan machine learning, yaitu supervised learning. Metode supervised learning yang digunakan yaitu random forest. Random forest adalah suatu metode pencarian informasi berbasis tree dengan setiap tree memuat kumpulan variabel acak. Implementasi model random forest dilakukan dengan menggunakan tiga skenario strategy sampling SMOTE yang berbeda. Implementasi model pada tiap skenario dilakuan sebanyak 5 kali percobaan dan dievaluasi menggunakan precision, recall, f1-score, accuracy, dan AUC one vs all. Rata-rata accuracy terbaik adalah sebesar 0,78; dan rata-rata AUC one vs all terbaik adalah sebesar 0,679179. Sedangkan untuk hasil evaluasi berdasarkan tiap kelas, pada kelas default, precision terbaik adalah sebesar 0,39; recall terbaik adalah sebesar 0,27; dan f1-score terbaik adalah sebesar 0,28. Pada kelas fully paid, precision terbaik adalah sebesar 0,82; recall terbaik adalah sebesar 0,95; dan f1-score terbaik adalah sebesar 0,88. Pada kelas late, precision terbaik adalah sebesar 0,02; recall terbaik adalah sebesar 0,02; dan f1-score terbaik adalah sebesar 0,02. Secara keseluruhan, hasil evaluasi model pada ketiga skenario hanya baik dalam memprediksi kelas 1 (fully paid), tetapi kurang baik dalam memprediksi kelas 0 (default) dan kelas 2 (late). Hal tersebut diduga terjadi akibat dataset yang terdapat imbalance data dan class overlap.

Credit scoring is a process in evaluating the creditworthiness of an individual. Credit scoring needs to be done by financial companies to minimize credit risk, because credit scoring can determine the eligibility of debtors. One financial company that provides P2P (Peer-to-Peer) based loan services that applies credit scoring in debtor evaluation is LendingClub. In this thesis, a multiclass classification of credit scoring based on loan status was carried out consisting of 3 classes, namely default, fully paid, and late. Multiclass classification of credit scoring can be done with one of the machine learning approaches, namely supervised learning. The supervised learning method used is random forest. Random forest is a tree-based method of retrieving information with each tree containing a random set of variables. The implementation of the random forest model was carried out using three different SMOTE strategy sampling scenarios. Model implementation in each scenario was carried out 5 times and evaluated using precision, recall, f1-score, accuracy, and AUC one vs all. The best average accuracy is 0.78; and the best average AUC of one vs all is 0.679179. As for the evaluation results based on each class, in the default class, the best precision is 0.39; The best recall was 0.27; and the best F1-score is 0.28. In the fully paid class, the best precision is 0.82; The best recall is 0.95; and the best F1-score is 0.88. In the late class, the best precision is 0.02; The best recall is 0.02; and the best F1-score is 0.02. Overall, the results of model evaluation in all three scenarios were only good at predicting class 1 (fully paid), but less good at predicting class 0 (default) and class 2 (late). This is thought to occur due to datasets that contain data imbalances and class overlap"

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian