Hasil Pencarian

Ditemukan 215839 dokumen yang sesuai dengan query

Syahrul Amrie

Analisis sentimen terhadap layanan imigrasi menggunakan data Twitter, Instagram dan ulasan pada aplikasi M-Paspor di Google play store berbasis pembelajaran mesin = Sentiment analysis on immigration services using data Twitter, Instagram and review application M-paspor on Google play store based on machine learning

"Perkembangan media sosial telah berkembang pesat, tidak hanya sebagai alat komunikasi sosial antar individu. Fungsi dan kegunaannya semakin berkembang serta banyak dimanfaatkan organisasi swasta maupun pemerintah untuk mengukur tingkat layanan. Ditjen Imigrasi selaku organisasi pemerintah merupakan salah satu organisasi yang memanfaatkan media sosial, salah satu fungsinya untuk mengetahui apakah layanan yang diberikan telah diterima dengan baik oleh masyarakat. Selain melalui media sosial, Imigrasi juga telah meluncurkan aplikasi M-Paspor di platform Google Play Store, di platform tersebut Imigrasi juga dapat mengetahui tingkat efektivitas dari aplikasi yang telah diluncurkan. Berdasarkan survei yang dilakukan oleh Balitbangham yang merupakan internal dari Kemenkumham, layanan yang diberikan oleh imigrasi mendapat nilai sangat baik, namun faktanya pada media sosial maupun google play store banyak komentar maupun ulasan yang kurang puas dengan pelayanan pihak imigrasi. Hal tersebut menjadi kontradiksi antara hasil survei Balitbangham dan data di media sosial. Namun, akan sulit untuk melakukan analisis data media sosial dikarenakan jumlah yang banyak. Oleh karena itu, perlu dilakukan untuk mengusulkan sistem untuk melakukan analisis sentimen menggunakan data teks komentar dan ulasan. Sehingga pihak Imigrasi dapat mengambil langkah terbaik untuk dapat memperbaiki layanan yang masih belum maksimal. Dataset yang digunakan berupa data yang diambil dari media sosial Twitter dan Instagram serta ulasan pada Google Play Store. Hasil penelitian menunjukan jika fitur ekstraksi TF-IDF Unigram yang dipadukan dengan algoritma Support Vector Machine (SVM) serta SMOTE menghasilkan performa paling tinggi dibandingkan dengan nave Bayes (NB) maupun Random Forest (RF). dalam melakukan klasifikasi, SVM menghasilkan dengan hasil Precision 72%, Recall 69%, Accurasy 69, serta F1-Score sebesar 68%. Model tersebut dapat digunakan Imigrasi untuk mengetahui umpan balik pelayanan dari masyarakat yang dapat digunakan sebagai pertimbangan dalam melakukan perbaikan pelayanan serta merumuskan strategi pelayanan oleh Direktorat terkait agar pelayanan lebih efisien untuk kedepannya. Sehingga, Imigrasi akan mampu dengan cepat merespon kendala yang dihadapai oleh masyarakat.

The development of social media has grown rapidly, not only as a means of social communication between individuals. Its functions and uses are growing and are widely used by private and government organizations to measure service levels. The Directorate General of Immigration as a government organization is one of the organizations that utilizes social media. Its function is to find out whether the services provided have been well received or not by the public. Apart from social media, Immigration has also launched the M-Passport application on the Google Play Store platform, on the platform, Immigration officials can also find out the effectiveness of the applications that have been launched. Based on a survey conducted by Balitbangham which is internal to the Ministry of Human Rights, the services provided by immigration get a very good score, but the fact is that on social media and the Google Play Store some many comments and reviews are not satisfied with the services of the immigration authorities. This is a contradiction between the results of the Balitbangham survey and data on social media. However, it will be difficult to analyze social media data due to the large number. Therefore, it is necessary to propose a system to perform sentiment analysis using commentary and reviewing text data. So that Immigration can take the best steps to be able to improve services that are still not optimal. The dataset used is in the form of data taken from social media Twitter and Instagram as well as reviews on the Google Play Store. The results show that the TF-IDF Unigram extract feature combined with the Support Vector Machine (SVM) and SMOTE algorithms produces the highest performance compared to NaÃ¯ve Bayes (NB) and Random Forest (RF). In classifying, SVM produces 72% Precision, 69% Recall, 69% Accuracy, and 68% F1-Score. This model can be used by Immigration to find out service feedback from the community as a consideration in making service improvements and formulating more efficient service strategies for the future. Thus, Immigration will be able to quickly respond to the obstacles faced by the community."

Jakarta: Fakultas Ilmu Kompter Universitas Indonesia, 2022

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Jefka Dhammananda

Analisis Sentimen dan Pemodelan Topik Ulasan Aplikasi E-Grocery Menggunakan Algoritma Naive Bayes dan Support Vector Machine: Studi Kasus Data Ulasan Segari di Google Play Store = Sentiment Analysis and Topic Modeling of E-grocery Application Reviews Using Naive Bayes and Support Vector Machine Algorithm: A Case Study of Segari Data Review on Google Play Store

"Pesatnya perkembangan teknologi informasi dan komunikasi menuntut adanya inovasi dalam pengembangan aplikasi agar dapat mengikuti perkembangan yang cepat tersebut. Segari adalah salah satu penyedia layanan supermarket online yang populer di Indonesia. Segari merupakan perusahaan yang berlandasan customer centric dan mempunyai nilai Be Obsessed with our Customers, sangat mengedepankan kebutuhan dari pelanggannya. Minimnya sumber daya manusia dan banyaknya ulasan pelanggan yang perlu di analisis menghambat proses penggalian informasi dari ulasan pelanggan tersebut, sehingga diperlukan model pembelajaran mesin yang dapat secara otomatis melakukan analisis sentimen untuk mengklasifikasikan ulasan menjadi sentimen positif atau negatif. Informasi yang diambil dari analisis sentimen dapat digunakan sebagai referensi untuk menjaga kualitas layanan berdasarkan sentimen positif, sedangkan hasil dari sentimen negatif dapat digunakan sebagai bahan evaluasi untuk meningkatkan layanan dan aplikasi Segari. Dalam penelitian ini, peneliti membahas implementasi model analisis sentimen menggunakan ulasan pelanggan dari Google Play Store. Metode pembuatan model dimulai dari pengumpulan data, pelabelan data, pra proses data, ekstraksi fitur, model klasifikasi sentimen, evaluasi model, dan pemodelan topik. Peneliti menggunakan dua algoritma klasifikasi, Naive Bayes Classifier (NB) dan Support Vector Machine (SVM), pada total 10.507 ulasan. Data menunjukkan bahwa 74,37% ulasan mengungkapkan sentimen positif, sedangkan 25,63% mengungkapkan sentimen negatif. Hasil penelitian menunjukkan bahwa algoritma SVM dengan oversampling mencapai kinerja model terbaik, dengan recall sebesar 89,98%. Selain itu, peneliti menggunakan Latent Dirichlet Allocation (LDA) untuk mengidentifikasi topik terkait dengan perspektif pelanggan tentang Segari yang selanjutnya disampaikan kepada tim terkait. Hasil analisis mengungkapkan bahwa terdapat pelanggan yang puas dan kecewa dengan proses pengiriman produk. Pelanggan umumnya sudah puas dengan kualitas dan kesegaran dari produk. Beberapa pelanggan merasa kecewa karena pesanan yang kosong atau tidak lengkap dalam paket. Terdapat pelanggan yang puas dan kecewa terhadap aplikasi antarmuka pengguna, kecepatan, maupun kinerja aplikasi. Terdapat pelanggan yang puas dan kecewa terhadap harga, promo, dan voucher yang tersedia. Beberapa pelanggan merasa kecewa terhadap servis yang diberikan oleh customer service. Secara keseluruhan, penelitian ini memperluas pengetahuan tentang metode analisis sentimen dan memberikan wawasan tentang melakukan penelitian terkait analisis sentimen dan ulasan pelanggan.

The rapid development of information and communication technology demands innovation in application development to keep up with such rapid advancement. Segari is one of the popular online supermarket service providers in Indonesia. Segari is a customer-centric company with a core value of being obsessed with its customers, prioritizing their needs. The lack of human resources and the abundance of customer reviews that need to be analyzed hinder the process of extracting information from these reviews. Therefore, a machine learning model is needed to automatically perform sentiment analysis and classify the reviews into positive or negative sentiments. The information extracted from sentiment analysis can be used as a reference to maintain service quality based on positive sentiments, while the results of negative sentiments can be used for evaluation to improve Segari's services and application. In this research, the implementation of a sentiment analysis model using customer reviews from the Google Play Store is discussed. The model development process includes data collection, data labeling, data preprocessing, feature extraction, sentiment classification model, model evaluation, and topic modeling. The researcher utilized two classification algorithms, Naive Bayes Classifier (NB) and Support Vector Machine (SVM), on a total of 10,507 reviews. The data shows that 74.37% of the reviews express positive sentiments, while 25.63% express negative sentiments. The results of the study indicate that the SVM algorithm with oversampling achieved the best model performance, with a recall of 89.98%. Additionally, the researcher used Latent Dirichlet Allocation (LDA) to identify topics related to customer perspectives on Segari, which will be communicated to the relevant team. The analysis revealed that some customers are satisfied while others are disappointed with the product delivery process. Customers generally expressed satisfaction with the quality and freshness of the products. Some customers felt disappointed due to missing or incomplete items in their orders. There were mixed opinions about the user interface, speed, and performance of the application. Customers also expressed satisfaction and dissatisfaction with the available prices, promotions, and vouchers. Some customers felt disappointed with the service provided by the customer service team. Overall, this paper extends knowledge of sentiment analysis methods and provides insights on conducting research related to sentiment analysis and customer reviews.
"

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Riko Wijayanto

Analisis Sentimen Berbasis Aspek pada Teks Ulasan Pengguna di Google Play Store = Aspect Based Sentiment Analysis of User`s Review on Google Play Store

"Perkembangan teknologi informasi dan komunikasi (TIK) yang pesat menuntut inovasi dalam pengembangan aplikasi juga berkembang cepat. Aplikasi Tokopedia Seller merupakan salah satu aplikasi utama milik PT Tokopedia yang diperuntukkan bagi penjual dalam melakukan kegiatan operasional penjualan produk. Aplikasi yang baru diluncurkan di Android ini tergolong aplikasi perintis dan memerlukan banyak masukan dari pengguna, salah satunya dari Google Play Store. Akan tetapi, banyaknya ulasan yang masuk dan beragamnya opini, mengakibatkan proses analisis sentimen dan aspek ulasan menjadi lambat dan banyak terlewat. Oleh karena itu, perlu dilakukan suatu penelitian yang mengusulkan sistem otomatis untuk melakukan analisis sentimen berbasis aspek. Tujuan dari usulan sistem otomatis ini adalah untuk memudahkan proses analisis ulasan pengguna. Adapun data ulasan yang digunakan sebagai masukan eksperimen bersumber dari Google Play Store sejumlah 6.221 data berlabel dari Juli – September 2021. Penelitian ini menunjukkan bahwa algoritma Support Vector Machine (SVM) yang dipadukan dengan SMOTE menghasilkan performa yang paling baik dibandingkan dengan CNN dan Logistic Regression dengan accuracy 54%, precision 48%, dan recall 52% untuk mengklasifikan sentimen. Selaras dengan analisis sentimen, SVM dengan SMOTE juga menghasilkan performa yang lebih baik dengan accuracy 40%, precision 41%, dan recall 40%. Kondisi data ulasan yang cenderung singkat yakni kurang dari 10 kata, mengakibatkan performa klasifikasi kurang optimal.

The rapid development of information and communication technology (ICT) requires innovation in the field of application development. The Tokopedia Seller application is one of the main applications owned by PT Tokopedia which develops for sellers in carrying out product sales operational activities. It was just launched on Android, and it is classified as a pioneering application and requires a lot of input from users, one of which is from the Google Play Store. However, due to a lot of reviews came in, it makes the process of sentiment analysis and aspect review being slow and many being missed. Therefore, it is necessary to conduct a study that proposes a automatic system to perform aspect-based sentiment analysis. The purpose of this automated system proposal is to simplify the process of analyzing user reviews. The review of the data used as experimental input sourced from the Google Play Store with a total of 6,221 data labeled from July – September 2021. This study shows that the Support Vector Machine (SVM) algorithm combined with SMOTE produces the best performance compared to CNN and Logistic Regression with 54% accuracy, 48% precision, and 52% recall for classifying sentiments. In line with sentiment analysis, SVM with SMOTE also produces better performance with 40% accuracy, 41% precision, and 40% recall. The condition of the short review data is less than 10 words, resulting in a less than optimal classification performance."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2022

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Hanif Sudira

Pembuatan model analisis sentimen untuk perhitungan brand reputation serta pemanfaatan topic modelling pada layanan Indihome menggunakan data Twitter dan Instagram = Creating a sentiment analysis model for calculation of brand reputation and utilization of topic modelling on Indihome services using Twitter and Instagram data

"Peran internet semakin penting dalam berbagai aspek kehidupan masyarakat. Kebutuhan akan internet menjadi peluang bagi penyedia internet, salah satunya Telkom dengan IndiHome. Sebagai BUMN, Telkom berperan sebagai penyedia layanan internet untuk memenuhi kebutuhan masyarakat. Berdasarkan survei kepuasan pelanggan tahun 2019 dan 2020, NPS IndiHome tidak mencapai target. Dari target besar atau sama dengan 5, tahun 2019 dan 2020, NPS IndiHome sebesar -1,67 dan 2,87. Hal ini karena pengerjaan permasalahan masih berdasarkan laporan, belum memiliki cara untuk mengetahui permasalahan yang terjadi dan belum memanfaatkan opini media sosial karena masih memanfaatkan survei. Penelitian ini membangun model analisis sentimen dam topic modelling IndiHome pada twitter & instagram. Data diambil dari bulan Maret 2019-April 2021. Model yang dihasilkan menggunakan metode SVM, twitter akurasi 70,13% dan instagram akurasi 73,55%. Sentimen mayoritas negatif, nilai NPS -79,49 pada twitter dan -56,12 pada Instagram. Dari twitter & instagram respons terhadap IndiHome memiliki indeks negatif, dimana masyarakat tidak puas dengan IndiHome. Hasil Topik diskusi negatif yaitu internet IndiHome mati mendadak, internet IndiHome lamban, internet IndiHome mati ketika terjadi hujan, biaya IndiHome mahal, pelayanan IndiHome tidak responsif, pelayanan IndiHome tidak solutif, sudah bayar internet diisolir, janji temu teknisi tidak sesuai waktu, dan ingin berhenti berlangganan atau pindah provider.

The role of the internet is increasingly important in various aspects of people's lives. The need for internet is an opportunity for internet providers, one of which is Telkom and IndiHome. As a BUMN, Telkom acts as a provider of internet services to meet the needs of the community. Based on customer satisfaction surveys in 2019 and 2020, IndiHome's NPS did not reach the target. Of the large target or equal to 5, in 2019 and 2020, IndiHome's NPS is -1.67 and 2.87. This is because the problem solving is still based on reports, does not have a way to find out the problems that occur and has not used social media opinions because they are still using surveys. This study builds a sentiment analysis model and IndiHome topic modeling on Twitter & Instagram. The data was taken from March 2019-April 2021. The resulting model used the SVM method, twitter 70.13% accuracy and instagram 73.55% accuracy. The majority sentiment is negative, the NPS score is -79.49 on Twitter and -56.12 on Instagram. From Twitter & Instagram, the response to IndiHome has a negative index, where people are not satisfied with IndiHome. The results of the negative discussion topics are IndiHome internet shuts down suddenly, IndiHome internet is slow, IndiHome internet shuts down when it rains, IndiHome costs are expensive, IndiHome services are unresponsive, IndiHome services are not solutive, already paid for the internet is isolated, technician appointments are not on time, and want to stop subscribe or switch providers."

Jakarta: Fakultas Ilmu Komputer Universitas Indonesia, 2022

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Hanandi Rahmad Syahputra

Penerapan Discriminant Analysis dan Support Vector Machine dalam Memprediksi Tren Pergerakan Harga Saham di Bursa Efek Indonesia = The Implementation of Discriminant Analysis and Support Vector Machine in Predicting The Trend of Stock Price Movements on the Indonesia Stock Exchange.

"Memprediksi pergerakan harga saham merupakan tugas yang sangat menantang karena karakteristik pasar saham yang kompleks, tidak linier, dan penuh ketidakpastian. Namun berdasarkan pada teori efficient market hypothesis dan tingkat efisiensinya, memprediksi pergerakan harga saham merupakan tugas yang masih memungkinkan untuk dicapai. Banyak pendekatan telah diterapkan untuk memprediksi pergerakan harga saham mulai dari pendekatan statistik linier sederhana seperti discriminant analysis (DA) hingga pendekatan machine learning yang kompleks seperti support vector machine (SVM). Baik DA dan SVM adalah pendekatan yang dapat digunakan untuk melakukan klasifikasi seperti memprediksi tren harga saham dari beberapa kelas. Dalam penelitian ini, tren pergerakan harga saham diklasifikasikan ke dalam dua kelas, yaitu "highly possible to go up" dan "highly possible to go down or be neutral" di mana pemisahan kelasnya didasarkan pada variabel berupa data teknikal, fundamental, keuangan, dan koefisien beta dari saham di Bursa Efek Indonesia (BEI). Dengan menggunakan variabel-variabel ini, sejumlah model prediksi dengan periode prediksi atau fungsi tertentu dilatih dan kemudian digunakan untuk memprediksi tren pergerakan harga saham di BEI. Periode prediksi yang digunakan dalam penelitian ini berkisar dari 1 bulan hingga 9 bulan. Metode stepwise linear regression (SLR) dan sequential forward selection (SFS) diterapkan sebagai metode feature selection guna memilih variabel yang paling relevan sehingga kinerja setiap model prediksi dapat dioptimalkan. Pada penelitian ini, jumlah fitur, nilai signifikansi maksimum dari F-to-enter, fungsi kernel, dan metode parameter selection divariasikan sehingga dihasilkan 12 model prediksi DA dan 30 model prediksi SVM. Dengan menerapkan beberapa proses evaluasi, maka model prediksi dengan tingkat akurasi dan efektifitas yang paling baik dapat dipilih. Dari seluruh 12 model prediksi DA yang dirancang, terdapat 3 model prediksi yang dinilai layak untuk diterapkan. Sedangkan dari seluruh 30 model prediksi SVM yang dirancang, terdapat 11 model prediksi yang dinilai layak untuk diterapkan. Kemudian dari 14 model prediksi yang dinilai layak tersebut, 4 model prediksi terbaik untuk periode prediksi 3, 5, 7, dan 9 bulan serta 1 model prediksi terbaik dengan fungsi untuk mengklasifikasi major trend selama 9 bulan telah berhasil dipilih. Kelima model tersebut merupakan model prediksi SVM sehingga dapat disimpulkan bahwa SVM mengungguli DA dalam memprediksi tren pergerakan harga saham di Bursa Efek Indonesia.

Predicting the movement of stock prices is a very challenging task because the characteristics of the stock market are complex, non-linear, and full of uncertainty. However, based on the efficient market hypothesis theory and its level of efficiency, predicting stock price movements is a task that is still possible to achieve. Many approaches have been applied for predicting the movement of stock prices ranging from simple linear statistical approaches such as discriminant analysis (DA) to complex machine learning approaches such as support vector machines (SVM). Both DA and SVM are approaches that can be used to perform classifications such as predicting stock price trends from several classes. In this study, the trends of stock price movements are classified into two classes, namely "highly possible to go up" and "highly possible to go down or be neutral" in which the class separation is based on variables in the form of technical, fundamental, financial, and beta coefficient data of stocks on the Indonesia Stock Exchange (IDX). By using these variables, a number of prediction models with specific prediction periods or functions are trained and then used to predict the trends of stock price movements on the IDX. The prediction periods used in this study range from 1 month to 9 months. The stepwise linear regression (SLR) and sequential forward selection (SFS) methods are applied as the feature selection methods to select the most relevant variables so that the performance of each prediction model can be optimized. In this study, the number of features, the maximum significance value of the F-to-enter, kernel function, and parameter selection method are varied to produce 12 DA prediction models and 30 SVM prediction models. By applying several evaluation processes, the prediction model with the best level of accuracy and effectiveness can be chosen. From all 12 DA prediction models designed, there are 3 prediction models that are considered feasible to be applied. While from all 30 SVM prediction models designed, there are 11 prediction models that are considered feasible to be applied. Then, out of these 14 prediction models that are considered feasible, 4 best prediction models for the prediction periods of 3, 5, 7, and 9 months and 1 best prediction model with the function to classify the major trend for 9 months have been successfully selected. These five prediction models are SVM prediction models so that it can be concluded that SVM outperforms DA in predicting the trends of stock price movements on the Indonesia Stock Exchange."

Depok: Fakultas Ekonomi dan Bisnis Universitas Indonesia, 2020

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Dealitha Winata

Pengembangan sistem penilaian esai otomatis simple-O untuk ujian esai berbahasa Jepang menggunakan algoritma support vector machine = Automatic Japanese essay grading system simple-O development using support vector machine algorithm

"Departemen Teknik Elektro Fakultas Teknik Universitas Indonesia telah mengembangkan Sistem Penilaian Esai Otomatis Simple-O berbasis Latent Semantic Analysis LSA sejak tahun 2007. Pada awalnya, Simple-O hanya dikembangkan untuk mengoreksi ujian esai berbahasa Indonesia, namun kali ini dikembangkan untuk mengoreksi ujian esai berbahasa Jepang. Simple-O hanya menggunakan algoritma LSA saat pertama kali dikembangkan. Beberapa tahun setelahnya, Simple-O mulai dikembangkan menggunakan algoritma LSA dilengkapi dengan algoritma klasifikasi seperti Learning Vector Quantization LVQ dan Support Vector Machine SVM. Simple-O juga mulai dikembangkan menggunakan algoritma lain seperti Winnowing.

Pada skripsi ini akan dijelaskan tentang pengembangan sistem penilaian esai otomatis Simple-O untuk ujian esai berbahasa Jepang menggunakan algoritma LSA untuk pemrosesan kata, serta menggunakan algoritma Support Vector Machine SVM untuk klasifikasinya. Algoritma SVM merupakan suatu algoritma pembelajaran yang berfungsi untuk menentukan bidang pemisah hyperplane dari sekumpulan data baik yang linearly separable, maupun yang non-linearly separable. SVM akan memisahkan data nilai hasil proses LSA ke dalam dua kelas untuk variasi kelas pertama, dan akan memisahkan data nilai hasil proses LSA ke dalam sembilan kelas untuk variasi kelas kedua. Jenis kernel dan parameter juga divariasikan untuk menemukan jenis kernel, parameter, dan jumlah kelas yang tepat. Hasil dari analisis dan pengujian yang telah dilakukan, apabila menggunakan jenis kernel, parameter, dan variasi kelas yang tepat, SVM mampu menghasilkan akurasi sebesar 100.

Department of Electrical Engineering in Universitas Indonesia has developed an automatic essay grading system Simple O based on Latent Semantic Analysis LSA since 2007. At first, Simple O was developed for giving score to essay with Indonesian language, but now Simple O is developed for giving score to essay with Japanese language. Simple O used to be developed using LSA algorithm only. A few years later, Simple O began to be developed using LSA algorithm and some classification algorithm such as Learning Vector Quantization LVQ and Support Vector Machine SVM. Simple O began to be developed using another algorithm too such as Winnowing algorithm.
This thesis will explain about development of automatic essay grading system Simple O for essay with Japanese language using LSA as word processing algorithm, and SVM as classification algorithm. SVM is a learning algorithm for determining hyperplane from set of linearly separable data as well as non linearly separable data. SVM will separate output data of LSA into two class for the first class variation and will separate output data of LSA into nine class for the second class variation. Kernel type and parameter will be varied too to find the right kernel, parameter, and number of classes. From the results of analysis and test that have been done, SVM is able to obtain accuracy of 100 if the system uses the right kernel, parameter, and number of classes."

Depok: Fakultas Teknik Universitas Indonesia, 2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Karin Marshanda

Penerapan Adaptive Synthetic Sampling Approach dalam Menangani Ketidakseimbangan Kelas pada Dataset Wi-Fi Attacks = Application of Adaptive Synthetic Sampling Approach in Handling Class Imbalance in Wi-Fi Attacks Dataset

"Instrusion Detection System (IDS) merupakan sistem untuk mendeteksi serangan dalam jaringan, baik lokal maupun internet. Dalam melakukan deteksi penyalahgunaan atau deteksi anomali, beberapa peneliti telah menggunakan data mining untuk mengidentifikasi berbagai jenis intrusi, termasuk yang jarang terjadi. Namun, data mining rentan terhadap data imbalance (data tidak seimbang) yang dapat mengurangi efektivitas algoritma klasifikasi karena asumsi mayoritas classifier terhadap distribusi yang seimbang. Berdasarkan permasalahan tersebut, maka akan dilakukan penelitian terkait penanganan data imbalance menggunakan metode Adaptive Synthetic Sampling (ADASYN) dengan cara menghasilkan data sintetis pada kelas minoritas agar algoritma klasifikasi dapat bekerja lebih baik. Metode ADASYN efektif bekerja pada variabel prediksi berjumlah 2 kelas (binary class), namun dikarenakan penelitian ini berurusan dengan masalah multiclass, makan akan digunakan pendekatan One-Vs-One (OVO) untuk menyeimbangkan kelas. Keefektifan ADASYN akan dievaluasi melalui implementasinya pada dataset Wi-Fi attacks, yaitu Aegean Wi-Fi Intrusion Dataset (AWID2). Data sebelum dan setelah rebalancing dievaluasi dengan menggunakan metode klasifikasi seperti regresi logistik dan Support Vector Machine (SVM), untuk dibandingkan nilai precision, recall, spesifisitas, serta F1-score dari kedua dataset tersebut. Meskipun ADASYN hanya meningkatkan nilai precision dalam dataset Wi-Fi attacks, dengan menggunakan metode klasifikasi SVM kernel polynomial terbukti efektif dalam mendeteksi kelas serangan, meskipun performa metrik lainnya tidak mencapai tingkat yang sama.

An Intrusion Detection System (IDS) is a system designed to detect attacks within networks, both local and internet-based. In the realm of misuse detection or anomaly detection, researchers have utilized data mining to identify various types of intrusions, including those that occur infrequently. However, data mining is susceptible to data imbalance, which can reduce the effectiveness of classification algorithms due to their assumption of balanced distribution. To address this issue, research will focus on handling data imbalance using the Adaptive Synthetic Sampling (ADASYN) method, which generates synthetic data for the minority class to enhance the performance of classification algorithms. ADASYN is effective for predictive variables with binary class scenarios, but since this study deals with multiclass problems, an One-Vs-One (OVO) approach will be employed to balance the classes. The effectiveness of ADASYN will be evaluated by implementing it on the Wi-Fi attacks dataset, specifically the Aegean Wi-Fi Intrusion Dataset (AWID2). Data before and after rebalancing will be evaluated using classification methods such as logistic regression and Support Vector Machine (SVM). Metrics including precision, recall, specificity, and F1-score will be compared between the two datasets. Although ADASYN only improves precision values in the Wi-Fi attacks dataset, using SVM with a polynomial kernel has proven effective in detecting attack classes, although other metric performances did not reach the same level."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

14-24-64198984

UI - Skripsi Membership Universitas Indonesia Library

Muhammad Nur Ichsan

Analisis Kinerja Model Support Vector Machine dalam Mengklasifikasi Tingkat Keparahan Penyakit Pestalotiopsis sp. pada Data Citra Daun Karet Menggunakan Fitur Warna dan Jumlah Bintik = Performance Analysis of Support Vector Machine Model in Classifying the Severity of Pestalotiopsis sp. Disease on Rubber Leaf Image Data Using Color and Number of Spots Features

"Saat ini, Indonesia menempati peringkat kedua sebagai produsen karet terbesar di dunia, menyumbang sekitar 29,8% dari kebutuhan global. Namun, produksi karet di Indonesia mengalami penurunan dari tahun ke tahun, salah satu faktornya adalah serangan penyakit gugur daun yang disebabkan oleh jamur Pestalotiopsis sp. Pada tahun 2021, luas perkebunan karet yang terkena penyakit mencapai 30.328,84 hektar dan tanaman yang terinfeksi oleh penyakit tersebut mengalami penurunan produksi lateks hingga 30%. Penyakit ini menyerang daun dengan gejala pembentukan bercak berukuran 0,5-2 cm yang menyebabkan nekrosis dan gugur. Penklasifikasian tingkat keparahan penyakit Pestalotiopsis sp. secara morfologi melalui pengamatan jumlah bintik dan warna pada daun karet membutuhkan waktu dan tenaga besar, terutama karena luasnya perkebunan yang terinfeksi. Oleh karena itu, penggunaan metode machine learning diusulkan untuk mengurangi waktu dan usaha yang dibutuhkan dalam menklasifikasi penyakit gugur daun akibat jamur Pestalotiopsis sp. Pada penelitian ini, model machine learning digunakan untuk mengklasifikasi 5 kelas tingkat keparahan penyakit Pestalotiopsis sp. yaitu tingkat 0 (sehat), tingkat 1 (terinfeksi ringan), tingkat 2 (terinfeksi sedang), tingkat 3 (terinfeksi parah), dan tingkat 4 (terinfeksi sangat parah). Dataset yang digunakan adalah citra daun tanaman karet yang diperoleh dari Pusat Penelitian Karet Sembawa. Model machine learning menerima input data citra daun tanaman karet, lalu citra disegmentasi menggunakan k-mean clustering. Data yang telah tersegmentasi kemudian diekstraksi dengan fitur warna hue, saturation, dan value (HSV) dan fitur jumlah bintik dengan metode contour detection menggunakan Suzuki’s contour algorithm. Selanjutnya, fitur-fitur ini diklasifikasikan menggunakan Support Vector Machine (SVM) tipe one vs rest multiclass classification dan Grid Search Cross Validation dengan 5 fold untuk menemukan hyperparameter terbaik untuk SVM. Hyperparameter terbaik adalah kernel radial basis function dengan C=100. Berdasarkan hasil percobaan sebanyak 5 kali, diperoleh kesimpulan bahwa model dengan akurasi tertinggi adalah model yang menggunakan fitur warna dan jumlah bintik dengan nilai rata-rata akurasi sebesar 81,86% dan nilai rata-rata Cohen’s kappa statistic sebesar 0,77 yang artinya model mampu mengklasifikasi data citra daun tanaman karet dengan cukup baik.

Currently, Indonesia ranks as the second largest rubber producer in the world, contributing about 29.8% of global demand. However, rubber production in Indonesia has decreased from year to year, one of the factors is the attack of leaf fall disease caused by the fungus Pestalotiopsi sp. In 2021, the area of rubber plantations affected by the disease reached 30,328.84 hectares with infected plants have a 30% decrease in latex production. The disease attacks the leaves with symptoms of spot formation measuring 0.5-2 cm which causes necrosis and fall. Detecting the severity of Pestalotiopsis sp. morphologically through the observation of the number of spots and colors on rubber leaves requires a lot of time and energy, especially due to the large area of infected plantations. Therefore, the use of machine learning methods is proposed to reduce the time and effort required in classifying leaf fall disease caused by the fungus Pestalotiopsis sp. In this study, a machine learning model is used to classify 5 classes of Pestalotiopsis sp. disease severity, namely level 0 (healthy), level 1 (mild infected), level 2 (moderate infected), level 3 (severe infected), and level 4 (very severe infected). The dataset used is an image of rubber plant leaves obtained from the Sembawa Rubber Research Center. The machine learning model received input data of rubber plant leaf images, then the image is segmented using k-mean clustering. The segmented data will then be extracted with hue, saturation, and value (HSV) color features and the number of spots feature with the contour detection method using Suzuki’s contour algorithm. In this study, the performance evaluation used is accuracy and Cohen's kappa statistic. Furthermore, these features are classified using Support Vector Machine (SVM) type one vs rest multiclass classification and Grid Search Cross Validation with 5 folds to find the best hyperparameter for SVM. The best hyperparameter is the radial basis function kernel with C=100. Based on the results of 5 experiments, it is concluded that the model with the highest accuracy is a model that uses color and the number of spots features with an average accuracy value of 81.86% and an average Cohen's kappa statistic value of 0.77, which means that the model is able to classify rubber plant leaf image data quite well."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2024

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Dilla Fadlillah Salma

Analisis akurasi metode support vector machine, random forest, dan logistic regression dalam mengklasifikasi data asuransi mobil dengan implementasi metode seleksi fitur one dimensional naive bayes classifier = Accuracy analysis of support vector machine, random forest, and logistic regression method in classifying car insurance data with one dimensional naive bayes classifier features selection implementation

"Kepemilikan dan penggunaan kendaraan mobil memiliki berbagai risiko negatif, seperti terjadinya kecelakaan. Untuk mengurangi beban risiko tersebut, perusahaan menjual produk asuransi mobil. Asuransi mobil merupakan salah satu produk perusahaan asuransi kendaraan yang bertujuan sebagai upaya perlindungan pemilik kendaraan mobil dari kerugian finansial yang terjadi pada kendaraan yang diasuransikannya. Untuk menawarkan produk asuransi, beberapa perusahaan menggunakan teknik penjualan dengan cara cold calling. Teknik penjualan tersebut akan lebih efektif menjual produk asuransi jika terlebih dahulu data nasabah calon pembeli asuransi diprediksi atau diklasifikasi ke dalam kelas membeli atau tidak membeli.
Pada skripsi ini, dilakukan klasfikasi dengan metode Support Vector Machine (SVM), Random Forest (RF),dan Logistic Regression (LR) dengan implementasi metode seleksi fitur One Dimensional NaÃ¯ve Bayes Classifier (1-DBC). Data yang diperoleh berjumlah 4000 data dengan total 18 fitur. Diperoleh hasil bahwa akurasi SVM lebih tinggi dibandingkan dengan kedua metode lainnya. Selain itu, mplementasi metode seleksi fitur telah berhasil meningkatkan akurasi dari metode Random Forest, dan Logistic Regression. Dengan implementasi 1-DBC, ketiga metode klasifikasi memperoleh hasil akurasi tertinggi pada penggunaan 15 fitur.
Ownership and use of car vehicles have a variety of negative risks, such as accidents. To reduce the risk burden, the company sells car insurance products. Car insurance is one of the products of a vehicle insurance company that aims to protect vehicle owners from financial losses that occur on their insured vehicles. To offer insurance products, some companies use sales techniques using cold calling. The sales technique will be more effective in selling insurance products if first the prospective customer buyer data is predicted or classified into the class of buying or not buying.
In this paper, classification is done using the method of Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR) by implementing the One Dimensional NaA-ve Bayes Classifier (1-DBC) feature selection method. The data obtained amounted to 4000 data with a total of 18 features. The results were obtained that the accuracy of SVM was higher compared to the other two methods. In addition, the implementation of the feature selection method has succeeded in increasing the accuracy of the Random Forest, and Logistic Regression. With the implementation of 1-DBC, the three classification methods obtained the highest accuracy results with the use of 15 features."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2018

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

Rafiqatul Khairi

Klasifikasi Kanker Pankreas menggunakan Kernel-based Support Vector Machine = Pancreatic Cancer Classification using Kernel-based Support Vector Machine

"Kanker pankreas adalah penyakit di mana sel-sel tumor ganas (kanker) berkembang di jaringan pankreas, yaitu organ di belakang perut bagian bawah dan di depan tulang belakang, yang membantu tubuh menggunakan dan menyimpan energi dari makanan dengan memproduksi hormon untuk mengontrol kadar gula darah dan enzim pencernaan untuk memecah makanan. Biasanya, kanker pankreas jarang terdeteksi pada tahap awal. Salah satu tanda seseorang mengalami kanker pankreas adalah diabetes, terutama jika itu bertepatan dengan penurunan berat badan yang cepat, penyakit kuning, atau rasa sakit di perut bagian atas yang menyebar ke punggung. Di antara berbagai jenis kanker, kanker pankreas memiliki tingkat kelangsungan hidup terendah, yaitu hanya sekitar 3-6% dari mereka yang didiagnosis yang dapat bertahan hidup selama lima tahun. Jika pasien didiagnosis tepat waktu untuk perawatan, peluang mereka untuk bertahan hidup akan meningkat. Terdapat penanda tumor yang biasa digunakan untuk mengikuti perkembangan kanker pankreas, yaitu CA 19-9 yang dapat diukur dalam darah. Orang sehat dapat memiliki sejumlah kecil CA 19-9 dalam darah mereka. Kadar CA 19-9 yang tinggi seringkali merupakan tanda kanker pankreas. Tetapi kadang-kadang, kadar tinggi dapat menunjukkan jenis kanker lain atau gangguan non-kanker tertentu, seperti sirosis dan batu empedu. Karena kadar CA 19-9 yang tinggi tidak spesifik untuk kanker pankreas, CA 19-9 tidak dapat digunakan dengan sendirinya untuk skrining atau diagnosis. Ini dapat membantu memantau perkembangan kanker dan efektivitas pengobatan kanker. Dalam studi ini, metode Kernel-based Support Vector Machine digunakan untuk mengklasifikasikan hasil tes darah CA19-9 menjadi dua bagian; data pasien yang didiagnosis dengan kanker pankreas atau pasien normal (tidak terdiagnosis kanker pankreas). Metode ini memperoleh akurasi sekitar 95%.
Pancreatic cancer is a disease in which malignant (cancerous) tumor cells develop in pancreatic tissue; organ behind the lower abdomen and in front of the spine, which helps the body use and store energy from food by producing hormones to control blood sugar levels and digestive enzymes to break down food. Usually, pancreatic cancer is rarely detected at an early stage. One sign of a person with pancreatic cancer is diabetes, especially if it coincides with rapid weight loss, jaundice, or pain in the upper abdomen that spreads to the back. Among various types of cancer, pancreatic cancer has the lowest survival rate of only about 3-6% of those diagnosed who can survive for five years. If patients are diagnosed on time for treatment, their chances of survival will increase. There is a tumor marker commonly used to follow the course of pancreatic cancer, namely CA 19-9 which can be measured in the blood. Healthy people can have small amounts of CA 19-9 in their blood. High levels of CA 19-9 are often a sign ofÂ pancreatic cancer. But sometimes, high levels can indicate other types of cancer or certain noncancerous disorders, includingÂ cirrhosisÂ andÂ gallstones. Because a high level of CA 19-9 is not specific for pancreatic cancer, CA 19-9 cannot be used by itself for screening or diagnosis. It can help monitor the progress of your cancer and the effectiveness of cancer treatment. In this study, the Kernel-based Support Vector Machine method is used to classify CA19-9 blood test results into two sections including data on patients diagnosed with pancreatic cancer or normal patients. This method will get an accuracy of around 95%."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Cari yang mirip

Tambahkan ke Favorit

Metadata PDF

Abstrak PDF

Abstrak

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian