Hasil Pencarian

Ditemukan 3 dokumen yang sesuai dengan query

Wongkar, Enggelin Giacinta

Pengembangan Sistem Web Crawler Sebagai Sarana Riset Media Secara Otomatis (Studi Di Subdit Neraca Rumah Tangga Dan Institusi Nirlaba)

"With the vast development of data to become informations on the Internet, everything online seems to explode at a rapid rate. These informations, including online news which is created as a complement to the original printed media, has even overtaken the latter. Subdirectorate of Household National Account and Non-profit Institution of Statistics Indonesia is in charge for the work of media research. In the process of media research, time and human resources are two important elements but yet having problem of ineffective and inefficient process. This study aimed to overcome that problem by developing a web crawler system that could do summarization automatically from online news sites (currently from Bisnis and Kontan) with output in Microsoft Word format file and minimizing number of similar news. This system is developed using several techniques in information technologies such as crawling and wrapping method and cosine similarity method to minimalize similar news. The result shows the process of media research by using this system much more effective and efficient."

Jakarta: Sekolah Tinggi Ilmu Statistik (STIS-Statistics Institute Jakarta, 2014

JASKS 6:2 (2014)

Artikel Jurnal Universitas Indonesia Library

Bahy Helmi Hartoyo Putra

Pengembangan sistem daring untuk deteksi situs merchant fraud berbasis struktur situs: studi kasus PT Nusa Satu Inti Artha = Development of online system for merchant site fraud detection based on site structure: case study PT Nusa Satu Inti Artha

"PT Nusa Satu Inti Artha atau lebih dikenal dengan DOKU merupakan salah satu perusahaan fintech yang bergerak di sektor pembayaran. DOKU telah digunakan oleh lebih dari 100.000 merchant online dalam kedua layanannya, yaitu payment gateway dan transfer service. Semakin banyaknya merchant yang melakukan registrasi, menuntut DOKU untuk lebih efisien dalam menjalankan salah satu tahapan pada proses registrasi tersebut, yaitu verifikasi situs merchant. Penilitian ini memiliki tujuan untuk mengem- bangkan sebuah aplikasi web crawler yang dapat digunakan untuk melakukan ekstraksi kelengkapan data situs merchant dan melakukan prediksi tingkatan fraud situs tersebut secara otomatis. Web crawler dibuat menggunakan micro web framework bernama Flask dan berisi modul-modul yang dapat melakukan ekstraksi fitur-fitur untuk kemudian dilakukan scoring menggunakan model machine learning yang diimplementasi di dalamnya. Pemilihan model dilakukan dengan cara melakukan nested cross-validation terhadap empat jenis classifier, yaitu Decision Tree Classifier, Random Forest Classifier, Extreme Gradient Boost Classifier, dan Bernoulli Naive Bayes Classifier. Hasil analisis menunjukkan bahwa Bernoulli Naive Bayes Classifier memiliki hasil performa terbaik, sehingga classifier ini juga yang akan diimplementasikan pada web crawler. Hasil dari pengembangan web crawler menunjukkan bahwa efisiensi waktu proses verifikasi dapat ditingkatkan sebesar 4900% dengan AUC sebesar 0.953 dan recall sebesar 0.864.

PT Nusa Satu Inti Artha or better known as DOKU is one of the fintech companies engaged in the payment sector. DOKU has been used by more than 100,000 online mer- chants in its two services, namely payment gateway and transfer service. More and more merchants are registering, demanding DOKU to be more efficient in carrying out one of the stages in the registration process, namely merchant site verification. This research aims to develop a web crawler application that can be used to extract the the merchant site data and to predict the fraud level of the site automatically. Web crawler is created using a micro web framework named Flask and contains modules that can extract features to then do scoring using the machine learning model implemented in it. Model selection is done by doing nested cross-validation of four types of classifier namely Decision Tree Classifier, Random Forest Classifier, Extreme Gradient Boost Classifier, and Bernoulli Naive Bayes Classifier. The analysis shows that the Bernoulli Naive Bayes Classifier has the best performance results, so this classifier will be the one that implemented on the web crawler. The results of the development of web crawler show that the efficiency of the verification process can be increased by 4900% with AUC of 0.953 and recall of 0.864."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2020

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Pratama Amirullisan

Analisa dan rancang bangun sistem deteksi cepat konten web negatif berbasis teks menggunakan random sampling dan latent semantic analysis dengan algoritma singular value decomposition = Analysis and design of quick detection system to text based negative web content using random sampling and latent semantic analysis with singular value decomposition algorithm

"Kebutuhan terhadap Internet sudah sangat dirasakan, namun, akibat kurangnya kontrol dalam mengawasi kegiatan berselancar di dunia maya ini, menjadikan konten yang dapat merusak moral tersebar dengan sangat cepat dan begitu leluasa untuk diakses oleh setiap orang.

Penelitian ini membahas Analisa dan Rancang Bangun Sistem Deteksi Cepat Konten Web Negatif Berbasis Teks Menggunakan Random Sampling dan Latent Semantic Analysis dengan Algoritma Singular Value Decomposition yang bertujuan untuk mengklasifikasikan website-website berkonten negatif dengan langkah awal melakukan penelusuran terhadap link-link pada suatu website dengan teknik crawling oleh program web crawler untuk mengumpulkan konten website yang berupa teks. Seluruh konten teks yang telah dikumpulkan selanjutnya akan diklasifikasikan menggunakan metode Latent Semantic Analysis dengan menerapkan algoritma Singular Value Decomposition untuk menunjukkan hasil klasifikasi yang mampu membedakan antara website berkonten negatif dengan konten non-negatif. Pengujian dilakukan dengan menggunakan metode full sampling dan random sampling untuk menentukan cara pendeteksian website berkonten negatif yang lebih cepat.

Hasil pengujian pada penelitian ini menunjukkan bahwa metode Latent Semantic Analysis dengan algoritma Singular Value Decomposition berhasil mengklasifikasikan website berkonten negatif dengan batas persentase hasil klasifikasi sebesar 70% sebagai indikatornya, dan metode random sampling dengan pengambilan sample hanya 30% dari total telah berhasil meningkatkan kecepatan eksekusi program rata-rata sebesar 507.01%, dengan penurunan akurasi rata-rata hanya sebesar 27.19% dibandingkan dengan metode full sampling untuk website berkonten negatif.

The need of the Internet has been keenly felt, however, due to a lack of control in monitoring the activities of surfing in this virtual world, making contents that will damage the morale spread very quickly and so freely accessible to everyone.
This study discusses the Analysis and Design of Quick Detection System to Text-Based Negative Web Content Using Random Sampling and Latent Semantic Analysis with Singular Value Decomposition Algorithm which aims to classify negative content websites with the first step is to perform a search for links in a website using crawling technique by a web crawler program to gather website content in the text form. The entire text-based contents that have been collected will then be classified using Latent Semantic Analysis method by applying Singular Value Decomposition algorithm to show the result of classification that is able to distinguish the negative content and non-negative content website. The testing is performed using full sampling and random sampling method to determine which one is faster in doing the detection of negative content website.
The results of this study showed that Latent Semantic Analysis method with Singular Value Decomposition algorithm successfully classifies the negative content websites with the percentage of classification result by 70% as the indicator, and the random sampling method with only 30% of total samples has been successful in increasing the speed of program execution by an average of 507.01%, with decreasing accuracy by an average of only 27.19% compared to full sampling method for negative content websites."

Depok: Fakultas Teknik Universitas Indonesia, 2016

S66330

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian