Hasil Pencarian

Ditemukan 3 dokumen yang sesuai dengan query

Julizar Isya Pandu Wangsa

Studi Perbandingan Metode Clustering K-Means, DBSCAN, dan HDBSCAN pada BERTopic untuk Pendeteksian Topik = Comparative Study of K-Means, DBSCAN, and HDBSCAN Clustering Methods on BERTopic for Topic Detection

"Pendeteksian topik merupakan suatu proses pengidentifikasian suatu tema sentral yang ada dalam kumpulan dokumen yang luas dan tidak terorganisir. Hal ini merupakan hal sederhana yang bisa dilakukan secara manual jika data yang ada hanya sedikit. Untuk data yang banyak dibutuhkan pengolahan yang tepat agar representasi topik dari setiap dokumen didapat dengan cepat dan akurat sehingga machine learning diperlukan. BERTopic adalah metode pemodelan topik yang memanfaatkan teknik clustering dengan menggunakan model pre-trained Bidirectional Encoder Representations from Transformers (BERT) untuk melakukan representasi teks dan Class based Term Frequency Invers Document Frequency (c-TF-IDF) untuk ekstraksi topik. Metode clustering yang digunakan pada penelitian ini adalah metode K-Means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), dan Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). BERT dipilih sebagai metode representasi teks pada penelitian ini karena BERT merepresentasikan suatu kalimat berdasarkan sequence-of-word dan telah memperhatikan aspek kontekstual kata tersebut dalam kalimat. Hasil representasi teks merupakan vektor numerik dengan dimensi yang besar sehingga perlu dilakukan reduksi dimensi menggunakan Uniform Manifold Approximation and Projection (UMAP) sebelum clustering dilakukan. Model BERTopic dengan tiga metode clustering ini akan dianalisis kinerjanya berdasarkan matrik nilai coherence, diversity, dan quality score. Nilai quality score merupakan perkalian dari nilai coherence dengan nilai diversity. Hasil simulasi yang didapat adalah model BERTopic menggunakan metode clustering K-Means lebih unggul 2 dari 3 dataset untuk nilai quality score dari kedua metode clustering yang ada.

Topic detection is the process of identifying a central theme in a large, unorganized collection of documents. This is a simple thing that can be done manually if there is only a small amount of data. For large amounts of data, proper processing is needed to represent the topic of each document quickly and accurately, so machine learning is required. BERTopic is a topic modeling method that utilizes clustering techniques by using pre-trained Bidirectional Encoder Representations from Transformers (BERT) models to perform text representation and Class based Term Frequency Inverse Document Frequency (c-TF-IDF) for topic extraction. The clustering methods used in this research are the K-Means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). BERT was chosen as the text representation method in this research because BERT represents a sentence based on sequence-of-words and has considered the contextual aspects of the word in the sentence. The result of text representation is a numeric vector with large dimensions, so it is necessary to reduce the dimensions using Uniform Manifold Approximation and Projection (UMAP) before clustering is done. The BERTopic model with three clustering methods will be analyzed for performance based on the matrix of coherence, diversity, and quality score values. The quality score value is the multiplication of the coherence value with the diversity value. The simulation results obtained are the BERTopic model using K-Means clustering method is superior to 2 of the 3 datasets for the quality score value of the two existing clustering methods."

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2023

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Anton Ade Putra

Analisis Sentimen dan Pemodelan Topik terkait Metaverse di Media Sosial: Studi Kasus Bidang Riset dan Pengembangan Metaverse Universitas T = Sentiment Analysis and Topic Modeling Related to Metaverse in Social Media: A Case Study at the Metaverse Research and Development Department of University T

"Universitas T memiliki rencana (roadmap) untuk mengembangkan berbagai jenis Metaverse di masa depan. Namun, ada kekhawatiran bahwa roadmap yang telah dibuat mungkin tidak sesuai dengan kebutuhan masyarakat. Oleh karena itu, penelitian ini bertujuan untuk menganalisis sentimen dan pemodelan topik tentang Metaverse di media sosial guna memberikan wawasan yang penting bagi roadmap pengembangan Metaverse di Universitas T dengan memperhatikan pendapat dan sentimen masyarakat. Data yang digunakan dalam penelitian ini adalah twit berbahasa Indonesia yang dikumpulkan dari bulan Agustus 2021 hingga April 2023. Untuk analisis, digunakan pustaka LazyPredict yang menghasilkan lima model klasifikasi, yaitu Bernoulli Naive Bayes (BernoulliNB), Nearest Centroid, Calibrated Classifier CV, Logistic Regression, dan Linear Support Vector Classification (LinearSVC). Hasil menunjukkan bahwa model BernoulliNB memiliki performa terbaik dengan nilai rata-rata F1 sebesar 0,788. Selain itu, penelitian ini juga mengidentifikasi topik-topik yang dibahas terkait dengan Metaverse menggunakan pustaka Bertopic. Temuan menunjukkan adanya topik negatif seperti ketidakpastian pengembangan Metaverse, skeptisisme terhadap teknologi baru, keterbatasan infrastruktur internet, kekhawatiran etika dan syariah, ketidakpastian legalitas, kekhawatiran privasi dan keamanan, serta skeptisisme terhadap kesiapan Indonesia dalam membangun Metaverse. Di sisi lain, topik positif meliputi peluncuran Metaverse Jagat Nusantara, potensi kripto dalam konteks Metaverse, perubahan nama Facebook menjadi Meta, konser virtual di Metaverse, kehidupan di dunia Metaverse, pengembangan teknologi Metaverse di dalam negeri, transformasi digital dan inovasi di era Metaverse, penggunaan blockchain, kripto, dan NFT dalam teknologi Metaverse, serta Manasik Haji di Metaverse. Hasil analisis sentimen dan pemodelan ini dapat memberikan wawasan yang berharga bagi Universitas T dalam memahami tren dan pandangan masyarakat terkait Metaverse. Hal ini akan membantu universitas dalam mengevaluasi roadmap Metaverse yang telah dibuat untuk memastikan kesesuaiannya dengan kebutuhan masyarakat.

Universitas T has a roadmap to develop various types of Metaverse in the future. However, there are concerns that the existing roadmap may not align with the needs of society. Therefore, this research aims to analyze the sentiment and topic modeling related to Metaverse on social media to provide valuable insights for the development roadmap of Metaverse at Universitas T, taking into account the opinions and sentiments of the public. The data used in this study are Indonesian tweets collected from August 2021 to April 2023. The LazyPredict library is utilized for analysis, which generates five classification models: Bernoulli Naive Bayes (BernoulliNB), Nearest Centroid, Calibrated Classifier CV, Logistic Regression, and Linear Support Vector Classification (LinearSVC). The results show that the BernoulliNB model performs the best with an F1 score of 0.788. Additionally, this research identifies various topics discussed in relation to Metaverse using Bertopic library. Findings indicate the presence of negative topics such as uncertainty in Metaverse development, skepticism towards new technologies, limitations of internet infrastructure, ethical and Sharia concerns, legal uncertainties, privacy and security concerns, as well as skepticism about Indonesia's readiness in building the Metaverse. On the other hand, positive topics include the launch of Metaverse Jagat Nusantara, the potential of cryptocurrencies in the context of Metaverse, the name change of Facebook to Meta, virtual concerts in the Metaverse, life in the Metaverse world, domestic Metaverse technology development, digital transformation and innovation in the era of Metaverse, the use of blockchain, cryptocurrencies, and NFTs in Metaverse technology, as well as Manasik of Hajj in the Metaverse. The results of sentiment analysis and topic modeling can provide valuable insights for Universitas T to understand the trends and public perspectives regarding Metaverse. This will assist the university in evaluating the existing Metaverse roadmap to ensure its alignment with the needs of society."

Depok: Fakultas Ilmu Komputer Universitas Indonesia, 2023

TA-pdf

UI - Tugas Akhir Universitas Indonesia Library

Aurelio Naufal Effendy

Analitis Visual dari Sentimen Level Topik pada Ulasan Aplikasi Gojek = Visual Analytics of Topic-Level Sentiment in Gojek App Reviews

"Era digital yang berkembang pesat telah menjadikan aplikasi mobile seperti Gojek sebagai solusi penting bagi kebutuhan masyarakat modern. Sebagai platform multi layanan, Gojek berkontribusi signifikan terhadap ekonomi Indonesia, mendukung Usaha Mikro, Kecil, dan Menengah (UMKM), meningkatkan inklusi keuangan, serta mendorong efisiensi mobilitas. Dengan lebih dari 6 juta ulasan pengguna di Play Store, data ini menjadi sumber informasi yang berharga untuk mengevaluasi kualitas layanan. Peningkatan jumlah pengguna dan layanan Gojek juga meningkatkan tuntutan terhadap kualitas aplikasinya, menjadikan ulasan pengguna penting untuk dianalisis guna memperoleh wawasan dan menilai kinerjanya. Penelitian ini bertujuan untuk melakukan analisis sentimen pada level topik terhadap ulasan pengguna aplikasi Gojek di Play Store. Melalui analitis visual (visual analytics), wawasan mendalam dari ulasan pengguna dapat dieksplorasi secara efektif dengan memanfaatkan Large Language Model (LLM). Pada penelitian ini digunakan model Gemma 2, yang merupakan pembaruan dari model Gemma yang dikembangkan oleh Google. Analisis sentimen untuk memahami persepsi pengguna dilakukan dengan menerapkan Gemma 2 yang telah melalui proses fine-tuning, di mana kinerjanya dibandingkan dengan beberapa pendekatan In-Context Learning (ICL) seperti zero-shot, one-shot, dan few-shot. Selanjutnya, pendeteksian topik dilakukan menggunakan BERTopic-HDBSCAN untuk mengidentifikasi topik utama pada ulasan. Terakhir, knowledge graph dibangun untuk memetakan hubungan antar entitas dalam ulasan melalui Neo4j LLM Knowledge Graph Builder. Hasil penelitian menunjukkan bahwa model Gemma 2 yang telah melaui proses fine-tuning memberikan performa terbaik dengan akurasi 0,955, sensitivitas 0,883, spesifisitas 0,976, presisi 0,915, dan f1-score 0,898, mengungguli pendekatan ICL. Berdasarkan analisis sentimen pada level topik, diperoleh bahwa 74,16% ulasan bersentimen negatif, dengan sebelas topik utama teridentifikasi, di mana hanya satu topik yang didominasi sentimen positif, sedangkan sisanya dominan negatif. Temuan ini dianalisis lebih lanjut melalui analitis visual untuk menghasilkan sejumlah rekomendasi bagi Gojek, antara lain peningkatan kualitas mitra driver, evaluasi struktur biaya layanan, perbaikan sistem GoPay, perluasan cakupan promo yang lebih merata dan jelas, peningkatan akurasi peta, serta pengembangan fitur positif lebih lanjut seperti kemudahan dan kecepatan layanan guna meningkatkan kepuasan dan loyalitas pengguna.

The rapidly evolving digital era has made mobile applications like Gojek an essential solution for the needs of modern society. As a multi-service platform, Gojek contributes significantly to Indonesia's economy by supporting micro, small and medium enterprises, enhancing financial inclusion, and promoting mobility efficiency. With more than 6 million user reviews on the Play Store, this data serves as a valuable source of information for evaluating service quality. The growing number of Gojek users and services has also increased the demand for higher application quality, making user reviews crucial to analyze in order to gain insights and assess performance. This study aims to conduct topic-level sentiment analysis on user reviews of the Gojek application on the Play Store. Through visual analytics, in-depth insights from user reviews can be effectively explored by leveraging Large Language Models (LLM). In this study, the Gemma 2 model, an updated version of the Gemma model developed by Google, is utilized. To understand user perception, sentiment analysis is conducted using the fine-tuned Gemma 2, with its performance compared to several In-Context Learning (ICL) approaches such as zero shot, one-shot, and few-shot. Subsequently, topic detection is performed using BERTopic-HDBSCAN to identify the main topics in the reviews. Lastly, a knowledge graph is constructed to map relationships between entities in the reviews using the Neo4j LLM Knowledge Graph Builder. The results show that the fine-tuned Gemma 2 model delivers the best performance with an accuracy of 0.955, sensitivity of 0.883, specificity of 0.976, precision of 0.915, and an F1-score of 0.898, outperforming the ICL approaches. Based on the topic-level sentiment analysis, it was found that 74.16% of the reviews carry negative sentiment, with eleven main topics identified. Only one topic is dominated by positive sentiment, while the rest are predominantly negative. These findings were further analyzed through visual analytics to generate several recommendations for Gojek, including improving the quality of driver partners, evaluating the service cost structure, enhancing the GoPay system, expanding promotional offerings with clearer terms, improving map accuracy, and further developing positive features such as ease and speed of service to increase user satisfaction and loyalty. "

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2025

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian