Hasil Pencarian

Ditemukan 295 dokumen yang sesuai dengan query

M. Misbachul Huda

Diversity-based attribute weighting for k-modes clustering

"Categorical data is a kind of data that is used for computational in computer science. To obtain the information from categorical data input, it needs a clustering algorithm. There are so many clustering algorithms that are given by the researchers. One of the clustering algorithms for categorical data is k-modes. K-modes uses a simple matching approach. This simple matching approach uses similarity va-lues. In K-modes, the two similar objects have similarity value 1, and 0 if it is otherwise. Actually, in each attribute, there are some kinds of different attribute value and each kind of attribute value has different number. The similarity value 0 and 1 is not enough to represent the real semantic distance between a data object and a cluster. Thus in this paper, we generalize a k-modes algorithm for catego-rical data by adding the weight and diversity value of each attribute value to optimize categorical data clustering.

Data Kategorial merupakan suatu jenis data perhitungan di ilmu komputer .Untuk mendapatkan infor-masi dari input data kategorial diperlukan algoritma klastering. Ada berbagai jenis algoritma klas-tering yang dikembangkan peneliti terdahulu. Salah satunya adalah K-modes. K-modes menggunakan pendekatan simple matching. Pendekatan simple matching ini menggunakan nilai similarity. Pada K-modes, jika dua objek data mirip, maka akan diberi nilai. Jika dua objek data tidak mirip, maka diberi nilai 0. Pada kenyataannya, tiap atribut data terdiri dari beberapa jenis nilai atribut dan tiap jenis nilai atribut terdiri dari jumlah yang berbeda. Nilai similarity 0 dan 1 kurang merepresentasi jarak antara sebuah objek data dan klaster secara nyata. Oleh karena itu, pada paper ini, kami mengembangkan algoritma K-modes untuk data kategorial dengan penambahan bobot dan nilai diversity pada setiap atribut untuk mengoptimalkan klastering data kategorial."

Surabaya: Institut Teknologi Sepuluh Nopember, Faculty of Information Technology, Department of Informatics Engineering, 2014

AJ-Pdf

Artikel Jurnal Universitas Indonesia Library

Yuliana Portti

Aplikasi algoritma metaheuristik basis fuzzy K- modes untuk supplier clustering = Application of metaheuristic based fuzzy K-modes algorithm to supplier clustering

"Penelitian ini mengusulkan tiga algoritma meta-heuristik berbasis Fuzzy K-modes untuk clustering binary data set. Ada tiga metode metaheuristik diterapkan, yaitu Particle Swarm Optimization (PSO), Genetika Algoritma (GA), dan Artificial Bee Colony (ABC). Ketiga algoritma digabungkan dengan algoritma K-modes. Tujuannya adalah untuk memberikan modes awal yang lebih baik untuk K-modes. Jarak antara data ke modes dihitung dengan menggunakan koefisien Jaccard. Koefisien Jaccard diterapkan karena dataset mengandung banyak nilai nol . Dalam rangka untuk melakukan pengelompokan set data real tentang supplier otomotif di Taiwan, algoritma yang diusulkan diverifikasi menggunakan benchmark set data. Hasil penelitian menunjukkan bahwa PSO K-modes dan GA K-modes lebih baik dari ABC K-modes. Selain itu, dari hasil studi kasus, GA K-modes memberikan SSE terkecil dan juga memiliki waktu komputasi lebih cepat dari PSO K-modes dan ABC K-modes.

This study proposed three meta-heuristic based fuzzy K-modes algorithms for clustering binary dataset. There are three meta-heuristic methods applied, namely Particle Swarm Optimization (PSO) algorithm, Genetic Algorithm (GA) algorithm, and Artificial Bee Colony (ABC) algorithm. These three algorithms are combined with k-modes algorithm. Their aim is to give better initial modes for the k-modes. Herein, the similarity between two instances is calculated using jaccard coefficient. The Jaccard coefficient is applied since the dataset contains many zero values. In order to cluster a real data set about automobile suppliers in Taiwan, the proposed algorithms are verified using benchmark data set. The experiments results show that PSO K-modes and GA K-modes is better than ABC K-modes. Moreover, from case study results, GA fuzzy K-modes gives the smallest SSE and also has faster computational time than PSO fuzzy K-modes and ABC fuzzy K-modes."

Depok: Fakultas Teknik Universitas Indonesia, 2015

T44406

UI - Tesis Membership Universitas Indonesia Library

Novieka Distiasari

Aplikasi algoritma metaheuristik berbasis K-modes pada pengelompokan supplier = Application of metaheuristic based K-modes algorithms to supplier clustering

"ABSTRAK

Pengelompokan supplier penting untuk memberikan informasi kepada pembeli. Penelitian ini mengusulkan meta-heuristik berbasis algoritma K-modes untuk mengelompokkan dataset dalam bentuk biner. Ada dua metode metaheuristik yang digunakan dalam penelitian ini, yaitu particle swarm optimization (PSO) dan genetic algorithm (GA). Meta-heuristik yang diterapkan untuk memberikan modes awal yang lebih baik untuk algoritma K-modes. Penelitian ini menggunakan pengukuran Jaccard dalam hal pengukuran similarity dan menggunakan tiga dataset untuk memvalidasi algoritma yang diusulkan. Hasil percobaan dan hasil statistik menunjukkan bahwa PSO berbasis algoritma K-modes lebih baik dari GA berbasis algoritma K-modes. Dalam hasil evaluasi menggunakan data dari sebuah perusahaan automobile di Taiwan, PSO berdasarkan PSO berbasis algoritma K-modes memiliki SSE kecil dari pada GA berbasis algoritma K-modes.

ABSTRACT

Supplier clustering is important for providing more important information for the buyer. This study proposes meta-heuristics based K-modes algorithm for clustering binary dataset. There are two metaheuristic methods applied in this study, namely particle swarm optimization (PSO) and genetic algorithm (GA). The meta-heuristics are applied to give better initial modes for the K-modes algorithm. In terms of similarity measurement, this study uses Jaccard measurement since the real data set consists of higher number of value zero than one. In order to validate the proposed algorithms, three benchmark datasets are employed. The experiments results and statistical results show that PSO based K-modes algorithm is better than GA based K- modes algorithm. The data set from a exisibition company in Taiwan. In model evaluation results, PSO based K- modes algorithm has the SSE lowest than GA based K- modes algorithm."

Depok: Fakultas Teknik Universitas Indonesia, 2015

T44694

UI - Tesis Membership Universitas Indonesia Library

Sarah Syarofina

Analisis Pemilihan Molekul Inhibitor Dipeptidil Peptidase 4 pada Perancangan Obat Diabetes Tipe 2 menggunakan Algoritma K-Modes Clustering dengan Levenshtein Distance = Molecular Selection Analysis of Dipeptidyl Peptidase-4 Inhibitors in The Drug Discovery of Type 2 Diabetes using K-Modes Clustering Algorithm with Levenshtein Distance

"Inhibitor dipeptidil peptidase 4 (DPP-4) baru perlu dikembangkan untuk meminimalkan efek samping merugikan yang diakibatkan oleh obat golongan inhibitor DPP-4 yang telah terdaftar. Penelitian ini bertujuan untuk menghasilkan subset molekul inhibitor DPP-4 yang representatif dengan mengaplikasikan algoritma K-Modes clustering dengan Levenshtein distance pada proses clustering dan melakukan analisis pemilihan molekul inhibitor DPP-4 berdasarkan kriteria nilai logP dari aturan Lipinskis Rule of 5. 2053 molekul inhibitor DPP-4 diperoleh dari situs ChEMBL. Clustering dilakukan terhadap fingerprint molekuler inhibitor DPP-4 yang diperoleh dari fitur SMILES (Simplified Molecular Input Line Entry System). Metode MACCS (Molecular Access System) Keys, ECFP (Extended Connectivity Fingerprint) diameter 4 dan 6, dan FCFP (Functional Class Fingerprint) diameter 4 dan 6, digunakan untuk membangun lima dataset fingerprint untuk proses clustering. Prosedur clustering diawali dengan menentukan jumlah klaster dengan menghitung nilai Koefisien Silhouette sebagai metode evaluasi klaster. Penerapan algoritma K-Modes clustering dengan Levenshtein distance pada 2053 molekul inhibitor DPP-4 menghasilkan nilai Koefisien Silhouette maksimal dari dataset MACCS sebesar 0.3947 dengan jumlah klaster 1258. Pemilihan molekul berdasarkan kriteria nilai logP dan aturan Lipinskis Rule of 5 menghasilkan 778 molekul inhibitor DPP-4 dari semua dataset dengan 298 molekul inaktif dan 480 molekul aktif dan nilai logP berkisar antara -1.67 sampai dengan 4.97.

New dipeptidyl peptidase 4 (DPP-4) inhibitors need to be developed to minimize the adverse side effects caused by registered DPP-4 inhibitor drugs. This study aims to produce a representative subset of DPP-4 inhibitor molecules by applying the K-Modes clustering algorithm with Levenshtein distance in the clustering process and analyzing the selection of DPP-4 inhibitor molecules based on the logP value criteria. 2053 DPP-4 inhibitor molecules obtained from the ChEMBL website. Clustering was carried out on the molecular fingerprint obtained from the SMILES feature. The MACCS Keys, ECFP (diameter 4 and 6), and FCFP (diameter 4 and 6) methods were used to construct fingerprint datasets for the clustering process. The clustering procedure begins by determining the number of clusters by calculating the Silhouette Coefficient value. The application of the K-Modes clustering with Levenshtein distance to 2053 DPP-4 inhibitor molecules resulted in the maximum Silhouette Coefficient value of the MACCS dataset of 0.3947 with the number of clusters 1258. Selection of molecules based on logP value criteria and Lipinskis Rule of 5 resulted in 778 DPP-4 inhibitor molecules. of all the datasets with 298 inactive molecules and 480 active molecules and the logP value ranged from -1.67 to 4.97.

Depok: Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Indonesia, 2020

T-pdf

UI - Tesis Membership Universitas Indonesia Library

Riupassa, Pieter Agusthinus

The Molecular Diversity-based ISSR of Durio tanjungpurensis Originating from West Kalimantan, Indonesia

"Diversitas Molekuler Berbasis ISSR pada Durio tanjungpurensis Asal Kalimantan Barat, Indonesia. Durian Tengkurak (Durio tanjungpurensis Navia) adalah salah satu spesies langka yang eksotis dari suku Malvaceae. Durian tersebut bernilai penting untuk konservasi plasma nutfah dan berpotensi sebagai sumber daya genetik untuk pengembangan durian di masa depan. Tujuan penelitian adalah mengetahui keragaman molekuler D. tanjungpurensis asal Kalimantan Barat berdasarkan penanda Inter Simple Sequence Repeat (ISSR). Sepuluh primer ISSR digunakan untuk mengetahui keragaman genetik 60 individu Durian Tengkurak dari enam populasi endemik alami D. tanjungpurensis. Parameter keragaman genetik didasarkan pada data biner pita DNA produk PCR, yaitu ada atau tidak- ada pita. Hasil penelitian menunjukkan bahwa rata-rata jumlah alel, rata-rata jumlah efektif alel, diversitas genetik, indeks informasi Shannon, jumlah polimorfik lokus, dan persentase polimorfik lokus berturut-turut adalah 1,53, 1,29, 0,17, 0,26, 77,83 dan 52,59. Analisis ragam molekuler (AMOVA) menunjukkan keragaman genetik yang lebih tinggi di dalam populasi (65%) dibandingkan antar populasi (35%). Analisis gugus menggunakan metode UPGMA berdasarkan matriks keserupaan Dice dan analisis koordinat utama digunakan untuk mengelompokkan semua individu populasi ke dalam tiga kelompok, yaitu grup 1 (Hutan Rejunak dan Tembaga), grup 2 (Bukit Merindang), dan grup 3 (Hutan Rawak, Bukit Sagu 1 dan Bukit Sagu 2). Analisis lebih lanjut terhadap struktur populasi menggunakan program STRUCTURE menyatukan grup 2 dan 3 ke dalam satu grup utama. Penelitian ini berhasil mengungkap keragaman genetik Durian Tengkurak menggunakan penanda ISSR.

The Durian Tengkurak (Durio tanjungpurensis Navia) is one of the endangered exotic species in the Malvaceae family. The species is important for conservation of the germplasm and is considered a potential genetic resource for the development of durian in the future. The objective of this research project was to assess the molecular diversity of D. tanjungpurensis in West Kalimantan, based on Inter Simple Sequence Repeat (ISSR) markers. We applied ten ISSR primers to reveal the genetic diversity of 60 individuals from six natural endemic D. tanjungpurensis populations. The genetic diversity parameters were estimated based on binary data about PCR products (present or absent bands). The results showed that the mean number of observed alleles, the mean number of effective alleles, the genetic diversity, the Shannon?s Information Index score, the number of polymorphic loci, and the percentage of polymorphic loci were 1.53, 1.29, 0.17, 0.26, 77.83, and 52.59, respectively. An analysis of molecular variance (AMOVA) showed that the genetic diversity within a population (65%) was higher than that found between the populations (35%). UPGMA clustering and principal coordinate analysis, based on the DICE similarity matrix, were used to classify the populations into three groups: 1) Hutan Rejunak and Tembaga, 2) Bukit Merindang, and 3) Hutan Rawak, Bukit Sagu 1, and Bukit Sagu 2. Further analysis of the population structure using STRUCTURE software was used to classify all the individuals into two major categories, thus uniting Groups 2 and 3 as one major category. In conclusion, a high level of genetic diversity in the Durian Tengkurak was revealed utilizing the ISSR markers employed in the study."

Institut Pertanian Bogor. Faculty of Mathematics and Natural Sciences, 2015

pdf

Artikel Jurnal Universitas Indonesia Library

Memetics of ethno - clustering analysis

"The works on phylomemetics strees to certain cultural artifacts of ethnics cultures in Indonesian archipelago is advanced by the proposal of methodology that can yield a cultural tree reflecting the superposition of traditional song, architectural designs,and motif designs in fabric phylomemetic tree...."

Artikel Jurnal Universitas Indonesia Library

Moch Galih Primantara

Clustering graf dengan metode k-way spectral clustering = Graph clustering with k-way spectral clustering method

" ABSTRAK

Clustering adalah salah satu topik penting pada bidang Data Mining. Teori graf dapat digunakan untuk membantu clustering dengan cara membuat graf yang mewakili data-data yang akan di-cluster. Salah satu metode graf clustering adalah k-way spectral clustering yang memanfaatkan sebanyak k nilai eigen dan vektor eigen pertama dari matriks Laplacian suatu graf untuk melakukan clustering dengan k adalah banyaknya cluster yang diinginkan. Pada skripsi ini dibahas mengenai algoritma k-way spectral clustering merujuk kepada Ng, Jordan, dan Weiss (2002) dan von Luxburg (2007).

ABSTRACT
Clustering is one of the most important topic in Data Mining. Graph can be used to do clustering by forming a representation graph data which is needed to be clustered. K-way spectral clustering is one of many methods of graph clustering. This method uses first-k eigen values and eigen vectors of a Laplacian matrix to cluster with k is the number of desired clusters. In this skripsi, it will be discussed a k-way spectral clustering algorithm by Ng, Jordan, and Weiss (2002) and von Luxburg (2007)."

Universitas Indonesia, 2016

S61791

UI - Skripsi Membership Universitas Indonesia Library

Khadijah Fahmi Hayati Holle

Preference based term weighting for arabic fiqh document ranking

"In document retrieval, besides the suitability of query with search results, there is also a subjective user assessment that is expected to be a deciding factor in document ranking. This preference aspect is referred at the fiqh document searching. People tend to prefer on certain fiqh metho-dology without rejecting other fiqh methodologies. It is necessary to investigate preference factor in addition to the relevance factor in the document ranking. Therefore, this research proposed a method of term weighting based on preference to rank documents according to user preference. The proposed method is also combined with term weighting based on documents index and books index so it sees relevance and preference aspect. The proposed method is Inverse Preference Fre-quency with α value (IPFα). In this method, we calculate preference value by IPF term weighting. Then, the preference values of terms that is equal with the query are multiplied by α. IPFα combin-ed with the existing weighting methods become TF.IDF.IBF.IPFα. Experiment of the proposed me-thod uses dataset of several Arabic fiqh documents. Evaluation uses recall, precision, and f-mea-sure calculations. Proposed term weighting method is obtained to rank the document in the right order according to user preference. It is shown from the result with recall value reach 75%, preci-sion 100%, and F-measure 85.7% respectively.

Dalam pencarian, selain kesesuaian query dengan hasil pencarian, terdapat penilaian subjektif pengguna yang diharapkan menjadi faktor penentu dalam perangkingan dokumen. Aspek prefe-rensi tersebut tampak pada pencarian dokumen fiqih. Seseorang cenderung mengutamakan meto-dologi fiqih tertentu meskipun tidak mengabaikan pendapat metodologi fiqih lain. Faktor prefe-rensi menjadi hal yang diperlukan selain relevansi dalam perangkingan dokumen. Oleh karena itu, pada penelitian ini diajukan metode pembobotan kata berbasis preferensi untuk merangkingkan dokumen sesuai dengan preferensi pengguna. Metode yang diajukan digabungkan dengan pembo-botan kata berbasis indeks dokumen dan buku sehingga mampu memperhatikan aspek kesesuaian (relevance) dan keutamaan (preference). Metode pembobotan yang diusulkan disebut dengan Invers Preference Frequency with α value (IPFα). Langkah pembobotan yang diusulkan yaitu de-ngan perhitungan nilai preferensi term dengan pembobotan IPF. Kemudian nilai preferensi dari term dokumen yang sama dengan term query dikalikan dengan 𝜶𝜶 sebagai penguat. IPFα digabung-kan dengan metode pembobotan yang telah ada menjadi TF.IDF.IBF.IPFα. Pengujian metode yang diusulkan menggunakan dataset dari beberapa dokumen fiqih berbahasa Arab. Evaluasi meng-gunakan perhitungan recall, precision, dan F-measure. Hasil uji coba menunjukkan bahwa dengan pembobotan TF.IDF.IBF.IPFα diperoleh perangkingan dokumen dengan urutan yang tepat dan se-suai dengan preferensi pengguna. Hal ini ditunjukkan dengan nilai maksimal recall mencapai 75%, precision 100%, dan F-measure 85.7%."

Surabaya: Institut Teknologi Sepuluh Nopember Surabaya, Faculty of Information Technology, Department of Infromatics Engineering, 2015

AJ-Pdf

Artikel Jurnal Universitas Indonesia Library

Information retrieval of text document with weighting tf-idf and lcs

"Information retrieval of text document requires a method that is able to restore a number of documents that have high relevance according to the user's request. One important step in the process is a text representation of the weighting process. The use of LCS in Tf-Idf weighting adjustments considers the appearance of the same order of words between the query and the text in the document. There is a very long document but irrelevant cause weight produced is not able to represent the value relevance of documents. This research proposes the use of LCS which gives weight to the word order by considering long documents related to the average length of documents in the corpus. This method is able to return a text document effectively. Additional features of word order by normalizing the ratio of the overall length of the document to the documents in the corpus generate values of precision and recall as well as the method of Tasi et al.

Sistem temu kembali dokumen teks membutuhkan metode yang mampu mengembalikan sejumlah dokumen yang memiliki relevansi tinggi sesuai dengan permintaan pengguna. Salah satu tahapan penting dalam proses representasi teks adalah proses pembobotan. Penggunaan LCS dalam penyesuaian bobot Tf-Idf mempertimbangkan kemunculan urutan kata yang sama antara query dan teks di dalam dokumen. Adanya dokumen yang sangat panjang namun tidak relevan menyebabkan bobot yang dihasilkan tidak mampu merepresentasikan nilai relevansi dokumen. Penelitian ini mengusulkan penggunaan metode LCS yang memberikan bobot urutan kata dengan mempertimbangkan panjang dokumen terkait dengan rata-rata panjang dokumen dalam korpus. Metode ini mampu melakukan pengembalian dokumen teks secara efektif. Penambahan fitur urutan kata dengan normalisasi rasio panjang dokumen terhadap keseluruhan dokumen dalam korpus menghasilkan nilai presisi dan recall yang sama baiknya dengan metode Tasi dkk."

Surabaya: Institut Teknologi Sepuluh Nopember Surabaya, Faculty of Information Technology, Department of Infromatics Engineering, 2013

AJ-Pdf

Artikel Jurnal Universitas Indonesia Library

Ade Azurat

Mechanizing logic in an aspect oriented attribute grammar system

"This paper reports a preliminary work on using an aspect oriented attribute grammar system called UU-AG to develop computer aided verification tools. UU_AG provides an abstract and modular way to develop such a tool and later on incrementally upgrade them. This paper shows an example of a toy programming logic implemented in UU_AG. We will show the implementation of the verification condition generator (VCG). We extend the implementation with a new feature such as run-time-trace generator to validate the computation of the implemented inference engine."

2003

JIKT-3-2-Okt2004-77

Artikel Jurnal Universitas Indonesia Library

<< 1 2 3 4 5 6 7 8 9 10 >>

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian