Hasil Pencarian

Ditemukan 4 dokumen yang sesuai dengan query

Bayu Satria Persada

Perbandingan Depthwise Separable Convolutional Neural Network dan Convolutional Neural Network Sebagai Multiclass Keyword Spotting Pada Edge Device = Comparison of Depthwise Separable Convolutional Neural Network and Convolutional Neural Network As Multiclass Keyword Spotting On Edge Devices

"Perkembangan Artificial Intelligence (AI) sudah berkembang pesat. Dari ketiga arah pengembangan AI yakni computer vision, speech processing dan natural language processing. Speech processing memiliki tren paling rendah di antara ketiga pengembangan tersebut. Meskipun begitu pengembangan di bidang speech processing seperti speech recognition dan keyword spotting sudah banyak di implementasikan seperti model keyword spotting menggunakan Convolutional Neural Network (CNN) di microcontroller, mobile device dan perangkat lainnya. Namun CNN saja belum tentu menghasilkan akurasi yang tinggi maka dicoba Depthwise Separable Convolutional Neural Network (DSCNN) untuk mendapatkan hasil dengan akurasi yang lebih tinggi. Pengembangan model keyword spotting belum banyak diimplementasikan di edge device lainnya, yang dimaksud dengan edge device yaitu perangkat sederhana di sisi pengguna yang kemampuan komputasinya terbatas. Dengan menggunakan DSCNN menunjukkan nilai F1 score yang dibandingkan dengan model CNN. Model DSCNN menghasilkan model dengan nilai F1 score paling optimal dengan 4 layer konvolusi depthwise separable, menggunakan filter konvolusi sebanyak 256 dengan jumlah filter konvolusi depthwise 512 menggunakan optimizer RMSprop dan menggunakan batch size berukuran 126. Dari hasil pengujian dapat diketahui bahwa secara umum DSCNN menghasilkan F1 score yang lebih baik dibandingkan CNN yaitu sebesar 31,8% dengan CNN sebesar 28,35%. Namun DSCNN menggunakan sumber daya yang lebih banyak dan lebih lama waktu responsnya.

The development of Artificial Intelligence (AI) has grown rapidly. Of the three directions of AI development, namely computer vision, speech processing, and natural language processing. Speech processing has the lowest trend among the three developments. However, many developments in speech processing such as speech recognition and keyword spotting have been implemented, such as the keyword spotting model using the Convolutional Neural Network (CNN) in microcontrollers, mobile devices, and other devices. However, CNN alone does not necessarily produce high accuracy, so a Depthwise Separable Convolutional Neural Network (DSCNN) is used to get results with higher accuracy. The development of the keyword spotting model has not been widely implemented in other edge devices, which is meant by edge devices, namely simple devices on the user's side with limited computing capabilities. Using DSCNN shows the F1 score which is compared with the CNN model. The DSCNN model produces a model with the most optimal F1 score with 4 layers of convolution depthwise separable, using a convolution filter of 256 with a convolution depthwise filter of 512 using the RMSprop optimizer and using a batch size of 126. From the test results, in general DSCNN produces F1 score which is better than CNN, which is 31,8% with CNN at 28,35%. However, DSCNN uses more resources and a longer response time."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Gita Ayu Salsabila

Perbandingan Convolutional Neural Network dan Convolutional Recurrent Neural Network sebagai Model Multiclass Keyword Spotting pada Edge Device = Convolutional Neural Network and Convolutional Recurrent Neural Network Comparison as Multiclass Keyword Spotting Model on Edge Device

"Selama masa pandemi COVID-19, antarmuka suara menggunakan KWS (keyword spotting) semakin sering digunakan pada berbagai sistem elektronik karena minimnya kontak fisik yang diperlukan antarmuka ini. Salah satu sistem yang dapat menggunakan KWS adalah sistem navigasi lift, di mana KWS pada sistem tersebut akan mengenali kata kunci terkait lantai yang ingin dituju pengguna. Dalam penelitian ini, model KWS untuk sistem navigasi lift dibuat menggunakan CNN (Convolutional Neural Network) dan CRNN (Convolutional Recurrent Neural Network) untuk mengenali enam kata kunci spesifik. Selama proses pembuatannya, berbagai hyperparameter CRNN terkait implementasi GRU, batch normalization, dropout layer, optimizer, kernel size, dan batch size diuji pengaruh variasinya terhadap performa CRNN. Dari pengujian tersebut, ditemukan bahwa CRNN menunjukkan performa paling baik ketika GRU yang digunakan bersifat bidirectional dengan dua layer dan 64 hidden unit, kernel size sebesar 3x3, optimizer Adams, batch size sebesar 163, serta penerapan batch normalization layer sebelum dropout layer. Model CRNN yang diperoleh dari kombinasi hyperparameter terbaik kemudian dibandingkan dengan model CNN untuk dievaluasi performa klasifikasinya saat dijalankan pada Raspberry Pi 4B. Berdasarkan hasil akurasi, persentase penggunaan RAM, dan latensi, model CNN menunjukkan performa yang lebih baik daripada CRNN.

During the COVID-19 pandemic, voice interfaces using KWS (keyword spotting) are increasingly being used in various electronic systems due to the lack of physical contact required for this interface. One system that can use KWS is an elevator navigation system, where the KWS on the system will recognize keywords related to the floor the user wants to go to. In this study, the KWS model for the elevator navigation system was created using CNN (Convolutional Neural Network) and CRNN (Convolutional Recurrent Neural Network) to identify six specific keywords. During the manufacturing process, various CRNN hyperparameters related to GRU implementation, batch normalization, dropout layer, optimizer, kernel size, and batch size were tested for the effect of their variations on CRNN performance. From these tests, it was found that CRNN showed the best performance when the GRU used bidirectional with two layers and 64 hidden units, kernel size of 3x3, Adams optimizer, batch size of 163, and batch normalization layer applied before dropout layer. The CRNN model obtained from the best combination of hyperparameters is then compared with the CNN model to evaluate its classification performance when run on the Raspberry Pi 4B. Based on the results of accuracy, percentage of RAM usage, and latency, CNN model shows better performance than CRNN."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Muhammad As`Ad Muyassir

Rancang Bangun Metode Object Detection YOLOv5 pada Raspberry Pi 4B untuk Smart Trolley = Development of YOLOv5 Object Detection on Raspberry Pi 4B for Smart Trolley

Supermarket merupakan tempat pilihan terbaik untuk berbelanja kebutuhan rumah saat ini karena pelanggan dapat memilih produk yang ingin dibelinya tanpa perlu mengantre. Namun untuk melakukan pembayaran saat ini pelanggan masih perlu mengantre di kasir. Oleh karena itu, penelitian ini akan mengimplementasikan sistem cashierless yang dapat melakukan checkout secara otomatis dan efisien sehingga pelanggan tidak perlu mengantre lagi di kasir. Sistem cashierless yang digunakan pada penelitian ini adalah smart trolley, sistem ini dapat melakukan deteksi produk yang masuk atau keluar dari troli pelanggan lalu melakukan checkout secara otomatis saat pelanggan keluar dari supermarket. Untuk dapat melakukan deteksi produk diperlukan model machine learning yang berjenis object detection. Model juga harus dapat diimplementasikan pada edge device karena deteksi akan dilakukan di troli yang memiliki keterbatasan ruang. Maka model yang digunakan adalah YOLOv5 karena memiliki akurasi serta performa tinggi supaya tetap dapat diimplementasikan pada edge device. Hasil pengujian variasi backbone menunjukkan backbone original lebih baik dari backbone Swin Transformer dengan nilai F1-Score sebesar 98.64%, ukuran model sebesar 7.7 MB, dan dapat berjalan dengan 3.87 FPS di komputer pengujian dan 0.74 FPS di Raspberry Pi 4B. Hasil pengujian variasi dataset menunjukkan kombinasi dataset bergerak dengan statis blur dapat menghasilkan model yang memiliki akurasi yang paling baik dengan nilai 99.53% pada fase pelatihan dan 99.44% pada fase testing. Hasil pengujian intensitas cahaya menunjukkan penggunaan lampu untuk meningkatkan pencahayaan di sekitar wilayah deteksi di dalam troli dapat meningkatkan F1-Score hasil deteksi yang dilakukan hingga 63.55%. Hasil pengujian variasi kecepatan produk menunjukkan kecepatan ideal yang dapat digunakan pada saat proses deteksi di komputer pengujian adalah hingga 36 cm/s dan untuk proses yang dilakukan di Raspberry Pi 4B adalah di bawah 7 cm/s. Hasil pengujian dengan penambahan sampling rate dapat mendeteksi produk di komputer pengujian dengan kecepatan hingga 124 cm/s pada produk-produk dengan ukuran yang cukup lebar.

Supermarkets are the best place to shop for home needs today because customers can choose what products they want to buy without the need to queue. However, today customers still need to queue at the cashier to make payments. Therefore, this research will implement a cashier-less system that can do checkout automatically and efficiently so that customers don't have to queue at the cashier anymore. The cashier-less system used in this study is a smart trolley, this system can detect products entering or leaving the customer's trolley and then checkout automatically when the customer leaves the supermarket. To be able to perform product detection, a machine learning model of the object detection type is needed. The model must be able implemented on edge devices because the detection will be done in the cart with limited space. So, the model used is YOLOv5 because it has high accuracy and performance so it can implement on edge devices. The backbone variation test results show that the original backbone is better than the Swin-Transformer backbone with an F1-Score value of 98.64%, a model size of 7.7 MB, and can run with 3.87 FPS on a test computer and 0.74 FPS on a Raspberry Pi 4B. The dataset variation test results show that the combination of moving datasets with static blur can produce a model with the best accuracy of 99.53% in the training phase and 99.44% in the testing phase. The light intensity variation test results show that the use of lamps to increase the lighting around the detection area in the trolley can increase the F1-Score of the detection results made up to 63.55%. The product speed variation results show that the ideal speed that can use during the detection process on the testing computer is up to 36 cm/s and for the process carried out on the Raspberry Pi 4B it is below 7 cm/s. The sampling rate addition results can detect products on the test computer at speeds up to 124 cm/s on products with a wide size
"

Depok: Fakultas Teknik Universitas Indonesia, 2022

S-Pdf

UI - Skripsi Membership Universitas Indonesia Library

Anandwi Ghurran Muhajjalin Arreto

Perbandingan convolutional neural network dan multihead attention dengan recurrent neural network sebagai multiclass keyword spotting pada edge device = Comparison of convolutional neural network and multihead attention with recurrent neural network as multiclass keyword spotting on edge devices.

"Artificial Intelligence (AI) telah berkembang sangat pesat sehingga sudah sering terlihat dan digunakan secara umum oleh masyarakat. Salah satu jenis AI yang sering digunakan adalah speech recognition terutama keyword spotting yang disebabkan karena pandemi COVID-19. Implementasi keyword spotting dapat diterapkan pada lift sebagai sistem navigasi agar para pengguna lift tidak perlu melakukan kontak pada tombol, melainkan dapat menggerakkan lift hanya dengan mengucapkan lantai yang dituju. Metode untuk melakukan implementasi keyword spotting pada sistem lift dapat dilakukan dengan banyak metode, namun pada skripsi ini, metode yang diujikan adalah CNN (Convolutional Neural Network) dan MHAtt RNN (Multihead Attention Recurrent Neural Network). Penelitian yang dilakukan memiliki batasan untuk setiap metode agar dapat melakukan klasifikasi enam keyword dan melihat performa kedua metode dalam berbagai skenario yang dapat terjadi dalam lift. Dalam pembentukan model dari MHAtt RNN, dapat diketahui bahwa model memiliki performa terbaik ketika dibentuk dengan jumlah head untuk attention sebesar 8 dan LSTM dengan jumlah unit sebanyak 32. Pelatihan pada model dilakukan menggunakan optimizer Adam dengan learning rate sebesar 0.001 dan decay 0.005 agar pelatihan dapat menghasilkan model yang paling baik. Setelah melakukan pengujian pada berbagai skenario yang dapat terjadi di dalam sebuah lift, didapatkan hasil bahwa secara keseluruhan model CNN memiliki performa yang lebih baik dibandingkan model MHAtt RNN karena memiliki nilai F1-score dan precision yang lebih tinggi.

Artificial Intelligence (AI) has grown so rapidly that it has often been seen and used in general by the public. One type of AI that is often used is speech recognition, especially keyword spotting caused by the COVID-19 pandemic. The implementation of keyword spotting can be applied to elevators as a navigation system so that elevator users do not need to make contact with buttons but can move the elevator just by saying the intended floor. There are many methods to implement keyword spotting in elevator systems, but in this thesis, the methods tested are CNN (Convolutional Neural Network) and MHAtt RNN (Multihead Attention Recurrent Neural Network). The research conducted has limitations for each method in order to be able to classify six keywords and see the performance of both methods in various scenarios that can occur in an elevator. In forming the model from MHAtt RNN, it can be seen that the model has the best performance when it is formed with the number of heads for attention of 8 and the LSTM with the number of units of 32. The training on the model is carried out using the Adam optimizer with a learning rate of 0.001 and a decay of 0.005 so that the training can produce the best models. After testing on various scenarios that can occur in an elevator, the results show that the CNN model overall has better performance than the MHAtt RNN model because it has a higher F1-score and precision."

Depok: Fakultas Teknik Universitas Indonesia, 2021

S-pdf

UI - Skripsi Membership Universitas Indonesia Library

Hasil Pencarian :: Simpan CSV :: Kembali

Hasil Pencarian