Hasil Pencarian  ::  Simpan CSV :: Kembali

Hasil Pencarian

Ditemukan 3599 dokumen yang sesuai dengan query
cover
Mary, Leena
"Extraction and representation of prosodic features for speech processing applications deals with prosody from speech processing point of view with topics including, the significance of prosody for speech processing applications, why prosody need to be incorporated in speech processing applications, and different methods for extraction and representation of prosody for applications such as speech synthesis, speaker recognition, language recognition and speech recognition."
New York: Springer, 2012
e20418411
eBooks  Universitas Indonesia Library
cover
Mary, Leena
"This updated book expands upon prosody for recognition applications of speech processing. It includes importance of prosody for speech processing applications; builds on why prosody needs to be incorporated in speech processing applications; and presents methods for extraction and representation of prosody for applications such as speaker recognition, language recognition and speech recognition. The updated book also includes information on the significance of prosody for emotion recognition and various prosody-based approaches for automatic emotion recognition from speech."
Switzerland: Springer Cham, 2019
e20502221
eBooks  Universitas Indonesia Library
cover
New York: IEEE Press, c1979
621.381 9 AUT
Buku Teks  Universitas Indonesia Library
cover
F.X. Rahyono, 1956-
"This experimental phonetic research deals with the prosodies of directive speech in Javanese. The research procedures were: (1) speech production, (2) acoustic analysis, and (3) perception test. The data investigated are three directive utterances, in the form of statements, commands, and questions. The data were obtained by recording dialogues that present polite as well as impolite speech. Three acoustic experiments were conducted for statements, commands, and questions in directive speech: (1) modifications of duration, (2) modifications of contour, and (3) modifications of fundamental frequency. The result of the subsequent perception tests to 90 stimuli with 24 subjects were analysed statistically with ANOVA (Analysis of Variant). Based on this statistic analysis, the prosodic characteristics of polite and impolite speech were identified."
Depok: Faculty of Humanities University of Indonesia, 2009
pdf
Artikel Jurnal  Universitas Indonesia Library
cover
Angela Anggundari
"ABSTRAK
Skripsi ini membahas tentang speech recognition pada Ericsson T18s Voice Dialing dengan menggunakan rnetode Hidden Markov Models (HMM) diskrit yang telah dimodifikasi untuk aplikasi speaker dependent dan speaker independent. Tujuan dari skripsi ini adalah bagaimana sistem dapat mengenali kata yang diucapkan dari orang tertentu yang suaranya terdapat pada database utuk aplikasi speaker dependent dan banyak orang (termasuk yang tidak terdapat pada database) untuk speaker independent.
Pada skripsi ini program dibuat dalam matlab 5.3 dengan dilengkapi oleh windows 98, sound card wav record, mikropon dan speaker yang merupakan modifikasi dari referensi asli, dimana program dijalankan dengan menggunakan matiab 5.x, Visual C++ dan TMS 320C6701 DSP Board Texas Instrument sehingga lebih mudah dan murah dalam aplikasinya.
Selain itu Skripsi ini dalam aplikasinya juga lebih luas yaitu untuk speaker dependent dan speaker independent yang merupakan modifikasi dari referensi Skripsi ini (speaker independent) dan pada Ericsson T18s Voice Dialing (speaker independent).
Cara kerja dari sistem ini adalah mengenali kata yang diucapkan oleh tester dengan cara membandingkan kata yang diucapkannya dengan kata-kata dari suara yang ada pada database ketika suara dites. Pada database terdapat 5 buah kata, dan kata yang diucapkan ketika dites akan dibandingkan dengan kelima buah kata tersebut. Setelah itu kata dengan proliabilitas tertinggi akan dipilih sebagai kata yang dikenali oleh sistem.
Keberhasilan sistem dalam mengenali dinilai cukup baik, karena dapat membuktikan teori yang ada, baik untuk speaker dependent maupun untuk speaker independent, meskipun tidak sebaik referensi aslinya.

"
2001
S39927
UI - Skripsi Membership  Universitas Indonesia Library
cover
Jurafsky, Dan
Upper Saddle River, N.J.: Pearson Education, 2009
410.285 JUR s
Buku Teks  Universitas Indonesia Library
cover
Jesus Romero-Trillo
"The book examines key issues in the development of prosody and delves into the role of intonation in the construction of meaning. The contributions tackle difficult areas of intonation for language learners, providing a theoretical analysis of each stumbling block as well as a practical explanation for teachers and teacher trainers. The numerous issues dealt with in the book include stress and rhythm, tone units and information structure, intonation and pragmatic meaning, tonicity and markedness, etc. "
Dordrecht, Netherlands: Springer, 2012
e20400640
eBooks  Universitas Indonesia Library
cover
Rao, K. Sreenivasa
"Predicting prosody from text for text-to-speech synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems."
New York: Springer, 2012
e20418380
eBooks  Universitas Indonesia Library
cover
Neustein, Amy, editor
"Forensic speaker recognition : law enforcement and counter-terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. "
New York: Springer, 2012
e20421082
eBooks  Universitas Indonesia Library
cover
Mohammad Salman Alfarisi
"

Salah satu permasalahan yang terdapat pada sistem Automatic Speech Recognition (ASR) yang sudah ada adalah kurangnya transparansi dalam penanganan data suara, yang tentunya membuat adanya keraguan terhadap privasi data tersebut. Di sisi lainnya, untuk mengembangkan sebuah sistem ASR yang memiliki akurasi memadai dan dapat bekerja secara luring membutuhkan jumlah data yang banyak, khususnya data suara yang sudah diiringi dengan transkripnya. Hal ini menjadi salah satu hambatan utama pengembangan sistem pengenalan suara, terutama pada yang memiliki sumber daya minim seperti Bahasa Indonesia. Oleh karena itu, dalam penelitian ini dilakukan perancangan sistem pengenalan suara otomatis berbasis model wav2vec 2.0, sebuah model kecerdasan buatan yang dapat mengenal sinyal suara dan mengubahnya menjadi teks dengan akurasi yang baik, meskipun hanya dilatih data dengan label yang berjumlah sedikit. Dari pengujian yang dilakukan dengan dataset Common Voice 8.0, model wav2vec 2.0 menghasilkan WER sebesar 25,96%, dua kali lebih baik dibandingkan dengan model Bidirectional LSTM biasa yang menghasilkan 50% namun membutuhkan jumlah data dengan label 5 kali lipat lebih banyak dalam proses pelatihan. Namun, model wav2vec membutuhkan sumber daya komputasi menggunakan 2 kali lebih banyak RAM dan 10 kali lebih banyak memori dibandingkan model LSTM


One of the main problems that have plagued ready-to-use Automatic Speech Recognition (ASR) Systems is that there is less transparency in handling the user’s voice data, that has raised concerns regarding the privacy of said data. On the other hand, developing an ASR system from scratch with good accuracy and can work offline requires a large amount of data, more specifically labeled voice data that has been transcribed. This becomes one of the main obstacles in speech recognition system development, especially in low-resourced languages where there is minimal data, such as Bahasa Indonesia. Based on that fact, this research conducts development of an automatic speech recognition system that is based on wav2vec 2.0, an Artificial Model that is known to recognize speech signals and convert it to text with great accuracy, even though it has only been trained with small amounts of labeled data. From the testing that was done using the Common Voice 8.0 dataset, the wav2vec 2.0 model produced a WER of 25,96%, which is twice as low as a traditional Bidirectional LSTM model that gave 50% WER, but required 5 times more labeled data in the training process. However, the wav2vec model requires more computational resource, which are 2 times more RAM and 10 times more storage than the LSTM model.

"
Depok: Fakultas Teknik Universitas Indonesia, 2022
S-Pdf
UI - Skripsi Membership  Universitas Indonesia Library
<<   1 2 3 4 5 6 7 8 9 10   >>